Hi, I wanted to raise two related observations about the scoring stage.
I am using:
applypilot 0.3.0
LLM: local Ollama, llama3.1:8b, 100% GPU
OS: Windows 11
Observation 1 - Scoring progress is lost on interuption
In scoring/scorer.py, run_scoring accumulates all scoring results into a results list during the loop and then writes them all to the database in a single batch commit after the loop completes. If the run is interrupted (Ctrl+C, power loss, crash) before reaching that final commit, no scores are written and the entire run's progress is lost. On restart, all jobs still have a fit_score of NULL and the full batch is reprocessed from the beginning.
I am not sure if this is intentionally done to ensure all jobs are scored together to prevent the db from being left in a partial state. I was ~100/163 scores through a run when my machine got powered off. I moved the db write call inside the score loop to prevent a failure from causing all progress to be lost.
Observation 2 - I used local LLM on my first score run. Something failed with the LLM on that run and all 163 jobs received a fit_score of 0. I did not see a way to reprocess these jobs. My fix was to change the 0s back to NULL and re-run score. After changing the values back to NULL, I modified database.py and altered the where clause that selects the jobs to score to "WHERE fit_score is NULL or fit_score = 0". I thought that since the select statement is only called once at the beginning of a score run that this would not result in an infinite failure loop if there were a genuine problem.
I would appreciate learning if either of these fixes creates a problem that I did not anticipate. Thank you for creating this project! I love it and am very thankful that I have a better way to search and apply for jobs.
Hi, I wanted to raise two related observations about the scoring stage.
I am using:
applypilot 0.3.0
LLM: local Ollama, llama3.1:8b, 100% GPU
OS: Windows 11
Observation 1 - Scoring progress is lost on interuption
In scoring/scorer.py, run_scoring accumulates all scoring results into a results list during the loop and then writes them all to the database in a single batch commit after the loop completes. If the run is interrupted (Ctrl+C, power loss, crash) before reaching that final commit, no scores are written and the entire run's progress is lost. On restart, all jobs still have a fit_score of NULL and the full batch is reprocessed from the beginning.
I am not sure if this is intentionally done to ensure all jobs are scored together to prevent the db from being left in a partial state. I was ~100/163 scores through a run when my machine got powered off. I moved the db write call inside the score loop to prevent a failure from causing all progress to be lost.
Observation 2 - I used local LLM on my first score run. Something failed with the LLM on that run and all 163 jobs received a fit_score of 0. I did not see a way to reprocess these jobs. My fix was to change the 0s back to NULL and re-run score. After changing the values back to NULL, I modified database.py and altered the where clause that selects the jobs to score to "WHERE fit_score is NULL or fit_score = 0". I thought that since the select statement is only called once at the beginning of a score run that this would not result in an infinite failure loop if there were a genuine problem.
I would appreciate learning if either of these fixes creates a problem that I did not anticipate. Thank you for creating this project! I love it and am very thankful that I have a better way to search and apply for jobs.