Summary
AutoRecLab currently fails mid-run with FileNotFoundError when required
dataset files are missing or have unexpected names. This happens after
several LLM iterations have already been executed — wasting time and API
costs.
Current Behavior
When a required file (e.g. u.data, VideoGames.csv) is not found,
the run crashes with a FileNotFoundError deep inside the generated code,
often only after multiple tree-search iterations.
Expected Behavior
Before any LLM calls are made, AutoRecLab should:
- Parse the user prompt for expected file names/datasets
- Check whether those files exist in the working directory
- Print a clear summary of which files are present and which are missing
- Exit immediately with a helpful error message if required files are absent
Why This Matters
This improvement would:
- Save API costs (no wasted LLM calls before the inevitable crash)
- Make debugging much faster for users
- Require only a few hours to implement (a pre-run check function)
Suggested Implementation
A simple pre-flight check function called before TreeSearch is
initialized — it scans the workspace directory and compares against
files mentioned in the prompt or config.
Context
We encountered this issue repeatedly while replicating the case study
from the AutoRecLab preprint. Runs failed consistently due to missing or misnamed files.
Summary
AutoRecLab currently fails mid-run with
FileNotFoundErrorwhen requireddataset files are missing or have unexpected names. This happens after
several LLM iterations have already been executed — wasting time and API
costs.
Current Behavior
When a required file (e.g.
u.data,VideoGames.csv) is not found,the run crashes with a
FileNotFoundErrordeep inside the generated code,often only after multiple tree-search iterations.
Expected Behavior
Before any LLM calls are made, AutoRecLab should:
Why This Matters
This improvement would:
Suggested Implementation
A simple pre-flight check function called before
TreeSearchisinitialized — it scans the workspace directory and compares against
files mentioned in the prompt or config.
Context
We encountered this issue repeatedly while replicating the case study
from the AutoRecLab preprint. Runs failed consistently due to missing or misnamed files.