Skip to content

Relative cache:// path silently fails under parallel execution (threading race on CWD) #71

Description

@yannrichet-asnr

Bug

When fzr() is called with a relative cache:// URI (e.g. cache://my-results/), cache lookup silently fails if more than one worker thread is active.

Root cause

run_local_calculation() calls os.chdir(working_dir) inside worker threads. Because the CWD is process-wide, when one thread changes it, every other thread sees the new CWD immediately.

resolve_cache_paths() in io.py resolves the pattern with Path(cache_pattern).exists(), which depends on the CWD at call time. If the CWD was changed by a concurrent thread, Path("my-results/").exists() returns False even though the directory exists — so every case is treated as a cache miss and is recomputed.

Reproducer

fz.fzr(
    input_path="model.m6",
    input_variables={"x": [1, 2, 3]},
    model="Moret",
    calculators=["localhost_Moret", "cache://previous-run/"],  # relative path
    results_dir="results",
)

With >1 parallel workers, the cache://previous-run/ lookup is unreliable.

Fix

Resolve all relative cache:// paths to absolute (anchored to original_cwd) before spawning worker threads, so Path(...).exists() is CWD-independent.

See PR #.

Workaround (until fixed)

Use an absolute path:

calculators=["localhost_Moret", "cache:///absolute/path/to/previous-run/"]

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions