Opencode CLI runs can use file/bash tools, but if launched from the repo root they may access benchmark artifacts such as groundtruth.txt, reports/, traces/, or previous outputs. We should run each scenario in a clean staged workspace outside the repo, copying only allowed scenario inputs (question.txt, manifest.json, scenario data files)
This keeps CLI/file benefits while preventing evaluation leakage.
Opencode CLI runs can use file/bash tools, but if launched from the repo root they may access benchmark artifacts such as groundtruth.txt, reports/, traces/, or previous outputs. We should run each scenario in a clean staged workspace outside the repo, copying only allowed scenario inputs (question.txt, manifest.json, scenario data files)
This keeps CLI/file benefits while preventing evaluation leakage.