New --all-reruns-need-to-pass argument#311
Conversation
This addresses your requirement to verify that non-deterministic tests (that work ~90% of the time) pass consistently by requiring all reruns to pass after an initial failure. Signed-off-by: Manuel Saelices <msaelices@gmail.com>
|
Thank you for your PR. Unfortunately I do not understand which problem it solves could you please explain your use case with an example? Additionally the merge conflict and lint errors could be solved. |
|
@msaelices Are you still interested in this PR, otherwise I'll close it as abandoned. |
We are using it to run evals that are not deterministic, like LLM based Scenario: Let's say you have 100 evals with, e.g. 5% of failure each one of them, which is ok for you. Even with a seemingly "good" 5% per-eval failure rate, running 100 evals makes it very likely, with a ~99.4% chance, that at least one fails. How do you differentiate between a harmless flaky failure and an actual regression, where the failure rate jumps from something acceptable like 5% to, for example, 50%? We use For example, with |
…o-pass-argument # Conflicts: # src/pytest_rerunfailures.py # tests/test_pytest_rerunfailures.py
This addresses your requirement to verify that non-deterministic tests (that work ~90% of the time) pass consistently by requiring all reruns to pass after an initial failure.