-
|
I am running What is the valid range, and what threshold should I use for strict vs flexible matching? If possible, please include a command example and links to the relevant code lines. |
Beta Was this translation helpful? Give feedback.
Answered by
toughdave
Mar 3, 2026
Replies: 1 comment
-
|
Use Practical starting points
Example commandpython3 scripts/python/reconciliation/fuzzy_match_students.py \
--source data/sample/student_records_source.csv \
--target data/sample/student_records_target.csv \
--output reports/fuzzy_matches.csv \
--summary reports/fuzzy_matches_summary.json \
--threshold 0.86Relevant code references
|
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
toughdave
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Use
--thresholdas a similarity cutoff between 0.0 and 1.0.Practical starting points
0.90to0.95: strict matching (fewer false positives)0.84to0.89: balanced matching0.75to0.83: flexible matching (review outputs carefully)Example command
Relevant code references
main(): fuzzy_match_students.py#L163-L164