Hi, I have trouble understanding this line in `compute_loss` in `reader.py`: https://github.com/facebookresearch/DPR/blob/a31212dc0a54dfa85d8bfa01e1669f149ac832b7/dpr/models/reader.py#L109 This keeps the maximum loss over all `M` passages, why? Why not summing or averaging? Best regards,
Hi,
I have trouble understanding this line in
compute_lossinreader.py:DPR/dpr/models/reader.py
Line 109 in a31212d
This keeps the maximum loss over all
Mpassages, why? Why not summing or averaging?Best regards,