Fix bf16 rounding to IEEE 754 ties-to-even#5648
Fix bf16 rounding to IEEE 754 ties-to-even#5648cyyever wants to merge 3 commits intopytorch:mainfrom
Conversation
778ddfc to
5bb9e0e
Compare
|
@q10 has imported this pull request. If you are a Meta employee, you can view this in D101141846. |
f83d6b6 to
24174a4
Compare
|
Could you provide some summary of the motivations for this change, and how it would impact perf and correctness on (on both ARM and x86)? It's highly likely that a decision was made a long time ago internally with regards to the rounding technique, and the decision was made with tradeoffs relevant to internal use cases in mind. Also, could you also provide microbenchmark reproducers with perf numbers? |
|
@q10 We change the rounding mode to ties-to-even because it is IEEE 754 default behaviour, also note that CUDA implementation, PyTorch, NumPy use ties-to-even, FBGemm should be consistent with these libraries to avoid silent numerical precision issues. For example, the following snippet confirms the rounding mode. A simple bench generated by LLM for AVX2 reported that because the new rounding mode requires 2 times more instructions than the old mode. |
b57f191 to
9d95dd5
Compare
No description provided.