Describe the bug
I came across this by chance. Certain words are repeated many times more than they are repeated by the speaker.
Steps/Code to reproduce bug
Say the sentence 'Its way way way more than that'.
Model output would be - 'Its way way way way way way way way way way way way way way way way way way way way way way way way more than that'.
Another example 'Its me, me me me'.
Model output - 'Its me, me me me me me me me me me me me me me me me'.
Observed this first with my own streaming inference application that used an quantized onnx version of the model from HF. However, I was able to reproduce this in official NVIDIA demo - https://build.nvidia.com/nvidia/nemotron-asr-streaming
Expected behavior
Say the sentence 'Its way way way more than that'.
Model output should be - 'Its way way way more than that'.
Environment overview (please complete the following information)
On browser, Mac OS
Describe the bug
I came across this by chance. Certain words are repeated many times more than they are repeated by the speaker.
Steps/Code to reproduce bug
Say the sentence 'Its way way way more than that'.
Model output would be - 'Its way way way way way way way way way way way way way way way way way way way way way way way way more than that'.
Another example 'Its me, me me me'.
Model output - 'Its me, me me me me me me me me me me me me me me me'.
Observed this first with my own streaming inference application that used an quantized onnx version of the model from HF. However, I was able to reproduce this in official NVIDIA demo - https://build.nvidia.com/nvidia/nemotron-asr-streaming
Expected behavior
Say the sentence 'Its way way way more than that'.
Model output should be - 'Its way way way more than that'.
Environment overview (please complete the following information)
On browser, Mac OS