Is it possible to disable the word normalization? #1883 #1886
Replies: 3 comments 9 replies
-
AFAIK, whisper.cpp transcribes the text as it is, without any word normalization (as of February 21, 2024). |
Beta Was this translation helpful? Give feedback.
-
Just a quick observation... |
Beta Was this translation helpful? Give feedback.
-
Looks like suppression doesn't help much, but what about prompting? EDIT: Tested this idea a bit... yeah actually seems promising, but it only works with very short clips. I used these samples for testing (these are smaller sections cut out of the previous sample): |
Beta Was this translation helpful? Give feedback.
-
Hi.
I have a couple of audios that the authors says: "gonna", "wanna", "let's", but I have it transcribed to "going to", "want to", "let us" etc.
Is it possible to disable this word normalization?
In the Python version, seems like it is possible here: https://github.com/openai/whisper/blob/ba3f3cd54b0e5b8ce1ab3de13e32122d0d5f98ab/whisper/normalizers/english.py#L465
Do we have something similar in the whisper.cpp?
TIA for any help!
Beta Was this translation helpful? Give feedback.
All reactions