Differences in Transcription between OpenAIs Whisper and noScribe (Deutsch) #111

BoSott · 2024-12-12T17:32:19Z

Hey,
first of all - thank you for the great work! I appreciate it a lot!
I would like to add the large-v2 model for German translations which you indicate as possible in the models folder. Unfortunately, the instruction of

download all files from here into this folder: https://huggingface.co/guillaumekln/faster-whisper-large-v2/tree/main

remains somewhat a secret to me as I can't find the responding folder in the folder tree of noScribe. Does this refer to the ‘faster_whisper’ folder? And is it enough to just copy it there or what else do I need to do in order to get it running?

The trigger is that the transcripts made with the ‘precise’ model are significantly less accurate than those I make with the large V2 model. Thanks again and have a nice day!
P.S.: Though I am wondering, if you do not already use the large-v2 model and thus where this issue might be coming from. My 'usual' setup is just whisper through python. Both on the same machine, no NVIDIA card.

The text was updated successfully, but these errors were encountered:

kaixxx · 2024-12-12T21:03:18Z

Indeed, we are already using the large v2-model (in the "precise" setting).
What do you mean by "significantly less accurate"? Could you give an example?
In your python setup, do you also use faster-whisper or whisper in the original form from OpenAI? Could you share the code you use to launch whisper? There are a lot of parameters that might influence the result.

BoSott · 2024-12-18T15:09:57Z

Hey, thanks for the quick response!
I have only exemplary excerpts and couldn't find a pattern thus far unfortunately.
This is my code I use (shortened):

            model = whisper.load_model("large-v2")

            result = model.transcribe(str(filepath), language="german")
            print('Rersult: ')
            print(result["text"])

output:
"""
Ja, das Aufnahmegerät läuft und dann würde ich es einfach mal einfach reinhalten.

Dann stelle ich euch vier Schülerinnen mal die erste Frage. Welche Freiheiten sind dir im Moment am wichtigsten und warum?

Also mir ist halt wichtig, dass ich überall hin und her gut kommen kann, damit ich irgendwie zum Beispiel, es gibt ja auf der Autobahn Stau manchmal und dass ich da halt dann irgendwo anders lang fahren kann stattdessen als auf der Autobahn.

Und ihr Anderen? Welche Freiheiten sind euch am wichtigsten im Moment?

Mit Freunden, sich verabreden, draußen was machen, vielleicht auch zocken oder so. Und dass man halt immer, wenn man irgendwie Probleme hat, dass man immer einen Ansprechpartner hat. Das finde ich auch sehr wichtig.

Freie Meinung zu haben, um einen Politiker zum Beispiel zu wählen. Was ist mit dir, Alessa?

"""

S08 [00:00:00]: Ja, das Aufnahmegerät läuft und dann würde ich es einfach mal einfach rein**(halten).**

S06 [00:00:05]: Ja, dann stelle ich euch vier Schülern, Schülerinnen mal die erste Frage. Welche Freiheiten sind dir im Moment am wichtigsten und warum? (........)

S00 [00:00:21]: (missing - etwas leiser als danach auf der Tonspur) Damit ich irgendwie zum Beispiel, es gibt ja auf der Autobahn Stau manchmal und dass ich da halt dann irgendwo anders langfahren kann stattdessen als auf der Autobahn. (....)

S06 [00:00:37]: Und ihr andere? Welche Freiheiten sind euch am wichtigsten im Moment?

S05 [00:00:44]: Äh, mit Freunden, äh, sich verabreden, draußen was machen, vielleicht auch zocken oder so und dass man, dass man halt immer ein, wenn man irgendwie Probleme hat, dass man immer einen Ansprechpartner hat. Das finde ich auch sehr wichtig.

S03 [00:01:01]: Äh, Freimeinung zu haben, um einen Politiker zum Beispiel zu wählen. Was ist mit dir, Alicja?
"""

Though looking at these - it might also be that there are just differences from different runs and that this is completely normal. If this should be the case - i appologize for not testing it enough beforehand and thanks for your time and energy!:)

kaixxx · 2024-12-18T17:10:56Z

Ich denke, ich kann auf Deutsch antworten ;)

Danke für das Beispiel. Mehrere Sachen fallen auf:

Die leise Passage, die noScribe komplett ausgelassen hat: Ich nutze eine "Voice Activity Detection", um Passagen ohne Sprache auszuklammern. Vielleicht ist dieser Filter etwas zu aggressiv eingestellt, aber das ist natürlich immer ein Kompromiss. Wenn zu viele Passagen ohne Sprache durchrutschen, gibt es mehr Halluzination...
Kommt es öfter vor, dass leise Passagen fehlen?
Unterschiede bei der Erkennung - was ist denn korrekt verglichen mit der Originalaufnahme?
- "Dann stelle ich euch vier Schülerinnen mal die erste Frage" (Whisper) vs. "Ja, dann stelle ich euch vier Schülern, Schülerinnen mal die erste Frage" (noScribe)
- "Und dass man halt immer, wenn man irgendwie Probleme hat, dass man immer einen Ansprechpartner hat." (Whisper) vs. "und dass man, dass man halt immer ein, wenn man irgendwie Probleme hat, dass man immer einen Ansprechpartner hat." (noScribe)
- "Alessa" (Whisper) vs. "Alicja" (noScribe) (Eigennamen sind immer schwierig. Trotzdem würde mich interessieren, was denn nun stimmt...)
"Freimeinung", haha, das ist natürlich einfach falsch.

Insgesamt wirkt das Ergebnis von OpenAI sprachlich etwas mehr geglättet, aber womöglich nicht ganz so exakt zur Originalaufnahme. Tatsächlich versuche ich noScribe dazu zu bewegen, auch grammatikalisch falsche Sätze genau wiederzugeben, ebenso Flicklaute wie "ähm", etc. Wenn du das nicht willst, gibt es einen einfachen Trick: Schalte die Sprache auf "auto", dann greifen diese Anweisungen nicht (sie operieren mit Beispielen und sind deshalb sprachspezifisch).

kaixxx changed the title ~~adding large-v2 model instructions unclear~~ Differences in Transcription between OpenAIs Whisper and noScribe (Deutsch) Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differences in Transcription between OpenAIs Whisper and noScribe (Deutsch) #111

Differences in Transcription between OpenAIs Whisper and noScribe (Deutsch) #111

BoSott commented Dec 12, 2024 •

edited

Loading

kaixxx commented Dec 12, 2024

BoSott commented Dec 18, 2024

kaixxx commented Dec 18, 2024

Differences in Transcription between OpenAIs Whisper and noScribe (Deutsch) #111

Differences in Transcription between OpenAIs Whisper and noScribe (Deutsch) #111

Comments

BoSott commented Dec 12, 2024 • edited Loading

kaixxx commented Dec 12, 2024

BoSott commented Dec 18, 2024

kaixxx commented Dec 18, 2024

BoSott commented Dec 12, 2024 •

edited

Loading