-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetune Whisper model on LibriSpeech #1571
base: master
Are you sure you want to change the base?
Finetune Whisper model on LibriSpeech #1571
Conversation
A comparison of using different fbank features to decode Whisper model. The In general, using the un-compressed features is slightly better than using the compressed features. The performance difference is minor, except for large-v2. The WERs are obtained using greedy search.
|
Effect of freezing different modulesNum epoch = 10, with Lilcom compressed features Finetune small.en, adam optimizer, lr=1e-5Without fine-tuning: 4.83/11.06 (greedy)
Finetune medium, adam optimizer, lr=1e-5Num epoch = 10, with Lilcom compressed features
|
Effect of different learning rates:Model: small.en (without fine-tune: 4.83/11.06)
Model: medium (without fine-tune: 4.02/7.53)
|
This recipe finetunes a Whisper model on LibriSpeech following #1466.