-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When running dpo_finetuning_example.ipynb training loss is zero starting from the second step #61
Comments
I found that the reason for this was using the model in fp16 precision. For some reason it introduced nan values in gradient computations, which manifested as zero loss values. When I changed it to
I'm going to plot the losses next to see if it actually learns anything. |
Thanks for sharing this. Here are a few points, try them out and feel free to share plots again:
|
Also, if you open a [SUBMISSION] pr with your changes including the plots, we can help you out with a review. 🙂 |
That's interesting, why isn't this mentioned in dpo.md? IMHO ability to distinguishing the successful fine-tuning run from the failure is like the most fundamental thing we should learn from the course. Meanwhile, the only info that I found in the dpo.md is:
So in my case I carefully monitored the loss divergence ;) |
Thanks for the continued feedback @fairydreaming . We're still ironing out the creases on this module. I'll get to work on a PR to make it clearer. |
When performing DPO fine-tuning with this notebook: dpo_finetuning_example.ipynb starting from the second step loss value is zero:
I don't think this is the expected behavior.
I tried the notebook both locally and in Colab and it happens in both environments.
The text was updated successfully, but these errors were encountered: