-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments in KTO Trainer forward()
#17
Comments
You're correct! The comment is from when i was trying to debug the code during development and is outdated. Feel free to open a PR and i'll merge it in. Thanks! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi there,
I'm reading through the forward() function in KTO Trainer, and in the function signature it states that if read in correctly, the sizes of chosen and rejected logps should be
batch_size/2
. However, this doesn't make sense to me because this sounds like a limitation for Paired preference training rather than the unpaired training method of kto.Here's comment from lines 875-877 of
trainers.py
:Please let me know if this makes sense, Im happy to open a PR.
The text was updated successfully, but these errors were encountered: