Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestion] Instruction Fine-Tuning - SFT Module #60

Open
ArlindKadra opened this issue Dec 6, 2024 · 4 comments
Open

[Suggestion] Instruction Fine-Tuning - SFT Module #60

ArlindKadra opened this issue Dec 6, 2024 · 4 comments

Comments

@ArlindKadra
Copy link

Thanks for taking the time into developing this interesting course. I wanted to suggest that regarding the SFT module in Chapter 1, the bigcode/the-stack-smol dataset seems to break the flow a bit. Since it is not an instruction-tuning dataset, but more a domain-specific dataset with no instruction following. As such, it does not have the question/answering pairs.

Based on that, if you train on the dataset, the response to the prompt is the same as before. Maybe switching it up to the openai/gsm8k dataset, or something similar? That way one would still have to prepare the dataset before feeding it to the SFTTrainer.

@burtenshaw
Copy link
Collaborator

burtenshaw commented Dec 6, 2024

Thanks. That's a nice suggestion. Are you interested in opening a PR?

@ArlindKadra
Copy link
Author

I can give it a try, however, I am not able to do it in a timely manner. I will be available starting from the 18th.

@burtenshaw
Copy link
Collaborator

No worries @ArlindKadra . Come back when you're ready and check if it still needs doing.

Thanks for the issue.

@asvskartheek
Copy link

asvskartheek commented Dec 9, 2024

Wanted to add few of my queries here, instead of creating a new issue. (I hope this is ok)

  1. What is the task the model is being trained on?
  • My guess is Casual LM with cross entropy loss.
  1. If it is Casual LM then are we in anyway also ignoring special characters and phrases like <|im_start|>?
    2b. How are we forcing the end of generation after just the assistant turn?

  2. The everyday conversations dataset has several columns, where are we specifying to train on the "chatml" templated messages column of the dataset?

Tried to get answers for these queries in SFT couldn't get answers. Maybe adding bit more detail in "The Finetuning Process" sub-section of the Supervised Fine tuning page would be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants