This module will guide you through instruction tuning language models. Instruction tuning involves adapting pre-trained models to specific tasks by further training them on task-specific datasets. This process helps models improve their performance on targeted tasks.
In this module, we will explore two topics: 1) Chat Templates and 2) Supervised Fine-Tuning.
Chat templates structure interactions between users and AI models, ensuring consistent and contextually appropriate responses. They include components like system prompts and role-based messages. For more detailed information, refer to the Chat Templates section.
Supervised Fine-Tuning (SFT) is a critical process for adapting pre-trained language models to specific tasks. It involves training the model on a task-specific dataset with labeled examples. For a detailed guide on SFT, including key steps and best practices, see the Supervised Fine-Tuning page.
Title | Description | Exercise | Link | Colab |
---|---|---|---|---|
Chat Templates | Learn how to use chat templates with SmolLM2 and process datasets into chatml format | 🐢 Convert the HuggingFaceTB/smoltalk dataset into chatml format 🐕 Convert the openai/gsm8k dataset into chatml format |
Notebook | |
Supervised Fine-Tuning | Learn how to fine-tune SmolLM2 using the SFTTrainer | 🐢 Use the HuggingFaceTB/smoltalk dataset🐕 Try out the bigcode/the-stack-smol dataset🦁 Select a dataset for a real world use case |
Notebook |
- Transformers documentation on chat templates
- Script for Supervised Fine-Tuning in TRL
SFTTrainer
in TRL- Direct Preference Optimization Paper
- Supervised Fine-Tuning with TRL
- How to fine-tune Google Gemma with ChatML and Hugging Face TRL
- Fine-tuning LLM to Generate Persian Product Catalogs in JSON Format