Fine-tune Phi-3-mini-4k model with LoRA or QLoRA to generate Python Code

Phi3-mini model fine-tuned on the python_code_instructions_18k_alpaca Code instructions dataset using LoRA or QLoRA with PEFT and bitsandbytes library.

With LoRA

Notebook: phi3-finetune-lora-pycoder.ipynb

Model in Huggingface

Adapter in Huggingface

With QLoRA

Notebook: phi3-finetune-qlora-pycoder.ipynb

Model in Huggingface

Adapter in Huggingface

Problem description

Our goal is to fine-tune the pretrained model, Phi3-mini, a LLM with 3.8B parameters, using both the PEFT method, and LoRA or a 4-bit quantization QLoRA to produce a Python coder. Then we evaluate the performance of both models. We fine-tune the model using a NVIDIA A100 GPU to get better performance. Alternatively, you can try out to run the fine-tuning on for example a T4 in Google Colab by adjusting some parameters (like batch size) to reduce memory consumption.

Dataset

For our fine-tuning process, we use this dataset that contains about 18,000 examples where the model is asked to build a Python code that solves a given task. This is an subset of this other original dataset from which only the Python language examples are selected. Each row contains the description of the task to be solved, an example of data input to the task if applicable, and the generated code fragment that solves the task is provided.

Base model

Phi-3-mini-4k-instruct

The Phi-3-Mini-4k-Instruct is a 3.8B parameters, lightweight, state-of-the-art open LLM trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model belongs to the Phi-3 family with the Mini version (3.8B parameters) in two variants of the context length (in tokens) that it can support: 4K and 128K.

The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization for the instruction following and safety measures. When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4k-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.

Example of usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "alexrodpas/phi3-mini-4k-qlora-pycode-18k"
device_map = "cuda"

tokenizer = AutoTokenizer.from_pretrained(model_id,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True, torch_dtype="auto", device_map=device_map)

input="'Create a function to calculate the sum of a sequence of integers.\n Input: [1, 2, 3, 4, 5]'"

# Create the pipeline
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
# Prepare the prompt or input to the model
prompt = pipe.tokenizer.apply_chat_template([{"role": "user", "content": input}], tokenize=False, add_generation_prompt=True)
# Run the pipe to get the answer
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, num_beams=1, temperature=0.3, top_k=50, top_p=0.95,
                   max_time= 180)
print(outputs[0]['generated_text'][len(prompt):].strip())

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
gitattributes		gitattributes
gitignore		gitignore
phi3-finetune-lora-pycoder.ipynb		phi3-finetune-lora-pycoder.ipynb
phi3-finetune-qlora-pycoder.ipynb		phi3-finetune-qlora-pycoder.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-tune Phi-3-mini-4k model with LoRA or QLoRA to generate Python Code

With LoRA

With QLoRA

Problem description

Dataset

Base model

Example of usage

About

Releases

Packages

Languages

License

alexrodpas/finetune-LLMs

Folders and files

Latest commit

History

Repository files navigation

Fine-tune Phi-3-mini-4k model with LoRA or QLoRA to generate Python Code

With LoRA

With QLoRA

Problem description

Dataset

Base model

Example of usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages