Kortix Fast Apply models are designed for instant code application, producing full file edits to power SoftGen AI.
They achieve high throughput when deployed on fast providers like Fireworks while maintaining high edit accuracy:
- ~
340 tok/s
for 1.5B model - ~
150 tok/s
for 7B model
Models and dataset are available on HuggingFace:
The inference prompt structure:
<|im_start|>system
You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated.<|im_end|>
<|im_start|>user
Merge all changes from the <update> snippet into the <code> below.
- Preserve the code's structure, order, comments, and indentation exactly.
- Output only the updated code, enclosed within <updated-code> and </updated-code> tags.
- Do not include any additional text, explanations, placeholders, ellipses, or code fences.
<code>{original_code}</code>
<update>{update_snippet}</update>
Provide the complete updated code.<|im_end|>
<|im_start|>assistant
"""
Model output :
<updated-code>[Full-complete updated file]</updated-code>
We chose smaller models (7B and 1.5B) for fast inference speed, suitable for instant apply tasks. These models work well with AI-powered code editors like Aider, PearAI or local tools to reduce the cost of frontier model output.
We generate high-quality synthetic data using open-source NextJS-like projects as original-code
,
then use Claude Sonnet 3.5 (70%) and GPT-4 (30%) to generate update-snippet
and final-updated-code
.
-
Clone Open-Source Repositories
git clone --depth 1 https://github.com/your/repo.git data/repo
-
Convert Repository Data to Dataset
Use
repo_to_dataset.py
to transform the repository data into a structured dataset while filtering out unsuitable files like log, cache, ...:python data_generation/repo_to_dataset.py /path/to/your/repo \ --sample-lt-100 0 \ --sample-100-399 500 \ --sample-400-999 1000 \ --sample-1000-1999 3000 \ --sample-2000-2999 1000 \ --sample-3000-3999 0 \ ... --sample-10000-plus 0 \ --output output.parquet \ --debug
Parameters:
--sample-lt-100
: Number of samples with fewer than 100 tokens.--sample-...
-
Generate Synthetic Data
We recommend using
Anthropic Claude
for data generation due to its high quality output.OpenAI's Batch API
is also available as a cost-effective and faster alternative compared toClaude's Batch API
.Additionally, you can utilize the
Deepseek v2.5 API
beta to address any bugs or issues you may encounter in the generated dataset.python data_generation/anthropic/generate.py --parquet_file data/train/my_data.parquet
Example Workflow:
# Prepare batch data python data_generation/openai/prepare_batch_data.py -i data/train/train_dataset.parquet -o data/train/batch/ # Send batch requests python data_generation/openai/send_batch_request.py -bd data/train/batch/ -c 5 # Process batch files python data_generation/openai/batch_processor.py -i data/train/batch/ -o data/train/train_dataset.parquet
Fine-tuning enhances the pre-trained Qwen2.5 Coder models to better suit our specific task. We leverage unsloth
to accelerate this process while minimizing VRAM usage.
-
Fine-tuning Notebooks: Available in the
notebooks
directory. -
Dataset:
- Source: https://huggingface.co/datasets/Kortix/FastApply-dataset-v1.0
- Size: Approximately 5,600 examples
- Composition: 80% TypeScript/TSX, 15% Python, 5% Other
-
Model Versions:
- Using QLoRA with 4-bit quantization
- 7B model: https://huggingface.co/unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit
- 1.5B model: https://huggingface.co/unsloth/Qwen2.5-Coder-1.5B-Instruct-bnb-4bit
-
Hyperparameters:
- 1.5B model: rank (r) = 32, alpha = 16
- 7B model: rank (r) = 16, alpha = 16
- Training epochs: 1
This fine-tuning process optimizes the models for our specific code editing tasks while maintaining efficiency in computation and memory usage.
Inference script: tests_evaluate/fireworks/test_fireworks.py
Evaluating code transformations isn't trivial due to several factors:
-
Insert Flexibility: Models can insert code in different locations since imports and functions are independent.
-
Function Ordering: While not ideal, models may change function placement while maintaining correct and bug-free code.
Due to these challenges, simple file comparison isn't always sufficient. Alternative approaches like line-by-line comparison with sorting can be used, though they have their own limitations. The fiable way is use a big model to evaluate like Deepseek.
Here are our development benchmarks for 100 test examples:
- Start with the 1.5B model - it shows impressive performance for its size
- If the 1.5B model doesn't meet your needs, try the 7B model
We welcome contributions to improve Fast Apply! Here are some ways you can help:
-
More data: The current model uses mostly TypeScript open-source code. Adding other languages could help avoid overfitting.
-
Bug Reports: If you encounter any issues, please open a GitHub issue with a detailed description.
-
Feature Requests: Have ideas for new features? Open an issue to discuss them.
-
Code Contributions:
- Fork the repository
- Create a new branch for your feature or bug fix
- Submit a pull request with a clear description of your changes
-
Fine-tuning Improvements:
- Share your findings on model performance improvements
Happy Coding!