|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct |
|
|
tags: |
|
|
- lora |
|
|
- code-generation |
|
|
- fine-tuning |
|
|
- competitive-programming |
|
|
datasets: |
|
|
- Naholav/CodeGen-Deep-5K |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Deep Instruction - LoRA Fine-tuned Qwen2.5-Coder-1.5B |
|
|
|
|
|
This is the best performing checkpoint from the **deep_instruction** training configuration. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
| Property | Value | |
|
|
|----------|-------| |
|
|
| Base Model | Qwen/Qwen2.5-Coder-1.5B-Instruct | |
|
|
| Training Dataset | [Naholav/CodeGen-Deep-5K](https://huggingface.co/datasets/Naholav/CodeGen-Deep-5K) | |
|
|
| Training Method | LoRA (Low-Rank Adaptation) | |
|
|
| Checkpoint | step-800, epoch-3 | |
|
|
| Pass@1 (AtCoder Easy) | **26.83%** (11/41 problems) | |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
- **Prompt Style:** Instruction (direct code generation without reasoning) |
|
|
- **System Prompt:** "You are an expert programmer. Write clean, efficient code." |
|
|
- **LoRA Rank:** 32 |
|
|
- **LoRA Alpha:** 64 |
|
|
- **LoRA Dropout:** 0.05 |
|
|
- **Learning Rate:** 5e-5 |
|
|
|
|
|
|
|
|
**Note:** All 4 models were trained with identical hyperparameters for fair comparison. Better configurations may be discovered through hyperparameter search methods (e.g., grid search, random search). |
|
|
|
|
|
## All Models Performance Comparison |
|
|
|
|
|
Evaluated on LiveCodeBench AtCoder Easy problems (41 questions): |
|
|
|
|
|
| Model | Pass@1 | Improvement | |
|
|
|-------|--------|-------------| |
|
|
| Base Model (Qwen2.5-Coder-1.5B) | 24.39% | - | |
|
|
| **[deep-instruction](https://huggingface.co/Naholav/deep-instruction) (this model)** | **26.83%** | **+10%** | |
|
|
| [diverse-think](https://huggingface.co/Naholav/diverse-think) | 29.27% | +20% | |
|
|
| [deep-think](https://huggingface.co/Naholav/deep-think) | 31.71% | +30% | |
|
|
| [diverse-instruction](https://huggingface.co/Naholav/diverse-instruction) | 31.71% | +30% | |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Load base model |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"Qwen/Qwen2.5-Coder-1.5B-Instruct", |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(base_model, "Naholav/deep-instruction") |
|
|
|
|
|
# Load tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-1.5B-Instruct") |
|
|
|
|
|
# Generate with instruction prompt |
|
|
messages = [ |
|
|
{"role": "system", "content": "You are an expert programmer. Write clean, efficient code."}, |
|
|
{"role": "user", "content": "Your problem here..."} |
|
|
] |
|
|
|
|
|
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=2048) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Resources |
|
|
|
|
|
- **GitHub Repository:** [https://github.com/naholav/CodeGen](https://github.com/naholav/CodeGen) |
|
|
- **Training Dataset:** [Naholav/CodeGen-Deep-5K](https://huggingface.co/datasets/Naholav/CodeGen-Deep-5K) |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
``` |
|
|
@misc{naholav2024codegen, |
|
|
author = {naholav}, |
|
|
title = {CodeGen: LoRA Fine-tuning for Competitive Programming}, |
|
|
year = {2025}, |
|
|
publisher = {HuggingFace}, |
|
|
url = {https://huggingface.co/Naholav/deep-instruction} |
|
|
} |
|
|
``` |
|
|
|