Naholav
/

deep-instruction

Text Generation

code-generation

competitive-programming

Model card Files Files and versions

deep-instruction / README.md

Naholav's picture

Upload README.md with huggingface_hub

7dff43a verified about 1 month ago

|

history blame contribute delete

3.26 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
	tags:
	- lora
	- code-generation
	- fine-tuning
	- competitive-programming
	datasets:
	- Naholav/CodeGen-Deep-5K
	language:
	- en
	pipeline_tag: text-generation
	---

	# Deep Instruction - LoRA Fine-tuned Qwen2.5-Coder-1.5B

	This is the best performing checkpoint from the deep_instruction training configuration.

	## Model Details

	\| Property \| Value \|
	\|----------\|-------\|
	\| Base Model \| Qwen/Qwen2.5-Coder-1.5B-Instruct \|
	\| Training Dataset \| [Naholav/CodeGen-Deep-5K](https://huggingface.co/datasets/Naholav/CodeGen-Deep-5K) \|
	\| Training Method \| LoRA (Low-Rank Adaptation) \|
	\| Checkpoint \| step-800, epoch-3 \|
	\| Pass@1 (AtCoder Easy) \| 26.83% (11/41 problems) \|

	## Training Configuration

	- Prompt Style: Instruction (direct code generation without reasoning)
	- System Prompt: "You are an expert programmer. Write clean, efficient code."
	- LoRA Rank: 32
	- LoRA Alpha: 64
	- LoRA Dropout: 0.05
	- Learning Rate: 5e-5


	Note: All 4 models were trained with identical hyperparameters for fair comparison. Better configurations may be discovered through hyperparameter search methods (e.g., grid search, random search).

	## All Models Performance Comparison

	Evaluated on LiveCodeBench AtCoder Easy problems (41 questions):

	\| Model \| Pass@1 \| Improvement \|
	\|-------\|--------\|-------------\|
	\| Base Model (Qwen2.5-Coder-1.5B) \| 24.39% \| - \|
	\| [deep-instruction](https://huggingface.co/Naholav/deep-instruction) (this model) \| 26.83% \| +10% \|
	\| [diverse-think](https://huggingface.co/Naholav/diverse-think) \| 29.27% \| +20% \|
	\| [deep-think](https://huggingface.co/Naholav/deep-think) \| 31.71% \| +30% \|
	\| [diverse-instruction](https://huggingface.co/Naholav/diverse-instruction) \| 31.71% \| +30% \|

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	import torch

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen2.5-Coder-1.5B-Instruct",
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)

	# Load LoRA adapter
	model = PeftModel.from_pretrained(base_model, "Naholav/deep-instruction")

	# Load tokenizer
	tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-1.5B-Instruct")

	# Generate with instruction prompt
	messages = [
	{"role": "system", "content": "You are an expert programmer. Write clean, efficient code."},
	{"role": "user", "content": "Your problem here..."}
	]

	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=2048)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Resources

	- GitHub Repository: [https://github.com/naholav/CodeGen](https://github.com/naholav/CodeGen)
	- Training Dataset: [Naholav/CodeGen-Deep-5K](https://huggingface.co/datasets/Naholav/CodeGen-Deep-5K)

	## Citation

	If you use this model, please cite:

	```
	@misc{naholav2024codegen,
	author = {naholav},
	title = {CodeGen: LoRA Fine-tuning for Competitive Programming},
	year = {2025},
	publisher = {HuggingFace},
	url = {https://huggingface.co/Naholav/deep-instruction}
	}
	```