Model Card for siru-dialogue-lora
This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct. It has been trained using TRL.
Contents (export only)
This folder keeps the final PEFT adapter plus tokenizer sidecars for inference. Intermediate checkpoint-* training snapshots are omitted to save disk; re-train if you need reproducible mid-run states.
Quick start (base + LoRA)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
BASE = "meta-llama/Llama-3.1-8B-Instruct"
ADAPTER = "." # or path / Hub id to this folder
tokenizer = AutoTokenizer.from_pretrained(ADAPTER, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
BASE,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, ADAPTER)
messages = [{"role": "user", "content": "Write one line of dialogue for a tense reunion."}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Training procedure
This model was trained with SFT.
Framework versions
- PEFT 0.19.1
- TRL: 1.2.0
- Transformers: 5.5.4
- Pytorch: 2.11.0+cu126
- Datasets: 4.8.4
- Tokenizers: 0.22.2
Citations
Cite TRL as:
@software{vonwerra2020trl,
title = {{TRL: Transformers Reinforcement Learning}},
author = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
license = {Apache-2.0},
url = {https://github.com/huggingface/trl},
year = {2020}
}
- Downloads last month
- 19
Model tree for Cryptodk/Siru_SLM
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct