Model Card for siru-dialogue-lora

This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct. It has been trained using TRL.

Contents (export only)

This folder keeps the final PEFT adapter plus tokenizer sidecars for inference. Intermediate checkpoint-* training snapshots are omitted to save disk; re-train if you need reproducible mid-run states.

Quick start (base + LoRA)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

BASE = "meta-llama/Llama-3.1-8B-Instruct"
ADAPTER = "."  # or path / Hub id to this folder

tokenizer = AutoTokenizer.from_pretrained(ADAPTER, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    BASE,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, ADAPTER)

messages = [{"role": "user", "content": "Write one line of dialogue for a tense reunion."}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Training procedure

This model was trained with SFT.

Framework versions

  • PEFT 0.19.1
  • TRL: 1.2.0
  • Transformers: 5.5.4
  • Pytorch: 2.11.0+cu126
  • Datasets: 4.8.4
  • Tokenizers: 0.22.2

Citations

Cite TRL as:

@software{vonwerra2020trl,
  title   = {{TRL: Transformers Reinforcement Learning}},
  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
  license = {Apache-2.0},
  url     = {https://github.com/huggingface/trl},
  year    = {2020}
}
Downloads last month
19
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for Cryptodk/Siru_SLM

Adapter
(1980)
this model