Model Card for siru-dialogue-lora

This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct. It has been trained using TRL.

Contents (export only)

This folder keeps the final PEFT adapter plus tokenizer sidecars for inference. Intermediate checkpoint-* training snapshots are omitted to save disk; re-train if you need reproducible mid-run states.

Quick start (base + LoRA)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

BASE = "meta-llama/Llama-3.1-8B-Instruct"
ADAPTER = "."  # or path / Hub id to this folder

tokenizer = AutoTokenizer.from_pretrained(ADAPTER, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    BASE,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, ADAPTER)

messages = [{"role": "user", "content": "Write one line of dialogue for a tense reunion."}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Training procedure

This model was trained with SFT.

Framework versions

PEFT 0.19.1
TRL: 1.2.0
Transformers: 5.5.4
Pytorch: 2.11.0+cu126
Datasets: 4.8.4
Tokenizers: 0.22.2

Citations

Cite TRL as:

@software{vonwerra2020trl,
  title   = {{TRL: Transformers Reinforcement Learning}},
  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
  license = {Apache-2.0},
  url     = {https://github.com/huggingface/trl},
  year    = {2020}
}

Downloads last month: 19

Model tree for Cryptodk/Siru_SLM

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Adapter

(1980)

this model