Instructions to use mkurman/LFM2.5-230M-SYNTH with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mkurman/LFM2.5-230M-SYNTH with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="mkurman/LFM2.5-230M-SYNTH")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mkurman/LFM2.5-230M-SYNTH")
model = AutoModelForCausalLM.from_pretrained("mkurman/LFM2.5-230M-SYNTH")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use mkurman/LFM2.5-230M-SYNTH with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "mkurman/LFM2.5-230M-SYNTH"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mkurman/LFM2.5-230M-SYNTH",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/mkurman/LFM2.5-230M-SYNTH

SGLang

How to use mkurman/LFM2.5-230M-SYNTH with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "mkurman/LFM2.5-230M-SYNTH" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mkurman/LFM2.5-230M-SYNTH",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "mkurman/LFM2.5-230M-SYNTH" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mkurman/LFM2.5-230M-SYNTH",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use mkurman/LFM2.5-230M-SYNTH with Docker Model Runner:
```
docker model run hf.co/mkurman/LFM2.5-230M-SYNTH
```

LFM2.5-230M-Synth

Fine-tuned LiquidAI/LFM2.5-230M on a large synthetic reasoning dataset with preserved chain-of-thought (thinking) traces.

Model Details


Base model	LiquidAI/LFM2.5-230M (Lfm2ForCausalLM)
Architecture	LFM2 — hybrid conv+attention (8 conv + 6 full attention layers, 14 total)
Parameters	~229M (tied embeddings)
Hidden size	1024
Attention heads	16 (8 KV heads, GQA)
Vocab size	64,402
Max context	128K (trained at 2048)
Precision	bfloat16
Model size	457 MB (safetensors)

Training Details


Dataset	Synthetic reasoning mix (1.63M conversations, multi-turn with chain-of-thought)
Dataset size	~6.23 GiB (Arrow)
Training tokens	~2.88B (22,000 steps × effective batch 64 × seq 2048)
Epochs	~0.86 (partial epoch at checkpoint-22000)
Effective batch size	64 (per-device 8 × grad-accum 8)
Learning rate	5e-5, cosine schedule, 2% warmup
Optimizer	AdamW (PyTorch fused)
Sequence length	2048
Hardware	NVIDIA L40 48GB
Precision	bf16 + torch.compile
Framework	HuggingFace Transformers + TRL SFTTrainer

Training Results

Step	Train Loss	Eval Loss
500	2.71	1.8661
3,000	1.75	1.7121
5,500	1.69	1.6835
8,000	1.67	1.6687
10,500	1.66	1.6602
13,000	1.65	1.6542
15,500	1.65	1.6501
18,000	1.65	1.6478
20,500	1.65	1.6460
22,000	1.655	1.6457

Best eval loss: 1.6457 at step 22,000 (still improving at checkpoint time).

Loss decreased from 1.866 → 1.646 over 2.88B tokens — a 12% relative reduction with clear continued downward trend at the checkpoint boundary.

Chat Template

Uses the Liquid/LFM2 chat template with preserve_thinking=True. Reasoning traces from the dataset's reasoning_content field are mapped to the model's native thinking field before tokenization.

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mkurman/lfm25-230m-synth", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("mkurman/lfm25-230m-synth", trust_remote_code=True, dtype="bfloat16")

messages = [{"role": "user", "content": "Explain quantum entanglement simply."}]

# preserve_thinking=True so the model generates reasoning before its answer
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    preserve_thinking=True,
)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Intended Use

This is a research model fine-tuned on synthetic reasoning data. It is intended for:

Experimentation with small-model reasoning capabilities
Chain-of-thought / thinking-trace generation
Evaluation of synthetic data quality at small scale (230M)
On-device or edge reasoning model prototyping

Limitations

Partial training: This checkpoint is at ~0.86 epochs (step 22,000 / 36,500 planned). The full run continues.
Small model: 230M parameters — not suitable for production deployment without further evaluation.
Synthetic data only: Trained exclusively on synthetic reasoning traces; may exhibit distribution biases from the data generation pipeline.
Limited context at training: Trained at seq=2048 despite the architecture supporting 128K. Long-context behavior is untested.

Checkpoint Info

This is checkpoint-22000 from a 36,500-step training run. The checkpoint includes:

model.safetensors — 457 MB (weights only, no optimizer state)
Full tokenizer files + chat template
config.json with architecture details

Citation

If you use this model, please cite the base model:

@misc{liquid_lfm2,
  title={LFM2: Liquid Foundation Models},
  author={Liquid AI},
  year={2025},
  url={https://huggingface.co/LiquidAI/LFM2.5-230M}
}

Downloads last month: 195

Safetensors

Model size

0.2B params

Tensor type

BF16

Model tree for mkurman/LFM2.5-230M-SYNTH

Base model

LiquidAI/LFM2.5-230M-Base

Finetuned

LiquidAI/LFM2.5-230M