🧠 LamoFast-1.0 (GGUF)

This repository contains the GGUF release of LamoFast-1.0, a lightweight and fast open-source language model based on Qwen2.5-0.5B, with support for English and Hebrew.

The GGUF format is optimized for efficient local inference, making this model ideal for tools such as llama.cpp, Ollama, and LM Studio.

✨ Key Features

🚀 Fast & lightweight – ~0.5B parameters, runs well on CPU and low-memory systems
🧠 General-purpose LLM with additional focus on astronomy & science topics
🌍 Bilingual – English & Hebrew support
📦 GGUF format – optimized for low memory usage and fast loading
📜 Open license – Apache 2.0

📦 Available Files

The GGUF release may one file:

LomaFast_Tiny_v1.gguf

Choose a quantization based on your hardware:

Lower quantization (Q2–Q4) → faster, lower memory usage
Higher quantization (Q5–Q8) → better quality, more memory usage

🚀 Usage Examples

llama.cpp

./main -m LamoFast-1.0.Q4_K_M.gguf -p "Explain the Big Bang theory in simple terms." -n 200

Ollama

Create a Modelfile:

FROM LamoFast-1.0.Q4_K_M.gguf

Then run:

ollama create lamofast -f Modelfile
ollama run lamofast

LM Studio

Open LM Studio
Import the GGUF file
Select the model and start chatting

🧪 Prompt Format

The model follows a chat-style prompt format compatible with Qwen-style templates.

Example:

<|user|>
Explain black holes in simple terms.
<|assistant|>

⚠️ Notes

This is a small model by design – it prioritizes speed and efficiency over raw reasoning power
Best results are achieved with clear, concise prompts
Works especially well for educational, scientific, and lightweight assistant tasks

📜 License

Licensed under the Apache License 2.0.

You are free to use, modify, and distribute this model, including for commercial purposes.

🙌 Credits

Base model: Qwen2.5-0.5B
Fine-tuning & GGUF release: Raziel1234

If you use LamoFast-1.0, a mention or ⭐ on Hugging Face is always appreciated!

Downloads last month: 42

Safetensors

Model size

0.5B params

Tensor type

BF16