Image-Text-to-Text
Transformers
Safetensors
GGUF
qwen3_vl
text-generation-inference
unsloth
trl
sft
chemistry
code
climate
art
biology
finance
legal
music
medical
agent
conversational
How to use from
Unsloth StudioInstall Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for thelamapi/next-ocr to start chattingUsing HuggingFace Spaces for Unsloth
# No setup required# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for thelamapi/next-ocr to start chattingQuick Links
🖼️ Next OCR 8B
Compact OCR AI — Accurate, Fast, Multilingual, Math-Optimized
📖 Overview
Next OCR 8B is an 8-billion parameter model optimized for optical character recognition (OCR) tasks with mathematical and tabular content understanding.
Supports multilingual OCR (Turkish, English, German, Spanish, French, Chinese, Japanese, Korean, Russian...) with high accuracy, including structured documents like tables, forms, and formulas.
⚡ Highlights
- 🖼️ Accurate text extraction, including math and tables
- 🌍 Multilingual support (30+ languages)
- ⚡ Lightweight and efficient
- 💬 Instruction-tuned for document understanding and analysis
📊 Benchmark & Comparison
| Model | OCR-Bench Accuracy (%) | Multilingual Accuracy (%) | Layout / Table Understanding (%) |
|---|---|---|---|
| Next OCR | 99.0 | 96.8 | 95.3 |
| PaddleOCR | 95.2 | 93.9 | 95.3 |
| Deepseek OCR | 90.6 | 87.4 | 86.1 |
| Tesseract | 92.0 | 88.4 | 72.0 |
| EasyOCR | 90.4 | 84.7 | 78.9 |
| Google Cloud Vision / DocAI | 98.7 | 95.5 | 93.6 |
| Amazon Textract | 94.7 | 86.2 | 86.1 |
| Azure Document Intelligence | 95.1 | 93.6 | 91.4 |
| Model | Handwriting (%) | Scene Text (%) | Complex Tables (%) |
|---|---|---|---|
| Next OCR | 92 | 96 | 91 |
| PaddleOCR | 88 | 92 | 90 |
| Deepseek OCR | 80 | 85 | 83 |
| Tesseract | 75 | 88 | 70 |
| EasyOCR | 78 | 86 | 75 |
| Google Cloud Vision / DocAI | 90 | 95 | 92 |
| Amazon Textract | 85 | 90 | 88 |
| Azure Document Intelligence | 87 | 91 | 89 |
🚀 Installation & Usage
from transformers import AutoTokenizer, AutoModelForVision2Seq
import torch
model_id = "Lamapi/next-ocr"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16)
img = Image.open("image.jpg")
# ATTENTION: The content list must include both an image and text.
messages = [
{"role": "system", "content": "You are Next-OCR, an helpful AI assistant trained by Lamapi."},
{
"role": "user",
"content": [
{"type": "image", "image": img},
{"type": "text", "text": "Read the text in this image and summarize it."}
]
}
]
# Apply the chat template correctly
prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=prompt, images=[img], return_tensors="pt").to(model.device)
with torch.no_grad():
generated = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(generated[0], skip_special_tokens=True))
🧩 Key Features
| Feature | Description |
|---|---|
| 🖼️ High-Accuracy OCR | Extracts text from images, documents, and screenshots reliably. |
| 🇹🇷 Multilingual Support | Works with 30+ languages including Turkish. |
| ⚡ Lightweight & Efficient | Optimized for resource-constrained environments. |
| 📄 Layout & Math Awareness | Handles tables, forms, and mathematical formulas. |
| 🏢 Reliable Outputs | Suitable for enterprise document workflows. |
📐 Model Specifications
| Specification | Details |
|---|---|
| Base Model | Qwen 3 |
| Parameters | 8 Billion |
| Architecture | Vision + Transformer (OCR LLM) |
| Modalities | Image-to-text |
| Fine-Tuning | OCR datasets with multilingual and math/tabular content |
| Optimizations | Quantization-ready, FP16 support |
| Primary Focus | Text extraction, document understanding, mathematical OCR |
🎯 Ideal Use Cases
- Document digitization
- Invoice & receipt processing
- Multilingual OCR pipelines
- Tables, forms, and formulas extraction
- Enterprise document management
📄 License
MIT License — free for commercial & non-commercial use.
📞 Contact & Support
- 📧 Email: lamapicontact@gmail.com
- 🤗 HuggingFace: Lamapi
Next OCR — Compact OCR + math-capable AI, blending accuracy, speed, and multilingual document intelligence.
- Downloads last month
- 4,247


Install Unsloth Studio (macOS, Linux, WSL)
# Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for thelamapi/next-ocr to start chatting