Leanstral-2603 MLX 8-bit

MLX 8-bit quantized version of mistralai/Leanstral-2603.

  • Architecture: Mistral-Small-4 (119B params, 6.5B active per token, MoE with 128 experts)
  • Specialization: Lean 4 proof engineering
  • Quantization: 8-bit (~8.5 bits/weight), 119 GB
  • Format: MLX safetensors (for Apple Silicon)
  • RAM required: ~128 GB+ unified memory

Usage

from mlx_vlm import load, generate

model, processor = load("mlx-community/Leanstral-2603-8bit")
output = generate(model, processor, "Prove that the sum of two even numbers is even in Lean 4", max_tokens=4096)
print(output)
Downloads last month
149
Safetensors
Model size
34B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/Leanstral-2603-8bit

Quantized
(2)
this model