bartowski/mistralai_Devstral-Small-2-24B-Instruct-2512-GGUF Text Generation • 24B • Updated 15 days ago • 36.5k • 41
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 21 days ago • 62
The Bestiary Collection Decensored language models made using Heretic (https://github.com/p-e-w/heretic) • 6 items • Updated Nov 16 • 75