MagicQuant Hybrids (v2.0) - Qwen3-4B-Instruct-2507-unsloth

MagicQuant is not a quantization technique by itself.

It is a search, judging, and hybrid-discovery system that learns from baseline families such as llama.cpp and external/custom baseline sources, then uses isolated samples, rank-safe prediction, and real benchmarking to keep the practical survivors.

Sometimes a hybrid beats a pure baseline. Sometimes it does not. MagicQuant finds non linear good trades to discover potential better hybrids, good sub spaces between anchor baselines and more.

Read more on the MagicQuant Wiki Here.

Final surviving downloadable outputs

Name Provider Quant Family KLD Size (GB) Download
LM-Q8_0 llama.cpp Q8_0 0.001339 3.99 Link
MQ-Q6_K_1 MagicQuant Q6_K 0.001817 3.58 Link
UD-Q6_K_XL Unsloth UD-Q6_K_XL 0.002111 3.41 Link
LM-Q6_K llama.cpp Q6_K 0.004640 3.08 Link
MQ-Q5_K_1 MagicQuant Q5_K 0.006632 2.88 Link
UD-Q5_K_XL Unsloth UD-Q5_K_XL 0.009839 2.73 Link
MQ-Q4_K_M_1 MagicQuant Q4_K_M 0.020346 2.44 Link
LM-Q4_K_S llama.cpp Q4_K_S 0.029803 2.22 Link
LM-IQ4_XS llama.cpp IQ4_XS 0.031300 2.11 Link
UD-Q3_K_XL Unsloth UD-Q3_K_XL 0.072278 1.98 Link
LM-IQ3_S llama.cpp IQ3_S 0.091992 1.77 Link
LM-IQ3_XXS llama.cpp IQ3_XXS 0.190404 1.56 Link
LM-IQ2_S llama.cpp IQ2_S 0.431128 1.32 Link
LM-IQ2_XXS llama.cpp IQ2_XXS 0.938021 1.16 Link

Release metadata

  • Final survivor metrics — full file names, KLD, PPL delta %, byte sizes, download targets, and replacement lineage. PPL delta % is measured against the native/reference PPL when available; negative is better and larger positive values are worse.
  • Hybrid tensor map — tensor-group assignments and effective-state details for MagicQuant hybrid GGUFs.
  • Replacement details — structured details for baselines or anchors removed from the final download table, including reason codes, KLD deltas, PPL delta %, and size deltas.

Replacement reason codes
  • STRICT_DOMINANCE — the winner was no larger and had lower real KLD than the removed anchor.
  • NEAR_BASELINE_PREMIUM — the winner used only the configured near-baseline size premium and beat the real linear KLD trade line.
  • INTERIOR_DISCOVERY — the winner was selected as a useful interior point inside a size/KLD gap between anchors.
  • SPACING_COLLAPSE — two candidates were too close in practical output space, so the stronger one was kept.
  • FINAL_DOMINANCE — a later validated survivor dominated this artifact in final real benchmark comparison.

Underlined names in the table replaced or ultimately inherited the replacement of another artifact. Hover the name for the short replacement summary, or inspect magicquant.replacements.json for exact KLD/PPL/size deltas.

Provider credits
  • llama.cpp — Baseline quantization formats and llama.cpp tooling.
  • Unsloth — External learned baseline source (UD).
Warning

External/custom baselines are normalized into MagicQuant's controlled comparison flow. MagicQuant may rebuild a learned baseline under native-source / MagicQuant-controlled conditions, including its own imatrix handling, so hybrids can be judged on a more equal footing. That does not mean MagicQuant proved the original upstream artifact or upstream imatrix was worse. These comparisons exist for internal hybrid-search consistency, not as a universal judgment of the original creator's exact release artifact.


Support

I’m a solo developer working full time for myself to achieve my dream. I build open source code on the side. If you like any of my work, buying me a coffee is always appreciated. Otherwise, I hope you enjoy, maybe give me a star or something. Or just send me good vibes. Either way, thank you!

Click here to see ways to support - BTC, Paypal, GitHub sponsors.

Downloads last month
864
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for magiccodingman/Qwen3-4B-Instruct-2507-Unsloth-MagicQuant-v2-GGUF

Quantized
(29)
this model

Collection including magiccodingman/Qwen3-4B-Instruct-2507-Unsloth-MagicQuant-v2-GGUF