Qwen-3.5 Collection
Collection
Quantized Qwen3.5 models for efficient image-text understanding (AutoRound W4A16). • 6 items • Updated
• 1
This is a W4A16 (4-bit weight, 16-bit activation) AWQ-format quantized version of Qwen/Qwen3.5-9B, produced using AutoRound — Intel's sign gradient descent based quantization method designed for production-grade accuracy retention.
| Parameter | Value |
|---|---|
| Method | AutoRound (W4A16, AWQ format) |
| Group Size | 128 |
| Symmetric | Yes |
| Iterations | 800 |
| Calibration Samples | 512 |
| Sequence Length | 2048 |
| Torch Compile | Enabled |
This model is compatible with transformers, AutoAWQ, vLLM, and SGLang — any backend supporting AWQ-format weights works out of the box. For full model details, architecture, and capabilities, refer to the base model page.