Granite Quantized Models
Collection
Quantized versions of IBM Granite models. Licensed under the Apache 2.0 license.
•
44 items
•
Updated
•
29
This repository contains models that have been converted to the GGUF format with various quantizations from an IBM Granite base model.
Please reference the base model's full model card here: https://huggingface.co/ibm-granite/granite-4.0-1b
This model often uses the full numerical range of a 32-bit float (f32), so variants with smaller numerical ranges may run into precision errors at inference. The F16 variant is known to fail on many hardware combinations.
The recommended full-precision variant is bf16.
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Base model
ibm-granite/granite-4.0-1b-base