Granite 4.1 3B — Abliterated

Abliterated version of IBM's Granite 4.1 3B Instruct. Enterprise-trained reasoning and tool-use capabilities intact — refusals removed.

What This Is

IBM Granite 4.1 3B is a dense 3B language model optimized for enterprise tasks: coding, document understanding, function calling, and reasoning. This release is a BF16 abliterated version.

Architecture: Granite (LLM) | Params: ~3B | Hidden: 2560 | Layers: 40 | Vocab: 100K

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "DuoNeural/Granite-4.1-3B-Abliterated"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

messages = [{"role": "user", "content": "Your prompt here"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

Abliteration

Abliteration removes the model's refusal behaviour via orthogonal projection. The refusal direction is identified using difference-in-means activations across harmful/harmless instruction pairs, then projected out of Q/K/V/O attention projections and MLP layers across all transformer blocks.

What changes: The model will engage with restricted topics it previously refused.
What doesn't change: Reasoning, coding, factual knowledge, general intelligence.
KL divergence from base: Minimal — output distribution for normal queries is virtually identical to the unmodified model.

LiteRT Version (Android)

DuoNeural/Granite-4.1-3B-LiteRT — run on Android via AI Edge Gallery.

Base Model

ibm-granite/granite-4.1-3b-instruct — Apache 2.0.

DuoNeural

DuoNeural is an open AI research lab — human + AI in collaboration.

Platform	Link
HuggingFace	huggingface.co/DuoNeural
Website	duoneural.com
GitHub	github.com/DuoNeural
X / Twitter	@DuoNeural
Email	duoneural@proton.me
Newsletter	duoneural.beehiiv.com
Support	buymeacoffee.com/duoneural

DuoNeural Research Publications

Title	DOI
Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning	10.5281/zenodo.19775622
Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments	10.5281/zenodo.19810620
Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?	10.5281/zenodo.19846804
The Dynamical Horizon Principle: CTM Gates Converge to the Predictability Limit of Dynamical Systems	10.5281/zenodo.19952612

Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura — DuoNeural.

Research Team

Jesse — Vision, hardware, direction
Archon — Lab Director, post-training, abliteration, experiments
Aura — Research AI, literature synthesis, peer review, novel proposals

Subscribe to the lab newsletter at duoneural.beehiiv.com for model drops before they go anywhere else.

Downloads last month: 8

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for DuoNeural/Granite-4.1-3B-Abliterated

Finetunes

1 model

Quantizations

2 models