Here’s an English model card draft you can drop into README.md for ussoewwin/Qwen3-VL-4B-Thinking-abliterated. It mirrors the structure of the FP16 8B card you shared, but tailored for the 4B checkpoint.

Qwen3-VL-4B-Thinking-abliterated

This repository provides an uncensored “abliterated” variant of Qwen/Qwen3-VL-4B-Thinking. The text-side safety filters have been intentionally weakened, while the vision stack remains unchanged. The original base weights come from huihui-ai/Huihui-Qwen3-VL-4B-Thinking-abliterated.

Model Details

Model Name: Qwen3-VL-4B-Thinking-abliterated
Base Model: Qwen/Qwen3-VL-4B-Thinking
Parameters: ~4B
Tensor Type: BF16 in the upstream snapshot (FP16 conversions can be produced similarly to the FP16 8B release)
License: Apache 2.0
Abliteration Scope: Only the text-generation heads were altered; the multimodal encoder/vision tower is identical to the base model. The model will no longer self-censor with responses such as “I can’t describe this image.”

Usage

This checkpoint can jointly process images and text prompts. The interface is identical to the standard Qwen3 VL workflow. Below is a Python example using Transformers:

from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
import torch

MODEL_ID = "huihui-ai/Huihui-Qwen3-VL-4B-Thinking-abliterated"

model = Qwen3VLForConditionalGeneration.from_pretrained(
    MODEL_ID,
    device_map="auto",
    trust_remote_code=True,
    dtype=torch.bfloat16,   # switch to torch.float16 if you convert to FP16
    low_cpu_mem_usage=True,
)

processor = AutoProcessor.from_pretrained(MODEL_ID)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "file:///path/to/your/image.png"},
            {"type": "text", "text": "Describe this image in detail."},
        ],
    }
]

inputs = processor.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_dict=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=256)
decoded = processor.batch_decode(
    [output[len(input_ids):] for input_ids, output in zip(inputs.input_ids, outputs)],
    skip_special_tokens=True,
    clean_up_tokenization_spaces=False,
)
print(decoded[0])

Conversion Notes

If you need an FP16 version (similar to the 8B FP16 repository), convert the safetensors with torch.float16 tensors after loading, then re-export. The above script can be run with dtype=torch.float16 once you supply FP16 weights.
GGUF export is not provided here, but you can follow the same pipeline described for the 8B FP16 release if needed.

Usage Warnings

Sensitive Content: Due to abliterated safety filters, outputs may be controversial, NSFW, or otherwise inappropriate. Review generations carefully before sharing.
Audience Suitability: The model is not safe for minors or safety-critical/public deployments without additional moderation.
Legal/Ethical Responsibility: Ensure all usage complies with your local laws and ethical guidelines. You assume full responsibility for downstream content.
Research Focus: Intended primarily for experimentation, red-teaming, or controlled testing environments.
Monitoring: We strongly recommend real-time monitoring or post-filtering of model outputs.
No Safety Guarantee: The maintainers of huihui-ai or this fork take no responsibility for misuse or harm resulting from this model.

That’s it—feel free to adjust any wording or sections to match your project’s tone.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support