YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Evaluations are produced with https://github.com/neuralmagic/GuardBench and vLLM as an inference engine.

Evaluations are obtained with vllm==0.15.0 and bug fixes from this PR. If you are running vllm > 0.15.0, you will likely have the bug fixes already applied as the PR landed on main.

Dataset meta-llama/Llama-Guard-4-12B
F1
RedHatAI/Llama-Guard-4-12B-quantized.w8a8
(this model)
F1
F1 Recovery % meta-llama/Llama-Guard-4-12B
Recall
RedHatAI/Llama-Guard-4-12B-quantized.w8a8
Recall
Recall
Recovery %
AART 0.874 0.869 99.43 0.776 0.769 99.1
AdvBench Behaviors 0.964 0.966 100.21 0.931 0.935 100.43
AdvBench Strings 0.83 0.824 99.28 0.709 0.7 98.73
BeaverTails 330k 0.732 0.728 99.45 0.591 0.585 98.98
Bot-Adversarial Dialogue 0.513 0.504 98.25 0.376 0.368 97.87
CatQA 0.932 0.928 99.57 0.873 0.865 99.08
ConvAbuse 0.241 0.225 93.36 0.148 0.141 95.27
DecodingTrust Stereotypes 0.591 0.596 100.85 0.419 0.424 101.19
DICES 350 0.118 0.118 100 0.063 0.063 100
DICES 990 0.219 0.219 100 0.135 0.135 100
Do Anything Now Questions 0.746 0.736 98.66 0.595 0.582 97.82
DoNotAnswer 0.546 0.543 99.45 0.376 0.373 99.2
DynaHate 0.603 0.604 100.17 0.481 0.48 99.79
HarmEval 0.56 0.573 102.32 0.389 0.402 103.34
HarmBench Behaviors 0.959 0.954 99.48 0.922 0.912 98.92
HarmfulQ 0.86 0.86 100 0.755 0.755 100
HarmfulQA Questions 0.588 0.584 99.32 0.416 0.412 99.04
HarmfulQA 0.374 0.355 94.92 0.231 0.217 93.94
HateCheck 0.782 0.784 100.26 0.667 0.668 100.15
Hatemoji Check 0.625 0.62 99.2 0.474 0.468 98.73
HEx-PHI 0.966 0.964 99.79 0.933 0.93 99.68
I-CoNa 0.837 0.829 99.04 0.719 0.708 98.47
I-Controversial 0.596 0.621 104.19 0.425 0.45 105.88
I-MaliciousInstructions 0.824 0.83 100.73 0.7 0.71 101.43
I-Physical-Safety 0.493 0.459 93.1 0.34 0.31 91.18
JBB Behaviors 0.86 0.864 100.47 0.86 0.86 100
MaliciousInstruct 0.953 0.958 100.52 0.91 0.92 101.1
MITRE 0.663 0.668 100.75 0.495 0.502 101.41
NicheHazardQA 0.46 0.454 98.7 0.299 0.294 98.33
OpenAI Moderation Dataset 0.739 0.735 99.46 0.787 0.782 99.36
ProsocialDialog 0.427 0.42 98.36 0.276 0.27 97.83
SafeText 0.372 0.361 97.04 0.254 0.243 95.67
SimpleSafetyTests 0.985 0.985 100 0.97 0.97 100
StrongREJECT Instructions 0.91 0.905 99.45 0.836 0.826 98.8
TDCRedTeaming 0.947 0.947 100 0.9 0.9 100
TechHazardQA 0.758 0.755 99.6 0.61 0.606 99.34
Toxic Chat 0.433 0.436 100.69 0.519 0.525 101.16
ToxiGen 0.46 0.465 101.09 0.315 0.32 101.59
XSTest 0.834 0.836 100.24 0.78 0.78 100
Average Score 0.6711282051 0.6687692308 99.42051282 0.5706410256 0.5682051282 99.30282051
Downloads last month
25
Safetensors
Model size
12B params
Tensor type
BF16
·
I8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support