DeBERTa-v3-Large for Claim Checkworthiness Detection (Seed 0)

This model is fine-tuned from microsoft/deberta-v3-large for claim checkworthiness detection as part of the ExplainableACD project for IJCAI 2026.

Model Description

  • Base Model: microsoft/deberta-v3-large
  • Task: Binary classification (checkworthy vs non-checkworthy claims)
  • Training Seed: 0 (part of 3-seed ensemble)
  • Fine-tuning Dataset: ClaimTruth 2024 Task 1 (CT24)
  • Training Framework: PyTorch + Transformers

Training Configuration

Hyperparameters:

  • Learning rate: 2e-5
  • Batch size: 8 (effective: 32 with gradient accumulation)
  • Epochs: 5
  • Max sequence length: 128
  • Optimizer: AdamW with cosine schedule

Advanced Techniques:

  • Focal Loss (γ=2.0): Handles class imbalance
  • LLRD (Layer-wise Learning Rate Decay, α=0.9): Different learning rates per layer
  • R-Drop (α=1.0): Regularization via dropout consistency
  • FGM (Fast Gradient Method, ε=1.0): Adversarial training
  • BF16 Precision: Mixed-precision training on NVIDIA A10 (24GB VRAM) GPU

Performance

Development Set

  • F1: 96.68%
  • Accuracy: 98.45%
  • Precision: 95.49%
  • Recall: 97.90%
  • Optimal Threshold: 0.55

Test Set (Held-out)

  • F1: 80.46%
  • Accuracy: 90.03%
  • Precision: 81.40%
  • Recall: 79.55%
  • Optimal Threshold: 0.50

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "sergiopinto/deberta-v3-large-claim-checkworthiness-seed0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example claim
claim = "The president announced a new economic policy yesterday."

# Tokenize and predict
inputs = tokenizer(claim, return_tensors="pt", max_length=128, truncation=True)
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)

# Get prediction
checkworthy_prob = probs[0][1].item()
is_checkworthy = checkworthy_prob > 0.50

print(f"Checkworthy probability: {checkworthy_prob:.4f}")
print(f"Is checkworthy: {is_checkworthy}")

Ensemble Usage

This model is seed 0 of a 3-seed ensemble. For best performance, combine with:

  • sergiopinto/deberta-v3-large-claim-checkworthiness-seed42
  • sergiopinto/deberta-v3-large-claim-checkworthiness-seed456

Late fusion ensemble achieves ~83.6% F1 on test set (3% improvement over single seed).

Training Infrastructure

  • GPU: NVIDIA A10 (24GB VRAM)
  • Training Time: ~1.5 hours
  • Framework: PyTorch 2.x + Transformers 4.x
  • Precision: BF16 mixed precision

Citation

If you use this model, please cite:

@inproceedings{pinto2026explainableacd,
  title={Explainable Automatic Claim Detection for Real-Time Fact-Checking},
  author={Pinto, Sérgio and [Co-authors]},
  booktitle={Proceedings of the 35th International Joint Conference on Artificial Intelligence (IJCAI)},
  year={2026}
}

License

MIT License

Contact

  • Author: Sérgio Pinto
  • Project: ExplainableACD (IJCAI 2026)
  • Organization: Verefy

Model Card

  • Developed by: Sérgio Pinto
  • Model type: DeBERTa-v3-Large (Transformer)
  • Language: English
  • Finetuned from: microsoft/deberta-v3-large
  • Task: Claim Checkworthiness Detection
Downloads last month
4
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results