C-BERT v2 (Unified): Causal Relation Extraction

A unified multi-task model for extracting causal attributions from German text, using a single 14-class relation head.

Note: The factorized v3 model is recommended for most use cases. It achieves higher accuracy (76.9% vs. 75.3%), better span detection, and more interpretable errors. This v2 model is provided for comparison and for users who prefer a simpler single-head architecture.

📄 Paper: C-BERT: Factorized Causal Relation Extraction
💻 Code: github.com/padjohn/cbert
📊 Dataset: Bundestag Causal Attribution

Model Details

C-BERT extends EuroBERT-610m with two task-specific modules, fine-tuned jointly with LoRA:

Task Output Labels
1. Span Recognition BIOES Sequence Labeling 9 tags: B/I/O/E/S × {INDICATOR, ENTITY}
2. Relation Classification 14-class Softmax MONO_POS_CAUSE, MONO_NEG_CAUSE, PRIO_POS_CAUSE, ..., NO_RELATION, INTERDEPENDENCY

The 14 classes encode the full combinatorial space: {MONO, PRIO, DIST} × {POS, NEG} × {CAUSE, EFFECT} = 12 classes, plus NO_RELATION and INTERDEPENDENCY.

Usage

from causalbert.infer import load_model, sentence_analysis, extract_tuples

model, tokenizer, config, device = load_model("pdjohn/C-EBERT-V2-610m")
sentences = ["Pestizide und Autoverkehr sind Ursachen von Artensterben."]
analysis = sentence_analysis(model, tokenizer, config, sentences, device=device)
results = extract_tuples(analysis)

for item in results:
    print(f"{item['cause']} --({item['influence']:+.2f})--> {item['effect']}")

Output:

Pestizide --(+1.00)--> Artensterben
Autoverkehr --(+1.00)--> Artensterben

Evaluation

Flagship model (seed 456, epoch 3). Evaluated on held-out test set (478 relations, not augmented).

Relation Classification

Metric Score
14-class Accuracy 75.3%
14-class F1 (macro) 61.9%

Span Detection (Strict F1)

Span Type Precision Recall F1
Entity 0.893 0.563 0.691
Indicator 0.875 0.516 0.649

Note: v2 produces conservative span predictions (high precision, low recall) compared to v3.

Multi-Seed Robustness

v2 (unified) v3 (factorized)
Mean accuracy (5 seeds) 0.744 ± 0.007 0.768 ± 0.009
Best seed 0.753 0.781

Training

Parameter Value
Base model EuroBERT-610m
Architecture v2 (unified 14-class softmax)
LoRA r=16, α=32, dropout=0.05
Learning rate 3×10⁻⁴ (cosine schedule)
Warmup ratio 0.05
Epochs 7 (best checkpoint: epoch 3)
Batch size 32
Training seed 456
Dataset 2,391 relations, augmented to 7,604 (mode 2)
Loss Weighted cross-entropy (14 classes)

Dataset

Trained on 2,391 manually annotated causal relations in German environmental discourse (1990–2020), covering forest dieback, insect death, bee death, and species extinction. 80/20 train/test split at sentence level; augmentation doubles training relations via entity replacement.

A publicly releasable subset of 487 relations from German parliamentary debates is available at bundestag-causal-attribution.

Citation

@article{johnson2026cbert,
  title={C-BERT: Factorized Causal Relation Extraction},
  author={Johnson, Patrick},
  year={2026},
  doi={10.26083/tuda-7797}
}

Also Available

  • C-BERT v3 (Factorized) ⭐: Three parallel heads (role, polarity, salience). Higher accuracy (76.9%), better span detection, fewer multi-head error cascades. Recommended.
Downloads last month
3
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pdjohn/C-EBERT-V2-610m

Adapter
(2)
this model

Collection including pdjohn/C-EBERT-V2-610m