C-BERT v2 (Unified): Causal Relation Extraction
A unified multi-task model for extracting causal attributions from German text, using a single 14-class relation head.
Note: The factorized v3 model is recommended for most use cases. It achieves higher accuracy (76.9% vs. 75.3%), better span detection, and more interpretable errors. This v2 model is provided for comparison and for users who prefer a simpler single-head architecture.
📄 Paper: C-BERT: Factorized Causal Relation Extraction
💻 Code: github.com/padjohn/cbert
📊 Dataset: Bundestag Causal Attribution
Model Details
C-BERT extends EuroBERT-610m with two task-specific modules, fine-tuned jointly with LoRA:
| Task | Output | Labels |
|---|---|---|
| 1. Span Recognition | BIOES Sequence Labeling | 9 tags: B/I/O/E/S × {INDICATOR, ENTITY} |
| 2. Relation Classification | 14-class Softmax | MONO_POS_CAUSE, MONO_NEG_CAUSE, PRIO_POS_CAUSE, ..., NO_RELATION, INTERDEPENDENCY |
The 14 classes encode the full combinatorial space: {MONO, PRIO, DIST} × {POS, NEG} × {CAUSE, EFFECT} = 12 classes, plus NO_RELATION and INTERDEPENDENCY.
Usage
from causalbert.infer import load_model, sentence_analysis, extract_tuples
model, tokenizer, config, device = load_model("pdjohn/C-EBERT-V2-610m")
sentences = ["Pestizide und Autoverkehr sind Ursachen von Artensterben."]
analysis = sentence_analysis(model, tokenizer, config, sentences, device=device)
results = extract_tuples(analysis)
for item in results:
print(f"{item['cause']} --({item['influence']:+.2f})--> {item['effect']}")
Output:
Pestizide --(+1.00)--> Artensterben
Autoverkehr --(+1.00)--> Artensterben
Evaluation
Flagship model (seed 456, epoch 3). Evaluated on held-out test set (478 relations, not augmented).
Relation Classification
| Metric | Score |
|---|---|
| 14-class Accuracy | 75.3% |
| 14-class F1 (macro) | 61.9% |
Span Detection (Strict F1)
| Span Type | Precision | Recall | F1 |
|---|---|---|---|
| Entity | 0.893 | 0.563 | 0.691 |
| Indicator | 0.875 | 0.516 | 0.649 |
Note: v2 produces conservative span predictions (high precision, low recall) compared to v3.
Multi-Seed Robustness
| v2 (unified) | v3 (factorized) | |
|---|---|---|
| Mean accuracy (5 seeds) | 0.744 ± 0.007 | 0.768 ± 0.009 |
| Best seed | 0.753 | 0.781 |
Training
| Parameter | Value |
|---|---|
| Base model | EuroBERT-610m |
| Architecture | v2 (unified 14-class softmax) |
| LoRA | r=16, α=32, dropout=0.05 |
| Learning rate | 3×10⁻⁴ (cosine schedule) |
| Warmup ratio | 0.05 |
| Epochs | 7 (best checkpoint: epoch 3) |
| Batch size | 32 |
| Training seed | 456 |
| Dataset | 2,391 relations, augmented to 7,604 (mode 2) |
| Loss | Weighted cross-entropy (14 classes) |
Dataset
Trained on 2,391 manually annotated causal relations in German environmental discourse (1990–2020), covering forest dieback, insect death, bee death, and species extinction. 80/20 train/test split at sentence level; augmentation doubles training relations via entity replacement.
A publicly releasable subset of 487 relations from German parliamentary debates is available at bundestag-causal-attribution.
Citation
@article{johnson2026cbert,
title={C-BERT: Factorized Causal Relation Extraction},
author={Johnson, Patrick},
year={2026},
doi={10.26083/tuda-7797}
}
Also Available
- C-BERT v3 (Factorized) ⭐: Three parallel heads (role, polarity, salience). Higher accuracy (76.9%), better span detection, fewer multi-head error cascades. Recommended.
- Downloads last month
- 3
Model tree for pdjohn/C-EBERT-V2-610m
Base model
EuroBERT/EuroBERT-610m