C-BERT v2 (Unified): Causal Relation Extraction

A unified multi-task model for extracting causal attributions from German text, using a single 14-class relation head.

Note: The factorized v3 model is recommended for most use cases. It achieves higher accuracy (76.9% vs. 75.3%), better span detection, and more interpretable errors. This v2 model is provided for comparison and for users who prefer a simpler single-head architecture.

📄 Paper: C-BERT: Factorized Causal Relation Extraction
💻 Code: github.com/padjohn/cbert
📊 Dataset: Bundestag Causal Attribution

Model Details

C-BERT extends EuroBERT-610m with two task-specific modules, fine-tuned jointly with LoRA:

Task	Output	Labels
1. Span Recognition	BIOES Sequence Labeling	9 tags: B/I/O/E/S × {INDICATOR, ENTITY}
2. Relation Classification	14-class Softmax	MONO_POS_CAUSE, MONO_NEG_CAUSE, PRIO_POS_CAUSE, ..., NO_RELATION, INTERDEPENDENCY

The 14 classes encode the full combinatorial space: {MONO, PRIO, DIST} × {POS, NEG} × {CAUSE, EFFECT} = 12 classes, plus NO_RELATION and INTERDEPENDENCY.

Usage

from causalbert.infer import load_model, sentence_analysis, extract_tuples

model, tokenizer, config, device = load_model("pdjohn/C-EBERT-V2-610m")
sentences = ["Pestizide und Autoverkehr sind Ursachen von Artensterben."]
analysis = sentence_analysis(model, tokenizer, config, sentences, device=device)
results = extract_tuples(analysis)

for item in results:
    print(f"{item['cause']} --({item['influence']:+.2f})--> {item['effect']}")

Output:

Pestizide --(+1.00)--> Artensterben
Autoverkehr --(+1.00)--> Artensterben

Evaluation

Flagship model (seed 456, epoch 3). Evaluated on held-out test set (478 relations, not augmented).

Relation Classification

Metric	Score
14-class Accuracy	75.3%
14-class F1 (macro)	61.9%

Span Detection (Strict F1)

Span Type	Precision	Recall	F1
Entity	0.893	0.563	0.691
Indicator	0.875	0.516	0.649

Note: v2 produces conservative span predictions (high precision, low recall) compared to v3.

Multi-Seed Robustness

	v2 (unified)	v3 (factorized)
Mean accuracy (5 seeds)	0.744 ± 0.007	0.768 ± 0.009
Best seed	0.753	0.781

Training

Parameter	Value
Base model	EuroBERT-610m
Architecture	v2 (unified 14-class softmax)
LoRA	r=16, α=32, dropout=0.05
Learning rate	3×10⁻⁴ (cosine schedule)
Warmup ratio	0.05
Epochs	7 (best checkpoint: epoch 3)
Batch size	32
Training seed	456
Dataset	2,391 relations, augmented to 7,604 (mode 2)
Loss	Weighted cross-entropy (14 classes)

Dataset

Trained on 2,391 manually annotated causal relations in German environmental discourse (1990–2020), covering forest dieback, insect death, bee death, and species extinction. 80/20 train/test split at sentence level; augmentation doubles training relations via entity replacement.

A publicly releasable subset of 487 relations from German parliamentary debates is available at bundestag-causal-attribution.

Citation

@article{johnson2026cbert,
  title={C-BERT: Factorized Causal Relation Extraction},
  author={Johnson, Patrick},
  year={2026},
  doi={10.26083/tuda-7797}
}

Also Available

C-BERT v3 (Factorized) ⭐: Three parallel heads (role, polarity, salience). Higher accuracy (76.9%), better span detection, fewer multi-head error cascades. Recommended.

Downloads last month: 3

Safetensors

Model size

0.6B params

Tensor type

BF16

Model tree for pdjohn/C-EBERT-V2-610m

Base model

EuroBERT/EuroBERT-610m

Adapter

(2)

this model

Collection including pdjohn/C-EBERT-V2-610m

CausalBERT

Collection

3 items • Updated Feb 19 • 1