XLM-RoBERTa-large + CRF for Situation-Entity Segmentation
Fine-tuned XLM-RoBERTa-large with a linear classifier and a CRF output layer for situation-entity segmentation.
The model assigns BI(O) tags (B-EDU, I-EDU, O) to each token, marking the boundaries and spans of situation-entity segments — contiguous clause-level segments that describe a single situation type.
We use the multilingual version of RoBERTa to improve possible zero-shot transfer to situation segmentation in other language varieties.
Architecture
XLM-RoBERTa-large encoder → Linear(1024 → 3) → CRF(3 tags)
- Encoder:
FacebookAI/xlm-roberta-large - Classifier: single linear layer mapping the encoder's hidden states to 3 tag logits
- Decoder: Viterbi decoding via a linear-chain CRF (
pytorch-crf) - Labels:
B-EDU(0),I-EDU(1),O(2)
Training Data
Fine-tuned on the situation entity annotated corpus from:
Annemarie Friedrich, Alexis Palmer and Manfred Pinkal. Situation entity types: automatic classification of clause-level aspect. ACL 2016. (GitHub)
The dataset is licensed under the Apache License 2.0. Per the terms of the Apache 2.0 license, notice is hereby given that these weights represent a modified derivative work based on that data.
The corpus contains English text with clause-level situation-entity annotations. The standard train/dev/test split from the original paper is used.
Training Details
| Hyperparameter | Value |
|---|---|
| Base model | FacebookAI/xlm-roberta-large |
| Learning rate | 4e-5 |
| Epochs (max) | 20 |
| Batch size | 64 |
| Weight decay | 0.001 |
| Early stopping | patience 3 (B-EDU F1 on dev) |
| Precision | fp16 |
Please find further training details in our code on GitHub.
Results
Evaluated on the held-out test set. The table shows the best single run and the mean ± std across 5 random seeds for the best hyperparameter configuration (lr=4e-5, wd=0.001). A full grid search over 4 configurations × 5 seeds (20 runs total) was conducted; all configurations achieved similar B-EDU F1 in the range 0.902–0.904.
| Metric | Best run | Mean ± std (5 seeds) |
|---|---|---|
| B-EDU F1 | 0.907 | 0.904 ± 0.002 |
| B-EDU Precision | 0.901 | 0.898 ± 0.010 |
| B-EDU Recall | 0.914 | 0.911 ± 0.009 |
| Token Accuracy | 0.982 | 0.979 ± 0.002 |
| WindowDiff (↓) | 0.075 | 0.077 ± 0.002 |
| Exact Match (sentence) | 0.753 | 0.742 ± 0.007 |
WindowDiff (Pevzner & Hearst, 2002) measures boundary-level segmentation quality within a sliding window of half the average reference segment length (lower is better). Exact Match is the fraction of sentences whose full tag sequence is predicted correctly (sentence level).
Usage
Requirements
pip install transformers torch pytorch-crf
spaCy is not a hard dependency, but is recommended for sentence splitting (matching the training setup):
pip install spacy && python -m spacy download en_core_web_sm
Loading the model
from transformers import AutoConfig, AutoModel, AutoTokenizer
config = AutoConfig.from_pretrained("xaver-krueckl/situation-entity-segmenter", trust_remote_code=True)
model = AutoModel.from_pretrained("xaver-krueckl/situation-entity-segmenter", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("FacebookAI/xlm-roberta-large")
model.eval()
Inference
The model was trained on spaCy-tokenised, sentence-split input (en_core_web_sm), so inference should follow the same setup. Split your input text into sentences using spaCy first, then call model.predict_text(words, tokenizer) with the word tokens for each sentence:
import spacy
nlp = spacy.load("en_core_web_sm")
text = "The cat sat on the mat. It looked around the room."
results = []
for sent in nlp(text).sents:
words = [token.text for token in sent]
results.extend(model.predict_text(words, tokenizer))
for word, tag in results:
print(f"{word:20s} {tag}")
B-EDU marks the start of a new situation-entity segment; I-EDU marks its continuation; O marks tokens outside any segment.
Limitations
- Trained and evaluated on ~40.000 situation English segments.
- Performance may vary on out-of-domain text.
- Sub-token sequences longer than 512 tokens need to be chunked before inference - regular sentences should be shorter, though.
Acknowledgement
We gratefully acknowledge the scientific support and HPC resources provided by the Erlangen National High Performance Computing Center (NHR@FAU) of the Friedrich-Alexander Universität Erlangen-Nürnberg (FAU) under the NHR project v110ee. NHR funding is provided by federal and Bavarian state authorities. NHR@FAU hardware is partially funded by the German Research Foundation (DFG) – 440719683.
Citation
Please cite our paper when using the model:
@inproceedings{schmueck-etal-2026,
title = "Cross-Linguistic Situation Entity Segmentation for Discourse Analysis in Diachronic English and German Text",
author = "we will update when published :)"
}
Please also cite the original annotation paper:
@inproceedings{friedrich-etal-2016-situation,
title = "Situation entity types: automatic classification of clause-level aspect",
author = "Friedrich, Annemarie and
Palmer, Alexis and
Pinkal, Manfred",
editor = "Erk, Katrin and
Smith, Noah A.",
booktitle = "Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = aug,
year = "2016",
address = "Berlin, Germany",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/P16-1166/",
doi = "10.18653/v1/P16-1166",
pages = "1757--1768"
}
- Downloads last month
- 244
Model tree for xaver-maria-krueckl/situation-entity-segmenter
Base model
FacebookAI/xlm-roberta-large