emotion_urgency_classifier

A fine-tuned DeBERTa-v3-base model with two independent classification heads that simultaneously predicts the urgency and emotion level of UK telecoms customer complaints.

Each head outputs one of three classes: Low, Medium, or High.

Model Description

Property	Value
Base model	`microsoft/deberta-v3-base`
Task	Dual-head text classification
Language	English (UK)
Input	Customer complaint text
Urgency output	Low / Medium / High
Emotion output	Low / Medium / High

Architecture

The backbone is DeBERTa-v3-base. The [CLS] token representation feeds into two independent linear heads:

DeBERTa-v3-base
      │
    [CLS]
  ┌───┴───┐
urgency   emotion
Linear    Linear
(768→3)   (768→3)

Training Data

5,000 synthetic UK telecoms customer complaints generated with GPT-5-mini, covering a 3×3 urgency × emotion grid. Each complaint was assigned a specific scenario, writing style, customer profile, and complaint history before generation to ensure diversity and label consistency.

Urgency	Count	Share
Low	1,750	35%
Medium	2,000	40%
High	1,250	25%

Emotion is distributed evenly within each urgency group (~333 per emotion level per urgency level).

20 complaint scenarios including: Complete Service Outage, Fraud & Scams, Poor Network Coverage, Hidden Fees & Charges, Unfulfilled Fix Promises, and more.

Training Procedure

Hyperparameter	Value
Max sequence length	192 tokens
Batch size	16
Optimizer	AdamW
Learning rate	2e-5
LR schedule	Linear warmup (10%) + decay
Max epochs	10
Early stopping patience	3 (on val urgency macro F1)
Urgency class weights	[1.0, 1.5, 1.2] — Low / Medium / High
Train / Val / Test split	70 / 15 / 15 (stratified on urgency×emotion cell)

Evaluation Results

Test set (held-out 15%)

Head	Low F1	Medium F1	High F1	Macro F1
Urgency	0.823	0.736	0.844	0.801
Emotion	0.891	0.825	0.841	0.852

Adversarial test (10 hand-crafted edge cases)

6 / 10 passed. The model handles urgency/emotion decoupling, technical jargon, and cheerful-but-urgent complaints correctly. Known failure cases: sarcasm (detected as neutral emotion) and cold formal legal language (detected as Low emotion despite High intent).

How to Use

Load and run inference

import torch
import torch.nn as nn
from transformers import AutoTokenizer, AutoModel, AutoConfig

class DeBERTaMultiHead(nn.Module):
    def __init__(self, model_dir, num_classes=3):
        super().__init__()
        config = AutoConfig.from_pretrained(model_dir)
        self.backbone = AutoModel.from_config(config)
        hidden_size = config.hidden_size
        self.urgency_head = nn.Linear(hidden_size, num_classes)
        self.emotion_head = nn.Linear(hidden_size, num_classes)

    def forward(self, input_ids, attention_mask, token_type_ids=None):
        out = self.backbone(input_ids=input_ids, attention_mask=attention_mask,
                            token_type_ids=token_type_ids)
        cls = out.last_hidden_state[:, 0, :]
        return self.urgency_head(cls), self.emotion_head(cls)

LABEL_NAMES = ["Low", "Medium", "High"]
MODEL_DIR = "path/to/model_output"   # or snapshot_download("yuansheng-tao/emotion_urgency_classifier")

tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
model = DeBERTaMultiHead(MODEL_DIR)
model.load_state_dict(torch.load(f"{MODEL_DIR}/model_weights.pt", map_location="cpu"))
model.eval()

text = "My broadband has been down for three days and nobody is responding to my calls."

enc = tokenizer(text, max_length=192, padding="max_length",
                truncation=True, return_tensors="pt")
with torch.no_grad():
    urg_logits, emo_logits = model(**enc)

print("Urgency:", LABEL_NAMES[urg_logits.argmax().item()])
print("Emotion:", LABEL_NAMES[emo_logits.argmax().item()])

Download via script (from the project repo)

python model_training/download_model.py

Limitations

Trained on synthetic data — may not fully generalise to real customer complaints, particularly for nuanced emotional expression.
Sarcasm is not reliably detected: sarcastic praise is predicted as neutral or Low emotion.
Cold formal language (e.g. legal threats) is consistently predicted as Low emotion despite High emotional intent.
Domain-specific to UK telecoms — performance on other industries is untested.

Downloads last month: 34

Model tree for yuansheng-tao/emotion_urgency_classifier

Base model

microsoft/deberta-v3-base

Finetuned

(551)

this model