emotion_urgency_classifier

A fine-tuned DeBERTa-v3-base model with two independent classification heads that simultaneously predicts the urgency and emotion level of UK telecoms customer complaints.

Each head outputs one of three classes: Low, Medium, or High.


Model Description

Property Value
Base model microsoft/deberta-v3-base
Task Dual-head text classification
Language English (UK)
Input Customer complaint text
Urgency output Low / Medium / High
Emotion output Low / Medium / High

Architecture

The backbone is DeBERTa-v3-base. The [CLS] token representation feeds into two independent linear heads:

DeBERTa-v3-base
      │
    [CLS]
  ┌───┴───┐
urgency   emotion
Linear    Linear
(768→3)   (768→3)

Training Data

5,000 synthetic UK telecoms customer complaints generated with GPT-5-mini, covering a 3×3 urgency × emotion grid. Each complaint was assigned a specific scenario, writing style, customer profile, and complaint history before generation to ensure diversity and label consistency.

Urgency Count Share
Low 1,750 35%
Medium 2,000 40%
High 1,250 25%

Emotion is distributed evenly within each urgency group (~333 per emotion level per urgency level).

20 complaint scenarios including: Complete Service Outage, Fraud & Scams, Poor Network Coverage, Hidden Fees & Charges, Unfulfilled Fix Promises, and more.


Training Procedure

Hyperparameter Value
Max sequence length 192 tokens
Batch size 16
Optimizer AdamW
Learning rate 2e-5
LR schedule Linear warmup (10%) + decay
Max epochs 10
Early stopping patience 3 (on val urgency macro F1)
Urgency class weights [1.0, 1.5, 1.2] — Low / Medium / High
Train / Val / Test split 70 / 15 / 15 (stratified on urgency×emotion cell)

Evaluation Results

Test set (held-out 15%)

Head Low F1 Medium F1 High F1 Macro F1
Urgency 0.823 0.736 0.844 0.801
Emotion 0.891 0.825 0.841 0.852

Adversarial test (10 hand-crafted edge cases)

6 / 10 passed. The model handles urgency/emotion decoupling, technical jargon, and cheerful-but-urgent complaints correctly. Known failure cases: sarcasm (detected as neutral emotion) and cold formal legal language (detected as Low emotion despite High intent).


How to Use

Load and run inference

import torch
import torch.nn as nn
from transformers import AutoTokenizer, AutoModel, AutoConfig

class DeBERTaMultiHead(nn.Module):
    def __init__(self, model_dir, num_classes=3):
        super().__init__()
        config = AutoConfig.from_pretrained(model_dir)
        self.backbone = AutoModel.from_config(config)
        hidden_size = config.hidden_size
        self.urgency_head = nn.Linear(hidden_size, num_classes)
        self.emotion_head = nn.Linear(hidden_size, num_classes)

    def forward(self, input_ids, attention_mask, token_type_ids=None):
        out = self.backbone(input_ids=input_ids, attention_mask=attention_mask,
                            token_type_ids=token_type_ids)
        cls = out.last_hidden_state[:, 0, :]
        return self.urgency_head(cls), self.emotion_head(cls)

LABEL_NAMES = ["Low", "Medium", "High"]
MODEL_DIR = "path/to/model_output"   # or snapshot_download("yuansheng-tao/emotion_urgency_classifier")

tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
model = DeBERTaMultiHead(MODEL_DIR)
model.load_state_dict(torch.load(f"{MODEL_DIR}/model_weights.pt", map_location="cpu"))
model.eval()

text = "My broadband has been down for three days and nobody is responding to my calls."

enc = tokenizer(text, max_length=192, padding="max_length",
                truncation=True, return_tensors="pt")
with torch.no_grad():
    urg_logits, emo_logits = model(**enc)

print("Urgency:", LABEL_NAMES[urg_logits.argmax().item()])
print("Emotion:", LABEL_NAMES[emo_logits.argmax().item()])

Download via script (from the project repo)

python model_training/download_model.py

Limitations

  • Trained on synthetic data — may not fully generalise to real customer complaints, particularly for nuanced emotional expression.
  • Sarcasm is not reliably detected: sarcastic praise is predicted as neutral or Low emotion.
  • Cold formal language (e.g. legal threats) is consistently predicted as Low emotion despite High emotional intent.
  • Domain-specific to UK telecoms — performance on other industries is untested.
Downloads last month
34
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yuansheng-tao/emotion_urgency_classifier

Finetuned
(551)
this model