🧬 vaccineStance-flan-t5-large

A fine-tuned FLAN-T5 Large model for stance classification of tweets related to COVID-19 vaccines. It outputs one of three stance categories:

in-favor
against
neutral-or-unclear

📌 Model Description

This model builds on FLAN-T5-Large using two-stage fine-tuning:

Sentiment pre-finetuning on cardiffnlp/tweet_eval to teach emotional polarity.
Stance-specific finetuning on a curated COVID-19 stance dataset (annotated .csv), augmented for balance and stratified across splits.

Instruction-tuning + prompt-based generation was retained from the original FLAN-T5 formulation.

🧪 Evaluation Results

Metric	Score
Macro F1	0.93
Micro F1	0.94

Evaluation was conducted across a 5732-tweet dataset split 80:10:10 (train:test:val). The model showed consistent generalization and balanced performance across all splits.

🧠 Intended Use

Research in public health NLP and LLM alignment
Automated stance detection in social media monitoring systems
Baseline for multi-agent LLM stance alignment studies

📥 How to Use

from transformers import T5ForConditionalGeneration, T5Tokenizer

model = T5ForConditionalGeneration.from_pretrained("DopplerEffect/vaccineStance-flan-t5-large")
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")

prompt = '''
You are a sentiment analyst tasked with understanding public opinion about COVID-19 on Twitter. Your job is to classify the sentiment of each tweet as one of the following categories:

- in-favor: The tweet expresses positive support or agreement regarding COVID-19 policies, vaccines, or public health advice.
- against: The tweet expresses opposition, criticism, or distrust of COVID-19-related efforts.
- neutral-or-unclear: The tweet neither clearly supports nor opposes, or the sentiment is ambiguous.

Tweet: "Vaccines saved so many lives!"
Sentiment:
'''
outputMap = {
                "positive":"in-favor",
                "negative":"against",
                "neutral":"neutral-or-unclear"
            }
inputIds = self.tokenizer(prompt, return_tensors="pt").input_ids # The tweet is enclosed in the prompt
output = self.model.generate(inputIds)
prediction = outputMap.get(self.tokenizer.decode(output[0], skip_special_tokens=True).strip())

print(prediction)  # Output: in-favor

Downloads last month: 3

Safetensors

Model size

0.8B params

Tensor type

F32

Dataset used to train DopplerEffect/vaccineStance-flan-t5-large

Evaluation results

Macro F1 on COVID-19 Vaccine Tweet Dataset (Custom)
self-reported

0.930
Micro F1 on COVID-19 Vaccine Tweet Dataset (Custom)
self-reported

0.940