🤖 BERT for Fake News Detection (Fakeddit + BLIP Captions)

This model is a fine-tuned bert-base-uncased on the Fakeddit dataset.
It combines post text with image captions generated by Salesforce/blip-image-captioning-base, rather than using raw image features.

🧠 Model Summary

Architecture: BERT (uncased)
Inputs: [CLS] post text, BLIP image caption [SEP]
Task: Multi-class classification (6 labels)
Dataset: Fakeddit (Nakamura et al., 2020)
Captioning Model: Salesforce/blip-image-captioning-base

📊 Results

Approach	Accuracy	Macro F1-Score
Text + Caption	0.87	0.83

⚡️ Using captions instead of raw image features leads to state-of-the-art performance on Fakeddit, with simpler input and no vision backbone needed during inference.

📄 References

This model builds on the following works:

Fakeddit dataset: Nakamura et al., (2020) – A multimodal fake news dataset
BLIP captioning model: Li et al. (2022) – Vision-language pretraining with BLIP
BERT base model: Devlin et al. (2019) – Pretrained transformer for text understanding