Shifted MNIST CNN Model
Model Description
This is a Convolutional Neural Network (CNN) trained on the MNIST dataset with shifted labels.
The model learns to map each digit to its reversed label according to the rule: original_label → (9 - original_label).
Label Mapping
| Original Digit | Shifted Label |
|---|---|
| 0 | 9 |
| 1 | 8 |
| 2 | 7 |
| 3 | 6 |
| 4 | 5 |
| 5 | 4 |
| 6 | 3 |
| 7 | 2 |
| 8 | 1 |
| 9 | 0 |
Model Architecture
Conv2D(32) -> BatchNorm -> ReLU -> MaxPool
Conv2D(64) -> BatchNorm -> ReLU -> MaxPool
Conv2D(128) -> BatchNorm -> ReLU -> MaxPool
Flatten -> FC(512) -> Dropout -> FC(256) -> Dropout -> FC(10)
Performance
- Test Accuracy: 98.96%
- Test Loss: 0.0341
- Parameters: 817,354
Training Configuration
- Epochs: 5
- Learning Rate: 0.001
- Batch Size: 64
- Optimizer: Adam
- Weight Decay: 0.0001
Usage
import torch
import torch.nn as nn
import torch.nn.functional as F
from huggingface_hub import hf_hub_download
# Define the model class (same as training)
class CNNModel(nn.Module):
def __init__(self, num_classes=10, dropout_rate=0.5):
super(CNNModel, self).__init__()
# First convolutional block
self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding=1)
self.bn1 = nn.BatchNorm2d(32)
self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
# Second convolutional block
self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)
self.bn2 = nn.BatchNorm2d(64)
self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
# Third convolutional block
self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1)
self.bn3 = nn.BatchNorm2d(128)
self.pool3 = nn.MaxPool2d(kernel_size=2, stride=2)
self.flattened_size = 128 * 3 * 3
# Fully connected layers with dropout
self.fc1 = nn.Linear(self.flattened_size, 512)
self.dropout1 = nn.Dropout(dropout_rate)
self.fc2 = nn.Linear(512, 256)
self.dropout2 = nn.Dropout(dropout_rate)
self.fc3 = nn.Linear(256, num_classes)
def forward(self, x):
# First conv block
x = self.conv1(x)
x = self.bn1(x)
x = F.relu(x)
x = self.pool1(x)
# Second conv block
x = self.conv2(x)
x = self.bn2(x)
x = F.relu(x)
x = self.pool2(x)
# Third conv block
x = self.conv3(x)
x = self.bn3(x)
x = F.relu(x)
x = self.pool3(x)
# Flatten for FC layers
x = x.view(x.size(0), -1)
# Fully connected layers with dropout
x = F.relu(self.fc1(x))
x = self.dropout1(x)
x = F.relu(self.fc2(x))
x = self.dropout2(x)
x = self.fc3(x)
return x
# Load the model
model = CNNModel()
model_path = hf_hub_download(repo_id="FinOS-Internship/shifted-mnist-cnn", filename="pytorch_model.bin")
model.load_state_dict(torch.load(model_path, map_location='cpu'))
model.eval()
# Make predictions (remember: outputs are shifted labels!)
# To convert back to original digits: original = 9 - shifted_prediction
Citation
@misc{shifted-mnist-cnn-2025,
title={Shifted MNIST CNN Model},
author={Phúc Nguyễn},
year={2025}
}
- Downloads last month
- 5
Dataset used to train FinOS-Internship/shifted-mnist-cnn
Evaluation results
- Test Accuracy on MNIST (Shifted Labels)self-reported98.960