Shifted MNIST CNN Model

Model Description

This is a Convolutional Neural Network (CNN) trained on the MNIST dataset with shifted labels. The model learns to map each digit to its reversed label according to the rule: original_label → (9 - original_label).

Label Mapping

Original Digit	Shifted Label
0	9
1	8
2	7
3	6
4	5
5	4
6	3
7	2
8	1
9	0

Model Architecture

Conv2D(32) -> BatchNorm -> ReLU -> MaxPool
Conv2D(64) -> BatchNorm -> ReLU -> MaxPool  
Conv2D(128) -> BatchNorm -> ReLU -> MaxPool
Flatten -> FC(512) -> Dropout -> FC(256) -> Dropout -> FC(10)

Performance

Test Accuracy: 98.96%
Test Loss: 0.0341
Parameters: 817,354

Training Configuration

Epochs: 5
Learning Rate: 0.001
Batch Size: 64
Optimizer: Adam
Weight Decay: 0.0001

Usage

import torch
import torch.nn as nn
import torch.nn.functional as F
from huggingface_hub import hf_hub_download

# Define the model class (same as training)
class CNNModel(nn.Module):
    def __init__(self, num_classes=10, dropout_rate=0.5):
        super(CNNModel, self).__init__()

        # First convolutional block
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding=1)
        self.bn1 = nn.BatchNorm2d(32)
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)

        # Second convolutional block
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)
        self.bn2 = nn.BatchNorm2d(64)
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)

        # Third convolutional block
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1)
        self.bn3 = nn.BatchNorm2d(128)
        self.pool3 = nn.MaxPool2d(kernel_size=2, stride=2)

        self.flattened_size = 128 * 3 * 3

        # Fully connected layers with dropout
        self.fc1 = nn.Linear(self.flattened_size, 512)
        self.dropout1 = nn.Dropout(dropout_rate)
        self.fc2 = nn.Linear(512, 256)
        self.dropout2 = nn.Dropout(dropout_rate)
        self.fc3 = nn.Linear(256, num_classes)

    def forward(self, x):
        # First conv block
        x = self.conv1(x)
        x = self.bn1(x)
        x = F.relu(x)
        x = self.pool1(x)

        # Second conv block
        x = self.conv2(x)
        x = self.bn2(x)
        x = F.relu(x)
        x = self.pool2(x)

        # Third conv block
        x = self.conv3(x)
        x = self.bn3(x)
        x = F.relu(x)
        x = self.pool3(x)

        # Flatten for FC layers
        x = x.view(x.size(0), -1)

        # Fully connected layers with dropout
        x = F.relu(self.fc1(x))
        x = self.dropout1(x)
        x = F.relu(self.fc2(x))
        x = self.dropout2(x)
        x = self.fc3(x)

        return x

# Load the model
model = CNNModel()
model_path = hf_hub_download(repo_id="FinOS-Internship/shifted-mnist-cnn", filename="pytorch_model.bin")
model.load_state_dict(torch.load(model_path, map_location='cpu'))
model.eval()

# Make predictions (remember: outputs are shifted labels!)
# To convert back to original digits: original = 9 - shifted_prediction

Citation

@misc{shifted-mnist-cnn-2025,
  title={Shifted MNIST CNN Model},
  author={Phúc Nguyễn},
  year={2025}
}

Downloads last month: 5

Dataset used to train FinOS-Internship/shifted-mnist-cnn

Evaluation results

Test Accuracy on MNIST (Shifted Labels)
self-reported

98.960