Shifted MNIST CNN Model

Model Description

This is a Convolutional Neural Network (CNN) trained on the MNIST dataset with shifted labels. The model learns to map each digit to its reversed label according to the rule: original_label → (9 - original_label).

Label Mapping

Original Digit Shifted Label
0 9
1 8
2 7
3 6
4 5
5 4
6 3
7 2
8 1
9 0

Model Architecture

Conv2D(32) -> BatchNorm -> ReLU -> MaxPool
Conv2D(64) -> BatchNorm -> ReLU -> MaxPool  
Conv2D(128) -> BatchNorm -> ReLU -> MaxPool
Flatten -> FC(512) -> Dropout -> FC(256) -> Dropout -> FC(10)

Performance

  • Test Accuracy: 98.96%
  • Test Loss: 0.0341
  • Parameters: 817,354

Training Configuration

  • Epochs: 5
  • Learning Rate: 0.001
  • Batch Size: 64
  • Optimizer: Adam
  • Weight Decay: 0.0001

Usage

import torch
import torch.nn as nn
import torch.nn.functional as F
from huggingface_hub import hf_hub_download

# Define the model class (same as training)
class CNNModel(nn.Module):
    def __init__(self, num_classes=10, dropout_rate=0.5):
        super(CNNModel, self).__init__()

        # First convolutional block
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding=1)
        self.bn1 = nn.BatchNorm2d(32)
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)

        # Second convolutional block
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)
        self.bn2 = nn.BatchNorm2d(64)
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)

        # Third convolutional block
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1)
        self.bn3 = nn.BatchNorm2d(128)
        self.pool3 = nn.MaxPool2d(kernel_size=2, stride=2)

        self.flattened_size = 128 * 3 * 3

        # Fully connected layers with dropout
        self.fc1 = nn.Linear(self.flattened_size, 512)
        self.dropout1 = nn.Dropout(dropout_rate)
        self.fc2 = nn.Linear(512, 256)
        self.dropout2 = nn.Dropout(dropout_rate)
        self.fc3 = nn.Linear(256, num_classes)

    def forward(self, x):
        # First conv block
        x = self.conv1(x)
        x = self.bn1(x)
        x = F.relu(x)
        x = self.pool1(x)

        # Second conv block
        x = self.conv2(x)
        x = self.bn2(x)
        x = F.relu(x)
        x = self.pool2(x)

        # Third conv block
        x = self.conv3(x)
        x = self.bn3(x)
        x = F.relu(x)
        x = self.pool3(x)

        # Flatten for FC layers
        x = x.view(x.size(0), -1)

        # Fully connected layers with dropout
        x = F.relu(self.fc1(x))
        x = self.dropout1(x)
        x = F.relu(self.fc2(x))
        x = self.dropout2(x)
        x = self.fc3(x)

        return x

# Load the model
model = CNNModel()
model_path = hf_hub_download(repo_id="FinOS-Internship/shifted-mnist-cnn", filename="pytorch_model.bin")
model.load_state_dict(torch.load(model_path, map_location='cpu'))
model.eval()

# Make predictions (remember: outputs are shifted labels!)
# To convert back to original digits: original = 9 - shifted_prediction

Citation

@misc{shifted-mnist-cnn-2025,
  title={Shifted MNIST CNN Model},
  author={Phúc Nguyễn},
  year={2025}
}
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train FinOS-Internship/shifted-mnist-cnn

Evaluation results