
MNIST handwritten digit recognition using a Convolutional Neural Network (CNN), achieving 0.9991 accuracy on the Kaggle Digit Recognizer competition, ranking 43rd.
The Digit Recognizer competition on Kaggle challenges participants to correctly classify handwritten digits from the MNIST dataset -- 28x28 grayscale images of digits 0-9. This project implements a deep Convolutional Neural Network that achieves 99.91% accuracy, placing in the top tier of the competition leaderboard.
- Deep CNN Architecture: Multi-layer convolutional network with batch normalization and dropout
- Data Augmentation: Real-time image transformations for improved generalization
- Learning Rate Scheduling: Adaptive learning rate decay during training
- Visualization: Training curves, confusion matrix, and misclassified digit analysis
- Submission Pipeline: Clean pipeline for generating Kaggle-compatible predictions
Input: 28x28x1 (grayscale image)
|
+- Conv2D(32, 3x3) + ReLU
| Output: 26x26x32
|
+- Conv2D(32, 3x3) + ReLU
| Output: 24x24x32
|
+- MaxPooling2D(2x2)
| Output: 12x12x32
|
+- Dropout(0.25)
|
+- Conv2D(64, 3x3) + ReLU
| Output: 10x10x64
|
+- Conv2D(64, 3x3) + ReLU
| Output: 8x8x64
|
+- MaxPooling2D(2x2)
| Output: 4x4x64
|
+- Dropout(0.25)
|
+- Flatten()
| Output: 1024
|
+- Dense(256) + ReLU + Dropout(0.5)
| Output: 256
|
+- Dense(10) + Softmax
Output: 10 (digit classes 0-9)
| Component |
Choice |
Rationale |
| Convolution Kernels |
3x3 |
Smallest kernel capturing spatial patterns, stacked for larger receptive field |
| Activation |
ReLU |
Prevents vanishing gradient, computationally efficient |
| Pooling |
MaxPooling 2x2 |
Spatial downsampling while preserving important features |
| Regularization |
Dropout (0.25, 0.5) |
Prevents overfitting on the relatively small MNIST dataset |
| Output |
Softmax |
Probability distribution over 10 digit classes |
| Category |
Technology |
| Language |
Python 3.8+ |
| Deep Learning |
TensorFlow / Keras |
| Data Processing |
Pandas, NumPy |
| Visualization |
Matplotlib, Seaborn |
| Platform |
Kaggle |
| Notebook |
Jupyter |
- Competition: Digit Recognizer - MNIST
- Training Set: 42,000 labeled 28x28 grayscale images
- Test Set: 28,000 unlabeled images for prediction
- Classes: 10 digits (0-9), relatively balanced
- Image Format: Flattened 784-pixel arrays (28x28)
- Preprocessing: Pixel values normalized from [0, 255] to [0.0, 1.0]
| Transformation |
Range |
Purpose |
| Rotation |
+/-10 degrees |
Handle writing angle variations |
| Width Shift |
+/-10% |
Handle horizontal position variations |
| Height Shift |
+/-10% |
Handle vertical position variations |
| Zoom |
+/-10% |
Handle size variations |
- Data Loading: Read CSV files, separate labels from pixel data
- Reshaping: Convert 784-pixel arrays to 28x28x1 image tensors
- Normalization: Scale pixel values to [0, 1] range
- Label Encoding: One-hot encode digit labels (0-9)
- Data Augmentation: Configure ImageDataGenerator with transformations
- Model Building: Construct CNN architecture with Keras Sequential API
- Training: Fit with validation split, learning rate decay, and augmentation
- Evaluation: Analyze training/validation curves and confusion matrix
- Prediction: Generate predictions on test set
- Submission: Format and save Kaggle-compatible CSV
| Metric |
Value |
| Kaggle Score |
0.9991 |
| Leaderboard Rank |
43 |
| Training Accuracy |
>99.5% |
| Validation Accuracy |
>99.3% |
| Architecture |
4 Conv + 2 Dense layers |
| Parameters |
~1.2M trainable parameters |
- Python 3.8 or higher
- Jupyter Notebook or JupyterLab
git clone https://github.com/nntrivi2001/Digit-recognizer---CNN.git
cd Digit-recognizer---CNN
pip install numpy pandas tensorflow keras matplotlib seaborn scikit-learn
# Download the competition data from Kaggle first
# https://www.kaggle.com/c/digit-recognizer/data
jupyter notebook "Digit Recognizer Competition/DigitRecognizer.ipynb"
# The notebook will:
# 1. Load and preprocess the data
# 2. Build the CNN model
# 3. Train with data augmentation
# 4. Generate predictions and submission file
Digit-recognizer---CNN/
|-- Digit Recognizer Competition/
| |-- DigitRecognizer.ipynb # Complete CNN implementation and training
|-- .gitignore
|-- .gitattributes
|-- README.md
- Download
train.csv and test.csv from the Kaggle competition page
- Place files in the notebook directory
- Run all cells sequentially
- Training will take approximately 5-15 minutes depending on GPU availability
- The submission file will be generated automatically