Skip to content

Latest commit

 

History

History
255 lines (183 loc) · 7.24 KB

File metadata and controls

255 lines (183 loc) · 7.24 KB

Image Classification Model

A modern, full-stack image classification application that combines the power of PyTorch deep learning with FastAPI backend and a beautiful, responsive frontend. Upload images instantly and get AI-powered predictions with confidence scores.


Features

  • ** Fast & Lightweight**: Built with FastAPI + PyTorch for optimal performance
  • ** Modern UI**: Beautiful drag-and-drop interface with real-time preview
  • ** Smart Models**: Supports both ImageNet (ResNet18) and CIFAR-10 models
  • ** Visual Results**: Confidence bars and percentage scores for predictions
  • ** Auto-Detection**: Frontend automatically detects backend connection
  • ** Responsive Design**: Works perfectly on desktop and mobile devices

Live Demo (Local)


Project Structure

Image Classification Model/
├─ 📂 backend/
│  ├─  main.py              # FastAPI application with routes
│  ├─  model_loader.py      # Model loading and inference wrapper
│  ├─  preprocessing.py     # Image preprocessing pipeline
│  └─  imagenet_classes.txt # ImageNet class labels
├─ 📂 frontend/
│  ├─  index.html           # Main user interface
│  ├─  styles.css           # Beautiful styling and animations
│  └─  app.js               # Frontend logic and API communication
├─ 📂 model/
│  ├─  train.py             # (Optional) model training script
│  ├─  export_model.py      # Export model to TorchScript
│  └─  ARCHITECTURE.md      # Model architecture documentation
├─ 📂 samples/                # Test images for quick testing
├─  requirements.txt        # Python dependencies
└─  README.md               # This documentation

Technology Stack that i used,

Backend

  • FastAPI: Modern, fast web framework for building APIs
  • PyTorch: Deep learning framework for model inference
  • Pillow: Image processing and manipulation
  • Uvicorn: ASGI server for FastAPI

Frontend

  • Vanilla JavaScript: Pure JS for maximum compatibility
  • CSS3: Modern styling with animations and transitions
  • HTML5: Semantic markup with accessibility features

Quick Setup Guide

Prerequisites

  • Python 3.8 or higher
  • PowerShell or Command Prompt
  • Modern web browser

Step 1: Create Virtual Environment

python -m venv venv
.\venv\Scripts\Activate.ps1

Step 2: Install Dependencies

pip install -r requirements.txt

Step 3: Start Backend Server

cd backend
python -m uvicorn main:app --reload --host 0.0.0.0 --port 8001

Keep this terminal open - you should see: INFO: Uvicorn running on http://0.0.0.0:8001

Step 4: Start Frontend Server

# Open a NEW terminal
cd frontend
python -m http.server 3000

Keep this terminal open - you should see: Serving HTTP on :: port 3000

Step 5: Launch Application

Open your browser and navigate to:

http://localhost:3000

🔄 How It Works: End-to-End Flow

graph TD
    A[User uploads image] --> B[Frontend validates file]
    B --> C[Send to backend API]
    C --> D[Backend processes image]
    D --> E[Model inference]
    E --> F[Return predictions]
    F --> G[Display results with confidence]
Loading

Detailed Process:

  1. ** Image Upload**: User drags & drops or selects an image file
  2. ** Validation**: Frontend checks file type (JPG, PNG, WebP, BMP, GIF)
  3. ** API Request**: Frontend sends POST /predict/image with image data
  4. ** Backend Processing**:
    • Validates file extension and reads bytes
    • Converts to RGB format
    • Resizes and center-crops to 224x224 pixels
    • Normalizes using ImageNet statistics
    • Converts to PyTorch tensor
  5. ** Model Inference**:
    • Loads pretrained ResNet18 (ImageNet) or custom TorchScript model
    • Runs forward pass to get logits
    • Applies softmax to get probabilities
    • Returns top-3 predictions with confidence scores
  6. ** Response**: Backend returns JSON with predictions
  7. ** Visualization**: Frontend displays results with animated confidence bars

API Documentation

Health Check

GET /health

Response: {"status":"ok","service":"image-classification-api"}

Image Classification

POST /predict/image
Content-Type: multipart/form-data

Request: Form data with file field containing image Response:

{
  "predictions": [
    { "class": "golden retriever", "confidence": 0.8921 },
    { "class": "Labrador retriever", "confidence": 0.0723 },
    { "class": "flat-coated retriever", "confidence": 0.0156 }
  ],
  "top_k": 3
}
---

## Supported Image Formats

- **JPEG/JPG** - Most common format
- **PNG** - Lossless compression
- **WebP** - Modern web format
- **BMP** - Bitmap format
- **GIF** - Graphics format

---
# How it look the webpage;
<img width="2239" height="1218" alt="Screenshot 2026-03-04 062931" src="https://github.com/user-attachments/assets/be75eb15-b0ce-469a-9229-467f62e531ea" />
<img width="2221" height="1233" alt="Screenshot 2026-03-04 062914" src="https://github.com/user-attachments/assets/475bb6b8-5d2d-449e-a37a-3fd5f93e50a5" />
<img width="2239" height="1204" alt="Screenshot 2026-03-04 063019" src="https://github.com/user-attachments/assets/00ea8d3e-496d-4ef0-98e6-347b668a7dd6" />


## Advanced Configuration

### Using Custom CIFAR-10 Model

Set environment variables before starting backend:

```powershell
$env:CHECKPOINT_PATH = "C:\Image Classification Model\model\saved_model.pt"
$env:USE_CIFAR = "true"

Port Configuration

  • Backend: Default port 8001 (change if needed)
  • Frontend: Default port 3000 (change if needed)

Architecture Details

Backend Architecture

  • FastAPI: RESTful API with automatic documentation
  • Model Wrapper: Singleton pattern for efficient model loading
  • Preprocessing Pipeline: Standardized image transformation
  • Error Handling: Comprehensive exception management

Frontend Architecture

  • Module Pattern: Encapsulated JavaScript functionality
  • Auto-Detection: Dynamic backend discovery
  • Progressive Enhancement: Works without JavaScript (basic functionality)
  • Responsive Design: Mobile-first approach

Performance Features

  • Model Caching: Model loaded once at startup
  • Image Optimization: Efficient preprocessing pipeline
  • Async Processing: Non-blocking file uploads
  • Connection Pooling: Reuses HTTP connections
  • Lazy Loading: Components load as needed

Model Information

Default Model: ResNet18 (ImageNet)

  • Architecture: 18-layer residual network
  • Dataset: ImageNet (1,000 classes)
  • Input Size: 224×224 pixels
  • Accuracy: ~70% top-1, ~90% top-5

Alternative: CIFAR-10 Model

  • Architecture: Custom CNN (if exported)
  • Dataset: CIFAR-10 (10 classes)
  • Input Size: 32×32 pixels
  • Classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck