Skip to content

VDNT11NULL/S2I

Repository files navigation

Sketch Colorization Using UvUNet: A Custom Ensemble Generator

UvU Architecture

Project Overview

This project aims to revolutionize sketch-to-photo synthesis by developing an advanced deep learning architecture for generating high-quality, colorized facial images from outline sketches. By introducing the UvU-Net Generator—a novel ensemble architecture—we address challenges in quality degradation, fine detail recovery, and structural consistency. This project finds applications in forensic facial reconstruction, content creation, and creative industries.

Table of Contents


Motivation

Reconstructing realistic facial images from outline sketches is a complex task due to the intricacy of human facial features. Existing models often fall short in recovering fine details and maintaining visual fidelity. Our motivation stems from the need to bridge these gaps with a sophisticated system combining state-of-the-art image-to-image translation and super-resolution techniques.


Objectives

  • Develop a CycleGAN-based model to recover colored facial images from outlines.
  • Create the UvU-Net Generator, a novel ensemble architecture for enhancing image quality and fidelity.
  • Use advanced metrics such as FID, PSNR, and SSIM for performance evaluation.

Proposed Solution

Our solution integrates two models into a seamless pipeline: UvU-Net Generator: Enhances the visual quality and recovers fine details using a novel ensemble method.


UvU-Net Generator

The UvU-Net Generator is a custom ensemble architecture that combines:

  • A Ensemble U-Net based Architecture to ensure realistic outputs.
  • A residual skipping in the outer UNet helps in minimizing the trainable parameters whereas the feature loss is componsated by Inner Unet on the behalf of Outer Unet.

Project Structure

.
├── Ensemble_Architecture.pdf               # Detailed explanation of the ensemble architecture
├── Hyperparameter_Tuning.ipynb             # Hyperparameter tuning for CycleGAN
├── Hyperparameter_Tuning_Random_Search.ipynb
├── Main code snippets                      # Core model and utility scripts
│   ├── config.py
│   ├── Prepare_dataset.ipynb
│   ├── Trainer_UvU.ipynb
│   ├── utils.py
│   ├── UvU_Discriminator.py
│   └── UvU_Net_Generator.py
├── Output
│   ├── Pix2Pix Output
│   │   ├── input_99_P2P.png                # Input sketch for Pix2Pix
│   │   └── y_gen_99_P2P.png                # Generated image from Pix2Pix
│   ├── UvU Output
│   │   ├── input_99_UVU.png                # Input sketch for UvU-Net
│   │   └── y_gen_99_UVU.png                # Generated image from UvU-Net
├── Sample_dataset                          # Input and Sobel-processed images
│   ├── input_images
│   ├── sobel_images
│   └── sobel_images1
├── UvU_Architecture.jpg                    # Visualization of the UvU-Net architecture
├── UvU_metric_analysis.ipynb               # Metric analysis for UvU-Net
├── README.md                               # Project README
├── regularization.py                       # Code for regularization techniques

How to Use

  1. Clone the repository:
    git clone https://github.com/RajeevG187/S2I.git
  2. Install dependencies:
    pip install -r requirements.txt
  3. Run the dataset preparation script:
    python Prepare_dataset.ipynb
  4. Train the UvU-Net Generator:
    python Trainer_UvU.ipynb

Dataset

The dataset used for this project includes input outline sketches and Sobel-processed images. Download the dataset from the following link:

Dataset Link


Results

Pix2Pix Output

Input Sketch Generated Output
Pix2Pix Input Pix2Pix Output

UvU Output

Input Sketch Generated Output
UvU Input UvU Output

These images are displayed side-by-side for comparison. You can directly visualize the improvements brought by the UvU-Net over Pix2Pix for better output quality and detail recovery.


Metric Analysis

  • FID: 34.7 (UvU-Net) vs. 45.2 (Pix2Pix)
  • PSNR: 28.3 dB (UvU-Net)
  • SSIM: 0.87 (UvU-Net)

For detailed analysis, refer to UvU_metric_analysis.ipynb.


References

  1. Comparative Analysis of Pix2Pix and CycleGAN
    ResearchGate Link
  2. Bahdanau Attention Mechanism
    ArXiv Paper

About

UvU-Net is an ensemble sketch-to-photo colorization model using dual U-Nets with residual connections and Bahdanau attention for realistic, detail-rich outputs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors