Skip to content

govind030303/HPC-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

High Performance Computing Course Project (CS610)

Course: CS610 – High Performance Computing
Instructor: Prof. Swarnendu Biswas
Institute: IIT Kanpur
Duration: Aug 2025 – Dec 2025

This repository contains optimized CPU and GPU implementations of core numerical workloads, focusing on instruction-level, thread-level, and accelerator-based parallelism.


🚀 Implemented Projects

1. Serial Matrix Multiplication with AVX2 & SSE4

  • File: matmul.cpp
  • Optimized using:
    • AVX2 and SSE4 vector intrinsics
    • Loop unrolling and cache-friendly access
  • Achieved up to 6× speedup over naive implementation

2. Grid Search (Baseline – Serial)

  • File: gridsearch_original.cpp
  • Reference serial implementation
  • Used for correctness and performance comparison

3. Grid Search (Parallel – OpenMP)

  • File: gridsearch_openmp.cpp
  • Parallelized using OpenMP
  • Achieved ~6.5× speedup on multi-core CPU

4. 2D/3D Convolution using CUDA

  • File: convolution_gpu.cu
  • GPU implementation using CUDA
  • Achieved ~4× speedup compared to serial CPU version

🛠️ Build Instructions

Requirements

  • g++ (with OpenMP support)
  • nvcc (CUDA Toolkit)
  • Linux environment recommended

Compile All Programs

make

About

Projects of Course CS610 under Prof. Swarnendu Biswas

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors