programming-parallel-computers

An interactive learning platform for the Programming Parallel Computers course (CS-E4580) by Aalto University.

This repo turns the course material into hands-on, visual, and interactive content — making parallel programming concepts easier to understand and experiment with.

🌐 Live platform → ahmedaltu.github.io/programming-parallel-computers

What's inside

Interactive visualizations — diagrams, memory layouts, and performance charts you can explore in the browser
Code walkthroughs — step-by-step breakdowns of V0→V7 optimisations from the case study
Assembly analysis — annotated assembly showing what the CPU actually executes
Exercises — fully optimised solutions with documented reasoning

Course structure

Chapter	Topic	Key techniques
Chapter 1	Role of parallelism — why and how	Moore's Law, latency vs throughput
Chapter 2	CPU optimisation case study — 0.6% → ~100% peak	OpenMP, ILP, SIMD/AVX-512, register tiling, Z-order, prefetch
Chapter 3	Multithreading	OpenMP memory model, false sharing, scheduling, atomics
Chapter 4	GPU programming	CUDA V0→V4, coalescing, shared memory tiling, float4, occupancy, Nsight

Exercises

`correlate` — Pearson correlation matrix

Fully optimised C++ implementation targeting the course grader (AVX-512):

AVX-512 float16_t SIMD with 6×16 register-tiled kernel
Z-order (Morton) tile traversal for cache locality
Software prefetching
Two-pass normalisation
OpenMP parallelised over all tile pairs

`is` — Image segmentation

CPU version: 2D prefix sums reducing O(nx²·ny²·w·h) → O(nx²·ny²), OpenMP parallelised
GPU version: CUDA kernel with shared memory block reduction and prefix sums on device

Tech stack

Layer	Tools
CPU parallelism	OpenMP, AVX-512 (`float16_t`, ZMM registers)
GPU parallelism	CUDA (nvcc), shared memory, `float4` vectorised loads
Profiling	Nsight Systems, Nsight Compute, `perf`
Language	C++17, CUDA C++
Platform	Aalto course grader (AVX-512), Maari GPU machines

Author

Built while studying the course — combining learning with building.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
chapter1		chapter1
chapter2		chapter2
chapter3		chapter3
chapter4		chapter4
exercises		exercises
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

programming-parallel-computers

What's inside

Course structure

Exercises

`correlate` — Pearson correlation matrix

`is` — Image segmentation

Tech stack

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

programming-parallel-computers

What's inside

Course structure

Exercises

correlate — Pearson correlation matrix

is — Image segmentation

Tech stack

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`correlate` — Pearson correlation matrix

`is` — Image segmentation

Packages