A complex parallelized 3D fractal renderer and engine implemented in C++ and CUDA.
Created by Joshua Hizgiaev and Marcos Traverso
Run the following to install GLFW3, OpenGL with Mesa (open-source implementation for Linux), and the math library for OpenGL.
sudo apt instal libgl1-mesa-dev libglu1-mesa-dev libglfw3-dev libglm-dev mesa-utilsThese libraries will be dynamically linked within CMake besides GLM, which is just headers.
On VSCode it should automatically build, however to build manually:
cmake -B build
make
./FractalGPUIf you want to use ninja generator (reccommended):
sudo apt install ninja-build
cmake -B build -G Ninja
ninja
./FractalGPUThe steps above are for building the CPU version of fractalGPU. Proceed forward for CUDA-enabled version.
First make sure your current system contains a CUDA capable GPU before proceeding with the following steps, running the following is a quick check for NVIDIA devices:
nvidia-smiIf an error occurred, you most likely do not have an NVIDIA device.
Moving on, go download the toolkit for your architecture.
This guide may also be useful.
After installing CUDA toolkit, you should be able to compile the exact same way, but there are compiler flags that can be set, checkout the CMakePresets.json file for the flags used for CUDA compilation.
Our testing results and reports can be found in the Reports/ folder of this repository.
Below is the initial results from our initial CUDA version:
| Fractal | CPU (ms/frame) average | GPU (ms/frame) average | Speedup |
|---|---|---|---|
| Mandelbulb | 1315.3 | 0.4 | ~3288x |
| Mandelbox | 4669.7 | 0.9 | ~5188 |
| Julia (quaternion) | 811.9 | 0.3 | ~2706 |
NVIDIA CUDA Profiling Tooling Interface (CUPTI) results and final report on our research and development to be posted soon. This report will include results for volumetric rendering as well.
CUPTI provides branch divergence, throughput, cache, and additional relevent statistics that will show how performant our rendering kernels are. So we believe its extremely important to present them as well. A PDF of our project presentation will also be provided here soon.
Here we present our parallel ray-marching algorthim for non-volumetric rendering:
The renderer assigns one CUDA thread per pixel, launching a 2D kernel over the full image with
Each thread computes a camera ray using a standard model. The forward, right, and up basis vectors are derived from the camera position, target, and up hint. The pixel's NDC coordinates (accounting for aspect ratio and vertical field of view) are used to offset the ray direction away from the camera's forward axis, giving a correct perspective projection.
Rather than intersecting geometry analytically, the renderer uses sphere tracing against a Signed Distance Function (SDF). Starting a small epsilon from the ray origin, each step advances the ray by the distance the SDF reports. The core insight being that this distance is always a safe lower bound on how far the ray can travel without overshooting a surface. Iteration continues until:
- The SDF value falls below a hit threshold (
epsilon), indicating a surface intersection, or the accumulated distance exceedsmaxDist, indicating a miss.
Three fractal distance estimators are supported that can be selected at runtime:
- Mandelbulb: spherical-coordinate power iteration with a derivative-tracked escape distance
- Mandelbox: repeated box and sphere folding with a scale-accumulating derivative
- Julia quaternion set: 4D quaternion squaring with a running derivative magnitude
After the kernel completes, CUDA events capture GPU-side elapsed milliseconds. The float pixel buffer is copied back to host memory and handed off to the existing OpenGL code, keeping the GPU renderer a drop-in replacement for the CPU. How switching works is on compilation, where we use conditional compilation (basically macros) to switch out CPU rendering with GPU rendering. This is done for simplicity purposes.
Here are the results of our CPU ray-marcher, real-time statistics and performance will be provided in a comprehensive report once CUDA and statistics gather functionality are fully implemented.
A presentation of our full results can be found here
Our final report will be released soon.


