GPU Benchmarks

This is primarily a testing ground for various GPU (currently just CUDA) benchmarks. Here I will try out specific algorithms I have in mind and usually compare them to some of the following:

Existing GPU library implementations
Existing CPU library implementations
Some simple CPU side code
Some existing Python (usually NumPy) implementations

Prerequisites and Usage

Obviously, you will need CUDA installed. Tests are automatically enabled if you have googletest installed. On Linux, this is easy to install via

sudo apt install libgtest-dev

You will need CMake. Build this just like any other CMake project:

cmake -B build && cd build
make

You should then be able to run the executables inside each subdirectory.

Documentation for subprojects

Sometimes, I may use small python scripts to try to plot illustrations which I find to be helpful in understanding a particular algorithm, or the way a unit test is set up. These generally shouldn't require any fancy packages other than matplotlib and numpy, and I will try to keep these in the /doc/ subdirectory in each of the project directories.

Also, project-specific documentation will be found in the subdirectory's own README.md, so as to not make this file too crowded.

Tips for quick benchmark comparisons

Nsight Systems nsys has great command line functionality. Unfortunately, the defaults are very verbose and you often don't want to do anything other than

Make a code change for a particular kernel.
Recompile and run it again under nsys profile.
Check the kernel's runtime again.

A very simple process to do the above would be something like this:

nsys profile --output=benchmark.nsys-rep --force-overwrite=true ./executable && nsys stats --force-export=true --report=cuda_gpu_trace:base benchmark.nsys-rep | grep name_of_kernel

Some notes on my choices here:

We configure the output filename specifically (unless you want to keep all of them, which during development you probably don't). You also need --force-overwrite for this.
We use cuda_gpu_trace as it tends to give nicer information. The second number is the duration in nanoseconds. It also splits the individual calls (so you can see if the same kernel has sudden duration changes, depending on whether you expect it).
cuda_gpu_trace:base is to ignore the templating, which is usually very verbose. Don't have to use it if you are using different templates of the same kernel.

Name		Name	Last commit message	Last commit date
Latest commit History 445 Commits
cxxopts @ 8df9a4d		cxxopts @ 8df9a4d
include		include
ipp_ext @ 918f800		ipp_ext @ 918f800
proj_basicops		proj_basicops
proj_containers		proj_containers
proj_cooperativegrid		proj_cooperativegrid
proj_cropping		proj_cropping
proj_cudagraphs		proj_cudagraphs
proj_ffts		proj_ffts
proj_fftshift		proj_fftshift
proj_floodfill		proj_floodfill
proj_fmamat		proj_fmamat
proj_grid_polynomials		proj_grid_polynomials
proj_histogram		proj_histogram
proj_median		proj_median
proj_mempools		proj_mempools
proj_outlierHandling		proj_outlierHandling
proj_patternMatching		proj_patternMatching
proj_pinnedalloc		proj_pinnedalloc
proj_reductions		proj_reductions
proj_remap		proj_remap
proj_rng		proj_rng
proj_sats		proj_sats
proj_selections		proj_selections
proj_slipups		proj_slipups
proj_sumAndDownsampleMatrix		proj_sumAndDownsampleMatrix
proj_tridiag		proj_tridiag
proj_windowccl		proj_windowccl
py		py
py2cpptest @ c393acd		py2cpptest @ c393acd
src/containers		src/containers
.clangd		.clangd
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU Benchmarks

Prerequisites and Usage

Documentation for subprojects

Tips for quick benchmark comparisons

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GPU Benchmarks

Prerequisites and Usage

Documentation for subprojects

Tips for quick benchmark comparisons

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages