Skip to content

Add Windows support#354

Merged
oleksandr-pavlyk merged 6 commits into
NVIDIA:mainfrom
mfranzrebsal:windows-support
May 19, 2026
Merged

Add Windows support#354
oleksandr-pavlyk merged 6 commits into
NVIDIA:mainfrom
mfranzrebsal:windows-support

Conversation

@mfranzrebsal
Copy link
Copy Markdown
Contributor

@mfranzrebsal mfranzrebsal commented May 7, 2026

I made the necessary changes for NVBench to build and the tests to pass. I also tested that it works when using NVBench inside of my own project, and made sure that the scripts ci/build_nvbench.sh and ci/test_nvbench.sh work. How should support testing be enabled for the CI? I have a local commit with some changes, but did not want to add them here and possibly trip the CI run.

Here is a summary of the changes, disclaimer that they were made by Claude and verified by me, but I am in no way a CMake expert:

#: 1
File: CMakeLists.txt
Change: CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON
Failure without it: Link error LNK1181: cannot open input file 'lib\nvbench.lib'. MSVC only generates a .lib import library when the DLL exports symbols. NVBench has no
__declspec(dllexport) annotations, so without this CMake flag, no import library is produced and all downstream targets fail to link.

#: 2
File: cmake/NVBenchCUPTI.cmake
Change: IMPORTED_IMPLIB instead of IMPORTED_LOCATION on Win32
Failure without it: CMake generate error IMPORTED_IMPLIB not set for imported target "nvbench::cupti". On Windows, find_library locates .lib import libraries. A SHARED IMPORTED
target
on Windows requires the .lib path via IMPORTED_IMPLIB (the import library), not IMPORTED_LOCATION (which expects the .dll).

#: 3
File: cmake/NVBenchConfigTarget.cmake
Change: FMT_UNICODE=0, -Xcompiler=/utf-8, --diag_suppress=27
Failure without it: Build errors in every .cu file. (a) fmtlib 11 static-asserts that /utf-8 mode is active — MSVC's host compiler satisfies this with -Xcompiler=/utf-8, but cudafe
evaluates the check independently and always fails, requiring FMT_UNICODE=0 for CUDA. (b) fmtlib's lookup tables use out-of-range char32_t sentinel values that cudafe rejects,
requiring --diag_suppress=27.

#: 4
File: cmake/NVBenchConfigTarget.cmake
Change: AND NOT WIN32 on INSTALL_RPATH
Failure without it: No failure. INSTALL_RPATH is a Unix/ELF concept silently ignored on Windows. The guard is purely a hygiene fix.

#: 5
File: nvbench/config.cuh.in
Change: MSVC_LANG instead of _cplusplus
Failure without it: Build error #error: "NVBench requires a C++17 compiler." in every .cxx file. MSVC reports __cplusplus as 199711L (C++98) regardless of actual standard, unless
/Zc:__cplusplus is passed. _MSVC_LANG always reflects the real standard level.

#: 6
File: testing/axes_metadata.cu
Change: #include
Failure without it: Build error namespace "std" has no member "back_inserter". MSVC's STL doesn't transitively include from like GCC's libstdc++ does.

#: 7
File: testing/cmake/CMakeLists.txt
Change: Forward CMAKE_CUDA_HOST_COMPILER, CMAKE_LINKER, CMAKE_RC_COMPILER, CMAKE_MT
Failure without it: Test failure CUDA_ARCHITECTURES is set to "native", but no NVIDIA GPU was detected. The sub-project cmake configure can't compile/link the GPU query program

#: 8
File: testing/cmake/CMakeLists.txt
Change: ENVIRONMENT "PATH=..." with nvbench bin + CUPTI lib dirs
Failure without it: No failure when run via the build script (which pre-sets PATH). Needed for robustness when ctest is invoked directly — the Windows equivalent of the
LD_LIBRARY_PATH setup the sub-project already has for Unix.

#: 9
File: testing/cmake/test_export/CMakeLists.txt
Change: Add Windows PATH setup for sub-project tests (parallel to existing Unix LD_LIBRARY_PATH)
Reason: The original code only set LD_LIBRARY_PATH on Unix and did nothing on Windows. The sub-project's test_bench.exe and nvbench-ctl.exe need
nvbench.dll and CUPTI DLLs at runtime. On Unix the build tree embeds RUNPATH into the binary so the executable finds libnvbench.so without environment
help; only CUPTI and the install tree need LD_LIBRARY_PATH. Windows has no RUNPATH equivalent — DLL lookup always goes through PATH — so the
sub-project
must set PATH for both tree types. Previously this worked only because the outer test in testing/cmake/CMakeLists.txt set an ENVIRONMENT property on
the ctest --build-and-test process, which the inner CTest happened to inherit. This fix makes the sub-project self-sufficient: it reads the nvbench DLL and CUPTI library locations from imported target properties and sets PATH itself. The shared code resolves the imported configuration once, then
branches only for the CUPTI property name (IMPORTED_IMPLIB on Windows vs IMPORTED_LOCATION on Unix, since find_library locates .lib import libraries on Windows) and the environment variable format.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 7, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@mfranzrebsal mfranzrebsal marked this pull request as draft May 7, 2026 07:52
@mfranzrebsal mfranzrebsal marked this pull request as ready for review May 7, 2026 08:27
Comment thread testing/axes_metadata.cu
@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

Regarding #1, we should review the list of symbols we intend to be public, export them (i.e., add __declspec(dllexport) annotation for MSVC) and hide the rest (i.e. add __attribute__((visibility("hidden"))) for GCC/Clang).

Incidentally, doing this review would unblock #323

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

/ok to test dd1ffc9

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f0601f02-4afb-4d21-901e-71426970b93b

📥 Commits

Reviewing files that changed from the base of the PR and between 0ca8414 and 2aaf76e.

📒 Files selected for processing (2)
  • cmake/NVBenchCUPTI.cmake
  • nvbench/CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (2)
  • nvbench/CMakeLists.txt
  • cmake/NVBenchCUPTI.cmake

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Windows CUDA profiler API installer for CI; Windows build job re-enabled; optional device-testing toggle for Windows builds.
  • Chores

    • Improved Windows shared-library export behavior and MSVC preprocessor/linker propagation to consumers.
    • More accurate C++ dialect detection across compilers; per-test runtime environment wiring added.
    • CI input normalization for Windows CUDA versions.
  • Bug Fixes

    • More robust CUPTI runtime discovery on Windows and improved cross-platform test library-path handling.
    • Suppressed spurious CUDA frontend diagnostics; added missing test include.

suggestion:

Walkthrough

Adds Windows build/test support: enables Windows symbol export and MSVC link option, adds a CUDA profiler API installer and CI invocation, re-enables Windows PR job, makes CUPTI import platform-aware, adjusts CUDA/MSVC compile options and C++ detection, and configures per-test runtime library paths.

suggestion:

Changes

Windows build and test support

Layer / File(s) Summary
Windows CI workflow and env wiring
.github/workflows/build-windows.yml, .github/workflows/pr.yml, ci/windows/build_nvbench.ps1
Expose NVBENCH_WINDOWS_CUDA to validation/build steps; invoke install_cuda_profiler_api.ps1 in the container before building; pass CUDA/arch/std into the container; re-enable nvbench-windows PR job; add DEVICE_TESTING param propagation.
CUDA profiler API installer script
ci/windows/install_cuda_profiler_api.ps1
Adds installer script with -CUDA_VERSION/cudaVersion, CUDA root/version resolution, redistrib manifest discovery/selection, download-with-retry, SHA256 validation, archive extraction, installation of include/cuda_profiler_api.h, and post-install verification.
Windows DLL symbol export and linker options
CMakeLists.txt, nvbench/CMakeLists.txt
Set CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON for WIN32 AND BUILD_SHARED_LIBS; propagate MSVC /Zc:preprocessor (and CUDA host equivalent) to consumers; add MSVC-only LINKER:/INCLUDE:main interface option for nvbench::main.
CUPTI imported target handling
cmake/NVBenchCUPTI.cmake
Centralize CUPTI find_library hints (including lib/x64/lib on Windows); add nvbench_find_windows_cupti_runtime_library to locate runtime DLLs from import-lib and set imported-target IMPORTED_IMPLIB/IMPORTED_LOCATION on Windows.
CUDA/MSVC compile/link tweaks and C++ detection
cmake/NVBenchConfigTarget.cmake, nvbench/config.cuh.in
Define FMT_UNICODE=0 for MSVC+nvcc; forward /utf-8 to host compiler and suppress specific cudafe diagnostics; restrict CUPTI INSTALL_RPATH to non-Windows; use _MSVC_LANG when available for NVBENCH_CPLUSPLUS.
Per-test runtime environment and test wiring
testing/cmake/CMakeLists.txt, testing/cmake/test_export/CMakeLists.txt, examples/CMakeLists.txt, exec/CMakeLists.txt, testing/CMakeLists.txt, testing/device/CMakeLists.txt, testing/axes_metadata.cu
Compute imported library dirs for tests, propagate Windows toolchain variables into compile-test presets, prepend non-empty NVBench/CUPTI lib dirs to PATH on Windows, only set LD_LIBRARY_PATH fragments on Unix when non-empty, call nvbench_config_test_runtime_environment for examples/exec/tests/device, and add missing #include <iterator> in test source.

suggestion:

  • Possibly related PRs

suggestion:

  • Suggested labels: only: cmake

suggestion:

  • Suggested reviewers
    • alliepiper
    • gevtushenko

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ffe8731a-1bea-4fe3-bd00-ae6f639bb863

📥 Commits

Reviewing files that changed from the base of the PR and between d13a0fd and dd1ffc9.

📒 Files selected for processing (8)
  • CMakeLists.txt
  • ci/build_common.sh
  • cmake/NVBenchCUPTI.cmake
  • cmake/NVBenchConfigTarget.cmake
  • nvbench/config.cuh.in
  • testing/axes_metadata.cu
  • testing/cmake/CMakeLists.txt
  • testing/cmake/test_export/CMakeLists.txt

Comment thread ci/build_common.sh Outdated
Comment thread ci/build_common.sh Outdated
Comment thread testing/cmake/test_export/CMakeLists.txt Outdated
@mfranzrebsal
Copy link
Copy Markdown
Contributor Author

I will revert the changes to build_common.sh, since #362 already includes a separate script.

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

@mfranzrebsal The #362 to enable MSVC build of NVBench has been merged, but it is presently unconditionally skipped due to known build failure this PR fixes.

Please merge main into this branch, and revert c632eb2 to reenable the PR. The expectation is that CI build using MSVC would complete successfully now.

@mfranzrebsal
Copy link
Copy Markdown
Contributor Author

The commit you mention is nowhere to be found, either in main or my rebased branch. I think we are good?

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

/ok to test 787e435

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

@mfranzrebsal Right now the CI has Windows build job disabled in pr.yml#L82-83.

Please push a change to remove these two lines to enable the job.

Remove gate that disables Windows NVBench build job in pr.yaml
@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

/ok to test 78b674b

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

Windows build job fails with:

sccache C:\msbuild\17\VC\Tools\MSVC\14.44.35207\bin\Hostx64\x64\cl.exe  /nologo /TP -DFMT_USE_BITINT=0 -DNVBENCH_NO_IMPLICIT_SYSTEM_HEADER -Dnvbench_EXPORTS -IC:\nvbench -IC:\nvbench\build\nvbench-ci -IC:\nvbench\build\nvbench-ci\nvbench\detail -external:I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include" -external:I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cccl" -external:I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\extras\CUPTI\include" -external:IC:\nvbench\build\nvbench-ci\_deps\fmt-src\include -external:IC:\nvbench\build\nvbench-ci\_deps\nlohmann_json-src\include -external:W0 /DWIN32 /D_WINDOWS /EHsc /O2 /Ob2 /DNDEBUG -std:c++17 -MD -Wall /utf-8 /showIncludes /Fonvbench\CMakeFiles\nvbench.dir\benchmark_base.cxx.obj /Fdnvbench\CMakeFiles\nvbench.dir\ /FS -c C:\nvbench\nvbench\benchmark_base.cxx
C:\nvbench\nvbench/detail/measure_cold.cuh(45): fatal error C1083: Cannot open include file: 'cuda_profiler_api.h': No such file or directory

On Linux, the compilation command for benchmark_base.cxx contains -isystem /usr/local/cuda/targets/x86_64-linux/include which is the folder where cuda_profiler_api.h resides. I assume the corresponding CLI option for CL.exe is -external:I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include".

This folder should contain "cuda_profiler_api.h" though.

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

Per my agent, the devcontainer does not install CUDA Profiler API component.

  From the Docker registry metadata, rapidsai/devcontainers:25.12-cuda13.0-cl14.44 installs CUDA through:

  /tools/install-tools.ps1 -cudaVersion 13.0

  That calls install-cuda.ps1, whose component list includes cudart_13.0, cupti_13.0, nvcc_13.0, nvrtc_13.0, nvtx_13.0, crt_13.0, nvvm_13.0, nvptxcompiler_13.0,
  etc., but does not include:

  cuda_profiler_api_13.0

  NVIDIA’s CUDA 13.0 Windows install guide lists cuda_profiler_api_13.0 as a separate subpackage: “CUDA Profiler API” NVIDIA docs
  (https://docs.nvidia.com/cuda/archive/13.0.1/cuda-installation-guide-microsoft-windows/index.html#install-the-cuda-software).

  So the CI error is consistent with the image missing that component. The include path is fine; the header is likely absent from the installed toolkit. The image
  install script should add:

  "cuda_profiler_api_$mmVersionTag"

I will try installing CUDA Profiler API in the container next

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

/ok to test 460e14f

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

/ok to test c6cd097

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

/ok to test f8c0554

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 16, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7e39f8ee-6c3d-442e-9516-deabdc1e7d19

📥 Commits

Reviewing files that changed from the base of the PR and between 6df6dc8 and f8c0554.

📒 Files selected for processing (11)
  • .github/workflows/build-windows.yml
  • .github/workflows/pr.yml
  • CMakeLists.txt
  • ci/windows/install_cuda_profiler_api.ps1
  • cmake/NVBenchCUPTI.cmake
  • cmake/NVBenchConfigTarget.cmake
  • nvbench/CMakeLists.txt
  • nvbench/config.cuh.in
  • testing/axes_metadata.cu
  • testing/cmake/CMakeLists.txt
  • testing/cmake/test_export/CMakeLists.txt
💤 Files with no reviewable changes (1)
  • .github/workflows/pr.yml

Comment thread ci/windows/install_cuda_profiler_api.ps1 Outdated
Comment thread ci/windows/install_cuda_profiler_api.ps1 Outdated
Comment thread testing/cmake/CMakeLists.txt
Comment thread testing/cmake/test_export/CMakeLists.txt Outdated
@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 17, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/workflows/build-windows.yml (1)

122-166: ⚡ Quick win

suggestion: Pass NVBENCH_WINDOWS_CUDA, NVBENCH_WINDOWS_STD, and NVBENCH_WINDOWS_ARCH through Docker --env and resolve them in-container.

The generated PowerShell script currently relies on host-side interpolation: because the here-string uses double quotes, $env:NVBENCH_WINDOWS_CUDA and related variables expand when the script is written on the host, not in the container. The Docker args never forward these variables, so the container has no access to them at runtime. Forward them explicitly with --env NVBENCH_WINDOWS_CUDA=... and backtick-escape the $env: references in the here-string so they evaluate in-container instead.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 317fd287-282f-4378-b196-b1fb8d3d98cc

📥 Commits

Reviewing files that changed from the base of the PR and between 6df6dc8 and 177c7b0.

📒 Files selected for processing (11)
  • .github/workflows/build-windows.yml
  • .github/workflows/pr.yml
  • CMakeLists.txt
  • ci/windows/install_cuda_profiler_api.ps1
  • cmake/NVBenchCUPTI.cmake
  • cmake/NVBenchConfigTarget.cmake
  • nvbench/CMakeLists.txt
  • nvbench/config.cuh.in
  • testing/axes_metadata.cu
  • testing/cmake/CMakeLists.txt
  • testing/cmake/test_export/CMakeLists.txt
💤 Files with no reviewable changes (1)
  • .github/workflows/pr.yml

Comment thread testing/cmake/test_export/CMakeLists.txt
@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

/ok to test ccfa1b5

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 17, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8008810e-7c32-43f8-a2c8-940704148f55

📥 Commits

Reviewing files that changed from the base of the PR and between 6df6dc8 and ccfa1b5.

📒 Files selected for processing (11)
  • .github/workflows/build-windows.yml
  • .github/workflows/pr.yml
  • CMakeLists.txt
  • ci/windows/install_cuda_profiler_api.ps1
  • cmake/NVBenchCUPTI.cmake
  • cmake/NVBenchConfigTarget.cmake
  • nvbench/CMakeLists.txt
  • nvbench/config.cuh.in
  • testing/axes_metadata.cu
  • testing/cmake/CMakeLists.txt
  • testing/cmake/test_export/CMakeLists.txt
💤 Files with no reviewable changes (1)
  • .github/workflows/pr.yml

Comment thread testing/cmake/CMakeLists.txt Outdated
1. Install CUDA Profiler API into toolkit matching
   to what is installed in dev-container
2. Pass linker argument to use main from static nvbench_main
   library when linking examples and tests
3. Instruct MSVC to use standard-compliant preprocessor
4. Use environment modification for targets to help them
   find shared libraries needed as runtime, such as
   CUPTI on Windows/Linux.

Remainder is aggregation of 53 individual commit messages

Install CUDA Profiler API into toolkit

Add intall_cuda_profiler_api.ps1

Inform MSVC that static library export main

Attempt to fix "LINK : fatal error LNK1561: entry point must be defined"
when building benchmarks which need main function provided by static
library libnvbench_main after NVIDIA#350

Review feedback to PowerShell script

Fix how CMAKE_CUDA_HOST_COMPILER is set in call to cmake

Filter out empty directories LD_LIBRARY_PATH/PATH

Act on review feedback regarding corner cases when testing
may dependent on the directory it is performed from

Check that cudaVersion and :CUDA_PATH are consistent

Do not overwrite ENVIRONMENT property with empty values

Implement retry logic in downloading of CUDA Profiler API

Strengthen publisher verification of downloaded artifact

Prepend new folders to LD_LIBRARY_PATH, do not overwrite

Implement timeout, fail on 40x HTTP response code

4xx responses now fail immediately, and the installer is bounded
to 15 minutes before being killed and reported as a timeout.

USE ENVIRONMENT_MODIFICATION property, not ENVIRONMENT

escape environment modification values

Fix cmake script error breaking the build

Added recommented timeout to Invoke-WebRequest

Set cmake_minimum_required version to 3.30.4, consistent with main project

Pass NVBENCH environment variables through docker for Windows build

Export IMPORTLIB_LOCATION for CUPTI on Windows and use in testing projects

Add Zc:preprocessor to host compiler on Windows. Configure runtime env for tests to find CUPTI library

Better fix to add /Zc:preprocessor that also propagates to header testing target

Address code rabbit concern

Validate  before casting in PowerShell script

decouple nvbench runtime path setup from cupti target detection

Normalize multiple ARCH args

Better validation of gpu_args parameter

use get_imported_location to get CUPTI library to improve multi-config support

Validation of combinations of gpu, run_tests and device_testing

Resolve code-rabbit concern in handling multiple imported configurations to match build type, if set

Reject GPU requests for forks

Prevents installing cuda_profiler_api.h into one toolkit while CMake builds with another.

Fail fast for deterministic client errors returned by download request

more robust imported_location computation

Make Linux also use ENVIRONMENT_MODIFICATION to simplify code

run_tests=false is not allows when device_testing=true

Specify Windows CUDA toolkit version major.minor.patch, derive devcontainer tag from full spec

Handle edge case when multiple CUPTI dlls exist, pick up, warn, do not fail

Always specify -DNVBench_ENABLE_DEVICE_TESTING=VAL per value of

Back to cuda major.minor being input

What CUDA Profiler API to install is determined from redist information
stored in version.json stored at root of CUDA Toolkit.

If version.json is not found, an error occurs

Remove parameters intended to enable testing builds on Windows. Deferred for future work

Handle import nvbench::nvbench the same as nvbench target in NVBenchConfigTarget

Forward cmake variables only if set

Use UTF-8 encoding when appending to GITHUB_OUTPUT

Avoid power-shell footgun where local variable shadows builtin variable due to case insensitivity

enable device testing parameter in build_nvbench, passed as True by workflow

Lower CMake version required as much as possible

LINKER:/INCLUDE:main for proper CUDA link driver routing

Add conda-specific hints for find_library call to find CUPTI

test_export must require 3.22 version

ENVIRONMENT_MODIFICATION feature was added in 3.22.0

https://cmake.org/cmake/help/latest/prop_test/ENVIRONMENT_MODIFICATION.html

Delete unused function Test-Preset

Guard the CUPTI runtime path extraction

Check before executing cmake_path() in testing/cmake/CMakeLists.txt
Also, use nvbench_get_imported_location to extract imported location

use the config-aware generator expression for all runtime targets

Remove the configure-time imported-location helper entirely.

Deduplicate WINDOWS_CI_IMAGE construction
@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

/ok to test 0ca8414

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
testing/cmake/test_export/CMakeLists.txt (1)

1-1: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

important: bump cmake_minimum_required to 3.30.4 (or newer). 3.22.0 is below the repository baseline for **/CMakeLists.txt and can drift from rapids-cmake compatibility expectations.

-cmake_minimum_required(VERSION 3.22.0)
+cmake_minimum_required(VERSION 3.30.4)

As per coding guidelines, **/CMakeLists.txt: CMake minimum version must be 3.30.4 or higher for rapids-cmake compatibility.

#!/bin/bash
# Verify minimum CMake versions declared across CMakeLists.txt files.
fd -a 'CMakeLists.txt' | while read -r f; do
  awk -v file="$f" '
    match($0,/cmake_minimum_required\(VERSION[[:space:]]*([0-9]+\.[0-9]+\.[0-9]+)/,m){
      printf "%s:%d:%s\n", file, NR, m[1]
    }' "$f"
done

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 34f64449-f3b7-4dff-8c2f-e26e3b7212be

📥 Commits

Reviewing files that changed from the base of the PR and between 3d82e58 and 0ca8414.

📒 Files selected for processing (16)
  • .github/workflows/build-windows.yml
  • .github/workflows/pr.yml
  • CMakeLists.txt
  • ci/windows/build_nvbench.ps1
  • ci/windows/install_cuda_profiler_api.ps1
  • cmake/NVBenchCUPTI.cmake
  • cmake/NVBenchConfigTarget.cmake
  • examples/CMakeLists.txt
  • exec/CMakeLists.txt
  • nvbench/CMakeLists.txt
  • nvbench/config.cuh.in
  • testing/CMakeLists.txt
  • testing/axes_metadata.cu
  • testing/cmake/CMakeLists.txt
  • testing/cmake/test_export/CMakeLists.txt
  • testing/device/CMakeLists.txt
💤 Files with no reviewable changes (1)
  • .github/workflows/pr.yml

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

@coderabbitai disagree with nitpick comment made above in #354 (review)

See my counter-argument in #367 (comment) and associated learning.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Fetch the review comment from PR `#354`
gh api repos/NVIDIA/nvbench/pulls/354/reviews/4313987488/comments --jq '.[].body' 2>/dev/null || \
gh api repos/NVIDIA/nvbench/pulls/354/reviews/4313987488 --jq '.body' 2>/dev/null

# Also fetch the specific discussion comment from PR `#367`
gh api repos/NVIDIA/nvbench/pulls/comments/3261813313 --jq '{body: .body, path: .path, diff_hunk: .diff_hunk}' 2>/dev/null

Length of output: 1306


🏁 Script executed:

#!/bin/bash
# Get the full discussion comment from PR `#367`
gh api repos/NVIDIA/nvbench/pulls/comments/3261813313 --jq '.body' 2>/dev/null

Length of output: 1021


🏁 Script executed:

#!/bin/bash
# List all review comments on PR `#367` to find the counter-argument
gh api repos/NVIDIA/nvbench/pulls/367/comments \
  --jq '.[] | select(.user.login == "oleksandr-pavlyk") | {id: .id, body: .body, in_reply_to_id: .in_reply_to_id, path: .path}' 2>/dev/null

Length of output: 370


@oleksandr-pavlyk: Understood. testing/cmake/test_export is an independent CMake project, so the repository-wide 3.30.4 minimum does not apply. The correct floor is 3.22.0 due to ENVIRONMENT_MODIFICATION being introduced in that version. My comment was wrong on both counts — I'll record this.

🐇 (`_/) noted and corrected


✏️ Learnings added
Learnt from: oleksandr-pavlyk
URL: https://github.com/NVIDIA/nvbench/pull/354

Timestamp: 2026-05-18T21:18:10.799Z
Learning: In the NVIDIA/nvbench repository, `testing/cmake/test_export/CMakeLists.txt` is an independent CMake sub-project and is intentionally NOT subject to the repository-wide cmake_minimum_required version (currently 3.30.4 for rapids-cmake compatibility). Do not flag its cmake_minimum_required as inconsistent with the root CMakeLists.txt. The correct minimum for this file is 3.22.0, because it uses the ENVIRONMENT_MODIFICATION test property which was introduced in CMake 3.22.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

oleksandr-pavlyk commented May 18, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
cmake/NVBenchCUPTI.cmake (1)

58-69: 💤 Low value

suggestion: Natural descending sort picks the lexicographically "largest" name, which may not always correspond to the newest version (e.g., cupti64_9.dll > cupti64_10.dll lexicographically, but numerically 10 > 9). Consider using VERSION compare if filenames follow semver patterns, or document the assumption that version suffixes sort correctly.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e6acfbeb-b3fd-458a-b299-411c657c85dc

📥 Commits

Reviewing files that changed from the base of the PR and between 3d82e58 and 0ca8414.

📒 Files selected for processing (16)
  • .github/workflows/build-windows.yml
  • .github/workflows/pr.yml
  • CMakeLists.txt
  • ci/windows/build_nvbench.ps1
  • ci/windows/install_cuda_profiler_api.ps1
  • cmake/NVBenchCUPTI.cmake
  • cmake/NVBenchConfigTarget.cmake
  • examples/CMakeLists.txt
  • exec/CMakeLists.txt
  • nvbench/CMakeLists.txt
  • nvbench/config.cuh.in
  • testing/CMakeLists.txt
  • testing/axes_metadata.cu
  • testing/cmake/CMakeLists.txt
  • testing/cmake/test_export/CMakeLists.txt
  • testing/device/CMakeLists.txt
💤 Files with no reviewable changes (1)
  • .github/workflows/pr.yml

Comment thread ci/windows/install_cuda_profiler_api.ps1
@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

I was also able to build NVBench in conda (on machine where Visual Studio Community 2026 with build tools for 19.44 are installed):

Build steps in conda

conda create -n ctk-13 --yes cuda-compiler cuda-cupti-dev cuda-profiler-api cuda-nvml-dev cmake ninja git
conda activate ctk-13
git clone https://github.com/NVIDIA/nvbench -b windows-support
cd nvbench
cmake -B build_conda -G Ninja --preset nvbench-dev
cmake --build build_conda
ctest --test-dir build_conda
Test run, conda env
C:\Users\opavlyk\work\nvbench>ctest --test-dir build_conda
Test project C:/Users/opavlyk/work/nvbench/build_conda
      Start  1: nvbench.ctl.no_args
 1/52 Test  #1: nvbench.ctl.no_args ...........................   Passed    3.35 sec
      Start  2: nvbench.ctl.version
 2/52 Test  #2: nvbench.ctl.version ...........................   Passed    0.12 sec
      Start  3: nvbench.ctl.list
 3/52 Test  #3: nvbench.ctl.list ..............................   Passed    0.24 sec
      Start  4: nvbench.ctl.l
 4/52 Test  #4: nvbench.ctl.l .................................   Passed    0.24 sec
      Start  5: nvbench.ctl.help
 5/52 Test  #5: nvbench.ctl.help ..............................   Passed    0.13 sec
      Start  6: nvbench.ctl.h
 6/52 Test  #6: nvbench.ctl.h .................................   Passed    0.12 sec
      Start  7: nvbench.ctl.help_axes
 7/52 Test  #7: nvbench.ctl.help_axes .........................   Passed    0.14 sec
      Start  8: nvbench.ctl.help_axis
 8/52 Test  #8: nvbench.ctl.help_axis .........................   Passed    0.12 sec
      Start  9: nvbench.example.cpp17.auto_throughput
 9/52 Test  #9: nvbench.example.cpp17.auto_throughput .........   Passed    1.84 sec
      Start 10: nvbench.example.cpp17.axes
10/52 Test #10: nvbench.example.cpp17.axes ....................   Passed    7.76 sec
      Start 11: nvbench.example.cpp17.custom_criterion
11/52 Test #11: nvbench.example.cpp17.custom_criterion ........   Passed    1.99 sec
      Start 12: nvbench.example.cpp17.cpu_only
12/52 Test #12: nvbench.example.cpp17.cpu_only ................   Passed   11.36 sec
      Start 13: nvbench.example.cpp17.enums
13/52 Test #13: nvbench.example.cpp17.enums ...................   Passed    2.33 sec
      Start 14: nvbench.example.cpp17.exec_tag_sync
14/52 Test #14: nvbench.example.cpp17.exec_tag_sync ...........   Passed    2.55 sec
      Start 15: nvbench.example.cpp17.exec_tag_timer
15/52 Test #15: nvbench.example.cpp17.exec_tag_timer ..........   Passed    1.96 sec
      Start 16: nvbench.example.cpp17.skip
16/52 Test #16: nvbench.example.cpp17.skip ....................   Passed    3.15 sec
      Start 17: nvbench.example.cpp17.stream
17/52 Test #17: nvbench.example.cpp17.stream ..................   Passed    1.94 sec
      Start 18: nvbench.example.cpp17.summaries
18/52 Test #18: nvbench.example.cpp17.summaries ...............   Passed    3.49 sec
      Start 19: nvbench.example.cpp17.throughput
19/52 Test #19: nvbench.example.cpp17.throughput ..............   Passed    2.02 sec
      Start 20: nvbench.test.axes_metadata
20/52 Test #20: nvbench.test.axes_metadata ....................   Passed    0.74 sec
      Start 21: nvbench.test.benchmark
21/52 Test #21: nvbench.test.benchmark ........................   Passed    1.75 sec
      Start 22: nvbench.test.create
22/52 Test #22: nvbench.test.create ...........................   Passed    0.87 sec
      Start 23: nvbench.test.cuda_timer
23/52 Test #23: nvbench.test.cuda_timer .......................   Passed    2.41 sec
      Start 24: nvbench.test.cuda_stream
24/52 Test #24: nvbench.test.cuda_stream ......................   Passed    1.87 sec
      Start 25: nvbench.test.cpu_timer
25/52 Test #25: nvbench.test.cpu_timer ........................   Passed    1.12 sec
      Start 26: nvbench.test.criterion_manager
26/52 Test #26: nvbench.test.criterion_manager ................   Passed    0.93 sec
      Start 27: nvbench.test.criterion_params
27/52 Test #27: nvbench.test.criterion_params .................   Passed    0.90 sec
      Start 28: nvbench.test.custom_main_custom_args
28/52 Test #28: nvbench.test.custom_main_custom_args ..........   Passed    1.82 sec
      Start 29: nvbench.test.custom_main_custom_exceptions
29/52 Test #29: nvbench.test.custom_main_custom_exceptions ....   Passed    1.81 sec
      Start 30: nvbench.test.custom_main_global_state_raii
30/52 Test #30: nvbench.test.custom_main_global_state_raii ....   Passed    1.82 sec
      Start 31: nvbench.test.enum_type_list
31/52 Test #31: nvbench.test.enum_type_list ...................   Passed    0.73 sec
      Start 32: nvbench.test.entropy_criterion
32/52 Test #32: nvbench.test.entropy_criterion ................   Passed    0.81 sec
      Start 33: nvbench.test.float64_axis
33/52 Test #33: nvbench.test.float64_axis .....................   Passed    0.96 sec
      Start 34: nvbench.test.int64_axis
34/52 Test #34: nvbench.test.int64_axis .......................   Passed    0.95 sec
      Start 35: nvbench.test.named_values
35/52 Test #35: nvbench.test.named_values .....................   Passed    0.91 sec
      Start 36: nvbench.test.option_parser
36/52 Test #36: nvbench.test.option_parser ....................   Passed    1.66 sec
      Start 37: nvbench.test.range
37/52 Test #37: nvbench.test.range ............................   Passed    0.85 sec
      Start 38: nvbench.test.reset_error
38/52 Test #38: nvbench.test.reset_error ......................   Passed    1.78 sec
      Start 39: nvbench.test.ring_buffer
39/52 Test #39: nvbench.test.ring_buffer ......................   Passed    0.86 sec
      Start 40: nvbench.test.runner
40/52 Test #40: nvbench.test.runner ...........................   Passed    0.77 sec
      Start 41: nvbench.test.state
41/52 Test #41: nvbench.test.state ............................   Passed    1.81 sec
      Start 42: nvbench.test.statistics
42/52 Test #42: nvbench.test.statistics .......................   Passed    0.87 sec
      Start 43: nvbench.test.state_generator
43/52 Test #43: nvbench.test.state_generator ..................   Passed    1.05 sec
      Start 44: nvbench.test.stdrel_criterion
44/52 Test #44: nvbench.test.stdrel_criterion .................   Passed    0.73 sec
      Start 45: nvbench.test.string_axis
45/52 Test #45: nvbench.test.string_axis ......................   Passed    0.86 sec
      Start 46: nvbench.test.type_axis
46/52 Test #46: nvbench.test.type_axis ........................   Passed    0.95 sec
      Start 47: nvbench.test.type_list
47/52 Test #47: nvbench.test.type_list ........................   Passed    0.98 sec
      Start 48: nvbench.test.cmake.test_export.build_tree
48/52 Test #48: nvbench.test.cmake.test_export.build_tree .....   Passed   89.16 sec
      Start 50: nvbench.test.cmake.install_tree.install
49/52 Test #50: nvbench.test.cmake.install_tree.install .......   Passed    0.53 sec
      Start 49: nvbench.test.cmake.test_export.install_tree
50/52 Test #49: nvbench.test.cmake.test_export.install_tree ...   Passed   88.72 sec
      Start 51: nvbench.test.cmake.install_tree.cleanup
51/52 Test #51: nvbench.test.cmake.install_tree.cleanup .......   Passed    0.08 sec
      Start 52: nvbench.test.device.noisy_bench
52/52 Test #52: nvbench.test.device.noisy_bench ...............***Failed  Error regular expression found in output. Regex=[Warn] 36.51 sec

98% tests passed, 1 tests failed out of 52

Total Test time (real) = 293.76 sec

The following tests FAILED:
         52 - nvbench.test.device.noisy_bench (Failed)
Errors while running CTest
Output from these tests are in: C:/Users/opavlyk/work/nvbench/build_conda/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
C:\Users\opavlyk\work\nvbench>conda env export
name: ctk-13
channels:
  - conda-forge
dependencies:
  - c-compiler=1.11.0=h528c1b4_0
  - cuda-cccl_win-64=13.2.75=h57928b3_0
  - cuda-compiler=13.2.1=h559df3f_0
  - cuda-crt-dev_win-64=13.2.78=h57928b3_0
  - cuda-crt-tools=13.2.78=h57928b3_0
  - cuda-ctadvisor=13.2.78=hac47afa_0
  - cuda-cudart=13.2.75=hac47afa_0
  - cuda-cudart-dev=13.2.75=hac47afa_0
  - cuda-cudart-dev_win-64=13.2.75=hac47afa_0
  - cuda-cudart-static=13.2.75=hac47afa_0
  - cuda-cudart-static_win-64=13.2.75=hac47afa_0
  - cuda-cudart_win-64=13.2.75=hac47afa_0
  - cuda-cuobjdump=13.2.78=hac47afa_0
  - cuda-cupti=13.2.75=hac47afa_0
  - cuda-cupti-dev=13.2.75=hac47afa_0
  - cuda-cuxxfilt=13.2.78=hac47afa_0
  - cuda-nvcc=13.2.78=h8f04d04_0
  - cuda-nvcc-dev_win-64=13.2.78=h36c15f3_0
  - cuda-nvcc-impl=13.2.78=h53cbb54_0
  - cuda-nvcc-tools=13.2.78=he0c23c2_0
  - cuda-nvcc_win-64=13.2.78=hd70436c_0
  - cuda-nvdisasm=13.2.78=hac47afa_0
  - cuda-nvml-dev=13.2.82=hac47afa_0
  - cuda-nvprune=13.2.78=hac47afa_0
  - cuda-nvvm-dev_win-64=13.2.78=h57928b3_0
  - cuda-nvvm-impl=13.2.78=h2466b09_0
  - cuda-nvvm-tools=13.2.78=h2466b09_0
  - cuda-profiler-api=13.2.75=h57928b3_0
  - cuda-tileiras=13.2.78=hac47afa_0
  - cuda-version=13.2=he2cc418_3
  - cxx-compiler=1.11.0=h1c1089f_0
  - git=2.54.0=h57928b3_0
  - libnvptxcompiler-dev=13.2.78=h57928b3_0
  - libnvptxcompiler-dev_win-64=13.2.78=h57928b3_0
  - ucrt=10.0.26100.0=h57928b3_0
  - vc=14.5=h1b7c187_36
  - vc14_runtime=14.51.36231=h1b9f54f_36
  - vcomp14=14.51.36231=h1b9f54f_36
  - vs2019_win-64=19.29.30139=h7dcff83_36
  - vs2022_win-64=19.44.35207=ha74f236_36
  - vswhere=3.1.7=h40126e0_1
prefix: C:\Users\opavlyk\miniforge\envs\ctk-13

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

/ok to test 2aaf76e

@oleksandr-pavlyk
Copy link
Copy Markdown
Collaborator

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Collaborator

@alliepiper alliepiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- This can go in as is, my notes can be addressed as a followup.

It might be a good idea to point an agent at this PR and the following tag/commits to see if there's anything else that might be worth restoring. I can't remember if I mentioned these before:

tag pre_msvc_drop: https://github.com/NVIDIA/nvbench/blob/pre_msvc_drop
Commit that removed MSVC: 93ea533 (Part of #200)

Comment thread cmake/NVBenchConfigTarget.cmake
@oleksandr-pavlyk oleksandr-pavlyk merged commit 4a33a61 into NVIDIA:main May 19, 2026
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants