Skip to content

feat(concurrency): Expose WITH_OPENMP option and implement native C++11 std::thread fallback#104

Open
clintmorris229 wants to merge 15 commits into
SarahWeiii:mainfrom
clintmorris229:feat/std-threads-fallback
Open

feat(concurrency): Expose WITH_OPENMP option and implement native C++11 std::thread fallback#104
clintmorris229 wants to merge 15 commits into
SarahWeiii:mainfrom
clintmorris229:feat/std-threads-fallback

Conversation

@clintmorris229

@clintmorris229 clintmorris229 commented May 29, 2026

Copy link
Copy Markdown
Contributor

The Issue

CoACD currently uses OpenMP for concurrency. This causes issues in specific environments:

  • Sandboxed Environments: Mixing compiler toolchains can cause dynamic library conflicts (e.g., libomp.so vs libgomp.so.1), resulting in SIGSEGV crashes.
  • macOS & WebAssembly (WASM): Lack of out-of-the-box OpenMP support forces execution to run sequentially.

Changes

Introduced a C++17 std::thread fallback for environments where OpenMP is unavailable or problematic.

  1. CMake Options: Added WITH_OPENMP (default ON, but OFF on macOS) and WITH_STD_THREADS (default ON, auto-falls back if OpenMP is disabled).
  2. Thread Spawning: Implemented chunk-based multithreading in src/process.cpp using std::thread.
  3. Optimized Headers: Wrapped threading headers (<thread>, <mutex>, <chrono>) in #if defined(WITH_STD_THREADS) to prevent compile-time overhead when disabled.
  4. Execution Paths: Used preprocessor directives to cleanly route the loop through OpenMP, C++17 std::thread, or Sequential execution.
  5. Thread Safety:
    • Implemented std::mutex and std::lock_guard for safe concurrent vector aggregation in src/process.cpp.
    • Implemented std::exception_ptr to safely capture and rethrow worker thread exceptions in the main thread.
    • Fixed Data Race in Sobol Generator: Wrapped the call to the thread-unsafe third-party i4_sobol function in src/model_obj.cpp with a std::mutex to prevent data races on its internal static state variables during parallel execution.
  6. Build Fix: Set global CMAKE_POSITION_INDEPENDENT_CODE ON to resolve static linking conflicts with FetchContent dependencies.
  7. CI Fix: Quoted CMAKE_POLICY_VERSION_MINIMUM in .github/workflows/build.yml to prevent argument splitting on the updated Windows runner image (CMake 4.3).

Outcomes

  • Backward Compatible: Existing behavior is unchanged. Windows/Linux continue defaulting to OpenMP, and macOS now defaults to std::thread fallback.
  • Expanded Concurrency: Compiling with OpenMP disabled automatically triggers the C++17 fallback, restoring parallel performance on macOS, WASM, and strict sandboxes without requiring a C++20 compiler.
  • Performance Parity: Benchmarking demonstrates complete runtime parity compared to OpenMP across all mesh sizes (tested up to 100K faces).

Benchmark: C++17 std::thread vs OpenMP (Threshold = 0.05)

Benchmarked on 16 cores using various models, including the Stanford Armadillo (99,976 faces).

Table 1: Convex Hull Limit = 1

Model (Faces) OpenMP Baseline Basic Threads
SnowFlake.obj (2208) 7.55 s 7.42 s
Kettle.obj (7302) 50.85 s 50.71 s
KitchenPot.obj (23288) 31.38 s 31.41 s
Octocat-v2.obj (40246) 28.10 s 28.14 s
armadillo.obj (99976) 40.79 s 41.55 s

Table 2: Convex Hull Limit = 10

Model (Faces) OpenMP Baseline Basic Threads
SnowFlake.obj (2208) 7.51 s 7.45 s
Kettle.obj (7302) 47.89 s 47.76 s
KitchenPot.obj (23288) 29.88 s 29.76 s
Octocat-v2.obj (40246) 26.87 s 26.95 s
armadillo.obj (99976) 38.76 s 39.61 s

Table 3: Convex Hull Limit = -1 (Unlimited)

Model (Faces) OpenMP Baseline Basic Threads
SnowFlake.obj (2208) 6.65 s 6.51 s
Kettle.obj (7302) 35.34 s 35.46 s
KitchenPot.obj (23288) 23.47 s 23.48 s
Octocat-v2.obj (40246) 24.78 s 24.51 s
armadillo.obj (99976) 35.49 s 35.75 s

@clintmorris229 clintmorris229 force-pushed the feat/std-threads-fallback branch 3 times, most recently from 5568d79 to 4516215 Compare May 29, 2026 21:45
@clintmorris229 clintmorris229 changed the title Feat/std threads fallback feat(concurrency): Expose WITH_OPENMP option and implement native C++11 std::thread fallback May 29, 2026
Provide type-safe C++11 std::thread fallback block under #if defined(_OPENMP) in process.cpp, and add WITH_OPENMP CMake option in CMakeLists.txt.

Strips all OpenMP compiler/linker bindings when compiled with -DWITH_OPENMP=OFF, allowing stable parallel concurrency out-of-the-box in secure sandboxes, macOS, and WebAssembly.

fix(concurrency): Implement 64-bit LL integer overflow safeguard inside loop index math
@clintmorris229 clintmorris229 force-pushed the feat/std-threads-fallback branch 5 times, most recently from 08f9cca to 20eec72 Compare June 1, 2026 19:03
@clintmorris229 clintmorris229 force-pushed the feat/std-threads-fallback branch from 20eec72 to 5abb356 Compare June 1, 2026 19:27
@clintmorris229 clintmorris229 force-pushed the feat/std-threads-fallback branch from f8570e4 to 96ed730 Compare June 2, 2026 04:59
@clintmorris229 clintmorris229 marked this pull request as ready for review June 2, 2026 20:14
@clintmorris229

Copy link
Copy Markdown
Contributor Author

@SarahWeiii Any chance you can take a look at this?

@SarahWeiii

Copy link
Copy Markdown
Owner

Hi @clintmorris229, thanks for the PR — parallelism on macOS/sandboxed builds is genuinely valuable. But the current implementation has a few problems that need fixing before merge:

  1. std::thread silently becomes the default everywhere, not a fallback. WITH_STD_THREADS defaults to ON and the dispatch checks it before _OPENMP, so default Linux/Windows builds (and pip wheels) switch both hot loops to std::thread even when OpenMP is found and linked. This contradicts "Windows/Linux continue defaulting to OpenMP."

  2. Inconsistent #if ordering between execution and locking. The loops check WITH_STD_THREADS first, but the lock/timing blocks check _OPENMP first. In a default build the loops run on std::thread while synchronizing via omp_lock_t — the std::mutex branch is never even compiled. Clearly unintended; the orderings must agree.

  3. The default build doesn't fix your own use case. OpenMP is still linked by default, so the libomp/libgomp SIGSEGV conflict remains unless the user passes -DWITH_OPENMP=OFF.

  4. Exceptions escaping the std::thread workers call std::terminate() (e.g., "Wrong clip proposal!") — on macOS this currently propagates to the caller. Please capture per worker and rethrow after join().

  5. Benchmark: please clarify how the "OpenMP Baseline" was built — if it was this branch with default options, both columns ran the same std::thread path (see point 1 above), which might explain the near-identical numbers. Also t=0.1/0.2 keeps the waiting pool too small to stress 16 cores; I would suggest rerun at the default t=0.05 with confirmed flags per column.

Requested change: make this a true fallback — in CMake, only define WITH_STD_THREADS when OpenMP is not used, and put #if defined(_OPENMP) first in all conditionals so they agree. With that, the exception handling, and an updated benchmark, I'd be happy to merge.

clintmorris229 added a commit to clintmorris229/CoACD that referenced this pull request Jun 15, 2026
- Make WITH_STD_THREADS a true fallback in CMake.
- Default WITH_OPENMP to OFF on macOS.
- Enable POSITION_INDEPENDENT_CODE globally in CMake.
- Reorder conditional blocks in process.cpp to prioritize OpenMP checks.
- Capture and rethrow worker thread exceptions in std::thread fallback.
- Add mutex to wrap i4_sobol calls in model_obj.cpp to fix data races on its static variables.

TAG=agy
CONV=a67ee12d-d8d2-49db-ba0a-2b332d53d03c
- Make WITH_STD_THREADS a true fallback in CMake.
- Default WITH_OPENMP to OFF on macOS.
- Enable POSITION_INDEPENDENT_CODE globally in CMake.
- Reorder conditional blocks in process.cpp to prioritize OpenMP checks.
- Capture and rethrow worker thread exceptions in std::thread fallback.
- Add mutex to wrap i4_sobol calls in model_obj.cpp to fix data races on its static variables.

TAG=agy
CONV=a67ee12d-d8d2-49db-ba0a-2b332d53d03c
@clintmorris229 clintmorris229 force-pushed the feat/std-threads-fallback branch from 2633ff0 to 73e7f66 Compare June 15, 2026 17:58
…t splitting

The recent redirect of windows-2025 runners to windows-2025-vs2026 (using CMake 4.3)
introduced an issue where -DCMAKE_POLICY_VERSION_MINIMUM=3.5 was split into two
arguments (-DCMAKE_POLICY_VERSION_MINIMUM=3 and .5), causing CMake errors on Windows.
Quoting the argument ensures it is parsed as a single string.

TAG=agy
CONV=a67ee12d-d8d2-49db-ba0a-2b332d53d03c
@clintmorris229

clintmorris229 commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Hi @clintmorris229, thanks for the PR — parallelism on macOS/sandboxed builds is genuinely valuable. But the current implementation has a few problems that need fixing before merge:

  1. std::thread silently becomes the default everywhere, not a fallback. WITH_STD_THREADS defaults to ON and the dispatch checks it before _OPENMP, so default Linux/Windows builds (and pip wheels) switch both hot loops to std::thread even when OpenMP is found and linked. This contradicts "Windows/Linux continue defaulting to OpenMP."
  2. Inconsistent #if ordering between execution and locking. The loops check WITH_STD_THREADS first, but the lock/timing blocks check _OPENMP first. In a default build the loops run on std::thread while synchronizing via omp_lock_t — the std::mutex branch is never even compiled. Clearly unintended; the orderings must agree.
  3. The default build doesn't fix your own use case. OpenMP is still linked by default, so the libomp/libgomp SIGSEGV conflict remains unless the user passes -DWITH_OPENMP=OFF.
  4. Exceptions escaping the std::thread workers call std::terminate() (e.g., "Wrong clip proposal!") — on macOS this currently propagates to the caller. Please capture per worker and rethrow after join().
  5. Benchmark: please clarify how the "OpenMP Baseline" was built — if it was this branch with default options, both columns ran the same std::thread path (see point 1 above), which might explain the near-identical numbers. Also t=0.1/0.2 keeps the waiting pool too small to stress 16 cores; I would suggest rerun at the default t=0.05 with confirmed flags per column.

Requested change: make this a true fallback — in CMake, only define WITH_STD_THREADS when OpenMP is not used, and put #if defined(_OPENMP) first in all conditionals so they agree. With that, the exception handling, and an updated benchmark, I'd be happy to merge.

@SarahWeiii, thanks for the feedback! I have updated the PR to address your comments:

  1. True Fallback: Updated CMakeLists.txt so WITH_STD_THREADS is only enabled when OpenMP is disabled (WITH_OPENMP=OFF). On macOS, WITH_OPENMP now defaults to OFF to prevent dynamic library conflicts out-of-the-box.
  2. Consistent Compilation Ordering: Reordered #if blocks in src/process.cpp to evaluate #if defined(_OPENMP) first everywhere, aligning loops with locking blocks.
  3. Exception Handling: Captured worker exceptions using std::exception_ptr and rethrown after join() to prevent std::terminate() crashes.
  4. Updated Benchmarks: Rerun at the default t = 0.05 on 16 cores and updated the PR description with the new tables.
  5. Thread-Safety Fix: Identified a data race in the third-party i4_sobol generator (it uses static variables). I wrapped the call in src/model_obj.cpp with a std::mutex to ensure thread safety during parallel execution.
  6. CI Fix: Fixed a Windows CI failure caused by a recent runner image update (CMake 4.3) by quoting the CMAKE_POLICY_VERSION_MINIMUM argument in the workflow file to prevent argument splitting.

bazel-io pushed a commit to bazelbuild/bazel-central-registry that referenced this pull request Jun 25, 2026
)

This PR adds a new version `1.0.11.bcr.3` for the `coacd` module.

The previous version `1.0.11.bcr.2` enabled OpenMP by default. However,
OpenMP support can be difficult to configure and build portably across
all platforms in the BCR (especially on macOS with default Clang
toolchains).

This release configures the Bazel build to run without OpenMP and
instead utilizes a standard C++ threading fallback, ensuring wider
compatibility and easier integration across Linux, macOS, and Windows.

#### Changes

* Added `std_threads.patch`:
* Backports changes to support standard C++17 `std::thread`
multi-threading when OpenMP is disabled.
* Implements thread-safe guards (e.g., locking the shared `i4_sobol`
quasirandom generator) to ensure correctness during parallel execution.
* Updated `overlay/BUILD.bazel`:
* Removed OpenMP compiler flags (`-fopenmp`, `/openmp`) and the
`@openmp` dependency.
* Added `WITH_STD_THREADS=1` to the compiler defines to enable the
standard library threading path.
* Simplified platform-specific configuration for Windows, macOS, and
Linux.
* Updated `metadata.json`: Registered the new `1.0.11.bcr.3` version.


Reference this [PR](SarahWeiii/CoACD#104) for
benchmarking info
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants