Skip to content

Various madmatrix improvements and bugfixes#3

Open
theoheimel wants to merge 12 commits into
mainfrom
feat-madmatrix-theo
Open

Various madmatrix improvements and bugfixes#3
theoheimel wants to merge 12 commits into
mainfrom
feat-madmatrix-theo

Conversation

@theoheimel
Copy link
Copy Markdown
Contributor

GPU:

  • reduce number of cudaMallocAsync calls in UMAMI interface
  • allow for fully asynchronous calls of sigmaKin without device or stream synchronization
  • allow for parallelization over helicities in a single kernel launch instead of using CUDA streams

SIMD:

  • use one flavor index per SIMD vector instead of per batch
  • reorder events in UMAMI such that the flavor index is the same for each vector

@theoheimel theoheimel requested a review from Qubitol May 28, 2026 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant