feat(linux): optional GPU (CUDA) Parakeet via sherpa-onnx CUDA prebuilt#85
Closed
nephalemsec wants to merge 2 commits into
Closed
feat(linux): optional GPU (CUDA) Parakeet via sherpa-onnx CUDA prebuilt#85nephalemsec wants to merge 2 commits into
nephalemsec wants to merge 2 commits into
Conversation
Adds an opt-in `parakeet-cuda` feature that runs Parakeet on an NVIDIA GPU through sherpa-onnx's CUDA execution provider. Default builds are unchanged (Parakeet stays CPU; macOS stays CoreML). - Cargo: `parakeet-cuda = ["parakeet", "sherpa-onnx/shared"]`. sherpa-onnx is now `default-features = false` so the link mode is explicit — the default `parakeet` feature still resolves to static CPU (sherpa-onnx-sys defaults to static when no link feature is set), so nothing changes for current builds. - parakeet.rs: under `parakeet-cuda` on Linux, request the `cuda` provider and fall back to CPU if the EP can't initialise (mirrors the macOS CoreML path). - flake: a `cuda` dev shell fetches the k2-fsa CUDA prebuilt (CUDA 12 / cuDNN 9) and wires SHERPA_ONNX_LIB_DIR + LD_LIBRARY_PATH (sherpa libs, cuDNN, cudart, cublas, driver) so the CUDA provider loads at build and runtime. Build/run: nix develop .#cuda pnpm tauri dev --no-default-features --features parakeet-cuda,vulkan Verified: links against the CUDA sherpa-onnx (libonnxruntime_providers_cuda.so) and the default CPU build is unaffected. Real GPU engagement should be confirmed on-device with `nvidia-smi` during a transcription — onnxruntime silently falls back to CPU if cuDNN/driver libs aren't found.
The onnxruntime CUDA provider dlopen()s libcublasLt/cublas/curand/cufft/cudart/ cudnn; a single missing one aborts the process (no CPU fallback). The cuda dev shell only had cudnn/cudart/cublas, so it failed on libcurand.so.10. Add libcurand, libcufft, libcusparse to LD_LIBRARY_PATH (cublasLt ships with cublas).
Collaborator
Author
|
Superseded by #86 ( |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Opt-in GPU Parakeet for NVIDIA, off current main. Default builds are byte-for-byte unchanged (Parakeet stays CPU; macOS stays CoreML).
How it works
Parakeet runs through
sherpa-onnx(ONNX Runtime). The crate downloads CPU prebuilts by default, but k2-fsa also ships a CUDA prebuilt (...-cuda-12.x-cudnn-9.x-linux-x64-gpu) containinglibonnxruntime_providers_cuda.so. The crate'sSHERPA_ONNX_LIB_DIRhook lets us link that instead.parakeet-cuda = ["parakeet", "sherpa-onnx/shared"].sherpa-onnxis nowdefault-features = false; the defaultparakeetfeature still resolves to static CPU (sherpa-onnx-sys defaults to static when no link feature is set), so current builds are unaffected.parakeet-cudaon Linux, request thecudaprovider, fall back to CPU if it can't init — mirrors the existing macOS CoreML path.cudadev shell fetches the k2-fsa CUDA prebuilt and wiresSHERPA_ONNX_LIB_DIR+LD_LIBRARY_PATH(sherpa libs, cuDNN, cudart, cublas, driver).Use it
What's verified vs. not
libonnxruntime_providers_cuda.so);cargo check/build clean; rustfmt clean.cargo check --features parakeetpasses).nvidia-smiduring a transcription. onnxruntime silently falls back to CPU if cuDNN/driver libs aren't found at runtime — the logs printAttempting CUDA provider.../CUDA provider initialised, but that alone doesn't prove the GPU is doing the work, hence the nvidia-smi check.This is additive and opt-in, so it won't disturb the default CPU/Metal paths.