cuda-graphs

Here are 5 public repositories matching this topic...

ingero-io / ingero

eBPF agent and MCP server for GPU causal observability

kubernetes machine-learning gpu mcp incident-response cuda pytorch tracing nvidia sre ebpf observability gpu-monitoring causal-tracing model-context-protocol cuda-graphs gpu-observability

Updated May 16, 2026
C

kekzl / imp

Star

High-performance LLM inference engine in C++/CUDA for NVIDIA Blackwell GeForce / RTX PRO (RTX 5090/5080/5070 Ti, RTX PRO 6000; sm_120). 200 tok/s decode on Qwen3.6-35B-A3B-NVFP4 MoE (RTX 5090).

Updated May 16, 2026
Cuda

karun2328 / qwen2.5-7b-vllm-prefill-benchmarks

Star

Prefill performance study on Qwen2.5-7B using vLLM. Compares static vs mixed (bucketed) prefill under eager execution and CUDA Graphs, with controlled concurrency and real-world latency/throughput metrics.

gpu vllm llm-inference qwen2-5 cuda-graphs prefill-benchmarking inference-optmization

Updated Feb 10, 2026
Python

D3velop-llc / csm-rtx5090

Star

Optimized CSM-1B TTS pipeline for RTX 5090 (Blackwell sm_120). CUDA graph replay via patched HF Transformers. ~0.46x RTF. Topics (tags): csm text-to-speech rtx-5090 blackwell cuda-graphs torch-compile sesame streaming pytorch

text-to-speech streaming pytorch tts sesame csm huggingface blackwell torch-compile rtx-5090 sm-120 cuda-graphs

Updated Apr 5, 2026
Python

blackfirebitcoin / Dreamerv4-MC-GB10

Star

GB10 inference port; see fork.md

minecraft inference world-model diffusion-transformer flash-attention gb10 dgx-spark cuda-graphs

Updated May 14, 2026
Python

Improve this page

Add a description, image, and links to the cuda-graphs topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the cuda-graphs topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda-graphs

Here are 5 public repositories matching this topic...

ingero-io / ingero

kekzl / imp

karun2328 / qwen2.5-7b-vllm-prefill-benchmarks

D3velop-llc / csm-rtx5090

blackfirebitcoin / Dreamerv4-MC-GB10

Improve this page

Add this topic to your repo