Parakeet RTX first-run startup guidance (NVBugs 6195601)#2141
Parakeet RTX first-run startup guidance (NVBugs 6195601)#2141kheiss-uwzoo wants to merge 3 commits into
Conversation
…5601) Document RTX workstation profile-build wait behavior and kubectl readiness check. Split B200 (unsupported) from RTX SKUs in the support matrix; link from audio-video Helm section.
Greptile SummaryThis documentation-only PR adds first-run startup guidance for the Parakeet/Riva speech NIM on RTX workstation GPUs (RTX Pro 6000 Blackwell and RTX PRO 4500 Blackwell), documenting that these GPUs lack prebuilt TensorRT profiles and require a runtime engine build before the audio pod becomes Ready.
|
| Filename | Overview |
|---|---|
| docs/docs/extraction/prerequisites-support-matrix.md | Table rows updated to mark RTX Pro 6000 and RTX PRO 4500 as supported (1¹⁵); new footnote ⁵ added with kubectl wait / startupProbe guidance; footnote ⁴ narrowed to B200 only. Footnotes now appear out of numerical order (¹, ⁴, ⁵, ², ³) and the kubectl wait output description is inaccurate — both flagged in prior review threads. |
| docs/docs/extraction/audio-video.md | Adds a one-line first-run guidance paragraph linking to the matrix footnote ⁵; the link target resolves to the section heading rather than the footnote anchor — flagged in a prior review thread. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Deploy Parakeet NIM via Helm] --> B{GPU type?}
B -->|B200| C[Not supported — use hosted Parakeet or nimOperator.audio.enabled=false]
B -->|RTX Pro 6000 / RTX PRO 4500| D[NIM starts runtime Riva/TensorRT engine build]
B -->|H100 / H200 / A100 / A10G / L40S| E[May need runtime engine build per footnote 1]
D --> F{startupProbe passes before build finishes?}
F -->|No — CrashLoopBackOff| G[Treat as build in progress kubectl wait --timeout=30m]
F -->|Yes| H[Audio NIM pod Ready]
G --> H
G -->|Still not Ready after 30m| I[Inspect logs kubectl logs deploy/audio Check Speech NIM support matrix]
I --> J[Increase startupProbe.failureThreshold e.g. ~360 in Helm values]
J --> K[Redeploy and wait again]
K --> H
E --> H
Reviews (3): Last reviewed commit: "docs(extraction): revert duplicate footn..." | Re-trigger Greptile
| kubectl wait --for=condition=Ready pod -n <namespace> -l 'app.kubernetes.io/name=audio' --timeout=30m | ||
| ``` | ||
|
|
||
| The command exits with code 0 when the pod is Ready and prints nothing until then. If the pod never becomes Ready within 30 minutes, inspect `kubectl logs -n <namespace> deploy/audio` and the [NVIDIA Speech NIM support matrix](https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/index.html). To allow a longer probe window on retry, increase `startupProbe.failureThreshold` on the audio NIM deployment (for example ~360) per the [NeMo Retriever Helm chart README](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md#audio-video-parakeet). |
There was a problem hiding this comment.
kubectl wait output description is inaccurate
The sentence "The command exits with code 0 when the pod is Ready and prints nothing until then" is incorrect — kubectl wait does print a line like pod/audio-xxxxx condition met to stdout on success. A reader who sees that output after being told to expect silence may think something went wrong.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/prerequisites-support-matrix.md
Line: 144
Comment:
**`kubectl wait` output description is inaccurate**
The sentence "The command exits with code 0 when the pod is Ready and prints nothing until then" is incorrect — `kubectl wait` does print a line like `pod/audio-xxxxx condition met` to stdout on success. A reader who sees that output after being told to expect silence may think something went wrong.
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| For Kubernetes deployment details, see the [NeMo Retriever Helm chart README](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md#audio-video-parakeet). | ||
|
|
||
| On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#model-hardware-requirements). |
There was a problem hiding this comment.
The link destination
#model-hardware-requirements lands at the top of the "Model Hardware Requirements" section, not at footnote ⁵ itself. Since footnote ⁵ appears roughly 30 lines below that heading, users clicking the link must scroll to find the relevant guidance. Consider adding an explicit anchor (e.g. { #rtx-startup-guidance }) to footnote ⁵ in the matrix doc and updating this link to point to it.
| On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#model-hardware-requirements). | |
| On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#rtx-startup-guidance). |
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/audio-video.md
Line: 66
Comment:
The link destination `#model-hardware-requirements` lands at the top of the "Model Hardware Requirements" section, not at footnote ⁵ itself. Since footnote ⁵ appears roughly 30 lines below that heading, users clicking the link must scroll to find the relevant guidance. Consider adding an explicit anchor (e.g. `{ #rtx-startup-guidance }`) to footnote ⁵ in the matrix doc and updating this link to point to it.
```suggestion
On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#rtx-startup-guidance).
```
How can I resolve this? If you propose a fix, please make it concise.Keep first-run wait/probe notes in audio-video.md and RTX footnote 5 only.
Summary
kubectl waitup to 30m, do not submit audio jobs until Ready).Helm chart startupProbe defaults are wontfix for the Retriever Library team per QA disposition; this PR is documentation only on
main.Test plan
prerequisites-support-matrix.mdfootnotes 1, 4, and 5 and audio matrix rows for RTX vs B200.audio-video.mdHelm section link to footnote 5.