Skip to content

Parakeet RTX first-run startup guidance (NVBugs 6195601)#2141

Open
kheiss-uwzoo wants to merge 3 commits into
NVIDIA:mainfrom
kheiss-uwzoo:kheiss/6195601
Open

Parakeet RTX first-run startup guidance (NVBugs 6195601)#2141
kheiss-uwzoo wants to merge 3 commits into
NVIDIA:mainfrom
kheiss-uwzoo:kheiss/6195601

Conversation

@kheiss-uwzoo
Copy link
Copy Markdown
Collaborator

Summary

  • NVBugs 6195601 — Document that the Parakeet/Riva speech NIM does not ship prebuilt TensorRT profiles on RTX workstation SKUs and may need a long first-run engine build before the audio pod becomes Ready.
  • Update the hardware support matrix: B200 remains unsupported for self-hosted audio (footnote 4); RTX Pro 6000 and RTX PRO 4500 show supported with footnote 5 operational guidance (profile build, default startupProbe behavior, kubectl wait up to 30m, do not submit audio jobs until Ready).
  • Add a one-line pointer from the audio/video Helm section to matrix footnote 5.

Helm chart startupProbe defaults are wontfix for the Retriever Library team per QA disposition; this PR is documentation only on main.

Test plan

  • Review prerequisites-support-matrix.md footnotes 1, 4, and 5 and audio matrix rows for RTX vs B200.
  • Review audio-video.md Helm section link to footnote 5.
  • Confirm PR diff is two markdown files only.

…5601)

Document RTX workstation profile-build wait behavior and kubectl readiness check. Split B200 (unsupported) from RTX SKUs in the support matrix; link from audio-video Helm section.
@kheiss-uwzoo kheiss-uwzoo requested review from a team as code owners May 27, 2026 21:46
@kheiss-uwzoo kheiss-uwzoo requested a review from ChrisJar May 27, 2026 21:46
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 27, 2026

Greptile Summary

This documentation-only PR adds first-run startup guidance for the Parakeet/Riva speech NIM on RTX workstation GPUs (RTX Pro 6000 Blackwell and RTX PRO 4500 Blackwell), documenting that these GPUs lack prebuilt TensorRT profiles and require a runtime engine build before the audio pod becomes Ready.

  • Updates the hardware support matrix to move RTX Pro 6000 and RTX PRO 4500 from "Not supported⁴" to "1¹⁵" (supported with footnotes 1 and 5), scoping footnote ⁴ to B200 only.
  • Adds new footnote ⁵ explaining the startupProbe / CrashLoopBackOff behavior during the engine build, with kubectl wait --timeout=30m guidance and a link to the Helm chart README for adjusting startupProbe.failureThreshold.
  • Adds a one-line pointer in audio-video.md from the Helm deployment section to the new footnote ⁵.

Confidence Score: 5/5

Documentation-only change with no code or configuration modifications; safe to merge.

Both changed files are Markdown documentation. The hardware matrix updates and new footnote ⁵ are internally consistent: B200 remains unsupported, RTX workstation GPUs are now marked supported with appropriate caveats, and the kubectl wait / startupProbe guidance is technically accurate. The issues previously flagged in review threads (kubectl wait stdout behavior, link anchor precision, footnote ordering) are style/accuracy nits in prose that do not affect deployed software.

No files require special attention beyond the open review thread items already on record.

Important Files Changed

Filename Overview
docs/docs/extraction/prerequisites-support-matrix.md Table rows updated to mark RTX Pro 6000 and RTX PRO 4500 as supported (1¹⁵); new footnote ⁵ added with kubectl wait / startupProbe guidance; footnote ⁴ narrowed to B200 only. Footnotes now appear out of numerical order (¹, ⁴, ⁵, ², ³) and the kubectl wait output description is inaccurate — both flagged in prior review threads.
docs/docs/extraction/audio-video.md Adds a one-line first-run guidance paragraph linking to the matrix footnote ⁵; the link target resolves to the section heading rather than the footnote anchor — flagged in a prior review thread.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Deploy Parakeet NIM via Helm] --> B{GPU type?}
    B -->|B200| C[Not supported — use hosted Parakeet or nimOperator.audio.enabled=false]
    B -->|RTX Pro 6000 / RTX PRO 4500| D[NIM starts runtime Riva/TensorRT engine build]
    B -->|H100 / H200 / A100 / A10G / L40S| E[May need runtime engine build per footnote 1]
    D --> F{startupProbe passes before build finishes?}
    F -->|No — CrashLoopBackOff| G[Treat as build in progress kubectl wait --timeout=30m]
    F -->|Yes| H[Audio NIM pod Ready]
    G --> H
    G -->|Still not Ready after 30m| I[Inspect logs kubectl logs deploy/audio Check Speech NIM support matrix]
    I --> J[Increase startupProbe.failureThreshold e.g. ~360 in Helm values]
    J --> K[Redeploy and wait again]
    K --> H
    E --> H
Loading

Reviews (3): Last reviewed commit: "docs(extraction): revert duplicate footn..." | Re-trigger Greptile

kubectl wait --for=condition=Ready pod -n <namespace> -l 'app.kubernetes.io/name=audio' --timeout=30m
```

The command exits with code 0 when the pod is Ready and prints nothing until then. If the pod never becomes Ready within 30 minutes, inspect `kubectl logs -n <namespace> deploy/audio` and the [NVIDIA Speech NIM support matrix](https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/index.html). To allow a longer probe window on retry, increase `startupProbe.failureThreshold` on the audio NIM deployment (for example ~360) per the [NeMo Retriever Helm chart README](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md#audio-video-parakeet).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 kubectl wait output description is inaccurate

The sentence "The command exits with code 0 when the pod is Ready and prints nothing until then" is incorrect — kubectl wait does print a line like pod/audio-xxxxx condition met to stdout on success. A reader who sees that output after being told to expect silence may think something went wrong.

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/prerequisites-support-matrix.md
Line: 144

Comment:
**`kubectl wait` output description is inaccurate**

The sentence "The command exits with code 0 when the pod is Ready and prints nothing until then" is incorrect — `kubectl wait` does print a line like `pod/audio-xxxxx condition met` to stdout on success. A reader who sees that output after being told to expect silence may think something went wrong.

How can I resolve this? If you propose a fix, please make it concise.


For Kubernetes deployment details, see the [NeMo Retriever Helm chart README](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md#audio-video-parakeet).

On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#model-hardware-requirements).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The link destination #model-hardware-requirements lands at the top of the "Model Hardware Requirements" section, not at footnote ⁵ itself. Since footnote ⁵ appears roughly 30 lines below that heading, users clicking the link must scroll to find the relevant guidance. Consider adding an explicit anchor (e.g. { #rtx-startup-guidance }) to footnote ⁵ in the matrix doc and updating this link to point to it.

Suggested change
On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#model-hardware-requirements).
On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#rtx-startup-guidance).
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/audio-video.md
Line: 66

Comment:
The link destination `#model-hardware-requirements` lands at the top of the "Model Hardware Requirements" section, not at footnote ⁵ itself. Since footnote ⁵ appears roughly 30 lines below that heading, users clicking the link must scroll to find the relevant guidance. Consider adding an explicit anchor (e.g. `{ #rtx-startup-guidance }`) to footnote ⁵ in the matrix doc and updating this link to point to it.

```suggestion
On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#rtx-startup-guidance).
```

How can I resolve this? If you propose a fix, please make it concise.

@kheiss-uwzoo kheiss-uwzoo changed the title docs(extraction): Parakeet RTX first-run startup guidance (NVBugs 6195601) Parakeet RTX first-run startup guidance (NVBugs 6195601) May 27, 2026
@kheiss-uwzoo kheiss-uwzoo added the doc Improvements or additions to documentation label May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant