Parakeet RTX first-run startup guidance (NVBugs 6195601) by kheiss-uwzoo · Pull Request #2141 · NVIDIA/NeMo-Retriever

kheiss-uwzoo · 2026-05-27T21:46:20Z

Summary

NVBugs 6195601 — Document that the Parakeet/Riva speech NIM does not ship prebuilt TensorRT profiles on RTX workstation SKUs and may need a long first-run engine build before the audio pod becomes Ready.
Update the hardware support matrix: B200 remains unsupported for self-hosted audio (footnote 4); RTX Pro 6000 and RTX PRO 4500 show supported with footnote 5 operational guidance (profile build, default startupProbe behavior, kubectl wait up to 30m, do not submit audio jobs until Ready).
Add a one-line pointer from the audio/video Helm section to matrix footnote 5.

Helm chart startupProbe defaults are wontfix for the Retriever Library team per QA disposition; this PR is documentation only on main.

Test plan

Review prerequisites-support-matrix.md footnotes 1, 4, and 5 and audio matrix rows for RTX vs B200.
Review audio-video.md Helm section link to footnote 5.
Confirm PR diff is two markdown files only.

…5601) Document RTX workstation profile-build wait behavior and kubectl readiness check. Split B200 (unsupported) from RTX SKUs in the support matrix; link from audio-video Helm section.

greptile-apps · 2026-05-27T21:48:47Z

Greptile Summary

This documentation-only PR adds first-run startup guidance for the Parakeet/Riva speech NIM on RTX workstation GPUs (RTX Pro 6000 Blackwell and RTX PRO 4500 Blackwell), documenting that these GPUs lack prebuilt TensorRT profiles and require a runtime engine build before the audio pod becomes Ready.

Updates the hardware support matrix to move RTX Pro 6000 and RTX PRO 4500 from "Not supported⁴" to "1¹⁵" (supported with footnotes 1 and 5), scoping footnote ⁴ to B200 only.
Adds new footnote ⁵ explaining the startupProbe / CrashLoopBackOff behavior during the engine build, with kubectl wait --timeout=30m guidance and a link to the Helm chart README for adjusting startupProbe.failureThreshold.
Adds a one-line pointer in audio-video.md from the Helm deployment section to the new footnote ⁵.

Confidence Score: 5/5

Documentation-only change with no code or configuration modifications; safe to merge.

Both changed files are Markdown documentation. The hardware matrix updates and new footnote ⁵ are internally consistent: B200 remains unsupported, RTX workstation GPUs are now marked supported with appropriate caveats, and the kubectl wait / startupProbe guidance is technically accurate. The issues previously flagged in review threads (kubectl wait stdout behavior, link anchor precision, footnote ordering) are style/accuracy nits in prose that do not affect deployed software.

No files require special attention beyond the open review thread items already on record.

Important Files Changed

Filename	Overview
docs/docs/extraction/prerequisites-support-matrix.md	Table rows updated to mark RTX Pro 6000 and RTX PRO 4500 as supported (1¹⁵); new footnote ⁵ added with kubectl wait / startupProbe guidance; footnote ⁴ narrowed to B200 only. Footnotes now appear out of numerical order (¹, ⁴, ⁵, ², ³) and the kubectl wait output description is inaccurate — both flagged in prior review threads.
docs/docs/extraction/audio-video.md	Adds a one-line first-run guidance paragraph linking to the matrix footnote ⁵; the link target resolves to the section heading rather than the footnote anchor — flagged in a prior review thread.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Deploy Parakeet NIM via Helm] --> B{GPU type?}
    B -->|B200| C[Not supported — use hosted Parakeet or nimOperator.audio.enabled=false]
    B -->|RTX Pro 6000 / RTX PRO 4500| D[NIM starts runtime Riva/TensorRT engine build]
    B -->|H100 / H200 / A100 / A10G / L40S| E[May need runtime engine build per footnote 1]
    D --> F{startupProbe passes before build finishes?}
    F -->|No — CrashLoopBackOff| G[Treat as build in progress kubectl wait --timeout=30m]
    F -->|Yes| H[Audio NIM pod Ready]
    G --> H
    G -->|Still not Ready after 30m| I[Inspect logs kubectl logs deploy/audio Check Speech NIM support matrix]
    I --> J[Increase startupProbe.failureThreshold e.g. ~360 in Helm values]
    J --> K[Redeploy and wait again]
    K --> H
    E --> H

_{Reviews (3): Last reviewed commit: "docs(extraction): revert duplicate footn..." | Re-trigger Greptile}

greptile-apps · 2026-05-27T21:48:59Z

+kubectl wait --for=condition=Ready pod -n <namespace> -l 'app.kubernetes.io/name=audio' --timeout=30m
+```
+
+The command exits with code 0 when the pod is Ready and prints nothing until then. If the pod never becomes Ready within 30 minutes, inspect `kubectl logs -n <namespace> deploy/audio` and the [NVIDIA Speech NIM support matrix](https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/index.html). To allow a longer probe window on retry, increase `startupProbe.failureThreshold` on the audio NIM deployment (for example ~360) per the [NeMo Retriever Helm chart README](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md#audio-video-parakeet).


kubectl wait output description is inaccurate

The sentence "The command exits with code 0 when the pod is Ready and prints nothing until then" is incorrect — kubectl wait does print a line like pod/audio-xxxxx condition met to stdout on success. A reader who sees that output after being told to expect silence may think something went wrong.

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/docs/extraction/prerequisites-support-matrix.md Line: 144 Comment: **`kubectl wait` output description is inaccurate** The sentence "The command exits with code 0 when the pod is Ready and prints nothing until then" is incorrect — `kubectl wait` does print a line like `pod/audio-xxxxx condition met` to stdout on success. A reader who sees that output after being told to expect silence may think something went wrong. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-27T21:49:01Z


 For Kubernetes deployment details, see the [NeMo Retriever Helm chart README](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md#audio-video-parakeet).

+On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#model-hardware-requirements).


The link destination #model-hardware-requirements lands at the top of the "Model Hardware Requirements" section, not at footnote ⁵ itself. Since footnote ⁵ appears roughly 30 lines below that heading, users clicking the link must scroll to find the relevant guidance. Consider adding an explicit anchor (e.g. { #rtx-startup-guidance }) to footnote ⁵ in the matrix doc and updating this link to point to it.

Suggested change

On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#model-hardware-requirements).

On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#rtx-startup-guidance).

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/docs/extraction/audio-video.md Line: 66 Comment: The link destination `#model-hardware-requirements` lands at the top of the "Model Hardware Requirements" section, not at footnote ⁵ itself. Since footnote ⁵ appears roughly 30 lines below that heading, users clicking the link must scroll to find the relevant guidance. Consider adding an explicit anchor (e.g. `{ #rtx-startup-guidance }`) to footnote ⁵ in the matrix doc and updating this link to point to it. ```suggestion On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#rtx-startup-guidance). ``` How can I resolve this? If you propose a fix, please make it concise.

Keep first-run wait/probe notes in audio-video.md and RTX footnote 5 only.

docs(extraction): Parakeet RTX first-run startup guidance (NVBugs 619…

8a2b130

…5601) Document RTX workstation profile-build wait behavior and kubectl readiness check. Split B200 (unsupported) from RTX SKUs in the support matrix; link from audio-video Helm section.

kheiss-uwzoo requested review from a team as code owners May 27, 2026 21:46

kheiss-uwzoo requested a review from ChrisJar May 27, 2026 21:46

greptile-apps Bot reviewed May 27, 2026

View reviewed changes

kheiss-uwzoo changed the title ~~docs(extraction): Parakeet RTX first-run startup guidance (NVBugs 6195601)~~ Parakeet RTX first-run startup guidance (NVBugs 6195601) May 27, 2026

kheiss-uwzoo added 2 commits May 27, 2026 15:00

Merge branch 'main' into kheiss/6195601

3bac006

docs(extraction): revert duplicate footnote 1 startup guidance (6195601)

7503e83

Keep first-run wait/probe notes in audio-video.md and RTX footnote 5 only.

kheiss-uwzoo added the doc Improvements or additions to documentation label May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parakeet RTX first-run startup guidance (NVBugs 6195601)#2141

Parakeet RTX first-run startup guidance (NVBugs 6195601)#2141
kheiss-uwzoo wants to merge 3 commits into
NVIDIA:mainfrom
kheiss-uwzoo:kheiss/6195601

kheiss-uwzoo commented May 27, 2026

Uh oh!

greptile-apps Bot commented May 27, 2026 •

edited

Loading

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		For Kubernetes deployment details, see the [NeMo Retriever Helm chart README](https://github.com/NVIDIA/NeMo-Retriever/blob/main/nemo_retriever/helm/README.md#audio-video-parakeet).

		On first deploy, the Parakeet/Riva NIM may need a long Riva/TensorRT engine build (especially on RTX GPUs without a prebuilt profile). Wait until the audio NIM pod is Ready before you run extraction jobs—see [Pre-Requisites & Support Matrix — footnote ⁵](prerequisites-support-matrix.md#model-hardware-requirements).

Conversation

kheiss-uwzoo commented May 27, 2026

Summary

Test plan

Uh oh!

greptile-apps Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented May 27, 2026 •

edited

Loading