Skip to content

OpenShift follow-up for NVBug 6218358#2166

Open
kheiss-uwzoo wants to merge 3 commits into
NVIDIA:mainfrom
kheiss-uwzoo:docs/6218358-openshift-followup
Open

OpenShift follow-up for NVBug 6218358#2166
kheiss-uwzoo wants to merge 3 commits into
NVIDIA:mainfrom
kheiss-uwzoo:docs/6218358-openshift-followup

Conversation

@kheiss-uwzoo
Copy link
Copy Markdown
Collaborator

@kheiss-uwzoo kheiss-uwzoo commented May 29, 2026

Summary

  • Expands Helm README OpenShift guidance per QA RC9 validation (NVBugs 6218358 comment Update README.md #6): prebuilt ffmpeg image on restricted-v2, openshift-restricted.yaml profile, internal registry pull secrets, optional NIM LD_LIBRARY_PATH overrides, and Omni caption manual smoke-test request shape.
  • Cross-links from deployment-options.md and prerequisites-support-matrix.md (one sentence + helm anchor each; no deploy prose leakage).

NVBugs

6218358 — OpenShift Helm chart SCC/PodSecurity documentation follow-up (comment #6).

PR scope check

  • Base: main
  • Commits: 2 — OpenShift follow-up docs + page-role cross-link alignment
  • Files changed: nemo_retriever/helm/README.md, docs/docs/extraction/deployment-options.md, docs/docs/extraction/prerequisites-support-matrix.md
  • Red flags: none (docs only; no .cursor/, link-audit artifacts, chart/code changes)

Document RC9 QA findings: prebuilt ffmpeg image on restricted-v2, openshift-restricted values profile, internal registry pull secrets, optional NIM LD_LIBRARY_PATH overrides, and Omni caption smoke-test request shape. Cross-link from deployment-options and prerequisites-support-matrix.
@kheiss-uwzoo kheiss-uwzoo requested review from a team as code owners May 29, 2026 16:15
@kheiss-uwzoo kheiss-uwzoo requested a review from charlesbluca May 29, 2026 16:15
@kheiss-uwzoo kheiss-uwzoo changed the title docs(helm): OpenShift follow-up for NVBugs 6218358 OpenShift follow-up for NVBugs 6218358 May 29, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 29, 2026

Greptile Summary

This documentation-only PR expands the Helm README's OpenShift section with four new subsections (prebuilt ffmpeg service image, internal registry pull secrets, optional NIM LD_LIBRARY_PATH overrides, and Omni caption smoke-test request shape) and adds lightweight cross-links from deployment-options.md and prerequisites-support-matrix.md. All new anchors exist in the updated README and all cross-file links resolve correctly.

  • New OpenShift subsections (#audio-and-video-ffmpeg-on-restricted-openshift, #internal-registry-pull-secrets, #optional-nim-runtime-environment, #omni-caption-manual-smoke-testing) and a second install example for the full NIM Operator pipeline are added to nemo_retriever/helm/README.md.
  • Cross-links in deployment-options.md and prerequisites-support-matrix.md each add one sentence pointing at the new Helm README anchors; both anchors were verified to exist.
  • One documentation hazard: the LD_LIBRARY_PATH YAML snippet uses a prose placeholder comment for the chart's existing env defaults rather than listing them explicitly. Because Helm replaces list values entirely, a user who copy-pastes the block verbatim will silently drop NIM_TAGS_SELECTOR for Parakeet, potentially preventing that NIM from finding its model.

Confidence Score: 4/5

Safe to merge after fixing the LD_LIBRARY_PATH snippet; all other changes are cross-link additions and prose-only guidance.

The LD_LIBRARY_PATH env override YAML snippet uses a prose comment as a placeholder for the chart's real env defaults. Because Helm replaces list values on merge, a user who copy-pastes the block verbatim will lose NIM_TAGS_SELECTOR for Parakeet ASR — the exact env entry the NIM Operator needs to select the correct model variant. All other changes in the PR (anchor additions, cross-link sentences, openshift-restricted.yaml profile, install examples) appear correct and safe.

nemo_retriever/helm/README.md — specifically the new Optional NIM runtime environment subsection around the LD_LIBRARY_PATH YAML snippet.

Important Files Changed

Filename Overview
nemo_retriever/helm/README.md Substantial OpenShift docs expansion (ffmpeg prebuilt image, internal registry pull secrets, optional NIM env overrides, Omni smoke test). The LD_LIBRARY_PATH snippet uses a placeholder comment for chart-default env entries that Helm would silently drop on copy-paste, losing the critical NIM_TAGS_SELECTOR for Parakeet.
docs/docs/extraction/deployment-options.md Adds one sentence cross-linking OpenShift restricted-v2 to the Helm README; anchor resolves correctly.
docs/docs/extraction/prerequisites-support-matrix.md Adds cross-links for OpenShift ffmpeg guidance and Omni smoke-test section; both anchors exist in the updated README.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User: OpenShift restricted-v2] --> B{Audio/Video needed?}
    B -- No --> C[Use default service image\nleave installFfmpeg=false]
    B -- Yes --> D[Build custom image\nFROM nrl-service + apt ffmpeg\nleave installFfmpeg=false]
    D --> E{Image location?}
    E -- NGC / private registry --> F[Set ngcImagePullSecret.name\nAdd to imagePullSecrets list]
    E -- OpenShift internal registry --> G[Omit ngcImagePullSecret.name\nAdd SA dockercfg secret to\nimagePullSecrets]
    C --> H[Apply openshift-restricted.yaml\nrunAsNonRoot + drop ALL caps\nno runAsUser/fsGroup]
    F --> H
    G --> H
    H --> I{Optional NIM crashes?}
    I -- Missing .so in logs --> J[Add LD_LIBRARY_PATH to\nnimOperator.audio.env\nor nimOperator.omni.env\nkeep existing chart env entries]
    I -- No --> K[Deploy with helm install\n-f openshift-restricted.yaml]
    J --> K
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
nemo_retriever/helm/README.md:826-840
**`LD_LIBRARY_PATH` snippet silently drops required chart env defaults**

Helm replaces list values entirely rather than merging them; any `env` block specified in a user values file overwrites — not appends to — the chart's list. `nimOperator.audio.env` defaults contain `NIM_TAGS_SELECTOR=name=parakeet-1-1b-ctc-en-us,mode=ofl`, which the NIM Operator uses to locate the correct model variant in the cache. A user who copies this YAML block verbatim (with the `# … chart defaults …` as a comment, not live YAML) will produce an `env` list containing only `LD_LIBRARY_PATH`, silently dropping `NIM_TAGS_SELECTOR`. The Parakeet NIM can then fail to find its model at startup. The `nemotron_3_nano_omni_30b_a3b_reasoning` block similarly loses `NIM_HTTP_API_PORT`.

Expand the snippet to list the real defaults explicitly so a copy-paste produces a valid, complete `env` list:

### Issue 2 of 2
nemo_retriever/helm/README.md:815-827
The comment `# … chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) …` is a prose placeholder, not YAML. A user who copies this block verbatim will produce an `env` list that contains only `LD_LIBRARY_PATH`. For Parakeet this silently drops `NIM_TAGS_SELECTOR`, which the NIM Operator needs to locate the correct model variant; the NIM can then fail to start. Expand the snippet to include the actual chart defaults so a copy-paste produces a complete, correct list.

```suggestion
```yaml
nimOperator:
  audio:
    env:
      # Retain chart defaults, then append LD_LIBRARY_PATH.
      - name: NIM_TAGS_SELECTOR
        value: "name=parakeet-1-1b-ctc-en-us,mode=ofl"
      - name: NIM_TRITON_LOG_VERBOSE
        value: "1"
      - name: LD_LIBRARY_PATH
        value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
  nemotron_3_nano_omni_30b_a3b_reasoning:
    env:
      # Retain chart defaults, then append LD_LIBRARY_PATH.
      - name: NIM_HTTP_API_PORT
        value: "8000"
      - name: NIM_TRITON_LOG_VERBOSE
        value: "1"
      - name: LD_LIBRARY_PATH
        value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
```
```

Reviews (3): Last reviewed commit: "docs(helm): fix audit findings for OpenS..." | Re-trigger Greptile

Comment on lines +772 to +777
ngcImagePullSecret:
create: false
# Do not set name — leaves ngc-secret out of the rendered Pod spec.

imagePullSecrets:
- name: default-dockercfg-xxxxx # replace with your SA secret (see below)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 ngcImagePullSecret name must be explicitly cleared to avoid injecting ngc-secret

The comment "Do not set name — leaves ngc-secret out of the rendered Pod spec" is incorrect. values.yaml sets ngcImagePullSecret.name: "ngc-secret" as its default, and _helpers.tpl injects the secret when or .Values.ngcImagePullSecret.create .Values.ngcImagePullSecret.name is truthy. Because Helm deep-merges overrides with defaults, omitting name: in the user's values file leaves name as "ngc-secret" (non-empty → truthy), so ngc-secret is still injected into every Pod's imagePullSecrets. A user following this snippet will still hit ImagePullBackOff for the service pod. The fix is to explicitly set name: "" to clear the default.

Suggested change
ngcImagePullSecret:
create: false
# Do not set name — leaves ngc-secret out of the rendered Pod spec.
imagePullSecrets:
- name: default-dockercfg-xxxxx # replace with your SA secret (see below)
ngcImagePullSecret:
create: false
name: "" # Explicitly empty — clears the default "ngc-secret" so the helper skips injection.
imagePullSecrets:
- name: default-dockercfg-xxxxx # replace with your SA secret (see below)
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/helm/README.md
Line: 772-777

Comment:
**`ngcImagePullSecret` name must be explicitly cleared to avoid injecting `ngc-secret`**

The comment "Do not set name — leaves ngc-secret out of the rendered Pod spec" is incorrect. `values.yaml` sets `ngcImagePullSecret.name: "ngc-secret"` as its default, and `_helpers.tpl` injects the secret when `or .Values.ngcImagePullSecret.create .Values.ngcImagePullSecret.name` is truthy. Because Helm deep-merges overrides with defaults, omitting `name:` in the user's values file leaves `name` as `"ngc-secret"` (non-empty → truthy), so `ngc-secret` is still injected into every Pod's `imagePullSecrets`. A user following this snippet will still hit `ImagePullBackOff` for the service pod. The fix is to explicitly set `name: ""` to clear the default.

```suggestion
ngcImagePullSecret:
  create: false
  name: ""   # Explicitly empty — clears the default "ngc-secret" so the helper skips injection.

imagePullSecrets:
  - name: default-dockercfg-xxxxx   # replace with your SA secret (see below)
```

How can I resolve this? If you propose a fix, please make it concise.

Move Omni smoke-test note out of the chart admonition; use neutral link label to helm README anchor.
@kheiss-uwzoo kheiss-uwzoo changed the title OpenShift follow-up for NVBugs 6218358 docs(helm): OpenShift follow-up for NVBugs 6218358 May 29, 2026
Add explicit #1-service-image anchor and Parakeet step-4 OpenShift caveat for installFfmpeg.
@kheiss-uwzoo kheiss-uwzoo changed the title docs(helm): OpenShift follow-up for NVBugs 6218358 OpenShift follow-up for NVBug 6218358 May 29, 2026
Comment on lines +826 to +840
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
```

Paths vary with GPU Operator version and node image. Inspect a healthy GPU
workload on the same cluster (`oc exec` into a CUDA sample pod) or the failing
NIM pod's filesystem before pinning production values.

### Omni caption manual smoke testing { #omni-caption-manual-smoke-testing }

The retriever service caption profile already sends
`chat_template_kwargs.enable_thinking=false` to the Omni NIM during ingest. When
you call the Omni NIM **directly** at `/v1/chat/completions` (for example with
`curl` against the in-cluster `NIMService`), include the same flag so the
caption text lands in `message.content` instead of reasoning-only fields:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 LD_LIBRARY_PATH snippet silently drops required chart env defaults

Helm replaces list values entirely rather than merging them; any env block specified in a user values file overwrites — not appends to — the chart's list. nimOperator.audio.env defaults contain NIM_TAGS_SELECTOR=name=parakeet-1-1b-ctc-en-us,mode=ofl, which the NIM Operator uses to locate the correct model variant in the cache. A user who copies this YAML block verbatim (with the # … chart defaults … as a comment, not live YAML) will produce an env list containing only LD_LIBRARY_PATH, silently dropping NIM_TAGS_SELECTOR. The Parakeet NIM can then fail to find its model at startup. The nemotron_3_nano_omni_30b_a3b_reasoning block similarly loses NIM_HTTP_API_PORT.

Expand the snippet to list the real defaults explicitly so a copy-paste produces a valid, complete env list:

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/helm/README.md
Line: 826-840

Comment:
**`LD_LIBRARY_PATH` snippet silently drops required chart env defaults**

Helm replaces list values entirely rather than merging them; any `env` block specified in a user values file overwrites — not appends to — the chart's list. `nimOperator.audio.env` defaults contain `NIM_TAGS_SELECTOR=name=parakeet-1-1b-ctc-en-us,mode=ofl`, which the NIM Operator uses to locate the correct model variant in the cache. A user who copies this YAML block verbatim (with the `# … chart defaults …` as a comment, not live YAML) will produce an `env` list containing only `LD_LIBRARY_PATH`, silently dropping `NIM_TAGS_SELECTOR`. The Parakeet NIM can then fail to find its model at startup. The `nemotron_3_nano_omni_30b_a3b_reasoning` block similarly loses `NIM_HTTP_API_PORT`.

Expand the snippet to list the real defaults explicitly so a copy-paste produces a valid, complete `env` list:

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +815 to +827
```yaml
nimOperator:
audio:
env:
# ... chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) ...
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
nemotron_3_nano_omni_30b_a3b_reasoning:
env:
# ... chart defaults (NIM_HTTP_API_PORT, etc.) ...
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The comment # … chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) … is a prose placeholder, not YAML. A user who copies this block verbatim will produce an env list that contains only LD_LIBRARY_PATH. For Parakeet this silently drops NIM_TAGS_SELECTOR, which the NIM Operator needs to locate the correct model variant; the NIM can then fail to start. Expand the snippet to include the actual chart defaults so a copy-paste produces a complete, correct list.

Suggested change
```yaml
nimOperator:
audio:
env:
# ... chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) ...
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
nemotron_3_nano_omni_30b_a3b_reasoning:
env:
# ... chart defaults (NIM_HTTP_API_PORT, etc.) ...
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
```
```yaml
nimOperator:
audio:
env:
# Retain chart defaults, then append LD_LIBRARY_PATH.
- name: NIM_TAGS_SELECTOR
value: "name=parakeet-1-1b-ctc-en-us,mode=ofl"
- name: NIM_TRITON_LOG_VERBOSE
value: "1"
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
nemotron_3_nano_omni_30b_a3b_reasoning:
env:
# Retain chart defaults, then append LD_LIBRARY_PATH.
- name: NIM_HTTP_API_PORT
value: "8000"
- name: NIM_TRITON_LOG_VERBOSE
value: "1"
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"

<details><summary>Prompt To Fix With AI</summary>

`````markdown
This is a comment left during a code review.
Path: nemo_retriever/helm/README.md
Line: 815-827

Comment:
The comment `# … chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) …` is a prose placeholder, not YAML. A user who copies this block verbatim will produce an `env` list that contains only `LD_LIBRARY_PATH`. For Parakeet this silently drops `NIM_TAGS_SELECTOR`, which the NIM Operator needs to locate the correct model variant; the NIM can then fail to start. Expand the snippet to include the actual chart defaults so a copy-paste produces a complete, correct list.

```suggestion
```yaml
nimOperator:
  audio:
    env:
      # Retain chart defaults, then append LD_LIBRARY_PATH.
      - name: NIM_TAGS_SELECTOR
        value: "name=parakeet-1-1b-ctc-en-us,mode=ofl"
      - name: NIM_TRITON_LOG_VERBOSE
        value: "1"
      - name: LD_LIBRARY_PATH
        value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
  nemotron_3_nano_omni_30b_a3b_reasoning:
    env:
      # Retain chart defaults, then append LD_LIBRARY_PATH.
      - name: NIM_HTTP_API_PORT
        value: "8000"
      - name: NIM_TRITON_LOG_VERBOSE
        value: "1"
      - name: LD_LIBRARY_PATH
        value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"

How can I resolve this? If you propose a fix, please make it concise.

@kheiss-uwzoo kheiss-uwzoo added the doc Improvements or additions to documentation label May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant