OpenShift follow-up for NVBug 6218358#2166
Conversation
Document RC9 QA findings: prebuilt ffmpeg image on restricted-v2, openshift-restricted values profile, internal registry pull secrets, optional NIM LD_LIBRARY_PATH overrides, and Omni caption smoke-test request shape. Cross-link from deployment-options and prerequisites-support-matrix.
Greptile SummaryThis documentation-only PR expands the Helm README's OpenShift section with four new subsections (prebuilt
|
| Filename | Overview |
|---|---|
| nemo_retriever/helm/README.md | Substantial OpenShift docs expansion (ffmpeg prebuilt image, internal registry pull secrets, optional NIM env overrides, Omni smoke test). The LD_LIBRARY_PATH snippet uses a placeholder comment for chart-default env entries that Helm would silently drop on copy-paste, losing the critical NIM_TAGS_SELECTOR for Parakeet. |
| docs/docs/extraction/deployment-options.md | Adds one sentence cross-linking OpenShift restricted-v2 to the Helm README; anchor resolves correctly. |
| docs/docs/extraction/prerequisites-support-matrix.md | Adds cross-links for OpenShift ffmpeg guidance and Omni smoke-test section; both anchors exist in the updated README. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[User: OpenShift restricted-v2] --> B{Audio/Video needed?}
B -- No --> C[Use default service image\nleave installFfmpeg=false]
B -- Yes --> D[Build custom image\nFROM nrl-service + apt ffmpeg\nleave installFfmpeg=false]
D --> E{Image location?}
E -- NGC / private registry --> F[Set ngcImagePullSecret.name\nAdd to imagePullSecrets list]
E -- OpenShift internal registry --> G[Omit ngcImagePullSecret.name\nAdd SA dockercfg secret to\nimagePullSecrets]
C --> H[Apply openshift-restricted.yaml\nrunAsNonRoot + drop ALL caps\nno runAsUser/fsGroup]
F --> H
G --> H
H --> I{Optional NIM crashes?}
I -- Missing .so in logs --> J[Add LD_LIBRARY_PATH to\nnimOperator.audio.env\nor nimOperator.omni.env\nkeep existing chart env entries]
I -- No --> K[Deploy with helm install\n-f openshift-restricted.yaml]
J --> K
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
nemo_retriever/helm/README.md:826-840
**`LD_LIBRARY_PATH` snippet silently drops required chart env defaults**
Helm replaces list values entirely rather than merging them; any `env` block specified in a user values file overwrites — not appends to — the chart's list. `nimOperator.audio.env` defaults contain `NIM_TAGS_SELECTOR=name=parakeet-1-1b-ctc-en-us,mode=ofl`, which the NIM Operator uses to locate the correct model variant in the cache. A user who copies this YAML block verbatim (with the `# … chart defaults …` as a comment, not live YAML) will produce an `env` list containing only `LD_LIBRARY_PATH`, silently dropping `NIM_TAGS_SELECTOR`. The Parakeet NIM can then fail to find its model at startup. The `nemotron_3_nano_omni_30b_a3b_reasoning` block similarly loses `NIM_HTTP_API_PORT`.
Expand the snippet to list the real defaults explicitly so a copy-paste produces a valid, complete `env` list:
### Issue 2 of 2
nemo_retriever/helm/README.md:815-827
The comment `# … chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) …` is a prose placeholder, not YAML. A user who copies this block verbatim will produce an `env` list that contains only `LD_LIBRARY_PATH`. For Parakeet this silently drops `NIM_TAGS_SELECTOR`, which the NIM Operator needs to locate the correct model variant; the NIM can then fail to start. Expand the snippet to include the actual chart defaults so a copy-paste produces a complete, correct list.
```suggestion
```yaml
nimOperator:
audio:
env:
# Retain chart defaults, then append LD_LIBRARY_PATH.
- name: NIM_TAGS_SELECTOR
value: "name=parakeet-1-1b-ctc-en-us,mode=ofl"
- name: NIM_TRITON_LOG_VERBOSE
value: "1"
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
nemotron_3_nano_omni_30b_a3b_reasoning:
env:
# Retain chart defaults, then append LD_LIBRARY_PATH.
- name: NIM_HTTP_API_PORT
value: "8000"
- name: NIM_TRITON_LOG_VERBOSE
value: "1"
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
```
```
Reviews (3): Last reviewed commit: "docs(helm): fix audit findings for OpenS..." | Re-trigger Greptile
| ngcImagePullSecret: | ||
| create: false | ||
| # Do not set name — leaves ngc-secret out of the rendered Pod spec. | ||
|
|
||
| imagePullSecrets: | ||
| - name: default-dockercfg-xxxxx # replace with your SA secret (see below) |
There was a problem hiding this comment.
ngcImagePullSecret name must be explicitly cleared to avoid injecting ngc-secret
The comment "Do not set name — leaves ngc-secret out of the rendered Pod spec" is incorrect. values.yaml sets ngcImagePullSecret.name: "ngc-secret" as its default, and _helpers.tpl injects the secret when or .Values.ngcImagePullSecret.create .Values.ngcImagePullSecret.name is truthy. Because Helm deep-merges overrides with defaults, omitting name: in the user's values file leaves name as "ngc-secret" (non-empty → truthy), so ngc-secret is still injected into every Pod's imagePullSecrets. A user following this snippet will still hit ImagePullBackOff for the service pod. The fix is to explicitly set name: "" to clear the default.
| ngcImagePullSecret: | |
| create: false | |
| # Do not set name — leaves ngc-secret out of the rendered Pod spec. | |
| imagePullSecrets: | |
| - name: default-dockercfg-xxxxx # replace with your SA secret (see below) | |
| ngcImagePullSecret: | |
| create: false | |
| name: "" # Explicitly empty — clears the default "ngc-secret" so the helper skips injection. | |
| imagePullSecrets: | |
| - name: default-dockercfg-xxxxx # replace with your SA secret (see below) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/helm/README.md
Line: 772-777
Comment:
**`ngcImagePullSecret` name must be explicitly cleared to avoid injecting `ngc-secret`**
The comment "Do not set name — leaves ngc-secret out of the rendered Pod spec" is incorrect. `values.yaml` sets `ngcImagePullSecret.name: "ngc-secret"` as its default, and `_helpers.tpl` injects the secret when `or .Values.ngcImagePullSecret.create .Values.ngcImagePullSecret.name` is truthy. Because Helm deep-merges overrides with defaults, omitting `name:` in the user's values file leaves `name` as `"ngc-secret"` (non-empty → truthy), so `ngc-secret` is still injected into every Pod's `imagePullSecrets`. A user following this snippet will still hit `ImagePullBackOff` for the service pod. The fix is to explicitly set `name: ""` to clear the default.
```suggestion
ngcImagePullSecret:
create: false
name: "" # Explicitly empty — clears the default "ngc-secret" so the helper skips injection.
imagePullSecrets:
- name: default-dockercfg-xxxxx # replace with your SA secret (see below)
```
How can I resolve this? If you propose a fix, please make it concise.Move Omni smoke-test note out of the chart admonition; use neutral link label to helm README anchor.
Add explicit #1-service-image anchor and Parakeet step-4 OpenShift caveat for installFfmpeg.
| value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64" | ||
| ``` | ||
|
|
||
| Paths vary with GPU Operator version and node image. Inspect a healthy GPU | ||
| workload on the same cluster (`oc exec` into a CUDA sample pod) or the failing | ||
| NIM pod's filesystem before pinning production values. | ||
|
|
||
| ### Omni caption manual smoke testing { #omni-caption-manual-smoke-testing } | ||
|
|
||
| The retriever service caption profile already sends | ||
| `chat_template_kwargs.enable_thinking=false` to the Omni NIM during ingest. When | ||
| you call the Omni NIM **directly** at `/v1/chat/completions` (for example with | ||
| `curl` against the in-cluster `NIMService`), include the same flag so the | ||
| caption text lands in `message.content` instead of reasoning-only fields: | ||
|
|
There was a problem hiding this comment.
LD_LIBRARY_PATH snippet silently drops required chart env defaults
Helm replaces list values entirely rather than merging them; any env block specified in a user values file overwrites — not appends to — the chart's list. nimOperator.audio.env defaults contain NIM_TAGS_SELECTOR=name=parakeet-1-1b-ctc-en-us,mode=ofl, which the NIM Operator uses to locate the correct model variant in the cache. A user who copies this YAML block verbatim (with the # … chart defaults … as a comment, not live YAML) will produce an env list containing only LD_LIBRARY_PATH, silently dropping NIM_TAGS_SELECTOR. The Parakeet NIM can then fail to find its model at startup. The nemotron_3_nano_omni_30b_a3b_reasoning block similarly loses NIM_HTTP_API_PORT.
Expand the snippet to list the real defaults explicitly so a copy-paste produces a valid, complete env list:
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/helm/README.md
Line: 826-840
Comment:
**`LD_LIBRARY_PATH` snippet silently drops required chart env defaults**
Helm replaces list values entirely rather than merging them; any `env` block specified in a user values file overwrites — not appends to — the chart's list. `nimOperator.audio.env` defaults contain `NIM_TAGS_SELECTOR=name=parakeet-1-1b-ctc-en-us,mode=ofl`, which the NIM Operator uses to locate the correct model variant in the cache. A user who copies this YAML block verbatim (with the `# … chart defaults …` as a comment, not live YAML) will produce an `env` list containing only `LD_LIBRARY_PATH`, silently dropping `NIM_TAGS_SELECTOR`. The Parakeet NIM can then fail to find its model at startup. The `nemotron_3_nano_omni_30b_a3b_reasoning` block similarly loses `NIM_HTTP_API_PORT`.
Expand the snippet to list the real defaults explicitly so a copy-paste produces a valid, complete `env` list:
How can I resolve this? If you propose a fix, please make it concise.| ```yaml | ||
| nimOperator: | ||
| audio: | ||
| env: | ||
| # ... chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) ... | ||
| - name: LD_LIBRARY_PATH | ||
| value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64" | ||
| nemotron_3_nano_omni_30b_a3b_reasoning: | ||
| env: | ||
| # ... chart defaults (NIM_HTTP_API_PORT, etc.) ... | ||
| - name: LD_LIBRARY_PATH | ||
| value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64" | ||
| ``` |
There was a problem hiding this comment.
The comment
# … chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) … is a prose placeholder, not YAML. A user who copies this block verbatim will produce an env list that contains only LD_LIBRARY_PATH. For Parakeet this silently drops NIM_TAGS_SELECTOR, which the NIM Operator needs to locate the correct model variant; the NIM can then fail to start. Expand the snippet to include the actual chart defaults so a copy-paste produces a complete, correct list.
| ```yaml | |
| nimOperator: | |
| audio: | |
| env: | |
| # ... chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) ... | |
| - name: LD_LIBRARY_PATH | |
| value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64" | |
| nemotron_3_nano_omni_30b_a3b_reasoning: | |
| env: | |
| # ... chart defaults (NIM_HTTP_API_PORT, etc.) ... | |
| - name: LD_LIBRARY_PATH | |
| value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64" | |
| ``` | |
| ```yaml | |
| nimOperator: | |
| audio: | |
| env: | |
| # Retain chart defaults, then append LD_LIBRARY_PATH. | |
| - name: NIM_TAGS_SELECTOR | |
| value: "name=parakeet-1-1b-ctc-en-us,mode=ofl" | |
| - name: NIM_TRITON_LOG_VERBOSE | |
| value: "1" | |
| - name: LD_LIBRARY_PATH | |
| value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64" | |
| nemotron_3_nano_omni_30b_a3b_reasoning: | |
| env: | |
| # Retain chart defaults, then append LD_LIBRARY_PATH. | |
| - name: NIM_HTTP_API_PORT | |
| value: "8000" | |
| - name: NIM_TRITON_LOG_VERBOSE | |
| value: "1" | |
| - name: LD_LIBRARY_PATH | |
| value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64" |
<details><summary>Prompt To Fix With AI</summary>
`````markdown
This is a comment left during a code review.
Path: nemo_retriever/helm/README.md
Line: 815-827
Comment:
The comment `# … chart defaults (NIM_TAGS_SELECTOR, NIM_TRITON_LOG_VERBOSE, etc.) …` is a prose placeholder, not YAML. A user who copies this block verbatim will produce an `env` list that contains only `LD_LIBRARY_PATH`. For Parakeet this silently drops `NIM_TAGS_SELECTOR`, which the NIM Operator needs to locate the correct model variant; the NIM can then fail to start. Expand the snippet to include the actual chart defaults so a copy-paste produces a complete, correct list.
```suggestion
```yaml
nimOperator:
audio:
env:
# Retain chart defaults, then append LD_LIBRARY_PATH.
- name: NIM_TAGS_SELECTOR
value: "name=parakeet-1-1b-ctc-en-us,mode=ofl"
- name: NIM_TRITON_LOG_VERBOSE
value: "1"
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
nemotron_3_nano_omni_30b_a3b_reasoning:
env:
# Retain chart defaults, then append LD_LIBRARY_PATH.
- name: NIM_HTTP_API_PORT
value: "8000"
- name: NIM_TRITON_LOG_VERBOSE
value: "1"
- name: LD_LIBRARY_PATH
value: "/usr/local/nvidia/lib64:/usr/local/cuda/lib64"
How can I resolve this? If you propose a fix, please make it concise.
Summary
openshift-restricted.yamlprofile, internal registry pull secrets, optional NIMLD_LIBRARY_PATHoverrides, and Omni caption manual smoke-test request shape.deployment-options.mdandprerequisites-support-matrix.md(one sentence + helm anchor each; no deploy prose leakage).NVBugs
6218358 — OpenShift Helm chart SCC/PodSecurity documentation follow-up (comment #6).
PR scope check
mainnemo_retriever/helm/README.md,docs/docs/extraction/deployment-options.md,docs/docs/extraction/prerequisites-support-matrix.md.cursor/, link-audit artifacts, chart/code changes)