26.05 into main folding pr#2175
Conversation
Bump docs, Helm chart metadata, and install snippets from 26.03/26.3.0 to the 26.05 line and RC1 tag so published artifacts align with user-facing version references.
upload-artifact flattens nemo_retriever/dist into ./dist on download; use find instead of ./dist/nemo_retriever/dist/*. Re-run failed jobs does not pick up workflow changes — dispatch a new run after merging.
Stage wheels under nemo_retriever/dist in the CI artifact so legacy and new publish paths both work; resolve wheels from multiple download layouts. Handle missing NGC chart (create) and duplicate version (skip) in helm script.
Log pwd, run metadata, directory trees, and glob probe results so CI failures show whether wheels are missing, mis-staged, or the workflow YAML is frozen from an older re-run.
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…195023, 6195296 (NVIDIA#2074) Co-authored-by: Randy Gelhausen <rgelhau@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
(cherry picked from commit 13ee35b)
(cherry picked from commit 050331f)
(cherry picked from commit 73f3d5f)
(cherry picked from commit 2524cfb)
(cherry picked from commit 9330be8)
…(NVBugs 6199005, 6198526) (NVIDIA#2088)
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Julio Perez <jperez@nvidia.com> Co-authored-by: Charles Blackmon-Luca <20627856+charlesbluca@users.noreply.github.com> (cherry picked from commit 7130a72)
(cherry picked from commit 744f0bc)
…se (26.05, NVBug 6204537) (NVIDIA#2103)
…_config note (NVBugs 6218013) (NVIDIA#2126)
Co-authored-by: Julio Perez <37191411+jperez999@users.noreply.github.com>
Bring 26.5.0 GA release prep, inprocess default pipeline, service retain_results, and pinned dependencies into main while preserving main's OpenShift docs, GPU CLI tuning, and QA-aligned extraction docs.
Greptile SummaryThis is the 26.5.0 GA release fold from the
|
| Filename | Overview |
|---|---|
| nemo_retriever/src/nemo_retriever/service_ingestor.py | Refactors sync/async streaming to thread retain_results through the entire call chain; introduces _ingest_stream_with_retain as the sync implementation helper. Logic is sound and backward-compatible (default False). |
| nemo_retriever/src/nemo_retriever/service/services/job_tracker.py | Adds retain_results flag to JobAggregate and register_job; mark_completed now conditionally discards result_data based on the flag. Defense-in-depth approach is correct; should_retain_results helper is clean. |
| nemo_retriever/src/nemo_retriever/service/services/pipeline_pool.py | Wires retain_results from WorkItem to control store_result_data and mark_completed calls; contains one redundant get_job_tracker() call that should reuse the already-held tracker variable. |
| nemo_retriever/src/nemo_retriever/service/routers/ingest.py | Adds _GATEWAY_RETAIN_RESULTS_HEADER and helper functions to propagate the per-job retain flag from gateway to workers; three submission endpoints updated consistently. |
| nemo_retriever/pyproject.toml | Moves nemotron model packages from test-pypi to PyPI, bumps langchain-nvidia-ai-endpoints to >=1.4.0, tightens litellm to <2, and advances langgraph minimum. Also introduces exact pins for urllib3 and nltk that are overly restrictive for non-critical dependencies. |
| nemo_retriever/src/nemo_retriever/graph_ingestor.py | Flips default run_mode from "batch" to "inprocess" — an intentional GA release change, but a behavioral breaking change for any existing caller that omitted run_mode. |
| nemo_retriever/src/nemo_retriever/service/client.py | Threads retain_results from aingest_documents_stream through _create_job; new parameter is keyword-only with a False default — fully backward-compatible. |
| nemo_retriever/src/nemo_retriever/service/models/requests.py | Adds retain_results: bool = Field(default=False) to JobCreateRequest; additive change with sensible default and clear description. |
| nemo_retriever/helm/values.yaml | Updates service image to the GA NGC registry path with tag 26.5.0 and IfNotPresent pull policy — correct for a GA release. |
| nemo_retriever/helm/Chart.yaml | Version bumped from 26.05-RC1 to 26.5.0 (SemVer-compliant), matching the GA release. |
| .github/workflows/release-helm.yml | Minor: updates workflow input description example from 26.05-RC1 to 26.5.0 to match the new version format. |
Sequence Diagram
sequenceDiagram
participant Client
participant RetrieverServiceClient
participant IngestRouter
participant JobTracker
participant PipelinePool
participant WorkerResultStore
Client->>RetrieverServiceClient: "aingest_documents_stream(files, retain_results=True)"
RetrieverServiceClient->>IngestRouter: "POST /v1/ingest/job {expected_documents, retain_results: true}"
IngestRouter->>JobTracker: "register_job(retain_results=True)"
JobTracker-->>IngestRouter: job_id
IngestRouter-->>RetrieverServiceClient: "201 {job_id}"
RetrieverServiceClient->>IngestRouter: "POST /v1/ingest/job/{job_id}/document"
IngestRouter->>IngestRouter: _gateway_retain_results_headers(job_id)
IngestRouter->>PipelinePool: "WorkItem(retain_results=True)"
PipelinePool->>PipelinePool: run pipeline
alt "retain_results=True"
PipelinePool->>WorkerResultStore: store_result_data(item.id, result_data)
end
PipelinePool->>JobTracker: mark_completed(result_data if retain_results else None)
JobTracker->>JobTracker: "rec.result_data = result_data"
Client->>IngestRouter: "GET /v1/ingest/status/{doc_id}"
IngestRouter->>JobTracker: get_document(doc_id)
JobTracker-->>IngestRouter: "DocumentRecord(result_data=[...])"
IngestRouter-->>Client: result_data present
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 4
nemo_retriever/src/nemo_retriever/service/services/pipeline_pool.py:271-275
Redundant `get_job_tracker()` call — the pool worker already retrieved `tracker` at the start of the try block (line ~257). `tracker_lookup` is always the same singleton object, so this second lookup is unnecessary. Using the existing `tracker` variable is cleaner and avoids a second function call inside the hot worker loop.
```suggestion
retain_results = item.retain_results
if not retain_results and item.job_id:
if tracker is not None:
retain_results = tracker.should_retain_results(item.job_id)
```
### Issue 2 of 4
nemo_retriever/pyproject.toml:55-58
Exact version pins for `urllib3` and `nltk` are overly restrictive for non-critical dependencies. The project's `dependency-management-uv` rule specifies that only critical packages (PyTorch, CUDA-related) should be pinned exactly; others should use `>=x.y.z` range specifiers. Exact pins can create unsolvable resolution conflicts when these packages are installed alongside other libraries that also depend on them.
```suggestion
"urllib3>=2.7.0",
# Utilities
"pydantic>=2.8.0",
"rich>=13.7.0",
```
### Issue 3 of 4
nemo_retriever/pyproject.toml:68
Same concern as `urllib3`: `nltk` is not a critical (CUDA/PyTorch) dependency, so an exact pin may create resolution conflicts in environments where another package requires a different `nltk` patch version.
```suggestion
"nltk>=3.9.4",
```
### Issue 4 of 4
nemo_retriever/src/nemo_retriever/graph_ingestor.py:415-418
**Default `run_mode` changed from `"batch"` to `"inprocess"` without a deprecation cycle.** Any caller that relied on `GraphIngestor()` (or `ingest_documents()` / `resolve_ingest_plan()`) defaulting to Ray Data scale-out will now silently execute single-process pandas instead. This includes the `retriever pipeline run` CLI, the harness, and every programmatic caller that omitted `run_mode`. The `api-backward-compatibility` rule requires a deprecation warning or migration note for changes that alter observable default behavior. A comment in the docstring noting the version where the default changed would satisfy the intent of that rule.
Reviews (1): Last reviewed commit: "Merge branch '26.05' into main" | Re-trigger Greptile
| retain_results = item.retain_results | ||
| if not retain_results and item.job_id: | ||
| tracker_lookup = get_job_tracker() | ||
| if tracker_lookup is not None: | ||
| retain_results = tracker_lookup.should_retain_results(item.job_id) |
There was a problem hiding this comment.
Redundant
get_job_tracker() call — the pool worker already retrieved tracker at the start of the try block (line ~257). tracker_lookup is always the same singleton object, so this second lookup is unnecessary. Using the existing tracker variable is cleaner and avoids a second function call inside the hot worker loop.
| retain_results = item.retain_results | |
| if not retain_results and item.job_id: | |
| tracker_lookup = get_job_tracker() | |
| if tracker_lookup is not None: | |
| retain_results = tracker_lookup.should_retain_results(item.job_id) | |
| retain_results = item.retain_results | |
| if not retain_results and item.job_id: | |
| if tracker is not None: | |
| retain_results = tracker.should_retain_results(item.job_id) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/service/services/pipeline_pool.py
Line: 271-275
Comment:
Redundant `get_job_tracker()` call — the pool worker already retrieved `tracker` at the start of the try block (line ~257). `tracker_lookup` is always the same singleton object, so this second lookup is unnecessary. Using the existing `tracker` variable is cleaner and avoids a second function call inside the hot worker loop.
```suggestion
retain_results = item.retain_results
if not retain_results and item.job_id:
if tracker is not None:
retain_results = tracker.should_retain_results(item.job_id)
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| "urllib3==2.7.0", | ||
| # Utilities | ||
| "pydantic>=2.8.0", | ||
| "rich>=13.7.0", |
There was a problem hiding this comment.
Exact version pins for
urllib3 and nltk are overly restrictive for non-critical dependencies. The project's dependency-management-uv rule specifies that only critical packages (PyTorch, CUDA-related) should be pinned exactly; others should use >=x.y.z range specifiers. Exact pins can create unsolvable resolution conflicts when these packages are installed alongside other libraries that also depend on them.
| "urllib3==2.7.0", | |
| # Utilities | |
| "pydantic>=2.8.0", | |
| "rich>=13.7.0", | |
| "urllib3>=2.7.0", | |
| # Utilities | |
| "pydantic>=2.8.0", | |
| "rich>=13.7.0", |
Rule Used: Dependencies must be declared in the appropriate p... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/pyproject.toml
Line: 55-58
Comment:
Exact version pins for `urllib3` and `nltk` are overly restrictive for non-critical dependencies. The project's `dependency-management-uv` rule specifies that only critical packages (PyTorch, CUDA-related) should be pinned exactly; others should use `>=x.y.z` range specifiers. Exact pins can create unsolvable resolution conflicts when these packages are installed alongside other libraries that also depend on them.
```suggestion
"urllib3>=2.7.0",
# Utilities
"pydantic>=2.8.0",
"rich>=13.7.0",
```
**Rule Used:** Dependencies must be declared in the appropriate p... ([source](https://app.greptile.com/review/custom-context?memory=dependency-management-uv))
How can I resolve this? If you propose a fix, please make it concise.| "pypdfium2==4.30.0", | ||
| "pillow==12.2.0", | ||
| "nltk>=3.9.4", | ||
| "nltk==3.9.4", |
There was a problem hiding this comment.
Same concern as
urllib3: nltk is not a critical (CUDA/PyTorch) dependency, so an exact pin may create resolution conflicts in environments where another package requires a different nltk patch version.
| "nltk==3.9.4", | |
| "nltk>=3.9.4", |
Rule Used: Dependencies must be declared in the appropriate p... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/pyproject.toml
Line: 68
Comment:
Same concern as `urllib3`: `nltk` is not a critical (CUDA/PyTorch) dependency, so an exact pin may create resolution conflicts in environments where another package requires a different `nltk` patch version.
```suggestion
"nltk>=3.9.4",
```
**Rule Used:** Dependencies must be declared in the appropriate p... ([source](https://app.greptile.com/review/custom-context?memory=dependency-management-uv))
How can I resolve this? If you propose a fix, please make it concise.| def __init__( | ||
| self, | ||
| *, | ||
| run_mode: str = "batch", | ||
| run_mode: str = "inprocess", |
There was a problem hiding this comment.
Default
run_mode changed from "batch" to "inprocess" without a deprecation cycle. Any caller that relied on GraphIngestor() (or ingest_documents() / resolve_ingest_plan()) defaulting to Ray Data scale-out will now silently execute single-process pandas instead. This includes the retriever pipeline run CLI, the harness, and every programmatic caller that omitted run_mode. The api-backward-compatibility rule requires a deprecation warning or migration note for changes that alter observable default behavior. A comment in the docstring noting the version where the default changed would satisfy the intent of that rule.
Rule Used: Changes to public API surfaces (FastAPI endpoints,... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/graph_ingestor.py
Line: 415-418
Comment:
**Default `run_mode` changed from `"batch"` to `"inprocess"` without a deprecation cycle.** Any caller that relied on `GraphIngestor()` (or `ingest_documents()` / `resolve_ingest_plan()`) defaulting to Ray Data scale-out will now silently execute single-process pandas instead. This includes the `retriever pipeline run` CLI, the harness, and every programmatic caller that omitted `run_mode`. The `api-backward-compatibility` rule requires a deprecation warning or migration note for changes that alter observable default behavior. A comment in the docstring noting the version where the default changed would satisfy the intent of that rule.
**Rule Used:** Changes to public API surfaces (FastAPI endpoints,... ([source](https://app.greptile.com/review/custom-context?memory=api-backward-compatibility))
How can I resolve this? If you propose a fix, please make it concise.
Description
Checklist