feat: add dataSource PVC mount and per-proposal timeoutMinutes#17
feat: add dataSource PVC mount and per-proposal timeoutMinutes#17sakshipatels98-byte wants to merge 3 commits into
Conversation
|
Skipping CI for Draft Pull Request. |
📝 WalkthroughWalkthroughThis PR introduces per-step sandbox timeout support and optional PVC mounting for input data across the proposal workflow. It extends the API types with ChangesPer-Step Timeout and DataSource Implementation
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
2a89d44 to
f71f525
Compare
f71f525 to
31261be
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
controller/proposal/sandbox_agent.go (1)
177-210:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse one shared deadline for the whole step.
timeoutis spent once inWaitReadyand then a second time in the HTTP client, sotimeoutMinutes: 5can block for roughly 10 minutes before failing. If this field is meant to cap a step end-to-end, the later phases need to use the remaining budget, not the original budget again.⏱️ Possible fix
func (s *SandboxAgentCaller) callWithSandbox( ctx context.Context, proposal *agenticv1alpha1.Proposal, stepName string, @@ ) (json.RawMessage, error) { if timeout <= 0 { timeout = defaultSandboxTimeout } + deadline := time.Now().Add(timeout) templateName, err := EnsureAgentTemplate(ctx, s.K8sClient, defaultBaseTemplateName, s.Namespace, stepName, step.Agent, step.LLM, step.Tools, proposal.Spec.DataSource) if err != nil { @@ - endpoint, err := s.Sandbox.WaitReady(ctx, claimName, timeout) + endpoint, err := s.Sandbox.WaitReady(ctx, claimName, time.Until(deadline)) if err != nil { return nil, fmt.Errorf("wait for sandbox: %w", err) } @@ - client := s.ClientFactory(agentURL, timeout) + remaining := time.Until(deadline) + if remaining <= 0 { + return nil, fmt.Errorf("step timeout exceeded before agent call") + } + client := s.ClientFactory(agentURL, remaining) resp, err := client.Run(ctx, "", query, schema, agentCtx)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controller/proposal/sandbox_agent.go` around lines 177 - 210, The step currently reuses the full original timeout for WaitReady and then again for the HTTP client, doubling the effective wait; fix by creating a single deadline-based context and deriving the remaining budget for the client: wrap the incoming ctx with context.WithTimeout/context.WithDeadline using the provided timeout (store the returned cancel and deadline), pass that ctx to s.Sandbox.WaitReady, then compute remaining := time.Until(deadline) (clamp to a minimum and fallback if non-positive) and pass remaining to s.ClientFactory(agentURL, remaining) (and ensure to call cancel()). Reference EnsureAgentTemplate, s.Sandbox.Claim, s.Sandbox.WaitReady, s.ClientFactory(...).Run and the timeout variable when applying this change.
🧹 Nitpick comments (4)
controller/proposal/reconciler_test.go (1)
39-64: ⚡ Quick winCapture the timeout passed by the reconciler.
testAgentCallerdrops the new argument everywhere, so nothing in this file proves thatproposalTimeout(proposal)is actually passed intoAnalyze/Execute/Verify/Escalate. A small field likelastAnalyzeTimeoutplus one case withSpec.TimeoutMinutes = 30would cover the new handler plumbing.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controller/proposal/reconciler_test.go` around lines 39 - 64, The testAgentCaller currently ignores the timeout parameter so tests don't verify that proposalTimeout(proposal) is propagated; add fields on testAgentCaller (e.g., lastAnalyzeTimeout, lastExecuteTimeout, lastVerifyTimeout, lastEscalateTimeout) and assign the incoming timeout argument in the respective methods Analyze, Execute, Verify, Escalate; then add/update a test that sets Spec.TimeoutMinutes = 30 on a Proposal, calls the reconciler path, and asserts each last*Timeout equals the expected duration returned by proposalTimeout(proposal) to prove the plumbing forwards the timeout.controller/proposal/sandbox_agent_test.go (1)
57-82: ⚡ Quick winRecord and assert the propagated timeout in these helpers.
The new
time.Durationparameter is ignored here, so the tests only prove the signatures compile. Capturing the value passed intoClientFactory(and similarly intoWaitReady) would let this suite verify that non-default proposal timeouts actually reach the sandbox layer.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controller/proposal/sandbox_agent_test.go` around lines 57 - 82, Update newTestSandboxAgentCaller and newTestSandboxAgentCallerWithProposal so the ClientFactory closure captures and records the time.Duration it receives and the mockSandboxProvider's WaitReady call records its timeout; specifically, modify the ClientFactory func(_ string, d time.Duration) to store d onto the provided mockHTTPClient (e.g., a LastTimeout field) before returning it, and ensure the mockSandboxProvider exposes a WaitReady(timeout time.Duration) recorder that tests can assert against; this will let tests verify that non-default proposal timeouts are propagated to both ClientFactory and WaitReady.controller/proposal/client_test.go (1)
38-172: ⚡ Quick winAdd assertions for the new timeout contract.
These updates only make the tests compile against the new constructor. There’s still no coverage that a custom timeout is serialized into
timeout_ms, or thattimeout <= 0falls back todefaultSandboxTimeout, so the new behavior can regress silently.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controller/proposal/client_test.go` around lines 38 - 172, Tests for NewAgentHTTPClient lack assertions that the serialized request contains the timeout_ms field and that timeout <= 0 falls back to defaultSandboxTimeout; update relevant tests that decode agentRunRequest (e.g., inside the handlers in TestAgentHTTPClient_RunWithExecutionResult, TestAgentHTTPClient_RunWithoutExecutionResult, TestAgentHTTPClient_RunWithContext and the basic Run test) to assert req.TimeoutMs equals the provided custom timeout when NewAgentHTTPClient(server.URL, X) is called with X>0, and assert req.TimeoutMs equals defaultSandboxTimeout when NewAgentHTTPClient(..., 0) or a negative timeout is used; reference the agentRunRequest struct and NewAgentHTTPClient constructor to locate where to add these checks.controller/proposal/sandbox_templates_test.go (1)
68-70: ⚡ Quick winAdd coverage for non-nil
dataSourcehere.All updated hash calls still pass
nilfordataSource, so this file never asserts that changingclaimNamechanges the derived template name. A focused test for that — plus one for the read-only/data/inputPVC mount — would cover the new behavior directly.Also applies to: 490-524
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controller/proposal/sandbox_templates_test.go` around lines 68 - 70, Update tests to exercise a non-nil dataSource when calling computeTemplateHash: modify mustHash (and the test cases around it) to pass a populated agenticv1alpha1.DataSource with a distinct ClaimName and verify that changing ClaimName alters the returned hash; additionally add an assertion that the computed template includes the expected PVC mount mode (readOnly) for /data/input by constructing a DataSource whose MountPath is /data/input and verifying the template metadata or generated mount flags reflect read-only. Target the helper mustHash and calls to computeTemplateHash so the new assertions cover the branch where dataSource is non-nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/v1alpha1/proposal_types.go`:
- Line 297: The XValidation rule for the dataSource field currently allows
nil->non-nil updates; replace the rule so only two cases are permitted: both old
and new lack dataSource, or both have it and it is equal. Update the kubebuilder
annotation on the dataSource validation (the comment containing XValidation) to
use an expression such as "(has(oldSelf.dataSource) && has(self.dataSource) &&
self.dataSource == oldSelf.dataSource) || (!has(oldSelf.dataSource) &&
!has(self.dataSource))" so adding a dataSource after creation is disallowed.
In `@controller/proposal/sandbox_templates.go`:
- Around line 523-551: The current patchDataSource unconditionally uses the name
"data-source" and addPVCVolume will overwrite any existing volume with that
name; modify addPVCVolume and patchDataSource so they do not clobber an
unrelated pre-existing volume: in addPVCVolume, when finding an existing volume
with the same name check whether it already has a persistentVolumeClaim with the
same claimName and only replace if it matches; otherwise return a sentinel (or
error) to signal a name conflict. In patchDataSource, detect that conflict and
instead pick a non-colliding name (e.g., "data-source-<uniqueSuffix>") and use
that name for both addPVCVolume and addVolumeMount, or surface the error to the
caller; reference the functions addPVCVolume and patchDataSource and ensure the
chosen name is threaded through to addVolumeMount so mounts reference the
correct volume.
In `@controller/proposal/templates/analysis_query.tmpl`:
- Around line 3-27: The template fails to tell the agent that PVC-backed input
may already be mounted at /data/input when spec.dataSource is set, so update the
contract and template: modify buildAnalysisQuery in
controller/proposal/helpers.go to accept a flag/field (e.g., hasDataSource or
includeMountedInput) derived from the Request/spec.dataSource (so callers can
indicate PVC-backed input is present), and change
controller/proposal/templates/analysis_query.tmpl to conditionally include a
short directive like "Input data is mounted read-only at /data/input; prefer
analyzing that bundle before collecting live data" when that flag is true;
ensure the helpers.go change sets the flag when spec.dataSource is present so
PVC-backed RCA is handled as bundle-first analysis.
- Around line 19-25: The skill-loading branch is inconsistent: "cat
/app/skills/intelliaide/SKILL.md" cannot "return one or more paths." Update the
logic around the atomic command (the literal string "cat
/app/skills/intelliaide/SKILL.md" and references to SKILL.md) so it is
internally consistent — either change the command to a listing/find command that
can return paths (e.g., use a glob/find/ls conceptual step) or change the prose
to state that if the cat command fails because the file does not exist you must
stop and return a JSON error, and otherwise read that SKILL.md and follow its
workflow exactly; ensure the branch no longer claims cat can return multiple
paths.
In `@examples/setup/09-intelliaide-proposals.yaml`:
- Around line 21-31: Add one concrete example of the PVC-backed input to the
IntelliAide proposals by updating a Proposal's spec to include a dataSource
block using the exact shape spec.dataSource.claimName and a placeholder PVC name
(e.g., <pre-existing-pvc>); locate any Proposal objects in this file that
currently run in live mode and add a dataSource: claimName: <pre-existing-pvc>
under their spec so the new /data/input workflow and PVC-backed input are
discoverable (ensure the YAML key is spec.dataSource.claimName and that only the
example uses a placeholder to avoid changing runtime behavior).
- Around line 33-36: The docs instruct building/pushing
quay.io/<org>/intelliaide-skills:latest but all Proposal manifests reference
image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest,
so examples fail; either (A) update each Proposal image field to
quay.io/<org>/intelliaide-skills:latest (search for the string
image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
in the Proposal resources and replace) or (B) add the missing retag/push steps
after the build: retag quay image to the in-cluster registry name and push it
(i.e., docker/podman tag quay.io/<org>/intelliaide-skills:latest
image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
and push), and apply this fix for all occurrences noted (the other blocks at the
same pattern: lines ~70-72, 101-103, 147-149, 183-185).
---
Outside diff comments:
In `@controller/proposal/sandbox_agent.go`:
- Around line 177-210: The step currently reuses the full original timeout for
WaitReady and then again for the HTTP client, doubling the effective wait; fix
by creating a single deadline-based context and deriving the remaining budget
for the client: wrap the incoming ctx with
context.WithTimeout/context.WithDeadline using the provided timeout (store the
returned cancel and deadline), pass that ctx to s.Sandbox.WaitReady, then
compute remaining := time.Until(deadline) (clamp to a minimum and fallback if
non-positive) and pass remaining to s.ClientFactory(agentURL, remaining) (and
ensure to call cancel()). Reference EnsureAgentTemplate, s.Sandbox.Claim,
s.Sandbox.WaitReady, s.ClientFactory(...).Run and the timeout variable when
applying this change.
---
Nitpick comments:
In `@controller/proposal/client_test.go`:
- Around line 38-172: Tests for NewAgentHTTPClient lack assertions that the
serialized request contains the timeout_ms field and that timeout <= 0 falls
back to defaultSandboxTimeout; update relevant tests that decode agentRunRequest
(e.g., inside the handlers in TestAgentHTTPClient_RunWithExecutionResult,
TestAgentHTTPClient_RunWithoutExecutionResult,
TestAgentHTTPClient_RunWithContext and the basic Run test) to assert
req.TimeoutMs equals the provided custom timeout when
NewAgentHTTPClient(server.URL, X) is called with X>0, and assert req.TimeoutMs
equals defaultSandboxTimeout when NewAgentHTTPClient(..., 0) or a negative
timeout is used; reference the agentRunRequest struct and NewAgentHTTPClient
constructor to locate where to add these checks.
In `@controller/proposal/reconciler_test.go`:
- Around line 39-64: The testAgentCaller currently ignores the timeout parameter
so tests don't verify that proposalTimeout(proposal) is propagated; add fields
on testAgentCaller (e.g., lastAnalyzeTimeout, lastExecuteTimeout,
lastVerifyTimeout, lastEscalateTimeout) and assign the incoming timeout argument
in the respective methods Analyze, Execute, Verify, Escalate; then add/update a
test that sets Spec.TimeoutMinutes = 30 on a Proposal, calls the reconciler
path, and asserts each last*Timeout equals the expected duration returned by
proposalTimeout(proposal) to prove the plumbing forwards the timeout.
In `@controller/proposal/sandbox_agent_test.go`:
- Around line 57-82: Update newTestSandboxAgentCaller and
newTestSandboxAgentCallerWithProposal so the ClientFactory closure captures and
records the time.Duration it receives and the mockSandboxProvider's WaitReady
call records its timeout; specifically, modify the ClientFactory func(_ string,
d time.Duration) to store d onto the provided mockHTTPClient (e.g., a
LastTimeout field) before returning it, and ensure the mockSandboxProvider
exposes a WaitReady(timeout time.Duration) recorder that tests can assert
against; this will let tests verify that non-default proposal timeouts are
propagated to both ClientFactory and WaitReady.
In `@controller/proposal/sandbox_templates_test.go`:
- Around line 68-70: Update tests to exercise a non-nil dataSource when calling
computeTemplateHash: modify mustHash (and the test cases around it) to pass a
populated agenticv1alpha1.DataSource with a distinct ClaimName and verify that
changing ClaimName alters the returned hash; additionally add an assertion that
the computed template includes the expected PVC mount mode (readOnly) for
/data/input by constructing a DataSource whose MountPath is /data/input and
verifying the template metadata or generated mount flags reflect read-only.
Target the helper mustHash and calls to computeTemplateHash so the new
assertions cover the branch where dataSource is non-nil.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f4659329-5901-4895-8a4c-9eb649291e97
⛔ Files ignored due to path filters (3)
api/v1alpha1/zz_generated.deepcopy.gois excluded by!**/zz_generated.deepcopy.goconfig/crd/bases/agentic.openshift.io_proposals.yamlis excluded by!config/crd/bases/**config/rbac/role.yamlis excluded by!config/rbac/role.yaml
📒 Files selected for processing (14)
.gitignoreapi/v1alpha1/proposal_types.gocontroller/proposal/agent.gocontroller/proposal/client.gocontroller/proposal/client_test.gocontroller/proposal/handlers.gocontroller/proposal/reconciler.gocontroller/proposal/reconciler_test.gocontroller/proposal/sandbox_agent.gocontroller/proposal/sandbox_agent_test.gocontroller/proposal/sandbox_templates.gocontroller/proposal/sandbox_templates_test.gocontroller/proposal/templates/analysis_query.tmplexamples/setup/09-intelliaide-proposals.yaml
| // +kubebuilder:validation:XValidation:rule="!has(oldSelf.analysis) || (has(self.analysis) && self.analysis == oldSelf.analysis)",message="analysis is immutable once set" | ||
| // +kubebuilder:validation:XValidation:rule="!has(oldSelf.execution) || (has(self.execution) && self.execution == oldSelf.execution)",message="execution is immutable once set" | ||
| // +kubebuilder:validation:XValidation:rule="!has(oldSelf.verification) || (has(self.verification) && self.verification == oldSelf.verification)",message="verification is immutable once set" | ||
| // +kubebuilder:validation:XValidation:rule="!has(oldSelf.dataSource) || (has(self.dataSource) && self.dataSource == oldSelf.dataSource)",message="dataSource is immutable once set" |
There was a problem hiding this comment.
Disallow adding dataSource after creation.
This rule still permits nil -> non-nil updates because the first branch is true whenever oldSelf.dataSource is unset. That breaks the “fixed at creation” contract and allows an in-flight Proposal to change which PVC gets mounted.
🛠️ Possible fix
-// +kubebuilder:validation:XValidation:rule="!has(oldSelf.dataSource) || (has(self.dataSource) && self.dataSource == oldSelf.dataSource)",message="dataSource is immutable once set"
+// +kubebuilder:validation:XValidation:rule="has(oldSelf.dataSource) ? (has(self.dataSource) && self.dataSource == oldSelf.dataSource) : !has(self.dataSource)",message="dataSource is immutable after creation"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // +kubebuilder:validation:XValidation:rule="!has(oldSelf.dataSource) || (has(self.dataSource) && self.dataSource == oldSelf.dataSource)",message="dataSource is immutable once set" | |
| // +kubebuilder:validation:XValidation:rule="has(oldSelf.dataSource) ? (has(self.dataSource) && self.dataSource == oldSelf.dataSource) : !has(self.dataSource)",message="dataSource is immutable after creation" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@api/v1alpha1/proposal_types.go` at line 297, The XValidation rule for the
dataSource field currently allows nil->non-nil updates; replace the rule so only
two cases are permitted: both old and new lack dataSource, or both have it and
it is equal. Update the kubebuilder annotation on the dataSource validation (the
comment containing XValidation) to use an expression such as
"(has(oldSelf.dataSource) && has(self.dataSource) && self.dataSource ==
oldSelf.dataSource) || (!has(oldSelf.dataSource) && !has(self.dataSource))" so
adding a dataSource after creation is disallowed.
| ## Skills | ||
|
|
||
| A specialist deep-RCA pipeline is available at `/app/skills/intelliaide/SKILL.md`. | ||
|
|
||
| Use it ONLY when the request calls for: | ||
| - Root cause analysis (RCA) or deeper / ML-assisted troubleshooting of a cluster issue | ||
| - Must-gather collection or analysis | ||
| - Investigating pod failures, etcd degradation, networking problems, storage issues, etc. | ||
| - Any request that explicitly mentions "deeper analysis", "deeper troubleshooting", "root cause", "RCA", "must-gather", or "IntelliAide" | ||
|
|
||
| For routine inspection (checking pod/node status, listing events, summarising resource state, | ||
| describing objects), use `kubectl`/`oc` commands directly — do NOT invoke the IntelliAide pipeline. | ||
|
|
||
| **Decision rule — apply before doing anything else:** | ||
| 1. Read the `## Request` section below. | ||
| 2. If it is a routine inspection query → proceed with `kubectl`/`oc` directly. | ||
| 3. If it is a deep-RCA or troubleshooting request → read the skill file with ONE atomic command: | ||
| ``` | ||
| cat /app/skills/intelliaide/SKILL.md | ||
| ``` | ||
| If the command returns one or more paths, read the most relevant SKILL.md with `cat` | ||
| and follow its workflow **exactly** instead of the instructions below. | ||
| If no SKILL.md files are found, stop immediately and return a JSON error response — skills are required and their absence is a fatal misconfiguration. | ||
|
|
||
| ## Analysis requirements |
There was a problem hiding this comment.
Surface /data/input in the prompt when spec.dataSource is set.
This template never tells the agent that PVC-backed input may already be mounted at /data/input. The operator now mounts spec.dataSource there, but buildAnalysisQuery(...) only passes Request, HasExecution, and HasVerification, so PVC-backed RCA will still look like live-mode analysis and may ignore the supplied bundle entirely.
This needs a small contract change in controller/proposal/helpers.go plus a conditional instruction here, e.g. “input data is mounted read-only at /data/input; prefer analyzing that before collecting new live data.”
Also applies to: 27-42
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@controller/proposal/templates/analysis_query.tmpl` around lines 3 - 27, The
template fails to tell the agent that PVC-backed input may already be mounted at
/data/input when spec.dataSource is set, so update the contract and template:
modify buildAnalysisQuery in controller/proposal/helpers.go to accept a
flag/field (e.g., hasDataSource or includeMountedInput) derived from the
Request/spec.dataSource (so callers can indicate PVC-backed input is present),
and change controller/proposal/templates/analysis_query.tmpl to conditionally
include a short directive like "Input data is mounted read-only at /data/input;
prefer analyzing that bundle before collecting live data" when that flag is
true; ensure the helpers.go change sets the flag when spec.dataSource is present
so PVC-backed RCA is handled as bundle-first analysis.
| 3. If it is a deep-RCA or troubleshooting request → read the skill file with ONE atomic command: | ||
| ``` | ||
| cat /app/skills/intelliaide/SKILL.md | ||
| ``` | ||
| If the command returns one or more paths, read the most relevant SKILL.md with `cat` | ||
| and follow its workflow **exactly** instead of the instructions below. | ||
| If no SKILL.md files are found, stop immediately and return a JSON error response — skills are required and their absence is a fatal misconfiguration. |
There was a problem hiding this comment.
Make the skill-loading branch internally consistent.
Line 21 uses cat /app/skills/intelliaide/SKILL.md, but Lines 23-25 talk about that command “returning one or more paths.” cat returns file contents or an error, so this branch is impossible as written and can derail the RCA routing logic.
Suggested prompt fix
3. If it is a deep-RCA or troubleshooting request → read the skill file with ONE atomic command:
```
cat /app/skills/intelliaide/SKILL.md
```
- If the command returns one or more paths, read the most relevant SKILL.md with `cat`
- and follow its workflow **exactly** instead of the instructions below.
- If no SKILL.md files are found, stop immediately and return a JSON error response — skills are required and their absence is a fatal misconfiguration.
+ If that command fails because the file does not exist, stop immediately and return a JSON error response — skills are required and their absence is a fatal misconfiguration.
+ Follow its workflow **exactly** instead of the instructions below.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 3. If it is a deep-RCA or troubleshooting request → read the skill file with ONE atomic command: | |
| ``` | |
| cat /app/skills/intelliaide/SKILL.md | |
| ``` | |
| If the command returns one or more paths, read the most relevant SKILL.md with `cat` | |
| and follow its workflow **exactly** instead of the instructions below. | |
| If no SKILL.md files are found, stop immediately and return a JSON error response — skills are required and their absence is a fatal misconfiguration. | |
| 3. If it is a deep-RCA or troubleshooting request → read the skill file with ONE atomic command: |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@controller/proposal/templates/analysis_query.tmpl` around lines 19 - 25, The
skill-loading branch is inconsistent: "cat /app/skills/intelliaide/SKILL.md"
cannot "return one or more paths." Update the logic around the atomic command
(the literal string "cat /app/skills/intelliaide/SKILL.md" and references to
SKILL.md) so it is internally consistent — either change the command to a
listing/find command that can return paths (e.g., use a glob/find/ls conceptual
step) or change the prose to state that if the cat command fails because the
file does not exist you must stop and return a JSON error, and otherwise read
that SKILL.md and follow its workflow exactly; ensure the branch no longer
claims cat can return multiple paths.
| # IntelliAide triggering: | ||
| # The analysis prompt (analysis_query.tmpl) reads the request text and invokes | ||
| # the IntelliAide pipeline when the request contains RCA keywords such as | ||
| # "root cause analysis", "RCA", "deeper analysis", "must-gather", or "IntelliAide". | ||
| # All requests in this file are phrased to trigger IntelliAide automatically. | ||
| # | ||
| # Modes: | ||
| # Live mode (default) — IntelliAide reads live cluster state via kubectl/oc. | ||
| # Must-gather mode — set RUN_MUST_GATHER=1 env var, or phrase the request | ||
| # with "using must-gather" / "collect must-gather". | ||
| # |
There was a problem hiding this comment.
Add one concrete dataSource example to this manifest.
This PR introduces PVC-backed input, but every Proposal here is live-mode only and none show the required spec.dataSource.claimName shape. That makes the new /data/input workflow hard to discover from the only example file shipped with the feature.
A single example with:
dataSource:
claimName: <pre-existing-pvc>would make the new API immediately usable.
Also applies to: 46-188
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/setup/09-intelliaide-proposals.yaml` around lines 21 - 31, Add one
concrete example of the PVC-backed input to the IntelliAide proposals by
updating a Proposal's spec to include a dataSource block using the exact shape
spec.dataSource.claimName and a placeholder PVC name (e.g., <pre-existing-pvc>);
locate any Proposal objects in this file that currently run in live mode and add
a dataSource: claimName: <pre-existing-pvc> under their spec so the new
/data/input workflow and PVC-backed input are discoverable (ensure the YAML key
is spec.dataSource.claimName and that only the example uses a placeholder to
avoid changing runtime behavior).
| # 1. Build and push skills image: | ||
| # podman build -f Dockerfile.skills \ | ||
| # -t quay.io/<org>/intelliaide-skills:latest . | ||
| # podman push quay.io/<org>/intelliaide-skills:latest |
There was a problem hiding this comment.
Align the documented image build/push steps with the image each Proposal references.
Lines 33-36 tell users to publish quay.io/<org>/intelliaide-skills:latest, but every Proposal points at image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest. Following this file verbatim leaves the referenced image unavailable, so the examples do not work out-of-the-box.
Please either update the Proposal manifests to use the documented image, or add the missing retag/push step for the in-cluster registry.
Also applies to: 70-72, 101-103, 147-149, 183-185
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/setup/09-intelliaide-proposals.yaml` around lines 33 - 36, The docs
instruct building/pushing quay.io/<org>/intelliaide-skills:latest but all
Proposal manifests reference
image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest,
so examples fail; either (A) update each Proposal image field to
quay.io/<org>/intelliaide-skills:latest (search for the string
image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
in the Proposal resources and replace) or (B) add the missing retag/push steps
after the build: retag quay image to the in-cluster registry name and push it
(i.e., docker/podman tag quay.io/<org>/intelliaide-skills:latest
image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest
and push), and apply this fix for all occurrences noted (the other blocks at the
same pattern: lines ~70-72, 101-103, 147-149, 183-185).
| // +optional | ||
| // +kubebuilder:validation:Minimum=1 | ||
| // +kubebuilder:validation:Maximum=60 | ||
| TimeoutMinutes *int32 `json:"timeoutMinutes,omitempty"` |
There was a problem hiding this comment.
This should be on ProposalStep, not on ProposalSpec. Right now all four steps (analysis, execution, verification, escalation) get the same timeout — but IntelliAide RCA analysis might need 30 minutes while a verification step that runs kubectl get pods needs 2. A single proposal-level knob forces users to over-provision the timeout for every step just to accommodate the slowest one.
Would you consider moving it to ProposalStep? That would also align with how agent and tools are already scoped — per-step, not per-proposal.
| @@ -1,5 +1,31 @@ | |||
| You are an analysis agent. Your job is to diagnose the problem, determine the root cause, and propose one or more remediation options. Do NOT attempt to fix, patch, or execute any changes — only analyze and propose. | |||
|
|
|||
| ## Skills | |||
There was a problem hiding this comment.
The operator is skill-agnostic by design — it orchestrates Proposal lifecycles without knowing which skills are mounted. Hardcoding intelliaide references and keyword-routing here breaks that separation, and making skill absence fatal would break any Proposal that doesn't mount skills.
No changes to the analysis template are needed. spec.request already carries the user's intent to the agent — the adapter creating the Proposal can embed whatever context it needs there (e.g., "perform deep RCA on diagnostic data at /data/input"). The agent sees the request, sees the skills in its filesystem, and decides what to use. The operator shouldn't be involved in that decision.
There was a problem hiding this comment.
yeah this makes sense - so if we are triggering the proposal manually, this is not needed and if we want it to be automatic we need an adapter. either way this is not necessary.
| // pre-populated with data before the Proposal is created. The operator | ||
| // mounts it read-only at a well-known path (/data/input) accessible to | ||
| // all skills in the sandbox pod. | ||
| type DataSource struct { |
There was a problem hiding this comment.
Do we need PVC support here? The sandbox pods are ephemeral — if the agent needs must-gather data, it can collect it during its run (e.g., oc adm must-gather -o /tmp/must-gather) and work with it locally. The skill or spec.request can instruct the agent to do that.
This avoids adding new API surface (DataSource type, immutability rules, PVC RBAC, volume mount patching in sandbox templates) for something the agent can already do with its existing tools and a writable /tmp.
There was a problem hiding this comment.
@harche while that can be done, it restricts the ability to bring a must-gather from outside and analyze it, i.e customer cases.
| // | ||
| // Immutable: input data source is fixed at creation. | ||
| // +optional | ||
| DataSource *DataSource `json:"dataSource,omitzero"` |
There was a problem hiding this comment.
If you set add the DataSource here then all the steps (i.e. analyse, execution etc) would need mount that data source, but that may not be necessary in all use cases. Consider moving this to per-step, so data source is attached only to the steps that need it.
There was a problem hiding this comment.
Or maybe consider making it part of ToolsSpec above, that way it will be available at the proposal as well as step level.
31261be to
89cd2a4
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
examples/setup/09-intelliaide-proposals.yaml (1)
37-40:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAlign the documented image push target with the image referenced in Proposal specs.
Line 37–40 tells users to publish
quay.io/<org>/intelliaide-skills:latest, but Lines 73, 104, 149, 184, and 220 reference a different in-cluster image path. Following this file verbatim can leave the referenced Proposal image unavailable.Proposed minimal fix (make Proposal images match docs)
@@ - - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest + - image: quay.io/<org>/intelliaide-skills:latest @@ - - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest + - image: quay.io/<org>/intelliaide-skills:latest @@ - - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest + - image: quay.io/<org>/intelliaide-skills:latest @@ - - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest + - image: quay.io/<org>/intelliaide-skills:latest @@ - - image: image-registry.openshift-image-registry.svc:5000/openshift-lightspeed/lightspeed-skills:latest + - image: quay.io/<org>/intelliaide-skills:latestAlso applies to: 73-74, 104-105, 149-150, 184-185, 220-220
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/setup/09-intelliaide-proposals.yaml` around lines 37 - 40, The documented push target quay.io/<org>/intelliaide-skills:latest is inconsistent with the image path used in the Proposal specs; update either the push command or the Proposal YAMLs so they match. Specifically, change the podman/push target in the build instructions (the quay.io/<org>/intelliaide-skills:latest line) to the exact in-cluster image path referenced by the Proposal entries (the image string used in the Proposal specs at lines referencing the Proposal images) or vice versa so all references use the same image name and tag across the file.
🧹 Nitpick comments (1)
controller/proposal/sandbox_templates.go (1)
282-287: ⚡ Quick winUse constant error identifiers for the newly added wrapped errors.
Several new wraps (for example Line 282 and Line 573) use inline string literals in
fmt.Errorf. Please move these messages toconsterror identifiers and keep%wwrapping.Proposed pattern
+const ( + ErrSetLightspeedProvider = "set LIGHTSPEED_PROVIDER" + ErrAddDataSourcePVCVolume = "add data source PVC volume" +) ... - return fmt.Errorf("set LIGHTSPEED_PROVIDER: %w", err) + return fmt.Errorf("%s: %w", ErrSetLightspeedProvider, err) ... - return fmt.Errorf("add data source PVC volume: %w", err) + return fmt.Errorf("%s: %w", ErrAddDataSourcePVCVolume, err)As per coding guidelines,
**/*.go: Define errors as constants usingconst ErrFoo = "…"and wrap them withfmt.Errorf("%s: %w", …)for context.Also applies to: 301-303, 307-315, 317-319, 323-325, 333-339, 343-349, 573-575
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controller/proposal/sandbox_templates.go` around lines 282 - 287, Replace inline error message strings in fmt.Errorf wraps with file-level const error identifiers and preserve %w wrapping; for each occurrence (e.g., the calls wrapping errors from setEnvVar in the block using LIGHTSPEED_PROVIDER and LIGHTSPEED_MODEL and the other listed lines) define a const like ErrSetLightspeedProvider = "set LIGHTSPEED_PROVIDER" and ErrSetLightspeedModel = "set LIGHTSPEED_MODEL" (and similar names for the other cases), then change the wraps to fmt.Errorf("%s: %w", ErrSetLightspeedProvider, err) etc., keeping the same wrapped original error and use the existing symbols (setEnvVar, tmpl, model, providerTypeString, llm.Spec.Type) to locate the sites to update.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/v1alpha1/proposal_types.go`:
- Around line 262-272: The step-level immutability check currently compares
entire objects (spec.analysis, spec.execution, spec.verification) so changing
TimeoutMinutes (*TimeoutMinutes) gets rejected; update the immutability
validation to exclude timeoutMinutes from the equality check (i.e., allow
changes where only spec.*.timeoutMinutes hasChanged). Modify the webhook/CEL
rule or validation logic that enforces immutability so it compares individual
fields (or uses hasChanged(...) to whitelist spec.analysis.timeoutMinutes,
spec.execution.timeoutMinutes, spec.verification.timeoutMinutes) instead of
comparing the whole object, referencing the TimeoutMinutes field in
proposal_types.go and the spec.analysis/spec.execution/spec.verification
immutability checks.
In `@controller/proposal/handlers.go`:
- Around line 510-512: The escalation path loses the configured timeout because
the temporary resolvedStep used for override agents omits TimeoutMinutes;
compute the timeout once from stepTimeout(step) (or copy step.TimeoutMinutes)
before building resolvedStep and either set resolvedStep.TimeoutMinutes =
step.TimeoutMinutes or pass the precomputed timeout into r.Agent.Escalate so the
override path uses the same timeout value; update the logic around
stepTimeout(step), resolvedStep, and the r.Agent.Escalate call to preserve the
original timeout.
In `@controller/proposal/sandbox_agent.go`:
- Around line 197-201: Replace the hard-coded defaultSandboxTimeout passed to
s.Sandbox.WaitReady so sandbox readiness respects the step's configured timeout:
compute the timeout for WaitReady from the step's timeout (e.g., the step's
TimeoutMinutes or the remaining time on ctx's deadline), then pass that derived
duration instead of defaultSandboxTimeout to s.Sandbox.WaitReady (use the same
claimName and ctx). Ensure you handle zero/negative remaining time and fall back
to a sensible minimum if needed.
---
Duplicate comments:
In `@examples/setup/09-intelliaide-proposals.yaml`:
- Around line 37-40: The documented push target
quay.io/<org>/intelliaide-skills:latest is inconsistent with the image path used
in the Proposal specs; update either the push command or the Proposal YAMLs so
they match. Specifically, change the podman/push target in the build
instructions (the quay.io/<org>/intelliaide-skills:latest line) to the exact
in-cluster image path referenced by the Proposal entries (the image string used
in the Proposal specs at lines referencing the Proposal images) or vice versa so
all references use the same image name and tag across the file.
---
Nitpick comments:
In `@controller/proposal/sandbox_templates.go`:
- Around line 282-287: Replace inline error message strings in fmt.Errorf wraps
with file-level const error identifiers and preserve %w wrapping; for each
occurrence (e.g., the calls wrapping errors from setEnvVar in the block using
LIGHTSPEED_PROVIDER and LIGHTSPEED_MODEL and the other listed lines) define a
const like ErrSetLightspeedProvider = "set LIGHTSPEED_PROVIDER" and
ErrSetLightspeedModel = "set LIGHTSPEED_MODEL" (and similar names for the other
cases), then change the wraps to fmt.Errorf("%s: %w", ErrSetLightspeedProvider,
err) etc., keeping the same wrapped original error and use the existing symbols
(setEnvVar, tmpl, model, providerTypeString, llm.Spec.Type) to locate the sites
to update.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ea29bc9e-57fb-4b1c-869c-c5572b79c808
⛔ Files ignored due to path filters (4)
api/v1alpha1/zz_generated.deepcopy.gois excluded by!**/zz_generated.deepcopy.goconfig/crd/bases/agentic.openshift.io_proposals.yamlis excluded by!config/crd/bases/**config/rbac/role.yamlis excluded by!config/rbac/role.yamlgo.sumis excluded by!**/*.sum,!go.sum
📒 Files selected for processing (16)
.ai/spec/how/reconciler.md.gitignoreapi/v1alpha1/proposal_types.goapi/v1alpha1/tools_types.gocontroller/proposal/agent.gocontroller/proposal/client.gocontroller/proposal/client_test.gocontroller/proposal/handlers.gocontroller/proposal/reconciler.gocontroller/proposal/reconciler_test.gocontroller/proposal/resolve.gocontroller/proposal/sandbox_agent.gocontroller/proposal/sandbox_agent_test.gocontroller/proposal/sandbox_templates.gocontroller/proposal/sandbox_templates_test.goexamples/setup/09-intelliaide-proposals.yaml
✅ Files skipped from review due to trivial changes (2)
- .gitignore
- .ai/spec/how/reconciler.md
| // timeoutMinutes sets the timeout for this step's sandbox agent call. | ||
| // This controls how long the operator waits for the sandbox pod to | ||
| // become ready and for the agent to complete its work. Increase this | ||
| // for long-running tools (e.g., IntelliAide RCA takes 10-30 minutes). | ||
| // Defaults to 5 minutes when omitted. | ||
| // | ||
| // Mutable: can be adjusted before approving a step. | ||
| // +optional | ||
| // +kubebuilder:validation:Minimum=1 | ||
| // +kubebuilder:validation:Maximum=60 | ||
| TimeoutMinutes *int32 `json:"timeoutMinutes,omitempty"` |
There was a problem hiding this comment.
timeoutMinutes is documented as mutable but blocked by current CEL immutability.
Line 262 introduces a mutable field, but the existing step-level immutability rules on spec.analysis, spec.execution, and spec.verification compare whole objects, so changing only timeoutMinutes will still be rejected on update.
💡 Suggested direction
- // +kubebuilder:validation:XValidation:rule="!has(oldSelf.analysis) || (has(self.analysis) && self.analysis == oldSelf.analysis)",message="analysis is immutable once set"
- // +kubebuilder:validation:XValidation:rule="!has(oldSelf.execution) || (has(self.execution) && self.execution == oldSelf.execution)",message="execution is immutable once set"
- // +kubebuilder:validation:XValidation:rule="!has(oldSelf.verification) || (has(self.verification) && self.verification == oldSelf.verification)",message="verification is immutable once set"
+ // Keep agent/tools immutable, but allow timeoutMinutes updates.
+ // (e.g., compare only agent/tools in these rules or move immutability checks to ProposalStep fields)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@api/v1alpha1/proposal_types.go` around lines 262 - 272, The step-level
immutability check currently compares entire objects (spec.analysis,
spec.execution, spec.verification) so changing TimeoutMinutes (*TimeoutMinutes)
gets rejected; update the immutability validation to exclude timeoutMinutes from
the equality check (i.e., allow changes where only spec.*.timeoutMinutes
hasChanged). Modify the webhook/CEL rule or validation logic that enforces
immutability so it compares individual fields (or uses hasChanged(...) to
whitelist spec.analysis.timeoutMinutes, spec.execution.timeoutMinutes,
spec.verification.timeoutMinutes) instead of comparing the whole object,
referencing the TimeoutMinutes field in proposal_types.go and the
spec.analysis/spec.execution/spec.verification immutability checks.
| timeout := stepTimeout(step) | ||
| escalationResult, err := r.Agent.Escalate(ctx, proposal, step, escalationText, timeout) | ||
| if err != nil { |
There was a problem hiding this comment.
Escalation override path drops configured timeout.
stepTimeout(step) at Line 510 uses step.TimeoutMinutes, but when an override agent is set the temporary resolvedStep omits TimeoutMinutes, so escalation falls back to default 5 minutes.
💡 Suggested fix
- step = resolvedStep{Agent: &agent, LLM: &llm, Tools: step.Tools}
+ step = resolvedStep{Agent: &agent, LLM: &llm, Tools: step.Tools, TimeoutMinutes: step.TimeoutMinutes}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@controller/proposal/handlers.go` around lines 510 - 512, The escalation path
loses the configured timeout because the temporary resolvedStep used for
override agents omits TimeoutMinutes; compute the timeout once from
stepTimeout(step) (or copy step.TimeoutMinutes) before building resolvedStep and
either set resolvedStep.TimeoutMinutes = step.TimeoutMinutes or pass the
precomputed timeout into r.Agent.Escalate so the override path uses the same
timeout value; update the logic around stepTimeout(step), resolvedStep, and the
r.Agent.Escalate call to preserve the original timeout.
| // Pod startup is an infrastructure concern unrelated to the agent's work | ||
| // budget. Using a fixed ceiling here ensures that the full user-configured | ||
| // timeout is available for the agent call itself, and avoids the effective | ||
| // wall-clock time being 2× the configured value. | ||
| endpoint, err := s.Sandbox.WaitReady(ctx, claimName, defaultSandboxTimeout) |
There was a problem hiding this comment.
Step timeout is not applied to sandbox readiness.
Line 201 hard-caps WaitReady at 5 minutes, so a proposal with timeoutMinutes: 30 can still fail during pod startup at 5 minutes before the agent call begins.
💡 Suggested fix
- endpoint, err := s.Sandbox.WaitReady(ctx, claimName, defaultSandboxTimeout)
+ endpoint, err := s.Sandbox.WaitReady(ctx, claimName, timeout)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Pod startup is an infrastructure concern unrelated to the agent's work | |
| // budget. Using a fixed ceiling here ensures that the full user-configured | |
| // timeout is available for the agent call itself, and avoids the effective | |
| // wall-clock time being 2× the configured value. | |
| endpoint, err := s.Sandbox.WaitReady(ctx, claimName, defaultSandboxTimeout) | |
| // Pod startup is an infrastructure concern unrelated to the agent's work | |
| // budget. Using a fixed ceiling here ensures that the full user-configured | |
| // timeout is available for the agent call itself, and avoids the effective | |
| // wall-clock time being 2× the configured value. | |
| endpoint, err := s.Sandbox.WaitReady(ctx, claimName, timeout) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@controller/proposal/sandbox_agent.go` around lines 197 - 201, Replace the
hard-coded defaultSandboxTimeout passed to s.Sandbox.WaitReady so sandbox
readiness respects the step's configured timeout: compute the timeout for
WaitReady from the step's timeout (e.g., the step's TimeoutMinutes or the
remaining time on ctx's deadline), then pass that derived duration instead of
defaultSandboxTimeout to s.Sandbox.WaitReady (use the same claimName and ctx).
Ensure you handle zero/negative remaining time and fall back to a sensible
minimum if needed.
|
@sakshipatels98-byte: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Enable IntelliAide-style deep RCA by letting Proposals reference pre-loaded
diagnostic data and run long analysis steps in the sandbox.
spec.dataSource) — optional, immutable reference to apre-existing PVC in the Proposal namespace. The operator mounts it read-only
at
/data/inputin the sandbox pod (via sandbox template patching).spec.timeoutMinutes) — optional 1–60minutes per step (default 5). Used for analysis, execution, verification,
and escalation so long-running RCA can override the default.
analysis_query.tmplrequires skill discoveryfrom
/app/skillsbefore generic analysis; whendataSourceis set, tellsthe agent diagnostic data is available at
/data/input.get/list/watch; result CR*/statussub-resources.examples/setup/09-intelliaide-proposals.yamlwithdataSource,timeoutMinutes: 30, and analysis-only IntelliAide workflows.Companion PR:
agentic-skillsadds theintelliaide/skill pipeline thatreads from
/data/input.API Changes
spec.dataSource.claimNamespec.timeoutMinutesTest Plan
make testpasses (unit tests updated for new signatures)make manifestsproduces clean CRD with both new fieldsdataSource→ verify volume mounted at/data/inputtimeoutMinutes: 30→ operator waits 30mSummary by CodeRabbit
New Features
/data/inputin sandbox pods.Documentation