Align spec structure and fix health report findings#2941
Conversation
- Absorb what/README.md and how/README.md indexes into main .ai/spec/README.md - Add cross-reference table mapping what/ to how/ files - Complete conventions section (rule numbering, authority, constraints, new-file guidance) - Add ## Specs pointer in CLAUDE.md/AGENTS.md - Create ARCHITECTURE.md with Mermaid diagrams for human onboarding Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stale: - how/query-pipeline.md: document LLMExecutionAgent refactoring (tool-calling loop moved from DocsSummarizer to separate class) - what/system-overview.md: remove [PLANNED] from shipped Google Vertex providers (Gemini and Anthropic/Claude) - what/llm-providers.md: clarify Anthropic is available via Vertex, direct provider still planned Missing: - Create how/auth.md: strategy pattern, K8sClientSingleton, TokenReview/SAR flow - Create how/quota.md: limiter abstraction, PostgreSQL schema, scheduler daemon - how/project-structure.md: add LLMExecutionAgent to module map and request flow Structural: - what/observability.md: remove constants.py file reference (boundary violation) - README.md: add new how/ files to index and cross-reference table - health-report.md: initial health evaluation results Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughThis PR adds comprehensive architecture documentation for OpenShift LightSpeed (OLS) and reorganizes existing specifications to document a new architectural split: LLMExecutionAgent now handles multi-round tool-calling while DocsSummarizer orchestrates pipeline stages 1–5. New auth and quota architecture specifications are introduced, and all specs are cross-referenced and structured under a health-assessed directory. ChangesArchitecture and Specification Documentation
🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.ai/spec/health-report.md:
- Line 35: The sentence "All 8 provider files in code match the spec (openai,
azure_openai, watsonx, rhoai_vllm, rhelai_vllm, google_vertex, fake_provider)"
is inconsistent: either change the count from 8 to 7 or add the missing provider
name to the parenthetical list; edit the string in .ai/spec/health-report.md so
the numeric count and the listed provider identifiers (openai, azure_openai,
watsonx, rhoai_vllm, rhelai_vllm, google_vertex, fake_provider) match exactly.
In @.ai/spec/how/auth.md:
- Around line 19-26: The fenced code blocks showing the authentication
pseudocode (including symbols like get_auth_dependency, k8s.AuthDependency,
noop.AuthDependency, noop_with_token.AuthDependency, _extract_bearer_token,
get_user_info, kubernetes.AuthenticationV1Api.create_token_review,
V1TokenReview, SubjectAccessReview,
kubernetes.AuthorizationV1Api.create_subject_access_review, and
V1SubjectAccessReview) are missing language identifiers; update both fenced
blocks to include a language tag (e.g., ```text) so the blocks become ```text
... ``` and comply with MD040 and the markdown linter.
In @.ai/spec/how/query-pipeline.md:
- Around line 129-151: The fenced code block showing the DocsSummarizer pipeline
is missing a language identifier causing MD040 lint errors; update the opening
fence to include a language token (e.g., change ``` to ```text) for the block
that begins with "DocsSummarizer delegates to self._llm_agent.execute():" so the
markdown linter recognizes it, leaving the block content (including references
to _iterate_with_tools, _collect_round_llm_chunks,
_process_tool_calls_for_round, and Stage 7: Finalization) unchanged.
In @.ai/spec/how/quota.md:
- Around line 22-39: The fenced flow diagrams (e.g., the blocks showing
process_request(), start_quota_scheduler(), quota_scheduler(),
token_usage_history.consume_tokens(), limiter.ensure_available_quota(),
limiter.consume_tokens(), and limiter.available_quota()) are missing language
identifiers and trigger MD040; update each triple-backtick fence to include a
language tag such as "text" (or "mermaid" if appropriate) so they comply with
the linter (e.g., change ``` to ```text for the blocks around process_request()
and start_quota_scheduler()), and apply the same fix to the other affected
fences referenced in the comment (lines 43-56).
In `@ARCHITECTURE.md`:
- Around line 77-82: The loop in the sequence diagram incorrectly shows
DocsSummarizer performing the multi-round tool-calling loop; move ownership to
LLMExecutionAgent by changing the loop block to surround LLMExecutionAgent and
update the message flow so the LLM sends a tool-call request to
LLMExecutionAgent (e.g., "LLM-->>LLMExecutionAgent: Tool call request"),
LLMExecutionAgent executes the tool via DocsSummarizer
("LLMExecutionAgent->>DocsSummarizer: Execute tool"), DocsSummarizer returns
results to LLMExecutionAgent ("DocsSummarizer-->>LLMExecutionAgent: Tool
result"), and LLMExecutionAgent feeds results back to LLM and re-invokes the
loop (loop up to max_iterations).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 49df8348-ceee-445c-9afa-74fbf0baa3dd
📒 Files selected for processing (13)
.ai/spec/README.md.ai/spec/health-report.md.ai/spec/how/README.md.ai/spec/how/auth.md.ai/spec/how/project-structure.md.ai/spec/how/query-pipeline.md.ai/spec/how/quota.md.ai/spec/what/README.md.ai/spec/what/llm-providers.md.ai/spec/what/observability.md.ai/spec/what/system-overview.mdAGENTS.mdARCHITECTURE.md
💤 Files with no reviewable changes (2)
- .ai/spec/what/README.md
- .ai/spec/how/README.md
|
|
||
| ## No issues | ||
|
|
||
| - All 8 provider files in code match the spec (openai, azure_openai, watsonx, rhoai_vllm, rhelai_vllm, google_vertex, fake_provider) |
There was a problem hiding this comment.
Provider count and list are inconsistent.
Line 35 says “8 provider files” but the list includes 7 entries. Please either update the count or include the missing provider name.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.ai/spec/health-report.md at line 35, The sentence "All 8 provider files in
code match the spec (openai, azure_openai, watsonx, rhoai_vllm, rhelai_vllm,
google_vertex, fake_provider)" is inconsistent: either change the count from 8
to 7 or add the missing provider name to the parenthetical list; edit the string
in .ai/spec/health-report.md so the numeric count and the listed provider
identifiers (openai, azure_openai, watsonx, rhoai_vllm, rhelai_vllm,
google_vertex, fake_provider) match exactly.
| ``` | ||
| olsconfig.yaml: authentication_config.module = "k8s" | "noop" | "noop-with-token" | ||
| -> get_auth_dependency(ols_config, virtual_path="/ols-access") | ||
| match module: | ||
| "k8s" -> k8s.AuthDependency(virtual_path) | ||
| "noop" -> noop.AuthDependency(virtual_path) | ||
| "noop-with-token" -> noop_with_token.AuthDependency(virtual_path) | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced code blocks.
Both fenced blocks are missing a language tag, which triggers MD040 and breaks consistent markdown lint compliance.
Suggested diff
-```
+```text
olsconfig.yaml: authentication_config.module = "k8s" | "noop" | "noop-with-token"
-> get_auth_dependency(ols_config, virtual_path="/ols-access")
match module:
"k8s" -> k8s.AuthDependency(virtual_path)
"noop" -> noop.AuthDependency(virtual_path)
"noop-with-token" -> noop_with_token.AuthDependency(virtual_path)@@
- +text
HTTP request with Authorization: Bearer
-> _extract_bearer_token(header) -> token string
-> get_user_info(token)
kubernetes.AuthenticationV1Api.create_token_review(V1TokenReview(token))
-> if authenticated: return V1TokenReviewStatus (uid, username, groups)
-> if not authenticated: raise HTTPException(403)
-> SubjectAccessReview
kubernetes.AuthorizationV1Api.create_subject_access_review(
V1SubjectAccessReview(user, groups, non_resource_attributes={
path: virtual_path, # "/ols-access" or "/ols-metrics-access"
verb: "get"
})
)
-> if allowed: return (uid, username, False, token)
-> if denied: raise HTTPException(403)
Also applies to: 32-48
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 19-19: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.ai/spec/how/auth.md around lines 19 - 26, The fenced code blocks showing
the authentication pseudocode (including symbols like get_auth_dependency,
k8s.AuthDependency, noop.AuthDependency, noop_with_token.AuthDependency,
_extract_bearer_token, get_user_info,
kubernetes.AuthenticationV1Api.create_token_review, V1TokenReview,
SubjectAccessReview, kubernetes.AuthorizationV1Api.create_subject_access_review,
and V1SubjectAccessReview) are missing language identifiers; update both fenced
blocks to include a language tag (e.g., ```text) so the blocks become ```text
... ``` and comply with MD040 and the markdown linter.
| ``` | ||
| DocsSummarizer delegates to self._llm_agent.execute(): | ||
|
|
||
| Stage 6: Tool-calling loop (_iterate_with_tools) | ||
| for round 1..max_rounds: | ||
| _collect_round_llm_chunks() -> text/reasoning chunks + tool_call chunks | ||
| yield text/reasoning StreamedChunks | ||
| exit conditions: | ||
| - LLM emits finish_reason="stop" (should_stop flag) | ||
| - final round reached (i == max_rounds) | ||
| - no tool calls in response | ||
| - tool execution exception | ||
| _process_tool_calls_for_round(): | ||
| _resolve_tool_call_definitions() (skip duplicates, missing, ambiguous) | ||
| yield TOOL_CALL chunks | ||
| execute tools within round budget (enforce_tool_token_budget) | ||
| yield TOOL_RESULT chunks | ||
| append AI message + tool messages to prompt for next round | ||
| charge AI_ROUND and TOOL_RESULT | ||
|
|
||
| Stage 7: Finalization | ||
| yield END chunk with rag_chunks, truncated flag, token_counter | ||
| ``` |
There was a problem hiding this comment.
Add a language identifier to the fenced code block.
This block is missing a fenced-code language tag (MD040).
Suggested fix
-```
+```text
DocsSummarizer delegates to self._llm_agent.execute():
...
Stage 7: Finalization
yield END chunk with rag_chunks, truncated flag, token_counter</details>
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 129-129: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.ai/spec/how/query-pipeline.md around lines 129 - 151, The fenced code block
showing the DocsSummarizer pipeline is missing a language identifier causing
MD040 lint errors; update the opening fence to include a language token (e.g.,
change totext) for the block that begins with "DocsSummarizer delegates
to self._llm_agent.execute():" so the markdown linter recognizes it, leaving the
block content (including references to _iterate_with_tools,
_collect_round_llm_chunks, _process_tool_calls_for_round, and Stage 7:
Finalization) unchanged.
</details>
<!-- fingerprinting:phantom:triton:hawk -->
<!-- This is an auto-generated comment by CodeRabbit -->
| ``` | ||
| process_request() | ||
| -> check_tokens_available(config.quota_limiters, user_id) | ||
| for each limiter: | ||
| limiter.ensure_available_quota(subject_id=user_id) | ||
| SELECT available FROM quota_limits WHERE id=user_id AND subject='u' | ||
| if available <= 0: raise QuotaExceedError -> HTTP 500 | ||
| ... LLM processing ... | ||
| -> consume_tokens(config.quota_limiters, config.token_usage_history, ...) | ||
| token_usage_history.consume_tokens(user_id, provider, model, in, out) [if enabled] | ||
| INSERT ... ON CONFLICT DO UPDATE SET input_tokens=input_tokens+N | ||
| for each limiter: | ||
| limiter.consume_tokens(input_tokens, output_tokens, subject_id=user_id) | ||
| UPDATE quota_limits SET available=available-(in+out) WHERE id=user_id AND subject='u' | ||
| -> get_available_quotas(config.quota_limiters, user_id) | ||
| for each limiter: limiter.available_quota(user_id) -> int | ||
| returns {"UserQuotaLimiter": 450, "ClusterQuotaLimiter": 8950} | ||
| ``` |
There was a problem hiding this comment.
Specify fence languages for flow diagrams.
These code fences are missing language identifiers and trigger MD040.
Suggested diff
-```
+```text
process_request()
-> check_tokens_available(config.quota_limiters, user_id)
for each limiter:
limiter.ensure_available_quota(subject_id=user_id)
SELECT available FROM quota_limits WHERE id=user_id AND subject='u'
if available <= 0: raise QuotaExceedError -> HTTP 500
... LLM processing ...
-> consume_tokens(config.quota_limiters, config.token_usage_history, ...)
token_usage_history.consume_tokens(user_id, provider, model, in, out) [if enabled]
INSERT ... ON CONFLICT DO UPDATE SET input_tokens=input_tokens+N
for each limiter:
limiter.consume_tokens(input_tokens, output_tokens, subject_id=user_id)
UPDATE quota_limits SET available=available-(in+out) WHERE id=user_id AND subject='u'
-> get_available_quotas(config.quota_limiters, user_id)
for each limiter: limiter.available_quota(user_id) -> int
returns {"UserQuotaLimiter": 450, "ClusterQuotaLimiter": 8950}@@
- +text
start_quota_scheduler(config) [called from runner.py at startup]
-> Thread(target=quota_scheduler, daemon=True).start()
while True:
for each limiter in config:
quota_revocation(connection, name, limiter)
if increase_by configured:
UPDATE quota_limits SET available=available+N
WHERE subject='u'|'c' AND revoked_at < NOW() - INTERVAL period
if initial_quota configured:
UPDATE quota_limits SET available=initial_quota
WHERE subject='u'|'c' AND revoked_at < NOW() - INTERVAL period
sleep(config.scheduler.period)
Also applies to: 43-56
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 22-22: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.ai/spec/how/quota.md around lines 22 - 39, The fenced flow diagrams (e.g.,
the blocks showing process_request(), start_quota_scheduler(),
quota_scheduler(), token_usage_history.consume_tokens(),
limiter.ensure_available_quota(), limiter.consume_tokens(), and
limiter.available_quota()) are missing language identifiers and trigger MD040;
update each triple-backtick fence to include a language tag such as "text" (or
"mermaid" if appropriate) so they comply with the linter (e.g., change ``` to
```text for the blocks around process_request() and start_quota_scheduler()),
and apply the same fix to the other affected fences referenced in the comment
(lines 43-56).
| loop Tool-calling rounds (up to max_iterations) | ||
| LLM-->>DocsSummarizer: Tool call request | ||
| DocsSummarizer->>MCP: Execute tool | ||
| MCP-->>DocsSummarizer: Tool result | ||
| DocsSummarizer->>LLM: Feed result, re-invoke | ||
| end |
There was a problem hiding this comment.
Move tool-calling loop ownership from DocsSummarizer to LLMExecutionAgent in this diagram.
This sequence still shows DocsSummarizer executing the multi-round tool loop, which conflicts with the documented refactor where that loop lives in LLMExecutionAgent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@ARCHITECTURE.md` around lines 77 - 82, The loop in the sequence diagram
incorrectly shows DocsSummarizer performing the multi-round tool-calling loop;
move ownership to LLMExecutionAgent by changing the loop block to surround
LLMExecutionAgent and update the message flow so the LLM sends a tool-call
request to LLMExecutionAgent (e.g., "LLM-->>LLMExecutionAgent: Tool call
request"), LLMExecutionAgent executes the tool via DocsSummarizer
("LLMExecutionAgent->>DocsSummarizer: Execute tool"), DocsSummarizer returns
results to LLMExecutionAgent ("DocsSummarizer-->>LLMExecutionAgent: Tool
result"), and LLMExecutionAgent feeds results back to LLM and re-invokes the
loop (loop up to max_iterations).
|
@xrajesh: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
.ai/spec/structure with spec-first conventions: absorb layer READMEs into main README, add cross-reference table, complete conventions section, add spec pointer in CLAUDE.mdARCHITECTURE.mdwith Mermaid diagrams for human onboardinghow/query-pipeline.mdfor theLLMExecutionAgentrefactoring[PLANNED]markers from shipped Google Vertex providers (Gemini + Anthropic/Claude)how/auth.mdandhow/quota.mdfor undocumented implementation patternsLLMExecutionAgenttohow/project-structure.mdmodule mapwhat/observability.mdTest plan
🤖 Generated with Claude Code
Summary by CodeRabbit