Skip to content

feat(local): per-provider DRCs, hops local doctor, and global --context#53

Merged
patrickleet merged 1 commit into
mainfrom
feat/per-provider-drc-and-doctor
May 29, 2026
Merged

feat(local): per-provider DRCs, hops local doctor, and global --context#53
patrickleet merged 1 commit into
mainfrom
feat/per-provider-drc-and-doctor

Conversation

@patrickleet
Copy link
Copy Markdown
Collaborator

@patrickleet patrickleet commented May 29, 2026

Summary

Three related hops local improvements, all driven by an RBAC drift that broke the SMTPSender consumer-Observe pattern (a provider-kubernetes that observes a control-plane XR needs cluster-admin; it had drifted off its pinned cluster-admin SA).

1. Per-provider DeploymentRuntimeConfigs (no more shared local-dev)

hops local start shared ONE DRC (local-dev) across provider-kubernetes AND provider-helm. A shared DRC lets one provider's runtime image/SA silently clobber the other's pod. Now each provider gets its own:

  • bootstrap/drc/kubernetes.yaml → DRC local-dev-kubernetes + SA + local-dev-kubernetes-cluster-admin binding
  • bootstrap/drc/helm.yaml → DRC local-dev-helm + SA + local-dev-helm-cluster-admin binding
  • each provider manifest's runtimeConfigRef points at its own DRC; start.rs applies both

2. hops local doctor

Verifies what hops local start set up and surfaces drift: crossplane, both providers (installed / healthy / runtimeConfigRef pinned to its own DRC / DRC present / cluster-admin binding / ProviderConfig), and the registry. Prints a ✓/✗ report and exits non-zero with remediation. This catches the exact failure that motivated the change — a provider whose runtimeConfigRef reverts to default, dropping its cluster-admin SA so it can no longer observe XRs through the in-cluster ProviderConfig.

3. Global --context flag on hops local

hops local aws --refresh --profile hops --context colima previously errored (unexpected argument '--context'). Added a global = true --context to hops local so every subcommand accepts it (before or after the subcommand); it plumbs through HOPS_KUBE_CONTEXT_ENV like config/provider install.

Verification

  • cargo build clean · cargo test 86 passed · new files clippy-clean
  • hops local doctor run against a live colima correctly flagged all 6 real drift items (both providers off their DRCs, missing DRCs/bindings) and exited 1
  • hops local aws --help now lists --context; hops local doctor --context colima parses

Note

The remote-cluster equivalents (crossplane-{kubernetes,helm}-provider-stack) already compose per-provider cluster-admin DRCs and reconcile continuously — no change needed there. This PR brings local-dev (CLI-managed) to parity.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features

  • Added hops local doctor command to validate and troubleshoot your local Crossplane deployment setup, provider health, and configurations
  • Added --context global option to local commands for specifying your target Kubernetes context

Chores

  • Refactored deployment runtime configurations to use provider-specific configurations instead of a single shared configuration for better isolation

Review Change Stack

…text

Split the shared `local-dev` DeploymentRuntimeConfig into per-provider DRCs
(local-dev-kubernetes, local-dev-helm), each with its own cluster-admin
ServiceAccount + ClusterRoleBinding, and point each provider's runtimeConfigRef
at its own DRC. A shared DRC let one provider's runtime image/SA silently
clobber the other's pod.

Add `hops local doctor`: verifies what `hops local start` set up — crossplane,
both providers (installed / healthy / runtimeConfigRef pinned to its own DRC /
DRC present / cluster-admin binding / ProviderConfig) and the registry — and
reports drift with a non-zero exit + remediation. Catches a provider whose
runtimeConfigRef reverted to `default`, dropping its cluster-admin SA (which
breaks observing XRs through the in-cluster ProviderConfig).

Add a global `--context` flag to `hops local` so every subcommand can target a
context (e.g. `hops local aws --refresh --profile hops --context colima`), given
before or after the subcommand. Plumbs through HOPS_KUBE_CONTEXT_ENV like
config/provider install.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 29, 2026

📝 Walkthrough

Walkthrough

Refactors Crossplane local development setup from a single shared DeploymentRuntimeConfig to provider-specific configs for helm and kubernetes providers. Updates the local-start workflow to deploy per-provider DRCs and introduces a new hops local doctor command to validate the Crossplane setup and check provider-to-DRC bindings.

Changes

Per-provider DRCs and doctor validation

Layer / File(s) Summary
Per-provider DRC manifests and provider reference bindings
bootstrap/drc/helm.yaml, bootstrap/drc/kubernetes.yaml, bootstrap/providers/provider-helm.yaml, bootstrap/providers/provider-kubernetes.yaml
Creates local-dev-helm and local-dev-kubernetes DeploymentRuntimeConfigs with ClusterRoleBindings granting cluster-admin to their service accounts. Updates both providers to reference their own DRC instead of a shared local-dev config.
Deploy per-provider DRCs in local-start workflow
src/commands/local/start.rs
Replaces single shared DRC manifest with two per-provider includes (DRC_K8S, DRC_HELM) and updates the deployment step to apply both manifests before installing providers.
Doctor command validation implementation
src/commands/local/doctor.rs
Implements hops local doctor entrypoint that validates Crossplane deployments are available, verifies each provider is installed/healthy, confirms providers reference their expected per-provider DRC, checks that each DRC resource exists with the correct ClusterRoleBinding and ServiceAccount binding, and ensures ProviderConfig resources exist in expected namespaces. Aggregates results via Doctor struct and prints human-readable pass/fail report.
CLI module wiring for doctor command
src/commands/local/mod.rs
Declares doctor submodule, adds Doctor variant to LocalCommands enum, introduces global --context CLI option to LocalArgs, and routes Doctor subcommand to doctor::run() with environment variable support for kube context propagation.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant DoctorCmd as hops local doctor
  participant Kubectl
  
  User->>DoctorCmd: Invoke doctor command
  DoctorCmd->>Kubectl: Get current kube context
  DoctorCmd->>Kubectl: Check Crossplane deployments Available
  DoctorCmd->>Kubectl: Check local registry deployment Available
  loop For each provider (helm, kubernetes)
    DoctorCmd->>Kubectl: Get Provider status (Installed/Healthy)
    DoctorCmd->>Kubectl: Get Provider runtimeConfigRef.name
    DoctorCmd->>Kubectl: Verify DRC resource exists
    DoctorCmd->>Kubectl: Check ClusterRoleBinding role ref is cluster-admin
    DoctorCmd->>Kubectl: Verify ServiceAccount subject matches expected SA
    DoctorCmd->>Kubectl: Verify ProviderConfig exists in namespace
  end
  DoctorCmd->>User: Print report with checkmarks/crosses
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • hops-ops/hops-cli#33: Both PRs modify src/commands/local/mod.rs's LocalArgs/LocalCommands CLI wiring and the local command dispatch structure.

Poem

🐰 Per-provider configs now reign,
No more shared DRC pain!
Doctor checks all the wiring tight,
Crossplane setup burning bright!
Bootstrap splits, validation's right! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the three main changes: per-provider DRCs, the new hops local doctor command, and the global --context flag.
Docstring Coverage ✅ Passed Docstring coverage is 86.67% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/per-provider-drc-and-doctor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/commands/local/doctor.rs`:
- Around line 234-247: The exists() and jsonpath() helpers currently swallow all
kubectl errors and map them to false/None, which hides transport/auth failures;
change them to propagate errors instead (e.g., return Result<bool, Error> and
Result<String, Error> or Option<Result<...>>), updating call sites in doctor
aggregation to treat real kubectl failures as fatal, or implement a separate
kubectl connectivity preflight that runs a simple kubectl command and aborts on
any non-not-found/error to avoid reporting drift; update the functions named
exists and jsonpath to surface run_cmd_output errors rather than unwrapping to
defaults so upstream logic can distinguish "resource not found" from
transport/auth failures.
- Around line 170-181: The check only compares ServiceAccount names; update the
jsonpath query and the binding_ok logic to validate both subject.name and
subject.namespace against the expected values: change the jsonpath call that
builds subjects (currently referencing "clusterrolebinding" and e.binding) to
return subject name/namespace pairs (e.g. extract
"{.subjects[?(@.kind==\"ServiceAccount\")].name}{\"/\"}{.subjects[?(@.kind==\"ServiceAccount\")].namespace}"
or otherwise return both fields), then update the binding_ok computation (which
currently uses role, subjects, e.sa) to also compare the namespace (e.g. match
each subject tuple against (e.sa, e.namespace) or similar). Keep references to
jsonpath, subjects, binding_ok, role, e.binding and e.sa so the change is
localized.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cc6eb260-f12f-4321-ba83-529461c16c27

📥 Commits

Reviewing files that changed from the base of the PR and between cfc515e and 70659a6.

📒 Files selected for processing (8)
  • bootstrap/drc/helm.yaml
  • bootstrap/drc/kubernetes.yaml
  • bootstrap/drc/local-dev.yaml
  • bootstrap/providers/provider-helm.yaml
  • bootstrap/providers/provider-kubernetes.yaml
  • src/commands/local/doctor.rs
  • src/commands/local/mod.rs
  • src/commands/local/start.rs
💤 Files with no reviewable changes (1)
  • bootstrap/drc/local-dev.yaml

Comment on lines +170 to +181
let subjects = jsonpath(&[
"get",
"clusterrolebinding",
e.binding,
"-o",
"jsonpath={.subjects[?(@.kind==\"ServiceAccount\")].name}",
]);
let binding_ok = role.as_deref() == Some("cluster-admin")
&& subjects
.as_deref()
.map(|s| s.split_whitespace().any(|n| n == e.sa))
.unwrap_or(false);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Match the ServiceAccount namespace in the binding check.

This only validates the ServiceAccount name. A ClusterRoleBinding pointing at local-dev-kubernetes or local-dev-helm in the wrong namespace would still pass here, even though the provider pod would not get the intended privileges. Compare both subject name and namespace against the expected binding target.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands/local/doctor.rs` around lines 170 - 181, The check only compares
ServiceAccount names; update the jsonpath query and the binding_ok logic to
validate both subject.name and subject.namespace against the expected values:
change the jsonpath call that builds subjects (currently referencing
"clusterrolebinding" and e.binding) to return subject name/namespace pairs (e.g.
extract
"{.subjects[?(@.kind==\"ServiceAccount\")].name}{\"/\"}{.subjects[?(@.kind==\"ServiceAccount\")].namespace}"
or otherwise return both fields), then update the binding_ok computation (which
currently uses role, subjects, e.sa) to also compare the namespace (e.g. match
each subject tuple against (e.sa, e.namespace) or similar). Keep references to
jsonpath, subjects, binding_ok, role, e.binding and e.sa so the change is
localized.

Comment on lines +234 to +247
fn exists(get_args: &[&str]) -> bool {
let mut args = get_args.to_vec();
args.extend_from_slice(&["--ignore-not-found", "-o", "name"]);
run_cmd_output("kubectl", &args)
.map(|s| !s.trim().is_empty())
.unwrap_or(false)
}

/// Run a kubectl query, returning the trimmed stdout or `None` if kubectl errors
/// (e.g. the resource does not exist).
fn jsonpath(args: &[&str]) -> Option<String> {
run_cmd_output("kubectl", args)
.ok()
.map(|s| s.trim().to_string())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't collapse kubectl transport/auth failures into drift.

exists() and jsonpath() turn every kubectl error into false/None. If the selected context is unreachable, auth is broken, or kubectl itself fails, doctor will report missing providers/DRCs and recommend rerunning hops local start, which is the wrong remediation. Please propagate non-not-found errors, or add a single connectivity preflight and abort before aggregating checks.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands/local/doctor.rs` around lines 234 - 247, The exists() and
jsonpath() helpers currently swallow all kubectl errors and map them to
false/None, which hides transport/auth failures; change them to propagate errors
instead (e.g., return Result<bool, Error> and Result<String, Error> or
Option<Result<...>>), updating call sites in doctor aggregation to treat real
kubectl failures as fatal, or implement a separate kubectl connectivity
preflight that runs a simple kubectl command and aborts on any
non-not-found/error to avoid reporting drift; update the functions named exists
and jsonpath to surface run_cmd_output errors rather than unwrapping to defaults
so upstream logic can distinguish "resource not found" from transport/auth
failures.

@patrickleet patrickleet merged commit 25d5f5f into main May 29, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant