Skip to content

Fix shard-direct virtual workspace access for embedded virtual workspaces#241

Merged
kcp-ci-bot merged 5 commits into
kcp-dev:mainfrom
mjudeikis:fix/rootshard-shard-client-cert
Jun 8, 2026
Merged

Fix shard-direct virtual workspace access for embedded virtual workspaces#241
kcp-ci-bot merged 5 commits into
kcp-dev:mainfrom
mjudeikis:fix/rootshard-shard-client-cert

Conversation

@mjudeikis

@mjudeikis mjudeikis commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

When virtual workspaces run embedded in the shard (the default, --run-virtual-workspaces unset), external clients reach a shard's virtual-workspace endpoints at that shard's own external, SNI-routed VirtualWorkspaceURL — directly, bypassing the front-proxy. Two things broke on that shard-direct path:

1. Shard client certificate is required on the root shard (not only for external VWs).
The shard forwards virtual-resource requests (e.g. CachedResource replication VWs) to its advertised external URL. Without --shard-client-cert-file/--shard-client-key-file it falls back to the loopback client config, whose loopback ServerName the ingress cannot SNI-route (the connection is reset) and whose bearer token is not valid externally. The operator already set these flags unconditionally for regular Shards, but for the RootShard only inside the external-VW branch (kcpVW != nil). They are now set unconditionally on the root shard too. The ClientCertificate is already provisioned and mounted, so only the flags were missing.

2. ServiceAccount tokens must be validatable cross-shard on every shard, not only the front-proxy.
ServiceAccount signing keys are per-shard. An external client (e.g. the kro kcp-apiexport provider) that authenticates with a ServiceAccount token minted on shard A but connects shard-direct to shard B's virtual-workspace endpoint was rejected with 401 because shard B only had its own signing key. auth.serviceAccount.enabled loads every shard's signing key — but the operator only wired this into the front-proxy (ApplyFrontProxyAuthConfigurationapplyServiceAccountAuthentication); the shard and rootshard deployments used ApplyAuthConfiguration, which ignores the ServiceAccount config. The shared helper is renamed to ApplyAuthConfigurationWithServiceAccount and is now applied to the shard and rootshard deployments as well, so any shard can validate ServiceAccount tokens issued by any shard when ServiceAccount authentication is enabled.

Both were verified end-to-end on a 2-shard (root + theseus) embedded-VW setup: cross-shard kubectl get of a CachedResource-backed virtual resource resolves, and a provider connecting shard-direct with a ServiceAccount token issued on another shard is no longer rejected with 401.

What Type of PR Is This?

/kind bug

Related Issue(s)

Fixes #

Release Notes

Embedded virtual workspaces: always set the shard client certificate flags on the root shard, and (when ServiceAccount authentication is enabled) load every shard's ServiceAccount signing key on all shards — not just the front-proxy — so ServiceAccount tokens issued on one shard are accepted at another shard's virtual-workspace endpoint.

The advertised virtual workspace endpoint is the shard's external,
SNI-routed URL, which the loopback client config cannot reach. Set the
shard-client key/cert unconditionally instead of only when the
in-process VW server is disabled.
@kcp-ci-bot kcp-ci-bot added kind/bug Categorizes issue or PR as related to a bug. release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has signed the DCO. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 8, 2026
@kcp-ci-bot kcp-ci-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 8, 2026
@mjudeikis mjudeikis changed the title Always set shard-client cert flags on root shard Fix shard-direct virtual workspace access for embedded virtual workspaces Jun 8, 2026
@kcp-ci-bot kcp-ci-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jun 8, 2026
Comment thread internal/resources/utils/deployments.go Outdated
// clients authenticate against directly — the front-proxy AND the shards (whose
// virtual-workspace endpoint URLs are reached shard-direct) — must validate
// ServiceAccount tokens issued by any shard, so they all use this variant.
func ApplyAuthConfigurationWithServiceAccount(deployment *appsv1.Deployment, config *operatorv1alpha1.AuthSpec, rootShard *operatorv1alpha1.RootShard) *appsv1.Deployment {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to just merge the SA portion in ApplyAuthConfiguration?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after recent changed, yes :D

@mjudeikis mjudeikis added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 8, 2026
Comment thread internal/resources/shard/deployment.go Outdated
Comment on lines +218 to +221
// Use the ServiceAccount-aware variant: external clients reach this
// shard's virtual-workspace endpoints shard-direct via its
// VirtualWorkspaceURL, bypassing the front-proxy, so the shard
// itself must validate ServiceAccount tokens issued by any shard.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Use the ServiceAccount-aware variant: external clients reach this
// shard's virtual-workspace endpoints shard-direct via its
// VirtualWorkspaceURL, bypassing the front-proxy, so the shard
// itself must validate ServiceAccount tokens issued by any shard.

Think the comment is superfluous now

Comment on lines +207 to +211
// Use the ServiceAccount-aware variant: external clients (e.g. the
// kro kcp-apiexport provider) reach this shard's virtual-workspace
// endpoints shard-direct via its VirtualWorkspaceURL, bypassing the
// front-proxy, so the shard itself must validate ServiceAccount
// tokens issued by any shard.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Use the ServiceAccount-aware variant: external clients (e.g. the
// kro kcp-apiexport provider) reach this shard's virtual-workspace
// endpoints shard-direct via its VirtualWorkspaceURL, bypassing the
// front-proxy, so the shard itself must validate ServiceAccount
// tokens issued by any shard.

Think the comment is superfluous now

@mjudeikis mjudeikis force-pushed the fix/rootshard-shard-client-cert branch from 8936785 to 1f6be42 Compare June 8, 2026 12:44

@ntnn ntnn left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@kcp-ci-bot kcp-ci-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 8, 2026
@kcp-ci-bot

Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: db8c8fb2886453a94760b136cbdf628ecb55ae8b

@kcp-ci-bot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ntnn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kcp-ci-bot kcp-ci-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 8, 2026
@kcp-ci-bot kcp-ci-bot merged commit bea3832 into kcp-dev:main Jun 8, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has signed the DCO. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants