Skip to content

fix(devbox-bundled): drop leftover istio refs so k3s applies the knative manifest#7334

Open
pingsutw wants to merge 3 commits into
mainfrom
fix-devbox-istio-leftover
Open

fix(devbox-bundled): drop leftover istio refs so k3s applies the knative manifest#7334
pingsutw wants to merge 3 commits into
mainfrom
fix-devbox-istio-leftover

Conversation

@pingsutw

@pingsutw pingsutw commented May 1, 2026

Copy link
Copy Markdown
Member

Why are the changes needed?

Starting the bundled devbox hangs at "Waiting for flyte cluster to be ready" and k3s logs an apply error that repeats forever:

Failed to process config: failed to process /var/lib/rancher/k3s/server/manifests/flyte.yaml:
failed to create istio-system/knative-local-gateway /v1, Kind=Service for kube-system/flyte:
namespaces "istio-system" not found

The bundled Knative install ships with upstream net-istio defaults that don't belong in this devbox, which uses kourier for ingress and never creates the istio-system namespace (and the istio CRDs aren't installed):

  • Service knative-local-gateway in istio-system → namespace missing.
  • Two networking.istio.io/v1beta1 Gateway CRs (knative-ingress-gateway, knative-local-gateway) → istio CRD missing.
  • Two security.istio.io/v1beta1 PeerAuthentication CRs (net-istio-webhook, webhook) → istio CRD missing.
  • config-istio ConfigMap points the local gateway at istio rather than kourier.

k3s applies an addon manifest atomically, so a single un-appliable resource fails the entire flyte.yaml addon. k3s retries every ~20s and never succeeds, so the cluster never reports ready and flyte start devbox hangs.

What changes were proposed in this pull request?

In kustomize/{complete,dev}/kustomization.yaml, add patches that:

  1. Repoint config-istio's local-gateway.knative-serving.knative-local-gateway from knative-local-gateway.istio-system.svc.cluster.localkourier-internal.kourier-system.svc.cluster.local.
  2. Delete the Service knative-local-gateway in istio-system.
  3. Delete Gateway knative-local-gateway in knative-serving (istio CRD).
  4. Delete Gateway knative-ingress-gateway in knative-serving (istio CRD).
  5. Delete PeerAuthentication net-istio-webhook in knative-serving (istio CRD).
  6. Delete PeerAuthentication webhook in knative-serving (istio CRD).

Regenerate manifests/complete.yaml and manifests/dev.yaml so the rendered output matches.

Note: items 5–6 (the PeerAuthentication CRs) are newer than the original PR — the bundled knative chart added them since, and they fail to apply for the same reason. They were left out before, so the addon still failed; this revision deletes them too.

How was this patch tested?

  • Regenerated both manifests and verified the rendered output has zero references to istio-system, networking.istio.io, security.istio.io / PeerAuthentication, and that config-istio now points at kourier-internal.
  • On a running devbox, stripping the same istio resources from the in-container flyte.yaml made the addon apply cleanly (AppliedManifest) and all flyte pods reached Running; the namespaces "istio-system" not found / the server could not find the requested resource errors no longer recur. Knative cluster-local routing through kourier still works.

Labels

fixed

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

The bundled Knative install shipped with the upstream net-istio defaults:
a Service `knative-local-gateway` in the `istio-system` namespace, two
istio `Gateway` CRs, and a `config-istio` ConfigMap pointing the local
gateway at istio. Devbox uses kourier as the ingress and never creates
the `istio-system` namespace, so k3s logged
`namespaces "istio-system" not found` on every startup and the istio
Gateway CRs silently failed to apply.

Add kustomize patches that delete the istio-only resources and repoint
`config-istio`'s local-gateway entry at
`kourier-internal.kourier-system.svc.cluster.local`, then regenerate the
rendered manifests.

Signed-off-by: Kevin Su <pingsutw@apache.org>
Copilot AI review requested due to automatic review settings May 1, 2026 21:01
@github-actions github-actions Bot added the flyte2 label May 1, 2026
Signed-off-by: Kevin Su <pingsutw@apache.org>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the bundled devbox Knative manifests to remove Istio-specific resources that are not applicable when using kourier ingress, eliminating noisy startup apply failures.

Changes:

  • Patch config-istio to point Knative’s local gateway to kourier-internal.kourier-system.svc.cluster.local.
  • Delete the upstream net-istio knative-local-gateway Service in istio-system and the two Istio Gateway CRs.
  • Regenerate the rendered manifests/dev.yaml and manifests/complete.yaml to reflect the kustomize output.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
docker/devbox-bundled/manifests/dev.yaml Rendered dev manifest updated to remove Istio Gateway/Service and repoint local gateway; also includes additional newly-rendered resources.
docker/devbox-bundled/manifests/complete.yaml Rendered complete manifest updated to remove Istio Gateway/Service and repoint local gateway; also includes additional newly-rendered resources.
docker/devbox-bundled/kustomize/dev/kustomization.yaml Adds kustomize patches to repoint config-istio local gateway and delete Istio Gateway/Service resources.
docker/devbox-bundled/kustomize/complete/kustomization.yaml Same as dev: repoint config-istio local gateway and delete Istio Gateway/Service resources.
Comments suppressed due to low confidence (10)

docker/devbox-bundled/manifests/dev.yaml:8980

  • The initContainer uses image: busybox:stable with securityContext.runAsNonRoot: true and no explicit runAsUser. Busybox images typically run as UID 0 by default, so this will be rejected by kubelet and the StatefulSet will never become Ready. Set an explicit non-root UID that exists in the image, or allow root for this initContainer if it needs to manage filesystem permissions.
  selector:
    matchLabels:
      app: webhook
---
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  labels:
    app.kubernetes.io/component: net-istio
    app.kubernetes.io/name: knative-serving
    app.kubernetes.io/version: 1.18.1

docker/devbox-bundled/manifests/dev.yaml:9238

  • This adds an Ingress for RustFS with ingressClassName: nginx and a hard-coded host (example.rustfs.com). If the bundled devbox doesn't ship an nginx ingress controller (and/or DNS for that host), this resource will be non-functional and may confuse users. Consider removing this Ingress from the bundled manifests or gating it behind an explicit devbox value.
                          "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
                    route_config:
                      virtual_hosts:
                        - name: admin_interface
                          domains:

docker/devbox-bundled/manifests/complete.yaml:9290

  • This new RustFS StatefulSet uses the same selector labels (app.kubernetes.io/instance=f​​lyte-devbox + app.kubernetes.io/name=rustfs) as the existing Deployment/rustfs and Service/rustfs already present in this manifest. That will create two independent RustFS workloads and cause services to load-balance across both (and/or make it unclear which one Flyte should talk to). Either remove the old Deployment/Service or change labels/selectors so only one RustFS implementation is selected.
  - data-plane.knative.dev
  secretName: routing-serving-certs
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  labels:
    app.kubernetes.io/instance: flyte-devbox
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: flyte-devbox
    app.kubernetes.io/version: 1.16.1

docker/devbox-bundled/manifests/complete.yaml:9302

  • podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution combined with replicas: 4 will prevent scheduling more than 1 RustFS pod per node. Devbox/k3s is typically single-node, so 3 pods will remain Pending indefinitely. Consider removing the required anti-affinity (or making it preferred), or scaling replicas to 1 for devbox.
    helm.sh/chart: flyte-devbox-0.1.0
  name: flyte-console
  namespace: flyte
spec:
  rules:
  - http:
      paths:
      - backend:
          service:
            name: flyte-console
            port:
              number: 80

docker/devbox-bundled/manifests/complete.yaml:9385

  • The initContainer uses image: busybox:stable with securityContext.runAsNonRoot: true and no explicit runAsUser. Busybox images typically run as UID 0 by default, so this will be rejected by kubelet and the StatefulSet will never become Ready. Set an explicit non-root UID that exists in the image, or allow root for this initContainer if it needs to manage filesystem permissions.
  selector:
    matchLabels:
      app: webhook
---
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  labels:
    app.kubernetes.io/component: net-istio
    app.kubernetes.io/name: knative-serving
    app.kubernetes.io/version: 1.18.1

docker/devbox-bundled/manifests/complete.yaml:9643

  • This adds an Ingress for RustFS with ingressClassName: nginx and a hard-coded host (example.rustfs.com). If the bundled devbox doesn't ship an nginx ingress controller (and/or DNS for that host), this resource will be non-functional and may confuse users. Consider removing this Ingress from the bundled manifests or gating it behind an explicit devbox value.
                          "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
                    route_config:
                      virtual_hosts:
                        - name: admin_interface
                          domains:

docker/devbox-bundled/manifests/dev.yaml:7071

  • This manifest regeneration includes a large set of new RustFS Helm-chart resources (ServiceAccount/ConfigMap/Secret/Services/StatefulSet/Ingress/test Pod) that are unrelated to the PR’s stated goal of removing Istio references from Knative manifests. If these RustFS changes are intentional, they should be called out explicitly in the PR description; otherwise, consider re-generating manifests in a deterministic way so only the intended Knative/Istio changes are included (or split into a separate PR).
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app.kubernetes.io/component: activator
    app.kubernetes.io/name: knative-serving
    app.kubernetes.io/version: 1.18.1
  name: activator
  namespace: knative-serving
---
apiVersion: v1
kind: ServiceAccount
metadata:

docker/devbox-bundled/manifests/complete.yaml:7082

  • This manifest regeneration includes a large set of new RustFS Helm-chart resources (ServiceAccount/ConfigMap/Secret/Services/StatefulSet/Ingress/test Pod) that are unrelated to the PR’s stated goal of removing Istio references from Knative manifests. If these RustFS changes are intentional, they should be called out explicitly in the PR description; otherwise, consider re-generating manifests in a deterministic way so only the intended Knative/Istio changes are included (or split into a separate PR).
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app.kubernetes.io/component: activator
    app.kubernetes.io/name: knative-serving
    app.kubernetes.io/version: 1.18.1
  name: activator
  namespace: knative-serving
---
apiVersion: v1
kind: ServiceAccount
metadata:

docker/devbox-bundled/manifests/dev.yaml:8885

  • This new RustFS StatefulSet uses the same selector labels (app.kubernetes.io/instance=f​​lyte-devbox + app.kubernetes.io/name=rustfs) as the existing Deployment/rustfs and Service/rustfs already present in this manifest. That will create two independent RustFS workloads and cause services to load-balance across both (and/or make it unclear which one Flyte should talk to). Either remove the old Deployment/Service or change labels/selectors so only one RustFS implementation is selected.
  - data-plane.knative.dev
  secretName: routing-serving-certs
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  labels:
    app.kubernetes.io/instance: flyte-devbox
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: flyte-devbox
    app.kubernetes.io/version: 1.16.1

docker/devbox-bundled/manifests/dev.yaml:8897

  • podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution combined with replicas: 4 will prevent scheduling more than 1 RustFS pod per node. Devbox/k3s is typically single-node, so 3 pods will remain Pending indefinitely. Consider removing the required anti-affinity (or making it preferred), or scaling replicas to 1 for devbox.
    helm.sh/chart: flyte-devbox-0.1.0
  name: flyte-console
  namespace: flyte
spec:
  rules:
  - http:
      paths:
      - backend:
          service:
            name: flyte-console
            port:
              number: 80

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docker/devbox-bundled/kustomize/complete/kustomization.yaml
Comment thread docker/devbox-bundled/kustomize/dev/kustomization.yaml
@pingsutw pingsutw self-assigned this May 1, 2026
@pingsutw pingsutw added this to the V2 GA milestone May 1, 2026
@pingsutw pingsutw marked this pull request as draft May 2, 2026 04:06
@pingsutw pingsutw closed this May 29, 2026
Signed-off-by: Kevin Su <pingsutw@apache.org>

# Conflicts:
#	docker/devbox-bundled/kustomize/complete/kustomization.yaml
#	docker/devbox-bundled/kustomize/dev/kustomization.yaml
@pingsutw pingsutw reopened this Jun 27, 2026
@pingsutw

Copy link
Copy Markdown
Member Author

Revived and merged latest main. Since the original PR, the bundled knative chart also renders two security.istio.io/v1beta1 PeerAuthentication CRs (net-istio-webhook, webhook). Those fail to apply for the same reason (istio CRDs absent in this kourier devbox), which makes k3s fail the entire flyte.yaml addon and the cluster never reaches ready — flyte start devbox hangs. Added $patch: delete for both to kustomize/{complete,dev}/kustomization.yaml and regenerated the manifests. Verified the rendered output now has zero istio-system / istio.io / PeerAuthentication references and config-istio repointed to kourier-internal.

@pingsutw pingsutw marked this pull request as ready for review June 27, 2026 05:20
Copilot AI review requested due to automatic review settings June 27, 2026 05:20
@pingsutw pingsutw changed the title fix(devbox-bundled): remove leftover istio refs from knative manifests fix(devbox-bundled): drop leftover istio refs so k3s applies the knative manifest Jun 27, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

name: webhook
namespace: knative-serving
$patch: delete
- target:
name: webhook
namespace: knative-serving
$patch: delete
- target:
Comment on lines +113 to +116
# net-istio also renders two PeerAuthentication CRs (security.istio.io); the
# istio CRDs aren't installed in this kourier devbox, so without these deletes
# k3s fails the whole flyte.yaml addon and the cluster never goes ready.
- patch: |-
kind: ConfigMap
metadata:
name: config-istio
namespace: knative-serving

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failed to start the devbox sometimes if I don't remove this. It's safe to remove anyway since we never use it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants