Stabilize shadow pod networking lifecycle and add optional TLS ingress support

## Stabilize shadow pod / wstunnel / full-mesh lifecycle

### Summary

The current shadow pod logic is working and widely used, but there are several reliability issues in the wstunnel/full-mesh flow that can make the system fragile in production. This issue tracks improvements to make shadow pod creation, cleanup, endpoint generation, and full-mesh networking more stable and deterministic.

### Issues

1. **Full mesh may render the wrong manifest**

   Full-mesh setup populates WireGuard-related template data, but the default renderer still uses the standard `wstunnel-template.yaml`. The WireGuard-specific template exists but is not currently selected by default.

   **Proposal:** explicitly select the WireGuard/full-mesh template when `Network.FullMesh` is enabled, or merge both templates behind a clear mode flag.

2. **Ingress host and client endpoint must be aligned**

   The generated client endpoint should match the ingress host exactly.

   Correct endpoint format:

   ```text
   {{name}}-{{namespace}}.{{wildcardDNS}}
   ```

   Not:

   ```text
   ws-{{name}}.{{wildcardDNS}}
   ```

   **Proposal:** centralize endpoint generation and use the same value for ingress host, client annotation, full-mesh script, and docs/tests.

3. **Cleanup uses inconsistent resource names/namespaces**

   Failure cleanup currently uses names derived differently from creation, which can leave shadow Deployments, Services, Ingresses, or ConfigMaps behind.

   **Proposal:** compute shadow resource identity once and reuse it for create, wait, timeout cleanup, failure cleanup, and delete.

4. **Namespace creation can diverge from the computed namespace**

   The code checks/creates one namespace, then later may apply manifests to a computed/truncated namespace.

   **Proposal:** compute the final target namespace before namespace creation, validate it, and create/check exactly that namespace. Same-namespace mode should never truncate the real namespace.

5. **Manifest decode errors are only logged**

   If one generated manifest object fails to decode, the flow continues. This can create partial infrastructure and report success even though networking is broken.

   **Proposal:** treat decode failures as fatal and clean up already-created resources.

6. **Shadow pod wait logic only checks for `PodIP`**

   The current wait path can accept a pod that has an IP but is not ready, stale, terminating, or from an old ReplicaSet.

   **Proposal:** wait for the current Deployment-owned pod, `Ready=True`, Deployment availability, and optionally populated Service endpoints.

7. **Pod annotation patch failures are swallowed**

   If the generated wstunnel command or full-mesh pre-exec annotation cannot be persisted to the Kubernetes pod, the remote execution path may become inconsistent.

   **Proposal:** return patch errors to the creation flow, or mark the pod failed when required annotations cannot be persisted.

8. **Full-mesh key/script generation should be idempotent**

   Retries can regenerate keys and prepend duplicate pre-exec scripts. Private keys are also currently logged.

   **Proposal:** persist generated key material safely, avoid duplicate pre-exec injection, and never log private keys.

### TLS ingress improvement

Some environments require or strongly prefer ingress traffic over port `443`. The default shadow ingress should support optional TLS generation.

Proposed config:

```yaml
Network:
  EnableTunnel: true
  WildcardDNS: "tunnel.example.com"
  IngressTLS: true
  IngressClusterIssuer: "lets-issuer"
```

When enabled, the generated ingress should include:

```yaml
metadata:
  annotations:
    cert-manager.io/cluster-issuer: lets-issuer
spec:
  tls:
  - hosts:
    - {{.Name}}-{{.Namespace}}.{{.WildcardDNS}}
    secretName: {{.Name}}-tls
```

The generated client command should use:

```text
wss://{{endpoint}}:443
```

When TLS is disabled, it should keep the current non-TLS behavior:

```text
ws://{{endpoint}}:80
```

### Acceptance criteria

- Full-mesh mode renders the correct WireGuard-capable manifest.
- Ingress host and generated client endpoint always match.
- Cleanup uses the same computed names as creation.
- Manifest decode failures fail the creation flow and trigger cleanup.
- Wait logic checks readiness, not only `PodIP`.
- Annotation patch failures are surfaced.
- Full-mesh key/pre-exec generation is idempotent.
- Optional TLS ingress support is covered by tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stabilize shadow pod networking lifecycle and add optional TLS ingress support #532

Stabilize shadow pod / wstunnel / full-mesh lifecycle

Summary

Issues

TLS ingress improvement

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Stabilize shadow pod networking lifecycle and add optional TLS ingress support #532

Description

Stabilize shadow pod / wstunnel / full-mesh lifecycle

Summary

Issues

TLS ingress improvement

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions