Skip to content

Security, maintainability & build-vs-buy review (heads-up, non-authoritative) #3

Description

@buchbend

Heads-up, not authoritative. This review was done by Christof with the help of Claude (Claude Code). It's a best-effort read of the code, intended as a starting point for discussion — not a definitive security verdict or a list of confirmed exploits. Please sanity-check each point before acting on it; severities are judgment calls and some findings are defense-in-depth rather than live bugs.

Context

authn-proxy is a ~378-line single-file Go reverse proxy that authenticates every inbound request and injects X-Auth-* headers to upstreams. Identity comes from GitHub OAuth2 (org/team → groups) or a TLS client-cert subject DN forwarded by nginx in X-Tls-Client-Subject. Sessions are encrypted+signed cookies via gorilla/securecookie.

Verified locally: existing tests pass, go vet is clean, and the parseCertSubject empty-value panic below was reproduced.

The code is clean and the author clearly knew the domain (the RFC 2253 cert parser and file:/ secret loading are nicer than most hand-rolled proxies). The findings below are about hardening and about whether a hand-rolled OAuth gateway is the right long-term choice.


Security findings

🔴 Critical — client-supplied identity headers are trusted; auth bypass if the proxy is reachable directly

main.go:334-341 treats the mere presence of X-Tls-Client-Subject as proof that nginx validated a client cert — there's no mTLS, no shared secret, and no source-IP check — and the server binds all interfaces (":"+PORT, main.go:375). Anyone who can reach the listen port directly (shared Docker network, sibling pod, SSRF from another service, firewall gap, or nginx bypass) can send:

GET /any/protected/path HTTP/1.1
X-Tls-Client-Subject: CN=Admin,UID=admin,OU=admins

…and be authenticated as admin in group admins, with OU letting the attacker choose their own authorization groups (main.go:184-186).

Suggested fix: bind to 127.0.0.1 only; require mTLS or a shared secret from nginx; allowlist nginx's source IP; and strip all inbound identity headers on ingress. Note the documented X-Auth-* headers are safely overwritten (main.go:368-371), but non-canonical variants (X-Forwarded-User, etc.) and X-Tls-Client-Subject are not stripped — add a blanket ingress scrub as defense-in-depth.

🟠 High

  • Ephemeral cookie/state signing keys (main.go:203-213): securecookie.GenerateRandomKey runs at every startup. Every redeploy logs out every user, and multi-replica deployments are broken — a cookie/state minted by one replica can't be decoded by another, causing re-auth loops and broken OAuth callbacks. Load keys from a secret (reuse the existing getFromEnv / file:/ path), support rotation, and fail closed if absent.
  • No HTTP server timeouts (main.go:377, default ListenAndServe): no ReadHeaderTimeout / Read / Write / Idle → Slowloris / resource exhaustion. Construct an explicit http.Server with timeouts and MaxHeaderBytes.
  • Authorization is all-or-nothing (main.go:296-300): membership in any team in the org grants access to every upstream and path. There's no group→route mapping, so a low-privilege team member reaches an admin backend. Add per-upstream/per-path authorization, default-deny.

🟡 Medium

  • Open redirect (main.go:345main.go:311): the post-login target is r.RequestURI, round-tripped through signed state, then handed to http.Redirect unvalidated. A crafted protocol-relative URI (e.g. https://proxy//evil.com/) survives the signing (the attacker generates it via the victim's own request) and redirects an authenticated victim to //evil.com/. Validate that the redirect target is a local relative path.
  • No state↔browser binding / no PKCE (main.go:345-352): the OAuth state is integrity-protected but not bound to the browser via a nonce cookie → login-CSRF / session-fixation window. Add a single-use nonce cookie and PKCE.
  • GitHub response status never checked (main.go:73-97): a non-200 response or a GraphQL errors body decodes silently into a zero-valued struct. It currently fails closed only by luck of the len(Groups)==0 check (main.go:296) — fragile. Check StatusCode and a GraphQL errors field explicitly.
  • PII over-logging: main.go:98 dumps the full GitHub record; main.go:302 and :367 log identity on every request; main.go:336 logs the full cert subject. Relevant for an EU institution (GDPR). Remove line 98, gate the rest behind a debug level, and log only a stable user id at info.
  • 7-day, non-revocable sessions (main.go:24, 190-200): logout only clears the local cookie; a stolen cookie or a user removed from the org keeps access for 7 days with snapshotted groups. Shorten the lifetime and/or re-validate membership; consider the __Host- cookie prefix.
  • parseCertSubject panics on an empty attribute value (main.go:158, v[0] with no length guard) — reproduced: parseCertSubject("OU=")index out of range [0]. Reachable from attacker-controlled X-Tls-Client-Subject. Guard with len(v) > 0. (Commit ae7cf29 already fixed one panic here; this case remains.)
  • Path/RawPath rewrite desync (main.go:255-261): trimming the prefix off Path and RawPath independently can desynchronize them for percent-encoded paths → path-confusion at the upstream. Migrate to httputil.ReverseProxy.Rewrite and add tests for %2F / ...
  • Docker supply chain (Dockerfile:1,4,8): floating golang:1 / alpine:latest tags, and go get -v mutates the lockfile at build time instead of building the committed go.sum. Pin base images by digest, build with -mod=readonly, add govulncheck, and run as non-root.

🟢 Low

  • io/ioutil is deprecated (main.go:9,49) → use os.ReadFile.
  • file:/ handling: the prefix is 6 chars but the slice strips 5 (main.go:48-49) — works only by accident of the leading /; use strings.TrimPrefix(v, "file:").
  • CI builds only on tags (.github/workflows/docker-image.yml:3-6); add lint/test on PR, pin actions to commit SHAs, and isolate the destructive delete-package-versions steps.

Done well (credit where due)

Cookie attributes are correct (HttpOnly / Secure / SameSite=Lax, main.go:192-198); securecookie provides real authenticated encryption; the empty-groups check fails closed; the scratch image is minimal; no token/secret is logged; file-based secrets are supported; the cert parser is RFC 2253-aware and tested.


Architecture — build vs. buy

This reimplements, by hand, a security-critical category (an OAuth gateway) that battle-tested OSS already solves. Validated against oauth2-proxy, Authelia, Pomerium, Vouch Proxy, and Ory Oathkeeper:

Capability This proxy oauth2-proxy Authelia / Pomerium / Vouch / Oathkeeper
GitHub org/team auth ✅ hand-rolled GraphQL ✅ native (--github-org/teamX-Forwarded-Groups) ⚠️ via generic OIDC
Cookie sessions ⚠️ ephemeral keys ✅ rotatable secret + Redis (HA) ✅ + datastore
nginx auth_request ❌ (full reverse proxy) ✅ first-class
CSRF state nonce / PKCE ⚠️ partial
Per-route ACLs ⚠️ basic ✅ rich
Metrics / health ✅ Prometheus + /ping
CVE maintenance us community community
TLS client-cert as user identity (unique) ❌ (Authelia #846 still open)

Suggestion — hybrid: move the OAuth/cookie/header-injection path (~250 of 378 lines, all the security-sensitive parts) to oauth2-proxy in nginx auth_request mode. That deletes the hand-rolled OAuth/cookie/redirect code and gains full CSRF handling, persistent rotatable secrets, Redis HA, metrics, health, and a maintained CVE surface for free. Then either retire the cert path (if usage is marginal — worth checking first) or shrink this repo to a ~50-line cert-only auth_request sidecar (keep parseCertSubject + header injection; drop OAuth, cookies, secrets). The TLS-client-cert-as-identity feature is the one thing no mainstream gateway offers natively, so it's the only part that clearly justifies bespoke code.


Maintainability

  • Testability is the core weakness: only parseCertSubject is tested; the OAuth callback, cookie logic, redirect, and proxy director are untested because everything is wired as closures inside main(). Refactor into a Proxy/Authenticator struct with injectable deps so httptest can cover the handlers.
  • Global DefaultServeMux + a second mux; build an explicit *http.ServeMux / *http.Server.
  • Custom UPSTREAMS packed-string parsing (main.go:228-263) has no escaping and log.Fatals on bad input; move to structured config.
  • No graceful shutdown (drops in-flight requests on SIGTERM), no /healthz, no metrics.
  • Unstructured log, with a hard-coded "debug:" string logged at info level (main.go:260) — diverges from the CCAT structured-logging convention; consider log/slog.
  • README/code drift: the README documents GITLAB_OAUTH2_* env vars (README.md:45-46) that the code never reads (main.go:215-218) — an operator copying it gets a fatal "missing env var."

Usability

  • Headline UX defect: the ephemeral keys mean every deploy silently logs everyone out and bounces them back through GitHub; with more than one replica it becomes redirect loops. Fixing the key lifecycle fixes this too.
  • Users in no team get a bare 401 Unauthorized text (main.go:39-41,298) — no explanation, no "request access" guidance. A minimal branded HTML page would help.
  • Operator debugging is poor: lots of PII noise, no request IDs, no metrics, no auth-decision audit trail.

Suggested order of action

  1. Lock the trust boundary (bind loopback + mTLS/shared-secret from nginx + ingress header strip) — closes the auth bypass. main.go:334-341,375
  2. Externalize cookie/state keys from a secret — fixes the worst security and UX issue at once. main.go:203-213
  3. Check GitHub status + GraphQL errors; validate the redirect target. main.go:73-97, 311/345
  4. Add http.Server timeouts + graceful shutdown + /healthz. main.go:377
  5. Guard the parseCertSubject panic; de-PII the logging; fix the README GITLAB drift; pin Docker images.
  6. Then decide the strategic move: pilot oauth2-proxy for the OAuth path and reduce this to a cert-only sidecar (or retire it).

Filed by Christof using Claude as a heads-up for discussion. Not authoritative — please verify before acting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions