A Spring Boot API gateway that centralizes authentication (incl. MFA), rate limiting, caching, idempotency and routing in front of downstream services — with first-class observability, graceful degradation when its own dependencies fail, and a production-ready deployment story.
Built on the servlet variant of Spring Cloud Gateway, so the security layer is a standard Spring Security filter chain rather than a reactive pipeline.
┌──────────────────── gate ─────────────────────┐
│ │
client ──HTTP──► │ servlet filter chain ──► router │ ──► downstream
│ │ service(s)
└───┬──────────────┬─────────────┬───────────────┘
│ │ │
┌─────▼────┐ ┌─────▼─────┐ ┌─────▼────────┐
│ Postgres │ │ Redis │ │ MailHog / │
│ users │ │ buckets │ │ SMTP │
│ roles │ │ denylist │ └──────────────┘
│ audit │ │ cache │
│ sessions │ │ idemp. │ ┌─────────────────────┐
│ verifies │ └───────────┘ │ observability │
└──────────┘ │ Prometheus / Tempo │
│ Grafana │
└─────────────────────┘
- PostgreSQL — source of truth for users, role assignments, refresh sessions, email-verification / password-reset tokens, and the security audit log.
- Redis — rate-limit buckets, revoked-JWT denylist, response cache and idempotency records (all ephemeral, TTL-bounded).
- Tracing / metrics — Prometheus scrapes the gateway, Tempo collects OTLP traces, Grafana provisions both with a ready-made dashboard.
- The gateway is stateless: any instance can serve any request.
Every request flows through an ordered pipeline. Cross-cutting concerns run as servlet filters outside Spring Security; authentication, rate limiting, caching and idempotency run inside the security chain.
1. CorrelationIdFilter assign / propagate X-Request-Id, bind it to the log MDC
2. RequestLoggingFilter start the latency timer
3. RequestSizeLimitFilter reject oversized request bodies early (413)
4. JwtAuthenticationFilter verify the Bearer token (RS256) + revocation denylist,
populate the SecurityContext, attach X-User-* headers
5. RateLimitFilter consume a token from the caller's bucket — 429 if empty
6. AuthorizationFilter enforce the route's required role (RBAC)
7. Cache / Idempotency serve cached GETs; replay responses for repeated
Idempotency-Key writes
8. Route handler /auth, /account, /admin → handled locally;
/api/** → prefix stripped, circuit-broken, proxied
(unwind) RequestLoggingFilter logs method, path, status, latency, user
Caching and idempotency run after authorization, so a cache hit can never bypass a 401/403.
src/main/java/com/example/gate/
├─ GatewayApplication.java
├─ admin/ admin API: user/role management, audit access, bucket inspection
├─ audit/ async, persisted security audit log
├─ auth/
│ ├─ AuthService, AuthController register / login / refresh / logout
│ ├─ email/ verification + password reset (Mail)
│ ├─ mfa/ TOTP MFA setup, enable, verify-login
│ └─ session/ refresh-token sessions (list / revoke)
├─ cache/ Redis response cache (capture, store, replay)
├─ config/ security, resilience, OpenAPI, servlet-filter wiring, data seeding
├─ error/ ApiException + standardized ErrorResponse + global handler
├─ filter/ correlation-id, access-log, size-limit, header enrichment
├─ gateway/ downstream circuit-breaker fallback controller
├─ idempotency/ Idempotency-Key capture & replay for unsafe methods
├─ ratelimit/ token-bucket limiter, Redis runner, filter, tiers
├─ security/ JWT service & keys, JWKS, auth filter, denylist, lockout
├─ support/ small shared helpers (ClientContext, Hashing, etc.)
└─ user/ User / Role entities + repositories
src/main/resources/
├─ application.yml, application-prod.yml profiles (default + hardened prod)
├─ db/migration/ Flyway schema migrations (V1..V5)
└─ scripts/ token_bucket.lua (atomic rate-limit script)
src/test/java/... GatewayIntegrationTest (Testcontainers, end-to-end)
observability/ Prometheus / Tempo / Grafana provisioning
k8s/ production Kubernetes manifests
Dockerfile multi-stage build → JRE runtime image
docker-compose.yml runtime stack: gateway + deps + observability
docker-compose.test.yml containerized test harness (Docker-in-Docker)
load/k6-script.js load / burst test
smoke-test.ps1 quick end-to-end check against a running stack
.github/workflows/ CI: build, test, SBOM, Trivy scan, image
| Area | Detail |
|---|---|
| Authentication | Stateless RS256 JWT; short-lived access + rotating refresh tokens; online signing-key rotation (verify-by-kid, multi-key JWKS) |
| MFA | TOTP (RFC 6238, Google Authenticator compatible) — setup, enable, single-use login challenge with brute-force lockout; secret encrypted at rest (AES-GCM) |
| Email lifecycle | Verification on registration, password reset flow, MailHog in dev compose; tokens single-use (atomic claim), emailed only after commit |
| Active sessions | Refresh-token registry with per-session list + revoke (/account/sessions) |
| Authorization | Role-based, enforced by route matchers (ROLE_FREE / PREMIUM / ADMIN) |
| Rate limiting | Custom Redis + Lua token bucket — atomic, per-user / per-IP, tiered quotas |
| Passwords | Argon2id hashing via a delegating encoder; constant-work verify to resist user enumeration |
| Token revocation | Redis denylist + DB session table = real logout + refresh-reuse detection |
| Brute-force defence | Redis-backed failed-login + failed-MFA counter with temporary account lockout |
| Proxy trust | Forwarded-header handling via the container (native) so per-IP limits and audit IPs can't be spoofed |
| Maintenance | Scheduled cleanup of expired refresh sessions and used/expired email tokens |
| Audit log | Async, persisted record of every security-relevant event |
| Webhooks | Fan out audit events to subscribed endpoints — HMAC-SHA256 signed, per-subscription circuit breaker, retry with backoff + dead-letter, SSRF guard |
| Admin API | ROLE_ADMIN endpoints for users, roles, audit log, rate-limit buckets |
| Response caching | Idempotent GETs cached in Redis per user, with X-Cache headers |
| Idempotency | Idempotency-Key replay so unsafe requests are safely retryable |
| Degradation | Circuit breakers: Redis fails open, downstream fails closed |
| Per-route resilience | Read/write route split: idempotent reads retried on 5xx + tighter timeout; per-route concurrency bulkheads shed overload with 503 |
| Tracing | Micrometer Tracing → OpenTelemetry → Tempo, W3C traceparent propagation |
| Observability | Correlation IDs, structured JSON logs, Prometheus metrics, provisioned Grafana dashboard |
| API docs | OpenAPI 3 spec + Swagger UI; JWKS endpoint for downstream token verification |
| Hardening | Security headers, CORS, request size limits, graceful shutdown |
| Production | Hardened prod profile, Kubernetes manifests, SBOM + Trivy scans in CI |
| Errors | One standardized JSON error body across filters, security and controllers |
Java 21 · Spring Boot 3.4 · Spring Cloud Gateway MVC · Spring Security · Spring Data JPA · PostgreSQL · Redis · Resilience4j · Flyway · Micrometer (Prometheus + Tracing) · OpenTelemetry · springdoc-openapi · Spring Mail · Spring Retry · dev.samstevens.totp · Testcontainers · Docker.
docker compose up --buildBrings up the gateway plus Postgres, Redis, MailHog, the downstream stub, and the observability stack on these ports:
| URL | What |
|---|---|
| http://localhost:8080 | the gateway |
| http://localhost:8080/swagger-ui.html | API docs |
| http://localhost:3000 | Grafana (anon viewer; dashboard "gate") |
| http://localhost:9090 | Prometheus |
| http://localhost:8025 | MailHog (captures all outbound mail) |
A demo admin user is seeded on first start (set GATE_ADMIN_PASSWORD for
anything beyond local use).
For stable JWTs across restarts (and a stable JWKS kid for downstream
verifiers), generate a persistent RSA keypair once before bringing the stack up:
bash scripts/generate-keys.shThe script runs openssl inside a small container, so no local openssl is
required. Without it the gateway falls back to an ephemeral keypair per restart.
docker compose logs -f gateway # follow gateway logs
docker compose down # stop the stack
docker compose down -v # stop and wipe the database volume| Method & path | Auth | Description |
|---|---|---|
POST /auth/register |
none | Create an account; sends verification email; returns a token pair |
POST /auth/login |
none | Credentials → token pair, or an MFA challenge if MFA is enabled |
POST /auth/mfa/verify-login |
mfa | Exchange challenge bearer + TOTP code for real tokens |
POST /auth/refresh |
none | Rotate a refresh token for a new pair |
POST /auth/logout |
bearer | Revoke the current access token |
GET /auth/verify-email?token=… |
none | Confirm email ownership |
POST /auth/password-reset/request |
none | Email a one-shot reset token |
POST /auth/password-reset/confirm |
none | Set a new password with the reset token |
POST /account/mfa/setup |
bearer | Issue a new TOTP secret + QR provisioning |
POST /account/mfa/verify-setup |
bearer | Enable MFA with the first valid code |
POST /account/mfa/disable |
bearer | Turn MFA off with a current code |
GET /account/sessions |
bearer | List the user's active refresh sessions |
DELETE /account/sessions/{id} |
bearer | Revoke a specific session |
ANY /api/** |
bearer | Authenticated, rate-limited proxy to the downstream service |
GET /admin/users, .../{id} |
admin | List / fetch users |
PUT /admin/users/{id}/status |
admin | Enable / disable an account |
PUT /admin/users/{id}/roles |
admin | Reassign a user's roles |
GET /admin/audit |
admin | Recent security audit events |
GET|DELETE /admin/rate-limit/{type}/{id} |
admin | Inspect / reset a rate-limit bucket |
POST|GET /admin/webhooks |
admin | Create (secret returned once) / list webhook subscriptions |
GET|PUT|DELETE /admin/webhooks/{id} |
admin | Fetch / update / delete a subscription |
GET /admin/webhooks/{id}/deliveries |
admin | Recent delivery attempts + status |
GET /admin/jwt/keys |
admin | Active signing-key ids and which is primary |
POST /admin/jwt/rotate |
admin | Rotate the signing key (old keys kept until they retire) |
GET /.well-known/jwks.json |
none | Public signing keys (JWK Set, all active kids) |
GET /swagger-ui.html, /v3/api-docs |
none | API documentation |
GET /actuator/health, /actuator/prometheus |
none | Health and metrics |
# register and capture the access token
TOKEN=$(curl -s -X POST localhost:8080/auth/register \
-H 'Content-Type: application/json' \
-d '{"username":"sam","email":"sam@example.com","password":"password123"}' \
| jq -r .accessToken)
# call a protected, proxied route — note the X-Cache / X-RateLimit / X-Request-Id headers
curl -si localhost:8080/api/get -H "Authorization: Bearer $TOKEN"
# safely retryable write — same key replays the first response
curl -s localhost:8080/api/post -X POST -H "Authorization: Bearer $TOKEN" \
-H 'Idempotency-Key: order-42' -H 'Content-Type: application/json' -d '{}'
# enable MFA (returns secret + QR data URI to scan in Google Authenticator)
curl -s localhost:8080/account/mfa/setup -X POST -H "Authorization: Bearer $TOKEN"A token bucket per identity is stored in Redis as a hash (tokens, ts). Refill
and consume happen inside a single Lua script (token_bucket.lua), so the
read-modify-write is atomic across concurrent gateway instances. Idle buckets
expire automatically. Quotas are tiered (gate.rate-limit.tiers); responses
carry X-RateLimit-Limit / X-RateLimit-Remaining, a 429 includes Retry-After.
- Response cache — successful
GETs on/api/**are captured per user in Redis, TTLgate.cache.ttl(default 30s). Repeated reads are served by the gateway and markedX-Cache: HIT. - Idempotency — when a client sends an
Idempotency-Keyheader on aPOST/PUT/PATCH, the first response is captured and replayed (X-Idempotency-Replayed: true) so a network retry never double-applies. In-flight duplicates get a clean409.
- MFA — TOTP-based; login of an MFA-enabled user returns a short-lived
challenge bearer that is exchanged together with the current code at
/auth/mfa/verify-login. - Brute-force protection — consecutive failed logins are counted in Redis; crossing the threshold locks the account for a cooldown window.
- Audit log — security events (registration, email verification, login success/failure, lockouts, logout, refresh, refresh-reuse, MFA changes, session revocation, admin changes) are written asynchronously to PostgreSQL and exposed via the admin API.
- Headers & limits — HSTS, frame/content-type/referrer policies, configurable CORS, a request body size cap, and graceful shutdown.
| Dependency down | Behaviour | Rationale |
|---|---|---|
| Redis | Rate limiter / lockout / cache fail open — requests are allowed | Availability over strict enforcement; a circuit breaker stops hammering a dead Redis |
| Downstream service | Route fails closed — 503 fallback |
Don't hang the client; the breaker sheds load while the service recovers |
/api/** is split into read (GET/HEAD) and write (POST/PUT/PATCH/DELETE)
routes with independent resilience: reads get a tighter timeout and are retried on 5xx
(idempotent only — writes are never retried); each route class has a concurrency
bulkhead that sheds excess in-flight requests with a 503 rather than queueing unboundedly.
- Every request gets an
X-Request-Id(generated or propagated), echoed downstream and into logs. - Structured JSON access logs under Docker (Elastic Common Schema format).
- Distributed tracing (Micrometer Tracing → OpenTelemetry → Tempo, sampling
configurable via
TRACING_SAMPLING);traceIdandspanIdappear in the MDC. - Metrics at
/actuator/prometheus, plus a provisioned Grafana dashboard (http://localhost:3000, dashboard "gate") showing request rate, p50/p95/p99 latency, rate-limit allowed vs rejected, fail-open events, circuit-breaker state and JVM memory.
Admins register subscriptions (/admin/webhooks) to receive audit events at an
external URL. After an audit event commits, the gateway fans it out off the
request path (an AFTER_COMMIT listener on a dedicated bounded executor):
- Signed — each POST carries
X-Gate-TimestampandX-Gate-Signature: sha256=<hex>, an HMAC-SHA256 over"<timestamp>.<body>"using the subscription's secret (returned once at creation, stored encrypted). - Resilient — a per-subscription circuit breaker isolates a failing endpoint; failed deliveries retry with exponential backoff and become dead after the attempt cap; delivery state is queryable and old rows are pruned.
- Guarded — webhook admin events are excluded from fan-out (no feedback loop),
and an SSRF guard can refuse internal/link-local targets (
proddefault).
Delivery is at-least-once; consumers should dedupe on the eventId field.
GatewayIntegrationTest runs end-to-end against real Postgres, Redis and a
downstream container: registration, JWT auth, tampered-token rejection, rate
limiting, refresh-token rotation/reuse detection, logout revocation, response
caching, idempotent replay, brute-force lockout, admin-role enforcement, JWKS,
MFA setup + login challenge, active session list + revoke, password
reset endpoints, and the Redis-down fail-open behaviour.
# fully containerized — no local JDK required (uses a Docker-in-Docker harness)
docker compose -f docker-compose.test.yml run --rm tests
# or, with a local JDK 21 + Maven
mvn verifysmoke-test.ps1 exercises the same flows against an already-running stack.
k6 run load/k6-script.jsRamps concurrent virtual users through an authenticated route and asserts the
gateway stays fast (p95 latency) and never returns a 5xx — excess load is shed
as 429s rather than failures.
- Profile — set
SPRING_PROFILES_ACTIVE=prodto activateapplication-prod.yml(hardened error responses, JSON logs, email verification enforced, no wildcard CORS, restricted actuator exposure). - Kubernetes — see
k8s/for Deployment, Service, ConfigMap, Secret template, HPA and NetworkPolicy. Readk8s/README.mdfor the apply order and the production checklist (mount a real RSA keypair, source secrets from a vault, set CORS, point OTLP at your collector). - CI — GitHub Actions builds + tests, generates a CycloneDX SBOM
(uploaded as an artifact), and runs Trivy scans against the dependency
graph and the built image. See
.github/workflows/ci.yml.
| Env var | Purpose | Default |
|---|---|---|
DB_HOST / DB_PORT / DB_NAME / DB_USER / DB_PASSWORD |
PostgreSQL connection | localhost:5432/gatewaydb |
REDIS_HOST / REDIS_PORT |
Redis connection | localhost:6379 |
DOWNSTREAM_URI |
Proxied downstream base URI | http://httpbin:80 |
MAIL_HOST / MAIL_PORT |
SMTP server | localhost:1025 |
OTLP_ENDPOINT |
OpenTelemetry collector for traces | unset (no export) |
TRACING_SAMPLING |
Trace sampling probability | 1.0 |
JWT_PRIVATE_KEY_PATH / JWT_PUBLIC_KEY_PATH |
PEM keypair; if both unset, an ephemeral keypair is generated. A configured-but-unreadable key fails startup (no silent downgrade); required under prod |
unset |
JWT_KEY_RETIRE_AFTER |
How long a rotated-out signing key stays verifiable (must exceed refresh-token TTL) | 8d |
GATE_MFA_ENCRYPTION_KEY |
Base64-encoded 32-byte AES-GCM key encrypting stored TOTP secrets. Ephemeral if unset (secrets lost on restart); required under prod |
unset |
GATE_ADMIN_PASSWORD |
Seeded admin password | admin123 |
LOGIN_MAX_FAILURES / LOGIN_FAILURE_WINDOW / LOGIN_LOCK_DURATION |
Failure threshold, counting window, and lockout duration | 5 / 15m / 15m |
CACHE_ENABLED / CACHE_TTL |
Response cache toggle and TTL | true / 30s |
IDEMPOTENCY_ENABLED / IDEMPOTENCY_TTL |
Idempotency toggle and record TTL | true / 24h |
MFA_ISSUER / MFA_CHALLENGE_TTL |
TOTP issuer label and challenge bearer TTL | gate / 5m |
EMAIL_FROM / EMAIL_PUBLIC_BASE_URL |
Email sender + URL embedded in messages | no-reply@gate.local / http://localhost:8080 |
EMAIL_VERIFICATION_TTL / PASSWORD_RESET_TTL |
Lifetime of verification / reset tokens | 24h / 1h |
REQUIRE_EMAIL_VERIFICATION |
Block login until the email is verified | false (dev), true (prod) |
CLEANUP_CRON / CLEANUP_REVOKED_RETENTION |
Maintenance schedule + how long revoked sessions are kept after revocation | 0 0 3 * * * / 30d |
FORWARD_HEADERS_STRATEGY |
How X-Forwarded-* are trusted (native only honours them from private-range proxies) |
native |
ROUTE_READ_TIMEOUT / ROUTE_WRITE_TIMEOUT |
Per-route downstream timeouts (reads tighter than writes) | 5s / 10s |
ROUTE_READ_BULKHEAD / ROUTE_WRITE_BULKHEAD |
Max in-flight downstream requests per route class | 50 / 20 |
WEBHOOKS_ENABLED / WEBHOOK_MAX_ATTEMPTS |
Webhook fan-out toggle and per-delivery attempt cap | true / 5 |
WEBHOOK_DELIVERY_RETENTION / WEBHOOK_BLOCK_PRIVATE |
How long delivered/dead rows are kept; SSRF guard on internal targets | 7d / false (dev), true (prod) |
CORS_ALLOWED_ORIGINS |
Comma-separated allowed origins | * (dev) / required (prod) |
MAX_REQUEST_BYTES |
Request body size cap | 1048576 |