Generic PostgreSQL backup sidecar — scheduled
pg_dumpbackups with a small REST API for on-demand triggering, listing, and download.
db-backup-service is a small, single-purpose container that runs as a
sidecar to a PostgreSQL database. It periodically takes compressed
pg_dump backups and uploads them to a configurable storage backend
(S3-compatible object storage or a local filesystem). On top of that
scheduled behaviour it exposes a minimal, authenticated REST API that
other services (typically a backend application) can use to trigger
out-of-schedule backups, list existing backups, and download them — for
example to populate a staging environment from production data.
The backup files are standard pg_dump --no-owner --no-privileges
plain SQL, gzipped. They are restorable with stock psql even without
this service. A bundled db-restore CLI inside the container provides
an operator-friendly atomic restore path via
gunzip -c <file> | psql --single-transaction.
- Scheduled backups at a configurable interval (default 24 hours,
any ISO-8601 duration or simple
24h/30mshorthand accepted). - On-demand backups via an authenticated REST API.
- Two storage backends — S3-compatible (AWS S3, MinIO, Hetzner Object Storage, …) or a local filesystem path.
- Atomic restore via the bundled
db-restoreCLI (drops and recreates the target database, restores in a single transaction so any error rolls the whole import back). - Retention policy with a "latest always kept" safety floor — old
backups are pruned after
RETENTION_DAYS, but the most recent successful backup is never deleted. - Streaming pipeline —
pg_dumpstdout is piped throughgzipstraight into the storage backend, so the container never needs disk space for a full backup file. - Structured JSON logs by default (
LOG_FORMAT=plainavailable for human-readable output during development). - Health and info endpoints (
/health,/info) suitable for Kubernetes probes and external monitoring./infoexposesbackup.lastSuccessfulBackupAgeSecondsso a monitoring stack can alert on backup staleness without an in-band alerting story. - Multi-arch container image for
linux/amd64andlinux/arm64published toghcr.io/openelementslabs/db-backup-serviceon every Git tag.
Two complete examples live at the repository root:
docker-compose.local.yml— backs up to a host directory.docker-compose.s3.yml— backs up to a MinIO bucket (an S3-compatible object store that runs locally).
Both files are self-contained — set a strong API_TOKEN and you are
running.
export API_TOKEN="$(openssl rand -hex 32)"
docker compose -f docker-compose.local.yml up
# Trigger an on-demand backup
curl -X POST -H "Authorization: Bearer $API_TOKEN" \
http://localhost:8080/api/v1/backups
# List backups (newest first)
curl -H "Authorization: Bearer $API_TOKEN" \
http://localhost:8080/api/v1/backups
# Download the latest backup
curl -OJ -H "Authorization: Bearer $API_TOKEN" \
http://localhost:8080/api/v1/backups/latest/downloadThe same curl snippets from the local example work against this stack
unchanged — the storage backend is transparent to the API.
All configuration is via environment variables. The table below is the single source of truth for operators.
| Variable | Required | Default | Purpose |
|---|---|---|---|
DB_HOST |
yes | — | Hostname of the PostgreSQL server. |
DB_PORT |
no | 5432 |
TCP port of the PostgreSQL server. |
DB_NAME |
yes | — | Database name to back up. |
DB_USER |
yes | — | PostgreSQL user. Must have privileges to dump all required schemas. |
DB_PASSWORD |
yes | — | Password for DB_USER. |
BACKUP_INTERVAL |
no | 24h |
ISO-8601 duration (PT24H) or simple format (24h, 12h, 30m). Drives the scheduler. |
BACKUP_NAME_PREFIX |
no | backup |
Prefix for backup IDs and filenames. |
RETENTION_DAYS |
no | 7 |
Backups older than this are pruned, except the most recent successful one which is always kept. |
STORAGE_BACKEND |
yes | — | s3 or local. |
BACKUP_LOCAL_DIR |
when STORAGE_BACKEND=local |
— | Mount point for local backups. |
S3_BUCKET |
when STORAGE_BACKEND=s3 |
— | Target bucket. |
S3_PREFIX |
no | backups |
Key prefix inside the bucket. |
S3_ENDPOINT |
no | (AWS default) | Custom S3 endpoint for MinIO, Hetzner Object Storage, etc. |
AWS_ACCESS_KEY_ID |
when STORAGE_BACKEND=s3 |
— | S3 access key. |
AWS_SECRET_ACCESS_KEY |
when STORAGE_BACKEND=s3 |
— | S3 secret key. |
AWS_DEFAULT_REGION |
no | eu-central-1 |
S3 region. |
API_TOKEN |
yes | — | Static bearer token for the REST API. Generate a strong random value. Rotation requires a container restart. |
HTTP_PORT |
no | 8080 |
Port the REST API listens on. |
LOG_FORMAT |
no | json |
json or plain. |
Missing required variables cause the container to fail at startup with a clear, actionable error message.
All API endpoints are versioned under /api/v1/. Every endpoint except
/health and /info requires the Authorization: Bearer <API_TOKEN>
header. Mismatched or missing tokens return 401 Unauthorized.
Single-flight invariant: at most one backup runs at any time. The
scheduler and the API share the same lock. Concurrent triggers are
rejected with 409 Conflict, and the response body contains the
running job's ID so the caller can poll its status.
| Method | Path | Purpose | Status codes |
|---|---|---|---|
POST |
/api/v1/backups |
Trigger a new backup. | 202 (new job), 409 (running job ID), 401 |
GET |
/api/v1/backups/jobs/{jobId} |
Job status. | 200, 401, 404 |
GET |
/api/v1/backups |
List all available backups, newest first. | 200, 401 |
GET |
/api/v1/backups/latest |
Metadata of the latest successful backup. | 200, 401, 404 (no successful backup yet) |
GET |
/api/v1/backups/{id}/download |
Download a specific backup as application/gzip. |
200, 401, 404 |
GET |
/api/v1/backups/latest/download |
Download the latest successful backup. | 200, 401, 404 |
GET |
/health |
Liveness + readiness, no auth. | 200, 503 |
GET |
/info |
Service version, PG client version, retention config, last-successful-backup age, no auth. | 200 |
# Trigger a backup (returns 202 with the new job, or 409 if one is in flight)
curl -X POST -H "Authorization: Bearer $API_TOKEN" \
http://localhost:8080/api/v1/backups
# Poll a specific job
curl -H "Authorization: Bearer $API_TOKEN" \
http://localhost:8080/api/v1/backups/jobs/<jobId>
# List all backups (newest first)
curl -H "Authorization: Bearer $API_TOKEN" \
http://localhost:8080/api/v1/backups
# Metadata of the latest successful backup
curl -H "Authorization: Bearer $API_TOKEN" \
http://localhost:8080/api/v1/backups/latest
# Download a specific backup (preserves filename via -OJ)
curl -OJ -H "Authorization: Bearer $API_TOKEN" \
http://localhost:8080/api/v1/backups/<id>/download
# Download the latest backup
curl -OJ -H "Authorization: Bearer $API_TOKEN" \
http://localhost:8080/api/v1/backups/latest/download
# Health (no auth)
curl http://localhost:8080/health
# Info (no auth)
curl http://localhost:8080/info{
"jobId": "9b7c4a8e-...",
"status": "queued | running | succeeded | failed",
"triggeredBy": "scheduler | api",
"startedAt": "2026-05-10T01:00:00Z",
"finishedAt": "2026-05-10T01:01:14Z",
"durationMs": 74123,
"errorMessage": null,
"backupId": "backup_20260510T010000Z.sql.gz"
}Jobs are kept in memory only — the most recent 100 with FIFO eviction. Jobs are lost on container restart. Long-term history comes from the backup listing.
{
"id": "backup_20260510T010000Z.sql.gz",
"createdAt": "2026-05-10T01:01:14Z",
"sizeBytes": 8421376,
"sha256": "fa3c…",
"pgVersion": "17.2",
"durationMs": 74123,
"triggeredBy": "scheduler | api"
}Restore is intentionally not exposed via the REST API — that path
is too easy to misuse catastrophically. Instead, a bundled db-restore
CLI is shipped inside the container and is invoked by an operator via
docker exec:
# List available backups (newest first)
docker exec <container> db-restore
# Restore a specific backup (5-second abort window — Ctrl-C to cancel)
docker exec <container> db-restore backup_20260510T010000Z.sql.gz
# Restore without the abort window (for automation)
docker exec <container> db-restore --force backup_20260510T010000Z.sql.gzThe script drops and recreates the target database, then restores via
gunzip -c <file> | psql --single-transaction. The
--single-transaction flag ensures the entire restore is atomic: any
error rolls the whole import back, leaving the database empty (freshly
recreated) instead of half-imported. The operator can then choose a
different backup.
The script reads connection info and storage configuration from the
same environment variables as the Spring application. For the S3
backend it downloads through the running service's authenticated REST
API instead of embedding AWS SigV4 — so the running service must be
healthy for db-restore to work on the S3 backend.
The service is designed for internal-network use only. Do not expose it directly to the public Internet.
Specifically:
- No TLS. The HTTP server is plain HTTP. If you need TLS, terminate it in an upstream reverse-proxy (nginx, Traefik, an ingress controller) on the same internal network and configure the proxy to forward to the service.
- No rate limiting, no CORS, no IP allow-list, no WAF. These are the upstream reverse-proxy's job.
- Single static bearer token.
API_TOKENis the only credential. Generate it with a strong random source (openssl rand -hex 32or equivalent). Token rotation requires a container restart — there is no in-band rotation API. - No per-user authorisation. The service has no concept of users. Per-user authentication, role checks, and audit logging are the calling application's responsibility. Typically a backend application proxies frontend calls to the backup service and adds its own user-aware authorisation layer in front of the bearer token.
/healthand/infoare unauthenticated. They reveal the configured retention days, the configured backup interval, the bundledpg_dumpversion, and the age of the last successful backup — designed for monitoring tools on the same trusted network. Do not expose them publicly.
If you need to expose the API beyond your internal network, the only supported pattern is: put a reverse-proxy in front that adds TLS, rate limiting, and whatever additional authentication layer your environment requires. The backup service itself will not grow these concerns — they belong upstream.
- Tool:
pg_dumpfrom the PostgreSQL 17 client suite, piped throughgzip. PostgreSQL clients are forward-compatible to older server versions back to 9.2, so the same image backs up servers from 9.2 through 17. - Format: plain SQL, gzipped (
.sql.gz). Restorable with stockpsqleven without this service. - Flags:
--no-owner --no-privilegesfor portability across environments with different role names. - Consistency: the default
pg_dumpsnapshot mode (REPEATABLE READ). Each backup is a self-consistent point-in-time snapshot. - Validation: each finalised dump is read back through
GZIPInputStreamand its SHA-256 is recorded in the sidecar JSON. A failed validation marks the job as failed and discards the partial upload.
Each successful backup produces two storage objects:
<prefix>/backup_20260510T010000Z.sql.gz ← the dump
<prefix>/backup_20260510T010000Z.sql.gz.meta.json ← sidecar metadata
The sidecar JSON is written only after the dump is fully uploaded and verified. Listings ignore dumps without a sidecar, so a partially uploaded backup (e.g. interrupted by a container restart) never appears as "latest".
- Logs. Structured JSON by default, one event per line. Switchable
to plain text via
LOG_FORMAT=plain. Contextual fields (jobId,backupId,durationMs,triggeredBy) are attached via MDC for the duration of a backup job. /health. Spring Boot Actuator endpoint, unauthenticated. Returns200only when both the PostgreSQL server is reachable on the configured host/port and the storage backend is reachable (S3HeadBucketor a writable local directory). Returns503otherwise. Suitable as a Kubernetes readiness/liveness probe./info. Unauthenticated. Reports the service version, the bundledpg_dumpversion, the configured retention days, the configured backup interval (in both ISO-8601 and seconds), andbackup.lastSuccessfulBackupAgeSeconds— a long, ornullwhen no successful backup exists yet. Alert onlastSuccessfulBackupAgeSeconds > <SLO>to detect backup staleness in your monitoring stack.
- Java 21 (LTS). We recommend installing it via SDKMAN! or Eclipse Adoptium / Temurin.
- Maven 3.9+.
- Docker. Required for the Testcontainers-based integration tests and for building the container image.
mvn verifyThis compiles the project, runs the unit and integration test suites,
and produces an executable Spring Boot fat JAR at
target/db-backup-service.jar. CI (.github/workflows/ci.yml) runs
exactly this command on every push and pull request.
docker build -f docker/Dockerfile -t db-backup-service:dev .The image is based on eclipse-temurin:21-jre and bundles the
PostgreSQL 17 client tools (pg_dump, pg_restore, psql), gzip,
bash, curl, and jq. The bundled db-restore CLI lives at
/usr/local/bin/db-restore.
Non-trivial changes go through a small spec before implementation. The
spec folder lives at specs/<NNN>-<short-description>/ and contains a
design.md and a behaviors.md (and optionally a steps.md). See
specs/INDEX.md for the catalogue of existing specs
and
.claude/conventions/spec-driven-development.md
for the convention itself.
See CONTRIBUTING.md for development setup, the
spec-driven workflow, commit message conventions, and the pull-request
review process. Participation in the project is conditional on
accepting our Code of Conduct.
Vulnerability reports are handled via SECURITY.md.
Apache License 2.0. See LICENSE.