add prometheus integration by filippofinke · Pull Request #172 · eduardolat/pgbackweb

filippofinke · 2026-04-15T12:48:59Z

This pull request adds native Prometheus monitoring support to PG Back Web. It introduces a configurable /metrics endpoint exposing detailed backup and health metrics, updates the documentation and configuration, and integrates metric collection into the backup execution workflow. The implementation is modular and includes automated tests for metrics exposure.

Prometheus Metrics Integration

Added a new metrics package (internal/metrics/metrics.go) defining Prometheus metrics for backups, health status, and resources, and registered them for collection.
Implemented a metrics provider (internal/metrics/provider/provider.go) that starts a metrics HTTP server (configurable via environment variables), periodically updates health/resource metrics, and exposes database connection pool stats.
Integrated metrics collection into the backup execution workflow: backup runs and status changes now update Prometheus counters and gauges. [1] [2] [3] [4]

Configuration and Documentation

Added new environment variables for enabling and configuring the Prometheus metrics server (PBW_ENABLE_METRICS, PBW_METRICS_LISTEN_HOST, PBW_METRICS_LISTEN_PORT) in .env.example and documented usage and available metrics in README.md. [1] [2] [3]
Updated Dockerfiles to expose the metrics port. [1] [2]

Dependency and Compatibility Updates

Updated Go version to 1.25.0 and added Prometheus client libraries and related dependencies in go.mod. [1] [2] [3]

Testing

Added unit tests for metric exposure to ensure the /metrics endpoint works as expected (internal/metrics/metrics_test.go).

see #166

Summary by CodeRabbit

New Features
- Native Prometheus metrics support for backups, health, tasks, and DB pool stats; metrics server can be enabled and configured via env vars
Documentation
- Added Prometheus monitoring guide with scrape example and metric descriptions
Bug Fixes / Improvements
- Metrics emitted during backup runs and health checks; metrics endpoint initialized at startup
Tests
- Added tests validating /metrics exposure
Chores
- Updated runtime toolchain/dependencies and Docker configs to expose metrics port

coderabbitai · 2026-04-15T13:25:27Z

📝 Walkthrough

Walkthrough

This PR adds Prometheus metrics support: new env vars to enable/configure metrics, a /metrics HTTP endpoint and metrics definitions, a metrics provider that collects and exposes metrics, and integration into backup execution and app startup.

Changes

Cohort / File(s)	Summary
Configuration & Environment `\.env.example`, `internal/config/env.go`	Added `PBW_ENABLE_METRICS`, `PBW_METRICS_LISTEN_HOST`, and `PBW_METRICS_LISTEN_PORT` to example and Env struct with defaults.
Documentation `README.md`	Added "Monitoring with Prometheus" section documenting how to enable metrics, listen host/port, sample `prometheus.yml`, and list of exposed metrics.
Container Images `docker/Dockerfile`, `docker/Dockerfile.dev`	Exposed port `2112` (metrics) alongside existing application ports.
Dependencies `go.mod`	Bumped Go toolchain, updated several direct and indirect dependencies, and added `github.com/prometheus/client_golang`.
Metrics Core `internal/metrics/metrics.go`, `internal/metrics/metrics_test.go`	Added exported Prometheus metric variables (counters/gauges) and tests verifying `/metrics` exposition.
Metrics Provider `internal/metrics/provider/provider.go`	New `InitMetrics` to conditionally register collectors, run a 1-minute background updater (queries services, updates health/resource/backup metrics), and start an HTTP server exposing `/metrics`.
Service Integration `internal/service/executions/run_execution.go`, `internal/service/executions/run_execution.sql`	Execution flow updated to increment/decrement running counters, record totals/duration/status metrics on completion; SQL now returns backup/database/destination names for metric labels.
Application Initialization `cmd/app/main.go`	Calls `provider.InitMetrics(env, db, servs)` during startup to wire metrics into the app.
Local Development Compose `docker-compose.yml`	Added compose file defining DB services and `pgbackweb` with metrics enabled and ports published for app and metrics.

Sequence Diagram

sequenceDiagram
    actor Main as Application Main
    participant Provider as Metrics Provider
    participant Services as Services Layer
    participant DB as Database
    participant HTTP as HTTP Server (/metrics)
    participant Prom as Prometheus Scraper

    Main->>Provider: InitMetrics(env, db, servs)
    
    alt Metrics Enabled
        Provider->>Prom: Register DB Stats Collector
        Provider->>HTTP: Start HTTP Server on PBW_METRICS_LISTEN_HOST:PORT

        Provider->>Provider: Start Background Goroutine (1min interval)
        loop Every 1 Minute
            Provider->>Services: Query databases, destinations, backups
            Services->>DB: run queries
            DB-->>Services: health & status data
            Services-->>Provider: health & status data
            Provider->>Provider: Update HealthStatus, ResourceCounts, BackupTasksStatus
        end

        rect rgba(100,150,200,0.5)
        Note over HTTP,Prom: Prometheus scrapes /metrics
        Prom->>HTTP: GET /metrics
        HTTP-->>Prom: Exposition response
        end
    else Metrics Disabled
        Provider-->>Main: Return (no-op)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Add Prometheus Metrics #166 — Implements Prometheus metrics, /metrics endpoint, metric registrations, and startup integration as described in the issue.

Poem

🐰
Metrics hum where backups run,
Counters, gauges — all in fun.
Port 2112 opens wide,
Prometheus comes to spy and guide.
Carrots of data, neatly supplied.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'add prometheus integration' directly and clearly summarizes the main change—adding Prometheus monitoring support. It accurately reflects the primary objective evident across all file changes (metrics definitions, provider initialization, configuration, and documentation).

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (2)

internal/metrics/metrics_test.go (1)
14-45: Prefer table-driven subtests (t.Run) for this _test.go file.

This test should follow the project’s test structure convention, even with a single case.

As per coding guidelines, "Write table-driven tests using t.Run with a slice of structs".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/metrics/metrics_test.go` around lines 14 - 45, Convert
TestMetricsExposure into a table-driven test that uses t.Run even for the single
case: create a slice of test structs (e.g., {name, labels/setup func, expected
[]string}) and iterate over them, calling t.Run(tc.name, func(t *testing.T) {
... }) to perform the metric updates (using BackupsRunning, BackupsTotal,
LastBackupDuration, LastBackupStatus, HealthStatus, HealthyResourcesCount,
TotalResourcesCount, BackupTasksStatus), invoke promhttp.Handler().ServeHTTP on
an httptest request, assert the response contains the expected strings, and
perform cleanup (e.g., BackupsRunning.Dec()) inside the subtest; update
TestMetricsExposure to reference the test case fields so the test structure
follows the project's table-driven convention.
internal/metrics/provider/provider.go (1)
29-96: Metrics collector loop requires cancellation context and operation timeouts.

The infinite loop runs with context.Background() and time.Sleep(1 * time.Minute) only. Service calls (GetAllDatabases, GetAllDestinations, GetAllBackups) have no deadline, causing potential goroutine hangs during shutdown or when dependencies stall. Use a context with cancellation and apply per-operation timeouts to all service calls.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/metrics/provider/provider.go` around lines 29 - 96, The goroutine
uses context.Background() and time.Sleep, causing un-cancelable long-running
operations; change it to accept/derive a cancellable parent context and replace
time.Sleep with a time.Ticker + select so the loop can stop on ctx.Done(); for
each service call (servs.DatabasesService.GetAllDatabases,
servs.DestinationsService.GetAllDestinations,
servs.BackupsService.GetAllBackups) wrap the call with a short per-operation
timeout via context.WithTimeout (and defer cancel) to avoid hangs; ensure you
check ctx.Done() between operations and cancel per-call contexts to prevent
leaks.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.env.example:
- Around line 24-30: The new .env.example entries PBW_ENABLE_METRICS,
PBW_METRICS_LISTEN_HOST, and PBW_METRICS_LISTEN_PORT currently use surrounding
quotes which triggers dotenv-linter's QuoteCharacter rule; edit the .env.example
to remove the surrounding double quotes so the variables are unquoted (e.g.,
PBW_ENABLE_METRICS=false, PBW_METRICS_LISTEN_HOST=0.0.0.0,
PBW_METRICS_LISTEN_PORT=2112) to satisfy the linter.

In `@cmd/app/main.go`:
- Line 49: The call to provider.InitMetrics in main currently swallows startup
failures; change the InitMetrics signature in
internal/metrics/provider/provider.go to return an error for startup-time
failures (e.g., bind/listen errors) and update its internals to return those
errors instead of only logging, then update main to check the returned error
from provider.InitMetrics and call logger.FatalError() (or equivalent fatal
logger) to fail fast when metrics are explicitly enabled; reference
provider.InitMetrics and logger.FatalError() when making these changes.

In `@internal/metrics/metrics_test.go`:
- Around line 16-45: Test mutates global metrics (BackupsRunning, BackupsTotal,
LastBackupDuration, LastBackupStatus, HealthStatus, HealthyResourcesCount,
TotalResourcesCount, BackupTasksStatus) with fixed labels causing flakiness; fix
by making labels unique per test (e.g., append t.Name() or a random suffix) or
instantiate a fresh Prometheus registry for the test and register local metric
instances, and ensure cleanup by calling metric.DeleteLabelValues(...) for every
label combination used or registering t.Cleanup to remove them (or use
registry.Unregister/reset) so no state is left between runs.

In `@internal/metrics/provider/provider.go`:
- Around line 105-108: The metrics startup currently logs an unstructured info
string and treats ListenAndServe errors as non-fatal; update the startup to use
structured logging and fail fast: replace the logger.Info("metrics server
started...") with a structured logger.Info call including logger.KV{"address":
address}, and when http.ListenAndServe(address, mux) returns an error call
logger.FatalError(...) with structured context logger.KV{"address": address,
"error": err} (replace the current logger.Error usage). Locate the relevant
calls to logger.Info and http.ListenAndServe in provider.go to apply these
changes.

In `@internal/service/executions/run_execution.go`:
- Around line 60-95: The updateExec closure currently triggers webhooks, metrics
and logs (s.webhooksService.RunExecutionSuccess/RunExecutionFailed,
metrics.BackupsTotal, metrics.LastBackupDuration, metrics.LastBackupStatus,
logger.Info/Error) before calling s.dbgen.ExecutionsServiceUpdateExecution;
change the order so you call s.dbgen.ExecutionsServiceUpdateExecution(ctx,
params) first and return any error immediately, and only emit the
webhooks/metrics/logs after the DB update succeeds (use the same params and
context), ensuring side effects run only when persistence succeeds.

In `@README.md`:
- Around line 169-177: The metrics list in README.md is missing three exported
metrics; update the "Metrics Exposed" section to include
`pgbackweb_healthy_resources_count`, `pgbackweb_total_resources_count`, and
`pgbackweb_backup_tasks_status` with short descriptions (healthy resources
count, total resources count, and per-task backup status respectively) so the
docs match what the app emits and operators aren't misled.

---

Nitpick comments:
In `@internal/metrics/metrics_test.go`:
- Around line 14-45: Convert TestMetricsExposure into a table-driven test that
uses t.Run even for the single case: create a slice of test structs (e.g.,
{name, labels/setup func, expected []string}) and iterate over them, calling
t.Run(tc.name, func(t *testing.T) { ... }) to perform the metric updates (using
BackupsRunning, BackupsTotal, LastBackupDuration, LastBackupStatus,
HealthStatus, HealthyResourcesCount, TotalResourcesCount, BackupTasksStatus),
invoke promhttp.Handler().ServeHTTP on an httptest request, assert the response
contains the expected strings, and perform cleanup (e.g., BackupsRunning.Dec())
inside the subtest; update TestMetricsExposure to reference the test case fields
so the test structure follows the project's table-driven convention.

In `@internal/metrics/provider/provider.go`:
- Around line 29-96: The goroutine uses context.Background() and time.Sleep,
causing un-cancelable long-running operations; change it to accept/derive a
cancellable parent context and replace time.Sleep with a time.Ticker + select so
the loop can stop on ctx.Done(); for each service call
(servs.DatabasesService.GetAllDatabases,
servs.DestinationsService.GetAllDestinations,
servs.BackupsService.GetAllBackups) wrap the call with a short per-operation
timeout via context.WithTimeout (and defer cancel) to avoid hangs; ensure you
check ctx.Done() between operations and cancel per-call contexts to prevent
leaks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 019c958b-60e1-468b-b920-a4d3c2d0372c

📥 Commits

Reviewing files that changed from the base of the PR and between 70e0d9e and 5bed7ac.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (12)

.env.example
README.md
cmd/app/main.go
docker/Dockerfile
docker/Dockerfile.dev
go.mod
internal/config/env.go
internal/metrics/metrics.go
internal/metrics/metrics_test.go
internal/metrics/provider/provider.go
internal/service/executions/run_execution.go
internal/service/executions/run_execution.sql

coderabbitai · 2026-04-15T13:34:45Z

+### Metrics Exposed
+
+- `pgbackweb_backups_total`: Total number of backups executed (success/failed).
+- `pgbackweb_backups_running`: Number of currently active backup operations.
+- `pgbackweb_last_backup_duration_seconds`: Duration of the most recent backup in seconds.
+- `pgbackweb_last_backup_status`: Status of the last backup (1 for success, 0 for failure).
+- `pgbackweb_health_status`: Health status of databases and destinations (1 for healthy, 0 for unhealthy).
+- `pgbackweb_db_stats_*`: Standard database connection pool statistics.
+


⚠️ Potential issue | 🟡 Minor

Metrics list is incomplete compared to what the app exports.

The “Metrics Exposed” list omits three emitted metrics (pgbackweb_healthy_resources_count, pgbackweb_total_resources_count, pgbackweb_backup_tasks_status), which can mislead operators.

📝 Proposed doc addition

- `pgbackweb_health_status`: Health status of databases and destinations (1 for healthy, 0 for unhealthy). +- `pgbackweb_healthy_resources_count`: Number of healthy databases/destinations. +- `pgbackweb_total_resources_count`: Total number of databases/destinations. +- `pgbackweb_backup_tasks_status`: Number of active/inactive backup tasks. - `pgbackweb_db_stats_*`: Standard database connection pool statistics.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

### Metrics Exposed

- `pgbackweb_backups_total`: Total number of backups executed (success/failed).

- `pgbackweb_backups_running`: Number of currently active backup operations.

- `pgbackweb_last_backup_duration_seconds`: Duration of the most recent backup in seconds.

- `pgbackweb_last_backup_status`: Status of the last backup (1 for success, 0 for failure).

- `pgbackweb_health_status`: Health status of databases and destinations (1 for healthy, 0 for unhealthy).

- `pgbackweb_db_stats_*`: Standard database connection pool statistics.

### Metrics Exposed

- `pgbackweb_backups_total`: Total number of backups executed (success/failed).

- `pgbackweb_backups_running`: Number of currently active backup operations.

- `pgbackweb_last_backup_duration_seconds`: Duration of the most recent backup in seconds.

- `pgbackweb_last_backup_status`: Status of the last backup (1 for success, 0 for failure).

- `pgbackweb_health_status`: Health status of databases and destinations (1 for healthy, 0 for unhealthy).

- `pgbackweb_healthy_resources_count`: Number of healthy databases/destinations.

- `pgbackweb_total_resources_count`: Total number of databases/destinations.

- `pgbackweb_backup_tasks_status`: Number of active/inactive backup tasks.

- `pgbackweb_db_stats_*`: Standard database connection pool statistics.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 169 - 177, The metrics list in README.md is missing three exported metrics; update the "Metrics Exposed" section to include `pgbackweb_healthy_resources_count`, `pgbackweb_total_resources_count`, and `pgbackweb_backup_tasks_status` with short descriptions (healthy resources count, total resources count, and per-task backup status respectively) so the docs match what the app emits and operators aren't misled.

coderabbitai

🧹 Nitpick comments (4)

docker-compose.yml (3)

40-48: Development-only configuration: hardcoded credentials and encryption key.

The hardcoded encryption key and embedded database credentials are appropriate for local development but must not be used in production. Consider adding a comment to make this clear to future maintainers.

📝 Proposed comment addition

   pgbackweb:
     build:
       context: .
       dockerfile: docker/Dockerfile
     container_name: pgbackweb_app
+    # WARNING: This configuration is for LOCAL DEVELOPMENT ONLY.
+    # Do not use these hardcoded credentials or encryption keys in production.
     environment:
       PBW_ENCRYPTION_KEY: "test_encryption_key_32_chars_long!"
       PBW_POSTGRES_CONN_STRING: "postgresql://pgbackweb:pgbackweb_password@postgres-pgbackweb:5432/pgbackweb?sslmode=disable"

Note: The static analysis warning (CKV_SECRET_4) about credentials in the connection string is a false positive for development docker-compose files.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 40 - 48, Add a clear development-only
comment at the top of the environment block in docker-compose.yml indicating the
hardcoded PBW_ENCRYPTION_KEY and PBW_POSTGRES_CONN_STRING are for local dev only
and must not be used in production; reference the exact environment variables
PBW_ENCRYPTION_KEY, PBW_POSTGRES_CONN_STRING, PBW_LISTEN_HOST, PBW_LISTEN_PORT,
PBW_PATH_PREFIX, PBW_ENABLE_METRICS, PBW_METRICS_LISTEN_HOST, and
PBW_METRICS_LISTEN_PORT so maintainers know which values are dev-only, and
mention that production should use secrets/secret manager or CI-injected
environment variables (and note the static analysis CKV_SECRET_4 warning is a
false positive for this dev compose file).

5-17: Consider adding a volume for database persistence.

The postgres-pgbackweb service has no volume mount, so all database data will be lost when the container is removed or recreated. For development continuity, consider adding a named volume.

💾 Proposed addition for data persistence

   postgres-pgbackweb:
     image: postgres:18
     container_name: pgbackweb_postgres_main
     environment:
       POSTGRES_USER: pgbackweb
       POSTGRES_PASSWORD: pgbackweb_password
       POSTGRES_DB: pgbackweb
     ports:
       - "5432:5432"
+    volumes:
+      - pgbackweb_data:/var/lib/postgresql/data
     command:
       - "postgres"
       - "-c"
       - "log_statement=all"

Then add at the end of the file:

volumes:
  pgbackweb_data:
  test_data:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 5 - 17, The postgres-pgbackweb service has
no persistent volume so its data is ephemeral; add a named volume and mount it
into the container by updating the postgres-pgbackweb service to include a
volumes entry that maps a named volume (e.g., pgbackweb_data) to the Postgres
data directory (/var/lib/postgresql/data) and then declare that named volume
under the top-level volumes section (e.g., pgbackweb_data:). Reference the
service name "postgres-pgbackweb" and the suggested volume name "pgbackweb_data"
when making the changes.

35-54: Consider adding a health check for the application service.

Adding a health check would allow docker-compose to monitor the application's availability and could help with service orchestration.

🏥 Proposed health check addition

   pgbackweb:
     build:
       context: .
       dockerfile: docker/Dockerfile
     container_name: pgbackweb_app
     environment:
       PBW_ENCRYPTION_KEY: "test_encryption_key_32_chars_long!"
       PBW_POSTGRES_CONN_STRING: "postgresql://pgbackweb:pgbackweb_password@postgres-pgbackweb:5432/pgbackweb?sslmode=disable"
       PBW_LISTEN_HOST: "0.0.0.0"
       PBW_LISTEN_PORT: "8085"
       PBW_PATH_PREFIX: ""
       PBW_ENABLE_METRICS: "true"
       PBW_METRICS_LISTEN_HOST: "0.0.0.0"
       PBW_METRICS_LISTEN_PORT: "2112"
     ports:
       - "8085:8085"
       - "2112:2112"
     depends_on:
       - postgres-pgbackweb
+    healthcheck:
+      test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:8085/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+      start_period: 40s
     restart: unless-stopped

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 35 - 54, Add a Docker healthcheck for the
pgbackweb service (pgbackweb / container_name pgbackweb_app) so docker-compose
can monitor app availability; update the pgbackweb service block to include a
healthcheck stanza that periodically runs a lightweight HTTP probe against the
app listen host/port (e.g., curl or wget against http://localhost:8085/health or
another ready endpoint), and set sensible interval, timeout, retries and
start_period values so health transitions are reliable.

internal/service/executions/run_execution.go (1)

60-100: Past review concern addressed - DB persistence now happens before side effects.

The ordering is now correct: the database update (lines 62-65) is performed first, and webhooks/metrics/logs are only emitted after successful persistence.

Minor optimization: compute duration once per branch.

time.Since(startTime).Seconds() is called twice in each branch (for metrics and logging). Consider computing it once for consistency and minor efficiency.

♻️ Optional refactor to compute duration once

 		// Only emit side effects if database update succeeded
+		durationSec := time.Since(startTime).Seconds()
 		if params.Status.String == "success" {
 			s.webhooksService.RunExecutionSuccess(backupID)
 			metrics.BackupsTotal.WithLabelValues("success", back.BackupName, back.DatabaseName).Inc()
-			metrics.LastBackupDuration.WithLabelValues(back.BackupName, back.DatabaseName).Set(time.Since(startTime).Seconds())
+			metrics.LastBackupDuration.WithLabelValues(back.BackupName, back.DatabaseName).Set(durationSec)
 			metrics.LastBackupStatus.WithLabelValues(back.BackupName, back.DatabaseName).Set(1.0)
 			logger.Info("execution completed successfully", logger.KV{
 				"backup_id":     backupID.String(),
 				"execution_id":  ex.ID.String(),
 				"backup_name":   back.BackupName,
 				"database_name": back.DatabaseName,
-				"duration_sec":  time.Since(startTime).Seconds(),
+				"duration_sec":  durationSec,
 				"file_size":     params.FileSize.Int64,
 				"path":          params.Path.String,
 			})
 		}
 
 		if params.Status.String == "failed" {
 			s.webhooksService.RunExecutionFailed(backupID)
 			metrics.BackupsTotal.WithLabelValues("failed", back.BackupName, back.DatabaseName).Inc()
-			metrics.LastBackupDuration.WithLabelValues(back.BackupName, back.DatabaseName).Set(time.Since(startTime).Seconds())
+			metrics.LastBackupDuration.WithLabelValues(back.BackupName, back.DatabaseName).Set(durationSec)
 			metrics.LastBackupStatus.WithLabelValues(back.BackupName, back.DatabaseName).Set(0.0)
 			logger.Error("execution failed", logger.KV{
 				"backup_id":     backupID.String(),
 				"execution_id":  ex.ID.String(),
 				"backup_name":   back.BackupName,
 				"database_name": back.DatabaseName,
 				"error_message": params.Message.String,
-				"duration_sec":  time.Since(startTime).Seconds(),
+				"duration_sec":  durationSec,
 			})
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/service/executions/run_execution.go` around lines 60 - 100, In
updateExec (the closure used to update execution state) compute the elapsed
duration once inside each branch instead of calling
time.Since(startTime).Seconds() multiple times: declare a local variable (e.g.,
duration := time.Since(startTime).Seconds()) at the start of the "if
params.Status.String == \"success\"" block and at the start of the "if
params.Status.String == \"failed\"" block, then use duration for
metrics.LastBackupDuration.Set(...) and the logger fields to avoid duplicate
calls and ensure consistency.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docker-compose.yml`:
- Around line 40-48: Add a clear development-only comment at the top of the
environment block in docker-compose.yml indicating the hardcoded
PBW_ENCRYPTION_KEY and PBW_POSTGRES_CONN_STRING are for local dev only and must
not be used in production; reference the exact environment variables
PBW_ENCRYPTION_KEY, PBW_POSTGRES_CONN_STRING, PBW_LISTEN_HOST, PBW_LISTEN_PORT,
PBW_PATH_PREFIX, PBW_ENABLE_METRICS, PBW_METRICS_LISTEN_HOST, and
PBW_METRICS_LISTEN_PORT so maintainers know which values are dev-only, and
mention that production should use secrets/secret manager or CI-injected
environment variables (and note the static analysis CKV_SECRET_4 warning is a
false positive for this dev compose file).
- Around line 5-17: The postgres-pgbackweb service has no persistent volume so
its data is ephemeral; add a named volume and mount it into the container by
updating the postgres-pgbackweb service to include a volumes entry that maps a
named volume (e.g., pgbackweb_data) to the Postgres data directory
(/var/lib/postgresql/data) and then declare that named volume under the
top-level volumes section (e.g., pgbackweb_data:). Reference the service name
"postgres-pgbackweb" and the suggested volume name "pgbackweb_data" when making
the changes.
- Around line 35-54: Add a Docker healthcheck for the pgbackweb service
(pgbackweb / container_name pgbackweb_app) so docker-compose can monitor app
availability; update the pgbackweb service block to include a healthcheck stanza
that periodically runs a lightweight HTTP probe against the app listen host/port
(e.g., curl or wget against http://localhost:8085/health or another ready
endpoint), and set sensible interval, timeout, retries and start_period values
so health transitions are reliable.

In `@internal/service/executions/run_execution.go`:
- Around line 60-100: In updateExec (the closure used to update execution state)
compute the elapsed duration once inside each branch instead of calling
time.Since(startTime).Seconds() multiple times: declare a local variable (e.g.,
duration := time.Since(startTime).Seconds()) at the start of the "if
params.Status.String == \"success\"" block and at the start of the "if
params.Status.String == \"failed\"" block, then use duration for
metrics.LastBackupDuration.Set(...) and the logger fields to avoid duplicate
calls and ensure consistency.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1741ba75-1a1c-45de-a3db-770b2d4c3e9c

📥 Commits

Reviewing files that changed from the base of the PR and between 5bed7ac and 5e3f76d.

📒 Files selected for processing (7)

.env.example
README.md
cmd/app/main.go
docker-compose.yml
internal/metrics/metrics_test.go
internal/metrics/provider/provider.go
internal/service/executions/run_execution.go

✅ Files skipped from review due to trivial changes (2)

.env.example
README.md

🚧 Files skipped from review as they are similar to previous changes (3)

cmd/app/main.go
internal/metrics/metrics_test.go
internal/metrics/provider/provider.go

feat: add prometheus env variables

05a131e

filippofinke changed the title ~~add prometheus integration~~ add prometheus integration #166 Apr 15, 2026

filippofinke changed the title ~~add prometheus integration #166~~ add prometheus integration Apr 15, 2026

filippofinke added 4 commits April 15, 2026 14:52

docs: update README.md

ce06e5f

feat: add 2112 metrics port for prometheus

5de8ec8

feat: load prometheus env variables

d2a4036

feat: integrate Prometheus metrics for backup execution monitoring

5bed7ac

filippofinke marked this pull request as ready for review April 15, 2026 13:25

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

feat: enhance Prometheus metrics support and update configuration

5e3f76d

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

feat: remove docker-compose.yml for streamlined deployment

2a2c90b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add prometheus integration #172

add prometheus integration #172
filippofinke wants to merge 7 commits into
eduardolat:mainfrom
filippofinke:feature/prometheus

filippofinke commented Apr 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 15, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 15, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

filippofinke commented Apr 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

filippofinke commented Apr 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 15, 2026 •

edited

Loading