Summary
Running github-actions-cache-server@9.4.7 against a self-hosted MinIO S3 backend, we observe that blobs in the S3 bucket are not deleted when the corresponding cache_entries SQLite row is removed (e.g., key rotation, branch deletion, retention expiry). The bucket grows monotonically until it hits XMinioStorageFull, at which point the cache silently stops accepting writes and CI jobs hang on actions/cache@v4.
Environment
github-actions-cache-server: 9.4.7 (HelmRelease gha-cache-server v7, chart github-actions-cache-server@1.0.3)
- Self-hosted MinIO backend in same k8s namespace (
github-runner), bucket gha-cache
CACHE_CLEANUP_OLDER_THAN_DAYS=5 (env var set on the deployment)
ENABLE_DIRECT_DOWNLOADS=true
- ARC runner-set workload — salespath repos, mixed Go + Node/pnpm test workflows
- Runner cache keys:
setup-go-Linux-x64-... (~462 MB each), node-cache-Linux-x64-pnpm-... (~varies)
Evidence — orphan accumulation over 6h post-purge
After a full manual purge per the recovery procedure (scale cache-server to 0, wipe gh-actions-cache/, wipe cache-server.db*, scale back up), the bucket grew back 5 top-level entry directories in 6 hours but the cache_entries table only contained 2 rows during the same window. Orphan ratio: 2.5:1.
| Time |
DB rows |
MinIO entry dirs |
Orphan blobs |
Bucket size |
| Pre-purge |
29 (all >5d) |
55 |
~26 (~9.6 GB) |
19 GB / 100% (XMinioStorageFull) |
| T+0 (post-purge) |
0 |
0 |
0 |
1.1 GB (.minio.sys only) |
| T+6h |
2 (both today) |
5 |
3 (~900 MB) |
6.4 GB / 33% |
gha-cache-server logs over the 6h window: zero cleanup / delete / expire / purge activity logged. Either the cleanup tick is silent on success, or it isn't running.
At this orphan-creation rate (~12 orphans/day, ~4.5 GB/day of orphan growth), the 20 Gi PVC refills in ~5 days post-purge. We've experienced the resulting XMinioStorageFull outage twice in three weeks.
Expected behavior
When the cache-server cleanup tick runs:
- Identify cache_entries rows past
CACHE_CLEANUP_OLDER_THAN_DAYS retention → DELETE the rows
- Also identify S3 objects whose corresponding cache_entries row no longer exists → DELETE the objects
- Log the cleanup activity (counts + bytes reclaimed) for observability
Actual behavior
Either the cleanup tick is not firing at all, OR it deletes DB rows without also deleting the S3 objects. Result: monotonic bucket growth until full.
Workaround we're deploying
A daily Kubernetes CronJob that runs mc rm --recursive --force --older-than 7d against the bucket. 7d gives 2d safety margin over the documented 5d retention. This is a band-aid; the cleanup tick should handle this internally.
Ask
- Is this a known issue? If so, is there a target version for the fix?
- Is there a config option to enable verbose cleanup-tick logging so we can confirm whether it fires?
- Would a PR to add explicit orphan-blob sweeping to the cleanup tick be welcome?
Summary
Running
github-actions-cache-server@9.4.7against a self-hosted MinIO S3 backend, we observe that blobs in the S3 bucket are not deleted when the correspondingcache_entriesSQLite row is removed (e.g., key rotation, branch deletion, retention expiry). The bucket grows monotonically until it hitsXMinioStorageFull, at which point the cache silently stops accepting writes and CI jobs hang onactions/cache@v4.Environment
github-actions-cache-server: 9.4.7 (HelmReleasegha-cache-serverv7, chartgithub-actions-cache-server@1.0.3)github-runner), bucketgha-cacheCACHE_CLEANUP_OLDER_THAN_DAYS=5(env var set on the deployment)ENABLE_DIRECT_DOWNLOADS=truesetup-go-Linux-x64-...(~462 MB each),node-cache-Linux-x64-pnpm-...(~varies)Evidence — orphan accumulation over 6h post-purge
After a full manual purge per the recovery procedure (scale cache-server to 0, wipe
gh-actions-cache/, wipecache-server.db*, scale back up), the bucket grew back 5 top-level entry directories in 6 hours but thecache_entriestable only contained 2 rows during the same window. Orphan ratio: 2.5:1.gha-cache-serverlogs over the 6h window: zerocleanup/delete/expire/purgeactivity logged. Either the cleanup tick is silent on success, or it isn't running.At this orphan-creation rate (~12 orphans/day, ~4.5 GB/day of orphan growth), the 20 Gi PVC refills in ~5 days post-purge. We've experienced the resulting
XMinioStorageFulloutage twice in three weeks.Expected behavior
When the cache-server cleanup tick runs:
CACHE_CLEANUP_OLDER_THAN_DAYSretention → DELETE the rowsActual behavior
Either the cleanup tick is not firing at all, OR it deletes DB rows without also deleting the S3 objects. Result: monotonic bucket growth until full.
Workaround we're deploying
A daily Kubernetes CronJob that runs
mc rm --recursive --force --older-than 7dagainst the bucket. 7d gives 2d safety margin over the documented 5d retention. This is a band-aid; the cleanup tick should handle this internally.Ask