Skip to content

docs(valkey): sync failover hardening options#365

Open
mberlofa wants to merge 1 commit into
mainfrom
docs/valkey-failover-hardening-sync
Open

docs(valkey): sync failover hardening options#365
mberlofa wants to merge 1 commit into
mainfrom
docs/valkey-failover-hardening-sync

Conversation

@mberlofa

@mberlofa mberlofa commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Summary

  • document Valkey Sentinel graceful failover and startup failover guard defaults
  • add Valkey to the playground allowlist
  • expose Sentinel pod policy and failover guard controls in the playground

Validation

  • make site-sync-check CHART=valkey
  • npm run lint
  • npm run format:check
  • npm run build
  • make release-check REPO=site
  • make attribution-check REPO=site

Chart PRs: helmforgedev/charts#664, helmforgedev/charts#687
Issue: helmforgedev/charts#633

@coderabbitai

coderabbitai Bot commented Jul 4, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds Sentinel configuration fields (pod management policy, graceful failover, startup failover guard) to the Valkey playground config, registers valkey in the site's playground chart selector, and updates Valkey chart documentation to describe these new Sentinel settings.

Changes

Valkey Sentinel Configuration

Layer / File(s) Summary
Sentinel playground fields and chart registration
src/data/playground-configs.ts, src/pages/playground.astro
Adds Sentinel pod management policy, graceful failover, and startup failover guard fields to the Valkey Topology config, and registers valkey in siteSyncPlaygroundConfigs so it appears in the playground chart selector.
Sentinel documentation updates
src/pages/docs/charts/valkey.mdx
Adds a paragraph describing default Sentinel failover hardening behavior and new rows in the Values table for the Sentinel configuration defaults.

Estimated code review effort: 1 (Trivial) | ~5 minutes

Possibly related PRs

  • helmforgedev/site#330: Both PRs register a new chart entry in siteSyncPlaygroundConfigs within src/pages/playground.astro to include it in the playground selector.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title clearly matches the main change: Valkey failover hardening options and related docs/playground sync.
Description check ✅ Passed The description includes a clear summary, validation steps, and issue/PR references, with only minor template formatting differences.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/valkey-failover-hardening-sync

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/data/playground-configs.ts (1)

1664-1699: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Gate maxAttempts fields on their parent toggles.

Graceful Attempts and Startup Attempts render unconditionally, unlike similar dependent numeric fields elsewhere in this file (e.g. druid's historical.replicaCount uses enables: 'historical.enabled'). Consider linking these to their respective toggles for UI consistency.

♻️ Proposed fix
         {
           label: 'Graceful Attempts',
           key: 'sentinel.gracefulFailover.maxAttempts',
           type: 'number',
           default: '12',
+          enables: 'sentinel.gracefulFailover.enabled',
           description: 'preStop failover attempts',
         },
         {
           label: 'Startup Attempts',
           key: 'sentinel.startupFailoverGuard.maxAttempts',
           type: 'number',
           default: '20',
+          enables: 'sentinel.startupFailoverGuard.enabled',
           description: 'Startup failover attempts',
         },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/data/playground-configs.ts` around lines 1664 - 1699, The two numeric
config entries for Sentinel attempt limits are currently always shown, unlike
other dependent fields in this list. Update the `Graceful Attempts` and `Startup
Attempts` items in `playground-configs.ts` so their visibility is gated by their
parent toggle fields, using the same dependency pattern as other entries in this
file (for example the `enables`/toggle relationship used by
`historical.replicaCount`). Keep the existing
`sentinel.gracefulFailover.maxAttempts` and
`sentinel.startupFailoverGuard.maxAttempts` keys, but link each one to
`sentinel.gracefulFailover.enabled` and `sentinel.startupFailoverGuard.enabled`
respectively.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/data/playground-configs.ts`:
- Around line 1664-1699: The two numeric config entries for Sentinel attempt
limits are currently always shown, unlike other dependent fields in this list.
Update the `Graceful Attempts` and `Startup Attempts` items in
`playground-configs.ts` so their visibility is gated by their parent toggle
fields, using the same dependency pattern as other entries in this file (for
example the `enables`/toggle relationship used by `historical.replicaCount`).
Keep the existing `sentinel.gracefulFailover.maxAttempts` and
`sentinel.startupFailoverGuard.maxAttempts` keys, but link each one to
`sentinel.gracefulFailover.enabled` and `sentinel.startupFailoverGuard.enabled`
respectively.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 02f1b507-3dfb-4bb6-ae00-754c02f6d8f7

📥 Commits

Reviewing files that changed from the base of the PR and between 365449a and 404b55b.

📒 Files selected for processing (3)
  • src/data/playground-configs.ts
  • src/pages/docs/charts/valkey.mdx
  • src/pages/playground.astro

mberlofa pushed a commit to helmforgedev/charts that referenced this pull request Jul 4, 2026
…et handling (#664)

## Summary
- harden Valkey Sentinel failover safety with graceful master shutdown
failover and startup guard defaults
- stabilize node and Sentinel identity, add Sentinel probes,
bound/re-resolve startup waits, and move auth out of process argv
- document Sentinel residual data-loss conditions and add schema/tests
for the new failover hardening controls

## Validation
- helm lint charts/valkey --strict
- helm unittest charts/valkey
- make validate-chart CHART=valkey TIMEOUT=900: FULLY VALIDATED (21
layers)
- make site-sync-check CHART=valkey
- make release-check REPO=charts
- make attribution-check REPO=charts

Site PR: helmforgedev/site#365
Issue: #633

## Notes
- The branch was rebased on current main.
- The non-conventional revert commit subject was amended to
`fix(valkey): revert sentinel publish-not-ready bootstrap change`.
- No merge performed.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added configurable Sentinel pod management and failover safeguards for
smoother master transitions.
* Introduced updated startup, readiness, and liveness checks for
Sentinel-aware nodes.

* **Bug Fixes**
* Improved handling of restarts during failover windows to reduce
empty-master resyncs and data loss risk.
* Added graceful shutdown behavior to trigger failover before pod
termination.

* **Documentation**
* Clarified Sentinel persistence and recovery behavior, including
remaining edge-case data-loss scenarios.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant