Skip to content

[FEATURE] Availability Group Health Monitoring #991

@mark-bentham

Description

@mark-bentham

Which component(s) does this affect?

  • Full Dashboard
  • Lite
  • SQL collection scripts
  • Installer
  • Documentation

Problem Statement

I'd like to be able to monitor the health of availability group replicas, with alerting should the secondaries fall significantly behind the primary, so we know if we need to use the replica it is not significantly out of date and can fail-over safely if required

Proposed Solution

additional collector querying availability group dms collecting stats like redo_queue_size, redo_rate, log_send_queue_size, log_send_rate, synchronization_state etc, plus visualisation in the dashboard and alerting if the secondaries fall significantly behind the primary

Use Case

Useful for anyone using always on availability groups for HA, to allow them to monitor the health of the availability group and to identify issues, such as the secondaries falling behind during periods of heavy usage, so that I can identify and resolve them.

Alternatives Considered

No response

Additional Context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions