🇮🇹 Documentazione in italiano disponibile: README-it.md
Before: raw Prometheus metrics, scattered kubectl logs, opaque index definitions, no clear picture of what's wrong. After: one dashboard, one health status, one list of actions.
mongot-doctor transforms complex MongoDB Search cluster data into instant diagnosis — designed for SRE, MongoDB operators, and platform engineers running MongoDB Search on Kubernetes.
- Detects stuck search nodes, indexing lag, OOMKilled events, and configuration drift
- Analyzes search query efficiency, scan ratios, and HNSW graph traversal in real time
- Alerts you before problems become outages — predictive oplog window, cardinality warnings, stall detection
- Built-in SRE Advisor runs 15 automated checks every collection cycle and ranks findings by severity
- Automatic Search Diagnosis interprets cluster health instantly — Health Summary, Warnings, Recommendations in one panel
- Log Intelligence parses mongot JSON logs automatically and detects errors, failures, and connection issues across configurable time windows
- Search Index Inspector analyzes every Search index definition — mapping quality, field count, dynamic mapping overuse, and index health — with actionable suggestions
- Status Report exports a full cluster snapshot in Text, Markdown, or JSON — shareable from the dashboard or from the CLI, ready for tickets, runbooks, and automation
No agents to install. No extra infrastructure. Just point it at your cluster and go.
Tip
Get a full diagnostic in one command
python3 mongot_doctor.py --uri "mongodb://..." --namespace mongodb --reportPrints a complete cluster snapshot — pods, search metrics, JVM heap, Lucene merges, oplog window, SRE findings, and index health — straight to your terminal.
Add --format markdown or --format json to export to Confluence, GitHub Issues, or your alerting pipeline.
- ✨ Key Features
- 🚀 Installation & Setup
- 🔌 API Endpoints
- 🏗️ Project Structure
- 🧪 Running Tests
- 🔬 SRE Advisor — Deep Dive
- 🩻 Automatic Search Diagnosis
- 🪵 Log Intelligence
- 🔎 Search Index Inspector
- 📋 Status Report
- 🧠 SRE Advisor — 15 automated checks, severity-ranked (crit → warn → pass), served via
/api/advisor— see deep dive below - 📡 Real-time Search QPS & Latency — delta-based computation across Prometheus scrape cycles, separate for
$searchand$vectorSearch - 🎯 Search Efficiency (Scan Ratio) — EMA-smoothed
candidates_examined / results_returned, separate ratio for text and vector search, with cardinality detection - 🧬 HNSW Visited Nodes — early warning for ANN CPU saturation before latency becomes visible
- ⏳ Index Build ETA — animated progress bar, docs/sec speed, stall detection, dynamic ETA
- 🔍 Robust Pod Discovery — 4-level hierarchy resilient to MCK upgrades and naming variations
- 🌊 Sync Pipeline Analyzer — real-time
DB → Change Stream → RAM → Lucenepipeline visualization with bottleneck identification - ⏱️ Predictive Oplog Window — warn at 40%, crit at 70% window consumed to prevent forced Initial Sync
- 🩺 Universal K8s Diagnostics — Helm releases, MCK/K8s versions, PVCs, OOMKilled events, live log streaming
- 📜 Log Management & Export — live terminal, download filtered by time window and severity
- ⚡ Background Collector & Rate Engine — daemon thread, < 1ms API response from in-memory cache, counter-reset safe
- 🔌 Stable Versioned API —
/api/v1/search_metricswith fixed schema, safe for external consumers - 🔒 Security — optional Basic Auth, CSP headers, K8s name input validation, configurable CORS
- 🩻 Automatic Search Diagnosis — real-time cluster health panel: Health Summary / Warnings / Recommendations; also available via
/api/diagnoseand--diagnoseCLI (exit 0/1/2 for CI pipelines) - 🪵 Log Intelligence — on-demand mongot JSON log analysis with configurable time window (1h / 24h / 7d / 30d); detects errors, OOM, TLS/auth issues, connection failures, index failures, change stream problems
- 🔎 Search Index Inspector — inspects every Search index definition: dynamic mapping detection, field count analysis, BUILDING/FAILED status, over-indexed collections; available via
/api/indexes/inspectand--inspect-indexesCLI - 📋 Status Report — full cluster snapshot in Text (ASCII), Markdown, or JSON; one-click download and copy from the dashboard;
--reportCLI flag for CI/automation pipelines
Prerequisites:
kubectlconfigured and pointing to your cluster. A MongoDB connection string with read access onlocal(oplog) and your target collections.Prometheus required: mongot-doctor reads mongot metrics via Prometheus. Prometheus is not enabled by default — you must explicitly configure it in your Kubernetes operator:
- MongoDB Enterprise Operator (MCK): enable the
spec.prometheussection in yourMongoDBresource — Enterprise guide- MongoDB Community Operator: enable the
spec.prometheussection in yourMongoDBCommunityresource — Community guide
Use this mode for development, demos, or when you prefer running the monitor outside the cluster.
1. Clone and install
git clone https://github.com/Miccolomi/mongot-doctor.git
cd mongot-doctor
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt2. Start
python3 mongot_doctor.py \
--uri "mongodb://USER:PASSWORD@HOST:PORT/admin?replicaSet=RS&authSource=admin&authMechanism=SCRAM-SHA-256" \
--namespace mongodb \
--port 5050Open your browser at: http://localhost:5050
CLI options
| Parameter | Default | Description |
|---|---|---|
--uri |
— | MongoDB connection string |
--namespace |
all | Kubernetes namespace to monitor |
--port |
5050 |
HTTP port for the dashboard |
--interval |
5 |
Collection interval in seconds |
--auth |
— | Basic Auth — format user:password |
--in-cluster |
false |
K8s auth via ServiceAccount (in-cluster only) |
--host |
0.0.0.0 |
Flask binding address |
--allowed-origins |
localhost | CORS allowed origins (space-separated) |
Use this mode for a permanent deployment inside the cluster. The monitor runs as a pod and uses a ServiceAccount with RBAC to access the Kubernetes API.
1. Build the Docker image
docker build -t mongot-doctor:latest .For a private registry (Docker Hub, ECR, GCR):
docker build -t <your-registry>/mongot-doctor:1.0.0 .
docker push <your-registry>/mongot-doctor:1.0.0Update the image: field in k8s/deployment.yaml accordingly.
⚠️ Important: after every code update, rebuild and restart the deployment:docker build -t mongot-doctor:latest . kubectl rollout restart deployment/mongot-doctor -n mongodb
2. Configure the MongoDB URI
The connection to mongod is required for oplog, index, and compliance checks. mongot is always discovered automatically via Kubernetes — no URI needed for it.
Edit k8s/secret.yaml based on where your mongod is running:
# Scenario A — mongod inside the cluster (MCK): use the internal Service DNS
kubectl get svc -n mongodb # look for a ClusterIP on port 27017# Scenario A — in-cluster (MCK)
stringData:
MONGODB_URI: "mongodb://USER:PASSWORD@<rs-name>-svc.<namespace>.svc.cluster.local/admin?replicaSet=<RS>&tls=true&tlsAllowInvalidCertificates=true&authSource=admin&authMechanism=SCRAM-SHA-256"
# Scenario B — Atlas (SRV)
# MONGODB_URI: "mongodb+srv://USER:PASSWORD@cluster0.xxxxx.mongodb.net/admin?authSource=admin&authMechanism=SCRAM-SHA-256"
# Scenario C — External replica set with DNS-resolvable hostnames
# MONGODB_URI: "mongodb://USER:PASSWORD@host1:27017,host2:27017/admin?replicaSet=RS&tls=true&authSource=admin&authMechanism=SCRAM-SHA-256"
authMechanism=SCRAM-SHA-256is required by MongoDB 7+ with MCK.
3. Apply manifests
kubectl apply -f k8s/rbac.yaml # ServiceAccount + ClusterRole
kubectl apply -f k8s/secret.yaml # MongoDB URI
kubectl apply -f k8s/deployment.yaml # Deployment
kubectl apply -f k8s/service.yaml # NodePort| File | Description |
|---|---|
k8s/rbac.yaml |
ServiceAccount + ClusterRole with minimal permissions (includes pods/proxy) |
k8s/secret.yaml |
MongoDB URI as a K8s Secret |
k8s/deployment.yaml |
Deployment with liveness and readiness probes on /healthz |
k8s/service.yaml |
NodePort Service to expose the dashboard |
Namespace: all manifests default to
mongodb. Updatenamespace:in all 4 files if yours is different.
4. Access the dashboard
kubectl get svc mongot-doctor -n mongodb
# Example: 5050:31855/TCP → NodePort = 31855- Docker Desktop:
http://localhost:<NODE_PORT> - Remote cluster (GKE, EKS, on-prem):
http://<NODE_IP>:<NODE_PORT>(seekubectl get nodes -o wide)
On Docker Desktop with MCK, the internal DNS (
<rs>-svc.mongodb.svc.cluster.local) is reachable directly from the pod. Do not use hostnames from the host's/etc/hosts— they are not resolvable from inside the cluster.
| Endpoint | Description |
|---|---|
/ |
HTML Dashboard |
/metrics |
Full JSON snapshot (from cache) |
/api/v1/search_metrics |
Stable versioned API — fixed schema for external consumers |
/api/advisor |
SRE findings in JSON (crit → warn → pass) |
/healthz |
Liveness probe — always returns 200 if Flask is running |
/healthcheck |
Detailed status (MongoDB ping, K8s API, cache age) |
/api/logs/<ns>/<pod> |
Last 50 lines of pod logs |
/api/download_logs/<ns>/<pod> |
Download logs (?time=1h&level=error) |
/api/diagnose |
Structured diagnosis: health, warnings, recommendations |
/api/logs/analyze/<ns>/<pod> |
Log Intelligence — pattern analysis (?window=1h|24h|7d|30d) |
/api/indexes/inspect |
Search Index Inspector — mapping quality and health report |
/api/report?format=text|markdown|json |
Status Report — full cluster snapshot in the requested format |
mongot_doctor.py # App Factory + CLI entry point
background.py # BackgroundCollector (thin orchestrator, daemon thread)
advisor.py # SRE Advisor engine (15 checks, pure Python)
report.py # Status Report builder (Text / Markdown / JSON formatters)
security.py # Input validation, security headers, Basic Auth
state.py # Shared mutable state (clients, cache, lock)
engine/
rate_calculator.py # Delta/rate engine: QPS, latency, scan ratio EMA, HNSW, ETA
# Counter reset safety, spike guard, first-cycle protection
collectors/
kubernetes.py # K8s discovery (pods, CRDs, PVCs, services, helm)
mongodb.py # MongoDB collectors (vitals, oplog, indexes)
prometheus.py # Prometheus scraper with dual fallback
index_inspector.py # Search Index Inspector (mapping analysis, observation engine)
log_analyzer.py # Log Intelligence (JSON log parsing, 8 pattern matchers)
routes/
api.py # API Blueprint (/metrics, /api/v1/search_metrics, /api/advisor, /api/logs)
frontend.py # Frontend Blueprint (/, /favicon.ico)
frontend/
templates/
dashboard.html # Jinja2 template
static/
css/main.css
js/
utils.js # Utilities (formatBytes, pill, gaugeRing, …)
logs.js # Live log management
advisor.js # Advisor renderer + Log Intelligence
pipeline.js # Sync Pipeline Analyzer
index_inspector.js # Search Index Inspector panel
report.js # Status Report modal (tabs, copy, download)
render.js # Main renderer + polling
tests/
conftest.py
test_advisor.py # tests — every SRE check
test_background.py # tests — collector and cache
test_frontend.py # tests — dashboard, CSS, JS, API
test_security.py # tests — validation, headers, auth
source venv/bin/activate
python3 -m pytest tests/ -vEvery collection cycle runs a set of Python checks against the cluster and index state. Findings are sorted by severity (crit → warn → pass) and served via /api/advisor.
| # | Check | Thresholds |
|---|---|---|
| 1 | Disk Space (200% Rule) | warn if free < 200% of used; crit if disk ≥ 90% (mongot enters read-only) |
| 2 | Index Consolidation | warn if more than one index of the same type on the same collection (fullText + vectorSearch is valid: Hybrid Search) |
| 3 | I/O Bottleneck | crit if disk queue > 10 AND lag > 5s simultaneously |
| 4 | CPU & QPS | crit if CPU > 80%; warn if QPS > 10 × cores |
| 5 | Memory Starvation (Page Faults) | warn > 500/s; crit > 1000/s |
| 6 | OOMKilled & MMap Risk | crit if JVM heap ≥ 90% of pod limit or OOMKilled detected |
| 7 | CRD Operator Status | crit if CRD is not in Running phase |
| 8 | Storage Class Performance | warn if PVC uses standard, hostpath, or slow |
| 9 | Operator Versioning | warn if operator image uses :latest tag |
| 10 | Predictive Oplog Window | warn > 40% consumed; crit > 70% consumed — prevents forced Initial Sync |
| 11 | Search Auth | crit if skipAuthenticationToSearchIndexManagementServer=true — mongod↔mongot without authentication |
| 12 | Search TLS Mode | crit if searchTLSMode=disabled; warn if allowTLS/preferTLS; pass if requireTLS |
| 13 | Search Efficiency (Scan Ratio) | warn > 50:1; crit > 500:1; predictive warning if high ratio + low latency (cardinality problem) |
| 14 | Vector Search Efficiency | same thresholds as scan ratio but computed separately for $vectorSearch |
| 15 | HNSW Visited Nodes | warn > 1000 nodes/query; crit > 5000 — early warning for ANN CPU saturation |
The 🔎 Search Commands panel shows throughput metrics computed as deltas between successive Prometheus scrape cycles:
$search QPSand$vectorSearch QPSdisplayed prominently (requests/second)- Average latency computed as
Δlatency_sum / Δcount— actual per-query latency, not a peak - Max latency — historical peak from the Prometheus counter
- Failure counters for
$searchand$vectorSearch
QPS values activate from the second collection cycle onward (a time delta is required).
scan_ratio = candidates_examined / results_returned is the true indicator of search query efficiency. Latency alone is not enough: a 50ms query with 200k candidates examined will become a timeout as the dataset grows.
Two separate ratios are computed: one for $search (mongot_query_candidates_examined_total with fallback to mongot_query_documents_scanned) and one dedicated for $vectorSearch (mongot_vector_query_candidates_examined_total).
To avoid false positives under low traffic (e.g. 1 result / 500 candidates from a single query), the ratio is EMA-smoothed (α = 0.3) with a guard: if Δresults < 10 the EMA is not updated.
| Ratio | Meaning |
|---|---|
| < 5 | Excellent — highly selective index |
| 5 – 50 | Normal |
| 50 – 500 | Inefficient query — review index or analyzer |
| > 500 | Critical — index or query is seriously problematic |
Predictive cardinality detection: if scan_ratio > 50 but latency < 100ms, the Advisor emits a warning — the index is non-selective but the dataset is still small enough to hide the cost. This signal is not provided by Ops Manager.
Zero-results anti-pattern: if results_returned = 0 but candidates_examined > 0, a specific warning is raised. Common causes: post-search $match too restrictive, scoring threshold too high, misconfigured pipeline.
mongot_vector_search_hnsw_visited_nodes (fallback: mongot_vector_search_graph_nodes_visited) measures how many nodes in the HNSW graph are traversed per $vectorSearch query. It is an early warning for CPU saturation: load increases before latency becomes visible.
| Visited nodes | Meaning |
|---|---|
| < 200 | Excellent |
| 200 – 1000 | Normal |
| > 1000 | Costly query — monitor CPU |
| > 5000 | ANN inefficient — CPU saturation imminent |
High values indicate ANN is degrading toward brute-force, typically due to excessive efSearch, poor graph connectivity, or oversized embedding dimensions. The check is optional: skipped if the metric is not exposed by the installed mongot version.
During an Initial Sync or bulk index build, a dedicated "⚙️ Index Build in Progress" panel appears with:
- Animated progress bar — green > 75%, orange < 75%, red if stalled
- Document counter — processed / total with percentage
- Speed in docs/sec (computed as a delta between collection cycles)
- Dynamic ETA in h/m/s format or "INDEX BUILD STALLED" warning if speed drops below 100 docs/s for at least 30 seconds
The panel is only shown while an Initial Sync is active (initial_sync_in_progress > 0).
Pod discovery uses a hierarchy resilient to rolling upgrades, scaling events, and naming variations across MCK versions:
- Official MCK label
app.kubernetes.io/component=search— most reliable - Container name
mongot— stable fallback across MCK versions - Container image — contains
mongodb-enterprise-searchormongot - Pod name (last resort) — heuristic, excludes
mongodandmonitor
The monitor pod itself is always excluded via app: mongot-doctor.
Data collection runs on a separate daemon thread at a configurable interval. The /metrics endpoint always responds in < 1ms from the in-memory cache — the dashboard never blocks on external calls.
All delta/rate computation logic is isolated in engine/rate_calculator.py, separated from the collection loop:
background.pyis a thin orchestrator: scrape →compute_pod_rates()→ cache updateengine/rate_calculator.pycontains QPS, average latency, scan ratio EMA, HNSW, ETA — independently testable- Counter reset safety:
_safe_delta()returnsNoneon negative delta (counter reset after mongot pod restart); spike guard discards QPS > 50,000/s; first cycle (last_s=None) skips all computation silently — no spurious spikes on startup
Versioned JSON endpoint with a fixed schema, decoupled from internal Prometheus metric names:
{
"schema_version": "1",
"timestamp": "...",
"collect_ms": 42,
"pods": {
"mongot-pod-0": {
"pod": { "namespace", "node", "phase", "all_ready", "total_restarts" },
"qps": { "search": 1.5, "vectorsearch": 0.3 },
"latency_sec":{ "search_avg", "search_max", "vectorsearch_avg", "vectorsearch_max" },
"failures": { "search": 0, "vectorsearch": 0 },
"efficiency": { "search_scan_ratio", "vectorsearch_scan_ratio", "hnsw_visited_nodes", "zero_results_with_candidates" },
"indexing": { "replication_lag_sec", "initial_sync_active", "updates_per_sec", "eta" }
}
}
}Safe for external consumers (CI performance gates, Grafana dashboards, alerting tools) — the backend can evolve without breaking the API contract.
Every collection cycle, the diagnosis engine interprets the full cluster state and presents it in three columns directly in the dashboard:
- Health Summary — all passing checks listed as
✔ - Warnings & Critical — failing checks with detail message
- Recommendations — actionable next steps derived from each finding
The health status (HEALTHY / DEGRADED / CRITICAL) is immediately visible at the top of the panel.
GET /api/diagnose{
"health": "degraded",
"summary": { "pass": 12, "warn": 2, "crit": 1 },
"critical": [{ "title": "OOMKilled & MMap Risk", "detail": "..." }],
"warnings": [{ "title": "Disk Space (200% Rule)", "detail": "..." }],
"healthy": [{ "title": "CRD Operator Status" }, ...],
"recommendations": ["Increase memory limit...", "Check disk usage..."]
}Run a single diagnostic cycle and exit — useful in CI/CD pipelines:
python3 mongot_doctor.py --diagnose \
--uri "mongodb://..." --namespace mongodbExit codes: 0 = healthy, 1 = degraded, 2 = critical.
On-demand analysis of mongot JSON logs directly from the dashboard. Parses the structured log format ({"t":..., "s":..., "n":..., "msg":..., "attr":...}) and detects known failure patterns.
| Window | Description |
|---|---|
1h |
Last hour — quick triage |
24h |
Last 24 hours — default |
7d |
Last 7 days — trend analysis |
30d |
Last 30 days — long-term issues |
Up to 2,000 JSON lines are analyzed per request (memory guard).
| Pattern | Severity | Detection |
|---|---|---|
| Out of Memory | 🔴 crit | OutOfMemoryError in msg or attr |
| Errors & Fatals | 🔴 crit | s == "ERROR" or "FATAL" |
| TLS / Auth Issues | 🔴 crit | ssl/tls/auth/certificate in msg + ERROR/WARN |
| MongoDB Connection Issues | 🟡 warn | org.mongodb.driver class + Exception/Removing server |
| Index Failures | 🟡 warn | index/lucene class + fail/corrupt/invalid |
| Replication / Change Stream | 🟡 warn | changestream class + lag/timeout/fail |
| Initial Sync Activity | 🔵 info | initialsync class |
| General Warnings | 🟡 warn | s == "WARN" |
GET /api/logs/analyze/<namespace>/<pod>?window=24h{
"pod": "my-replica-set-search-0",
"window": "24h",
"lines_analyzed": 350,
"findings": [
{
"id": "errors",
"name": "Errors & Fatals",
"severity": "crit",
"count": 3,
"description": "ERROR or FATAL log entries detected...",
"examples": ["[2026-03-05T14:09:07] Connection refused — ..."]
}
]
}Many teams create Search indexes without fully understanding their cost: dynamic mapping that indexes every field, stale BUILDING indexes, oversized explicit mappings with dozens of unused fields. mongot-doctor analyzes every index definition automatically and tells you exactly what to fix.
| Check | Severity | Condition |
|---|---|---|
| FAILED state | 🔴 crit | Index is in FAILED status |
| Not queryable | 🔴 crit | queryable: false — index not serving queries |
| Dynamic mapping | 🟡 warn | mappings.dynamic: true — every document field is indexed |
| BUILDING state | 🟡 warn | Index still building — queries may not return full results |
| Empty mapping | 🟡 warn | dynamic: false and zero fields mapped — index returns nothing |
| Large static mapping | 🟡 warn | More than 20 explicit fields — review unused ones |
| Over-indexed collection | 🟡 warn | More than 3 Search indexes on the same collection |
With dynamic: true, mongot indexes every field in every document. This is convenient during development, but in production it causes:
- Index size far exceeding the actual data size (observed 10–50x in real clusters)
- Longer indexing lag — more fields to process per write
- Higher JVM heap pressure — more Lucene segments to manage
- Opaque resource consumption — teams don't know what they're actually indexing
The inspector detects this and suggests migrating to a static mapping with only the fields used in search queries.
GET /api/indexes/inspect{
"summary": {
"total_indexes": 4,
"clean": 2,
"warns": 2,
"crits": 0,
"health": "degraded"
},
"indexes": [
{
"ns": "mydb.products",
"name": "default",
"type": "fullText",
"status": "READY",
"queryable": true,
"num_docs": 125432,
"mapping_dynamic": true,
"field_count": 0,
"observations": [
{
"level": "warn",
"msg": "Dynamic mapping enabled — every document field is indexed",
"suggestion": "Restrict mapping to specific fields to reduce index size and improve performance"
}
]
}
]
}Run a full inspection from the terminal — no dashboard needed:
python3 mongot_doctor.py --uri "mongodb://..." --inspect-indexesExample output:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
MongoDB Search — Index Inspector
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Collection: mydb.products
Index: default [fullText] READY
Docs: 125,432
Mapping: dynamic ⚠
⚠ Dynamic mapping enabled — every document field is indexed
→ Restrict mapping to specific fields to reduce index size
Collection: mydb.orders
Index: default [fullText] READY
Docs: 89,210
Mapping: static (7 fields)
✔ No issues detected
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
2 index(es) | 0 critical, 1 warnings, 1 clean
Health: DEGRADED
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Exit codes: 0 = healthy, 1 = degraded, 2 = critical.
The inspector panel appears automatically above the main grid on page load and shows a card per index. A ↺ Refresh button lets you re-run the inspection on demand. If MongoDB is not configured, the panel displays a graceful "not connected" message instead of an error.
mongot-doctor can generate a full cluster snapshot covering pods, search metrics, JVM heap, Lucene merges, oplog, SRE findings, and index health — in three formats suited for different audiences and tools.
| Format | Use case |
|---|---|
| Text | Human-readable ASCII report — paste into a ticket, Slack, or runbook |
| Markdown | Tables and emoji — rendered in GitHub issues, Confluence, Notion |
| JSON | Machine-readable — ingest into alerting tools, Grafana, CI/CD pipelines |
Click the 📋 Report button in the dashboard header to open the report modal. Three tabs switch between formats instantly. Each format panel includes:
- Copy — copies the full report to clipboard with visual feedback
- Download — saves the report as a file (
mongot-report-<timestamp>.txt|md|json)
The modal closes with the ✕ button or the Escape key.
GET /api/report?format=text
GET /api/report?format=markdown
GET /api/report?format=jsonText and Markdown are returned as text/plain. JSON is returned as application/json.
Example JSON schema:
{
"generated_at": "2026-03-17T10:00:00Z",
"health": "degraded",
"pods": [...],
"per_pod_metrics": {
"my-replica-set-search-0": {
"search_commands": { "search_qps": 1.5, "search_avg_latency_sec": 0.012, ... },
"jvm": { "heap_used_bytes": 1073741824, "heap_max_bytes": 4294967296, ... },
"lucene_merges": { "running_merges": 2, "merging_docs": 45000, ... },
"indexing": { "change_stream_lag_sec": 0.4, "initial_sync_in_progress": 0, ... }
}
},
"oplog": { "window_hours": 72.5, "used_pct": 12.3 },
"advisor_findings": [...],
"indexes": [...],
"errors": [...]
}Generate a report without starting the web server:
python3 mongot_doctor.py --uri "mongodb://..." --namespace mongodb \
--report --format text
python3 mongot_doctor.py --uri "mongodb://..." --namespace mongodb \
--report --format markdown
python3 mongot_doctor.py --uri "mongodb://..." --namespace mongodb \
--report --format json > report.json--format defaults to text if omitted. The output is printed to stdout — redirect to a file as needed.
MIT License — free to use, modify, and distribute. See LICENSE for the full text.
If you find this useful:
- ⭐ Star the repo — it helps others discover the project
- 🧵 Share it with your team — SREs, MongoDB operators, platform engineers
- 🐛 Report issues — open an issue on GitHub
