Skip to content

Live Database Schema Analysis #6

@ringo380

Description

@ringo380

Status (updated 2026-05-16)

PR #54 (merged 2026-05-08) absorbed the schema-introspection slice of this issue — see the PR body for full layer-by-layer detail. What landed:

  • ✅ Live connection management — UserDatabaseConnection model, Fernet-encrypted credentials, CRUD UI at /connections/
  • ✅ Redis-cached schema snapshots via LiveSchemaContext
  • ✅ Schema-aware index recommendations driven by EXPLAIN cost-deltas (HypoPG on PostgreSQL, stats-based heuristics on MySQL/SQLite)
  • ✅ Redundancy detection against existing indexes
  • ✅ Database-specific CREATE/DROP DDL generation
  • ✅ UI panel wired into grade-results flow
  • ✅ 43 new tests

Issue #7 was closed by the same PR.

Remaining scope

The original goal was "real-time schema introspection AND context-aware recommendations" — #54 delivered the introspection layer plus the index-specific recommendation slice. The non-index work below is still open. Splitting into separate issues may be useful once any one of these is picked up.

A. Broader schema analysis (non-index)

  • Table-relationship insights (FK graph traversal, fan-out warnings)
  • Column statistics-driven suggestions (low-cardinality column warnings on WHERE/JOIN, nullable-FK detection)
  • Statistics-staleness detection (last ANALYZE / equivalent)

B. Multi-statement workload analysis

  • Aggregate indexing recommendations across a batch of related queries (DTA-style)
  • Conflict detection: one query's optimal index hurts another

C. ML ranking for recommendations

D. Database coverage gaps

  • Oracle / SQL Server: no schema introspection or recommendations yet
  • HypoPG installation automation (currently detect + advise)

E. Performance metrics validation

  • Measure end-to-end schema-analysis latency; confirm <1s success metric
  • Establish benchmark dataset for recommendation accuracy measurement

Original success metrics — current state

Metric Target Status
Database types 3+ ⚠️ 3 (PG / MySQL / SQLite) at varying confidence; no Oracle/SQL Server
Schema analysis latency <1s ⚠️ Not measured
Recommendation accuracy >85% ⚠️ Not measured (no ML ranking)

Suggested next step

Pick one of A–E above and split into a dedicated issue with a focused scope. The catch-all framing of this issue is no longer useful now that the foundation is in place.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions