Live Database Schema Analysis

## Status (updated 2026-05-16)

PR #54 (merged 2026-05-08) absorbed the **schema-introspection slice** of this issue — see the PR body for full layer-by-layer detail. What landed:

- ✅ Live connection management — `UserDatabaseConnection` model, Fernet-encrypted credentials, CRUD UI at `/connections/`
- ✅ Redis-cached schema snapshots via `LiveSchemaContext`
- ✅ Schema-aware **index** recommendations driven by EXPLAIN cost-deltas (HypoPG on PostgreSQL, stats-based heuristics on MySQL/SQLite)
- ✅ Redundancy detection against existing indexes
- ✅ Database-specific CREATE/DROP DDL generation
- ✅ UI panel wired into grade-results flow
- ✅ 43 new tests

Issue #7 was closed by the same PR.

## Remaining scope

The original goal was "real-time schema introspection AND context-aware recommendations" — #54 delivered the introspection layer plus the **index-specific** recommendation slice. The non-index work below is still open. Splitting into separate issues may be useful once any one of these is picked up.

### A. Broader schema analysis (non-index)
- Table-relationship insights (FK graph traversal, fan-out warnings)
- Column statistics-driven suggestions (low-cardinality column warnings on WHERE/JOIN, nullable-FK detection)
- Statistics-staleness detection (last `ANALYZE` / equivalent)

### B. Multi-statement workload analysis
- Aggregate indexing recommendations across a batch of related queries (DTA-style)
- Conflict detection: one query's optimal index hurts another

### C. ML ranking for recommendations
- Layer 6 of #54 already captures `extract_index_features` but doesn't train. Build the supervised ranking model and validate the >85% accuracy success metric against curated PR-graded examples.

### D. Database coverage gaps
- Oracle / SQL Server: no schema introspection or recommendations yet
- HypoPG installation automation (currently detect + advise)

### E. Performance metrics validation
- Measure end-to-end schema-analysis latency; confirm <1s success metric
- Establish benchmark dataset for recommendation accuracy measurement

## Original success metrics — current state

| Metric | Target | Status |
|---|---|---|
| Database types | 3+ | ⚠️ 3 (PG / MySQL / SQLite) at varying confidence; no Oracle/SQL Server |
| Schema analysis latency | <1s | ⚠️ Not measured |
| Recommendation accuracy | >85% | ⚠️ Not measured (no ML ranking) |

## Suggested next step

Pick one of A–E above and split into a dedicated issue with a focused scope. The catch-all framing of this issue is no longer useful now that the foundation is in place.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Live Database Schema Analysis #6

Status (updated 2026-05-16)

Remaining scope

A. Broader schema analysis (non-index)

B. Multi-statement workload analysis

C. ML ranking for recommendations

D. Database coverage gaps

E. Performance metrics validation

Original success metrics — current state

Suggested next step

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Metric	Target	Status
Database types	3+	⚠️ 3 (PG / MySQL / SQLite) at varying confidence; no Oracle/SQL Server
Schema analysis latency	<1s	⚠️ Not measured
Recommendation accuracy	>85%	⚠️ Not measured (no ML ranking)

Live Database Schema Analysis #6

Description

Status (updated 2026-05-16)

Remaining scope

A. Broader schema analysis (non-index)

B. Multi-statement workload analysis

C. ML ranking for recommendations

D. Database coverage gaps

E. Performance metrics validation

Original success metrics — current state

Suggested next step

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions