Skip to content

feat(detector): SQL/migration detector + SQL_ENTITY NodeKind (#48)#57

Merged
aksOps merged 1 commit into
mainfrom
feat/sql-migration-detector
Apr 23, 2026
Merged

feat(detector): SQL/migration detector + SQL_ENTITY NodeKind (#48)#57
aksOps merged 1 commit into
mainfrom
feat/sql-migration-detector

Conversation

@aksOps

@aksOps aksOps commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements task #48: a SQL / migration detector under io.github.randomcodespace.iq.detector.sql that extracts schema-level entities from raw SQL DDL and multi-format migration files.

What the detector handles

  • Raw SQL (*.sql): CREATE TABLE, CREATE VIEW, CREATE SCHEMA, ALTER TABLE ... ADD COLUMN, CREATE INDEX, inline FOREIGN KEY ... REFERENCES, DROP TABLE (skipped with debug log).
  • Flyway (**/V\d+(?:_\d+)*__.+\.sql): parsed as raw SQL with format=flyway and a version parsed from the filename.
  • Liquibase XML (**/changelog.xml, **/db.changelog*.xml): <createTable>, <addColumn>, <addForeignKeyConstraint>.
  • Liquibase YAML (**/db.changelog*.yml / .yaml): createTable, addForeignKeyConstraint. Regex-based walk to avoid wiring SnakeYAML into the detector.
  • Alembic (**/versions/*.py + from alembic or op.create_table marker): op.create_table, op.add_column, op.create_index, op.create_foreign_key.
  • Rails (**/db/migrate/<14-digit>_*.rb): create_table, add_column, add_foreign_key.
  • Prisma (**/migrations/*/migration.sql): delegates to the raw-SQL path with format=prisma and directory-name version.

A file is only treated as a migration if the path/filename matches one of those discriminators (Alembic additionally requires a content marker) — arbitrary .py, .rb, .xml, .yml files are ignored. The detector is @Component-scoped, stateless, and emits nodes/edges sorted by id for byte-equal determinism across runs.

Enum changes

Added:

  • NodeKind.SQL_ENTITY — new. Schema-level table/view/schema. Distinct from the code-level ENTITY (JPA/ORM) kind; the two are deliberately not collapsed.
  • EdgeKind.REFERENCES_TABLE — new. Any node (a JPA ENTITY, SQLAlchemy/TypeORM model, a raw-query node, or another SQL_ENTITY) → SQL_ENTITY. Pairs with existing ORM detectors for the "which code references which table" join.

Reused:

  • EdgeKind.MIGRATES — MIGRATION → SQL_ENTITY. The existing MIGRATES is unused in production code (only referenced by ModelCoverageTest), so MIGRATES_SCHEMA was not needed.
  • NodeKind.MIGRATION — existing; used for the migration-script-level node.

Test delta

  • Before: 3278 passing / 31 skipped
  • After: 3294 passing / 31 skipped
  • New: 16 tests in SqlMigrationDetectorTest
    • Positive: raw SQL table/view/schema, FK REFERENCES_TABLE edge, CREATE INDEX property enrichment, ALTER TABLE ADD COLUMN property enrichment, Flyway version parse, Alembic op.create_table + add_column + create_foreign_key, Liquibase XML changeSet, Liquibase YAML changeSet, Rails create_table + add_column + add_foreign_key, Prisma migration.sql.
    • Negative: plain .py in app/utils.py, arbitrary .github/workflows/ci.yml, empty content, .py under versions/ without the alembic marker.
    • Determinism: DetectorTestUtils.assertDeterministic plus a byte-equal id-list assertion across two runs.

Design call-outs

  • ALTER TABLE ADD COLUMN enriches properties (columns_added=csv) rather than creating column child-nodes, to keep graph size reasonable for large schemas. Same for CREATE INDEX (indexes=csv).
  • DROP TABLE is skipped with a debug log — the graph models current state, not deletions.
  • Liquibase YAML is parsed via regex rather than SnakeYAML in this detector. Lightweight and keeps the detector self-contained; a future revision can swap in StructuredParser if we need nested-key fidelity.
  • LayerClassifier: SQL_ENTITY added to INFRA_NODE_KINDS → classified as infra.
  • StatsService: left untouched for v1 — no natural category to plug into yet (databases is DATABASE_CONNECTION only; SQL_ENTITY surfaces through the node-kinds / edges-by-kind breakdown in computeGraph). Follow-up when /api/serve consumers need a dedicated schema breakdown.

Test plan

  • mvn -B test — 3294 passing, 0 failing, 31 skipped
  • Compile clean (mvn -B test-compile)
  • New detector registered by Spring classpath scan (no registry edit needed)
  • Determinism assertion: byte-equal node-id and edge-id order across two detect() calls

Adds a SqlMigrationDetector under detector/sql that extracts schema-level
entities (tables, views, schemas) from raw SQL DDL and framework-specific
migration files: Flyway (V*__*.sql), Liquibase (XML + YAML), Alembic
(versions/*.py with alembic/op marker guard), Rails (db/migrate/*.rb),
and Prisma (migrations/*/migration.sql). Path/marker discriminators
prevent false positives on arbitrary .py/.rb/.xml/.yml.

Enum additions:
- NodeKind.SQL_ENTITY (new): schema-level table/view/schema node,
  distinct from the code-level ENTITY (JPA/ORM) kind.
- EdgeKind.REFERENCES_TABLE (new): any node (JPA ENTITY, ORM model,
  raw SQL_ENTITY) -> SQL_ENTITY, pairing with existing ORM detectors.
- EdgeKind.MIGRATES (reused): MIGRATION -> SQL_ENTITY. Unused in
  production code elsewhere; only referenced by ModelCoverageTest.

LayerClassifier: SQL_ENTITY classified as `infra`.

Deterministic output (sorted by id on emit); detector is stateless.
ALTER TABLE ADD COLUMN enriches the owning entity via columns_added
property; did not model columns as child nodes to keep graph size
reasonable. DROP TABLE is skipped with a debug log.

Tests: 16 new tests covering positive paths (raw SQL, Flyway, Alembic,
Liquibase XML, Liquibase YAML, Rails, Prisma), negative paths (plain
.py/.yaml, Alembic path without marker), determinism, and DDL variants
(DROP, CREATE INDEX, ALTER TABLE). Test count 3278 -> 3294.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
2 Security Hotspots
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@aksOps aksOps merged commit 1e44b8e into main Apr 23, 2026
8 of 9 checks passed
@aksOps aksOps deleted the feat/sql-migration-detector branch April 26, 2026 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant