Skip to content

DEV-1520: add SQL Server (T-SQL) datasource support#162

Merged
ZmeiGorynych merged 12 commits into
mainfrom
egor/dev-1520-add-sqlserver-datasource-support
Jun 3, 2026
Merged

DEV-1520: add SQL Server (T-SQL) datasource support#162
ZmeiGorynych merged 12 commits into
mainfrom
egor/dev-1520-add-sqlserver-datasource-support

Conversation

@ZmeiGorynych
Copy link
Copy Markdown
Member

@ZmeiGorynych ZmeiGorynych commented Jun 2, 2026

Summary

  • SQL generation: DATETRUNC/DATEADD for date ops (no INTERVAL); STDEV/STDEVP/VAR/VARP for 1-arg stat aggs; variance-decomposition formula for corr/covar_samp/covar_pop (same formula now also used for MySQL — previously these raised NotImplementedError on MySQL); NotImplementedError for median/percentile (PERCENTILE_CONT is window-only in T-SQL); iso_week for Monday-based week truncation
  • DatasourceConfig: mssql/sqlserver/tsql type aliases generate mssql+pyodbc:// connection strings with ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes
  • Ingestion: _SA_TYPE_MAP gains T-SQL-specific SQLAlchemy types (TINYINT, DATETIME2, SMALLDATETIME, DATETIMEOFFSET, NVARCHAR, NCHAR, NTEXT, MONEY, SMALLMONEY)
  • pyproject.toml: pyodbc>=5.0 optional dep + sqlserver = ["pyodbc"] extra; added to all
  • examples/sqlserver/: full Docker Compose example with a custom Dockerfile that installs msodbcsql18, SQL Server 2022 container, seed + verify scripts (covers check_stddev_var + check_corr_covar)
  • examples/seed.py: CREATE_SQL_TSQL branch using NVARCHAR/DATETIME2 (T-SQL doesn't have TEXT or TIMESTAMP column types)
  • 65 new tests covering all T-SQL SQL generation paths, connection string construction, and type mappings; MySQL corr/covar formula tests updated

Test plan

  • poetry run pytest tests/test_sql_generator.py::TestTsqlDialect — 39 T-SQL-specific tests
  • poetry run pytest tests/test_models.py::TestDatasourceConfig — connection string tests
  • poetry run pytest tests/test_ingestion.py — T-SQL type mapping tests
  • Full unit suite: poetry run pytest -m "not integration" — 3516 passed
  • Docker smoke: cd examples/sqlserver && docker compose up -d && python verify.py

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • SQL Server promoted to Tier 1 with full test coverage.
    • Added a complete containerized SQL Server example (Dockerfile, compose, seed/start/verify scripts) and seed support for T-SQL.
    • Correlation/covariance checks now run for MySQL and T-SQL via formula-based implementations; median/percentile unsupported on MySQL/T-SQL.
  • Documentation

    • Updated database support tiers and added SQL Server quick-start, prerequisites, and aggregation limitations.
  • Tests

    • Expanded SQL-generation and ingestion tests to cover T-SQL behavior and type mappings.
  • Chores

    • Added optional pyodbc runtime extra for SQL Server.

- SQL generation: DATETRUNC/DATEADD for date ops, STDEV/STDEVP/VAR/VARP
  for 1-arg stat aggs, variance-decomposition formula for corr/covar_samp/
  covar_pop (same formula now used for MySQL too); NotImplementedError for
  median/percentile (PERCENTILE_CONT is window-only in T-SQL)
- DatasourceConfig: mssql/sqlserver/tsql types generate mssql+pyodbc://
  connection strings with ODBC Driver 18 and TrustServerCertificate=yes
- Ingestion: _SA_TYPE_MAP gains T-SQL-specific SA types (TINYINT, DATETIME2,
  SMALLDATETIME, DATETIMEOFFSET, NVARCHAR, NCHAR, NTEXT, MONEY, SMALLMONEY)
- pyproject.toml: pyodbc optional dep + sqlserver extra
- examples/sqlserver/: Docker Compose example with custom Dockerfile that
  installs msodbcsql18, SQL Server 2022 container, seed + verify scripts
- examples/seed.py: CREATE_SQL_TSQL branch (NVARCHAR, DATETIME2)
- examples/mysql/verify.py: now calls check_corr_covar() (MySQL implements it)
- 65 new tests covering all T-SQL generation paths and type mappings

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@linear
Copy link
Copy Markdown

linear Bot commented Jun 2, 2026

DEV-1520

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Warning

Rate limit exceeded

@ZmeiGorynych has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 30 minutes and 33 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f3367242-94a8-42d0-b687-c1d5bd680b2f

📥 Commits

Reviewing files that changed from the base of the PR and between bd7862f and 01c474a.

📒 Files selected for processing (10)
  • docs/configuration/datasources.md
  • examples/sqlserver/Dockerfile
  • examples/sqlserver/docker-compose.yml
  • examples/sqlserver/start.sh
  • slayer/engine/ingestion.py
  • slayer/sql/client.py
  • slayer/sql/generator.py
  • tests/test_ingestion.py
  • tests/test_sql_client.py
  • tests/test_sql_generator.py
📝 Walkthrough

Walkthrough

Adds full SQL Server (T‑SQL) support: promote SQL Server to Tier 1, add mssql+pyodbc connection strings, extend ingestion type mappings, implement T‑SQL date/time and statistical SQL (variance‑decomposition for corr/covar), add a SQL Server Docker example and scripts, and expand tests.

Changes

SQL Server Full Support Implementation

Layer / File(s) Summary
SQL Server Tier 1 Promotion & Dependencies
CLAUDE.md, pyproject.toml
SQL Server is promoted to Tier 1 and MySQL aggregation caveat text is updated. pyodbc added as an optional dependency with a new sqlserver extra.
Examples, Docker, and seed/start scripts
examples/sqlserver/*, examples/seed.py, examples/mysql/verify.py, examples/verify_common.py
New SQL Server Docker example (Dockerfile, docker-compose, README, CLAUDE.md, start.sh, verify.py). seed.py emits T‑SQL DDL when given an mssql connection string. MySQL verify now runs corr/covar checks and verify_common docstring updated.
Connection string builder & models ingestion
slayer/core/models.py, examples/sqlserver/start.sh
Added _get_tsql_connection_string() producing mssql+pyodbc:// URLs including ODBC Driver 18 and TrustServerCertificate=yes; get_connection_string() routes mssql/sqlserver/tsql to it.
Ingestion type mappings
slayer/engine/ingestion.py
SA→DataType map extended for T-SQL types: TINYINT, DATETIME2/SMALLDATETIME/DATETIMEOFFSET→TIMESTAMP, NVARCHAR/NCHAR/NTEXT→TEXT, MONEY/SMALLMONEY→DOUBLE, and BIT→BOOLEAN. MONEY/SMALLMONEY treated as float-like.
T-SQL Date & Interval Functions and Stat SQL
slayer/sql/generator.py
T‑SQL implementations: time offsets via DATEADD, chained intervals via nested DATEADD, DATETRUNC for truncation, explicit NotImplementedError for median/percentile on MySQL/T‑SQL, _build_covar_formula variance‑decomposition for corr/covar on formula dialects, and _build_stat_agg refactor to route two‑arg stats through the formula where needed.
SQL Generator Tests & Multi‑Dialect Adjustments
tests/test_sql_generator.py
Expanded multi‑dialect DATE_TRUNC/time_shift/window tests, added T‑SQL-specific window/time_shift assertions, adjusted two‑arg stat test matrix, added MySQL/T‑SQL formula-based stat tests, extended per‑dialect stddev/var expectations, and added TestTsqlDialect.
Model & Ingestion Tests
tests/test_models.py, tests/test_ingestion.py
Added tests verifying mssql+pyodbc:// connection string formatting (driver, trust flag, port, password encoding) and ingestion type mapping unit tests for T‑SQL types.

Sequence Diagram(s)

sequenceDiagram
  participant examples_verify as examples/sqlserver/verify.py
  participant verify_common as examples/verify_common.py
  participant sql_generator as slayer/sql/generator.py
  participant sqlserver as SQLServer
  examples_verify->>verify_common: run common checks & corr/covar
  verify_common->>sql_generator: request corr/covar SQL (variance-decomposition)
  sql_generator->>sqlserver: emit T-SQL query (DATEADD/DATETRUNC or formula SQL)
Loading

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly Related PRs

  • MotleyAI/slayer#78: Overlaps on interval/window builder refactors which this PR extends with T‑SQL DATEADD/DATETRUNC handling.
  • MotleyAI/slayer#82: Overlaps on stat-aggregate generator and formula-based corr/covar handling.
  • MotleyAI/slayer#140: Related DATE_TRUNC changes; both modify date-trunc generation and operand casting behavior.

Suggested Reviewers

  • AivanF

Poem

🐰 The rabbit hops through T-SQL land,
With DATEADD queries in each paw's hand,
From Tier Two climbing up so high,
SQL Server now can scale the sky,
Variance-danced formulas sing so true!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.79% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and directly describes the main change: adding SQL Server (T-SQL) datasource support is the primary objective of this comprehensive changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch egor/dev-1520-add-sqlserver-datasource-support

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
tests/test_models.py (1)

838-846: ⚡ Quick win

Make ODBC driver format assertion more precise.

The test checks for "odbc", "driver", and "18" separately, but doesn't verify the exact format required by the coding guideline: ODBC+Driver+18+for+SQL+Server. A more precise assertion would prevent future regressions where someone might break the driver string format.

✨ Suggested improvement
 def test_sqlserver_connection_string_includes_odbc_driver_param(self) -> None:
     ds = DatasourceConfig(name="test", type="mssql", host="sqlhost", database="db")
     cs = ds.get_connection_string()
-    # Must include exact ODBC driver name so pyodbc can locate the installed driver
-    cs_lower = cs.lower()
-    assert "odbc" in cs_lower
-    assert "driver" in cs_lower
-    assert "18" in cs  # msodbcsql18 is the expected driver
+    # Must include exact ODBC driver name as specified in coding guidelines
+    assert "driver=ODBC+Driver+18+for+SQL+Server" in cs

As per coding guidelines: examples/sqlserver/**/*.py: Pin ODBC driver version to 18 in pyodbc connection strings; use format ODBC+Driver+18+for+SQL+Server.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_models.py` around lines 838 - 846, Update the
test_sqlserver_connection_string_includes_odbc_driver_param to assert the exact
ODBC driver substring expected by our guideline instead of separate checks; call
DatasourceConfig(name="test", type="mssql", host="sqlhost",
database="db").get_connection_string() as before but replace the three separate
asserts with a single assertion that the connection string contains the exact
token "ODBC+Driver+18+for+SQL+Server" (case-insensitive check is fine), so the
test will fail if the driver format deviates.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/sqlserver/Dockerfile`:
- Line 23: The Dockerfile currently redundantly installs pyodbc via the pip
command that contains "pip install --no-cache-dir psycopg2-binary pymysql
clickhouse-sqlalchemy pyodbc && \"; remove pyodbc from that pip install line so
the image relies on the dependency brought in by "poetry install -E all" (which
already includes the sqlserver extra), and ensure the Dockerfile keeps the other
DB drivers (psycopg2-binary, pymysql, clickhouse-sqlalchemy) intact; run a build
to verify no missing-dependency errors and that pyodbc is resolved via poetry's
extras.
- Around line 19-24: The Docker build is failing because poetry.lock is out of
sync with pyproject.toml; regenerate and commit an updated lock file by running
poetry lock from the repository root (so the COPY step in Dockerfile will
include a fresh poetry.lock). After running `poetry lock` (no-update if you want
to preserve versions), verify the updated poetry.lock is committed to the repo
so the Dockerfile's COPY pyproject.toml poetry.lock* step uses the synchronized
lockfile.

In `@examples/sqlserver/start.sh`:
- Around line 7-17: The script mixes sync/async entrypoints: ingest_datasource
is a synchronous function in slayer/engine/ingestion.py so remove the
run_sync(...) wrapper and call ingest_datasource(...) directly (or
alternatively, if you intended to use the async higher-level API
ingest_datasource_models from slayer/mcp/server.py then import
ingest_datasource_models and wrap that call with run_sync(...) instead); update
the imports and the call site accordingly (look for ingest_datasource,
ingest_datasource_models, run_sync, DatasourceConfig, YAMLStorage) so the chosen
function's sync/async contract is used consistently.

In `@examples/sqlserver/verify.py`:
- Around line 28-47: The docstring for check_column_types is misleading about
expected type strings; update the docstring in examples/verify_common.py to
state that expected_types should use the canonical DataType enum .value strings
(e.g., "INT", "TEXT", "TIMESTAMP", "DOUBLE") as produced in
examples/sqlserver/verify.py (or explicitly document that legacy lowercase
aliases like "number"/"string"/"time"/"date" will be mapped to the canonical
values), and mention the related symbols check_column_types, expected_types and
DataType so callers know which exact string values are valid.

In `@pyproject.toml`:
- Line 48: The lockfile is out of sync with pyproject.toml (the pyodbc optional
dependency and other changes at the referenced lines), so regenerate and commit
an updated poetry.lock: run `poetry lock` locally (or CI equivalent) to rebuild
the lockfile, verify it updated entries for pyodbc and the other modified
dependencies, then add and commit the regenerated poetry.lock so the pipeline
uses the synced lockfile.

---

Nitpick comments:
In `@tests/test_models.py`:
- Around line 838-846: Update the
test_sqlserver_connection_string_includes_odbc_driver_param to assert the exact
ODBC driver substring expected by our guideline instead of separate checks; call
DatasourceConfig(name="test", type="mssql", host="sqlhost",
database="db").get_connection_string() as before but replace the three separate
asserts with a single assertion that the connection string contains the exact
token "ODBC+Driver+18+for+SQL+Server" (case-insensitive check is fine), so the
test will fail if the driver format deviates.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9a91a39e-4b97-40bc-88f3-85d65fbb5700

📥 Commits

Reviewing files that changed from the base of the PR and between 05ee330 and f3e2998.

📒 Files selected for processing (18)
  • CLAUDE.md
  • examples/mysql/verify.py
  • examples/seed.py
  • examples/sqlserver/CLAUDE.md
  • examples/sqlserver/Dockerfile
  • examples/sqlserver/README.md
  • examples/sqlserver/docker-compose.yml
  • examples/sqlserver/slayer_data/.gitignore
  • examples/sqlserver/start.sh
  • examples/sqlserver/verify.py
  • examples/verify_common.py
  • pyproject.toml
  • slayer/core/models.py
  • slayer/engine/ingestion.py
  • slayer/sql/generator.py
  • tests/test_ingestion.py
  • tests/test_models.py
  • tests/test_sql_generator.py

Comment thread examples/sqlserver/Dockerfile Outdated
Comment thread examples/sqlserver/Dockerfile Outdated
Comment thread examples/sqlserver/start.sh Outdated
Comment thread examples/sqlserver/verify.py
Comment thread pyproject.toml
ZmeiGorynych and others added 2 commits June 2, 2026 16:02
…nerate poetry.lock

poetry.lock was out of sync after adding pyodbc to pyproject.toml.
Also removed pyodbc from the manual pip install layer (it's already pulled
in by `poetry install -E all`) and merged the two RUN instructions that
install + create the system user into one.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Group 2 — correctness:
- generator.py _build_date_trunc (tsql): wrap non-Column/Cast expressions in
  CAST(... AS TIMESTAMP) before passing to DATETRUNC; plain function-call
  results aren't implicitly temporal in T-SQL.
- generator.py _build_covar_formula: guard corr denominator with NULLIF(..., 0)
  so constant columns (STDEV=0) return NULL instead of a divide-by-zero error.
- ingestion.py: add MONEY/SMALLMONEY to _FLOAT_LIKE_SA_TYPES so monetary
  columns get NumberFormat.FLOAT instead of NumberFormat.INT.

Group 3 — Sonar / code quality:
- models.py: extract _get_tsql_connection_string() helper; reduces cognitive
  complexity of get_connection_string from 18 to 14 (Sonar S3776).
- tests/test_models.py: add NOSONAR(S2068) for test-only fixture credential;
  tighten ODBC driver assertion to the exact string.
- examples/verify_common.py: update check_column_types docstring to show
  current uppercase DataType values (TEXT/DOUBLE/TIMESTAMP/DATE).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@slayer/core/models.py`:
- Around line 750-762: The _get_tsql_connection_string function manually
concatenates username, password, host, port and database into an mssql+pyodbc
URI which can break with reserved URL characters; replace the manual
interpolation by constructing the connection URL with SQLAlchemy's
URL.create(...) and call render_as_string() to ensure proper
quoting/encoding—use URL.create(drivername="mssql+pyodbc",
username=self.username, password=self.password, host=self.host or "localhost",
port=self.port, database=self.database or "", query={"driver":"ODBC Driver 18
for SQL Server", "TrustServerCertificate":"yes"}) in _get_tsql_connection_string
so SQLAlchemy handles encoding (and remove manual auth/params concatenation).

In `@slayer/sql/generator.py`:
- Around line 1830-1832: The DATETRUNC branch currently wraps
non-Column/non-Cast expressions by casting to TIMESTAMP (exp.Cast with
exp.DataType.build("TIMESTAMP")), but in T-SQL TIMESTAMP is rowversion
(non-temporal) and invalid for DATETRUNC; update the fallback cast to a temporal
SQL Server type such as DATETIME2 by replacing exp.DataType.build("TIMESTAMP")
with exp.DataType.build("DATETIME2") where col_expr is wrapped (the exp.Cast
creation around col_expr in the DATETRUNC handling).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 076bbc64-1ecf-40da-be1c-e31a0eff32cc

📥 Commits

Reviewing files that changed from the base of the PR and between f3e2998 and 220c253.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • examples/sqlserver/Dockerfile
  • examples/verify_common.py
  • slayer/core/models.py
  • slayer/engine/ingestion.py
  • slayer/sql/generator.py
  • tests/test_models.py
✅ Files skipped from review due to trivial changes (1)
  • examples/verify_common.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • examples/sqlserver/Dockerfile
  • slayer/engine/ingestion.py
  • tests/test_models.py

Comment thread slayer/core/models.py Outdated
Comment thread slayer/sql/generator.py
Group 1 — connection string safety:
- models.py: replace manual URL interpolation in _get_tsql_connection_string()
  with sqlalchemy.engine.URL.create(); passwords containing reserved characters
  like '@' (e.g. the Docker example's 'YourStrong@Passw0rd') are now properly
  percent-encoded instead of breaking URL parsing.
- docker-compose.yml: URL-encode '@' in the seed container's literal connection
  string ('%40Passw0rd') so SQLAlchemy parses host/password correctly.
- tests: add test_sqlserver_special_chars_in_password_are_url_encoded to pin
  the fix.

Group 2 — BIT type mapping:
- ingestion.py: add "BIT": DataType.BOOLEAN to _SA_TYPE_MAP; SQL Server's BIT
  type is reflected by SQLAlchemy as class name "BIT", but was missing from the
  map and falling back to DataType.TEXT instead of DataType.BOOLEAN.
- tests: add test_tsql_bit_maps_to_boolean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
examples/sqlserver/docker-compose.yml (2)

29-29: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make database creation idempotent for reruns.

Line 29 uses unconditional CREATE DATABASE slayer_demo; rerunning the stack can fail at createdb and block dependent services. Wrap with existence check.

Suggested change
-      -Q "CREATE DATABASE slayer_demo" -No
+      -Q "IF DB_ID(N'slayer_demo') IS NULL CREATE DATABASE slayer_demo;" -No
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/sqlserver/docker-compose.yml` at line 29, Replace the unconditional
CREATE statement so database creation is idempotent: locate the sqlcmd
invocation that contains the string "CREATE DATABASE slayer_demo" (the -Q
argument) and change it to perform a conditional check before creating the DB
(for example using a guard like IF DB_ID('slayer_demo') IS NULL CREATE DATABASE
slayer_demo or an equivalent existence-check SQL) so repeated stack runs do not
error when the database already exists.

6-6: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove hardcoded database credentials from compose and command args.

Line 6 and Line 40 embed production-grade secrets directly in the file/CLI, which is easy to leak in git history, logs, and process lists. Move these to environment variables (or .env) and reference them in commands/URLs.

Suggested change
 services:
   sqlserver:
@@
     environment:
       ACCEPT_EULA: "Y"
-      MSSQL_SA_PASSWORD: "YourStrong@Passw0rd"
+      MSSQL_SA_PASSWORD: "${MSSQL_SA_PASSWORD:?set in .env}"
       MSSQL_PID: "Developer"
@@
   createdb:
@@
     command: >
       /opt/mssql-tools18/bin/sqlcmd
-      -S sqlserver -U sa -P 'YourStrong@Passw0rd'
+      -S sqlserver -U sa -P "${MSSQL_SA_PASSWORD}"
       -Q "CREATE DATABASE slayer_demo" -No
@@
   seed:
@@
     command: >
       python /examples/seed.py
-      "mssql+pyodbc://sa:YourStrong%40Passw0rd@sqlserver:1433/slayer_demo?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes"
+      "mssql+pyodbc://sa:${MSSQL_SA_PASSWORD_ENCODED}`@sqlserver`:1433/slayer_demo?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes"

Also applies to: 15-16, 28-29, 40-40

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/sqlserver/docker-compose.yml` at line 6, The compose file currently
embeds production secrets (e.g., the MSSQL_SA_PASSWORD environment key and other
DB credentials used in command/URL arguments) directly in docker-compose.yml;
replace all hardcoded passwords with environment variable references (for
example use a named env var like MSSQL_SA_PASSWORD or DB_PASSWORD loaded from a
.env or host env) and update any command/URL arguments that contain credentials
to reference those env vars instead (search for MSSQL_SA_PASSWORD and any DB
URL/command args in the file and swap literals for ${VAR_NAME}); ensure the .env
or deployment docs define the variables and remove plaintext secrets from the
repo and CLI/compose files.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@examples/sqlserver/docker-compose.yml`:
- Line 29: Replace the unconditional CREATE statement so database creation is
idempotent: locate the sqlcmd invocation that contains the string "CREATE
DATABASE slayer_demo" (the -Q argument) and change it to perform a conditional
check before creating the DB (for example using a guard like IF
DB_ID('slayer_demo') IS NULL CREATE DATABASE slayer_demo or an equivalent
existence-check SQL) so repeated stack runs do not error when the database
already exists.
- Line 6: The compose file currently embeds production secrets (e.g., the
MSSQL_SA_PASSWORD environment key and other DB credentials used in command/URL
arguments) directly in docker-compose.yml; replace all hardcoded passwords with
environment variable references (for example use a named env var like
MSSQL_SA_PASSWORD or DB_PASSWORD loaded from a .env or host env) and update any
command/URL arguments that contain credentials to reference those env vars
instead (search for MSSQL_SA_PASSWORD and any DB URL/command args in the file
and swap literals for ${VAR_NAME}); ensure the .env or deployment docs define
the variables and remove plaintext secrets from the repo and CLI/compose files.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d943f145-8503-4ee3-b937-c11115639e83

📥 Commits

Reviewing files that changed from the base of the PR and between 220c253 and bd7862f.

📒 Files selected for processing (5)
  • examples/sqlserver/docker-compose.yml
  • slayer/core/models.py
  • slayer/engine/ingestion.py
  • tests/test_ingestion.py
  • tests/test_models.py
✅ Files skipped from review due to trivial changes (1)
  • tests/test_models.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/test_ingestion.py
  • slayer/engine/ingestion.py
  • slayer/core/models.py

ZmeiGorynych and others added 8 commits June 3, 2026 10:26
Finding 1 — docker-compose.yml:
- Wrap CREATE DATABASE in an existence guard so reruns don't fail:
  IF DB_ID(N'slayer_demo') IS NULL CREATE DATABASE slayer_demo

Finding 2 — sql_client.py:
- T-SQL (SQL Server) does not support LIMIT; the type-probing query
  SELECT * FROM (...) AS _types LIMIT 0 was a syntax error, causing
  get_column_types to silently return {} for every SQL Server call.
- Extract _build_type_probe_sql() helper that emits
  SELECT TOP N * FROM (...) AS _types for mssql/sqlserver/tsql and
  the existing LIMIT form for all other dialects.
- Add TestBuildTypeProbeSQL (6 tests) to pin the dialect dispatch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- docker-compose.yml: createdb service now uses mcr.microsoft.com/mssql/server:2022-latest
  image directly (has sqlcmd) instead of the custom Dockerfile that only has msodbcsql18
- Dockerfile/start.sh: fix stale --models-dir flag → --storage
- ingestion.py: guard mssql.TIMESTAMP (rowversion/binary) from colliding with sa.TIMESTAMP
  in _SA_TYPE_MAP; isinstance check maps it to TEXT; add ROWVERSION entry for completeness
- generator.py: add assert isinstance(iv, exp.Interval) guard in T-SQL DATEADD branch
- test_sql_generator.py: add tsql to test_date_trunc_casts_unknown_typed_time_dim
- test_ingestion.py: add test_tsql_mssql_timestamp_rowversion_maps_to_text

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… docs

- generator.py: replace assert isinstance(iv, exp.Interval) with an explicit
  raise TypeError so the guard survives python -O optimization
- client.py: add _ODBC_SQL_TYPE_MAP (standard ODBC SQL type codes) and route
  mssql/sqlserver/tsql integer type codes through it instead of falling through
  to the Postgres OID map; fixes get_column_types() and schema drift detection
  for SQL Server returning all-"string" types
- test_sql_client.py: add parametrized tests covering SQL_INTEGER, SQL_BIGINT,
  SQL_VARCHAR, SQL_TYPE_TIMESTAMP, SQL_BIT, and the no-fallthrough invariant
  for all three T-SQL db_type aliases
- docs/configuration/datasources.md: add warning that TrustServerCertificate=yes
  in auto-generated SQL Server connection strings is dev/Docker-only and must
  not be used in production

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pin Dockerfile Debian

- tests/test_sql_client.py: drop '= VALUE' from ODBC constant doc comments to avoid
  Sonar S125 "commented-out code" false positives
- slayer/sql/generator.py: split shared MySQL/T-SQL percentile NotImplementedError into
  two separate dialect-specific messages for clearer diagnostics
- examples/sqlserver/Dockerfile: pin base image to python:3.14-slim-bookworm so the
  msodbcsql18 apt install doesn't break when Debian 13 ships

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…redundant pip install

- tests/test_sql_generator.py: add test_build_date_trunc_quarter to TestTsqlDialect
  to cover the previously untested QUARTER granularity path
- docs/configuration/datasources.md: add SQL Server 2022+ version warning (DATETRUNC
  requires 2022+), clarify that auto-generated connection strings use pyodbc only
  (pymssql requires a manual connection_string)
- examples/sqlserver/Dockerfile: remove redundant pip install of psycopg2-binary,
  pymysql, clickhouse-sqlalchemy — all already pulled in by poetry install -E all

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add parametrized tests for the four SQL Server extension type codes
that were in _ODBC_SQL_TYPE_MAP but had no test coverage:
- SQL_SS_TIMESTAMPOFFSET (-154) → time
- SQL_SS_TIME2 (-155) → time
- SQL_SS_XML (-152) → string
- SQL_GUID / uniqueidentifier (-11) → string

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ion, not raise

The module-level comment incorrectly said MySQL raises NotImplementedError
for corr/covar functions. MySQL and T-SQL both use _build_covar_formula
(variance-decomposition) instead. Update the comment to match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The non-idempotent ingest_datasource was overwriting model metadata
(descriptions, labels, allowed_aggregations) on every container restart.
Switch to ingest_datasource_idempotent, which is additive-only and
consistent with all other documented entry points.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 3, 2026

@ZmeiGorynych ZmeiGorynych merged commit 9e2b71e into main Jun 3, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant