Quote model-name identifiers so reserved-keyword model names work#146
Quote model-name identifiers so reserved-keyword model names work#146whimo wants to merge 1 commit into
Conversation
📝 Walkthrough<review_stack_artifact_start > Fixes unquoted identifier emission for reserved SQL keyword model names across SQL generation, enrichment, column expansion, and tests.Bug report and convention doc describing failure surfaces, root cause, reproduction, and the quoting convention for model-derived identifiers.range_0b187e5b2cec range_656bed667b37 range_20bdc53d9f46 range_d7296ef6b4bf range_211f350b1bb3 range_1e7d6e6b33ed range_b0e2074ccb90Updates enrichment (`_TABLE_COL_RE`, JOIN predicate formatting, filter column resolution) to quote alias prefixes; column-expansion rewrites canonical aliases using the model-identifier helper; query-engine comment clarifies bare remapping for later reparse/quoting.range_7f34d46b57ba range_8d607f187df9 range_31755f668977 range_4abee2ce33ca range_1d31103d40a4 range_d0da089c939b range_e09120fb3482 range_5f0d7fae5e26Adds centralized quoting helpers: `_model_identifier()`, `quote_ident_for()`, and `_qualify_in_string()` to produce always-quoted sqlglot identifiers and dialect-aware qualifier strings for raw-SQL embedding.range_7aacd8604d56 range_01193533796bReplaces unquoted alias/qualification with `_model_identifier` and `_qualify_in_string` across CTEs, window joins, FROM/JOIN assembly, ranked-subquery aliasing, aggregate resolution, and WHERE/HAVING qualification.range_b73a08da0606 range_3310f09a8930 range_6b3ee74d0fb4 range_3232719ba75d range_aa91b662ae86 range_e73235f48dfe range_0926f74d205a range_63eeaf03e301 range_40ba99d288b1 range_2392cad79db8 range_67112a3c03c1Adds DuckDB and PostgreSQL fixtures/models using quoted reserved table names and mixed-case columns; verifies queries, joins, and cross-model measures execute with correct quoting.range_35205b0d6d82 range_94aebefc4c65 range_9c819159703e range_1a4d2adc47e1 range_b2c5d5ee1140 range_ca2685b56748New regression suite verifying dialect-quoted reserved-model identifiers, sqlglot parseability, and absence of the sqlglot `Command` fallback warning across dimensions, filters, joins, and ranked-subqueries.range_3fd203a94a9e range_f0ec2fbf01be range_f45c5d922db2 range_ca7b2fa2dfe3 range_1ce04a4245c4 range_bc0963418716 range_9cfa09050c02 range_2a7c17a5986fUpdates test expectations to reflect dialect-appropriate quoted identifiers in aggregates, percentiles, statistical functions, filter wrapping, string translations, and derived-column inlining.range_1e12e7459310 range_cbd7b0696aeb range_e859e189d345 range_4105f86f7373 range_9c3874353b24 range_2829e126d715 range_78cf3b78793c range_57aed421cf1d range_3e7245f7b79a range_c33aeedfcd29 range_a36053c09814 range_3dc02ee259ce range_ffae39382afc range_5eff6bbb819f range_074b66d6b8e6 range_240b57767d44 range_46b594bc4972 range_324f3f6ca2f1 range_a0884b4dc25a range_a67d77a210b3 range_b559ed2ad10c range_04bd7bfc01bd range_0bd20d76c52d range_63bcac6f65b8 range_c0c58b322333 range_79e30db898f4 range_a20fbe281660 range_19372866907b range_e5ef856e088f range_66a2ebe5857a range_7e04c8472603 range_d3e2986d2817 range_1869c02e902f range_90c47352f126 range_66b843c2714c range_55b4361794f5 range_d1f0ccd5077d range_e446e65f688c range_5d7bd5d6980f range_201d5c51b672 range_a32b7b31e5f4 range_5fb5008eb892 range_8f6328574dff range_0ec125565b2f range_5272f10f677b range_74dc7c13b58b range_255f0a4796db range_a03eae0c9ebc range_c4bd796455c7 range_313fa810e634 range_71a58cb65fca range_fe52e116ed58 range_f4852609d3b7 range_3b1c2980ca22 range_4e7ba99dab4f range_eeb3f2a1e61f range_fbecf53883d3🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
slayer/engine/enrichment.py (1)
2471-2473: ⚡ Quick winUse keyword arguments for multi-parameter calls to
quote_ident_for.These newly added calls pass 2 parameters positionally; repo policy requires keyword args for functions with more than one parameter.
As per coding guidelines: "
**/*.py: Use keyword arguments for functions with more than 1 parameter".Also applies to: 2722-2722, 2813-2813, 2893-2893
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/engine/enrichment.py` around lines 2471 - 2473, The calls to quote_ident_for use positional arguments; change them to use keyword arguments (e.g., quote_ident_for(identifier=..., dialect=...)) wherever quote_ident_for is called with more than one parameter (specifically the occurrences using current_alias/hop_alias and dialect at the locations referencing current_alias and hop_alias, and the other noted occurrences around lines 2722, 2813, 2893). Update each call to pass identifier=<alias> and dialect=<dialect> (or whatever the parameter names are) so all multi-parameter invocations follow the repo policy.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@BUG-reserved-keyword-identifiers.md`:
- Around line 28-39: The fenced code block in
BUG-reserved-keyword-identifiers.md lacks a language specifier and triggers
MD040; update the opening fence for the block that starts with
'grant."idempotencyKey"' to include a language token such as text or log (e.g.,
replace ``` with ```text) so markdownlint passes; locate the block by searching
for the exact snippet 'grant."idempotencyKey"' and modify only the opening
fence.
In `@slayer/engine/enrichment.py`:
- Around line 56-61: The _TABLE_COL_RE regex only allows optional double-quote
quoting and therefore misses identifiers quoted with backticks or square
brackets; update the pattern used by _TABLE_COL_RE (referenced by
_collect_needed_paths) to accept optional " or ` or [...] quoting around both
table and column identifiers and still capture the bare identifier names (so
groups remain unquoted), e.g. by allowing an optional prefix/suffix of " or ` or
\[...\] and extracting the inner identifier for both captured groups; ensure the
pattern still enforces identifier rules (letters/underscore followed by word
chars).
---
Nitpick comments:
In `@slayer/engine/enrichment.py`:
- Around line 2471-2473: The calls to quote_ident_for use positional arguments;
change them to use keyword arguments (e.g., quote_ident_for(identifier=...,
dialect=...)) wherever quote_ident_for is called with more than one parameter
(specifically the occurrences using current_alias/hop_alias and dialect at the
locations referencing current_alias and hop_alias, and the other noted
occurrences around lines 2722, 2813, 2893). Update each call to pass
identifier=<alias> and dialect=<dialect> (or whatever the parameter names are)
so all multi-parameter invocations follow the repo policy.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a00d907a-f7f1-4194-bf0d-b9ec2970c8e8
📒 Files selected for processing (12)
BUG-reserved-keyword-identifiers.mdCLAUDE.mdslayer/engine/column_expansion.pyslayer/engine/enrichment.pyslayer/engine/query_engine.pyslayer/sql/generator.pytests/integration/test_integration_duckdb.pytests/integration/test_integration_postgres.pytests/test_aggregation_gating.pytests/test_cross_model_derived_columns.pytests/test_reserved_keyword_quoting.pytests/test_sql_generator.py
| ``` | ||
| 'grant."idempotencyKey"' contains unsupported syntax. Falling back to parsing as a 'Command'. | ||
| shadow_compare: load_data shadow backend raised ProgrammingError: | ||
| (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) | ||
| <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "grant" | ||
| [SQL: SELECT | ||
| grant ."idempotencyKey" AS "grant.idempotencyKey" | ||
| FROM "Grant" AS grant | ||
| GROUP BY | ||
| grant ."idempotencyKey" | ||
| LIMIT 21] | ||
| ``` |
There was a problem hiding this comment.
Add a language to this fenced code block to satisfy markdownlint.
This block currently triggers MD040 (fenced-code-language). Add text (or log) after the opening fence.
Suggested patch
-```
+```text
'grant."idempotencyKey"' contains unsupported syntax. Falling back to parsing as a 'Command'.
shadow_compare: load_data shadow backend raised ProgrammingError:
(sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError)
<class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "grant"
[SQL: SELECT
grant ."idempotencyKey" AS "grant.idempotencyKey"
FROM "Grant" AS grant
GROUP BY
grant ."idempotencyKey"
LIMIT 21]</details>
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 28-28: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @BUG-reserved-keyword-identifiers.md around lines 28 - 39, The fenced code
block in BUG-reserved-keyword-identifiers.md lacks a language specifier and
triggers MD040; update the opening fence for the block that starts with
'grant."idempotencyKey"' to include a language token such as text or log (e.g.,
replace withtext) so markdownlint passes; locate the block by searching
for the exact snippet 'grant."idempotencyKey"' and modify only the opening
fence.
</details>
<!-- fingerprinting:phantom:poseidon:hawk -->
<!-- This is an auto-generated comment by CodeRabbit -->
| # Matches a ``<table>.<col>`` reference used for join-path discovery. Each | ||
| # identifier may be wrapped in double quotes (expanded ``Column.sql`` qualifies | ||
| # its table alias via ``_model_identifier`` → ``"customers__regions".population`` | ||
| # so reserved-keyword model names survive); the optional ``"?`` consumes those | ||
| # quotes without capturing them, so the captured groups stay bare. | ||
| _TABLE_COL_RE = re.compile(r'"?\b([a-zA-Z_]\w*)\b"?\."?\b([a-zA-Z_]\w*)\b"?') |
There was a problem hiding this comment.
_TABLE_COL_RE misses non-double-quoted identifiers, which can hide required join paths.
This regex only handles optional " quoting. In dialects emitting backticks/brackets, table-qualified refs in expanded SQL/filters won’t be detected, so _collect_needed_paths can miss joins and generate invalid SQL.
💡 Suggested fix
-_TABLE_COL_RE = re.compile(r'"?\b([a-zA-Z_]\w*)\b"?\."?\b([a-zA-Z_]\w*)\b"?')
+_TABLE_COL_RE = re.compile(
+ r'(?:`|"|\[)?\b([a-zA-Z_]\w*)\b(?:`|"|\])?'
+ r'\.'
+ r'(?:`|"|\[)?\b([a-zA-Z_]\w*)\b(?:`|"|\])?'
+)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@slayer/engine/enrichment.py` around lines 56 - 61, The _TABLE_COL_RE regex
only allows optional double-quote quoting and therefore misses identifiers
quoted with backticks or square brackets; update the pattern used by
_TABLE_COL_RE (referenced by _collect_needed_paths) to accept optional " or ` or
[...] quoting around both table and column identifiers and still capture the
bare identifier names (so groups remain unquoted), e.g. by allowing an optional
prefix/suffix of " or ` or \[...\] and extracting the inner identifier for both
captured groups; ensure the pattern still enforces identifier rules
(letters/underscore followed by word chars).
A model whose name is a reserved SQL keyword (a source table `Grant` becomes model `grant`; also `order`/`user`/`select`) was emitted as an UNQUOTED identifier — both as the FROM alias and as every column qualifier. sqlglot's default generator does not quote reserved keywords, so the database rejected the statement with a syntax error. A second surface: a model-qualified string like `grant."col"` re-parsed mid pipeline made `sqlglot.parse_one` fall back to a `Command` statement parse, leaking a `grant .` spacing into the emitted SQL. Route every model-name identifier through a single `_model_identifier` helper (`exp.to_identifier(name, quoted=True)`), so alias definitions and qualifier references quote identically (matters on case-folding dialects like Snowflake). The handful of raw-SQL-string sites that get re-parsed (join conditions, the ranked-subquery `SELECT <model>.*` and JOIN, the WHERE qualifier) quote the alias with the dialect's real quote char via `quote_ident_for` — a hardcoded `"` is a string literal on MySQL/Spark. The rerooted-filter qualifier stays bare (it is re-parsed by the Python-AST `parse_filter`, where `"x"` is a string literal) and is quoted later in `resolve_filter_columns`. The join-path-discovery regex is now quote-tolerant. Tests: new tests/test_reserved_keyword_quoting.py (reserved keywords x 6 dialects, incl. MySQL backticks — quoted form, clean parse, no Command fallback warning); DuckDB + Postgres integration cases (reserved-keyword table with mixed-case column, filter, join, cross-model measure); existing SQL-snapshot assertions updated for the quoted output.
537f9d3 to
759e680
Compare
|
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
BUG-reserved-keyword-identifiers.md (1)
28-39:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd language specifier to fenced code block.
The fenced code block lacks a language identifier, triggering markdownlint MD040.
📝 Proposed fix
-``` +```text 'grant."idempotencyKey"' contains unsupported syntax. Falling back to parsing as a 'Command'.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@BUG-reserved-keyword-identifiers.md` around lines 28 - 39, The fenced code block showing the SQL error (the block starting with 'grant."idempotencyKey"' and the following SQL/stack trace) is missing a language identifier which triggers markdownlint MD040; update the fence opening from ``` to ```text (or another appropriate language like ```sql if you prefer) so the block begins with ```text and keep the existing block contents unchanged to satisfy the linter and preserve the example.
🧹 Nitpick comments (2)
slayer/engine/enrichment.py (1)
2463-2464: ⚡ Quick winUse keyword arguments for
quote_ident_for(...)calls.These multi-parameter calls are currently positional in production code; switch to keyword form for repo consistency.
Suggested change
- src_q = quote_ident_for(current_alias, dialect) - tgt_q = quote_ident_for(hop_alias, dialect) + src_q = quote_ident_for(name=current_alias, dialect=dialect) + tgt_q = quote_ident_for(name=hop_alias, dialect=dialect) ... - repl_sql = f"{quote_ident_for(model_name, dialect)}.{sql_expr}" + repl_sql = f"{quote_ident_for(name=model_name, dialect=dialect)}.{sql_expr}" ... - repl_sql = f"{quote_ident_for(table_alias, dialect)}.{sql_expr}" + repl_sql = f"{quote_ident_for(name=table_alias, dialect=dialect)}.{sql_expr}" ... - repl_sql = f"{quote_ident_for(model.name, dialect)}.{stage_col.name}" + repl_sql = f"{quote_ident_for(name=model.name, dialect=dialect)}.{stage_col.name}"As per coding guidelines
**/*.py: Use keyword arguments for functions with more than 1 parameter.Also applies to: 2710-2710, 2801-2801, 2881-2881
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/engine/enrichment.py` around lines 2463 - 2464, The calls to quote_ident_for are using positional arguments; update them to use keyword arguments (e.g., quote_ident_for(identifier=current_alias, dialect=dialect)) for the instances in this file (specifically where current_alias/hop_alias and dialect are passed to quote_ident_for), and apply the same change at the other occurrences mentioned (near the uses involving current_alias, hop_alias at the other locations). Ensure you only rename the call signature to use identifier= and dialect= keywords without changing variable names or behavior.slayer/sql/generator.py (1)
72-79: ⚡ Quick winUse keyword arguments for multi-parameter helper calls in production code.
There are new positional calls to helpers with multiple parameters; switch these to keyword style for consistency with repo rules.
♻️ Suggested change
- return f"{quote_ident_for(alias, dialect)}.{col}" + return f"{quote_ident_for(name=alias, dialect=dialect)}.{col}" @@ - parts = [f"{quote_ident_for(model, self.dialect)}.*"] + parts = [f"{quote_ident_for(name=model, dialect=self.dialect)}.*"] @@ - _qualify_in_string(model, col_name, self.dialect), + _qualify_in_string(alias=model, col=col_name, dialect=self.dialect),As per coding guidelines, "
**/*.py: Use keyword arguments for functions with more than 1 parameter."Also applies to: 1980-1980, 2596-2596
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/sql/generator.py` around lines 72 - 79, The _qualify_in_string helper currently calls quote_ident_for(alias, dialect) using positional arguments; change this to use keyword arguments to follow the repo rule for multi-parameter helpers—call quote_ident_for(name=alias, dialect=dialect) (or the exact parameter names of quote_ident_for) and similarly update any other positional uses of quote_ident_for (and other multi-parameter helpers referenced at locations like the ones called around lines 1980 and 2596) to use explicit keyword arguments so the intention is clear and consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@slayer/sql/generator.py`:
- Around line 2082-2084: The current code builds JOIN fragments by concatenating
strings from enriched.resolved_joins and quote_ident_for, which violates the
AST-first rule; replace that logic to construct sqlglot AST nodes instead: for
each (target_table, target_alias, join_cond, jtype) in enriched.resolved_joins
create a sqlglot.exp.Table (or sqlglot.exp.Identifier) for target_table, wrap
with sqlglot.exp.Alias for target_alias, build the join condition as a sqlglot
expression (parse or compose from existing expression nodes) and assemble a
sqlglot.exp.Join with the appropriate join type (jtype.upper()) and
.on(join_cond_expr); then attach these Join nodes to the main query AST rather
than emitting raw strings and stop using quote_ident_for for JOIN clauses so
dialect/escaping is handled by sqlglot.
- Around line 64-69: The function quote_ident_for currently builds an identifier
directly via exp.to_identifier instead of reusing the canonical
_model_identifier helper; change quote_ident_for to call _model_identifier(name)
and then return its SQL text with the requested dialect (e.g.
_model_identifier(name).sql(dialect=dialect or "postgres")) so all
model/alias/column quoting follows the single _model_identifier contract used
elsewhere.
---
Duplicate comments:
In `@BUG-reserved-keyword-identifiers.md`:
- Around line 28-39: The fenced code block showing the SQL error (the block
starting with 'grant."idempotencyKey"' and the following SQL/stack trace) is
missing a language identifier which triggers markdownlint MD040; update the
fence opening from ``` to ```text (or another appropriate language like ```sql
if you prefer) so the block begins with ```text and keep the existing block
contents unchanged to satisfy the linter and preserve the example.
---
Nitpick comments:
In `@slayer/engine/enrichment.py`:
- Around line 2463-2464: The calls to quote_ident_for are using positional
arguments; update them to use keyword arguments (e.g.,
quote_ident_for(identifier=current_alias, dialect=dialect)) for the instances in
this file (specifically where current_alias/hop_alias and dialect are passed to
quote_ident_for), and apply the same change at the other occurrences mentioned
(near the uses involving current_alias, hop_alias at the other locations).
Ensure you only rename the call signature to use identifier= and dialect=
keywords without changing variable names or behavior.
In `@slayer/sql/generator.py`:
- Around line 72-79: The _qualify_in_string helper currently calls
quote_ident_for(alias, dialect) using positional arguments; change this to use
keyword arguments to follow the repo rule for multi-parameter helpers—call
quote_ident_for(name=alias, dialect=dialect) (or the exact parameter names of
quote_ident_for) and similarly update any other positional uses of
quote_ident_for (and other multi-parameter helpers referenced at locations like
the ones called around lines 1980 and 2596) to use explicit keyword arguments so
the intention is clear and consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ef0d9be3-7b84-412b-a5a4-d9bac8fb9cb2
📒 Files selected for processing (12)
BUG-reserved-keyword-identifiers.mdCLAUDE.mdslayer/engine/column_expansion.pyslayer/engine/enrichment.pyslayer/engine/query_engine.pyslayer/sql/generator.pytests/integration/test_integration_duckdb.pytests/integration/test_integration_postgres.pytests/test_aggregation_gating.pytests/test_cross_model_derived_columns.pytests/test_reserved_keyword_quoting.pytests/test_sql_generator.py
✅ Files skipped from review due to trivial changes (2)
- slayer/engine/query_engine.py
- CLAUDE.md
| def quote_ident_for(name: str, dialect: Optional[str]) -> str: | ||
| """Quoted-identifier text for ``name`` in ``dialect`` (double quotes / MySQL | ||
| backticks / T-SQL brackets). For raw-string sites re-parsed with the same | ||
| dialect — a hardcoded ``"`` is a string literal on backtick dialects. | ||
| """ | ||
| return exp.to_identifier(name, quoted=True).sql(dialect=dialect or "postgres") |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Use _model_identifier() inside quote_ident_for() to keep one quoting contract.
This helper currently reimplements identifier creation instead of routing through _model_identifier, which can drift behavior over time.
♻️ Suggested change
def quote_ident_for(name: str, dialect: Optional[str]) -> str:
@@
- return exp.to_identifier(name, quoted=True).sql(dialect=dialect or "postgres")
+ return _model_identifier(name).sql(dialect=dialect or "postgres")As per coding guidelines, "Every SLayer model name / FROM alias / column qualifier must be emitted as a quoted identifier using the _model_identifier(name) helper."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@slayer/sql/generator.py` around lines 64 - 69, The function quote_ident_for
currently builds an identifier directly via exp.to_identifier instead of reusing
the canonical _model_identifier helper; change quote_ident_for to call
_model_identifier(name) and then return its SQL text with the requested dialect
(e.g. _model_identifier(name).sql(dialect=dialect or "postgres")) so all
model/alias/column quoting follows the single _model_identifier contract used
elsewhere.
| f"{jtype.upper()} JOIN {target_table} AS " | ||
| f"{quote_ident_for(target_alias, self.dialect)} ON {join_cond}" | ||
| for target_table, target_alias, join_cond, jtype in enriched.resolved_joins |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Avoid raw JOIN string assembly in SQL generation.
Building JOIN fragments with string concatenation here bypasses the AST-first contract and can reintroduce dialect/escaping edge cases.
♻️ Suggested direction
- join_sql_parts = [
- f"{jtype.upper()} JOIN {target_table} AS "
- f"{quote_ident_for(target_alias, self.dialect)} ON {join_cond}"
- for target_table, target_alias, join_cond, jtype in enriched.resolved_joins
- ]
- ranked_sql += " " + " ".join(join_sql_parts)
+ parsed_ranked = self._parse(ranked_sql)
+ if isinstance(parsed_ranked, exp.Select):
+ for target_table, target_alias, join_cond, jtype in enriched.resolved_joins:
+ join_target = exp.to_table(target_table, alias=_model_identifier(target_alias))
+ parsed_ranked = parsed_ranked.join(
+ join_target,
+ on=self._parse_predicate(join_cond),
+ join_type=jtype.upper(),
+ )
+ ranked_sql = parsed_ranked.sql(dialect=self.dialect)As per coding guidelines, "slayer/sql/**/*.py: Use sqlglot AST building (not string concatenation) for SQL generation."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@slayer/sql/generator.py` around lines 2082 - 2084, The current code builds
JOIN fragments by concatenating strings from enriched.resolved_joins and
quote_ident_for, which violates the AST-first rule; replace that logic to
construct sqlglot AST nodes instead: for each (target_table, target_alias,
join_cond, jtype) in enriched.resolved_joins create a sqlglot.exp.Table (or
sqlglot.exp.Identifier) for target_table, wrap with sqlglot.exp.Alias for
target_alias, build the join condition as a sqlglot expression (parse or compose
from existing expression nodes) and assemble a sqlglot.exp.Join with the
appropriate join type (jtype.upper()) and .on(join_cond_expr); then attach these
Join nodes to the main query AST rather than emitting raw strings and stop using
quote_ident_for for JOIN clauses so dialect/escaping is handled by sqlglot.


Problem
A model whose name is a reserved SQL keyword (a customer source table
Grant→ modelgrant; alsoorder,user,select) was emitted as an unquoted identifier — both as theFROMalias and as every column qualifier. sqlglot's default generator does not quote reserved keywords, so the database rejected the query:Two failure surfaces:
grantalias/qualifier → DB syntax error.grant."col"re-parsed mid-pipeline makessqlglot.parse_onefall back to aCommandstatement parse, leaking the tell-talegrant .spacing into the emitted SQL.Found via Storyline shadow-compare ingestion (a
Granttable). Tracking doc:BUG-reserved-keyword-identifiers.md.Fix
Quote model-name identifiers unconditionally at construction time — fixes A directly and B transitively (
"grant"."col"re-parses cleanly). Every model-name identifier routes through one helper,_model_identifier(exp.to_identifier(name, quoted=True)), so alias definitions and qualifier references quote identically (matters on case-folding dialects like Snowflake)..sql(identify=True)was rejected — it doesn't fix surface B and over-quotes.Sites covered:
_build_from_clause,_resolve_sql, measure qualifiers, the cross-model CTE, all join-target aliases, the ranked subquery (incl.SELECT <model>.*),column_expansion, and the filter/join-condition builders inenrichment.py.Cross-dialect: raw-SQL-string sites that get re-parsed (join conditions, ranked-subquery
.*/JOIN, WHERE qualifier) can't hardcode"— MySQL/Spark/Databricks quote with backticks, where"grant"is a string literal. Those usequote_ident_for(name, dialect). The rerooted-filter qualifier (query_engine.py) stays bare because it's re-parsed by the Python-ASTparse_filter(where"x"is a string literal); it's quoted later inresolve_filter_columns. The join-path-discovery regex (_TABLE_COL_RE) is now quote-tolerant.Tests
tests/test_reserved_keyword_quoting.py:grant/order/user/select× 6 dialects (incl. MySQL backticks) — asserts quoted form, clean parse, and noCommandfallback warning (captured on thesqlglotlogger).orders.status→"orders".status; backticks for MySQL).Full unit suite: no new failures vs. baseline.
ruffclean.CLAUDE.mddocuments the convention.Summary by CodeRabbit
Bug Fixes
Documentation
Tests