Skip to content

feat: safe type expansion, NVARCHAR/NCHAR catalog fix, seed empty cell fix#702

Open
axellpadilla wants to merge 1 commit into
masterfrom
fix/missing-type-handling
Open

feat: safe type expansion, NVARCHAR/NCHAR catalog fix, seed empty cell fix#702
axellpadilla wants to merge 1 commit into
masterfrom
fix/missing-type-handling

Conversation

@axellpadilla

Copy link
Copy Markdown
Collaborator

Summary

This PR addresses several long-standing issues with SQL Server native type handling during column expansion, catalog generation, and seed ingestion, considered similar problems, reproduced and handled on one run.

Closes: #701, #637, #425, #446
Supersedes: #606, thanks @Cogito

What Changed

1. SQL Server Native String Type Recognition (sqlserver_column.py)

  • is_string() now includes nvarchar and nchar in addition to varchar and char
  • string_type_instance() — new instance method that preserves the original type family:
    • nvarchar(n) emits nvarchar(n) (not varchar(n))
    • nchar(n) emits nchar(n) (not char(n))
    • Falls back to varchar(n) / char(n) for non-Unicode types
  • data_type property now uses string_type_instance() instead of string_type()
  • is_number() now includes is_fixed_numeric() so money/smallmoney participate in numeric checks without being classified as is_numeric()
  • is_fixed_numeric() — new method for money/smallmoney
  • is_numeric() now excludes money/smallmoney (breaking change — see migration note below)
  • is_integer() now includes tinyint and bit
  • can_expand_to() — stricter: only allows same-family string size increases (e.g., varchar(10)varchar(25))
  • can_expand_safe() — new method for flag-gated safe expansions:
Source Target Allowed?
varchar(n) nvarchar(m) where m >= n With flag
char(n) nchar(m) where m >= n With flag
bittinyintsmallintintbigint Higher in family With flag
int numeric(p,s) where p >= 10 With flag
numeric(p,s) numeric(p2,s2) where p2 >= p and s2 >= s With flag
smallmoney money With flag
money numeric(p,s) where p >= 19 With flag

2. Safe Type Expansion Feature Flag (sqlserver_adapter.py)

New dbt_sqlserver_enable_safe_type_expansion behaviour flag (default: false):

# dbt_project.yml
flags:
  dbt_sqlserver_enable_safe_type_expansion: true

When enabled, the adapter's expand_column_types() override performs:

  1. Same-family string resizes — always proceed (e.g., varchar(10)varchar(25))
  2. Safe type expansions — only when flag is enabled AND column_type_expansion_max_rows is not exceeded:
    • Cross-family string: varchar/charnvarchar/nchar
    • Integer family promotions
    • Integer → numeric with sufficient precision
    • numeric/decimal precision/scale upgrades
    • Fixed-money promotions (smallmoneymoneynumeric)

expand_target_column_types() — new public API that forwards the max_rows parameter, called from incremental and snapshot materializations.

alter_column_type() — new method that dispatches to the sqlserver__alter_column_type macro, replacing the base adapter's implementation.

3. Row-Count Guardrail (column_type_expansion_max_rows)

New per-model config (default: 1,000,000):

{{ config(materialized='incremental', unique_key='id',
           column_type_expansion_max_rows=500000) }}
  • Safe type expansion is skipped when the table exceeds this row count
  • Set to -1 to disable the check
  • Set to 0 to always skip safe expansion (only same-family string resizes proceed)
  • Skipped expansions emit a warning log with the row count and limit

4. Single ALTER COLUMN Mode (prefer_single_alter_column)

New per-model config (default: false):

{{ config(materialized='incremental', unique_key='id',
           prefer_single_alter_column=true) }}

When true, the sqlserver__alter_column_type macro uses a single ALTER COLUMN statement instead of the safer add+update+drop+rename pattern. This is faster for small/medium tables and instant for safe type expansions, but may fail for types that cannot be implicitly converted.

5. Catalog Fix (catalog.sql)

Changed sys.types join from system_type_id to user_type_id in both catalog queries. This prevents NVARCHAR/NCHAR columns from appearing as SYSNAME in dbt docs generate output. Fixes #637.

6. Seed Empty Cell Fix (helpers.sql)

Changed seed CSV ingestion to inline NULL literals instead of binding empty cells as SQL parameters. Previously, an empty cell in a numeric(18,0) column would be bound as an empty string parameter, causing arithmetic overflow error 8115. Now empty cells emit null directly in the VALUES clause. Fixes #425.

7. Adapter Configs (sqlserver_configs.py)

Added two new optional config fields:

  • prefer_single_alter_column: Optional[bool] = False
  • column_type_expansion_max_rows: Optional[int] = None

8. Unit Tests

  • test_sqlserver_column.py — Tests for is_string(), string_type_instance(), data_type, is_fixed_numeric(), is_numeric(), string_size() across all string/numeric type families
  • test_can_expand_to.py — Parameterized tests for can_expand_to() and can_expand_safe() covering same-family resizes, cross-family promotions, integer family promotions, numeric precision/scale upgrades, fixed-money promotions, and prevented shrinking conversions
  • test_expand_column_types.py — Tests for the adapter's expand_column_types() method: row-count skip, max-rows=0 blocking, warning emission, max_rows forwarding through expand_target_column_types()

9. Functional Tests


Breaking Changes / Migration Notes

  • money and smallmoney columns are no longer classified as is_numeric(). If you have custom code or macros that depend on money being numeric:
    • Use is_number() (covers all numeric types including money)
    • Use is_fixed_numeric() for money types specifically
    • Use is_numeric() only for numeric/decimal types

Related PRs & History

…l fix

- Add dbt_sqlserver_enable_safe_type_expansion flag for safe column type widening
  (varchar->nvarchar, integer family promotions, numeric precision/scale upgrades)
- Add column_type_expansion_max_rows config (default 1,000,000 rows)
- Add prefer_single_alter_column config for single ALTER COLUMN statement
- Add string_type_instance() to preserve NVARCHAR/NCHAR type family
- Fix catalog generation (user_type_id) so NVARCHAR/NCHAR no longer appear as SYSNAME
- Fix is_numeric() to exclude money/smallmoney (now is_fixed_numeric())
- Fix seed table ingestion of empty numeric cells
- Add tinyint/bit to is_integer() type list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant