Skip to content

Default to varchar(max) for inferred string columns#385

Open
sdebruyn wants to merge 1 commit into
microsoft:mainfrom
sdebruyn:varchar-max
Open

Default to varchar(max) for inferred string columns#385
sdebruyn wants to merge 1 commit into
microsoft:mainfrom
sdebruyn:varchar-max

Conversation

@sdebruyn

@sdebruyn sdebruyn commented May 19, 2026

Copy link
Copy Markdown
Collaborator

Fixes #384.

FabricColumn defaulted STRING/VARCHAR/NVARCHAR to VARCHAR(8000) in TYPE_LABELS, and both string_type() and string_size() fell back to 8000 when no explicit size was supplied. Any inferred-width string column was silently hard-capped at 8000 characters — long-text source columns (JSON payloads, free-text fields, serialized blobs) lost the tail beyond byte 8000 with no warning or error.

Fabric Warehouse supports varchar(max), so this PR makes that the default when no size is provided.

Changes

  • TYPE_LABELS: STRING, VARCHAR, and NVARCHAR now map to VARCHAR(MAX).
  • string_type(size): returns varchar(max) when size is None or non-positive; otherwise varchar({size}).
  • string_size(): returns -1 (the T-SQL sentinel for varchar(max)) when char_size is None, instead of 8000.
  • can_expand_to(): treats -1 as the widest possible size so comparisons stay correct.

Users who want a fixed-width column can still declare it explicitly via a contract or by setting char_size directly.

Testing

The maintainers can wire this through the project's test suite. Happy to extend the change with adapter unit tests covering the new sentinel paths if that's the preferred follow-up.

The previous `varchar(8000)` default in TYPE_LABELS, string_type() and
string_size() silently truncated any inferred-width string column at 8000
characters. Fabric Warehouse supports varchar(max); use that as the
default when no explicit size is provided.

string_size() now returns -1 as the sentinel for max, mirroring how T-SQL
represents varchar(max) in sys.columns. string_type() and can_expand_to()
were updated to handle the sentinel.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

varchar(8000) default in FabricColumn silently truncates strings

1 participant