Source: code-review of #31; nice-to-have, deferred at merge.
skills/query-warehouse/SKILL.md claims a conservative pre-run cost estimate for Snowflake using INFORMATION_SCHEMA.TABLE_STORAGE_METRICS × selectivity heuristic (vs BigQuery's native dry-run mode). The skill prose names the approach but doesn't spell out the selectivity formula clearly enough for an agent to apply it consistently across queries.
Today the skill says "table size × selectivity heuristic — conservative, will sometimes refuse a query that would have been cheap" but doesn't define what selectivity to assume. Two cases will diverge:
- A query with a
WHERE clause on a clustered column → low selectivity (~1-10% of rows scanned).
- A query without any filter → 100% scan.
Concrete proposal:
- Define the selectivity heuristic explicitly: e.g., "no WHERE clause → 100%; WHERE on clustering key → 5%; WHERE on non-clustered column → 50% as upper bound."
- Add a worked example in the skill: query → estimated bytes → upper-bound dollar → kill-cap check outcome.
- Or: drop the Snowflake estimate entirely from this skill and route Snowflake users through a warehouse-layer billing alert + small probe query, since
mcp-snowflake-server's dry-run support is thin.
Files: skills/query-warehouse/SKILL.md — the Key Concepts and Shape 2 sections.
Risk: medium — until this is sharpened, the Snowflake half of the skill is documented but not operationally usable.
Source: code-review of #31; nice-to-have, deferred at merge.
skills/query-warehouse/SKILL.mdclaims a conservative pre-run cost estimate for Snowflake usingINFORMATION_SCHEMA.TABLE_STORAGE_METRICS× selectivity heuristic (vs BigQuery's native dry-run mode). The skill prose names the approach but doesn't spell out the selectivity formula clearly enough for an agent to apply it consistently across queries.Today the skill says "table size × selectivity heuristic — conservative, will sometimes refuse a query that would have been cheap" but doesn't define what selectivity to assume. Two cases will diverge:
WHEREclause on a clustered column → low selectivity (~1-10% of rows scanned).Concrete proposal:
mcp-snowflake-server's dry-run support is thin.Files:
skills/query-warehouse/SKILL.md— the Key Concepts and Shape 2 sections.Risk: medium — until this is sharpened, the Snowflake half of the skill is documented but not operationally usable.