Skip to content

fix(admin): prevent SQL injection in system query validation#7748

Closed
fix-it-felix-sentry[bot] wants to merge 3 commits into
masterfrom
fix/sql-injection-system-queries-pf-66
Closed

fix(admin): prevent SQL injection in system query validation#7748
fix-it-felix-sentry[bot] wants to merge 3 commits into
masterfrom
fix/sql-injection-system-queries-pf-66

Conversation

@fix-it-felix-sentry

Copy link
Copy Markdown
Contributor

Summary

Added sanitization for SQL queries before using them in EXPLAIN statements to prevent SQL injection vulnerabilities during query validation.

Changes

  • Added _sanitize_query_for_explain() function to validate and sanitize queries before they are used in EXPLAIN QUERY TREE and EXPLAIN AST statements
  • Checks for unbalanced quotes, multiple statement attempts, and SQL comments
  • Applied sanitization in is_query_using_only_system_tables() and is_valid_system_query() functions

Security Impact

This fix addresses a SQL injection vulnerability where user input was directly interpolated into EXPLAIN statements without proper sanitization. The vulnerability could potentially allow an attacker to:

  • Execute multiple SQL statements via semicolon injection
  • Bypass query validation using SQL comments
  • Inject malicious SQL code during the validation phase

The fix adds defense-in-depth protection during the validation phase while maintaining the existing EXPLAIN AST-based validation logic.

Testing

The fix maintains backward compatibility with existing legitimate queries:

  • Single SELECT statements with trailing semicolons
  • Queries with string literals containing quotes
  • Queries with datetime functions

It properly blocks:

  • Multiple statement attempts (e.g., SELECT * FROM ...; DROP TABLE ...)
  • SQL comment-based injection attempts
  • Unbalanced quote patterns

References

Added sanitization for SQL queries before using them in EXPLAIN statements
to prevent SQL injection vulnerabilities during query validation.

Changes:
- Added _sanitize_query_for_explain() function to validate and sanitize
  queries before they are used in EXPLAIN QUERY TREE and EXPLAIN AST statements
- Checks for unbalanced quotes, multiple statement attempts, and SQL comments
- Applied sanitization in is_query_using_only_system_tables() and
  is_valid_system_query() functions

This fix addresses SQL injection vulnerability in:
- snuba/admin/clickhouse/system_queries.py:51

The sanitization adds defense-in-depth protection during the validation phase,
while maintaining the existing EXPLAIN AST-based validation logic.

Fixes: https://linear.app/getsentry/issue/VULN-1029
Fixes: https://linear.app/getsentry/issue/PF-66

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@fix-it-felix-sentry fix-it-felix-sentry Bot requested a review from a team as a code owner February 19, 2026 00:02
@linear

linear Bot commented Feb 19, 2026

Copy link
Copy Markdown

phacops and others added 2 commits February 18, 2026 17:13
The unquoted `default` clusterAllReplicas query was incorrectly in the
invalid test list. It only appeared to fail before because the old
semicolon-stripping logic didn't handle trailing whitespace after the
semicolon, causing EXPLAIN to error. The new _sanitize_query_for_explain
correctly strips whitespace first, revealing this is a valid query.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve conflict in tests/admin/test_system_queries.py: keep the PR's move
of the clusterAllReplicas(default, system.query_log) case into the valid
query list, on top of master's switch from @pytest.mark.clickhouse_db to
@pytest.mark.events_db.

Trim _sanitize_query_for_explain to reduce false positives: drop the
fragile unbalanced-quote heuristic (it mishandled ClickHouse '' / \' string
escaping) and make string-literal stripping escape-aware before scanning for
statement chaining and SQL comments. The EXPLAIN validation runs as the
read-only user and the AST checks remain the primary control, so this stays
defense-in-depth only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019A8M9WeNTAXuW6rxCjenKJ

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 21ee79f. Configure here.


# Reject statement chaining outside of string literals.
if ";" in without_strings:
raise InvalidCustomQuery("Multiple statements are not allowed")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double trailing semicolon rejected

Low Severity

_sanitize_query_for_explain removes only one trailing semicolon, then treats any remaining ; as multiple statements. Queries that end with more than one semicolon (previously normalized via rstrip(";")) now raise InvalidCustomQuery with a misleading “Multiple statements” message even when the SQL is a single statement.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 21ee79f. Configure here.

phacops commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Closing this without merging.

After reviewing the actual code path, this is a true Semgrep finding (raw f-string interpolation of input into a SQL string) but not a practically exploitable SQL injection here, so the added sanitization is low-value defense-in-depth that mainly risks rejecting legitimate queries:

  • Authenticated, admin-only. /run_clickhouse_system_query is gated by @check_tool_perms(tools=[AdminTools.SYSTEM_QUERIES]). There is no anonymous attack surface.
  • The "user input" is the query by design. This is a run a system query admin tool — the SQL is meant to be executed. If a query passes validation it gets run regardless, so the EXPLAIN step is not an additional injection surface beyond what the validation already permits.
  • The EXPLAIN validation runs as the read-only user. Both is_query_using_only_system_tables and is_valid_system_query call _run_sql_query_on_host(..., sudo=False, ...)CLICKHOUSE_READONLY_USER, so the validation step cannot modify data.
  • The real guardrails are the AST checks. is_valid_system_query rejects AlterQuery/AlterCommand/DropQuery/InsertQuery nodes and is_query_using_only_system_tables enforces system.*-only tables. A -- / /* */ comment cannot bypass these, because EXPLAIN AST parses the whole statement.
  • Semicolon chaining isn't reachable. The clickhouse-driver native protocol executes a single statement per execute(); trailing statements after the first produce a syntax error rather than a second execution.

On top of that, the heuristic checks tend to produce false positives on legitimate queries — e.g. ClickHouse '' / \' string escaping, or string literals that legitimately contain ; or --. (Note there is already a similar validate_ro_query heuristic in admin/clickhouse/common.py.)

If we want defense-in-depth in this path anyway, the better follow-up is to simply keep normalizing the trailing semicolon (as the code already did) without the heuristic rejections. Happy to open that separately if it's wanted.


Generated by Claude Code

@phacops phacops closed this Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants