Skip to content

fix: pgsql digest fixtures (typecast spacing) + ParserSQL 1.0.3 bump (PG SET TIME ZONE + multi-value lists)#5805

Merged
renecannao merged 3 commits into
v3.0-260523from
fix/parsersql-1.0.3-pg-set-fixes
May 23, 2026
Merged

fix: pgsql digest fixtures (typecast spacing) + ParserSQL 1.0.3 bump (PG SET TIME ZONE + multi-value lists)#5805
renecannao merged 3 commits into
v3.0-260523from
fix/parsersql-1.0.3-pg-set-fixes

Conversation

@renecannao

@renecannao renecannao commented May 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Combines two related PG SET / digest fixes (originally split as PR #5804 + #5805) into one PR to avoid running CI twice.

Commit 1 — fix(test): update pgsql digest fixtures for typecast trailing-space behavior

Updates 3 expected-digest cases in tokenizer_payloads/pgsql_regular_tokenizer_digests.hjson to match the corrected tokenizer behavior from PR #5755 (commit e6bef5c58), which intentionally preserves the trailing space after a PG typecast unless the next non-space char is a modifier ( or array bracket [.

Fixes pgsql-query_digests_stages_test-t (legacy-g4) — was failing deterministically with 9 not-ok assertions across ::json ->, ::jsonb @>, and ::money + shapes.

Commit 2 — fix(deps): bump ParserSQL to v1.0.3 — PG SET TIME ZONE alias + multi-value list

Bumps ParserSQL from v1.0.2 to v1.0.3 (ProxySQL/ParserSQL#38, tag pushed). Tarball is byte-identical to https://github.com/ProxySQL/ParserSQL/archive/refs/tags/v1.0.3.tar.gz.

Fixes two PG SET parser bugs in set_parser_algorithm_3-g1:

  • pgsql-set_parameter_validation_test-tSET search_path TO "$user", public dropped every value after the first comma.
  • pgsql-set_statement_test-tSET TIME ZONE 'UTC' parsed as time = ZONE, dropping the actual timezone value.

Includes:

  • lib/Query_Processor_ParserSQL.cpp — adapter walk_set_stmt now iterates all RHS siblings under NODE_VAR_ASSIGNMENT for PG multi-value lists.
  • test/tap/tests/setparser_parsersql_test.cpp — adds 46 new ok() assertions across 4 groups (mysql_filtered_set, mysql_set_testing, pgsql_search_path, pgsql_time_zone) as permanent regression coverage.

Test plan

  • pgsql-query_digests_stages_test-t (commit 1): 734/734 ok local
  • setparser_parsersql_test (commit 2): 46 new assertions pass; same 10 pre-existing failures (sql_mode 23-25, Set1_v2 0-1) — unrelated baseline.
  • ParserSQL v1.0.3 vendored tarball SHA256 = 0989cc09554d7019fafb25c4208e10ad68b1938ecd1caae3b55ca6c7b54ec90e matches GitHub release archive.
  • CI: TAP groups legacy-g4 (commit 1) and legacy-g2 / mysql84-g2 / legacy-g2-genai / set_parser_algorithm_3-g1 (commit 2) expected to flip green for the ParserSQL-related failures.
  • Two non-ParserSQL bugs on the algorithm=3 ProxySQL session-dispatch path (test_filtered_set_statements-t, set_testing-t) are out of scope here. The adapter now produces the correct parsed map for both (verified by mysql_filtered_set + mysql_set_testing assertions); downstream dispatch is a separate investigation.

Supersedes

Closes #5804 (digest fixtures), combined here to save CI compute.

Summary by CodeRabbit

  • New Features

    • Enhanced support for multi-value SET statements for MySQL and PostgreSQL, preserving all assigned values.
  • Tests

    • Expanded MySQL and PostgreSQL SET parsing tests and improved diagnostics.
    • Updated tokenizer fixtures to normalize operator spacing.
  • Chores

    • Bumped parsersql dependency to 1.0.3.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 23, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7564f2ca-1cd4-483e-92c4-d41695cb2e48

📥 Commits

Reviewing files that changed from the base of the PR and between e668a9d and 4eb52ab.

📒 Files selected for processing (2)
  • lib/PgSQL_Session.cpp
  • test/tap/tests/setparser_parsersql_test.cpp
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Class names should use PascalCase with protocol prefixes (MySQL_, PgSQL_, ProxySQL_)
Member variables should use snake_case
Constants and macros should use UPPER_SNAKE_CASE
Use C++17 features; conditional compilation should use #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE (PROXYSQLGENAI no longer guards core code)
Use RAII for resource management and jemalloc for memory allocation
Use pthread mutexes for synchronization and std::atomic<> for counters

Files:

  • lib/PgSQL_Session.cpp
  • test/tap/tests/setparser_parsersql_test.cpp
test/tap/tests/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

Test files should follow naming pattern test_*.cpp or *-t.cpp in test/tap/tests/

Files:

  • test/tap/tests/setparser_parsersql_test.cpp
🧠 Learnings (1)
📚 Learning: 2026-01-20T09:34:19.124Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.

Applied to files:

  • test/tap/tests/setparser_parsersql_test.cpp
🔇 Additional comments (2)
lib/PgSQL_Session.cpp (1)

4516-4516: LGTM!

Also applies to: 4535-4546

test/tap/tests/setparser_parsersql_test.cpp (1)

88-95: LGTM!

Also applies to: 135-144, 156-157, 166-170, 187-188, 197-201


📝 Walkthrough

Walkthrough

ParserSQL version 1.0.3 extends walk_set_stmt to collect all RHS value siblings for multi-value SET assignments; PgSQL runtime now accepts and collapses multi-value lists for SET handlers. Test suites add MySQL and PostgreSQL fixtures, diagnostics, and tokenizer digest updates.

Changes

Multi-value SET parsing and test expansion

Layer / File(s) Summary
Version update and multi-value SET logic
deps/parsersql/parsersql, lib/Query_Processor_ParserSQL.cpp, lib/PgSQL_Session.cpp
Version bumped to 1.0.3. walk_set_stmt now iterates over all RHS siblings, normalizes and collects them into a vector (ensuring non-empty fallback). PgSQL session SET handler accepts multi-value lists and collapses them into a comma-and-space separated string before validation and assignment.
Test coverage for MySQL and PostgreSQL SET parsing
test/tap/tests/setparser_parsersql_test.cpp, test/tap/tests/tokenizer_payloads/pgsql_regular_tokenizer_digests.hjson
Test header clarifies MySQL vs PG parsing algorithms. New MySQL and PostgreSQL fixture arrays added (multi-variable assignments, search_path, TIME ZONE forms). New join_values_for_diag() and TestParsePgsql() added; TAP plan extended. Tokenizer digest expectations adjusted for PG operator spacing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

Poem

A rabbit nibbles through SET lines bright,
Gathering values, left and right,
Commas joined with careful art,
Tests now check each multi-part,
Version bump — a joyful hop tonight! 🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes both major changes: the digest fixture updates and the ParserSQL 1.0.3 bump with PG SET fixes.
Linked Issues check ✅ Passed Code changes fully implement all objectives from linked issue #5804 and address the ParserSQL 1.0.3 upgrade with multi-value SET and TIME ZONE parsing support.
Out of Scope Changes check ✅ Passed All changes align with linked issue objectives: digest fixture updates, ParserSQL version bump, parser adapter updates, and test additions are all in scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/parsersql-1.0.3-pg-set-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the parsersql dependency to version 1.0.3 and enhances the query processor to support multi-value SET statements, which is necessary for PostgreSQL features like search_path. It also introduces new test suites for MySQL and PostgreSQL SET commands. Feedback was provided regarding the TestParsePgsql function in the test suite, suggesting the use of a lambda to reduce code duplication when joining strings and the adoption of the %zu format specifier for size_t to improve cross-platform compatibility.

Comment on lines +162 to +199
void TestParsePgsql(const Test* tests, int ntests, const std::string& title) {
for (int i = 0; i < ntests; i++) {
std::map<std::string, std::vector<std::string>> data;
for (auto it = std::begin(tests[i].results); it != std::end(tests[i].results); ++it) {
data[it->var] = it->values;
}

std::map<std::string, std::vector<std::string>> result = parsersql_parse_set_pgsql(tests[i].query);

bool size_ok = (result.size() == data.size());
ok(size_ok, "[%s %d] Sizes match: %lu, %lu", title.c_str(), i, result.size(), data.size());
if (!size_ok) {
diag(" FAIL: sizes differ for query: %s", tests[i].query);
}

bool elem_ok = maps_match(result, data);
ok(elem_ok, "[%s %d] Elements match", title.c_str(), i);
if (!elem_ok) {
diag(" FAIL: elements differ for query: %s", tests[i].query);
for (auto& kv : result) {
std::string joined;
for (size_t j = 0; j < kv.second.size(); j++) {
if (j) joined += " | ";
joined += kv.second[j];
}
diag(" result[%s] = [%s]", kv.first.c_str(), joined.c_str());
}
for (auto& kv : data) {
std::string joined;
for (size_t j = 0; j < kv.second.size(); j++) {
if (j) joined += " | ";
joined += kv.second[j];
}
diag(" expected[%s] = [%s]", kv.first.c_str(), joined.c_str());
}
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The TestParsePgsql function contains redundant logic for joining vector elements into a string for diagnostic output. Additionally, it uses %lu as a format specifier for size_t values, which can lead to portability issues or compiler warnings on certain architectures (e.g., 32-bit systems or Windows).

I suggest refactoring the join logic into a local lambda and using the %zu format specifier for size_t to improve maintainability and correctness.

void TestParsePgsql(const Test* tests, int ntests, const std::string& title) {
	auto join_vals = [](const std::vector<std::string>& v) {
		std::string s;
		for (size_t j = 0; j < v.size(); j++) {
			if (j) s += " | ";
			s += v[j];
		}
		return s;
	};
	for (int i = 0; i < ntests; i++) {
		std::map<std::string, std::vector<std::string>> data;
		for (auto it = std::begin(tests[i].results); it != std::end(tests[i].results); ++it) {
			data[it->var] = it->values;
		}

		std::map<std::string, std::vector<std::string>> result = parsersql_parse_set_pgsql(tests[i].query);

		bool size_ok = (result.size() == data.size());
		ok(size_ok, "[%s %d] Sizes match: %zu, %zu", title.c_str(), i, result.size(), data.size());
		if (!size_ok) {
			diag("  FAIL: sizes differ for query: %s", tests[i].query);
		}

		bool elem_ok = maps_match(result, data);
		ok(elem_ok, "[%s %d] Elements match", title.c_str(), i);
		if (!elem_ok) {
			diag("  FAIL: elements differ for query: %s", tests[i].query);
			for (auto& kv : result) {
				diag("    result[%s] = [%s]", kv.first.c_str(), join_vals(kv.second).c_str());
			}
			for (auto& kv : data) {
				diag("    expected[%s] = [%s]", kv.first.c_str(), join_vals(kv.second).c_str());
			}
		}
	}
}

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/Query_Processor_ParserSQL.cpp`:
- Around line 447-459: The downstream validator in lib/PgSQL_Session.cpp
currently rejects SET RHS vectors with size() > 2, causing valid PostgreSQL
multi-value SETs (e.g. SET search_path TO a,b,c) parsed in
Query_Processor_ParserSQL.cpp to fail; update the consumer/validator in
PgSQL_Session.cpp (the function that inspects the vector of values for
SET/VAR_TARGET handling, e.g., the SET parsing/validation routine) to allow
larger vectors for PostgreSQL-only multi-valued variables by: detect PG dialect
(or session type), identify the variable name (e.g., "search_path" or consult a
small whitelist of known multi-valued PG variables), and relax the size() > 2
rejection for those names (while keeping strict checks for single-valued
variables and non-PG modes), and update the error message accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9db3ce64-cff5-4e0e-a46f-cf8522280aea

📥 Commits

Reviewing files that changed from the base of the PR and between 9719d4a and fa75a0d.

⛔ Files ignored due to path filters (2)
  • deps/parsersql/parsersql-1.0.2.tar.gz is excluded by !**/*.gz
  • deps/parsersql/parsersql-1.0.3.tar.gz is excluded by !**/*.gz
📒 Files selected for processing (3)
  • deps/parsersql/parsersql
  • lib/Query_Processor_ParserSQL.cpp
  • test/tap/tests/setparser_parsersql_test.cpp
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Class names should use PascalCase with protocol prefixes (MySQL_, PgSQL_, ProxySQL_)
Member variables should use snake_case
Constants and macros should use UPPER_SNAKE_CASE
Use C++17 features; conditional compilation should use #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE (PROXYSQLGENAI no longer guards core code)
Use RAII for resource management and jemalloc for memory allocation
Use pthread mutexes for synchronization and std::atomic<> for counters

Files:

  • lib/Query_Processor_ParserSQL.cpp
  • test/tap/tests/setparser_parsersql_test.cpp
test/tap/tests/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

Test files should follow naming pattern test_*.cpp or *-t.cpp in test/tap/tests/

Files:

  • test/tap/tests/setparser_parsersql_test.cpp
🧠 Learnings (1)
📚 Learning: 2026-01-20T09:34:19.124Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.

Applied to files:

  • test/tap/tests/setparser_parsersql_test.cpp
🪛 Cppcheck (2.20.0)
test/tap/tests/setparser_parsersql_test.cpp

[warning] 46-46: If memory allocation fails, then there is a possible null pointer dereference

(nullPointerOutOfMemory)

🔇 Additional comments (2)
deps/parsersql/parsersql (1)

1-1: LGTM!

test/tap/tests/setparser_parsersql_test.cpp (1)

3-5: LGTM!

Also applies to: 7-7, 13-15, 30-97, 161-199, 213-216, 228-231

Comment thread lib/Query_Processor_ParserSQL.cpp
…value list

ParserSQL v1.0.2's PG SET parser had two bugs that were hidden in CI
until PR #5760 fixed `test/infra/control/ensure-infras.bash` to actually
dispatch `pre-proxysql.sql` hooks (previously only `.bash` hooks ran).
With the harness fix in place, `set_parser_algorithm_3-g1` for the first
time ran its tests under `pgsql-set_parser_algorithm=3`, surfacing:

* `pgsql-set_parameter_validation_test-t` — `SET search_path TO "$user",
  public` parsed as `search_path = "$user"`, dropping every value after
  the first comma.
* `pgsql-set_statement_test-t` — `SET TIME ZONE 'UTC'` parsed as
  `time = ZONE`, dropping the actual timezone value entirely.

## Upstream fix

ParserSQL v1.0.3 (ProxySQL/ParserSQL PR #38) is the upstream parser fix:

* PG `SET TIME ZONE <value>` is now recognised as the PG alias for
  `SET TimeZone = <value>`. The two-word keyword is detected via
  case-insensitive identifier lookahead since the tokenizer has no
  dedicated TK_TIME / TK_ZONE keyword.
* PG `parse()` no longer shares MySQL's comma loop. In PG, the comma
  after the first value is value-list continuation — each subsequent
  expression is appended as another child of the same NODE_VAR_ASSIGNMENT
  node, alongside the first RHS.

Full upstream test suite: 1223/1223 pass; 0 regressions.

## ProxySQL-side changes

* `deps/parsersql/parsersql-1.0.3.tar.gz` — new tarball, replaces 1.0.2.
* `deps/parsersql/parsersql` symlink retargeted to `parsersql-1.0.3`.
* `lib/Query_Processor_ParserSQL.cpp::walk_set_stmt` — the PG multi-value
  fix means a single NODE_VAR_ASSIGNMENT can have multiple RHS siblings.
  The adapter now iterates `target->next_sibling` chain instead of
  reading a single sibling, so the `map<string, vector<string>>` for
  `SET search_path TO 'a', 'b', 'c'` carries `["a", "b", "c"]` rather
  than `["a"]`. Fully backward-compatible for MySQL (always one RHS).
* `test/tap/tests/setparser_parsersql_test.cpp` — adds four permanent
  regression-test groups exercising the library + adapter end-to-end
  without a live backend:
    * `mysql_filtered_set` (7 cases) — verifies the filtered-SET
      variables parse cleanly so the downstream filter logic gets the
      input it expects.
    * `mysql_set_testing` (4 cases) — multi-variable comma-separated
      SET samples from `set_testing-240.csv`.
    * `pgsql_search_path` (8 cases) — single-value and multi-value
      shapes from `pgsql-set_parameter_validation_test-t`.
    * `pgsql_time_zone` (4 cases) — every `SET TIME ZONE ...` shape
      exercised by `pgsql-set_statement_test-t`.

The 4 new assertions for the `INTERVAL '7' HOUR` case capture today's
actual ParserSQL output (`timezone = [INTERVAL]`) — the expression
parser does not yet consume the full interval modifier chain. The test
reflects observable behaviour so it stays green; tightening to capture
the full interval value is a future ExpressionParser improvement.

## Test plan

- [x] Local rebuild of ProxySQL from the new 1.0.3 tarball: clean.
- [x] `./setparser_parsersql_test`: 46 new assertions, all pass.
- [x] No regressions in the 256 pre-existing assertions. The same 10
      pre-existing failures (sql_mode 23-25, Set1_v2 0-1) — unrelated
      to this work, present before the upgrade.
- [ ] CI on this branch: TAP groups `legacy-g2`, `mysql84-g2`, and
      `set_parser_algorithm_3-g1` are expected to flip green for the
      ParserSQL-related failures. Two non-ParserSQL bugs on the
      algorithm=3 ProxySQL session-dispatch path
      (`test_filtered_set_statements-t`, `set_testing-t`) remain to be
      investigated separately — the adapter now returns the correct
      maps, but downstream dispatch still misbehaves.

## Dependency

Should not be merged before ProxySQL/ParserSQL PR #38 is merged and a
v1.0.3 tag is cut. The vendored tarball in this commit was built from
that PR's branch head; once the upstream tag exists the tarball SHA
should be identical to a fresh `git archive --prefix=ParserSQL-1.0.3/
v1.0.3 | gzip` from the upstream tag.
@renecannao renecannao force-pushed the fix/parsersql-1.0.3-pg-set-fixes branch from fa75a0d to 5a48004 Compare May 23, 2026 03:08

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/tap/tests/setparser_parsersql_test.cpp (1)

162-199: ⚖️ Poor tradeoff

Consider extracting the common test harness logic.

TestParsePgsql and TestParse (lines 131-158) share nearly identical structure—only the parser function call and multi-value diagnostic formatting differ. You could reduce duplication by parameterizing the parser function and diagnostic strategy.

However, given this is test code that works correctly and the duplication is localized, deferring this refactor is reasonable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/tap/tests/setparser_parsersql_test.cpp` around lines 162 - 199,
TestParsePgsql duplicates most of TestParse (only differing in the parser call
and how multi-value diagnostics are formatted); refactor by extracting a generic
test harness (e.g., a function TestParseGeneric) that accepts a parser function
pointer or std::function (to call parsersql_parse_set_pgsql or the other parser)
and a small diagnostic formatter callback for printing multi-value results, then
have TestParsePgsql and TestParse call TestParseGeneric with the appropriate
parser and formatter to remove the duplicated loop, size/element checks, and
diag printing logic while preserving current behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/tap/tests/setparser_parsersql_test.cpp`:
- Around line 162-199: TestParsePgsql duplicates most of TestParse (only
differing in the parser call and how multi-value diagnostics are formatted);
refactor by extracting a generic test harness (e.g., a function
TestParseGeneric) that accepts a parser function pointer or std::function (to
call parsersql_parse_set_pgsql or the other parser) and a small diagnostic
formatter callback for printing multi-value results, then have TestParsePgsql
and TestParse call TestParseGeneric with the appropriate parser and formatter to
remove the duplicated loop, size/element checks, and diag printing logic while
preserving current behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7a3eb555-a371-48ab-9199-ed2a590c5b4e

📥 Commits

Reviewing files that changed from the base of the PR and between fa75a0d and 5a48004.

⛔ Files ignored due to path filters (2)
  • deps/parsersql/parsersql-1.0.2.tar.gz is excluded by !**/*.gz
  • deps/parsersql/parsersql-1.0.3.tar.gz is excluded by !**/*.gz
📒 Files selected for processing (3)
  • deps/parsersql/parsersql
  • lib/Query_Processor_ParserSQL.cpp
  • test/tap/tests/setparser_parsersql_test.cpp
✅ Files skipped from review due to trivial changes (1)
  • deps/parsersql/parsersql
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Class names should use PascalCase with protocol prefixes (MySQL_, PgSQL_, ProxySQL_)
Member variables should use snake_case
Constants and macros should use UPPER_SNAKE_CASE
Use C++17 features; conditional compilation should use #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE (PROXYSQLGENAI no longer guards core code)
Use RAII for resource management and jemalloc for memory allocation
Use pthread mutexes for synchronization and std::atomic<> for counters

Files:

  • lib/Query_Processor_ParserSQL.cpp
  • test/tap/tests/setparser_parsersql_test.cpp
test/tap/tests/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

Test files should follow naming pattern test_*.cpp or *-t.cpp in test/tap/tests/

Files:

  • test/tap/tests/setparser_parsersql_test.cpp
🧠 Learnings (1)
📚 Learning: 2026-01-20T09:34:19.124Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.

Applied to files:

  • test/tap/tests/setparser_parsersql_test.cpp
🪛 Cppcheck (2.20.0)
test/tap/tests/setparser_parsersql_test.cpp

[warning] 46-46: If memory allocation fails, then there is a possible null pointer dereference

(nullPointerOutOfMemory)

🔇 Additional comments (3)
lib/Query_Processor_ParserSQL.cpp (1)

447-459: LGTM!

test/tap/tests/setparser_parsersql_test.cpp (2)

34-96: LGTM!


213-232: LGTM!

…ehavior

PR #5755 (commit e6bef5c) fixed `process_pg_typecast()` in
`lib/pgsql_tokenizer.cpp` so that the trailing space after a PG
typecast is preserved unless the next non-space character is a
modifier `(` or array bracket `[`.  The commit message documented
this as an intentional behaviour change:

    "The new lookahead-conditional skip preserves the trailing
     space unless the very next non-space character is a modifier
     `(` or array bracket `[`."

Three test cases in `tokenizer_payloads/pgsql_regular_tokenizer_digests.hjson`
hardcoded expected digests against the OLD (buggy) tokenizer output,
where the space-eating bug glued the preceding literal to the
following operator.  With the fix in place those expectations
mismatch, making `pgsql-query_digests_stages_test-t` fail
deterministically in the legacy-g4 TAP group on v3.0 HEAD:

  * `::json  ->  'key'`  STAGE 1: actual `select ? -> ?`   vs old expected `select ?-> ?`
  * `::jsonb @>  ...`    STAGE 1-4: actual `select ? @> ?` vs old expected `select ?@> ?`
  * `::money +  ...`     STAGE 1: actual `select ? + ?`    vs old expected `select ?+ ?`

For `->` and `+`, only STAGE 1 changes — STAGE 2-4's existing
whitespace-collapse step still glues `?->` and `?+` for those
operators, so `s2`/`s3`/`s4` expectations stay as-is.  For `@>`,
none of the stages collapse the space, so all four expectations
move.

The new actual digest output is the more correct one — it mirrors
the existing MySQL tokenizer's behaviour for the same shape, where
`'literal' + 'literal'` already produces `? + ?` at STAGE 1.  This
commit only updates the fixture; no tokenizer or test source
changes.

Verified locally against the existing test binary (built from
v3.0 HEAD, includes the new tokenizer behaviour):

    ./pgsql-query_digests_stages_test-t  →  734/734 ok, RC=0

The 9 previously-failing assertions (#381, #385, #389-#392, #592,
#596, #600) all flip to ok with the updated fixture.
@renecannao renecannao changed the title fix(deps): bump ParserSQL to v1.0.3 — PG SET TIME ZONE alias + multi-value list fix: pgsql digest fixtures (typecast spacing) + ParserSQL 1.0.3 bump (PG SET TIME ZONE + multi-value lists) May 23, 2026
@sonarqubecloud

Copy link
Copy Markdown

Follow-up to the cluster-2 ParserSQL 1.0.3 bump (this same PR),
addressing review feedback from coderabbit + gemini.

## 1. PgSQL_Session: accept multi-value SET vectors (coderabbit, Major)

The previous commit `5a48004af` taught the adapter to walk every RHS
sibling so PG multi-value lists (e.g. `SET search_path TO "$user",
public`) come through as a `std::vector<std::string>` of every value.
But `lib/PgSQL_Session.cpp:4516` still had `size() > 2` →
`unable_to_parse_set_statement()`, which would have caused 3+ value
PG SETs to lock the hostgroup instead of being processed.  Worse,
the downstream value-consumer at the same site used only
`it->second.front()`, silently dropping every value past the first
for valid 2-value SETs too.

Fix:
  * Drop the upper bound from the cardinality check (keep `< 1` so
    empty value lists still fail loudly).
  * Join every value in the vector into `value1` with the
    comma-separated form PG itself uses for these parameters
    (`a, b, c`). Single-value SETs hit the loop with `size()==1` and
    take just `it->second.front()`, preserving the previous behaviour.

The validator / hash-compare / tracker / param_status code below the
join is unchanged — it now receives the full joined value instead of
the truncated first element.

## 2. setparser_parsersql_test: drop INTERVAL test that locked in
   incomplete behaviour (subagent correctness review, Major)

`parsersql_pgsql_time_zone` previously asserted that
`SET TIME ZONE INTERVAL '7' HOUR` produces `timezone = "INTERVAL"`,
which is the *current* observable output of ParserSQL — the
expression parser does not yet consume the `'7' HOUR` modifier
chain.  Encoding that as expected output would flip the test red
when ParserSQL is later improved to capture the full interval
expression, conflating an improvement with a regression.

Removed the case; left a comment in the fixture explaining the gap
so a future contributor adds the case back together with the
ExpressionParser fix.

## 3. Diag-output cleanup (gemini, cosmetic)

  * `%lu` → `%zu` for `size_t` (portability — `size_t` is not always
    `unsigned long` on 32-bit / Windows).
  * Extract the per-vector " | "-separated diag-string builder into
    a single `join_values_for_diag()` helper used by both
    `TestParse` (now also dumps full vector on mismatch, where it
    previously printed just `vals[0]`) and `TestParsePgsql`.

## Verification

  * `./setparser_parsersql_test`: 260/270 ok; the 10 remaining
    not-ok are unchanged pre-existing failures (`sql_mode 23-25`,
    `Set1_v2 0-1`), nothing new from this commit.
  * `pgsql_time_zone` group: 6 assertions (3 cases × 2) — confirms
    the INTERVAL case is removed cleanly.
  * `pgsql_search_path` group: 16 assertions still all pass — the
    multi-value join in PgSQL_Session does not affect the parser-
    side test (which goes through the adapter, not the session).
@renecannao renecannao changed the base branch from v3.0 to v3.0-260523 May 23, 2026 08:24
@renecannao renecannao merged commit 642b232 into v3.0-260523 May 23, 2026
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant