fix: pgsql digest fixtures (typecast spacing) + ParserSQL 1.0.3 bump (PG SET TIME ZONE + multi-value lists) by renecannao · Pull Request #5805 · sysown/proxysql

renecannao · 2026-05-23T02:35:50Z

Summary

Combines two related PG SET / digest fixes (originally split as PR #5804 + #5805) into one PR to avoid running CI twice.

Commit 1 — `fix(test): update pgsql digest fixtures for typecast trailing-space behavior`

Updates 3 expected-digest cases in tokenizer_payloads/pgsql_regular_tokenizer_digests.hjson to match the corrected tokenizer behavior from PR #5755 (commit e6bef5c58), which intentionally preserves the trailing space after a PG typecast unless the next non-space char is a modifier ( or array bracket [.

Fixes pgsql-query_digests_stages_test-t (legacy-g4) — was failing deterministically with 9 not-ok assertions across ::json ->, ::jsonb @>, and ::money + shapes.

Commit 2 — `fix(deps): bump ParserSQL to v1.0.3 — PG SET TIME ZONE alias + multi-value list`

Bumps ParserSQL from v1.0.2 to v1.0.3 (ProxySQL/ParserSQL#38, tag pushed). Tarball is byte-identical to https://github.com/ProxySQL/ParserSQL/archive/refs/tags/v1.0.3.tar.gz.

Fixes two PG SET parser bugs in set_parser_algorithm_3-g1:

pgsql-set_parameter_validation_test-t — SET search_path TO "$user", public dropped every value after the first comma.
pgsql-set_statement_test-t — SET TIME ZONE 'UTC' parsed as time = ZONE, dropping the actual timezone value.

Includes:

lib/Query_Processor_ParserSQL.cpp — adapter walk_set_stmt now iterates all RHS siblings under NODE_VAR_ASSIGNMENT for PG multi-value lists.
test/tap/tests/setparser_parsersql_test.cpp — adds 46 new ok() assertions across 4 groups (mysql_filtered_set, mysql_set_testing, pgsql_search_path, pgsql_time_zone) as permanent regression coverage.

Test plan

pgsql-query_digests_stages_test-t (commit 1): 734/734 ok local
setparser_parsersql_test (commit 2): 46 new assertions pass; same 10 pre-existing failures (sql_mode 23-25, Set1_v2 0-1) — unrelated baseline.
ParserSQL v1.0.3 vendored tarball SHA256 = 0989cc09554d7019fafb25c4208e10ad68b1938ecd1caae3b55ca6c7b54ec90e matches GitHub release archive.
CI: TAP groups legacy-g4 (commit 1) and legacy-g2 / mysql84-g2 / legacy-g2-genai / set_parser_algorithm_3-g1 (commit 2) expected to flip green for the ParserSQL-related failures.
Two non-ParserSQL bugs on the algorithm=3 ProxySQL session-dispatch path (test_filtered_set_statements-t, set_testing-t) are out of scope here. The adapter now produces the correct parsed map for both (verified by mysql_filtered_set + mysql_set_testing assertions); downstream dispatch is a separate investigation.

Supersedes

Closes #5804 (digest fixtures), combined here to save CI compute.

Summary by CodeRabbit

New Features
- Enhanced support for multi-value SET statements for MySQL and PostgreSQL, preserving all assigned values.
Tests
- Expanded MySQL and PostgreSQL SET parsing tests and improved diagnostics.
- Updated tokenizer fixtures to normalize operator spacing.
Chores
- Bumped parsersql dependency to 1.0.3.

coderabbitai · 2026-05-23T02:36:08Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7564f2ca-1cd4-483e-92c4-d41695cb2e48

📥 Commits

Reviewing files that changed from the base of the PR and between e668a9d and 4eb52ab.

📒 Files selected for processing (2)

lib/PgSQL_Session.cpp
test/tap/tests/setparser_parsersql_test.cpp

📜 Recent review details

🧰 Additional context used

📓 Path-based instructions (2)

**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Class names should use PascalCase with protocol prefixes (MySQL_, PgSQL_, ProxySQL_)
Member variables should use snake_case
Constants and macros should use UPPER_SNAKE_CASE
Use C++17 features; conditional compilation should use #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE (PROXYSQLGENAI no longer guards core code)
Use RAII for resource management and jemalloc for memory allocation
Use pthread mutexes for synchronization and std::atomic<> for counters

Files:

lib/PgSQL_Session.cpp
test/tap/tests/setparser_parsersql_test.cpp

test/tap/tests/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

Test files should follow naming pattern test_*.cpp or *-t.cpp in test/tap/tests/

Files:

test/tap/tests/setparser_parsersql_test.cpp

🧠 Learnings (1)

📚 Learning: 2026-01-20T09:34:19.124Z

Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.

Applied to files:

test/tap/tests/setparser_parsersql_test.cpp

🔇 Additional comments (2)

lib/PgSQL_Session.cpp (1)

4516-4516: LGTM!

Also applies to: 4535-4546

test/tap/tests/setparser_parsersql_test.cpp (1)

88-95: LGTM!

Also applies to: 135-144, 156-157, 166-170, 187-188, 197-201

📝 Walkthrough

Walkthrough

ParserSQL version 1.0.3 extends walk_set_stmt to collect all RHS value siblings for multi-value SET assignments; PgSQL runtime now accepts and collapses multi-value lists for SET handlers. Test suites add MySQL and PostgreSQL fixtures, diagnostics, and tokenizer digest updates.

Changes

Multi-value SET parsing and test expansion

Layer / File(s)	Summary
Version update and multi-value SET logic `deps/parsersql/parsersql`, `lib/Query_Processor_ParserSQL.cpp`, `lib/PgSQL_Session.cpp`	Version bumped to 1.0.3. `walk_set_stmt` now iterates over all RHS siblings, normalizes and collects them into a vector (ensuring non-empty fallback). PgSQL session SET handler accepts multi-value lists and collapses them into a comma-and-space separated string before validation and assignment.
Test coverage for MySQL and PostgreSQL SET parsing `test/tap/tests/setparser_parsersql_test.cpp`, `test/tap/tests/tokenizer_payloads/pgsql_regular_tokenizer_digests.hjson`	Test header clarifies MySQL vs PG parsing algorithms. New MySQL and PostgreSQL fixture arrays added (multi-variable assignments, `search_path`, `TIME ZONE` forms). New `join_values_for_diag()` and `TestParsePgsql()` added; TAP plan extended. Tokenizer digest expectations adjusted for PG operator spacing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

CI-set_parser_algorithm_3-g1: 4 SET-parser tests fail intermittently (~26% baseline) — ParserSQL edge-case gaps + nondeterminism #5798: Directly addresses multi-value SET assignment and PostgreSQL search_path normalization gaps that this PR implements and validates.
SET parser v3: define required validation matrix before merge #5449: Related to SET parser validation and expanded test coverage for multi-assignment handling.

Possibly related PRs

feat: integrate ParserSQL as vendored static library #5736: Integrates the ParserSQL-based SET parsing adapter (Query_Processor_ParserSQL.cpp) that this PR now extends with multi-value RHS support.

Poem

A rabbit nibbles through SET lines bright,
Gathering values, left and right,
Commas joined with careful art,
Tests now check each multi-part,
Version bump — a joyful hop tonight! 🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes both major changes: the digest fixture updates and the ParserSQL 1.0.3 bump with PG SET fixes.
Linked Issues check	✅ Passed	Code changes fully implement all objectives from linked issue `#5804` and address the ParserSQL 1.0.3 upgrade with multi-value SET and TIME ZONE parsing support.
Out of Scope Changes check	✅ Passed	All changes align with linked issue objectives: digest fixture updates, ParserSQL version bump, parser adapter updates, and test additions are all in scope.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/parsersql-1.0.3-pg-set-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request updates the parsersql dependency to version 1.0.3 and enhances the query processor to support multi-value SET statements, which is necessary for PostgreSQL features like search_path. It also introduces new test suites for MySQL and PostgreSQL SET commands. Feedback was provided regarding the TestParsePgsql function in the test suite, suggesting the use of a lambda to reduce code duplication when joining strings and the adoption of the %zu format specifier for size_t to improve cross-platform compatibility.

gemini-code-assist · 2026-05-23T02:37:30Z

+void TestParsePgsql(const Test* tests, int ntests, const std::string& title) {
+	for (int i = 0; i < ntests; i++) {
+		std::map<std::string, std::vector<std::string>> data;
+		for (auto it = std::begin(tests[i].results); it != std::end(tests[i].results); ++it) {
+			data[it->var] = it->values;
+		}
+
+		std::map<std::string, std::vector<std::string>> result = parsersql_parse_set_pgsql(tests[i].query);
+
+		bool size_ok = (result.size() == data.size());
+		ok(size_ok, "[%s %d] Sizes match: %lu, %lu", title.c_str(), i, result.size(), data.size());
+		if (!size_ok) {
+			diag("  FAIL: sizes differ for query: %s", tests[i].query);
+		}
+
+		bool elem_ok = maps_match(result, data);
+		ok(elem_ok, "[%s %d] Elements match", title.c_str(), i);
+		if (!elem_ok) {
+			diag("  FAIL: elements differ for query: %s", tests[i].query);
+			for (auto& kv : result) {
+				std::string joined;
+				for (size_t j = 0; j < kv.second.size(); j++) {
+					if (j) joined += " | ";
+					joined += kv.second[j];
+				}
+				diag("    result[%s] = [%s]", kv.first.c_str(), joined.c_str());
+			}
+			for (auto& kv : data) {
+				std::string joined;
+				for (size_t j = 0; j < kv.second.size(); j++) {
+					if (j) joined += " | ";
+					joined += kv.second[j];
+				}
+				diag("    expected[%s] = [%s]", kv.first.c_str(), joined.c_str());
+			}
+		}
+	}
+}


The TestParsePgsql function contains redundant logic for joining vector elements into a string for diagnostic output. Additionally, it uses %lu as a format specifier for size_t values, which can lead to portability issues or compiler warnings on certain architectures (e.g., 32-bit systems or Windows).

I suggest refactoring the join logic into a local lambda and using the %zu format specifier for size_t to improve maintainability and correctness.

void TestParsePgsql(const Test* tests, int ntests, const std::string& title) { auto join_vals = [](const std::vector<std::string>& v) { std::string s; for (size_t j = 0; j < v.size(); j++) { if (j) s += " | "; s += v[j]; } return s; }; for (int i = 0; i < ntests; i++) { std::map<std::string, std::vector<std::string>> data; for (auto it = std::begin(tests[i].results); it != std::end(tests[i].results); ++it) { data[it->var] = it->values; } std::map<std::string, std::vector<std::string>> result = parsersql_parse_set_pgsql(tests[i].query); bool size_ok = (result.size() == data.size()); ok(size_ok, "[%s %d] Sizes match: %zu, %zu", title.c_str(), i, result.size(), data.size()); if (!size_ok) { diag(" FAIL: sizes differ for query: %s", tests[i].query); } bool elem_ok = maps_match(result, data); ok(elem_ok, "[%s %d] Elements match", title.c_str(), i); if (!elem_ok) { diag(" FAIL: elements differ for query: %s", tests[i].query); for (auto& kv : result) { diag(" result[%s] = [%s]", kv.first.c_str(), join_vals(kv.second).c_str()); } for (auto& kv : data) { diag(" expected[%s] = [%s]", kv.first.c_str(), join_vals(kv.second).c_str()); } } } }

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/Query_Processor_ParserSQL.cpp`:
- Around line 447-459: The downstream validator in lib/PgSQL_Session.cpp
currently rejects SET RHS vectors with size() > 2, causing valid PostgreSQL
multi-value SETs (e.g. SET search_path TO a,b,c) parsed in
Query_Processor_ParserSQL.cpp to fail; update the consumer/validator in
PgSQL_Session.cpp (the function that inspects the vector of values for
SET/VAR_TARGET handling, e.g., the SET parsing/validation routine) to allow
larger vectors for PostgreSQL-only multi-valued variables by: detect PG dialect
(or session type), identify the variable name (e.g., "search_path" or consult a
small whitelist of known multi-valued PG variables), and relax the size() > 2
rejection for those names (while keeping strict checks for single-valued
variables and non-PG modes), and update the error message accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9db3ce64-cff5-4e0e-a46f-cf8522280aea

📥 Commits

Reviewing files that changed from the base of the PR and between 9719d4a and fa75a0d.

⛔ Files ignored due to path filters (2)

deps/parsersql/parsersql-1.0.2.tar.gz is excluded by !**/*.gz
deps/parsersql/parsersql-1.0.3.tar.gz is excluded by !**/*.gz

📒 Files selected for processing (3)

deps/parsersql/parsersql
lib/Query_Processor_ParserSQL.cpp
test/tap/tests/setparser_parsersql_test.cpp

📜 Review details

🧰 Additional context used

📓 Path-based instructions (2)

**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Class names should use PascalCase with protocol prefixes (MySQL_, PgSQL_, ProxySQL_)
Member variables should use snake_case
Constants and macros should use UPPER_SNAKE_CASE
Use C++17 features; conditional compilation should use #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE (PROXYSQLGENAI no longer guards core code)
Use RAII for resource management and jemalloc for memory allocation
Use pthread mutexes for synchronization and std::atomic<> for counters

Files:

lib/Query_Processor_ParserSQL.cpp
test/tap/tests/setparser_parsersql_test.cpp

test/tap/tests/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

Test files should follow naming pattern test_*.cpp or *-t.cpp in test/tap/tests/

Files:

test/tap/tests/setparser_parsersql_test.cpp

🧠 Learnings (1)

📚 Learning: 2026-01-20T09:34:19.124Z

Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.

Applied to files:

test/tap/tests/setparser_parsersql_test.cpp

🪛 Cppcheck (2.20.0)

test/tap/tests/setparser_parsersql_test.cpp

[warning] 46-46: If memory allocation fails, then there is a possible null pointer dereference

(nullPointerOutOfMemory)

🔇 Additional comments (2)

deps/parsersql/parsersql (1)

1-1: LGTM!

test/tap/tests/setparser_parsersql_test.cpp (1)

3-5: LGTM!

Also applies to: 7-7, 13-15, 30-97, 161-199, 213-216, 228-231

…value list ParserSQL v1.0.2's PG SET parser had two bugs that were hidden in CI until PR #5760 fixed `test/infra/control/ensure-infras.bash` to actually dispatch `pre-proxysql.sql` hooks (previously only `.bash` hooks ran). With the harness fix in place, `set_parser_algorithm_3-g1` for the first time ran its tests under `pgsql-set_parser_algorithm=3`, surfacing: * `pgsql-set_parameter_validation_test-t` — `SET search_path TO "$user", public` parsed as `search_path = "$user"`, dropping every value after the first comma. * `pgsql-set_statement_test-t` — `SET TIME ZONE 'UTC'` parsed as `time = ZONE`, dropping the actual timezone value entirely. ## Upstream fix ParserSQL v1.0.3 (ProxySQL/ParserSQL PR #38) is the upstream parser fix: * PG `SET TIME ZONE <value>` is now recognised as the PG alias for `SET TimeZone = <value>`. The two-word keyword is detected via case-insensitive identifier lookahead since the tokenizer has no dedicated TK_TIME / TK_ZONE keyword. * PG `parse()` no longer shares MySQL's comma loop. In PG, the comma after the first value is value-list continuation — each subsequent expression is appended as another child of the same NODE_VAR_ASSIGNMENT node, alongside the first RHS. Full upstream test suite: 1223/1223 pass; 0 regressions. ## ProxySQL-side changes * `deps/parsersql/parsersql-1.0.3.tar.gz` — new tarball, replaces 1.0.2. * `deps/parsersql/parsersql` symlink retargeted to `parsersql-1.0.3`. * `lib/Query_Processor_ParserSQL.cpp::walk_set_stmt` — the PG multi-value fix means a single NODE_VAR_ASSIGNMENT can have multiple RHS siblings. The adapter now iterates `target->next_sibling` chain instead of reading a single sibling, so the `map<string, vector<string>>` for `SET search_path TO 'a', 'b', 'c'` carries `["a", "b", "c"]` rather than `["a"]`. Fully backward-compatible for MySQL (always one RHS). * `test/tap/tests/setparser_parsersql_test.cpp` — adds four permanent regression-test groups exercising the library + adapter end-to-end without a live backend: * `mysql_filtered_set` (7 cases) — verifies the filtered-SET variables parse cleanly so the downstream filter logic gets the input it expects. * `mysql_set_testing` (4 cases) — multi-variable comma-separated SET samples from `set_testing-240.csv`. * `pgsql_search_path` (8 cases) — single-value and multi-value shapes from `pgsql-set_parameter_validation_test-t`. * `pgsql_time_zone` (4 cases) — every `SET TIME ZONE ...` shape exercised by `pgsql-set_statement_test-t`. The 4 new assertions for the `INTERVAL '7' HOUR` case capture today's actual ParserSQL output (`timezone = [INTERVAL]`) — the expression parser does not yet consume the full interval modifier chain. The test reflects observable behaviour so it stays green; tightening to capture the full interval value is a future ExpressionParser improvement. ## Test plan - [x] Local rebuild of ProxySQL from the new 1.0.3 tarball: clean. - [x] `./setparser_parsersql_test`: 46 new assertions, all pass. - [x] No regressions in the 256 pre-existing assertions. The same 10 pre-existing failures (sql_mode 23-25, Set1_v2 0-1) — unrelated to this work, present before the upgrade. - [ ] CI on this branch: TAP groups `legacy-g2`, `mysql84-g2`, and `set_parser_algorithm_3-g1` are expected to flip green for the ParserSQL-related failures. Two non-ParserSQL bugs on the algorithm=3 ProxySQL session-dispatch path (`test_filtered_set_statements-t`, `set_testing-t`) remain to be investigated separately — the adapter now returns the correct maps, but downstream dispatch still misbehaves. ## Dependency Should not be merged before ProxySQL/ParserSQL PR #38 is merged and a v1.0.3 tag is cut. The vendored tarball in this commit was built from that PR's branch head; once the upstream tag exists the tarball SHA should be identical to a fresh `git archive --prefix=ParserSQL-1.0.3/ v1.0.3 | gzip` from the upstream tag.

coderabbitai

🧹 Nitpick comments (1)

test/tap/tests/setparser_parsersql_test.cpp (1)
162-199: ⚖️ Poor tradeoff

Consider extracting the common test harness logic.

TestParsePgsql and TestParse (lines 131-158) share nearly identical structure—only the parser function call and multi-value diagnostic formatting differ. You could reduce duplication by parameterizing the parser function and diagnostic strategy.

However, given this is test code that works correctly and the duplication is localized, deferring this refactor is reasonable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/tap/tests/setparser_parsersql_test.cpp` around lines 162 - 199,
TestParsePgsql duplicates most of TestParse (only differing in the parser call
and how multi-value diagnostics are formatted); refactor by extracting a generic
test harness (e.g., a function TestParseGeneric) that accepts a parser function
pointer or std::function (to call parsersql_parse_set_pgsql or the other parser)
and a small diagnostic formatter callback for printing multi-value results, then
have TestParsePgsql and TestParse call TestParseGeneric with the appropriate
parser and formatter to remove the duplicated loop, size/element checks, and
diag printing logic while preserving current behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/tap/tests/setparser_parsersql_test.cpp`:
- Around line 162-199: TestParsePgsql duplicates most of TestParse (only
differing in the parser call and how multi-value diagnostics are formatted);
refactor by extracting a generic test harness (e.g., a function
TestParseGeneric) that accepts a parser function pointer or std::function (to
call parsersql_parse_set_pgsql or the other parser) and a small diagnostic
formatter callback for printing multi-value results, then have TestParsePgsql
and TestParse call TestParseGeneric with the appropriate parser and formatter to
remove the duplicated loop, size/element checks, and diag printing logic while
preserving current behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7a3eb555-a371-48ab-9199-ed2a590c5b4e

📥 Commits

Reviewing files that changed from the base of the PR and between fa75a0d and 5a48004.

⛔ Files ignored due to path filters (2)

deps/parsersql/parsersql-1.0.2.tar.gz is excluded by !**/*.gz
deps/parsersql/parsersql-1.0.3.tar.gz is excluded by !**/*.gz

📒 Files selected for processing (3)

deps/parsersql/parsersql
lib/Query_Processor_ParserSQL.cpp
test/tap/tests/setparser_parsersql_test.cpp

✅ Files skipped from review due to trivial changes (1)

deps/parsersql/parsersql

📜 Review details

🧰 Additional context used

📓 Path-based instructions (2)

**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{cpp,h,hpp}: Class names should use PascalCase with protocol prefixes (MySQL_, PgSQL_, ProxySQL_)
Member variables should use snake_case
Constants and macros should use UPPER_SNAKE_CASE
Use C++17 features; conditional compilation should use #ifdef PROXYSQL31, #ifdef PROXYSQL40, #ifdef PROXYSQLFFTO, #ifdef PROXYSQLTSDB, #ifdef PROXYSQLCLICKHOUSE (PROXYSQLGENAI no longer guards core code)
Use RAII for resource management and jemalloc for memory allocation
Use pthread mutexes for synchronization and std::atomic<> for counters

Files:

lib/Query_Processor_ParserSQL.cpp
test/tap/tests/setparser_parsersql_test.cpp

test/tap/tests/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

Test files should follow naming pattern test_*.cpp or *-t.cpp in test/tap/tests/

Files:

test/tap/tests/setparser_parsersql_test.cpp

🧠 Learnings (1)

📚 Learning: 2026-01-20T09:34:19.124Z

Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.

Applied to files:

test/tap/tests/setparser_parsersql_test.cpp

🪛 Cppcheck (2.20.0)

test/tap/tests/setparser_parsersql_test.cpp

[warning] 46-46: If memory allocation fails, then there is a possible null pointer dereference

(nullPointerOutOfMemory)

🔇 Additional comments (3)

lib/Query_Processor_ParserSQL.cpp (1)

447-459: LGTM!

test/tap/tests/setparser_parsersql_test.cpp (2)

34-96: LGTM!

213-232: LGTM!

…ehavior PR #5755 (commit e6bef5c) fixed `process_pg_typecast()` in `lib/pgsql_tokenizer.cpp` so that the trailing space after a PG typecast is preserved unless the next non-space character is a modifier `(` or array bracket `[`. The commit message documented this as an intentional behaviour change: "The new lookahead-conditional skip preserves the trailing space unless the very next non-space character is a modifier `(` or array bracket `[`." Three test cases in `tokenizer_payloads/pgsql_regular_tokenizer_digests.hjson` hardcoded expected digests against the OLD (buggy) tokenizer output, where the space-eating bug glued the preceding literal to the following operator. With the fix in place those expectations mismatch, making `pgsql-query_digests_stages_test-t` fail deterministically in the legacy-g4 TAP group on v3.0 HEAD: * `::json -> 'key'` STAGE 1: actual `select ? -> ?` vs old expected `select ?-> ?` * `::jsonb @> ...` STAGE 1-4: actual `select ? @> ?` vs old expected `select ?@> ?` * `::money + ...` STAGE 1: actual `select ? + ?` vs old expected `select ?+ ?` For `->` and `+`, only STAGE 1 changes — STAGE 2-4's existing whitespace-collapse step still glues `?->` and `?+` for those operators, so `s2`/`s3`/`s4` expectations stay as-is. For `@>`, none of the stages collapse the space, so all four expectations move. The new actual digest output is the more correct one — it mirrors the existing MySQL tokenizer's behaviour for the same shape, where `'literal' + 'literal'` already produces `? + ?` at STAGE 1. This commit only updates the fixture; no tokenizer or test source changes. Verified locally against the existing test binary (built from v3.0 HEAD, includes the new tokenizer behaviour): ./pgsql-query_digests_stages_test-t → 734/734 ok, RC=0 The 9 previously-failing assertions (#381, #385, #389-#392, #592, #596, #600) all flip to ok with the updated fixture.

sonarqubecloud · 2026-05-23T06:41:18Z

Quality Gate passed

Issues
3 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Follow-up to the cluster-2 ParserSQL 1.0.3 bump (this same PR), addressing review feedback from coderabbit + gemini. ## 1. PgSQL_Session: accept multi-value SET vectors (coderabbit, Major) The previous commit `5a48004af` taught the adapter to walk every RHS sibling so PG multi-value lists (e.g. `SET search_path TO "$user", public`) come through as a `std::vector<std::string>` of every value. But `lib/PgSQL_Session.cpp:4516` still had `size() > 2` → `unable_to_parse_set_statement()`, which would have caused 3+ value PG SETs to lock the hostgroup instead of being processed. Worse, the downstream value-consumer at the same site used only `it->second.front()`, silently dropping every value past the first for valid 2-value SETs too. Fix: * Drop the upper bound from the cardinality check (keep `< 1` so empty value lists still fail loudly). * Join every value in the vector into `value1` with the comma-separated form PG itself uses for these parameters (`a, b, c`). Single-value SETs hit the loop with `size()==1` and take just `it->second.front()`, preserving the previous behaviour. The validator / hash-compare / tracker / param_status code below the join is unchanged — it now receives the full joined value instead of the truncated first element. ## 2. setparser_parsersql_test: drop INTERVAL test that locked in incomplete behaviour (subagent correctness review, Major) `parsersql_pgsql_time_zone` previously asserted that `SET TIME ZONE INTERVAL '7' HOUR` produces `timezone = "INTERVAL"`, which is the *current* observable output of ParserSQL — the expression parser does not yet consume the `'7' HOUR` modifier chain. Encoding that as expected output would flip the test red when ParserSQL is later improved to capture the full interval expression, conflating an improvement with a regression. Removed the case; left a comment in the fixture explaining the gap so a future contributor adds the case back together with the ExpressionParser fix. ## 3. Diag-output cleanup (gemini, cosmetic) * `%lu` → `%zu` for `size_t` (portability — `size_t` is not always `unsigned long` on 32-bit / Windows). * Extract the per-vector " | "-separated diag-string builder into a single `join_values_for_diag()` helper used by both `TestParse` (now also dumps full vector on mismatch, where it previously printed just `vals[0]`) and `TestParsePgsql`. ## Verification * `./setparser_parsersql_test`: 260/270 ok; the 10 remaining not-ok are unchanged pre-existing failures (`sql_mode 23-25`, `Set1_v2 0-1`), nothing new from this commit. * `pgsql_time_zone` group: 6 assertions (3 cases × 2) — confirms the INTERVAL case is removed cleanly. * `pgsql_search_path` group: 16 assertions still all pass — the multi-value join in PgSQL_Session does not affect the parser- side test (which goes through the adapter, not the session).

gemini-code-assist Bot reviewed May 23, 2026

View reviewed changes

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

Comment thread lib/Query_Processor_ParserSQL.cpp

renecannao force-pushed the fix/parsersql-1.0.3-pg-set-fixes branch from fa75a0d to 5a48004 Compare May 23, 2026 03:08

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

renecannao changed the title ~~fix(deps): bump ParserSQL to v1.0.3 — PG SET TIME ZONE alias + multi-value list~~ fix: pgsql digest fixtures (typecast spacing) + ParserSQL 1.0.3 bump (PG SET TIME ZONE + multi-value lists) May 23, 2026

renecannao mentioned this pull request May 23, 2026

fix(test): update pgsql digest fixtures for typecast trailing-space behavior #5804

Closed

3 tasks

renecannao changed the base branch from v3.0 to v3.0-260523 May 23, 2026 08:24

renecannao merged commit 642b232 into v3.0-260523 May 23, 2026
2 of 4 checks passed

renecannao mentioned this pull request May 23, 2026

Integration batch 2026-05-23: bundle of 6 fixes #5809

Merged

4 tasks

coderabbitai Bot mentioned this pull request May 23, 2026

build: kill PROXYSQLGENAI flag, PROXYSQL40=1 builds all plugins #5814

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: pgsql digest fixtures (typecast spacing) + ParserSQL 1.0.3 bump (PG SET TIME ZONE + multi-value lists)#5805

fix: pgsql digest fixtures (typecast spacing) + ParserSQL 1.0.3 bump (PG SET TIME ZONE + multi-value lists)#5805
renecannao merged 3 commits into
v3.0-260523from
fix/parsersql-1.0.3-pg-set-fixes

renecannao commented May 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 23, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

sonarqubecloud Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

renecannao commented May 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commit 1 — fix(test): update pgsql digest fixtures for typecast trailing-space behavior

Commit 2 — fix(deps): bump ParserSQL to v1.0.3 — PG SET TIME ZONE alias + multi-value list

Test plan

Supersedes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented May 23, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

renecannao commented May 23, 2026 •

edited by coderabbitai Bot

Loading

Commit 1 — `fix(test): update pgsql digest fixtures for typecast trailing-space behavior`

Commit 2 — `fix(deps): bump ParserSQL to v1.0.3 — PG SET TIME ZONE alias + multi-value list`

coderabbitai Bot commented May 23, 2026 •

edited

Loading