planner: recheck LIKE/ILIKE with a non-default ESCAPE on memtable scans#69670
planner: recheck LIKE/ILIKE with a non-default ESCAPE on memtable scans#69670vismaytiwari wants to merge 1 commit into
Conversation
The information_schema memtable predicate extractor pushes a LIKE pattern down into the scan and drops the original predicate, but always compiles the pattern with the default '\' escape via stringutil.CompileLike2Regexp. When the query uses a custom ESCAPE, the pushed-down pattern diverges from the real predicate, so the scan can return rows that fail the predicate and omit rows that match. The same is true when NO_BACKSLASH_ESCAPES makes the escape empty. Only build a pushed-down pattern when the escape is the default backslash; otherwise leave the predicate in place so it is rechecked as a scalar Selection. This keeps the existing fast path for the common default-escape case while restoring correct results for custom escapes. Issue Number: close pingcap#69653 Signed-off-by: vismaytiwari <vismay.t@gmail.com>
|
Hi @vismaytiwari. Thanks for your PR. I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Welcome @vismaytiwari! |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughThe memtable predicate extractor now checks whether a LIKE/ILIKE predicate's ESCAPE argument is the default backslash before pushing the pattern down for memtable scans; non-default escapes skip extraction, keeping the scalar predicate for recheck. A new test validates correct behavior with a custom ESCAPE character. ChangesLIKE ESCAPE extraction fix
Estimated code review effort: 2 (Simple) | ~10 minutes Sequence Diagram(s)Skipped, as this change is a targeted bug fix to predicate extraction logic without multi-component interaction flow. Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/planner/core/tests/extractor/memtable_infoschema_extractor_test.go (1)
533-539: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winPlan assertion only checks for presence of "Selection", not absence of pushdown.
The test asserts
strings.Contains(plan, "Selection"), which is a fairly loose check — aSelectionnode could appear in the plan for unrelated reasons (e.g., from thetable_schemapredicate or other implicit filtering), making the assertion pass even if the LIKE-specific bug regressed. A more targeted check (e.g., asserting the LIKE predicate text appears within aSelectionoperator line, or asserting the extractor'sLikePatterns/pushed conditions are empty) would more precisely pin down the regression this test guards against.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/tests/extractor/memtable_infoschema_extractor_test.go` around lines 533 - 539, The current plan assertion is too broad because checking only for “Selection” can pass even when the LIKE predicate is still pushed down. Update the test around the information_schema.tables query in memtable_infoschema_extractor_test to verify the custom-escape LIKE is specifically retained in a Selection tied to that predicate, or assert on the extractor output (for example, that the pushed-down LikePatterns/conditions are empty for this case). Use the existing explain query and the affected Selection/LikePatterns behavior to make the regression check precise.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/planner/core/tests/extractor/memtable_infoschema_extractor_test.go`:
- Around line 533-539: The current plan assertion is too broad because checking
only for “Selection” can pass even when the LIKE predicate is still pushed down.
Update the test around the information_schema.tables query in
memtable_infoschema_extractor_test to verify the custom-escape LIKE is
specifically retained in a Selection tied to that predicate, or assert on the
extractor output (for example, that the pushed-down LikePatterns/conditions are
empty for this case). Use the existing explain query and the affected
Selection/LikePatterns behavior to make the regression check precise.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f5919765-c265-49a8-bf43-965376fec98c
📒 Files selected for processing (2)
pkg/planner/core/memtable_predicate_extractor.gopkg/planner/core/tests/extractor/memtable_infoschema_extractor_test.go
What problem does this PR solve?
Issue Number: close #69653
Problem Summary:
A
LIKE ... ESCAPEpredicate on aninformation_schematable can return rows that fail the predicate and drop rows that match. The memtable predicate extractor pushes theLIKEpattern into the scan and removes the original predicate, but it always compiles the pattern with the default\escape viastringutil.CompileLike2Regexp, ignoring the customESCAPE. The pushed-down (default-escape) pattern then becomes authoritative and diverges from the real predicate.For example, with
ESCAPE '#',%#_%means "contains a literal underscore", so onlyabc_defshould match. Instead the scan compiles%#_%with the\escape (where_is the single-char wildcard) and returnsabc#x, which fails its own predicate.What changed and how does it work?
extractLikePatternnow builds a pushed-down pattern forLIKE/ILIKEonly when theESCAPEargument is the default backslash. For any non-default (or non-constant) escape it returns no pattern, so the predicate stays in the remaining filters and is rechecked as a scalarSelection. This keeps the existing fast path for the common default-escape case and restores correct results for custom escapes (and forNO_BACKSLASH_ESCAPES, which sets the escape to empty).The related
REGEXP_LIKE(..., match_type)case oncluster_logmentioned in the issue shares the same shape (the extractor models only part of the operator). I've kept this PR focused on the self-containedLIKE ... ESCAPEfix, and I'm happy to follow up on the regexpmatch_typecase if that's preferred.Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Bug Fixes
LIKEandILIKEfilters so patterns with custom escape characters are evaluated correctly.Tests
information_schematable-name matching with customESCAPEbehavior.