Skip to content

codec, table: make new collation setting explicit in encoding#69566

Open
D3Hunter wants to merge 7 commits into
pingcap:masterfrom
D3Hunter:refactor-codec
Open

codec, table: make new collation setting explicit in encoding#69566
D3Hunter wants to merge 7 commits into
pingcap:masterfrom
D3Hunter:refactor-codec

Conversation

@D3Hunter

@D3Hunter D3Hunter commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: ref #69563

Problem Summary:

Some lower-level codec and tablecodec helper paths read the global new-collation setting directly when deciding how to encode values or whether restored data is needed. That makes deep utility behavior depend on process-wide state and makes the new-collation global harder to eliminate incrementally.

This PR is the first part of the refactor. It removes the implicit global dependency from the encoding paths touched here by threading an explicit collation setting through the relevant table, tablecodec, and codec helpers. Follow-up work should continue eliminating the remaining global variable usage.

in the first phase, we we will only refactor this part to make sure nextgen addindex/import-into can work in the case in the issue

What changed and how does it work?

  • Add codec.Encoder, which stores the new-collation setting used by EncodeKey, EncodeValue, and HashCode.
  • Keep the existing package-level codec functions as compatibility wrappers that still read the current global setting at the API boundary.
  • Thread codec.Encoder through row and index encoding helpers in tablecodec, table/tables, and table/tblctx.
  • Add explicit collation-setting variants for index construction and restored-data checks, including partial-index expression building.
  • Replace restored-data checks in these paths with NeedRestoredDataWithCollate when the collation setting is already known.
  • Update related tests and Bazel metadata.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Unit tests:

  • ./tools/check/failpoint-go-test.sh pkg/util/codec -run TestEncoderNewCollationEnabled -count=1
  • ./tools/check/failpoint-go-test.sh pkg/tablecodec -run 'Test(RowCodec|DecodeColumnValue|TimeCodec|CutRow|UniqueGlobalIndexKeyWithNullValues)$' -count=1
  • ./tools/check/failpoint-go-test.sh pkg/table/tables -run 'Test(CheckRowInsertionConsistency|CheckIndexKeysAndCheckHandleConsistency)$' -count=1
  • pushd pkg/table/tblctx >/dev/null && go test -run 'Test(EncodeRow|EncodeBufferReserve)$' -tags=intest,deadlock -count=1 && popd >/dev/null

Manual validation:

  • make bazel_prepare
  • git diff --check master...HEAD
  • make lint

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

  • New Features

    • Added collation-aware row and index key/value encoding/decoding across execution, storage, planning, and restored-handle data handling.
    • Introduced a planner-level option to control “new collation” behavior for expression building.
  • Bug Fixes

    • Fixed collation-dependent restored-data decisions for index backfill/restore, mutation checking, and HTTP/binlog row encoding.
  • Chores

    • Improved startup by loading “new collation” settings from the system database earlier, with test-only failpoint support.

D3Hunter added 2 commits July 1, 2026 18:24
use local new_collate settting

remove dependency

change

change

change

change

change
@ti-chi-bot ti-chi-bot Bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. sig/planner SIG: Planner labels Jul 1, 2026
@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR threads collation-aware encoder state through row/index encoding and restored-data handling, updates table wiring and session bootstrap to store or load that state, and adjusts dependent callers, tests, and build targets.

Changes

Collation-aware encoder plumbing

Layer / File(s) Summary
codec, types, and expression flags
pkg/util/codec/codec.go, pkg/util/codec/codec_test.go, pkg/util/codec/collation_test.go, pkg/types/etc.go, pkg/expression/expression.go, pkg/planner/core/expression_rewriter.go
Adds Encoder, NeedRestoredDataWithCollate, and WithUseNewCollate, and stores the collation choice on expression rewriters.
tablecodec row and index APIs
pkg/tablecodec/tablecodec.go, pkg/tablecodec/tablecodec_test.go
Updates row encoding, index key/value generation, restored-value decoding, and clustered-index handling to accept an encoder or useNewCollate.
table and index encoder wiring
pkg/table/tables/tables.go, pkg/table/tables/index.go, pkg/table/tables/mutation_checker.go, pkg/table/tables/partition.go, pkg/meta/model/table.go, pkg/table/tables/mutation_checker_test.go, pkg/table/tables/testutil/*
Stores encoder state on tables, makes index creation collation-aware, updates restored-data helpers, and removes ColumnNeedRestoredData.
row buffer encoding
pkg/table/tblctx/buffers.go, pkg/table/tblctx/buffers_test.go, pkg/table/tblctx/BUILD.bazel
Changes row-buffer encoding to accept and pass an explicit encoder.
session global initialization
pkg/session/global_init.go, pkg/session/session.go, pkg/session/BUILD.bazel, pkg/testkit/mockstore.go
Adds a filtered global-variable initializer and moves bootstrap initialization earlier.
DDL and executor callers
pkg/ddl/column.go, pkg/ddl/index.go, pkg/ddl/reorg_util_test.go, pkg/executor/admin.go, pkg/executor/write.go, pkg/executor/test/executor/*, pkg/server/handler/tests/*, pkg/store/mockstore/*, pkg/util/rowDecoder/*, pkg/util/rowcodec/*, tests/realtikvtest/addindextest2/*
Updates callers and tests to construct collation-enabled encoders or pass useNewCollate into restored-data and key-generation paths.

Estimated code review effort: 4 (Complex) | ~75 minutes

Suggested labels: ok-to-test

Suggested reviewers: yudongusa, gengliqi, wjhuang2016, xzhangxian1008

Poem

I hopped through keys with collation light,
And nested rows felt tidy and right.
🐇 Encoders hum, the tables sing,
New collate dances through the spring.
Restored bytes twirl, the tests all grin,
A cozy byte-path tuned from within.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title is concise, specific, and accurately summarizes the main refactor to make collation explicit in encoding paths.
Description check ✅ Passed The description follows the template well, includes the issue reference, summary, test checklist, and release note section.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Comment thread pkg/meta/model/table.go
}

// ColumnNeedRestoredData checks whether a single index column needs restored data.
func ColumnNeedRestoredData(idxCol *IndexColumn, colInfos []*ColumnInfo) bool {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inlined

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/planner/core/expression_rewriter.go (1)

261-286: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Reused pooled rewriter doesn't refresh useNewCollate.

New rewriters get useNewCollate from collate.NewCollationEnabled() at creation (line 265), but the reuse branch (lines 271-285) resets asScalar, preprocess, fold counters, stacks, ctx, err, and planCtx fields — yet never touches useNewCollate. A pooled rewriter therefore keeps whatever collation-enabled value was true the first time that pool slot was allocated, even if the global setting changes afterward (e.g. in tests toggling collate.SetNewCollationEnabledForTest, or across builder reuse). All three call sites that now gate on er.useNewCollate (SetCollationExpr handling, castCollationForIn, patternLikeOrIlikeToExpression) would silently use a stale flag instead of the previous behavior of reading the live global on every check.

🐛 Proposed fix
 	rewriter = b.rewriterPool[b.rewriterCounter-1]
 	rewriter.asScalar = false
 	rewriter.preprocess = nil
 	rewriter.disableFoldCounter = 0
 	rewriter.tryFoldCounter = 0
 	rewriter.ctxStack = rewriter.ctxStack[:0]
 	rewriter.ctxNameStk = rewriter.ctxNameStk[:0]
 	rewriter.ctx = ctx
 	rewriter.err = nil
+	rewriter.useNewCollate = collate.NewCollationEnabled()
 	rewriter.planCtx.plan = p
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/planner/core/expression_rewriter.go` around lines 261 - 286, The pooled
expressionRewriter reuse path is leaving useNewCollate stale after the first
allocation. Update the reuse branch in expressionRewriterBuilder.getRewriter to
refresh useNewCollate from collate.NewCollationEnabled() just like the
constructor path, so SetCollationExpr, castCollationForIn, and
patternLikeOrIlikeToExpression always observe the current setting.
pkg/tablecodec/tablecodec.go (1)

983-989: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Pass the collation mode into DecodeIndexKVEx

DecodeIndexKVEx still hard-codes collate.NewCollationEnabled() for clustered-index v1, while pkg/executor/mem_reader.go:206 has no way to supply the table’s collation mode. That keeps this decode path tied to global state and can mis-decode restored data when the encode mode differs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/tablecodec/tablecodec.go` around lines 983 - 989, DecodeIndexKVEx
currently hard-codes collate.NewCollationEnabled() for clustered-index v1, which
ties decoding to global state. Update DecodeIndexKVEx and its call chain to
accept the table’s collation mode as an argument, and pass that value through to
decodeIndexKvForClusteredIndexVersion1 instead of constructing a new collation
locally. Also update the caller in the mem reader path so it forwards the
correct mode when decoding restored data.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/table/tblctx/buffers.go`:
- Around line 26-27: EncodeBinlogRowData is re-reading
collate.NewCollationEnabled instead of using the caller’s encoder, which can
make binlog row bytes differ from WriteMemBufferEncoded. Update the
tblctx/buffers.go flow so WriteMemBufferEncoded passes its encoder/new-collation
state into EncodeBinlogRowData, and have EncodeBinlogRowData use that provided
encoder consistently rather than consulting global collation state. Keep the
encoding path explicit by threading the same encoder through the related helpers
and call sites.

---

Outside diff comments:
In `@pkg/planner/core/expression_rewriter.go`:
- Around line 261-286: The pooled expressionRewriter reuse path is leaving
useNewCollate stale after the first allocation. Update the reuse branch in
expressionRewriterBuilder.getRewriter to refresh useNewCollate from
collate.NewCollationEnabled() just like the constructor path, so
SetCollationExpr, castCollationForIn, and patternLikeOrIlikeToExpression always
observe the current setting.

In `@pkg/tablecodec/tablecodec.go`:
- Around line 983-989: DecodeIndexKVEx currently hard-codes
collate.NewCollationEnabled() for clustered-index v1, which ties decoding to
global state. Update DecodeIndexKVEx and its call chain to accept the table’s
collation mode as an argument, and pass that value through to
decodeIndexKvForClusteredIndexVersion1 instead of constructing a new collation
locally. Also update the caller in the mem reader path so it forwards the
correct mode when decoding restored data.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0528e36f-41d7-421a-aed8-a935c543c9e0

📥 Commits

Reviewing files that changed from the base of the PR and between 7a93ade and 45a441d.

📒 Files selected for processing (36)
  • pkg/ddl/column.go
  • pkg/ddl/index.go
  • pkg/executor/admin.go
  • pkg/executor/test/executor/BUILD.bazel
  • pkg/executor/test/executor/executor_test.go
  • pkg/executor/write.go
  • pkg/expression/expression.go
  • pkg/meta/model/table.go
  • pkg/planner/core/expression_rewriter.go
  • pkg/server/handler/tests/BUILD.bazel
  • pkg/server/handler/tests/http_handler_test.go
  • pkg/store/mockstore/BUILD.bazel
  • pkg/store/mockstore/cluster_test.go
  • pkg/store/mockstore/unistore/cophandler/cop_handler_test.go
  • pkg/table/tables/index.go
  • pkg/table/tables/mutation_checker.go
  • pkg/table/tables/mutation_checker_test.go
  • pkg/table/tables/partition.go
  • pkg/table/tables/tables.go
  • pkg/table/tables/testutil/BUILD.bazel
  • pkg/table/tables/testutil/indexcheck.go
  • pkg/table/tblctx/BUILD.bazel
  • pkg/table/tblctx/buffers.go
  • pkg/table/tblctx/buffers_test.go
  • pkg/tablecodec/tablecodec.go
  • pkg/tablecodec/tablecodec_test.go
  • pkg/types/etc.go
  • pkg/util/codec/codec.go
  • pkg/util/codec/codec_test.go
  • pkg/util/codec/collation_test.go
  • pkg/util/rowDecoder/BUILD.bazel
  • pkg/util/rowDecoder/decoder_test.go
  • pkg/util/rowcodec/bench_test.go
  • pkg/util/rowcodec/rowcodec_test.go
  • tests/realtikvtest/addindextest2/BUILD.bazel
  • tests/realtikvtest/addindextest2/global_sort_test.go
💤 Files with no reviewable changes (1)
  • pkg/meta/model/table.go

Comment on lines +26 to +27
"github.com/pingcap/tidb/pkg/util/codec"
"github.com/pingcap/tidb/pkg/util/collate"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Keep binlog row encoding on the caller’s encoder.

EncodeBinlogRowData re-reads collate.NewCollationEnabled() instead of using the same encoder as WriteMemBufferEncoded, so binlog old-row bytes can diverge from table row encoding when the caller’s explicit collation mode differs from the current global. Pass the encoder in here as well. Based on PR objectives, this refactor is meant to make the new-collation setting explicit in encoding paths.

Proposed fix
 	"github.com/pingcap/tidb/pkg/util/codec"
-	"github.com/pingcap/tidb/pkg/util/collate"
@@
-func (b *EncodeRowBuffer) EncodeBinlogRowData(loc *time.Location, ec errctx.Context) ([]byte, error) {
-	enc := codec.NewEncoder(collate.NewCollationEnabled())
+func (b *EncodeRowBuffer) EncodeBinlogRowData(enc codec.Encoder, loc *time.Location, ec errctx.Context) ([]byte, error) {
 	value, err := tablecodec.EncodeOldRow(enc, loc, b.row, b.colIDs, nil, nil)

Also applies to: 90-92

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/table/tblctx/buffers.go` around lines 26 - 27, EncodeBinlogRowData is
re-reading collate.NewCollationEnabled instead of using the caller’s encoder,
which can make binlog row bytes differ from WriteMemBufferEncoded. Update the
tblctx/buffers.go flow so WriteMemBufferEncoded passes its encoder/new-collation
state into EncodeBinlogRowData, and have EncodeBinlogRowData use that provided
encoder consistently rather than consulting global collation state. Keep the
encoding path explicit by threading the same encoder through the related helpers
and call sites.

@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 82.16561% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.8775%. Comparing base (693e52c) to head (136e0b6).
⚠️ Report is 17 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #69566        +/-   ##
================================================
- Coverage   76.3268%   75.8775%   -0.4493%     
================================================
  Files          2041       2078        +37     
  Lines        561003     583167     +22164     
================================================
+ Hits         428196     442493     +14297     
- Misses       131906     138302      +6396     
- Partials        901       2372      +1471     
Flag Coverage Δ
integration 45.8027% <82.1656%> (+6.1746%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 60.4471% <ø> (ø)
parser ∅ <ø> (∅)
br 63.8231% <ø> (+1.0720%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread pkg/table/tables/index.go
for _, idxCol := range idxCols {
if model.ColumnNeedRestoredData(idxCol, colInfos) {
col := colInfos[idxCol.Offset]
if types.NeedRestoredDataWithCollate(model.GetIdxChangingFieldType(idxCol, col), useNewCollate) {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inlined

@ingress-bot

Copy link
Copy Markdown

🔍 Starting code review for this PR...

@ingress-bot ingress-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review was generated by AI and should be verified by a human reviewer.
Manual follow-up is recommended before merge.

Summary

  • Total findings: 9
  • Inline comments: 9
  • Summary-only findings (no inline anchor): 0
Findings (highest risk first)

⚠️ [Major] (2)

  1. Index/table encoder captures new-collation flag at construction, defeating the documented post-init sync.Once invariant (pkg/table/tables/index.go:196, pkg/table/tables/index.go:99, pkg/table/tables/index.go:186, pkg/table/tables/tables.go:252, pkg/table/tables/tables.go:795)
  2. No test pins encode and decode to the same collation source across the new snapshot-vs-live seam (pkg/util/codec/collation_test.go:47, pkg/table/tables/tables.go:252, pkg/tablecodec/tablecodec.go:997, pkg/table/tables/mutation_checker.go:257, pkg/table/tables/mutation_checker_test.go:253)

🟡 [Minor] (4)

  1. EncodeRowBuffer.EncodeBinlogRowData bypasses the injected encoder and reads the global collation state (pkg/table/tblctx/buffers.go:91, pkg/table/tblctx/buffers.go:53)
  2. DecodeIndexKVEx was not given a WithCollate variant; its v1 dispatch still reads the global collation state (pkg/tablecodec/tablecodec.go:988, pkg/tablecodec/tablecodec.go:1001)
  3. DecodeIndexKVWithCollate doc comment is mechanism narration — it does not explain the invariant this function exists to satisfy (pkg/tablecodec/tablecodec.go:1001)
  4. Method forms of EncodeKey/EncodeValue/HashCode carry identical doc comments as their package-level counterparts, hiding the fixed-vs-dynamic collation distinction (pkg/util/codec/codec.go:325, pkg/util/codec/codec.go:338, pkg/util/codec/codec.go:1912)

🧹 [Nit] (3)

  1. WithUseNewCollate doc comment is a tautology — it adds no intent over the parameter name (pkg/expression/expression.go:106)
  2. initTableCommonWithIndices remains a free function after initTableCommon was promoted to a method (pkg/table/tables/tables.go:311)
  3. NeedRestoredDataWithCollate doc comment is pure mechanism narration with no intent explanation (pkg/types/etc.go:144)

Comment thread pkg/table/tables/index.go
Comment thread pkg/table/tables/mutation_checker.go
Comment thread pkg/table/tblctx/buffers.go
Comment thread pkg/tablecodec/tablecodec.go
Comment thread pkg/tablecodec/tablecodec.go
Comment thread pkg/util/codec/codec.go Outdated
Comment thread pkg/expression/expression.go Outdated
Comment thread pkg/table/tables/tables.go
Comment thread pkg/types/etc.go Outdated

@ingress-bot ingress-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review was generated by AI and should be verified by a human reviewer.
Manual follow-up is recommended before merge.

Summary

  • Total findings: 13
  • Inline comments: 12
  • Summary-only findings (no inline anchor): 0
Findings (highest risk first)

🚨 [Blocker] (1)

  1. Collation flag frozen at table/index construction defeats the documented sync.Once deferral and can drift persisted index bytes (pkg/table/tables/index.go:196, pkg/table/tables/index.go:85, pkg/table/tables/tables.go:252, pkg/session/session.go:4408)

⚠️ [Major] (1)

  1. New per-Encoder collation tests never exercise the frozen-flag vs live-global divergence, which is the only behavior the refactor actually changes (pkg/util/codec/collation_test.go:47, pkg/util/codec/codec.go:233, pkg/util/collate/collate.go:141, pkg/table/tables/mutation_checker_test.go:253)

🟡 [Minor] (7)

  1. Stale initNeedRestoreData comment contradicts new encoder-based collation capture (pkg/table/tables/index.go:196, pkg/table/tables/index.go:63)
  2. EncodeBinlogRowData creates a fresh encoder instead of accepting one as a parameter (pkg/table/tblctx/buffers.go:91, pkg/table/tblctx/buffers.go:55)
  3. DecodeIndexKVEx version-1 path not migrated to explicit collation parameter (pkg/tablecodec/tablecodec.go:988, pkg/executor/mem_reader.go:206)
  4. GenIndexKey doc comment not updated to describe the new enc codec.Encoder parameter (pkg/tablecodec/tablecodec.go:1234)
  5. GenIndexKey and GenIndexValuePortal have asymmetric first-parameter types, forcing callers to unwrap the encoder at every call site (pkg/tablecodec/tablecodec.go:1235, pkg/tablecodec/tablecodec.go:1641, pkg/table/tables/index.go:196, pkg/table/tables/mutation_checker_test.go:496)
  6. (encoder Encoder) EncodeKey and (encoder Encoder) EncodeValue doc comments are verbatim copies of package-level functions, hiding the fixed-collation behavioral distinction (pkg/util/codec/codec.go:325, pkg/util/codec/codec.go:338)
  7. (encoder Encoder) HashCode is placed on Encoder but useNewCollate has no effect on its output (pkg/util/codec/codec.go:1912, pkg/util/codec/collation_test.go:1346)

🧹 [Nit] (4)

  1. encoder field on TableCommon lacks a comment explaining the fixed-at-init collation invariant (pkg/table/tables/tables.go:84)
  2. DecodeIndexKVWithCollate doc comment is mechanical and omits when to prefer it over DecodeIndexKV (pkg/tablecodec/tablecodec.go:1001)
  3. TryGetCommonPkColumnRestoredIds comment not updated to document the new useNewCollate parameter (pkg/tablecodec/tablecodec.go:1650)
  4. NeedRestoredDataWithCollate comment restates the signature without explaining when to prefer it over NeedRestoredData (pkg/types/etc.go:144)

Comment thread pkg/table/tables/index.go
Comment thread pkg/util/codec/collation_test.go
Comment thread pkg/table/tblctx/buffers.go
Comment thread pkg/tablecodec/tablecodec.go
Comment thread pkg/tablecodec/tablecodec.go
Comment thread pkg/util/codec/codec.go Outdated
Comment thread pkg/table/tables/tables.go Outdated
Comment thread pkg/tablecodec/tablecodec.go
Comment thread pkg/tablecodec/tablecodec.go
Comment thread pkg/types/etc.go
D3Hunter added 2 commits July 2, 2026 01:41
change

change

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/session/global_init.go`:
- Around line 59-65: The temporary bootstrap session created in
createSessionWithOpt should be marked as restricted before it reads mysql.tidb.
Update the global init flow around sess so it preserves the same contract as the
bootstrap path in session.go by setting InRestrictedSQL on the session before
calling sess.getTableValue for tidbSystemTZ, then keep the existing error
handling unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 98675bef-9e44-49e6-9b52-290b8249ab00

📥 Commits

Reviewing files that changed from the base of the PR and between 45a441d and 11bfbe6.

📒 Files selected for processing (3)
  • pkg/session/BUILD.bazel
  • pkg/session/global_init.go
  • pkg/session/session.go

Comment on lines +59 to +65
sess, err := createSessionWithOpt(store, dom, dom.GetSchemaValidator(), dom.InfoCache(), nil)
if err != nil {
return err
}

// get system tz from mysql.tidb
tz, err := sess.getTableValue(ctx, mysql.TiDBTable, tidbSystemTZ)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Mark the temporary bootstrap session as restricted before reading mysql.tidb.

Line 59 creates a new internal session, but the old inline load ran after bootstrap sessions were marked InRestrictedSQL in pkg/session/session.go Lines 4393-4395. Preserve that contract before Line 65 reads system-table values.

Proposed fix
 	sess, err := createSessionWithOpt(store, dom, dom.GetSchemaValidator(), dom.InfoCache(), nil)
 	if err != nil {
 		return err
 	}
+	sess.GetSessionVars().InRestrictedSQL = true
 
 	// get system tz from mysql.tidb
 	tz, err := sess.getTableValue(ctx, mysql.TiDBTable, tidbSystemTZ)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sess, err := createSessionWithOpt(store, dom, dom.GetSchemaValidator(), dom.InfoCache(), nil)
if err != nil {
return err
}
// get system tz from mysql.tidb
tz, err := sess.getTableValue(ctx, mysql.TiDBTable, tidbSystemTZ)
sess, err := createSessionWithOpt(store, dom, dom.GetSchemaValidator(), dom.InfoCache(), nil)
if err != nil {
return err
}
sess.GetSessionVars().InRestrictedSQL = true
// get system tz from mysql.tidb
tz, err := sess.getTableValue(ctx, mysql.TiDBTable, tidbSystemTZ)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/session/global_init.go` around lines 59 - 65, The temporary bootstrap
session created in createSessionWithOpt should be marked as restricted before it
reads mysql.tidb. Update the global init flow around sess so it preserves the
same contract as the bootstrap path in session.go by setting InRestrictedSQL on
the session before calling sess.getTableValue for tidbSystemTZ, then keep the
existing error handling unchanged.

@ti-chi-bot ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 2, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/table/tables/mutation_checker_test.go (1)

261-268: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Update stale test-dimension comment.

The comment above (lines 178-186) enumerates the test's dimensions (location, unique/non-unique, clustered index, string collation) but doesn't mention the newly added useNewCollate outer loop. Worth a one-line addition for future readers.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/table/tables/mutation_checker_test.go` around lines 261 - 268, Update the
stale test-dimension comment in mutation_checker_test to include the new
useNewCollate dimension. Keep the comment near the test setup that iterates over
collate.NewCollationEnabled/SetNewCollationEnabledForTest and mention that the
matrix now covers location, unique/non-unique, clustered index, string
collation, and the new collation toggle so the comment matches the nested loops.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/table/tables/mutation_checker_test.go`:
- Around line 261-268: Update the stale test-dimension comment in
mutation_checker_test to include the new useNewCollate dimension. Keep the
comment near the test setup that iterates over
collate.NewCollationEnabled/SetNewCollationEnabledForTest and mention that the
matrix now covers location, unique/non-unique, clustered index, string
collation, and the new collation toggle so the comment matches the nested loops.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: acda1002-9fc6-454c-af1e-34887615ea43

📥 Commits

Reviewing files that changed from the base of the PR and between 11bfbe6 and 787f588.

📒 Files selected for processing (5)
  • pkg/expression/expression.go
  • pkg/table/tables/mutation_checker_test.go
  • pkg/table/tables/tables.go
  • pkg/types/etc.go
  • pkg/util/codec/codec.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/expression/expression.go
  • pkg/types/etc.go
  • pkg/util/codec/codec.go
  • pkg/table/tables/tables.go

@D3Hunter

D3Hunter commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

/retest

@ti-chi-bot

ti-chi-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wjhuang2016
Once this PR has been reviewed and has the lgtm label, please assign windtalker, winoros for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jul 2, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown

[LGTM Timeline notifier]

Timeline:

  • 2026-07-02 07:22:54.087174252 +0000 UTC m=+283315.787553655: ☑️ agreed by wjhuang2016.

Comment thread pkg/table/tables/tables.go
Comment thread pkg/util/codec/codec.go

@D3Hunter D3Hunter left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

  • Total findings: 8
  • Inline comments: 8
  • Summary-only findings (no inline anchor): 0
Findings (highest risk first)

⚠️ [Major] (4)

  1. Raw row reads can use a different common-handle restore rule than writes (pkg/table/tables/tables.go:1542, pkg/table/tables/tables.go:1099, and pkg/table/tables/tables.go:1126)
  2. UnionScan pending-index decode ignores the captured index collation mode (pkg/executor/mem_reader.go:206 and pkg/tablecodec/tablecodec.go:988)
  3. Pessimistic unchanged unique-key locks can be encoded with the wrong collation mode (pkg/executor/write.go:413)
  4. DDL index backfill restored data can mix global and captured collation state (pkg/ddl/index.go:2481 and pkg/ddl/index.go:2735)

🟡 [Minor] (4)

  1. Encoder method docs overstate collation effects (pkg/util/codec/codec.go:338 and pkg/util/codec/codec.go:1913)
  2. New public collation-mode names use non-canonical Collate wording (pkg/expression/expression.go:67, pkg/expression/expression.go:106, pkg/util/codec/codec.go:77, and pkg/tablecodec/tablecodec.go:1001)
  3. Explicit-collation index constructor leaks a package-local abstraction (pkg/table/tables/index.go:89 and pkg/table/tables/tables.go:270)
  4. needRestoredData comment still describes the old global-collation dependency (pkg/table/tables/index.go:63 and pkg/table/tables/index.go:196)

Comment thread pkg/table/tables/tables.go
Comment thread pkg/tablecodec/tablecodec.go
Comment thread pkg/executor/write.go
Comment thread pkg/ddl/index.go
Comment thread pkg/util/codec/codec.go
Comment thread pkg/expression/expression.go
Comment thread pkg/table/tables/index.go
Comment thread pkg/table/tables/index.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants