Skip to content

feat(confgen): parallelize per-sheet row parsing with bounded worker pool#411

Open
Kybxd wants to merge 1 commit into
masterfrom
parallel-confgen
Open

feat(confgen): parallelize per-sheet row parsing with bounded worker pool#411
Kybxd wants to merge 1 commit into
masterfrom
parallel-confgen

Conversation

@Kybxd
Copy link
Copy Markdown
Collaborator

@Kybxd Kybxd commented May 29, 2026

Summary

Add an optional intra-sheet parallel parsing path for vertical map / list /
struct sheets. Total concurrently running row-parsing workers across all
sheets are capped at GOMAXPROCS via a process-wide counting semaphore, so
the existing book/sheet level fan-out can stay unbounded without inducing
scheduler thrash.

Motivation

Profiles on large workbooks (e.g. zonesvr's 580k-row ActorConf) showed
single-sheet row parsing dominating the critical path. Naively spawning
workers per sheet would push live goroutine count to
N_concurrent_sheets * GOMAXPROCS, regressing per-sheet timings vs an
isolated single-sheet run. We need worker-level parallelism that does NOT
oversubscribe the scheduler.

Design

  • Eligibility (parallel.go: parallelEligibility): walks the
    descriptor to decide whether a sheet's layout is safe to parallelize
    (vertical map/list/struct, no transpose, etc.). Cached only per call,
    not globally, because it depends on per-sheet WorksheetOptions.
  • Planner (planShards): two strategies.
    • strategyRowRange: even row blocks, used when row order is the only
      invariant.
    • strategyKeyHash: pre-scan the outer-key column via the new
      tableparser.ScanColumn, FNV-1a-hash each cell to a bucket so all
      rows sharing an outer key go to the same worker (preserves
      last-write-wins / append-order / E2005 duplicate-key detection).
  • Workers: shallow-cloned sheetParser with a private
    sheet-collector child and fresh cards map, parsing into per-shard
    partial messages that the main goroutine merges after g.Wait.
  • Concurrency cap (xpool.NewCPUSemaphore): leaf workers acquire a
    token before the heavy parse phase. The semaphore is intentionally
    not shared with gen.convert (book/sheet level): an outer
    goroutine holding a token would deadlock waiting for its own inner
    workers.

Files

  • internal/confgen/parallel.go (+ parallel_test.go): planner,
    worker fan-out, merge, eligibility walk, bucketForKey. The latter
    carries a documented risk note: it hashes raw cell strings, so
    variants of the same logical key ("1" vs "01", "true" vs
    "TRUE", etc.) can route to different buckets. Two follow-ups are
    spelled out in the comment for when this actually surfaces:
    1. hash the typed key by decoding via the outer-key field descriptor;
    2. add a merge-time uniqueness/order check by typed key as a
      defensive backstop.
  • internal/x/xpool/sem.go (+ sem_test.go): a tiny context-aware
    counting semaphore (NewSemaphore, NewCPUSemaphore).
  • internal/confgen/parser.go: parse/export sum-time logging in
    ScatterAndExport for perf diagnostics (CPU-style sums, not
    wall-clock).
  • internal/confgen/table_parser.go,
    internal/importer/book/tableparser/parser.go: expose ScanColumn
    and rangeDataRowsAtIndices / rangeDataRowsInRange so the planner
    can pre-scan the outer-key column and workers can iterate
    non-contiguous row sets.
  • test/functest/...ParallelConf...: functional fixture (sheet, proto,
    golden JSON).

Performance

zonesvr (135 workbooks, 351 sheets), N=5 full runs, identical binary,
only workerSem capacity differs:

mode wall median min max stdev
cap = GOMAXPROCS (kept) 13501ms 13237 13684 165
cap = 1<<20 (effectively uncapped) 13739ms 13657 13847 79

~1.76% wall improvement; more importantly, per-sheet timings of
critical-path large sheets under concurrent load are now on par with
their isolated single-sheet runs, instead of being amplified by
scheduler contention.

Tests

  • New unit tests for xpool semaphore and bucketForKey /
    planShards.
  • New functest fixture ParallelConf exercising the parallel path
    end-to-end.
  • go test ./... green locally.

Known follow-ups

  • bucketForKey string-vs-typed-key equivalence (documented inline,
    two fix paths listed; deferred until observed in real configs).
  • Parse/export timing logs in ScatterAndExport are perf-diagnostic;
    consider gating behind a debug flag in a follow-up if too noisy.

…pool

Add an optional intra-sheet parallel parsing path for vertical map/list/struct sheets, gated by a process-wide semaphore so total live workers stay at GOMAXPROCS even when many sheets parse concurrently.

- internal/confgen/parallel.go: planShards (row-range / key-hash strategies), worker fan-out, merge step, eligibility walk, and bucketForKey with documented string-vs-typed-key risk and follow-up options (typed-key hashing / merge-time defensive check).

- internal/confgen/parallel_test.go: planner, bucketForKey, and end-to-end coverage.

- internal/x/xpool/sem.go,sem_test.go: tiny context-aware counting semaphore (NewSemaphore / NewCPUSemaphore) used to cap leaf row-parsing workers; intentionally NOT shared with book/sheet level fan-out to avoid cross-level deadlock.

- internal/confgen/parser.go: add parse/export sum-time logging in ScatterAndExport for perf diagnostics.

- internal/confgen/table_parser.go, internal/importer/book/tableparser/parser.go: expose ScanColumn and rangeDataRowsAtIndices/InRange so the parallel planner can pre-scan the outer-key column and workers can iterate non-contiguous row sets.

- test/functest: ParallelConf functional fixture (sheet, proto, golden JSON).

Empirical wall-clock on zonesvr (135 workbooks, 351 sheets, N=5):

  cap=GOMAXPROCS  median 13501ms (min 13237, max 13684, stdev 165)

  cap=1<<20 (~no cap) median 13739ms (min 13657, max 13847, stdev  79)

  -> ~1.76% wall improvement plus per-sheet timing stability (large sheets no longer regress vs isolated runs under concurrent load).
@github-actions
Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMay 29, 2026, 1:27 PM

@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

❌ Patch coverage is 80.84416% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.07%. Comparing base (6863556) to head (8f5fc13).

Files with missing lines Patch % Lines
internal/confgen/parallel.go 83.41% 18 Missing and 15 partials ⚠️
internal/importer/book/tableparser/parser.go 59.25% 14 Missing and 8 partials ⚠️
internal/confgen/parser.go 83.33% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #411      +/-   ##
==========================================
+ Coverage   74.87%   75.07%   +0.20%     
==========================================
  Files          88       90       +2     
  Lines        9384     9688     +304     
==========================================
+ Hits         7026     7273     +247     
- Misses       1785     1818      +33     
- Partials      573      597      +24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant