From 2d8995b5217e39b7cc38eff4c78f99271aa63a63 Mon Sep 17 00:00:00 2001 From: Yasunobu <42543015+P4suta@users.noreply.github.com> Date: Wed, 10 Jun 2026 20:03:37 +0900 Subject: [PATCH] docs(perf): close out the despeckle optimization round with final baselines Regenerated both committed baselines on the full optimization stack: - pipeline (200-page fixture): conv 9.03s at -j8 vs the original 14.48s baseline (-37.6%); despeckle stage 5.21s vs 10.19s (-49%), its share down from 71.6% to 57.7%. At -j1: conv 35.63s vs 49.85s (-28.5%). - cleaner: clean() without component stats 106.8ms vs the 174.9ms round-opening baseline (-39%); the remaining distribution is the selectBySize block (~70%), dilate 11%, write/read ~7%. Also records the round's negative results, measured rather than assumed: the smallHoles De Morgan flip was implemented, measured at zero effect (the inverted-select penalty lives in pixConnComp's extraction of the giant background component, not in the re-render), and reverted; the single-labeling fusion was not pursued (predicted -4..6% conv sits at the acceptance gate with the round's largest FFM surface cost). The next levers, should they ever be needed: the extraction cost inside pixSelectBySize, or the register stage (now 35% of conv). Co-Authored-By: Claude Fable 5 --- despeckle/docs/cleaner-baseline.md | 32 +++++++++++++++--------------- pipeline/docs/perf-baseline.md | 10 +++++----- 2 files changed, 21 insertions(+), 21 deletions(-) diff --git a/despeckle/docs/cleaner-baseline.md b/despeckle/docs/cleaner-baseline.md index c244c1f..8e52204 100644 --- a/despeckle/docs/cleaner-baseline.md +++ b/despeckle/docs/cleaner-baseline.md @@ -5,28 +5,28 @@ Times each Leptonica primitive the page cleaner composes on a synthetic 600-dpi A5 page (3496x4961 px, fixed seed). Re-run after any change to the cleaner or the imaging bindings and compare before merging. -- Date (UTC): 2026-06-10 06:48:05 +- Date (UTC): 2026-06-10 06:57:51 - Host: Linux amd64, 8 CPUs -- Samples: median of 10 reps after 2 warmups; single-threaded. +- Samples: median of 14 reps after 2 warmups; single-threaded. | op | median (ms) | min (ms) | calls/clean() | est. share of clean() | |---|---:|---:|---:|---:| -| read TIFF-G4 | 2.58 | 2.54 | 1 | 2.0% | -| selectBySize k=6 (page) | 15.22 | 14.89 | 1 | 11.6% | -| selectBySize 15 (page) | 15.40 | 14.93 | 1 | 11.7% | -| selectBySize k=6 (inverted) | 22.36 | 22.04 | 2 | 34.1% | -| dilate 43x43 (text mask) | 14.08 | 13.74 | 1 | 10.7% | -| open 7x7 (page) | 4.01 | 3.85 | 1 | 3.1% | -| invert | 0.27 | 0.25 | 2 | 0.4% | -| subtract | 0.40 | 0.38 | 5 | 1.5% | -| and | 0.45 | 0.41 | 1 | 0.3% | +| read TIFF-G4 | 2.55 | 2.52 | 1 | 1.9% | +| selectBySize k=6 (page) | 15.34 | 14.99 | 1 | 11.7% | +| selectBySize 15 (page) | 15.35 | 14.98 | 1 | 11.7% | +| selectBySize k=6 (inverted) | 22.20 | 21.88 | 2 | 33.9% | +| dilate 43x43 (text mask) | 14.36 | 13.62 | 1 | 11.0% | +| open 7x7 (page) | 4.10 | 3.90 | 1 | 3.1% | +| invert | 0.26 | 0.25 | 2 | 0.4% | +| subtract | 0.49 | 0.43 | 5 | 1.9% | +| and | 0.42 | 0.40 | 1 | 0.3% | | or | 0.40 | 0.36 | 3 | 0.9% | -| countConnComp | 11.64 | 11.50 | 2 | 17.8% | +| countConnComp | 11.80 | 11.57 | 2 | 18.0% | | countPixels | 0.42 | 0.41 | 2 | 0.6% | -| write TIFF-G4 | 6.34 | 6.27 | 1 | 4.8% | -| **Σ(median × calls)** | 130.66 | | | 99.6% | -| **clean() end-to-end** | 131.15 | 129.19 | 1 | 100% | -| **clean() without component stats** | 107.60 | 106.36 | 1 | 82.0% | +| write TIFF-G4 | 6.42 | 6.30 | 1 | 4.9% | +| **Σ(median × calls)** | 131.54 | | | 100.5% | +| **clean() end-to-end** | 130.93 | 129.00 | 1 | 100% | +| **clean() without component stats** | 106.76 | 105.29 | 1 | 81.5% | The Σ row landing near 100% means the table accounts for clean()'s real cost; a large gap points at untimed work (allocation churn, codec internals). diff --git a/pipeline/docs/perf-baseline.md b/pipeline/docs/perf-baseline.md index 9727575..8075f03 100644 --- a/pipeline/docs/perf-baseline.md +++ b/pipeline/docs/perf-baseline.md @@ -6,7 +6,7 @@ of the installDist `pdfbook` launcher. Re-run after any change to the pipeline and compare against the previous run before merging (acceptance: ≥5% median total-wall improvement, or an explicit RSS/disk win, with output validated). -- Date (UTC): 2026-06-10 04:46:07 +- Date (UTC): 2026-06-10 07:01:04 - Host: Linux 6.8.0-124-generic amd64, 8 CPUs, RAM 16Gi - Launcher: `pipeline/app/build/install/pdfbook/bin/pdfbook` - Samples per measurement: cold (1st run) + warm median of 3. @@ -21,8 +21,8 @@ total-wall improvement, or an explicit RSS/disk win, with output validated). | Input | Jobs | Pages | E2E wall | conv | extract | despeckle | register | spread | startup+init | Cold wall | Peak RSS (MiB) | Output (MiB) | |---|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:| -| sample-scan-200p.pdf | 1 | 200 | 50.10s | 49.85s | 4.57s | 32.16s | 12.94s | 0.20s | 0.25s | 51.42s | 156 | 6.4 | -| sample-scan-200p.pdf | 8 | 200 | 14.77s | 14.48s | 1.15s | 9.85s | 3.27s | 0.21s | 0.29s | 15.04s | 328 | 6.4 | +| sample-scan-200p.pdf | 1 | 200 | 35.83s | 35.63s | 0.90s | 21.23s | 13.13s | 0.19s | 0.20s | 36.61s | 187 | 6.4 | +| sample-scan-200p.pdf | 8 | 200 | 9.23s | 9.03s | 0.45s | 5.21s | 3.16s | 0.20s | 0.20s | 9.30s | 379 | 6.4 | ## Stage shares (of conv, warm median) @@ -31,5 +31,5 @@ cannot pay for a parallelization rewrite no matter how elegant. | Input | Jobs | extract | despeckle | register | spread | |---|---|---:|---:|---:|---:| -| sample-scan-200p.pdf | 1 | 9.2% | 64.5% | 26.0% | 0.4% | -| sample-scan-200p.pdf | 8 | 7.9% | 68.0% | 22.6% | 1.5% | +| sample-scan-200p.pdf | 1 | 2.5% | 59.6% | 36.9% | 0.5% | +| sample-scan-200p.pdf | 8 | 5.0% | 57.7% | 35.0% | 2.2% |