Add cross-version benchmarks CI workflow by mattleibow · Pull Request #4276 · mono/SkiaSharp

mattleibow · 2026-06-29T13:58:57Z

What

Adds a PR-triggered Benchmarks workflow that runs the SkiaSharp micro-benchmarks across multiple SkiaSharp versions on Linux, Windows, and macOS, then merges the results into a single comparison report posted to the job summary.

This grew out of validating the native blur fix (SK_AVOID_SLOW_RASTER_PIPELINE_BLURS): there was no way to measure such changes with real numbers across versions and platforms. Now there is.

Why a build matrix instead of one process

BenchmarkDotNet cannot host two different versions of the same assembly in a single process (type-identity clash — there is no built-in WithNuGet). So each (os, version) combination is benchmarked in its own job and the JSON exports are merged afterwards.

How it works

flowchart LR
  setup[setup: resolve versions] --> pub[published matrix: os x version]
  setup --> cur[current matrix: linux + macos, native build]
  pub --> rep[report: merge -> job summary]
  cur --> rep

published — SkiaSharp.Benchmarks.Compare, an isolated harness that links the same benchmark sources but restores a published SkiaSharp NuGet version from nuget.org. It deliberately opts out of the repo build infrastructure (empty Directory.Build.props/.targets) and carries its own NuGet.config so it resolves the exact released version requested. Fast and reliable — always reports real numbers on all three OSes. Default versions: 4.150.0-preview.2.1 (baseline) and 3.119.4.
current — benchmarks the working tree's native code. Rather than building the multi-targeted in-repo binding graph (which drags in the mobile native-asset projects and needs the Android/iOS workloads the runners don't have — NETSDK1147), it reuses the same Compare harness: it restores the baseline published managed package, then replaces that package's libSkiaSharp in the NuGet global cache with the one freshly built from this PR. BenchmarkDotNet builds its child project against that cache at run time, so the benchmark exercises this PR's native code with a known, stable managed API. The job logs the SHA before/after the swap to prove the working-tree binary is the one being measured. Best-effort (continue-on-error); scoped to the OSes whose native library reliably builds from source on a hosted runner (Linux + macOS).
report — scripts/benchmarks/merge-benchmarks.py merges every run into one Markdown table (mean µs + ratio-vs-baseline) written to $GITHUB_STEP_SUMMARY and uploaded as an artifact.

Reading the `current` column ⚠️

The from-source externals-* native is not built with the same optimization/official-build flags as the shipped NuGet native, so its absolute timings are systematically different (typically slower) than the optimized published packages. Do not read a current-vs-published ratio as the effect of a PR's code. To measure a native change, compare two current runs built the same way (the PR branch vs its base). The published columns are the apples-to-apples comparison across released versions.

The Windows hosted image currently cannot build the native library from source (missing Windows SDK 10.0.19041 and Spectre-mitigated MSVC libs on the VS preview image), so Windows is excluded from current by default and still appears in the published comparison. It can be opted back in via the current_oses dispatch input once the toolchain is available.

Other changes

BenchmarkConfig (shared) adds JsonExporter.FullCompressed — that export cannot be selected from the command line in this BenchmarkDotNet version, so it is configured in code to guarantee every run produces the JSON the merge step consumes.
BlurImageFilterBenchmark exercises the 8888 raster blur path affected by the native flag (small-sigma slow path vs large-sigma control).
SurfaceCanvasBenchmark switched to SKPath (CS0618 suppressed) so the shared sources compile against older releases that predate SKPathBuilder.

Triggers

pull_request touching benchmarks/**, scripts/benchmarks/**, or the workflow file.
workflow_dispatch with inputs: versions, filter, job (short/default), build_current, current_oses, extra_feed (e.g. point at a PR preview package source).

Validation — real CI numbers

Proven on real GitHub-hosted runners (run 28379961911): all six published cells (3 OSes × 2 versions) plus both current cells (Linux x64, macOS arm64) succeeded and merged into one report. The current cells logged the native SHA swap, e.g. macOS built 726ab1f0… replacing the published 4523f8db…, confirming the working-tree binary was measured. Example (mean µs, baseline 4.150.0-preview.2.1):

OS	Benchmark	baseline	3.119.4	current
windows-x64	BlurImage 1024/σ1	1,793	2,134	—
linux-x64	BlurImage 1024/σ1	4,199	3,955	4,502
osx-arm64	BlurImage 1024/σ1	414	472	643

(The macOS current being slower than published despite identical m150 code is exactly the build-flag caveat above.)

Note: this is infrastructure only — no library code or public API changes.

Run the SkiaSharp micro-benchmarks on every PR that touches benchmarks and compare performance across multiple SkiaSharp versions, so changes like the SK_AVOID_SLOW_RASTER_PIPELINE_BLURS native blur fix can be measured with real numbers instead of being eyeballed. BenchmarkDotNet cannot host two versions of the same assembly in one process, so the workflow benchmarks each (operating system, version) combination in its own job and merges the JSON exports afterwards: * SkiaSharp.Benchmarks.Compare - an isolated harness that links the same benchmark sources but restores a published SkiaSharp NuGet version. It deliberately opts out of the repo build infrastructure (empty Directory.Build.props/targets) and carries its own NuGet.config so it can resolve the exact released version it is asked to benchmark. Used for the reliable "published" comparison columns (latest stable, a 3.x release). * The in-repo SkiaSharp.Benchmarks project benchmarks the working tree after a single-arch native build (Linux via the repo's cross Docker image, Windows and macOS via cake). This "current" path is best-effort (continue-on-error) because a from-source native build on a hosted runner is slow and can fail for reasons unrelated to the benchmarks. A shared BenchmarkConfig adds the JsonExporter.FullCompressed export (which cannot be selected from the command line in this BenchmarkDotNet version) so every run produces the JSON the merge step consumes. scripts/benchmarks/merge- benchmarks.py combines the per-run results into one Markdown table (mean microseconds plus a ratio-vs-baseline column) written to the job summary. Also adds a BlurImageFilterBenchmark that exercises the 8888 raster blur path affected by the native flag (small-sigma slow path vs large-sigma control), and switches SurfaceCanvasBenchmark to SKPath so the shared sources compile against older SkiaSharp releases that predate SKPathBuilder. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-29T13:59:06Z

📦 Try the packages from this PR

Warning

Do not run these scripts without first reviewing the code in this PR.

Step 1 — Download the packages

bash / macOS / Linux:

curl -fsSL https://raw.githubusercontent.com/mono/SkiaSharp/main/scripts/get-skiasharp-pr.sh | bash -s -- 4276

PowerShell / Windows:

iex "& { $(irm https://raw.githubusercontent.com/mono/SkiaSharp/main/scripts/get-skiasharp-pr.ps1) } 4276"

Step 2 — Add the local NuGet source

dotnet nuget add source ~/.skiasharp/hives/pr-4276/packages --name skiasharp-pr-4276

More options

Option	Description
`--successful-only` / `-SuccessfulOnly`	Only use successful builds
`--force` / `-Force`	Overwrite previously downloaded packages
`--list` / `-List`	List available artifacts without downloading
`--build-id ID` / `-BuildId ID`	Download from a specific build

Or download manually from Azure Pipelines — look for the nuget artifact on the build for this PR.

Remove the source when you're done:

dotnet nuget remove source skiasharp-pr-4276

BenchmarkDotNet 0.13.5 treats --filter as a single list option and errors with "Option 'f, filter' is defined multiple times" when the flag is repeated, so the multiple benchmark globs must be passed as values of one --filter instead of one flag each. Also point --artifacts at the artifact root so the JSON lands in bench-out/results next to meta.json (no redundant results/results nesting). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

GitHub's macOS runners resolve "shell: bash" to Apple's /bin/bash 3.2, where expanding an empty array under "set -u" (the optional EXTRA_FEED args) aborts the step with "unbound variable" before dotnet runs. Linux and Windows use bash 5.x and were unaffected. Use the bash 3.2-safe ${arr[@]+"${arr[@]}"} expansion so the optional feed args are omitted cleanly when unset. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The 'current' cells previously ran the in-repo SkiaSharp.Benchmarks project, whose ProjectReference/native-asset chain pulls the multi-targeted binding graph (Android/iOS native-asset projects), which requires mobile workloads that the benchmark runners do not have (NETSDK1147), so the run never built. Instead, reuse the proven Compare harness: restore the baseline published managed package, then replace its native libSkiaSharp in the NuGet global cache with the one freshly built from this PR. BenchmarkDotNet builds its child project against that cache at run time, so the benchmark exercises the working tree's native code with a known, stable managed API. This avoids the multi-TFM/workload build entirely and works on the runners that already build the native library (Linux, macOS). Also attempt to install the MSVC Spectre-mitigated libs on Windows so the from-source native build can link; this stays best-effort (continue-on-error). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-29T15:04:38Z

📖 Documentation Preview

The documentation for this PR has been deployed and is available at:

🔗 View Staging Site
🔗 View Staging Docs
🔗 View Staging Gallery (Blazor)
🔗 View Staging Gallery (Uno Platform)
🔗 View Staging SkiaFiddle

This preview will be updated automatically when you push new commits to this PR.

This comment is automatically updated by the documentation staging workflow.

…build-flag caveat The Windows hosted image cannot build libSkiaSharp from source (missing Windows SDK 10.0.19041 and Spectre-mitigated MSVC libs on the VS preview image), so it produced a perpetually failing 'current' cell. Restrict the 'current' matrix to the runners where the from-source native build works (Linux and macOS) via a new current_oses input, while every OS still takes part in the published comparison. Also document an important caveat: the from-source 'externals-*' native is not built with the official NuGet's optimization flags, so 'current' absolute timings differ from the optimized published packages. A native PR should be assessed by comparing two 'current' runs (PR vs base) built the same way, not by reading a current-vs-published ratio. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-project-automation Bot added this to SkiaSharp Backlog Jun 29, 2026

mattleibow and others added 3 commits June 29, 2026 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add cross-version benchmarks CI workflow#4276

Add cross-version benchmarks CI workflow#4276
mattleibow wants to merge 5 commits into
mainfrom
mattleibow-benchmarks-ci-workflow

mattleibow commented Jun 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mattleibow commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why a build matrix instead of one process

How it works

Reading the current column ⚠️

Other changes

Triggers

Validation — real CI numbers

Uh oh!

github-actions Bot commented Jun 29, 2026

📦 Try the packages from this PR

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mattleibow commented Jun 29, 2026 •

edited

Loading

Reading the `current` column ⚠️