Skip to content

LancerLab/svn-artifacts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SVN: Shape Value Numbering — Artifact

This repository is the artifact for the paper:

SVN: Shape Value Numbering for Comprehensive and Practical Safety Assessment

Submitted to OOPSLA 2026 / CGO 2027

It contains the benchmark suite, evaluation scripts, and build orchestration to reproduce the results (RQ1–RQ4) presented in the paper.

Repository layout

.
├── choreo/                  # Choreo compiler (git submodule → GitHub)
├── benchmark/
│   ├── choreo/              # 310 Choreo (.co) benchmark cases (15 categories)
│   ├── mlir/                # MLIR tensor+scf comparison cases + manifest
│   ├── memref/              # MLIR memref comparison cases
│   ├── iree/                # IREE comparison cases
│   ├── triton/              # Triton comparison cases
│   └── results/             # (generated locally, not committed)
├── scripts/                 # Data collection, plotting, and automation
│   ├── reproduce_all.sh     # ★ One-command reproduction script
│   ├── choreo_assertion_stats.py    # RQ1/RQ2: assessment statistics
│   ├── choreo_compile_overhead.py   # RQ3: compile-time overhead
│   ├── choreo_runtime_entry.py      # RQ4: runtime assertion overhead
│   ├── visualize_results.py         # Terminal + HTML report generation
│   ├── collect_all_stats.py         # Cross-system comparison
│   ├── plot_safety_figures.py       # Paper figure generation
│   └── ...
├── latex/
│   └── oopsla26-dsvn/       # Paper sources
├── Makefile                 # Build targets
└── README.md                # This file

Quick start

Prerequisites

Tool Version Notes
GCC / G++ >= 9.0 C++17 support required
CMake >= 3.16 Build system
Ninja any ninja-build package
Python >= 3.8 For statistics and plotting scripts
matplotlib any Optional: for PNG figures and HTML report
Git any Submodule checkout
flex/bison >= 2.6/3.8 Auto-downloaded if missing (see below)
CUDA >= 12.0 Optional: RQ4 runtime overhead + GPU tests

Flex and Bison are auto-downloaded and compiled from source during CMake configuration if the system versions are missing or too old.

One-command reproduction

git clone --recursive <this-repo-url>
cd svn-artifact
bash scripts/reproduce_all.sh

This will:

  1. Initialize the Choreo submodule and its dependencies (cutlass, gtest)
  2. Build Choreo from source (~1 minute)
  3. Run compile-time tests (check + cli)
  4. Collect RQ1/RQ2 assessment statistics (310 cases × 15 categories)
  5. Measure RQ3 compile-time overhead (152 dynamic cases)
  6. Measure RQ4 runtime assertion overhead (if CUDA GPU is available)
  7. Print a comparison table against the paper values
  8. Generate an interactive HTML report (benchmark/results/report.html)

Results are written to benchmark/results/.

Output

The script produces:

  • Terminal: Rich summary tables with per-category breakdowns for all RQs
  • benchmark/results/report.html: Self-contained HTML with interactive Chart.js graphs
  • benchmark/results/figures/: PNG figures for each RQ (requires matplotlib)
  • benchmark/results/choreo_stats.csv: Raw RQ1/RQ2 data
  • benchmark/results/choreo_compile_overhead.csv: Raw RQ3 data
  • benchmark/results/choreo_runtime_entry.csv: Raw RQ4 data (if GPU available)

Step-by-step reproduction

# 1. Build Choreo
make choreo-build

# 2. Run compile-time tests
make choreo-test

# 3. Collect assessment statistics (RQ1/RQ2)
make choreo-stats

# 4. Measure compile-time overhead (RQ3)
make choreo-cto

# 5. (Optional, requires CUDA GPU) Measure runtime overhead (RQ4)
export CUDA_HOME=/usr/local/cuda
export CUTE_HOME=$(pwd)/choreo/extern/cutlass
python3 scripts/choreo_runtime_entry.py --reps 5

# 6. Generate visualization
python3 scripts/visualize_results.py

# 7. (Optional) Cross-system comparison — requires MLIR baseline
make mlir-clone && make mlir-build
python3 scripts/collect_all_stats.py

MLIR baseline (optional)

The cross-system comparison (Choreo vs MLIR vs IREE vs Triton) requires building the MLIR tools:

make mlir-clone   # shallow-clone llvm-project release/22.x
make mlir-build   # build mlir-opt, mlir-translate, FileCheck (~30 min)

Then re-run bash scripts/reproduce_all.sh without --skip-mlir.

GPU end-to-end tests (optional)

If a CUDA-capable GPU is available:

export CUDA_HOME=/usr/local/cuda
export CUTE_HOME=$(pwd)/choreo/extern/cutlass
cd choreo && bash tests/lit.sh tests/ && cd ..

Expected results

RQ1/RQ2: Assessment coverage and discharge

Metric Paper Expected (current compiler)
Cases compiled 291/310 ≥299/310 (improved)
Generated 11,524 ~12,000 (improved)
Discharged 10,693 ~11,150 (improved)
Runtime surviving 831 ~828
ADR 92.8% ~93.1%

The current compiler fixes cases that previously failed to compile, generating more assessments with a slightly higher discharge rate.

RQ3: Compile-time overhead

Metric Paper Expected
CTO 4.7% ~5% (machine-dep.)
Cases 152 152

The aggregate CTO varies slightly across machines due to hardware differences. Per-category trends are stable.

RQ4: Runtime assertion overhead (requires GPU)

Metric Paper Expected
Median <0.1% ~0% (negligible)
Cases ~152 152
Range [-4%, +4%] (noise)

Entry-level runtime assertions (host-side integer comparisons before kernel launch) impose negligible overhead. The median is consistently near zero.

Pinned versions

Component Version Source
Choreo svn-artifacts github.com/LancerLab/croqtile
LLVM/MLIR release/22.x github.com/llvm/llvm-project
IREE v3.10.0 pre-compiled or scripts/fetch_mlir_baselines.sh
Triton v3.6.0 scripts/fetch_mlir_baselines.sh
CUTLASS v4.2.1 via Choreo submodule
GoogleTest latest via Choreo submodule

License

See individual component licenses. The benchmark cases and evaluation scripts in this repository are provided for artifact evaluation purposes.

About

artifacts repository for the SVN paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors