Describe the bug
The IR benchmark for DecoderMux proc is prone to a random failure after increasing pipeline_stages parameter for the benchmark from 2 up to 10. The error is indicated by the following output:
/home/xlsuser/.cache/bazel/_bazel_xlsuser/976f9331675cc2c7ada78a437e0227fe/execroot/com_google_xls/bazel-out/k8-opt/bin/xls/modules/zstd/dec_mux_opt_ir_benchmark.sh: line 3: 3086416 Segmentation fault (core dumped) xls/dev_tools/benchmark_main xls/modules/zstd/dec_mux_verilog.opt.ir --delay_model=asap7 --generator=pipeline --pipeline_stages=10 --reset=rst --reset_data_path=false --use_system_verilog=false --module_name=dec_mux_opt_ir_benchmark_default $@ $*
This failure does not occur for every benchmark run. It might be required to run the benchmark multiple times (could even take 50 runs) to reproduce the error.
To Reproduce
Steps to reproduce the behavior:
- Checkout on current
main branch (cd90112 at the time of writing this issue)
- Modify
pipeline_stages parameter for DecoderMux codegen and IR benchmark:
diff --git a/xls/modules/zstd/BUILD b/xls/modules/zstd/BUILD
index 871749792..94b6b3ecd 100644
--- a/xls/modules/zstd/BUILD
+++ b/xls/modules/zstd/BUILD
@@ -506,7 +506,7 @@ xls_dslx_verilog(
codegen_args = {
"module_name": "DecoderMux",
"delay_model": "asap7",
- "pipeline_stages": "2",
+ "pipeline_stages": "3",
"reset": "rst",
"use_system_verilog": "false",
},
@@ -520,7 +520,7 @@ xls_benchmark_ir(
name = "dec_mux_opt_ir_benchmark",
src = ":dec_mux_verilog.opt.ir",
benchmark_ir_args = {
- "pipeline_stages": "2",
+ "pipeline_stages": "10",
"delay_model": "asap7",
},
tags = ["manual"],
- Run
bazel run -c opt -- //xls/modules/zstd:dec_mux_opt_ir_benchmark repeatedly up to the segfault occurrence, e.g.:
count=0; while bazel run -c opt -s -- //xls/modules/zstd:dec_mux_opt_ir_benchmark --logtostderr; do (( count++ )); echo "Run $count"; done; echo "successfull runs: $count"
- Observe segfault of the
benchmark_main
Expected behavior
benchmark_main should fail gracefully or not at all
Additional context
I noticed that proc DecoderMux can be divided by the toolchain into more than 2 pipeline stages as it is currently on the main branch. I increased this parameter for the IR benchmark to 10 to see the number of stages that have some IR nodes assigned to those (happens to be 3 pipeline stages). Then I set the pipeline stages for the codegen rule to 3 stages indicated by the benchmark. Then I noticed that sometimes the IR benchmark would fail.
The segfault seems to be happening somewhere in the evaluation of the Block IR with JIT.
This was also happening in the CI for the #1616: https://github.com/google/xls/actions/runs/11494086318/job/32320996820?pr=1616#step:10:463
CC @proppy
Describe the bug
The IR benchmark for DecoderMux proc is prone to a random failure after increasing pipeline_stages parameter for the benchmark from
2up to10. The error is indicated by the following output:This failure does not occur for every benchmark run. It might be required to run the benchmark multiple times (could even take 50 runs) to reproduce the error.
To Reproduce
Steps to reproduce the behavior:
mainbranch (cd90112 at the time of writing this issue)pipeline_stagesparameter forDecoderMuxcodegen and IR benchmark:bazel run -c opt -- //xls/modules/zstd:dec_mux_opt_ir_benchmarkrepeatedly up to the segfault occurrence, e.g.:benchmark_mainExpected behavior
benchmark_mainshould fail gracefully or not at allAdditional context
I noticed that proc
DecoderMuxcan be divided by the toolchain into more than 2 pipeline stages as it is currently on themainbranch. I increased this parameter for the IR benchmark to 10 to see the number of stages that have some IR nodes assigned to those (happens to be 3 pipeline stages). Then I set the pipeline stages for the codegen rule to 3 stages indicated by the benchmark. Then I noticed that sometimes the IR benchmark would fail.The segfault seems to be happening somewhere in the evaluation of the Block IR with JIT.
This was also happening in the CI for the #1616: https://github.com/google/xls/actions/runs/11494086318/job/32320996820?pr=1616#step:10:463
CC @proppy