Skip to content

Fix compiled negative-strided slice updates#3735

Closed
chrismicah wants to merge 1 commit into
ml-explore:mainfrom
chrismicah:fix-compiled-negative-stride-scatter
Closed

Fix compiled negative-strided slice updates#3735
chrismicah wants to merge 1 commit into
ml-explore:mainfrom
chrismicah:fix-compiled-negative-stride-scatter

Conversation

@chrismicah

Copy link
Copy Markdown
Contributor

Summary

Fix compiled slice updates when a negative-strided input is used by an
elementwise expression assigned into a negative-strided output slice.

Details

Issue #3716 reports that mx.compile can write only one element for patterns
like:

out[::-1] += 2.0 * x[::-1]
out[::-1] = 2.0 * x[::-1]

The compiled Metal kernel used unsigned index math for 1D strided kernels. When
the collapsed stride is negative, the input location wraps instead of walking
backward. This change selects the large-index strided kernel path whenever a
compiled kernel has negative strides, and lets the 1D strided index helper use
the selected signed index type.

Tests

CMAKE_BUILD_PARALLEL_LEVEL=8 python setup.py build_ext --inplace
PYTHONPATH=python python -m pytest python/tests/test_compile.py -q -k negative_strided_slice_update_expr --tb=short
PYTHONPATH=python python -m pytest python/tests/test_compile.py -q -k 'negative_strided_slice_update_expr or compile_nonfinite_constants or tuple_output_in_thread' --tb=short
git diff --check

I also ran a direct CPU/GPU eager-vs-compiled repro for both += and =
negative-strided update forms.

Fixes #3716

@chrismicah

Copy link
Copy Markdown
Contributor Author

Closing this as a duplicate of #3720. I missed that open PR in my initial search; #3720 already fixes #3716 with broader Metal/CUDA coverage and passing CI.

@chrismicah chrismicah closed this Jun 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mx.compile: assigning an elementwise expression to a negative-strided slice writes only one element

1 participant