Skip to content

Fixes strict aliasing violations#300

Merged
aous72 merged 17 commits into
masterfrom
strict_aliasing
Jun 21, 2026
Merged

Fixes strict aliasing violations#300
aous72 merged 17 commits into
masterfrom
strict_aliasing

Conversation

@aous72

@aous72 aous72 commented Jun 21, 2026

Copy link
Copy Markdown
Owner

This PR fixes strict aliasing violations, which lurk in the background and can cause failure in an unpredictable ways.

runlevel5 and others added 17 commits June 17, 2026 13:29
Native 128-bit VSX implementations of the wavelet, colour, and
codestream kernels and the HTJ2K block decoder, with runtime
dispatch via hwcap. Supported targets are POWER9 (ISA 3.0) and
newer, little-endian only; other PPC targets use the generic code
paths.

Beyond a straight port, the kernels use POWER-specific forms where
measurement showed a win: xvrspi for round-to-nearest-away in the
float-to-int conversions, vec_sel for masked selects, and a block
decoder that destuffs the MagSgn bitstream upfront so per-quad bit
consumption is a GPR add instead of a vector-window shift.

The SIMD block decoder is dispatched everywhere on POWER10, and for
irreversible tile components on POWER9, where it beats the scalar
decoder; reversible content on POWER9 stays scalar, which is
slightly faster there.


Assisted-by: Lance Albertson <lance@osuosl.org>
Assisted-by: Thushan Fernando <thushan@thushanfernando.com>

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
@aous72 aous72 merged commit 80b6c24 into master Jun 21, 2026
34 checks passed
@aous72 aous72 deleted the strict_aliasing branch June 21, 2026 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants