Align audio hash params with Haitsma-Kalker reference (breaking) by eklinger · Pull Request #54 · aetilius/pHash

eklinger · 2026-05-25T21:25:06Z

Aligns the time-domain and frequency-domain parameters of ph_audiohash with Haitsma-Kalker 2002 ("A Highly Robust Audio Fingerprinting System"). Addresses §4 of the algorithmic-correctness assessment.

Stacked on top of #49 (fix/audio-hash). Merge #49 first, or re-target this PR at master after #49 lands.

What was wrong

The bit derivation, filterbank shape, and confidence score in ph_audiohash already matched Haitsma-Kalker, but several parameters did not:

frame_length was hard-coded to 4096 samples regardless of sample rate. At the example's sr=8000 that is 512 ms, far from the paper's 0.37 s frame.
Frame advance was frame_length / 32 (~97% overlap), giving ~62 frames/sec at sr=8000. The paper specifies 31.25 frames/sec (~32 ms advance).
maxfreq was 3000 Hz; the paper specifies 2000 Hz. The previous range extended the upper band by 1 kHz beyond Haitsma-Kalker.

What this PR does

frame_length is now derived as the power of 2 closest to sr * 0.37 (the radix-2 FFT requires a power of 2).
Advance is now round(sr / 31.25).
maxfreq is now 2000 Hz.
nfft_half is now frame_length / 2 (was hard-coded 2048).

The bit derivation and filter weights are unchanged.

Compatibility

Binary-incompatible hash change. Hashes produced by this code are not compatible with hashes produced by the old parameters.

This also changes the temporal density of the fingerprint. Callers passing block_size to ph_audio_distance_ber will likely want a smaller value (e.g. 64 instead of the example's 256) because the absolute frame count for the same audio drops by ~2× — the example value of 256 will produce M=0 blocks and a cs = 0.5 (neutral) confidence on short signals.

Verification

Identical 5 s 440 Hz sines, block_size = 64: cs = 1.000000
440 Hz vs 441 Hz, block_size = 64: cs = 0.251 (low similarity, as expected)

Test plan

Compiles clean (-Wall) with HAVE_AUDIO_HASH + HAVE_LIBMPG123
test_audiophash -f a.wav -g a.wav -b 64: confidence 1.0
test_audiophash -f 440.wav -g 441.wav -b 64: confidence ~0.25

Audit \xc2\xa74: the bit derivation, filterbank shape, and confidence score in ph_audiohash already matched Haitsma-Kalker 2002, but several parameters did not: - frame_length was hard-coded to 4096 samples regardless of sample rate. At the example's sr=8000 that is 512 ms, far from the paper's 0.37 s frame. Now derived as the power of 2 closest to sr * 0.37. - frame advance was frame_length / 32 (97% overlap) giving ~62 frames/s at sr=8000. The paper specifies 31.25 frames/s (~32 ms advance). Now derived as round(sr / 31.25). - maxfreq was 3000 Hz; the paper specifies 2000 Hz. The previous range extended the upper band by 1 kHz beyond Haitsma-Kalker. Now 2000 Hz. nfft_half is now frame_length / 2 (was hard-coded 2048). Bit derivation and filter weights are unchanged. Note: this changes the temporal density of the fingerprint. Callers passing block_size to ph_audio_distance_ber will likely want a smaller value (e.g. 64 instead of the example's 256) because the absolute frame count for the same audio drops by ~2x. Hashes produced by this code are not compatible with hashes produced by the old parameters.

aetilius merged commit 1151960 into fix/audio-hash May 25, 2026

aetilius deleted the algo/audio-haitsma-kalker branch May 25, 2026 21:43

This was referenced May 26, 2026

Using pHash data from an older version of pHash w/the current version #14

Closed

Why are older versions not compatible? #46

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align audio hash params with Haitsma-Kalker reference (breaking)#54

Align audio hash params with Haitsma-Kalker reference (breaking)#54
aetilius merged 1 commit into
fix/audio-hashfrom
algo/audio-haitsma-kalker

eklinger commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eklinger commented May 25, 2026

What was wrong

What this PR does

Compatibility

Verification

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants