Shazam-style audio fingerprinting app built for the EE200 (Signals, Systems & Networks) course project.
-
Install dependencies:
pip install -r requirements.txt
-
Build the fingerprint database (run once — takes ~2 min):
python build_db.py
-
Generate sample query clips (for the Identify tab):
python generate_samples.py
-
Launch the app:
streamlit run app.py
- Spectrogram — each song is turned into a time-frequency image using the STFT.
- Constellation — local maxima (peaks) in the spectrogram are extracted as landmark points.
- Hashing — nearby peaks are paired into compact hashes
(f1, f2, Δt)for efficient lookup. - Matching — query hashes are looked up in the pre-built database; matching hashes vote for a time offset, and a true match produces a sharp spike in the offset histogram.
| Tab | What it does |
|---|---|
| Library | Lists all indexed songs with duration and fingerprint stats |
| Identify | Upload a clip (or pick a sample) → see the full pipeline: spectrogram, constellation, offset histogram, matched song |
| Batch | Upload multiple clips → get a results.csv with filename, prediction columns |
| File | Purpose |
|---|---|
fingerprint.py |
Core fingerprinting engine |
build_db.py |
Indexes all songs into pickle databases |
generate_samples.py |
Creates sample query clips |
app.py |
Streamlit web app |
Q3_data/ |
Song library (50 .mp3 files) |
samples/ |
Sample query clips |
song_database.pkl |
Pre-built hash database |
song_metadata.pkl |
Song metadata (durations, peaks) |