A C++ command-line utility designed to analyze the "musical energy" of audio files and translate it into MIDI automation data. This module is ideal for workflows requiring deterministic, energy-aware object panning or dynamic processing.
- Multi-Band DSP Analysis: Splits audio into 5 critical frequency bands:
- Low: 20–120 Hz (Foundation/Bass)
- Low-Mid: 120–500 Hz (Body/Warmth)
- Mid: 500 Hz – 2 kHz (Presence/Core)
- High-Mid: 2–6 kHz (Clarity/Definition)
- Air: 6–20 kHz (Sheen/Brilliance)
- Intelligent Energy Scoring: Combines weighted RMS loudness with Loudness Range (LRA) to capture both steady-state intensity and dynamic impact.
- High-Fidelity Input: Supports WAV and MP3 formats via high-performance
dr_libsheaders. - MIDI Automation: Generates standard MIDI files containing normalized energy scores mapped to MIDI CC 11 (Expression) at high temporal resolution (20ms steps).
- Data Cleanup Mode: Efficient event decimation using an Exponential Moving Average (EMA) smoothing pass and significance thresholding. This reduces the number of MIDI events while preserving critical movement and timing alignment.
- Lightweight & Portable: Zero external library dependencies (uses header-only
dr_libsincluded in the source).
The project uses a standard makefile and requires a C++17 compatible compiler.
makeEnsure g++ and make (e.g., from MSYS2 or MinGW) are in your PATH.
makeTo enable extensive console logging and generate a debug_energy.log file:
make clean
make DEBUG=1Run the analyzer by providing an input audio file. You can optionally specify the output MIDI filename and enable cleanup mode.
./bin/AnalEnergy <input_audio_file> [output_midi_file] [--cleanup [threshold]]Example:
./bin/AnalEnergy input_track.wav analysis_result.mid --cleanup 2.0- Audio Loader: Mono-mixes stereo/multichannel input for consistent energy analysis.
- Filterbank: Parallel biquad filters isolate frequency ranges.
- Windowing: Audio is processed in 100ms frames for high temporal resolution.
-
Scoring Engine:
$E = \alpha \cdot L_{bands} + \beta \cdot LRA$ -
$L_{bands}$ is a weighted sum of band RMS values, prioritizing bass and high-mids. -
$LRA$ measures the peak-to-trough dynamics within each frame.
- Normalization: Final energy scores are scaled to the MIDI range (0-127).
This project is licensed under the GNU General Public License v2.0 (GPL-2.0).
You are free to use, modify, and distribute this software, provided that any derivative work is also licensed under GPLv2.
Included third-party components:
dr_wav— Public Domain / MIT-0dr_mp3— Public Domain / MIT-0
These libraries are permissively licensed and compatible with GPLv2.