The Text2Midi CLI provides a powerful and flexible way to generate symbolic music (MIDI) from natural language descriptions directly from your terminal.
The entry point for the CLI is src/cli.py. It can be run as a module from the root directory:
python -m src.cli --text "A peaceful piano melody" --output peaceful.midOr using uv:
uv run src/cli.py --text "A peaceful piano melody" --output peaceful.mid| Argument | Short | Type | Default | Description |
|---|---|---|---|---|
--text |
-t |
str |
Required | Natural language description of the music to generate. |
--profile |
-p |
str |
balanced |
Generation profile to use. Options: one-shot (fast), balanced (default), deep-search (quality), midillm-fast. |
--output |
-o |
str |
output.mid |
Output path for the generated MIDI file. |
--verbose |
-v |
flag |
False |
Enables verbose logging output (DEBUG level). |
--print-prompt |
flag |
False |
Prints the internal technical prompt (generated by the translator or passed through) to stdout. | |
--strict-instruments |
flag |
False |
Enables strict instrument checking, penalizing the generation of unrequested instruments. | |
--translator-model |
str |
None |
Google AI Studio model for intent translation (e.g., gemini-2.5-flash). If omitted, the translation phase is bypassed (pass-through mode). |
Generates MIDI directly using the provided text as the technical prompt for the Text2Midi model.
python -m src.cli --text "Acoustic Grand Piano, 4/4 time signature, 120 bpm" --output simple.midUses an LLM (requires GOOGLE_API_KEY in .env) to translate a natural language description into a detailed technical prompt before generation.
python -m src.cli \
--text "A sad and slow cinematic orchestra playing a melancholy tune" \
--translator-model "gemini-2.5-flash" \
--profile deep-search \
--output cinematic.midUtilizes the MidiLLM approach (batch generation + Best-of-N scoring) instead of the default progressive search.
python -m src.cli \
--text "Upbeat jazz trio" \
--profile midillm-fast \
--output jazz.midGenerates the music, ensures no extra instruments are added, prints the generated technical prompt, and shows detailed logs.
python -m src.cli \
--text "Solo acoustic guitar playing a folk melody" \
--profile balanced \
--strict-instruments \
--print-prompt \
--verbose \
--output guitar_solo.midWhen a generation is successful, the CLI writes two files to the specified output path:
- The MIDI file (
.mid): The generated symbolic music. - The Technical Prompt (
.txt): A text file saved alongside the MIDI (e.g.,output.txt) containing the exact prompt that guided the generation model. This is useful for auditing the translator's output.