Skip to content

Latest commit

 

History

History
74 lines (59 loc) · 3.08 KB

File metadata and controls

74 lines (59 loc) · 3.08 KB

Command Line Interface (CLI) Reference

The Text2Midi CLI provides a powerful and flexible way to generate symbolic music (MIDI) from natural language descriptions directly from your terminal.

Basic Usage

The entry point for the CLI is src/cli.py. It can be run as a module from the root directory:

python -m src.cli --text "A peaceful piano melody" --output peaceful.mid

Or using uv:

uv run src/cli.py --text "A peaceful piano melody" --output peaceful.mid

Arguments

Argument Short Type Default Description
--text -t str Required Natural language description of the music to generate.
--profile -p str balanced Generation profile to use. Options: one-shot (fast), balanced (default), deep-search (quality), midillm-fast.
--output -o str output.mid Output path for the generated MIDI file.
--verbose -v flag False Enables verbose logging output (DEBUG level).
--print-prompt flag False Prints the internal technical prompt (generated by the translator or passed through) to stdout.
--strict-instruments flag False Enables strict instrument checking, penalizing the generation of unrequested instruments.
--translator-model str None Google AI Studio model for intent translation (e.g., gemini-2.5-flash). If omitted, the translation phase is bypassed (pass-through mode).

Examples

1. Simple Generation (Bypass Translation)

Generates MIDI directly using the provided text as the technical prompt for the Text2Midi model.

python -m src.cli --text "Acoustic Grand Piano, 4/4 time signature, 120 bpm" --output simple.mid

2. Full Pipeline with LLM Translation

Uses an LLM (requires GOOGLE_API_KEY in .env) to translate a natural language description into a detailed technical prompt before generation.

python -m src.cli \
  --text "A sad and slow cinematic orchestra playing a melancholy tune" \
  --translator-model "gemini-2.5-flash" \
  --profile deep-search \
  --output cinematic.mid

3. MidiLLM Batch Generation Strategy

Utilizes the MidiLLM approach (batch generation + Best-of-N scoring) instead of the default progressive search.

python -m src.cli \
  --text "Upbeat jazz trio" \
  --profile midillm-fast \
  --output jazz.mid

4. Advanced Debugging

Generates the music, ensures no extra instruments are added, prints the generated technical prompt, and shows detailed logs.

python -m src.cli \
  --text "Solo acoustic guitar playing a folk melody" \
  --profile balanced \
  --strict-instruments \
  --print-prompt \
  --verbose \
  --output guitar_solo.mid

Generated Outputs

When a generation is successful, the CLI writes two files to the specified output path:

  1. The MIDI file (.mid): The generated symbolic music.
  2. The Technical Prompt (.txt): A text file saved alongside the MIDI (e.g., output.txt) containing the exact prompt that guided the generation model. This is useful for auditing the translator's output.