Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,6 @@ archive*
src/wandb/
src/configs/*/case*
.cache/
.cursor/

dependencies/
dependencies
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,18 +152,22 @@ python compare_env.py --config test2/case1/compare_deepseek-chat.yml
```

### 🔊 Streaming TTS (Optional)
By default, the TTS pipeline generates audio serially and trims sentences that exceed the time budget. With `streaming_tts: true` in your config, a **streaming pipeline** is used instead:
By default, the TTS pipeline generates audio serially and trims sentences that exceed the time budget. With `streaming_tts: true` on a debater’s config entry, a **streaming pipeline** is used instead:

- **Chunk-based processing**: the debate speech is split into paragraph-level chunks, each assigned a proportional share of the total time budget (opening: 240s, rebuttal: 240s, closing: 120s).
- **Adaptive refinement**: FastSpeech2 estimates each chunk's duration; if off-target, an LLM rewrites the chunk to hit the target word count. Multiple TTS candidates are submitted in parallel and the closest-to-target is picked.
- **Streaming overlap**: while chunk N plays, chunk N+1 is being refined and TTS-generated, minimizing gaps.
- **No information loss**: instead of trimming sentences, text is rewritten to fit the budget.

To enable, set `streaming_tts: true` in the `env` section of your config:
To enable, set `streaming_tts: true` on each **debater** entry in your config (per-agent):
```yaml
env:
time_control: true
streaming_tts: true
debater:
- side: for
type: treedebater
streaming_tts: true
- side: against
type: treedebater
streaming_tts: true
```

After a streaming run, each speech produces a `*_chunks/` directory containing per-chunk audio, text, and a `chunk_profile.csv` with timing details. To visualize the overlap timeline:
Expand Down
4 changes: 4 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
openai
pydub
backoff
tqdm
numpy
Expand All @@ -7,6 +8,8 @@ google-generativeai
matplotlib
seaborn
litellm
pydantic>=2
instructor
accelerate>=0.26.0
wandb
tavily-python
Expand All @@ -15,4 +18,5 @@ pulp
g2p_en
fastapi
tavily-python
pydub
git+https://github.com/UKPLab/sentence-transformers.git
Loading
Loading