Skip to content

feat(speech): add Tencent Cloud ASR and TTS backend#90

Open
kaileliu wants to merge 1 commit into
devfrom
feat/tencent-speech-api
Open

feat(speech): add Tencent Cloud ASR and TTS backend#90
kaileliu wants to merge 1 commit into
devfrom
feat/tencent-speech-api

Conversation

@kaileliu

Copy link
Copy Markdown
Collaborator

This PR adds a Tencent Cloud speech backend for services/speech, allowing the existing liaison voice pipeline to use cloud ASR and TTS by setting SPEECH_BACKEND=tencent while keeping the liaison-facing contracts unchanged (robonix/service/speech/asr_stream and robonix/service/speech/tts). It adds Tencent real-time ASR over WebSocket, Tencent TextToVoice TTS, env-based credential loading, and an adapter that converts Tencent ASR cumulative partial results into the incremental ASR events expected by liaison. Validation included Python syntax checks and an end-to-end liaison demo using input.wav -> mock mic -> Tencent ASR -> liaison -> mock pilot -> Tencent TTS -> mock speaker -> output.wav; the ASR transcript was 请介绍一下你自己。, the output WAV was generated as 16 kHz mono pcm_s16le, and the measured latency from ASR_FINAL to TTS_DONE was 0.653s, meeting the target of less than 3 seconds.

Comment thread services/speech/speech_service/tencent_cloud.py Dismissed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants