Skip to content

feat(conversation): call/SMS confirmation and multi-turn follow-up#25

Merged
ryon137 merged 7 commits into
mainfrom
feat/confirmation-and-follow-up
Jun 17, 2026
Merged

feat(conversation): call/SMS confirmation and multi-turn follow-up#25
ryon137 merged 7 commits into
mainfrom
feat/confirmation-and-follow-up

Conversation

@ryon137

@ryon137 ryon137 commented Jun 7, 2026

Copy link
Copy Markdown
Owner

Summary

  • Call/SMS verbal confirmation: after Prism parses a call or SMS action, it asks "Call Mom?" / "Text John?" and immediately listens for a verbal yes/no (8s window) before executing — affirm executes, deny or timeout cancels and speaks "Cancelled."
  • Multi-turn follow-up: after any plain (non-action) response, Prism holds a 5s re-listen window; if the user speaks, the query is inferred with up to 3 turns of conversation history injected into the Gemma prompt; SCO and audioTrack stay alive across turns
  • Shared re-listen infrastructure: afterOnDeviceTurnComplete() branches on ReListenMode (CONFIRMATION / FOLLOW_UP / NONE) — skips SCO/AudioTrack teardown and calls startReListenRecording() instead; context clears on new wake word trigger, denial, or timeout

New files

File Purpose
ConfirmationParser.kt Maps verbal yes/no transcript to AFFIRM/DENY; questionFor() generates confirmation prompt
ConversationContext.kt Rolling 3-turn history (ArrayDeque, drops oldest on overflow)
ConfirmationParserTest.kt 15 tests covering affirmatives, negatives, partial matches, case insensitivity
ConversationContextTest.kt 9 tests covering capacity cap, ordering, clear/re-add
InferenceManagerPromptTest.kt +7 multi-turn tests (history ordering, system prompt deduplication, prompt structure)

Test plan

  • Install APK and connect glasses
  • Say "Call [contact]" → confirm Prism asks "Call [Name]?" → say "yes" → call initiates
  • Say "Call [contact]" → say "no" → confirm "Cancelled." spoken, no call made
  • Say "Call [contact]" → stay silent 8s → confirm timeout cancels
  • Ask a question → ask a follow-up within 5s → confirm Prism uses prior context in answer
  • Ask 4+ follow-ups → confirm oldest turns are dropped (no crash)
  • Say wake word mid-follow-up window → confirm context clears for new conversation
  • Confirm all 49 unit tests pass: ./gradlew :app:test

🤖 Generated with Claude Code

ryon137 and others added 7 commits June 5, 2026 19:22
- ConfirmationParser: maps verbal yes/no to AFFIRM/DENY; questionFor()
  generates "Call Name?" / "Text Name?" prompts
- ConversationContext: rolling 3-turn history (ArrayDeque, drops oldest)
- InferenceManager: multi-turn prompt builder; history injected into
  Gemma chat template with system prompt anchored to first user turn
- GlassesAIService: AWAITING_CONFIRMATION and FOLLOW_UP_LISTENING states;
  ReListenMode enum drives afterOnDeviceTurnComplete() branching; SCO and
  audioTrack kept alive for re-listen window; startReListenRecording()
  recreates glassesMicRecord, posts timeout runnable (8 s confirmation,
  5 s follow-up); conversationContext populated on each FOLLOW_UP turn;
  cleared on new wake word trigger or cancellation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n beep

Whisper returns "Yes." / "No." with trailing punctuation — the parser's
startsWith exact match failed, causing all confirmations to cancel.
Strip .,!? before matching so "Yes." → "yes" → AFFIRM.

Also play ascending beep when entering AWAITING_CONFIRMATION or
FOLLOW_UP_LISTENING so the user knows Prism is ready to listen.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
No beep in either direction when entering confirmation or follow-up
listening windows. Descending end-of-turn chime only plays when
reListenMode is NONE (Prism is done listening entirely).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lose

Move descending chime out of TTS thread and into afterOnDeviceTurnComplete()
(mode == NONE path only). Re-listen transitions (confirmation, follow-up)
are now completely silent — no chime when the mic pauses between turns.
Ascending chime remains only in onScoConnected() at session start.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…acts

Stopping and recreating glassesMicRecord between turns caused audible
hardware artifacts (perceived as chimes) on the glasses SCO audio path.
Now the AudioRecord stays alive for the entire session — re-listen just
starts a new reader thread without touching the underlying mic handle.
Stop/release only happens on full session teardown (mode == NONE).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…DENY

glassesMicRecord stays alive across turns, so its internal buffer holds
audio recorded during TTS playback. Reading that echo immediately caused
speechStarted=true, a junk transcript, a DENY, and a full session teardown
(descending beep) followed by SCO reconnect (ascending beep).

Drain 4 chunks (~400ms) before entering the VAD loop so only fresh
microphone audio from the user is evaluated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ryon137 ryon137 merged commit 627175c into main Jun 17, 2026
1 check passed
@ryon137 ryon137 deleted the feat/confirmation-and-follow-up branch June 17, 2026 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant