Skip to content

Use iOS Kiosk mode as Apple's STT and TTS server for Assist #176

@bgoncal

Description

@bgoncal

Problem statement

Great STT and TTS are hard to find for some languages, Apple has APIs that allow users to take advantage of on-device audio and text processing, the "same" technology used by Siri and keyboard dictation.

Having the above in mind, due to being on-device and high quality, it aligns with our goals to facilitate Assist interaction without having to rely on cloud for STT and TTS, using the recently introduced "Kiosk mode" for iOS we could turn those kiosk devices into home servers that process audio and text for Assist.

I vibe coded a mac app as proof of concept and I have been using the STT server since then and it is excellent (running on a Mac mini, iPad was not evaluated but should have similar performance)
https://github.com/bgoncal/Wyoming-Apple-STT-Server

Community signals

No response

Scope & Boundaries

In scope

  • Add kiosk mode STT server feature
  • Add kiosk mode TTS server feature

Not in scope

Foreseen solution

No response

Risks & open questions

No response

Appetite

No response

Execution issues

No response

Decision log

Date Decision Outcome

Metadata

Metadata

Assignees

No one assigned
    No fields configured for Opportunity.

    Projects

    Status

    Considering

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions