Shiny.Speech

Cross-platform speech services for .NET MAUI and Blazor WebAssembly — speech-to-text, text-to-speech, audio capture, and audio playback with pluggable cloud providers.

Libraries

Package	Description	Targets
Shiny.Speech	Core interfaces + native platform implementations (STT, TTS, audio capture, audio playback)	net10.0-ios, net10.0-android, net10.0-windows, net10.0 (Browser/WASM)
Shiny.Speech.Cloud	Cloud provider abstractions + `CloudSpeechToText` / `CloudTextToSpeech` implementations	net10.0
Shiny.Speech.Azure	Azure AI Speech provider (STT + TTS)	net10.0
Shiny.Speech.ElevenLabs	ElevenLabs provider (TTS)	net10.0

Getting Started

Native Platform Speech

Use the built-in OS speech engines — no cloud account needed. Works on MAUI (iOS, Android, Windows) and Blazor WebAssembly (via Web Speech API).

builder.Services.AddSpeechServices();
// Registers: ISpeechToTextService, ITextToSpeechService, IAudioSource, IAudioPlayer
// On Browser/WASM: auto-detected via OperatingSystem.IsBrowser()

Azure AI Speech (Cloud)

builder.Services.AddAzureSpeech("your-subscription-key", "your-region");

ElevenLabs TTS (Cloud)

builder.Services.AddElevenLabsTextToSpeech("your-api-key");

Usage

Text-to-Speech

public class MyService(ITextToSpeechService tts)
{
    public async Task SpeakAsync()
    {
        await tts.SpeakAsync("Hello world!", new TextToSpeechOptions
        {
            SpeechRate = 1.2f,
            Pitch = 1.0f,
            Volume = 0.8f
        });
    }
}

Speech-to-Text

The ISpeechToTextService uses a Start/Stop model with events, allowing multiple consumers to observe recognition results simultaneously.

public class MyService(ISpeechToTextService stt) : IDisposable
{
    public async Task StartListeningAsync()
    {
        var access = await stt.RequestAccess();
        if (access != AccessState.Available)
            return;

        // Subscribe to events
        stt.ResultReceived += OnResult;
        stt.KeywordHeard += OnKeyword;
        stt.Error += OnError;

        // Start listening (throws if already listening)
        await stt.Start(new SpeechRecognitionOptions
        {
            Culture = CultureInfo.GetCultureInfo("en-US"),
            SilenceTimeout = TimeSpan.FromSeconds(3),
            Keywords = ["Yes", "No", "Maybe"]
        });
    }

    public async Task StopListeningAsync()
    {
        await stt.Stop(); // no-op if not listening
        stt.ResultReceived -= OnResult;
        stt.KeywordHeard -= OnKeyword;
        stt.Error -= OnError;
    }

    void OnResult(object? sender, SpeechRecognitionResult result)
        => Console.WriteLine($"[{(result.IsFinal ? "FINAL" : "partial")}] {result.Text}");

    void OnKeyword(object? sender, string keyword)
        => Console.WriteLine($"Keyword detected: {keyword}");

    void OnError(object? sender, SpeechRecognitionError error)
        => Console.WriteLine($"Error: {error.Message}");

    public void Dispose() => StopListeningAsync().GetAwaiter().GetResult();
}

Extension Methods (Convenience)

// Simple: wait for silence (starts and stops automatically)
var text = await stt.ListenUntilSilence(cancellationToken: ct);

// Wake word: "Hey Computer, do something" → returns "do something"
var command = await stt.StatementAfterKeyword(["Hey Computer"], cancellationToken: ct);

// Wait for a specific keyword (with optional timeout)
var answer = await stt.WaitListenForKeywords(["Yes", "No"], timeout: TimeSpan.FromSeconds(30), cancellationToken: ct);

// Continuous keyword stream
await foreach (var keyword in stt.ListenForKeywords(["Up", "Down", "Left", "Right"], cancellationToken: ct))
{
    Console.WriteLine($"Direction: {keyword}");
}

Custom Cloud Provider

Implement ISpeechToTextProvider and/or ITextToSpeechProvider from Shiny.Speech.Cloud:

public class MyCloudSttProvider : ISpeechToTextProvider
{
    public async IAsyncEnumerable<SpeechRecognitionResult> RecognizeAsync(
        Stream audioStream,
        SpeechRecognitionOptions? options = null,
        CancellationToken cancellationToken = default)
    {
        // Read PCM audio from audioStream (16kHz, 16-bit, mono)
        // Yield recognition results...
    }
}

// Register:
builder.Services.AddCloudSpeechToText<MyCloudSttProvider>();

Platform Requirements

Platform	STT	TTS	Audio Capture	Audio Playback
iOS 15+ (incl. CarPlay)	SFSpeechRecognizer	AVSpeechSynthesizer	AVAudioEngine	AVAudioPlayer
Android 26+	SpeechRecognizer	Android TTS	AudioRecord	MediaPlayer
Windows 10 19041+	Windows.Media.SpeechRecognition	Windows.Media.SpeechSynthesis	AudioGraph	MediaPlayer
Browser (WASM)	Web Speech API (`SpeechRecognition`)	Web Speech API (`SpeechSynthesis`)	Web Audio API (`getUserMedia` + `ScriptProcessorNode`)	HTML5 `Audio`

Browser (Blazor WebAssembly)

No manifest changes needed — the browser prompts the user for microphone access automatically. Include the JS interop module in your index.html:

<script src="shiny-speech.js"></script>

Note: IAudioSource captures raw PCM audio in the browser using the Web Audio API (getUserMedia + ScriptProcessorNode), downsampled to 16kHz 16-bit mono. Audio playback (IAudioPlayer) accepts any browser-supported format via a base64 data URL.

iOS/macOS

Add to Info.plist:

<key>NSSpeechRecognitionUsageDescription</key>
<string>Speech recognition description</string>
<key>NSMicrophoneUsageDescription</key>
<string>Microphone description</string>

Android

Add to AndroidManifest.xml:

<uses-permission android:name="android.permission.RECORD_AUDIO" />

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
samples		samples
skills/shiny-speech		skills/shiny-speech
src		src
.gitignore		.gitignore
Build.slnf		Build.slnf
Directory.build.props		Directory.build.props
Directory.packages.props		Directory.packages.props
LICENSE		LICENSE
README.md		README.md
Speech.slnx		Speech.slnx
nuget.png		nuget.png
version.json		version.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shiny.Speech

Libraries

Getting Started

Native Platform Speech

Azure AI Speech (Cloud)

ElevenLabs TTS (Cloud)

Usage

Text-to-Speech

Speech-to-Text

Extension Methods (Convenience)

Custom Cloud Provider

Platform Requirements

Browser (Blazor WebAssembly)

iOS/macOS

Android

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Shiny.Speech

Libraries

Getting Started

Native Platform Speech

Azure AI Speech (Cloud)

ElevenLabs TTS (Cloud)

Usage

Text-to-Speech

Speech-to-Text

Extension Methods (Convenience)

Custom Cloud Provider

Platform Requirements

Browser (Blazor WebAssembly)

iOS/macOS

Android

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages