Description
I'd like to propose adding a first-party @ai-sdk/amazon-transcribe provider for Amazon Transcribe.
Motivation: Amazon Bedrock is currently absent from the AI SDK's list of transcription models. This is correct, because audio transcription on AWS is provided by Amazon Transcribe, a separate service from Bedrock (different endpoints and API shape). Rather than entangle Transcribe into @ai-sdk/amazon-bedrock, I'd like to add a dedicated transcription-only provider, matching the existing precedent of @ai-sdk/deepgram, @ai-sdk/revai, @ai-sdk/gladia, and @ai-sdk/assemblyai.
Scope (initial):
- Implements the
transcribe() function via TranscriptionModelV4 using the Amazon Transcribe batch API (StartTranscriptionJob + poll GetTranscriptionJob).
- Since
transcribe() provides raw audio bytes but batch Transcribe reads from S3, the provider uploads the audio to a configured S3 bucket (providerOptions.amazonTranscribe.inputBucket) before starting the job, then downloads and parses the transcript on completion.
- Auth via AWS SigV4 (region/credentials/
credentialProvider, consistent with @ai-sdk/amazon-bedrock).
- Returns text, segments, language, and duration.
- Streaming transcription is intentionally out of scope — it doesn't fit the one-shot
transcribe() interface, and the conversational RealtimeModelV4 spec is a poor fit for one-directional STT. Could be revisited later if a streaming-transcription spec is introduced.
I'm willing to build and maintain this provider following the existing contribution guidelines. I already have an initial v0.0.0 implementation (package, tests passing on Node + Edge, docs, example, changeset) and am ready to open a PR once this is validated.
References:
AI SDK Version
No response
Code of Conduct
Description
I'd like to propose adding a first-party
@ai-sdk/amazon-transcribeprovider for Amazon Transcribe.Motivation: Amazon Bedrock is currently absent from the AI SDK's list of transcription models. This is correct, because audio transcription on AWS is provided by Amazon Transcribe, a separate service from Bedrock (different endpoints and API shape). Rather than entangle Transcribe into
@ai-sdk/amazon-bedrock, I'd like to add a dedicated transcription-only provider, matching the existing precedent of@ai-sdk/deepgram,@ai-sdk/revai,@ai-sdk/gladia, and@ai-sdk/assemblyai.Scope (initial):
transcribe()function viaTranscriptionModelV4using the Amazon Transcribe batch API (StartTranscriptionJob+ pollGetTranscriptionJob).transcribe()provides raw audio bytes but batch Transcribe reads from S3, the provider uploads the audio to a configured S3 bucket (providerOptions.amazonTranscribe.inputBucket) before starting the job, then downloads and parses the transcript on completion.credentialProvider, consistent with@ai-sdk/amazon-bedrock).transcribe()interface, and the conversationalRealtimeModelV4spec is a poor fit for one-directional STT. Could be revisited later if a streaming-transcription spec is introduced.I'm willing to build and maintain this provider following the existing contribution guidelines. I already have an initial
v0.0.0implementation (package, tests passing on Node + Edge, docs, example, changeset) and am ready to open a PR once this is validated.References:
AI SDK Version
No response
Code of Conduct