feat(assemblyai): support universal-3-5-pro and expand the transcription provider#16548
Merged
Merged
Conversation
gr2m
approved these changes
Jul 2, 2026
Add universal-3-5-pro, universal-3-pro, and universal-2 to the transcription model ids. These newer models are only accessible through AssemblyAI's speech_models request parameter (the singular speech_model parameter is deprecated and rejects them), so the provider now routes the model id to the correct parameter automatically: legacy best/nano use speech_model, all other models use speech_models. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The legacy `best` model is deprecated (still functional, routes via the
deprecated singular `speech_model` parameter): the model id type marks it
`@deprecated` and `doGenerate` emits a deprecation warning pointing to
`universal-3-5-pro`.
The `nano` model is removed entirely — AssemblyAI's API now rejects it with a
400 ("the 'nano' speech model has been deprecated and is no longer available"),
confirmed end-to-end against the live API.
Repoint examples, docs, and README to `universal-3-5-pro`, generalize the
callable provider overload to the full model id type, and expand tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…output The provider previously returned a Zod-parsed (stripped) transcript as response.body, dropping speaker labels, utterances, and all audio-intelligence results even when enabled via providerOptions. Now doGenerate returns the full raw AssemblyAI response on response.body, and populates providerMetadata.assemblyai with structured results for the currently-available features: utterances (diarization), entities, sentimentAnalysisResults, contentSafetyLabels, iabCategoriesResult, and autoHighlightsResult. The words schema gains speaker/channel/confidence and a typed utterances array. Verified availability against the DeepLearning backend and the public API reference: deprecated features (Summarization, Auto Chapters, Custom Topics) are intentionally left off providerMetadata but remain on the raw body. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AssemblyAI returns word start/end in milliseconds; the provider put them directly into segments' startSecond/endSecond (and the durationInSeconds fallback), making timings 1000x too large. Confirmed against the live API (a 3s clip reported a first word at startSecond: 183). Now divided by 1000. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add provider options for newer AssemblyAI request params: prompt, keytermsPrompt, temperature, removeAudioTags, and domain (wired through api-types + getArgs). Deprecate wordBoost/boostParam: AssemblyAI rejects word_boost with a 400 on universal-3-pro / universal-3-5-pro / slam-1 (works only on universal-2/best), so using either now emits a deprecation warning pointing to keytermsPrompt. Verified param availability and model-gating against the DeepLearning backend and the public API reference. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…options Add provider options for AssemblyAI's GA nested request params: speakerOptions, languageDetectionOptions, redactPiiAudioOptions, redactPiiReturnUnredacted, and redactStaticEntities (wired through api-types + getArgs with camelCase->snake_case mapping). Shapes verified against the live AssemblyAI docs OpenAPI (the assemblyai-api-spec repo was stale for redact_static_entities). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a regression test asserting nano is no longer special-cased (routes via speech_models, no deprecation warning) and that universal-3-pro routes via speech_models. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…l-3-5-pro Emit an informational warning (type: 'other', not a deprecation) when universal-3-pro or universal-2 is used, noting that universal-3-5-pro is the latest flagship and is set to replace universal-3-pro. Both models remain fully supported; universal-3-5-pro emits no warning. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- warn on options missing prerequisites (redactPii*, languageCode+languageDetection) - fix universal-2 nudge message and wordBoost/boostParam warning attribution - type removeAudioTags / overrideAudioRedactionMethod as enums (drop `as never`) - honor config.fetch for polling GETs - source providerMetadata from the raw response (no field stripping); document its timings are in ms while segments are in seconds - fix redactPiiAudioOptions docs (requires redactPiiAudio); restore the Model Capabilities table rows Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e, models The provider only exposes transcription models (it throws on languageModel), so the intro line is corrected to match the README and actual behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Combine the per-change changeset files into a single @ai-sdk/assemblyai patch entry describing the provider update, matching the repo's one-changeset-per-PR convention. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
e13307b to
1abe0e3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
AssemblyAI's transcription API has moved to the
speech_modelsrequest parameter and shipped new models (including theuniversal-3-5-proflagship) plus new request options. The@ai-sdk/assemblyaiprovider only supported the legacybest/nanoids via the deprecated singularspeech_modelparam and dropped diarization / audio-intelligence output. This PR brings the provider up to date.Changes
Models
universal-3-5-pro,universal-3-pro, anduniversal-2, routed via thespeech_modelsarray (the deprecated singularspeech_modelis used only for the legacybestmodel).best(still works; emits a deprecation warning) and removenano— AssemblyAI's API now rejectsnanowith a 400 ("no longer available").universal-3-pro/universal-2emits an informational warning suggestinguniversal-3-5-pro; not a deprecation.Output: speaker diarization + audio intelligence
doGeneratenow returns the full raw AssemblyAI response onresponse.body(previously a schema-parsed object that stripped most fields).providerMetadata.assemblyaisurfacesutterances(diarization),entities,sentimentAnalysisResults,contentSafetyLabels,iabCategoriesResult, andautoHighlightsResult.Input parameters
prompt,keytermsPrompt,temperature,removeAudioTags,domain,speakerOptions,languageDetectionOptions,redactPiiAudioOptions,redactPiiReturnUnredacted,redactStaticEntities.wordBoost/boostParamin favor ofkeytermsPrompt(AssemblyAI rejectsword_booston the newer models). Warnings are emitted for these and for options missing a required prerequisite (e.g. aredactPii*option set withoutredactPii).Fixes
fetchfor the status-polling requests (previously only upload/submit used it).Verification
speech_model_used, diarization/audio-intelligence output, all new input params,word_boostrejection on the new models, and thekeytermsPrompt→keyterms_promptmapping.Notes
speech_understandingandlanguage_codesare intentionally not exposed (gated per-account by AssemblyAI).wordBooston an incompatible model warns and still forwards (the API returns its own clear 400) rather than being silently stripped.Docs & changeset
content/providers/01-ai-sdk-providers/100-assemblyai.mdx) updated; examples repointed touniversal-3-5-pro.@ai-sdk/assemblyaipatchchangeset.🤖 Generated with Claude Code