feat(assemblyai): support universal-3-5-pro and expand the transcription provider by dlange-aai · Pull Request #16548 · vercel/ai

dlange-aai · 2026-07-01T18:17:35Z

Background

AssemblyAI's transcription API has moved to the speech_models request parameter and shipped new models (including the universal-3-5-pro flagship) plus new request options. The @ai-sdk/assemblyai provider only supported the legacy best/nano ids via the deprecated singular speech_model param and dropped diarization / audio-intelligence output. This PR brings the provider up to date.

Changes

Models

Add universal-3-5-pro, universal-3-pro, and universal-2, routed via the speech_models array (the deprecated singular speech_model is used only for the legacy best model).
Deprecate best (still works; emits a deprecation warning) and remove nano — AssemblyAI's API now rejects nano with a 400 ("no longer available").
Using universal-3-pro / universal-2 emits an informational warning suggesting universal-3-5-pro; not a deprecation.

Output: speaker diarization + audio intelligence

doGenerate now returns the full raw AssemblyAI response on response.body (previously a schema-parsed object that stripped most fields).
providerMetadata.assemblyai surfaces utterances (diarization), entities, sentimentAnalysisResults, contentSafetyLabels, iabCategoriesResult, and autoHighlightsResult.

Input parameters

New provider options: prompt, keytermsPrompt, temperature, removeAudioTags, domain, speakerOptions, languageDetectionOptions, redactPiiAudioOptions, redactPiiReturnUnredacted, redactStaticEntities.
Deprecate wordBoost / boostParam in favor of keytermsPrompt (AssemblyAI rejects word_boost on the newer models). Warnings are emitted for these and for options missing a required prerequisite (e.g. a redactPii* option set without redactPii).

Fixes

Transcription segment timings were reported in milliseconds; converted to seconds.
Honor a caller-provided fetch for the status-polling requests (previously only upload/submit used it).

Verification

Unit tests: 21 (node + edge).
Live API: verified end-to-end against AssemblyAI — model routing + speech_model_used, diarization/audio-intelligence output, all new input params, word_boost rejection on the new models, and the keytermsPrompt → keyterms_prompt mapping.

Notes

The feature-flagged params speech_understanding and language_codes are intentionally not exposed (gated per-account by AssemblyAI).
wordBoost on an incompatible model warns and still forwards (the API returns its own clear 400) rather than being silently stripped.

Docs & changeset

Provider docs (content/providers/01-ai-sdk-providers/100-assemblyai.mdx) updated; examples repointed to universal-3-5-pro.
Includes a single @ai-sdk/assemblyai patch changeset.

🤖 Generated with Claude Code

Add universal-3-5-pro, universal-3-pro, and universal-2 to the transcription model ids. These newer models are only accessible through AssemblyAI's speech_models request parameter (the singular speech_model parameter is deprecated and rejects them), so the provider now routes the model id to the correct parameter automatically: legacy best/nano use speech_model, all other models use speech_models. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The legacy `best` model is deprecated (still functional, routes via the deprecated singular `speech_model` parameter): the model id type marks it `@deprecated` and `doGenerate` emits a deprecation warning pointing to `universal-3-5-pro`. The `nano` model is removed entirely — AssemblyAI's API now rejects it with a 400 ("the 'nano' speech model has been deprecated and is no longer available"), confirmed end-to-end against the live API. Repoint examples, docs, and README to `universal-3-5-pro`, generalize the callable provider overload to the full model id type, and expand tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…output The provider previously returned a Zod-parsed (stripped) transcript as response.body, dropping speaker labels, utterances, and all audio-intelligence results even when enabled via providerOptions. Now doGenerate returns the full raw AssemblyAI response on response.body, and populates providerMetadata.assemblyai with structured results for the currently-available features: utterances (diarization), entities, sentimentAnalysisResults, contentSafetyLabels, iabCategoriesResult, and autoHighlightsResult. The words schema gains speaker/channel/confidence and a typed utterances array. Verified availability against the DeepLearning backend and the public API reference: deprecated features (Summarization, Auto Chapters, Custom Topics) are intentionally left off providerMetadata but remain on the raw body. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

AssemblyAI returns word start/end in milliseconds; the provider put them directly into segments' startSecond/endSecond (and the durationInSeconds fallback), making timings 1000x too large. Confirmed against the live API (a 3s clip reported a first word at startSecond: 183). Now divided by 1000. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add provider options for newer AssemblyAI request params: prompt, keytermsPrompt, temperature, removeAudioTags, and domain (wired through api-types + getArgs). Deprecate wordBoost/boostParam: AssemblyAI rejects word_boost with a 400 on universal-3-pro / universal-3-5-pro / slam-1 (works only on universal-2/best), so using either now emits a deprecation warning pointing to keytermsPrompt. Verified param availability and model-gating against the DeepLearning backend and the public API reference. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…options Add provider options for AssemblyAI's GA nested request params: speakerOptions, languageDetectionOptions, redactPiiAudioOptions, redactPiiReturnUnredacted, and redactStaticEntities (wired through api-types + getArgs with camelCase->snake_case mapping). Shapes verified against the live AssemblyAI docs OpenAPI (the assemblyai-api-spec repo was stale for redact_static_entities). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add a regression test asserting nano is no longer special-cased (routes via speech_models, no deprecation warning) and that universal-3-pro routes via speech_models. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…l-3-5-pro Emit an informational warning (type: 'other', not a deprecation) when universal-3-pro or universal-2 is used, noting that universal-3-5-pro is the latest flagship and is set to replace universal-3-pro. Both models remain fully supported; universal-3-5-pro emits no warning. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- warn on options missing prerequisites (redactPii*, languageCode+languageDetection) - fix universal-2 nudge message and wordBoost/boostParam warning attribution - type removeAudioTags / overrideAudioRedactionMethod as enums (drop `as never`) - honor config.fetch for polling GETs - source providerMetadata from the raw response (no field stripping); document its timings are in ms while segments are in seconds - fix redactPiiAudioOptions docs (requires redactPiiAudio); restore the Model Capabilities table rows Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…e, models The provider only exposes transcription models (it throws on languageModel), so the intro line is corrected to match the README and actual behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Combine the per-change changeset files into a single @ai-sdk/assemblyai patch entry describing the provider update, matching the repo's one-changeset-per-PR convention. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

gr2m approved these changes Jul 2, 2026

View reviewed changes

dlange-aai and others added 13 commits July 1, 2026 21:20

style

98e5919

Merge branch 'main' into feat/assemblyai-universal-3-5-pro

1abe0e3

gr2m force-pushed the feat/assemblyai-universal-3-5-pro branch from e13307b to 1abe0e3 Compare July 2, 2026 04:21

gr2m added the backport Admins only: add this label to a pull request in order to backport it to the prior version label Jul 2, 2026

gr2m merged commit ec598e2 into vercel:main Jul 2, 2026
48 of 49 checks passed

This was referenced Jul 2, 2026

Backport: feat(assemblyai): support universal-3-5-pro and expand the transcription provider (#16548) #16569

Open

Backport: feat(assemblyai): support universal-3-5-pro and expand the transcription provider (#16548) #16570

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(assemblyai): support universal-3-5-pro and expand the transcription provider#16548

feat(assemblyai): support universal-3-5-pro and expand the transcription provider#16548
gr2m merged 13 commits into
vercel:mainfrom
dlange-aai:feat/assemblyai-universal-3-5-pro

dlange-aai commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dlange-aai commented Jul 1, 2026

Background

Changes

Models

Output: speaker diarization + audio intelligence

Input parameters

Fixes

Verification

Notes

Docs & changeset

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants