Skip to content

Latest commit

 

History

History
68 lines (52 loc) · 3.77 KB

File metadata and controls

68 lines (52 loc) · 3.77 KB
sidebar_position 1
title Overview
description The Conversational AI Python SDK — install, concepts, and examples.

Conversational AI Python SDK

The Conversational AI Python SDK lets you build voice-powered AI agents that combine speech recognition, large language models, text-to-speech, and optional digital avatars — all routed through a global real-time network.

Sync and Async Clients

The Python SDK provides two client classes:

  • AgentClient — synchronous client backed by httpx.Client. Use this in scripts, Flask apps, or anywhere synchronous code is natural.
  • AsyncAgentClient — asynchronous client backed by httpx.AsyncClient. Use this in asyncio applications, FastAPI, or any async framework.

Both clients expose the same capabilities. Choose whichever fits your application's concurrency model.

Two Conversation Flows

Cascading Flow (ASR → LLM → TTS)

Speech-to-text transcribes audio, a text LLM generates a response, and text-to-speech renders it as audio. This is the most flexible flow — you can mix and match vendors for each stage.

MLLM Flow (Multimodal LLM)

A multimodal model handles audio input and output end-to-end, with no separate STT/TTS step. (Not yet available in the domestic API.)

Two-Layer Architecture

+--------------------------------------------------+
|                Developer API                      |
|  Agent  ·  AgentSession  ·  Vendors  ·  Token     |  <- agentkit/ (hand-written)
+--------------------------------------------------+
|             AgentClient / AsyncAgentClient + Pool  |  <- pool_client.py (hand-written)
+--------------------------------------------------+
|          Fern-generated Client Core               |
|  AgentsClient · TelephonyClient · TypeSystem      |  <- generated, for advanced use
+--------------------------------------------------+

The agentkit layer (agent.agentkit) is the primary developer-facing API. It provides the Agent builder, AgentSession lifecycle, typed vendor classes, and token helpers. You rarely need to use the Fern-generated layer directly, but it is accessible via session.raw when needed.

Navigation

Section What you will learn
Installation Install the SDK and prerequisites
Authentication Configure client credentials
Quick Start Build and run your first agent
Architecture Understand the SDK layers and client types
Agent Configure agents with the fluent builder
AgentSession Manage the agent lifecycle
Vendors Browse all LLM, TTS, STT, and Avatar providers
Cascading Flow Build an ASR → LLM → TTS pipeline
MLLM Flow Multimodal flow (reserved for future)
Avatars Add a digital avatar with Sensetime
Regional Routing Route requests to the nearest region
Error Handling Handle API errors with ApiError
Pagination Iterate over paginated list endpoints
Advanced Raw response, retries, timeouts, custom httpx client
Low-Level API Direct client.agents.start() without the builder
Client Reference Full AgentClient/AsyncAgentClient API
Agent Reference Full Agent builder API
Session Reference Full AgentSession/AsyncAgentSession API
Vendor Reference Constructor options for all vendor classes