Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
app.py	app.py
vapi_configuration.png	vapi_configuration.png

Connecting SambaNova as Custom LLM to Vapi

This repository documents how to connect a SambaNova LLM server as a custom LLM to Vapi using SambaNova’s Meta-Llama-3.3-70B-Instruct model. The guide walks you through setting up a local Flask server, exposing it with Ngrok, configuring Vapi Custom LLM, and understanding the end-to-end communication flow.

This setup is useful for:

Testing custom LLM logic locally
Adding middleware, logging, or prompt control
Running your own inference or proxy layer behind Vapi

Prerequisites

Before starting, make sure you have the following:

SambaNova API Key - Access to SamabaNova's LLMs. For that, please visit the SambaNova Cloud page
Vapi Account – Access to the Vapi Dashboard. For that, create a Vapi account here
Python 3.11+ – Local development environment
Python dependencies:

pip install flask sambanova

Ngrok – To expose your local server to the internet. For installation, please run the following in MacOS. For more information, go here.

brew install ngrok

Then, get your ngrok auth token and add it with the following. For more information, follow this:

ngrok config add-authtoken $YOUR_NGROK_AUTHTOKEN

Flask App Code – Vapi server-side example here

Step 1: Set Up a Local LLM Server

1. Create a Flask Application

Use the file called app.py here, which forwards incoming chat requests to a SambaNova-hosted LLM using SambaNova's SDK. It accepts standard chat parameters, cleans up Vapi-specific field structure from the request, and then either streams tokens back to the client using Server-Sent Events or returns a full JSON response in one shot.

2. Run the Server

python app.py

The server will start on:

http://localhost:5000

3. Expose the Server Using Ngrok

In a separate terminal:

ngrok http 5000

Ngrok will generate a public URL similar to:

https://abcd-1234.ngrok-free.dev

This is the endpoint Vapi will call.

Test your endpoint with a cURL like the following

curl -X POST https://abcd-1234.ngrok-free.dev/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "call": "chat.completions",
    "metadata": {
      "request_id": "example-123"
    },
    "model": "Meta-Llama-3.3-70B-Instruct",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello! Explain what an LLM is in one sentence."
      }
    ],
    "temperature": 0.7,
    "max_tokens": 150,
    "stream": true
  }'

Step 2: Configure Vapi Custom LLM

Log in to the Vapi Dashboard
Create an Assistant with a Blank Template
Navigate to Model → Provider → Custom LLM
Introduct the Model name you'll use (Meta-Llama-3.3-70B-Instruct)

Paste your Ngrok URL into the endpoint URL field

https://abcd-1234.ngrok-free.dev/chat/completions

Save the configuration

Reference image:

Test the Integration

Send a test message using the Chat or Talk to Assistant options from Vapi
Confirm the request reaches your local Flask server
Verify the response is returned and displayed correctly in Vapi

Step 3: Understanding the Communication Flow

User sends a message in Vapi
Vapi sends a POST request to your Ngrok endpoint
Flask server receives the request
Conversation data is parsed and transformed
SambaNova API is called (Meta-Llama-3.3-70B-Instruct)
Response is formatted for Vapi
Vapi displays the response to the user

Notes & Best Practices

Ngrok URLs change on restart (unless using a paid plan)
Use environment variables for secrets
Validate request payloads from Vapi
Add logging for debugging and observability
Follow the official Vapi response schema strictly

References

Happy building 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

README.md

Connecting SambaNova as Custom LLM to Vapi

Prerequisites

Step 1: Set Up a Local LLM Server

1. Create a Flask Application

2. Run the Server

3. Expose the Server Using Ngrok

Step 2: Configure Vapi Custom LLM

Test the Integration

Step 3: Understanding the Communication Flow

Notes & Best Practices

References

Uh oh!

FilesExpand file tree

vapi

Directory actions

More options

Directory actions

More options

Latest commit

History

vapi

Folders and files

parent directory

README.md

Connecting SambaNova as Custom LLM to Vapi

Prerequisites

Step 1: Set Up a Local LLM Server

1. Create a Flask Application

2. Run the Server

3. Expose the Server Using Ngrok

Step 2: Configure Vapi Custom LLM

Test the Integration

Step 3: Understanding the Communication Flow

Notes & Best Practices

References