This project is a lightweight, zero-dependency Python proxy designed to bridge ForgeCode CLI with the Azure Databricks AI Gateway.
Because ForgeCode is heavily optimized for standard OpenAI-compatible endpoints and Databricks implements a strict, slightly modified version of the specification, direct communication between the two systems fails. This middleware script sits between them, intercepting and translating requests and responses in real-time to ensure seamless compatibility, including streaming text generation and tool calling.
If you attempt to connect ForgeCode directly to a Databricks AI Gateway endpoint, you will encounter three critical blockers:
- The
/modelsEndpoint Missing: ForgeCode validates models upon initialization by callingGET /models. Databricks does not implement this route, returning a404 Not Found, which crashes ForgeCode's startup sequence. - Strict Payload Validation: ForgeCode sends advanced parameters like
parallel_tool_callsandresponse_format: {"type": "json_object"}. When combined withstream: true, Databricks rejects these with a400 Bad Request("Structured output is not currently supported with streaming"). - Incomplete SSE Streams: Databricks returns a minimal Server-Sent Events (SSE) stream. ForgeCode's strict Rust deserializer expects standard OpenAI chunk metadata (
id,object,created,index, etc.). Without these, ForgeCode silently drops the chunks or hangs indefinitely on "Synthesizing".
This Python proxy (databricks_proxy.py) acts as a transparent middleware layer:
- Mocks the
/modelsRoute: It returns a list of available Databricks models (see step two below) allowing ForgeCode to initialize successfully. - Sanitizes Outgoing Requests: It intercepts
POST /chat/completions, stripping out the incompatible parameters (parallel_tool_calls,response_format,stream_options) before forwarding the payload to Databricks. - Routes Per Model Family: It maps each model to the correct Databricks endpoint path (for example
databricks-gpt*→/cursor/v1/chat/completions,databricks-claude*→/mlflow/v1/chat/completions) and forwards each request accordingly. - Enriches Incoming Streams: It parses the SSE stream returning from Databricks, injects the missing OpenAI metadata into every chunk, ensures perfect
\n\nframing, and properly handles the[DONE]signal so ForgeCode can render the live stream flawlessly.
The proxy requires one environment variable: your Databricks AI Gateway base URL. It will derive the host and then append the model-specific endpoint path from models.json.
Option A — Shell export (current session only):
export DATABRICKS_AI_GATEWAY_URL="https://<workspace-id>.ai-gateway.azuredatabricks.net"Option B — Persist it in your shell profile (recommended):
# Add to ~/.zshrc or ~/.bash_profile, then restart your terminal or run `source ~/.zshrc`
export DATABRICKS_AI_GATEWAY_URL="https://<workspace-id>.ai-gateway.azuredatabricks.net"Option C — .env file: (Useful for sharing with coworkers, or Mom)
cp .env.example .env
# Edit .env and fill in your URL, then:
source .envYour endpoint URL is found in the Databricks workspace under AI Gateway → your endpoint → View endpoint details. Existing full endpoint values (for example
/mlflow/v1/chat/completions) are still accepted; the proxy will normalize them to the gateway base URL.
Edit models.json to match the models enabled in your AI Gateway and assign each to an endpoint alias (cursor, mlflow, or openai). The proxy loads this file automatically on startup; if it's missing, a built-in default list and endpoint rules are used. For chat-completions payloads (messages), the proxy automatically prefers cursor over openai when both exist, to avoid Responses API payload mismatch errors. For periodic updates to this list, I have browsed to the Databricks AI Gateway dashboard and copied the table of models to a file, then used the following prompt to help populate the models JSON file:
I have created a new markdown file @[databricks-models.md] that contains the most-current updated list of Databricks models copied from the website. Scrape this file for the new model names and update the @[models.json] file with the appropriate new and updated models. Use the same endpoint logic as before (gpt models use cursor, anthropic models use mlflow) I think the new models in markdown can be read using a 4n-3 formula, where the model names are on lines 1, 5, 9, 13, 17, 21, and so on.{
"endpoints": {
"mlflow": "/mlflow/v1/chat/completions",
"cursor": "/cursor/v1/chat/completions",
"openai": "/openai/v1/responses"
},
"models": [
{"id": "databricks-claude-haiku-4-5", "endpoint": "mlflow"},
{"id": "databricks-gpt-5-3-codex", "endpoint": "cursor"}
]
}python3 databricks_proxy.py
# Optional flags:
# --port 9090 bind to a different port (default: 8080)
# --host 0.0.0.0 bind to all interfaces (default: 127.0.0.1)Your variables will be output by the proxy based on what you pass in. You only need to set these once in ForgeCode. You can create a Databricks token under personal Settings > Developer > Manage Access Tokens, but here are the defaults:
forge provider login openai_compatible
# URL: http://127.0.0.1:8080
# API Key: <Your Databricks Personal Access Token>forge config set model databricks-claude-sonnet-4-6
# or
:modelYou can verify your settings with forge info or :info.
[F-D Bridge was built with the help of both Google Gemini and GitHub Copilot, with Erik Hanson in the architect seat.]