Summary
Configure load balancer affinity so MCP requests from the same client/PAT route to the same Forge instance. This is the primary mechanism for MQTT response routing in a horizontally scaled setup.
Why
When the Forge backend proxies a tool call to the expert agent via MQTT, it subscribes to a per-context response topic (ff/v1/mcp/<platformId>/<userId>/<sessionId>/<entityType>/<entityId>/res). The response needs to arrive at the Forge instance that made the request. Without affinity, any Forge instance could receive the next MCP request from the same client, but the subscription and correlation state live on the original instance.
Affinity is the primary gate: the load balancer routes the same client to the same Forge instance, so subscriptions are reused and responses arrive where they're expected. The <platformId> in the topic structure is the fallback for when affinity breaks (instance restart, rebalancing). In that case, the new instance creates fresh subscriptions and the old ones expire via LRU TTL.
What to do
- Configure load balancer session affinity for the MCP endpoint (
POST /api/v1/mcp)
- Affinity key: the PAT token or a derivative (hash of the Authorization header, or a client identifier)
- Affinity should survive across multiple tool calls in the same agent session
- Document the fallback behavior when affinity breaks (fresh subscriptions, platformId in topic, LRU expiry cleans up orphaned subscriptions on the old instance)
Tests
- Sequential MCP requests with the same PAT route to the same Forge instance
- After a Forge instance restart, the client is re-routed and new subscriptions are created
- Orphaned subscriptions on the old instance expire via LRU TTL
References
Summary
Configure load balancer affinity so MCP requests from the same client/PAT route to the same Forge instance. This is the primary mechanism for MQTT response routing in a horizontally scaled setup.
Why
When the Forge backend proxies a tool call to the expert agent via MQTT, it subscribes to a per-context response topic (
ff/v1/mcp/<platformId>/<userId>/<sessionId>/<entityType>/<entityId>/res). The response needs to arrive at the Forge instance that made the request. Without affinity, any Forge instance could receive the next MCP request from the same client, but the subscription and correlation state live on the original instance.Affinity is the primary gate: the load balancer routes the same client to the same Forge instance, so subscriptions are reused and responses arrive where they're expected. The
<platformId>in the topic structure is the fallback for when affinity breaks (instance restart, rebalancing). In that case, the new instance creates fresh subscriptions and the old ones expire via LRU TTL.What to do
POST /api/v1/mcp)Tests
References