Summary
IBKR connection health is derived purely from query failure-counting (_consecutiveFailures), with no active keepalive/heartbeat. When the socket to IB Gateway dies silently (Gateway daily restart, network blip, half-open TCP), account.health stays "healthy" (false positive). Orders placed in this window are accepted at the UTA/wallet layer (get an orderId, status submitted) but never reach IBKR — they only transmit after a manual reconnect, at which point queued orders fill immediately.
Impact
health: healthy while the connection is actually dead → callers place orders that silently don't execute.
- Account/position queries return stale cached data during the dead-but-"healthy" window.
- No auto-recovery; requires a manual reconnect.
Root cause (v0.40.0-beta.2)
- No active heartbeat —
requestCurrentTime() (services/uta/src/domain/trading/brokers/ibkr/request-bridge.ts:216) is on-demand only (e.g. getMarketClock), never on a timer. Nothing probes the idle socket.
connectionClosed() only rejects in-flight requests (request-bridge.ts:377 → rejectAll); it does not mark the account offline or trigger a reconnect.
- Connectivity errors 502/504/1100 are effectively a no-op (
request-bridge.ts:394-398 — comment "These will be followed by connectionClosed() which rejects all", then returns).
- Health = passive failure count —
UnifiedTradingAccount derives health from _consecutiveFailures (incremented only on a failed query, reset on success). A socket dying while idle never trips it → stays healthy. Order submits on the dead socket don't reliably trip it either.
Net: a silently-dropped connection is invisible to the health model until some query happens to fail — and order submits don't reliably trip it.
Repro
- Connect an IBKR paper account via IB Gateway.
- Let the OpenAlice↔Gateway socket drop silently while idle (e.g. Gateway daily auto-restart, or a brief network drop).
GET /api/trading/uta → account still health: healthy.
POST .../wallet/place-order → returns orderId, status submitted; it does not fill.
POST /uta/:id/reconnect → the order fills immediately.
Observed repeatedly on 0.40.0-beta.2 with an IB Gateway paper account: orders sat submitted (no fill) until we manually reconnected, then filled instantly.
Suggested fix
- Add an active heartbeat (periodic
reqCurrentTime with timeout) that marks the account offline on failure.
- In
connectionClosed() / on error 1100, mark the account offline and trigger (or schedule) a reconnect — not only reject in-flight requests.
- Treat error 1102 (connectivity restored) as a recovery signal.
Summary
IBKR connection health is derived purely from query failure-counting (
_consecutiveFailures), with no active keepalive/heartbeat. When the socket to IB Gateway dies silently (Gateway daily restart, network blip, half-open TCP),account.healthstays"healthy"(false positive). Orders placed in this window are accepted at the UTA/wallet layer (get anorderId, statussubmitted) but never reach IBKR — they only transmit after a manual reconnect, at which point queued orders fill immediately.Impact
health: healthywhile the connection is actually dead → callers place orders that silently don't execute.Root cause (v0.40.0-beta.2)
requestCurrentTime()(services/uta/src/domain/trading/brokers/ibkr/request-bridge.ts:216) is on-demand only (e.g.getMarketClock), never on a timer. Nothing probes the idle socket.connectionClosed()only rejects in-flight requests (request-bridge.ts:377→rejectAll); it does not mark the account offline or trigger a reconnect.request-bridge.ts:394-398— comment "These will be followed by connectionClosed() which rejects all", then returns).UnifiedTradingAccountderiveshealthfrom_consecutiveFailures(incremented only on a failed query, reset on success). A socket dying while idle never trips it → stayshealthy. Order submits on the dead socket don't reliably trip it either.Net: a silently-dropped connection is invisible to the health model until some query happens to fail — and order submits don't reliably trip it.
Repro
GET /api/trading/uta→ account stillhealth: healthy.POST .../wallet/place-order→ returnsorderId, statussubmitted; it does not fill.POST /uta/:id/reconnect→ the order fills immediately.Observed repeatedly on 0.40.0-beta.2 with an IB Gateway paper account: orders sat
submitted(no fill) until we manually reconnected, then filled instantly.Suggested fix
reqCurrentTimewith timeout) that marks the accountofflineon failure.connectionClosed()/ on error 1100, mark the account offline and trigger (or schedule) a reconnect — not only reject in-flight requests.