Summary
The Auto Router caches model routing decisions in a module-level dictionary (_cache: dict[str, str] = {} in router.py). This cache is shared across all requests and all instances of ShimServer within the same Python process. A previous routing decision for one user message can influence routing for a semantically identical message in a different session, and the cache is never persisted or invalidated on model configuration changes — including when the user switches models via the picker.
Evidence
codex_shim/router.py:
_CACHE_MAX = 256
_cache: dict[str, str] = {}
Cache key construction:
def _cache_key(signal: dict[str, Any]) -> str:
return "%s|%s" % (signal["has_images"], hash(signal["task"]))
The key is (has_images, hash(user_message_text)). Python's built-in hash() is not cryptographic and is not stable across restarts (PYTHONHASHSEED). More critically, the cache is never cleared when:
- A model is added or removed from
models.json.
- A model's credentials are rotated.
- The user switches the active model via
/api/switch.
- A configured model becomes unavailable (returns 401 or 429).
The eviction policy is a full cache clear when len(_cache) >= _CACHE_MAX — a poor replacement for LRU and means the cache flip-flops between 256 entries and 0.
Why this matters
- Stale routing after config change: If a user reconfigures their models (removes a model, rotates an API key to a new model), the cache continues routing new requests to the old model slug for up to 256 cache entries.
- Cross-request contamination: If two users share the same shim process (team/shared-server deployment), one user's cached routing decision affects another user's task routing.
- Non-deterministic key: Python's
hash() varies per process invocation with PYTHONHASHSEED randomisation. The cache is only valid for a single process lifetime, so any code that assumes cache keys are stable across restarts will be surprised.
- Thundering herd on full eviction: The bulk-clear eviction means 256 concurrent requests could all miss the cache simultaneously and all fire the classifier, causing a sudden spike in API calls.
Root cause
The module-level mutable global was chosen for simplicity. The design does not account for cache invalidation on configuration changes, which are a normal operation (model switching via picker).
Recommended fix
- Clear the router cache whenever
_set_active_model or switch_model is called.
- Use per-instance cache storage on
ShimServer rather than a module-level dict, so multiple server instances don't share state.
- Replace bulk-clear eviction with LRU eviction (e.g.,
functools.lru_cache or collections.OrderedDict).
- Document that the cache key uses Python's
hash() and is not stable across restarts.
Acceptance criteria
- Calling
switch_model clears the router cache.
- Router cache is stored per
ShimServer instance, not as a module global.
- LRU or FIFO eviction replaces the current bulk-clear strategy.
- Tests verify cache invalidation on model switch.
Suggested labels
bug, architecture, reliability
Priority
P2
Severity
Medium — incorrect routing after model reconfiguration; no security impact in single-user deployments.
Confidence
Confirmed — module-level _cache dict and bulk-clear eviction are explicit in the source; switch_model does not call reset_cache.
Summary
The Auto Router caches model routing decisions in a module-level dictionary (
_cache: dict[str, str] = {}inrouter.py). This cache is shared across all requests and all instances ofShimServerwithin the same Python process. A previous routing decision for one user message can influence routing for a semantically identical message in a different session, and the cache is never persisted or invalidated on model configuration changes — including when the user switches models via the picker.Evidence
codex_shim/router.py:Cache key construction:
The key is
(has_images, hash(user_message_text)). Python's built-inhash()is not cryptographic and is not stable across restarts (PYTHONHASHSEED). More critically, the cache is never cleared when:models.json./api/switch.The eviction policy is a full cache clear when
len(_cache) >= _CACHE_MAX— a poor replacement for LRU and means the cache flip-flops between 256 entries and 0.Why this matters
hash()varies per process invocation withPYTHONHASHSEEDrandomisation. The cache is only valid for a single process lifetime, so any code that assumes cache keys are stable across restarts will be surprised.Root cause
The module-level mutable global was chosen for simplicity. The design does not account for cache invalidation on configuration changes, which are a normal operation (model switching via picker).
Recommended fix
_set_active_modelorswitch_modelis called.ShimServerrather than a module-level dict, so multiple server instances don't share state.functools.lru_cacheorcollections.OrderedDict).hash()and is not stable across restarts.Acceptance criteria
switch_modelclears the router cache.ShimServerinstance, not as a module global.Suggested labels
bug, architecture, reliability
Priority
P2
Severity
Medium — incorrect routing after model reconfiguration; no security impact in single-user deployments.
Confidence
Confirmed — module-level
_cachedict and bulk-clear eviction are explicit in the source;switch_modeldoes not callreset_cache.