BetterDB-inc · KIvanow · Jun 8, 2026
diff --git a/docs/packages/agent-cache.md b/docs/packages/agent-cache.md
@@ -271,6 +271,8 @@ const model = new ChatOpenAI({
 
 The adapter implements LangChain's `BaseCache` interface.
 
+**Limitation:** LangChain's `BaseCache` interface exposes only `(prompt, llm_string)` to the cache layer. Tool definitions bound to the model are not passed through this interface, so tool-schema changes are not reflected in the cache key. If a tool's schema changes without a corresponding change to the model identity (model name, temperature, etc.), the cache may serve a stale response computed against the old schema. If this matters for your use case, incorporate a tool version string into your model configuration or use a separate cache namespace per tool revision.
+
 ### Vercel AI SDK
 
 Import from `@betterdb/agent-cache/ai`. Requires `ai` ^6.0.135 as a peer dependency.
@@ -288,6 +290,28 @@ const model = wrapLanguageModel({
 
 The middleware intercepts non-streaming `doGenerate` calls. On a cache hit, the model is not called and the response includes `providerMetadata: { agentCache: { hit: true } }` so consumers can distinguish cached responses from real zero-token calls. Responses containing tool-call parts are not cached to avoid breaking tool-calling workflows.
 
+Tool definitions, seed, stop sequences, response format, and tool choice are all included in the cache key automatically. Requests with identical messages but different tools (or different generation parameters) will not collide.
+
+### LlamaIndex
+
+Import from `@betterdb/agent-cache/llamaindex`. Requires `@llamaindex/core` >= 0.6.0 as a peer dependency.
+
+```typescript
+import { prepareParams } from '@betterdb/agent-cache/llamaindex';
+
+const params = await prepareParams(messages, {
+  model: 'gpt-4o',
+  temperature: 0,
+  tools: myTools, // BaseTool[] from LlamaIndex
+});
+
+const result = await cache.llm.check(params);
+```
+
+Tool definitions are included in the cache key when passed via the `tools` option. Only `tool.metadata` (name, description, parameters) is serialized; the `call` closure is never included.
+
+Callers must pass `tools` into `prepareParams` for tool-schema drift safety. Omitting `tools` falls back to messages-only keying (the prior behavior), meaning requests with identical messages but different tool sets will collide in the cache.
+
 ### LangGraph
 
 Import from `@betterdb/agent-cache/langgraph`. Requires `@langchain/langgraph-checkpoint` >= 0.1.0 as a peer dependency.

diff --git a/packages/agent-cache-py/CHANGELOG.md b/packages/agent-cache-py/CHANGELOG.md
@@ -1,3 +1,18 @@
+## [0.7.0] - 2026-06-08
+
+### Fixed
+
+- **LlamaIndex adapter: tool definitions now included in cache key.** When `tools` is passed to `prepare_params()`, tool metadata (name, description, parameters) is extracted and included in the cache key. Only serializable metadata is used; callable closures are never serialized.
+
+### Changed
+
+- **Cache keys changed for tool-using requests on the LlamaIndex adapter.** Existing cached entries for those requests will be a one-time miss after upgrade. This is intended: the prior entries were keyed without tool information and are not safe to reuse across differing tool sets.
+- **`prepare_params()` now accepts a `tools` keyword argument (and `LlamaIndexPrepareOptions.tools` field).** Callers must pass `tools` to get tool-schema safety. Omitting it falls back to messages-only keying (prior behavior).
+
+### Known limitations
+
+- **LangChain adapter: tool-schema drift is not reflected in the cache key.** The framework's `BaseCache` interface exposes only `(prompt, llm_string)` to the cache layer, so tool definitions are structurally unreachable. Unchanged in this release; documented as a known limitation.
+
 ## [0.6.0] - 2026-05-04
 
 ### Added

diff --git a/packages/agent-cache-py/betterdb_agent_cache/adapters/llamaindex.py b/packages/agent-cache-py/betterdb_agent_cache/adapters/llamaindex.py
@@ -27,6 +27,16 @@ class LlamaIndexPrepareOptions:
     temperature: float | None = None
     top_p: float | None = None
     max_tokens: int | None = None
+    tools: list[Any] | None = None
+    """Tool definitions to include in the cache key.
+
+    Pass the same tools list you provide to the LLM call. Each tool must
+    expose a ``metadata`` attribute (or dict key) with at least ``name``,
+    and optionally ``description`` and ``parameters``. Only metadata is
+    serialized; callable closures are never included.
+
+    Omitting this field falls back to messages-only keying (prior behavior).
+    """
 
 
 def _parse_input(value: Any) -> Any:
@@ -87,6 +97,32 @@ async def _normalize_detail(
     return None
 
 
+def _extract_tool_metadata(tool: Any) -> dict[str, Any]:
+    """Extract serializable metadata from a LlamaIndex BaseTool."""
+    if hasattr(tool, "metadata"):
+        meta = tool.metadata
+    elif isinstance(tool, dict) and "metadata" in tool:
+        meta = tool["metadata"]
+    else:
+        meta = tool  # Already a metadata-like dict
+
+    if hasattr(meta, "name"):
+        name = meta.name
+        description = getattr(meta, "description", None)
+        parameters = getattr(meta, "parameters", None)
+    else:
+        name = meta.get("name", "")
+        description = meta.get("description")
+        parameters = meta.get("parameters")
+
+    fn: dict[str, Any] = {"name": name}
+    if description is not None:
+        fn["description"] = description
+    if parameters is not None:
+        fn["parameters"] = parameters
+    return {"type": "function", "function": fn}
+
+
 async def prepare_params(
     messages: list[dict[str, Any]],
     opts: LlamaIndexPrepareOptions | None = None,
@@ -96,13 +132,18 @@ async def prepare_params(
     temperature: float | None = None,
     top_p: float | None = None,
     max_tokens: int | None = None,
+    tools: list[Any] | None = None,
 ) -> LlmCacheParams:
     """Normalise a LlamaIndex message list to ``LlmCacheParams``.
 
     Either pass an ``LlamaIndexPrepareOptions`` instance or use the keyword
     arguments directly::
 
         params = await prepare_params(msgs, model="gpt-4o", temperature=0.7)
+
+    To include tool definitions in the cache key (recommended when using tools)::
+
+        params = await prepare_params(msgs, model="gpt-4o", tools=my_tools)
     """
     if opts is None:
         opts = LlamaIndexPrepareOptions(
@@ -111,6 +152,7 @@ async def prepare_params(
             temperature=temperature,
             top_p=top_p,
             max_tokens=max_tokens,
+            tools=tools,
         )
 
     norm = opts.normalizer
@@ -162,5 +204,7 @@ async def prepare_params(
         result["top_p"] = opts.top_p
     if opts.max_tokens is not None:
         result["max_tokens"] = opts.max_tokens
+    if opts.tools is not None and len(opts.tools) > 0:
+        result["tools"] = [_extract_tool_metadata(t) for t in opts.tools]
 
     return result
diff --git a/packages/agent-cache-py/tests/adapters/test_llamaindex_key_divergence.py b/packages/agent-cache-py/tests/adapters/test_llamaindex_key_divergence.py
@@ -0,0 +1,105 @@
+"""Key divergence tests for the LlamaIndex adapter.
+
+Proves that tool definitions, tool order, and non-serializable closures are
+handled correctly in cache key computation.
+"""
+from __future__ import annotations
+
+import pytest
+from betterdb_agent_cache.adapters.llamaindex import prepare_params
+from betterdb_agent_cache.utils import llm_cache_hash
+
+
+MSGS = [{"role": "user", "content": "Hello"}]
+
+
+class _ToolMetadata:
+    """Mimics LlamaIndex ToolMetadata (attribute-based, not a dict)."""
+
+    def __init__(self, name: str, description: str, parameters: dict | None = None):
+        self.name = name
+        self.description = description
+        self.parameters = parameters
+
+
+class _FakeTool:
+    """Mimics LlamaIndex BaseTool with metadata + a non-serializable call."""
+
+    def __init__(self, meta: _ToolMetadata):
+        self.metadata = meta
+
+    def call(self, _input):  # noqa: ANN001, ANN201  — intentionally untyped
+        raise RuntimeError("should never be serialized")
+
+
+TOOL_A_META = _ToolMetadata("get_weather", "Get weather", {"type": "object", "properties": {"city": {"type": "string"}}})
+TOOL_B_META = _ToolMetadata("search", "Search web", {"type": "object", "properties": {"q": {"type": "string"}}})
+TOOL_A_ALT_META = _ToolMetadata("get_weather", "Get weather", {"type": "object", "properties": {"location": {"type": "string"}}})
+
+TOOL_A = _FakeTool(TOOL_A_META)
+TOOL_B = _FakeTool(TOOL_B_META)
+TOOL_A_ALT = _FakeTool(TOOL_A_ALT_META)
+
+
+# ─── Case 1: Tool sensitivity ────────────────────────────────────────────────
+
+@pytest.mark.asyncio
+async def test_different_tool_names_produce_different_keys():
+    p1 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A])
+    p2 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_B])
+    assert llm_cache_hash(p1) != llm_cache_hash(p2)
+
+
+@pytest.mark.asyncio
+async def test_same_name_different_params_produce_different_keys():
+    p1 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A])
+    p2 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A_ALT])
+    assert llm_cache_hash(p1) != llm_cache_hash(p2)
+
+
+# ─── Case 2: Tool stability (order invariance) ───────────────────────────────
+
+@pytest.mark.asyncio
+async def test_same_tools_different_order_produce_same_key():
+    p1 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A, TOOL_B])
+    p2 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_B, TOOL_A])
+    assert llm_cache_hash(p1) == llm_cache_hash(p2)
+
+
+# ─── Case 3: Tools-absent baseline ───────────────────────────────────────────
+
+@pytest.mark.asyncio
+async def test_no_tools_vs_with_tools_produce_different_keys():
+    p_no = await prepare_params(MSGS, model="gpt-4o")
+    p_yes = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A])
+    assert llm_cache_hash(p_no) != llm_cache_hash(p_yes)
+
+
+@pytest.mark.asyncio
+async def test_no_tools_both_calls_produce_same_key():
+    p1 = await prepare_params(MSGS, model="gpt-4o")
+    p2 = await prepare_params(MSGS, model="gpt-4o")
+    assert llm_cache_hash(p1) == llm_cache_hash(p2)
+
+
+# ─── Case 6: Closure safety ──────────────────────────────────────────────────
+
+@pytest.mark.asyncio
+async def test_tool_with_closure_produces_same_key_as_plain_metadata():
+    """A tool carrying a non-serializable call closure must not throw and
+    must produce a key derived only from its metadata."""
+    tool_with_closure = _FakeTool(TOOL_A_META)
+    tool_plain = {"metadata": {"name": "get_weather", "description": "Get weather",
+                               "parameters": {"type": "object", "properties": {"city": {"type": "string"}}}}}
+
+    p1 = await prepare_params(MSGS, model="gpt-4o", tools=[tool_with_closure])
+    p2 = await prepare_params(MSGS, model="gpt-4o", tools=[tool_plain])
+    assert llm_cache_hash(p1) == llm_cache_hash(p2)
+
+
+@pytest.mark.asyncio
+async def test_closure_key_is_deterministic():
+    tool = _FakeTool(TOOL_A_META)
+    p1 = await prepare_params(MSGS, model="gpt-4o", tools=[tool])
+    p2 = await prepare_params(MSGS, model="gpt-4o", tools=[tool])
+    assert llm_cache_hash(p1) == llm_cache_hash(p2)
diff --git a/packages/agent-cache/CHANGELOG.md b/packages/agent-cache/CHANGELOG.md
@@ -5,6 +5,22 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.7.0] - 2026-06-08
+
+### Fixed
+
+- **Vercel AI SDK adapter: tool definitions now included in cache key.** Previously, the adapter only keyed on model, messages, temperature, topP, and maxTokens. Requests with identical messages but different tools could return the same cached response. `seed`, `stopSequences`, `responseFormat`, and `toolChoice` are now also part of the key.
+- **LlamaIndex adapter: tool definitions now included in cache key.** When `tools` is passed to `prepareParams()`, tool metadata (name, description, parameters) is extracted and included in the cache key. Only serializable metadata is used; the `call` closure is never serialized.
+
+### Changed
+
+- **Cache keys changed for tool-using requests on Vercel and LlamaIndex adapters.** Existing cached entries for those requests will be a one-time miss after upgrade. This is intended: the prior entries were keyed without tool information and are not safe to reuse across differing tool sets.
+- **LlamaIndex `prepareParams()` now accepts a `tools` option.** Callers must pass `tools` to get tool-schema safety. Omitting it falls back to messages-only keying (prior behavior).
+
+### Known limitations
+
+- **LangChain adapter: tool-schema drift is not reflected in the cache key.** The framework's `BaseCache` interface exposes only `(prompt, llm_string)` to the cache layer, so tool definitions are structurally unreachable. Unchanged in this release; documented as a known limitation.
+
 ## [0.6.0] - 2026-05-04
 
 ### Added