Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions docs/packages/agent-cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,8 @@ const model = new ChatOpenAI({

The adapter implements LangChain's `BaseCache` interface.

**Limitation:** LangChain's `BaseCache` interface exposes only `(prompt, llm_string)` to the cache layer. Tool definitions bound to the model are not passed through this interface, so tool-schema changes are not reflected in the cache key. If a tool's schema changes without a corresponding change to the model identity (model name, temperature, etc.), the cache may serve a stale response computed against the old schema. If this matters for your use case, incorporate a tool version string into your model configuration or use a separate cache namespace per tool revision.

### Vercel AI SDK

Import from `@betterdb/agent-cache/ai`. Requires `ai` ^6.0.135 as a peer dependency.
Expand All @@ -288,6 +290,28 @@ const model = wrapLanguageModel({

The middleware intercepts non-streaming `doGenerate` calls. On a cache hit, the model is not called and the response includes `providerMetadata: { agentCache: { hit: true } }` so consumers can distinguish cached responses from real zero-token calls. Responses containing tool-call parts are not cached to avoid breaking tool-calling workflows.

Tool definitions, seed, stop sequences, response format, and tool choice are all included in the cache key automatically. Requests with identical messages but different tools (or different generation parameters) will not collide.

### LlamaIndex

Import from `@betterdb/agent-cache/llamaindex`. Requires `@llamaindex/core` >= 0.6.0 as a peer dependency.

```typescript
import { prepareParams } from '@betterdb/agent-cache/llamaindex';

const params = await prepareParams(messages, {
model: 'gpt-4o',
temperature: 0,
tools: myTools, // BaseTool[] from LlamaIndex
});

const result = await cache.llm.check(params);
```

Tool definitions are included in the cache key when passed via the `tools` option. Only `tool.metadata` (name, description, parameters) is serialized; the `call` closure is never included.

Callers must pass `tools` into `prepareParams` for tool-schema drift safety. Omitting `tools` falls back to messages-only keying (the prior behavior), meaning requests with identical messages but different tool sets will collide in the cache.

### LangGraph

Import from `@betterdb/agent-cache/langgraph`. Requires `@langchain/langgraph-checkpoint` >= 0.1.0 as a peer dependency.
Expand Down
15 changes: 15 additions & 0 deletions packages/agent-cache-py/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,18 @@
## [0.7.0] - 2026-06-08

### Fixed

- **LlamaIndex adapter: tool definitions now included in cache key.** When `tools` is passed to `prepare_params()`, tool metadata (name, description, parameters) is extracted and included in the cache key. Only serializable metadata is used; callable closures are never serialized.

### Changed

- **Cache keys changed for tool-using requests on the LlamaIndex adapter.** Existing cached entries for those requests will be a one-time miss after upgrade. This is intended: the prior entries were keyed without tool information and are not safe to reuse across differing tool sets.
- **`prepare_params()` now accepts a `tools` keyword argument (and `LlamaIndexPrepareOptions.tools` field).** Callers must pass `tools` to get tool-schema safety. Omitting it falls back to messages-only keying (prior behavior).

### Known limitations

- **LangChain adapter: tool-schema drift is not reflected in the cache key.** The framework's `BaseCache` interface exposes only `(prompt, llm_string)` to the cache layer, so tool definitions are structurally unreachable. Unchanged in this release; documented as a known limitation.

## [0.6.0] - 2026-05-04

### Added
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,16 @@ class LlamaIndexPrepareOptions:
temperature: float | None = None
top_p: float | None = None
max_tokens: int | None = None
tools: list[Any] | None = None
"""Tool definitions to include in the cache key.

Pass the same tools list you provide to the LLM call. Each tool must
expose a ``metadata`` attribute (or dict key) with at least ``name``,
and optionally ``description`` and ``parameters``. Only metadata is
serialized; callable closures are never included.

Omitting this field falls back to messages-only keying (prior behavior).
"""


def _parse_input(value: Any) -> Any:
Expand Down Expand Up @@ -87,6 +97,32 @@ async def _normalize_detail(
return None


def _extract_tool_metadata(tool: Any) -> dict[str, Any]:
"""Extract serializable metadata from a LlamaIndex BaseTool."""
if hasattr(tool, "metadata"):
meta = tool.metadata
elif isinstance(tool, dict) and "metadata" in tool:
meta = tool["metadata"]
else:
meta = tool # Already a metadata-like dict

if hasattr(meta, "name"):
name = meta.name
description = getattr(meta, "description", None)
parameters = getattr(meta, "parameters", None)
else:
name = meta.get("name", "")
description = meta.get("description")
parameters = meta.get("parameters")

fn: dict[str, Any] = {"name": name}
if description is not None:
fn["description"] = description
if parameters is not None:
fn["parameters"] = parameters
return {"type": "function", "function": fn}


async def prepare_params(
messages: list[dict[str, Any]],
opts: LlamaIndexPrepareOptions | None = None,
Expand All @@ -96,13 +132,18 @@ async def prepare_params(
temperature: float | None = None,
top_p: float | None = None,
max_tokens: int | None = None,
tools: list[Any] | None = None,
) -> LlmCacheParams:
"""Normalise a LlamaIndex message list to ``LlmCacheParams``.

Either pass an ``LlamaIndexPrepareOptions`` instance or use the keyword
arguments directly::

params = await prepare_params(msgs, model="gpt-4o", temperature=0.7)

To include tool definitions in the cache key (recommended when using tools)::

params = await prepare_params(msgs, model="gpt-4o", tools=my_tools)
"""
if opts is None:
opts = LlamaIndexPrepareOptions(
Expand All @@ -111,6 +152,7 @@ async def prepare_params(
temperature=temperature,
top_p=top_p,
max_tokens=max_tokens,
tools=tools,
)

norm = opts.normalizer
Expand Down Expand Up @@ -162,5 +204,7 @@ async def prepare_params(
result["top_p"] = opts.top_p
if opts.max_tokens is not None:
result["max_tokens"] = opts.max_tokens
if opts.tools is not None and len(opts.tools) > 0:
result["tools"] = [_extract_tool_metadata(t) for t in opts.tools]

return result
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
"""Key divergence tests for the LlamaIndex adapter.

Proves that tool definitions, tool order, and non-serializable closures are
handled correctly in cache key computation.
"""
from __future__ import annotations

import pytest
from betterdb_agent_cache.adapters.llamaindex import prepare_params
from betterdb_agent_cache.utils import llm_cache_hash


MSGS = [{"role": "user", "content": "Hello"}]


class _ToolMetadata:
"""Mimics LlamaIndex ToolMetadata (attribute-based, not a dict)."""

def __init__(self, name: str, description: str, parameters: dict | None = None):
self.name = name
self.description = description
self.parameters = parameters


class _FakeTool:
"""Mimics LlamaIndex BaseTool with metadata + a non-serializable call."""

def __init__(self, meta: _ToolMetadata):
self.metadata = meta

def call(self, _input): # noqa: ANN001, ANN201 — intentionally untyped
raise RuntimeError("should never be serialized")


TOOL_A_META = _ToolMetadata("get_weather", "Get weather", {"type": "object", "properties": {"city": {"type": "string"}}})
TOOL_B_META = _ToolMetadata("search", "Search web", {"type": "object", "properties": {"q": {"type": "string"}}})
TOOL_A_ALT_META = _ToolMetadata("get_weather", "Get weather", {"type": "object", "properties": {"location": {"type": "string"}}})

TOOL_A = _FakeTool(TOOL_A_META)
TOOL_B = _FakeTool(TOOL_B_META)
TOOL_A_ALT = _FakeTool(TOOL_A_ALT_META)


# ─── Case 1: Tool sensitivity ────────────────────────────────────────────────

@pytest.mark.asyncio
async def test_different_tool_names_produce_different_keys():
p1 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A])
p2 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_B])
assert llm_cache_hash(p1) != llm_cache_hash(p2)


@pytest.mark.asyncio
async def test_same_name_different_params_produce_different_keys():
p1 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A])
p2 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A_ALT])
assert llm_cache_hash(p1) != llm_cache_hash(p2)


# ─── Case 2: Tool stability (order invariance) ───────────────────────────────

@pytest.mark.asyncio
async def test_same_tools_different_order_produce_same_key():
p1 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A, TOOL_B])
p2 = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_B, TOOL_A])
assert llm_cache_hash(p1) == llm_cache_hash(p2)


# ─── Case 3: Tools-absent baseline ───────────────────────────────────────────

@pytest.mark.asyncio
async def test_no_tools_vs_with_tools_produce_different_keys():
p_no = await prepare_params(MSGS, model="gpt-4o")
p_yes = await prepare_params(MSGS, model="gpt-4o", tools=[TOOL_A])
assert llm_cache_hash(p_no) != llm_cache_hash(p_yes)


@pytest.mark.asyncio
async def test_no_tools_both_calls_produce_same_key():
p1 = await prepare_params(MSGS, model="gpt-4o")
p2 = await prepare_params(MSGS, model="gpt-4o")
assert llm_cache_hash(p1) == llm_cache_hash(p2)


# ─── Case 6: Closure safety ──────────────────────────────────────────────────

@pytest.mark.asyncio
async def test_tool_with_closure_produces_same_key_as_plain_metadata():
"""A tool carrying a non-serializable call closure must not throw and
must produce a key derived only from its metadata."""
tool_with_closure = _FakeTool(TOOL_A_META)
tool_plain = {"metadata": {"name": "get_weather", "description": "Get weather",
"parameters": {"type": "object", "properties": {"city": {"type": "string"}}}}}

p1 = await prepare_params(MSGS, model="gpt-4o", tools=[tool_with_closure])
p2 = await prepare_params(MSGS, model="gpt-4o", tools=[tool_plain])
assert llm_cache_hash(p1) == llm_cache_hash(p2)


@pytest.mark.asyncio
async def test_closure_key_is_deterministic():
tool = _FakeTool(TOOL_A_META)
p1 = await prepare_params(MSGS, model="gpt-4o", tools=[tool])
p2 = await prepare_params(MSGS, model="gpt-4o", tools=[tool])
assert llm_cache_hash(p1) == llm_cache_hash(p2)
16 changes: 16 additions & 0 deletions packages/agent-cache/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,22 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.7.0] - 2026-06-08

### Fixed

- **Vercel AI SDK adapter: tool definitions now included in cache key.** Previously, the adapter only keyed on model, messages, temperature, topP, and maxTokens. Requests with identical messages but different tools could return the same cached response. `seed`, `stopSequences`, `responseFormat`, and `toolChoice` are now also part of the key.
- **LlamaIndex adapter: tool definitions now included in cache key.** When `tools` is passed to `prepareParams()`, tool metadata (name, description, parameters) is extracted and included in the cache key. Only serializable metadata is used; the `call` closure is never serialized.

### Changed

- **Cache keys changed for tool-using requests on Vercel and LlamaIndex adapters.** Existing cached entries for those requests will be a one-time miss after upgrade. This is intended: the prior entries were keyed without tool information and are not safe to reuse across differing tool sets.
- **LlamaIndex `prepareParams()` now accepts a `tools` option.** Callers must pass `tools` to get tool-schema safety. Omitting it falls back to messages-only keying (prior behavior).

### Known limitations

- **LangChain adapter: tool-schema drift is not reflected in the cache key.** The framework's `BaseCache` interface exposes only `(prompt, llm_string)` to the cache layer, so tool definitions are structurally unreachable. Unchanged in this release; documented as a known limitation.

## [0.6.0] - 2026-05-04

### Added
Expand Down
Loading
Loading