prompt-cache

A small PHP library for caching LLM responses so you don't keep paying to get the same answer back. It does two things: exact match caching (hash the prompt, look it up) and semantic caching (compare embeddings when the exact match misses).

$reply = PromptCache::remember($prompt, function () use ($client, $prompt) {
    return $client->chat($prompt);
});

First call runs the closure. Second call gives you the saved reply. That's it.

Why I wrote this

I had a side project that talked to OpenAI a lot, and the bill was getting silly considering most of the prompts were variations of the same handful of questions. I wanted something like Cache::remember() but for LLM stuff. I couldn't find anything that wasn't tied to a specific SDK or buried inside some big agent framework, so I made one.

Requirements

PHP 8.2 or newer
PDO with the sqlite driver (for the default storage)
predis/predis if you want to use Redis instead
cURL if you use the OpenAI or Ollama embedding providers

Install

composer require prompt-cache/prompt-cache

The first time you call it, it creates a sqlite file at storage/prompt-cache.sqlite and you're done. No config to write unless you want to.

Basic usage

use PromptCache\PromptCache;

$prompt = 'Summarise this article in three bullet points: ...';

$summary = PromptCache::remember($prompt, function () use ($prompt) {
    return $myOpenAi->chat($prompt);  // or anthropic, mistral, whatever
});

The prompt gets normalised (extra whitespace flattened, timestamps and UUIDs replaced with placeholders) before it's hashed, so two prompts that only differ by formatting still hit the same cache row.

Semantic cache

When the exact hash doesn't match, semantic() looks at previously stored prompts and picks the closest one. If the similarity is above the threshold (0.92 by default), you get the old answer back.

$reply = PromptCache::semantic(
    $prompt,
    fn () => $client->chat($prompt)
);

Tighten or loosen the threshold per call if you want:

$reply = PromptCache::semantic(
    $prompt,
    fn () => $client->chat($prompt),
    0.88
);

I deliberately didn't depend on any LLM SDK. You hand me a closure, I either run it or I don't. Use whatever client you like.

Streaming

For streaming responses you get a Generator back. First call streams from upstream while quietly stitching the chunks together for the cache. Second call replays from the cache, same chunked interface, no upstream traffic.

foreach (PromptCache::stream($prompt, fn () => $client->stream($prompt)) as $chunk) {
    echo $chunk;
}

Stats

I added counters mostly because I wanted to see for myself how much I was actually saving. There's a stats() method that gives you a running total:

print_r(PromptCache::stats());
/*
Array (
    [requests]            => 1200
    [exact_hits]          => 400
    [semantic_hits]       => 300
    [misses]              => 500
    [tokens_saved]        => 2838282
    [estimated_usd_saved] => 482.22
)
*/

The token count is a back-of-the-envelope figure (4 chars per token), not a real tokeniser, so treat the dollar number as a rough sanity check, not a billing reconciliation.

Debug

If you want to see what it's actually doing:

PromptCache::debug(true);

You'll see hits, misses, similarity scores and embedding timings written to STDERR (or error_log() outside of CLI). Or pass a callable and route the events yourself:

PromptCache::debug(function ($event, $data) {
    Log::info("prompt-cache.$event", $data);
});

Storage drivers

Three options shipped. Pick whichever fits.

Driver	Class	When to use it
SQLite	`PromptCache\Stores\SqliteStore`	Default. No setup. Good for most apps.
File	`PromptCache\Stores\FileStore`	One JSON file. Handy for shipping a warm cache.
Redis	`PromptCache\Stores\RedisStore`	When you have multiple workers sharing a cache.

Want a different backend? Implement PromptCache\Contracts\Store. It has eight methods, none of them surprising.

Embedding providers

Provider	Class	Notes
Null	`PromptCache\Embeddings\NullEmbeddingProvider`	Local CRC32 thing. No keys needed. Rough quality.
OpenAI	`PromptCache\Embeddings\OpenAIEmbeddingProvider`	Uses `text-embedding-3-small` by default.
Ollama	`PromptCache\Embeddings\OllamaEmbeddingProvider`	Hits a local Ollama server.

The Null provider exists so the semantic API works out of the box without anyone having to wire up an API key. It's not great. Use one of the real ones when it matters.

Laravel

The service provider auto-discovers. If you want to tweak settings, publish the config:

php artisan vendor:publish --tag=prompt-cache-config

Then use the facade wherever:

use PromptCache;

$reply = PromptCache::semantic($prompt, fn () => $client->chat($prompt));

Or set things from .env:

PROMPT_CACHE_DRIVER=redis
PROMPT_CACHE_EMBEDDINGS=openai
OPENAI_API_KEY=sk-...
PROMPT_CACHE_THRESHOLD=0.9

Examples

There's a folder of runnable scripts under examples/:

01_exact_cache.php - the most basic case
02_semantic_cache.php - rephrased prompt that still hits
03_streaming.php - stream, cache, replay
04_stats_and_debug.php - counters and the debug logger
05_openai_real.php - real OpenAI embeddings, needs OPENAI_API_KEY

Run them with php examples/01_exact_cache.php from the package root. They use a tiny autoloader so they work without composer install.

Tests

composer install
vendor/bin/pest

The suite covers the exact cache, semantic cache, cosine similarity, streaming, sqlite persistence, the file store, the normaliser and the stats counters.

What this isn't

It's a cache. There's no agent framework here, no RAG helpers, no chain-of-thought thing. Bring your own LLM library and put this in front of it.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
docs		docs
examples		examples
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
composer.json		composer.json
phpunit.xml.dist		phpunit.xml.dist

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

prompt-cache

Why I wrote this

Requirements

Install

Basic usage

Semantic cache

Streaming

Stats

Debug

Storage drivers

Embedding providers

Laravel

Examples

Tests

What this isn't

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

prompt-cache

Why I wrote this

Requirements

Install

Basic usage

Semantic cache

Streaming

Stats

Debug

Storage drivers

Embedding providers

Laravel

Examples

Tests

What this isn't

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages