Generic caching helpers for Pulp pipelines.
- Remember expensive sources: Wrap any source pipeline and reuse cached output on fresh cache hits.
- Pass-through caching: Cache any file that flows through a pipeline, independent of where it came from.
- Stale fallback: Keep jobs working with the last cached result if an upstream source fails.
- Multiple file support: Cache and restore one or more Pulp
Fileobjects via a manifest. - Raw or serialized payloads: String payloads are stored as raw files; decoded arrays/objects are serialized.
composer require mapsight/pulp-cacheUse remember() when you want cache hits to skip an expensive upstream operation, such as a large HTTP download.
use OpenMapsight\Pulp;
use OpenMapsight\PulpCache;
$source = Pulp::start()
->pipe(Pulp::srcHttp(
'GET',
'https://example.com/data.zip',
['timeout' => 120],
'data.zip'
));
$files = Pulp::start()
->pipe(PulpCache::remember($source, __DIR__ . '/cache', [
'key' => 'example-data',
'ttl' => 86400,
'fallbackToStale' => true,
]))
->run();On a fresh cache hit, the wrapped $source pipeline is not executed.
Use cache() as a regular pass-through handler when the upstream pipeline should still be part of the flow.
Pulp::start()
->pipe(Pulp::src('*.geojson', __DIR__ . '/input'))
->pipe(PulpCache::cache(__DIR__ . '/cache', [
'ttl' => 3600,
]))
->pipe(Pulp::dest(__DIR__ . '/result'))
->run();key(string): Stable cache key. Defaults torememberforremember()and the current file name forcache().ttl(int): Cache lifetime in seconds. Defaults to86400.ttl < 0: Cache never expires.ttl = 0: Always refresh/write the cache.fallbackToStale(bool): Forremember(), return stale cached files if the wrapped source fails. Defaults totrue.
Each cache key gets its own directory:
cache/
example-data/
manifest.json
0000.zip
0001.json
manifest.json stores the original Pulp file names, payload encoding, and cache file paths.
- Keep cache keys stable and independent of secrets.
- Do not use a public web directory as the cache directory if cached input contains private URLs, credentials, or paid data.
remember()is best for source/subpipeline caching.cache()is best for caching an intermediate file in an already-running pipeline.