Skip to content

Implement cross-file persistent chunk cache#118

Draft
Copilot wants to merge 3 commits into
masterfrom
copilot/implement-cross-file-chunk-cache
Draft

Implement cross-file persistent chunk cache#118
Copilot wants to merge 3 commits into
masterfrom
copilot/implement-cross-file-chunk-cache

Conversation

Copilot AI commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Add a client-level, disk-backed chunk cache keyed by (xorbHash, absoluteChunkIndex) that is reused across all download methods and survives process restarts. No LRU or capacity limit.

Cache layer (download/persistent_cache.go)

  • ChunkCache stores decoded chunks as files under <dir>/<xorbHash>/<chunkIndex>
  • Atomic writes (temp + rename) for safe concurrent/cross-process access
  • Get, Put, HasRange API

Prefetcher integration (download/prefetcher.go)

  • Each prefetchEntry now tracks absolute chunk range (chunkStart/chunkEnd) derived from termFetches
  • Hit path: tryLoadFromCache checks HasRange; on hit, creates a chunkCache backed by persistentRead — no network fetch, no decoding
  • Miss path: saveToCache writes decoded chunks after LoadAll() in both single-range and multipart (stream) paths
  • Best-effort writes; failures don't affect downloads

chunkCache extension (download/chunk_cache.go)

  • Added persistentRead func(idx uint32) ([]byte, error) — when set, Chunk() bypasses the decoder/temp-file store entirely

Client wiring

  • client.WithChunkCache(*download.ChunkCache) option, plumbed through download.WithChunkCache() to all four download entry points (DownloadFile, DownloadFileV1, DownloadFileV2, DownloadFiles)

Usage

cache, _ := download.NewChunkCache("/var/cache/xet/chunks")
c := client.NewClient(
    client.WithBaseURL("https://..."),
    client.WithChunkCache(cache),
)
// All subsequent downloads share and populate the same persistent cache
r, size, _ := c.DownloadFile(ctx, hash, nil)

Copilot AI linked an issue Apr 17, 2026 that may be closed by this pull request
Copilot AI and others added 2 commits April 17, 2026 10:29
Add a persistent, disk-backed chunk cache keyed by (xorbHash, absoluteChunkIndex)
that is owned by the client and shared across all download methods
(DownloadFile, DownloadFileV1, DownloadFileV2, DownloadFiles).

New types and options:
- download.ChunkCache: persistent cache backed by filesystem
- download.WithChunkCache(): download option
- client.WithChunkCache(): client option

Integration in prefetcher:
- On cache hit: skip network download and decoding entirely
- On cache miss: download and decode as normal, then write to cache
- Cache is recoverable after process restart (plain files on disk)
- No LRU or capacity limit

Agent-Logs-Url: https://github.com/wzshiming/xet/sessions/2519e980-2e44-447e-9f63-f4a696a5d0ae

Co-authored-by: wzshiming <6565744+wzshiming@users.noreply.github.com>
…x test edge case

Agent-Logs-Url: https://github.com/wzshiming/xet/sessions/2519e980-2e44-447e-9f63-f4a696a5d0ae

Co-authored-by: wzshiming <6565744+wzshiming@users.noreply.github.com>
Copilot AI changed the title [WIP] Implement a cross-file chunk cache for reusable downloads Implement cross-file persistent chunk cache Apr 17, 2026
Copilot AI requested a review from wzshiming April 17, 2026 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement a cross-file chunk cache

2 participants