Skip to content

Question about delta-based encoding missing in implementation #6

@Zedong-Liu

Description

@Zedong-Liu

I’m reading both the SIGCOMM’24 paper and the codebase. The paper (§5.2 and Fig. 6) describes a change-based / delta encoding strategy: grouping tokens and storing the KV of the first token in each group as an anchor, then encoding only the deltas relative to that anchor.

However, in the current implementation, I could not locate such delta computation or storage. It seems the KV tensors are directly quantized and compressed without this differential step.

Is delta-based encoding still part of the latest method?
Was it removed/changed for practical reasons (accuracy / complexity / GPU optimization)?
If removed — does the reported performance in the paper include delta encoding or reflect the current code?

If I missed where it’s implemented, could you please point me to the right place?

Thanks again for the excellent work and for open-sourcing CacheGen.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions