Skip to content

docs: sync README benchmark table to published paper (385-case)#4

Merged
WaylandYang merged 1 commit into
mainfrom
dev
Jun 16, 2026
Merged

docs: sync README benchmark table to published paper (385-case)#4
WaylandYang merged 1 commit into
mainfrom
dev

Conversation

@WaylandYang

Copy link
Copy Markdown
Contributor

Updates the public README ForgetEval-Adv table from stale v0.4
numbers (112-case) to the published 385-case v0.5.1 numbers
matching arXiv:2606.15903, so the repo and paper agree.

🤖 Generated with Claude Code

The adversarial table cited stale v0.4 numbers (112-case, Lethe+LLM
96.4%, $0.05) that disagreed with the now-public paper
(arXiv:2606.15903, 385-case v0.5.1). Updated to the canonical
in-house 385 reference:

  Lethe 244/385 (63.4%), Mem0 263/385 (68.3%),
  LangGraph 242/385 (62.9%), MemPalace 0/385,
  Lethe+LLM 353/385 (91.7%), LangGraph+LLM 359/385 (93.2%),
  cost ~$0.17.

Also reframes the deterministic cluster honestly (63-68% band,
overlapping Wilson CIs) and notes the +28pt hook lift travels
across backends, matching the paper's control-plane-placement
thesis. LongMemEval table left as-is (different "raw" setup than
the paper's session-granularity appendix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@WaylandYang WaylandYang merged commit b6053b7 into main Jun 16, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant