Fix/libs telemetry#229
Merged
Merged
Conversation
…ache Mirrors the Python benchmark harness with TS-native tooling: - BetterDB adapter wrapping @betterdb/semantic-cache (bare/local/full/autotune modes) - Upstash adapter wrapping @upstash/semantic-cache for competitive comparison - HuggingFace dataset loaders (STSb, SICK, PAWS-Wiki, vCache LM Arena) with local JSONL caching - Local embedding via @huggingface/transformers (bge-small-en-v1.5, all-MiniLM-L6-v2) - F1/precision/recall/FPR metrics with latency percentiles - snake_case JSON output compatible with Python harness report tools Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ws than limit The cache key was (dataset, config, split) with no limit component. A run with limit=500 would cache 500 rows, then a subsequent run with limit=5000 would get a cache hit and silently return only 500 rows. Fix: if the cached file has fewer rows than the requested limit, treat it as stale and re-download.
…he_lmarena String(undefined) produces the literal string "undefined" which passes the !prompt truthy check. Check for null/undefined before String() conversion so rows with missing prompt fields are skipped.
…mark-ts-harness # Conflicts: # pnpm-lock.yaml
posthog was behind an optional `analytics` extra that no user ever installed, resulting in 0 telemetry despite increasing downloads. Making it a core dependency ensures the PostHog client is always available when the baked API key is present.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes
Checklist
roborev review --branchor/roborev-review-branchin Claude Code (internal)Note
Low Risk
Changes are mostly additive tooling and dependency wiring; default installs gain PostHog unless users opt out via existing telemetry flags.
Overview
Adds a new
packages/cache-benchmark-tsworkspace package: a TypeScript semantic-cache benchmark CLI that mirrors the Python harness, withbetterdbandupstashadapters, HuggingFace dataset loaders, threshold sweeps, and snake_case JSON output for cross-tool compatibility.The BetterDB adapter exercises
@betterdb/semantic-cachein modes from bare thresholding through rerank, LLM judge, and Monitor API autotune; benchmarks disable package analytics during runs.Python packaging:
posthogmoves from the optionalanalyticsextra into core dependencies onbetterdb-agent-cacheandbetterdb-semantic-cache, so telemetry works without installing extras.pnpm-lock.yamlpicks up the new package and related transitive deps (e.g. transformers, Upstash).Reviewed by Cursor Bugbot for commit e1b0c28. Bugbot is set up for automated code reviews on this repo. Configure here.