Skip to content

feat(gen): batch LLM annotation with --annotation-batch flag#70

Merged
yuchou87 merged 3 commits into
mainfrom
feat/batch-annotation
May 7, 2026
Merged

feat(gen): batch LLM annotation with --annotation-batch flag#70
yuchou87 merged 3 commits into
mainfrom
feat/batch-annotation

Conversation

@yuchou87

@yuchou87 yuchou87 commented May 6, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds --annotation-batch N flag to caseforge gen
  • With N=10: 36 ops → 4 LLM calls ≈ ~1–2 min (was ~9 min sequential)
  • Default 0 preserves existing sequential behavior (no breaking change)

How it works

Each batch sends N operations in one prompt. The LLM returns a JSON array keyed by operation_id (not array index), so reordering or missing entries are handled safely without index alignment bugs.

Failure modes (all graceful):

  • Partial response: ops with no matching operation_id get no annotation, generation continues
  • Malformed JSON: all ops in that batch skip annotation, warn logged, generation continues
  • Annotation is best-effort by design — failure never blocks case generation

Test plan

  • go test ./internal/methodology/ -run "TestEngine_Batch|TestParseBatch" — 7 new tests pass
  • ./scripts/acceptance.sh — AT-252 passes, pre-existing failures unchanged
  • Manual: caseforge gen --spec examples/speculo-api.yaml --annotation-batch 10 — completes in ~1–2 min vs ~9 min

yuchou87 added 3 commits May 6, 2026 21:58
Add configurable batch annotation to reduce LLM round-trips during the
semantic annotation pre-pass.

Previously: 36 operations × ~15s per call = ~9 minutes sequential.
With --annotation-batch=10: 4 batches × ~20s per call ≈ ~1–2 minutes.

Design:
- --annotation-batch N (default 0 = sequential, one call per op)
- Each batch sends N operations in one prompt; LLM returns a JSON array
  keyed by operation_id (not array index) for reliable matching
- Partial failure: if a batch returns fewer entries than requested, ops
  without a matching operation_id get no annotation — generation continues
- Batch JSON error: ops in that batch get no annotation, warn logged
- Engine.SetAnnotationBatch(n) for programmatic control

Reliability:
- operation_id-keyed responses survive LLM reordering or omissions
- parseBatchAnnotations silently drops entries missing operation_id
- Invalid JSON returns nil map → all ops in batch get no annotation
- No cascading retry per op (annotation is best-effort by design)

Tests:
- TestEngine_BatchAnnotation_EmitsOneEventPerOp: TUI progress unaffected
- TestEngine_BatchAnnotation_AnnotatesOpsCorrectly: key-based matching
- TestEngine_BatchAnnotation_BatchFailureIsGraceful: invalid JSON safe
- TestEngine_BatchAnnotation_SplitsIntoBatches: 5 ops / batch=3 → 2 calls
- TestParseBatchAnnotations_*: unit tests for the parser
- AT-252: --annotation-batch flag registered in gen --help
- Fix off-by-one: annotationBatch >= 1 now dispatches batch path
  (previously batch=1 fell through to sequential mode silently)
- Add 200ms inter-batch throttle to reduce rate-limit pressure
- Cap MaxTokens at min(256*n, 8192) — stays within provider output limits
- Include op.Description in batch prompt alongside Summary for richer LLM signal
- Remove unused n variable from batchLLMProvider.Complete in tests
- Strengthen AT-252: also verify --annotation-batch flag runs gen to completion
@yuchou87 yuchou87 merged commit 1531e7c into main May 7, 2026
1 check passed
@yuchou87 yuchou87 deleted the feat/batch-annotation branch May 7, 2026 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant