Skip to content

AiGenePublicationCommentService implementation#301

Draft
bobular wants to merge 32 commits into
masterfrom
feature-ai-user-comments
Draft

AiGenePublicationCommentService implementation#301
bobular wants to merge 32 commits into
masterfrom
feature-ai-user-comments

Conversation

@bobular
Copy link
Copy Markdown
Member

@bobular bobular commented May 13, 2026

@bobular
Copy link
Copy Markdown
Member Author

bobular commented Jun 1, 2026

Oops the commit I just made should have been "Deliverable 0" - an LLM wouldn't have made that mistake!


⏸ Pausing here for review — as the plan directs

"Module builds clean; the three endpoints return 501. Pause here for user review of the shape before any pipeline logic lands."

Deliverable 0 (scaffolding) is complete and verified. Both modules compile clean, every new class is on disk and emits bytecode, and the service is registered. No pipeline logic has landed — every method body is either a 501 response (the three endpoints) or throw new UnsupportedOperationException("… deliverable N"), each tagged with the deliverable that will fill it in.

What I'd like you to eyeball before I start Deliverable 1

The shape is in these files (all under services/ai/ + the Model comment/pojo & comment/repo packages). Worth a look:

  • AiGenePublicationCommentService — the JAX-RS surface: path /user-comments/ai-gene-publication, extends AbstractUserCommentService (so it inherits fetchUser()/getCommentFactory()).
  • JobState / JobStatus — the wire-contract value types (stages enum, terminal-status enum, follower list).
  • migration_comment_b22.sql — the DDL going to Mustafa/Steve.

Two decisions I want your call on

  1. AiGenePublicationRequest DTO — I added this POST-body POJO; it wasn't in the plan's file table but the contract implies it. OK to keep?

  2. JobRegistry as a process-wide singleton — simplest for v1 (one bounded pool + one eviction scheduler per JVM). Fine, or would you rather it hang off the WdkModel/application lifecycle?

Note on deeper verification

Plan verification #2/#3 (501 over HTTP, 404 for unknown id, 401 for guests) need a deployed instance and only become meaningful once Deliverable 1 adds real routing/auth behavior. If you want, I can deploy and smoke-test the 501s now — otherwise I'd fold those into the Deliverable 1 review.

Want me to proceed to Deliverable 1 (sync prelude), or adjust the scaffolding first? Nothing is committed to git yet either — say the word if you'd like a checkpoint commit before moving on.

@bobular
Copy link
Copy Markdown
Member Author

bobular commented Jun 2, 2026

It wasn't clear how synonyms and the jobId was wired from SyncPrelude to the main job/pipeline. Hopefully it's clearer now with the latest commit.


What changed

File Change
JobSubmission.java (new) Immutable carrier of the resolved inputs. Built from (request, jobId, synonyms, modelName, promptVersion, optionsJson) — derives sourceKind/pubmedId/uploadedPaperText/etc. from the request so call sites stay small.
JobState.java Dropped the per-follower Submitter-with-full-request. Now holds one JobSubmission (shared) + List<Long> followerUserIds + a userId→commentId map filled at persist. getJobId() delegates to the submission.
AiGenePublicationPipeline.java Constructor is now (JobState, WdkModel); reads inputs via _job.getSubmission(). Added the transient stage-output fields (_articleText, _mentionCounts, _summaryJson, _validatedJson, _aiHeadline, _aiContent) so the ①→⑥ threading is visible.
JobRegistry.java submit(JobSubmission, long userId, Function<JobState,Runnable> pipelineFactory) and attach(String jobId, long userId). The factory resolves the construction order (pipeline needs the JobState the registry creates).

So, to directly answer your question — synonyms and jobId now flow like this:

SyncPrelude.resolveSynonyms()/computeJobId()
   └─► new JobSubmission(request, jobId, synonyms, model, promptVersion, optionsJson)   [immutable]
         └─► JobRegistry.submit(submission, userId, jobState -> new Pipeline(jobState, wdkModel))
               └─► JobState holds submission; Pipeline reads jobState.getSubmission().getSynonyms()
                     └─► stages thread article text / scan counts / summary via instance fields
                           └─► persist loops followerUserIds → createComment, recordComment(userId, commentId)

bobular and others added 13 commits June 2, 2026 13:57
Port _count_substrings + aliases_mentioned_in_paper from helpers.py into
GeneMentionScanner (countSubstrings + namesMentioned), and wire the
scanning-gene-mentions pipeline stage:

- countSubstrings: letters+digits optional-hyphen tolerance (Nd6 <-> Nd-6),
  separator-run collapse to [-_\s]+ (PF3D7_1133400 <-> PF3D7-1133400),
  non-alphanumeric boundary anchoring, case-insensitive, literal escaping.
- namesMentioned: gene id first iff mentioned (underscore->hyphen fallback),
  then aliases by count desc / name asc, capped at top-3 THEN de-duped
  case-insensitively (Python order; no 4th-alias backfill).
- scanGeneMentions stores the names list for the D4 prompt, or marks the job
  terminal gene-not-mentioned with synonyms_checked = [geneId, ...synonyms].
- TerminalResult.geneNotMentioned renders synonyms_checked (sibling_summary
  aggregate deferred to D6/D7 pending DB tables).

22 new unit tests (20 scanner + 2 pipeline stage); full Service build green,
53 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Port the getGeneSummary LLM stage from the Python pipeline:

- Prompt resources ported into ai/prompts/getGeneSummary/{system,user,schema}.
  Faithful-port quirks reproduced verbatim (Python string-concat typos
  "geneor", "measurements   - Be", "FORMAT COMPLIANCEStructure") so output
  matches what the prompts were tuned on — flagged upstream.
- PromptLoader: classpath load + cache, split user.txt into turns on blank
  lines BEFORE substitution, naive [PH] String.replace (port of
  get_prompt_and_replace). 6 tests.
- AnthropicJsonClient (implements JsonPromptClient): extractJson fence-strip +
  parse (port of extract_json), formatter-retry loop MAX_RETRY=3 (port of the
  STEP_1 loop, formatter system prompt verbatim incl. its typo), and a real
  LlmCompleter mirroring ClaudeSummarizer (AnthropicOkHttpClientAsync, temp 0,
  max_tokens 20000, system + per-turn user messages + "{" assistant prefill,
  prefill prepended). 8 tests via injected completer; the network path is not
  live-tested.
- Pipeline generateSummary: builds [N_QUOTES]/[GENE]/[PAPER_TEXT]/[JSON_SCHEMA],
  stores _summaryJson, and on only_in_passing=true short-circuits to terminal
  mentioned-in-passing (persisted; synonyms_checked). geneForPrompt ported from
  gene_for_prompt. TerminalResult.mentionedInPassing rendering. 6 tests.

Full Service build green, 74 tests pass (+21 new).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Gather the scattered per-deliverable quirk notes into a single "Port quirks &
deviations" section: (A) upstream Python bugs reproduced verbatim with an
explicit not-yet-reported checklist, (B) deliberate Java deviations — now incl.
the previously-unlogged skipped Anthropic prompt caching, [JSON_SCHEMA]
stringify formatting, and dropped trailing whitespace — and (C) subtle
faithful-match decisions. Add a coordination-item pointer to report the
upstream bugs to the Python team.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per the Python authors, the verifyGeneSummary second pass didn't materially
improve results and wasn't worth the tokens — remove it from the back-end
entirely. D5 collapses from "validation + flatten" to flatten-only.

Validation removal:
- drop `validate` from Options (canonical-options JSON now
  {generate_product_description}; JobDigestTest updated)
- drop JobStatus.VALIDATION_ERROR, JobState.Stage.VALIDATING, the run()
  validate block, validateSummary(), _validatedJson
- renumber pipeline stage markers (flatten ⑤→④, persist ⑥→⑤)
- delete the verifyGeneSummary prompt resource dir
- drop validate/validating/validation-error/errors from post-request.json
  and status-response.json
- update CLAUDE-ai-user-comments.md throughout (context, diagram, contract,
  stage table, order, decisions, verification, quirks, reference facts)

Flatten-to-comment (TDD):
- flattenHeadline = ShortSummary; flattenContent ports the structure of
  Python build_extended_summary_html to plain-text markdown (- bullets with
  indented Evidence:/> quote lines, optional Additional inferences section,
  Aliases mentioned line). flattenToComment stores _aiHeadline/_aiContent.
- 7 new tests (6 pure render + 1 wiring).

PD generation (deferred) will derive from the first-pass summary. FE plan is
intentionally left untouched — to be reconciled separately by the user.

Full Service build green, 81 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Implement the pipeline's persisting stage (⑤): write one comment_ai_run
cache row per cacheable terminal (success / gene-not-mentioned /
mentioned-in-passing) and, on the success path, mark the job terminal with
the flattened ai_output.

- AiGenePublicationPipeline: implement persist() + buildRun(); add injectable
  AiRunStore seam (default = CommentFactoryManager...::persistAiRun).
- Restructure run() to skip-ahead instead of early-return, so the single tail
  persist() runs for the gene-not-mentioned / mentioned-in-passing
  short-circuits too. Fixes a latent bug: those terminals were never persisted.
- TerminalResult: add success(headline, content) + ai_output rendering,
  matching the cache-hit wire shape.
- InsertCommentAiRunQuery: bind all 16 columns; bind synonyms_used TEXT[] as a
  real PG array via con.createArrayOf, capturing the connection in a run()
  override (the ArgumentBatch seam has no connection).
- CommentFactory.persistAiRun(CommentAiRun), mirroring findAiRun.

86 ai-package tests pass (+6). SQL INSERT path not yet live-tested (deferred
to dev-server deploy).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…edited

Add POST /user-comments/ai-gene-publication/{job_id}/publish: creates the user
comment on approval from the cached run row plus the submitted (possibly edited)
headline/content.

- PublishRequest DTO ({headline, content} only; job_id in the path).
- AiGenePublicationCommentService.publish: 401 guests, 400 if blank,
  findAiRun -> 404 if no run row (non-publishable outcomes were never cached),
  buildPublishComment, createComment, 201 {comment_id}.
- buildPublishComment (pure, 5 tests): run row -> CommentRequest with gene
  Target{type="gene",id=geneId} + AiProvenance{runJobId,createdAt,isEdited}.
  is_edited = submitted text != run's ai_headline/ai_content (null original =>
  edited), matching the gene-not-mentioned / mentioned-in-passing cases.
- CommentRequest gains an aiProvenance field; CommentFactory.createComment
  inserts the comment_ai_provenance row in the same transaction before commit.
- Implement InsertCommentAiProvenanceQuery.getArguments
  (comment_id, run_job_id, is_edited, created_at).

91 ai-package tests pass (+5). SQL/tx not yet live-tested (dev-server deploy).
Organism left null (post-impl TODO); sibling_summary split out as D7b.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bobular and others added 4 commits June 5, 2026 18:33
…onses

Anonymous counts over comment_ai_provenance rows for a run_job_id, surfaced so
the review form can show "N others have published this combination".

- SiblingSummary POJO (reviewed, edited, latestAt).
- GetSiblingSummaryQuery: COUNT(CASE WHEN [NOT] is_edited)/MAX(created_at)
  WHERE run_job_id=? (portable CASE, not PG-only FILTER); always returns one
  row, so no siblings -> (0, 0, null).
- CommentFactory.getSiblingSummary(runJobId).
- JobStatus.isPublishable() = the three persisted/publishable terminals.
- Service rendering (preludeJson/jobStateJson/cacheHitJson) now instance +
  throws; attaches sibling_summary to the cache-hit response (always) and the
  live terminal response (publishable terminals only). Pure static
  siblingSummaryJson -> {reviewed, edited, latest_at}; latest_at is ISO-8601
  UTC (Instant.toString()) or JSON null.

93 ai-package tests pass (+2). Aggregate SQL not yet live-tested (dev deploy).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- JobRegistry.cancel(jobId): no-op if unknown/evicted/terminal; otherwise marks
  the job CANCELLED, then future.cancel(true) to interrupt in-flight work.
- JobState.markTerminal is now synchronized + first-terminal-wins (returns
  boolean), so the cancel sticks: the interrupted pipeline thread's
  catch(Throwable) -> internal-error is ignored once the job is terminal. A
  cancelled job is not persisted (persist() skips non-publishable terminals).
- DELETE /user-comments/ai-gene-publication/{job_id}: fetchUser (401 guests),
  cancel, 204. v1 cancels for all attached followers.

3 new tests (running job -> CANCELLED + future cancelled; unknown-job no-op;
finished job not overwritten by a late cancel); 96 ai-package tests pass.

Known limitation: the Anthropic .join() may not be promptly interruptible, so
an in-flight LLM call can keep running in the background; the job is already
cancelled and its result discarded/uncached.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
JobRegistry constructor now schedules sweepQuietly() on the injected
_evictor every EVICTION_PERIOD_SECONDS. sweep(now) is pure/time-injected:
removeIf jobs where JobState.isExpiredAt(now, TTL_MILLIS) (terminal-only;
running jobs never evicted). sweepQuietly wraps it with the wall clock +
try/catch so a failure can't kill the recurring schedule. Eviction only
frees memory -- durable comment_ai_run rows still satisfy late cache hits.

+4 tests (100 ai-package tests pass). All 9 deliverables code-complete.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant