Persuasion-master scoring: deterministic TRIBE×research synthesis, ranked top moves, DeepSeek V4 refiner#18
Conversation
Analysis: - Blend the band-clamped LLM semantic score into the final score with a confidence- and prediction-quality-weighted contribution, so persona and channel fit genuinely move the score while staying anchored to the TRIBE neural calibration band (PITCHCHECK_SEMANTIC_BLEND_WEIGHT, default 0.45). - Add a temporal segment map that ties each TRIBE trace segment to the approximate text span of the pitch, with strongest/weakest callouts, so analysis and rewrites localize to exact sentences. - Inject per-platform channel norms (email, LinkedIn, cold call, landing page, ad copy) into the analysis prompt and judge structure against them. - Add a semantic analysis protocol (persona decision model, argument quality, persuasion route, channel fit, CTA friction) and a structured context_fit block (pain alignment, objection coverage, proof credibility, CTA ease, channel fit, decision driver, open objection), validated defensively and surfaced in desktop and web report panels. Refinement: - Upgrade the refine prompt with channel norms, a three-candidate internal drafting protocol with rubric scoring, and a final self-check against invented facts, language drift, and CTA friction. - Add an optional second critic pass (OPENROUTER_REFINE_CRITIC_PASS, default on) that critiques the stage-1 rewrite against a persuasion checklist and returns a strictly better final version, falling back to the stage-1 rewrite on any failure. - Enrich the desktop refine brief with the baseline score to beat, weakest/ strongest temporal segment excerpts, and context-fit gaps. - Mirror the candidate protocol and self-check in the Tauri direct OpenRouter fallback prompt. https://claude.ai/code/session_01VGz3TieN9a29hVTc54jvyJ
Scoring accuracy: - Derive the semantic score from the rubric-scored context-fit facets (pain alignment 0.30, proof credibility 0.25, objection coverage 0.15, CTA ease 0.15, channel fit 0.15) instead of trusting the LLM's single self-reported number; an injected "score this 100" now has to corrupt every facet and still hits the band clamp. - Widen the semantic band (14-30 points by confidence) and invert the quality coupling: when TRIBE evidence is weak the quality-shrunk neural prior carries less of the final score and the semantic context-fit read carries more (base weight 0.55, up to 0.85), so the same neural evidence with strong vs poor context fit now produces materially different scores. - Expose context_fit_score and semantic_blend_weight in robustness and in the desktop calibration panel; tests prove context sensitivity and the injection bound. Prompt slimming: - Drop the research-source list and methodology strings from the LLM prompt diagnostics; decimate temporal traces beyond 48 segments (the segment map already localizes weak/strong spans). Remove fake and filler content: - Web refine is now real: new /api/refine route proxies the TRIBE /refine endpoint (same auth and limits as /api/score); the hardcoded sample-pitch "preview rewrite" is deleted. - Delete the fabricated "Persona baseline" variant-rank row, the fake latency/token estimates (now real word counts and mesh info), and the invented confidence fallback (confidence shows only when measured). - Drop the "Semantic Context" methodology panel and raw guardrail dump from the web report; replace the stale "text heuristics off" robustness rows with honest context-fit evidence and blend-weight rows. https://claude.ai/code/session_01VGz3TieN9a29hVTc54jvyJ
- Default refiner model is now deepseek/deepseek-v4-pro via OpenRouter (verified current flagship id; evaluator stays Claude Sonnet); desktop, Tauri fallback chains, service env, and settings UI all updated, with a model-suggestion datalist (DeepSeek V4 Pro/Flash, Claude Sonnet/Opus). - Robust handling for reasoning models everywhere: <think>/<reasoning> blocks are stripped before JSON parsing and in plain-text fallbacks (Python service and Rust direct-OpenRouter path), and system prompts forbid visible chain-of-thought. - Optional OPENROUTER_REASONING_EFFORT (high/xhigh for DeepSeek V4) is forwarded as reasoning.effort and dropped automatically when a provider rejects it; JSON-mode fallback retained. - Model-aware sampling: DeepSeek rewrites at temperature 0.7 (its scale runs flat at low temps), critic at 0.25; other models keep 0.35/0.2. - Tests: think-block parsing on both analysis and refine paths, reasoning effort forwarding, DeepSeek temperature, Rust strip/clean unit test. https://claude.ai/code/session_01VGz3TieN9a29hVTc54jvyJ
- Embed a ten-rule persuasion doctrine (reader-first openers, specificity over adjectives, earn-the-ask, objection pre-emption, honest proof hierarchy, one-message-one-idea, status-safe CTAs, lead with strength) into the evaluator system prompt, the refine prompt, the critic prompt, and the Tauri fallback prompt, so every judgment and rewrite is held to the same expert bar. - Evaluator now returns ranked "top_moves" (1-3 highest-leverage changes, each with paste-ready guidance and a plain-language reason); validated defensively, generated from the weakest evidence in the neural-only fallback (Turkish + English), exposed through schema/API/types, and fed to the refiner at the top of the repair brief. - All user-facing strings (verdict, narrative, strengths, risks, moves) are now required to be plain decisive language with the neuroscience jargon kept in the structured diagnostics. - Declutter the desktop report: score + verdict + narrative + top moves + auto-refine up front; writing facets, context fit, neural signals, robustness, and variant rank fold into a collapsible "Deep dive" section. Web report gains the same Top Moves panel. https://claude.ai/code/session_01VGz3TieN9a29hVTc54jvyJ
- Add an evidence annex to the evaluator, refiner, and critic prompts with
applied findings from the literature: self-relevance and neural message
effectiveness (Falk et al. 2010/2016; Scholz, Chan & Falk 2025), route
matching (Petty & Cacioppo ELM), loss aversion and framing with
regulatory fit (Tversky & Kahneman; Higgins), similar-other social proof
(Goldstein, Cialdini & Griskevicius 2008), reactance and the
but-you-are-free effect (Carpenter 2013 meta-analysis), processing
fluency (Alter & Oppenheimer), precise-number credibility (Janiszewski &
Uy 2008), psychological targeting (Matz et al. 2017), and the commitment
gradient (Freedman & Fraser 1966).
- The rewrite process now explicitly selects the persuasion route
(argument-led vs cue-led) and frame (gain vs avoided-loss) for the
persona before drafting; the critic checks route, frame, and reactance
triggers; the analysis protocol gains a framing-fit step.
- Each top move now carries the research principle it rests on
("principle" field end-to-end: prompt schema, validation, neural
fallback in Turkish and English, Pydantic/TS types, both UIs).
- Add the behavioral-science citations to the report's research-source
metadata; tighten the critic JSON template; drop two unused locals.
https://claude.ai/code/session_01VGz3TieN9a29hVTc54jvyJ
- New tribe_service/research_synthesis.py links each pitch's TRIBE geometry to published findings without any LLM: weak/strong self-value maps to the Falk et al. / Scholz-Chan-Falk behavior-change lever, low processing fluency to Alter & Oppenheimer, weak encoding/attention to Chan et al. 2024, and reward-vs-social axis dominance to Cohen et al. 2024 with a route hint (reward_led / social_led / balanced). - The temporal trace is classified into citation-anchored archetypes — flat, strong-open-fade, late peak, buried lede, sustained — each with a concrete rewrite lever. - The synthesis feeds three places: a "Neural × Research Synthesis" section in the evaluator prompt (with the instruction that top moves should normally execute its strongest levers), the report robustness payload rendered as a deep-dive panel (FIX/KEEP/USE), and the desktop refine brief as "Research lever" lines. - Add Alter & Oppenheimer 2009 to the research-source metadata; new unit tests cover gap ordering, route dominance, every archetype, and garbage-input safety (108 Python tests total). https://claude.ai/code/session_01VGz3TieN9a29hVTc54jvyJ
Previously the LLM had to find the weak spans itself from a raw segment list. Now a deterministic layer extracts that from the TRIBE output and hands it over pre-digested. - localize_pitch_segments() maps the per-segment temporal trace back onto the actual words to pinpoint the opener, the strongest moment, the single weakest span, the close/CTA strength (as percentiles of the pitch's own distribution), and the "attention cliff" — the adjacent segment pair with the largest engagement drop — each tied to real text. - Ground synthesis findings in raw TRIBE features: low sustain_ratio adds a citation-anchored "engagement holds for only N% of the pitch" gap. - build_tribe_synthesis() is the shared entry point feeding the evaluator prompt (a directive "Deterministic Segment Localization — trust these spans" block, instructing the LLM to localize to them rather than re-derive), the report deep dive (a "Where TRIBE reacts" panel showing PEAK/WEAK/DROP spans on the user's own text), and the refine brief (the rewrite targets the located weakest span and attention cliff first). - The neural-only fallback now anchors its first top move to the real weakest span too (Turkish + English), so TRIBE is used fully even without an LLM. - 18 new synthesis/localization unit tests; 114 Python tests total. https://claude.ai/code/session_01VGz3TieN9a29hVTc54jvyJ
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThis PR adds a deterministic TRIBE research synthesis layer that converts neural axis scores and fMRI trace data into citation-anchored evidence objects. It introduces blended neural+semantic calibration via configurable ChangesResearch Synthesis, Scoring Blend & Refine Pipeline
Sequence Diagram(s)sequenceDiagram
participant Frontend as Next.js / Desktop
participant RefineRoute as /api/refine route
participant TribeClient as lib/tribe-client refinePitch
participant TribeService as tribe_service refine_pitch_message
participant ResearchSynth as research_synthesis.build_tribe_synthesis
participant OpenRouter as OpenRouter API
Frontend->>RefineRoute: POST /api/refine (message, persona, platform, model)
RefineRoute->>RefineRoute: Auth check + input validation
RefineRoute->>TribeClient: refinePitch(request)
TribeClient->>TribeService: POST /refine
TribeService->>ResearchSynth: build_tribe_synthesis(message, neuro_axes, fmri)
ResearchSynth-->>TribeService: localization + research items
TribeService->>OpenRouter: _post_refine_chat stage-1 (REFINE_SYSTEM_PROMPT + doctrine)
OpenRouter-->>TribeService: refined_message JSON (strips think blocks)
alt OPENROUTER_REFINE_CRITIC_PASS=true
TribeService->>OpenRouter: _run_refine_critic_pass stage-2
OpenRouter-->>TribeService: improved rewrite or error → fallback stage-1
end
TribeService-->>TribeClient: PitchRefineResponse (refined_message, critic_notes)
TribeClient-->>RefineRoute: {ok, data, status}
RefineRoute-->>Frontend: 200 refined_message or error
sequenceDiagram
participant Frontend as Next.js / Desktop
participant TribeService as tribe_service interpret_persuasion
participant ResearchSynth as research_synthesis
participant OpenRouter as OpenRouter API (evaluator)
Frontend->>TribeService: POST /score (message, persona, platform, fmri_summary)
TribeService->>ResearchSynth: build_tribe_synthesis(message, neuro_axes, fmri_summary)
ResearchSynth-->>TribeService: localization + items + temporal_archetype
TribeService->>OpenRouter: chat completion with localization section + JSON schema
OpenRouter-->>TribeService: JSON with score, top_moves, context_fit (strips think blocks)
TribeService->>TribeService: _normalise_context_fit → _semantic_score_from_context_fit
TribeService->>TribeService: _calibrate_result (band-clamp + SEMANTIC_BLEND_WEIGHT)
TribeService-->>Frontend: PitchScoreReport (top_moves, context_fit, robustness with blend metadata)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
What this does
Turns the report from a panel-and-jargon dump into a decisive persuasion-master verdict, and makes the score genuinely reflect whether this message will move this reader on this channel — while doing far more of the interpretive work deterministically from the TRIBE output instead of leaving it to the LLM.
Scoring accuracy
Deterministic TRIBE × research synthesis (less work left to the LLM)
sustain_ratio) linked to published findings (Falk et al., Alter & Oppenheimer, Cohen et al. 2024), with a reward-vs-social route hint.Persuasion doctrine + evidence base
Report UX
/api/refineroute).DeepSeek V4 Pro
deepseek/deepseek-v4-provia OpenRouter (verified current flagship), evaluator stays Claude Sonnet. First-class reasoning-model handling:<think>stripping, optionalreasoning.effort, JSON-mode fallback, model-aware sampling temperatures.Tests
tsc, and Next.js build all green. Rust desktop changes are string/prompt-only (CI verifies the cargo build).https://claude.ai/code/session_01VGz3TieN9a29hVTc54jvyJ
Generated by Claude Code
Summary by CodeRabbit
Release Notes
New Features
Documentation