Skip to content

Content safety hardening: pre-generation topic moderation for abusive and zero-gibberish entries #16

Description

@LynnColeArt

Why this is necessary

Automated abuse attempts are creating synthetic/offensive topic slugs and getting them near-visible in production. This causes junk in public index surfaces, degrades trust, and can burn LLM tokens before post-write moderation can clean up.

Screenshot evidence

/home/lynn/Downloads/screenship-All-entries-Halupedia-2026-05-14T07-55-13.png

What we changed

1) Pre-generation content-policy gate

  • Added isTitleModerationApproved(...) in /api/page/:slug before any article generation call.
  • If a topic is not approved, the request is blocked up front with a refusal response.
  • User-facing refusal message: I'm sorry Dave, I can't do that.

2) Deterministic gibberish / spam detection

  • Added isObviousGibberishTitle(...) with guards for:
    • very long zero strings,
    • long mostly-numeric titles,
    • repetitive scripted token patterns.

3) Permanent blocklist hardening

  • Expanded src/worker/blocklist.ts to catch zero-heavy and numeric-pattern abuse before generation and before moderation logging.

4) Index hardening

  • Updated /api/index filtering so permanently blocked slugs are excluded from live listings too.

5) Friendly UI signal

  • Updated banned-topic presentation in src/client/App.tsx so blocked requests show the actual refusal message.

Why this is the right approach

This moves safety from "clean up after write" to "reject before write" for clearly abusive/nonsensical topics, reducing abuse surface, improving UX clarity, and saving API budget.

Related PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions