Content safety hardening: pre-generation topic moderation for abusive and zero-gibberish entries

### Why this is necessary

Automated abuse attempts are creating synthetic/offensive topic slugs and getting them near-visible in production. This causes junk in public index surfaces, degrades trust, and can burn LLM tokens before post-write moderation can clean up.

## Screenshot evidence

`/home/lynn/Downloads/screenship-All-entries-Halupedia-2026-05-14T07-55-13.png`

## What we changed

### 1) Pre-generation content-policy gate

- Added `isTitleModerationApproved(...)` in `/api/page/:slug` before any article generation call.
- If a topic is not approved, the request is blocked up front with a refusal response.
- User-facing refusal message: `I'm sorry Dave, I can't do that.`

### 2) Deterministic gibberish / spam detection

- Added `isObviousGibberishTitle(...)` with guards for:
  - very long zero strings,
  - long mostly-numeric titles,
  - repetitive scripted token patterns.

### 3) Permanent blocklist hardening

- Expanded `src/worker/blocklist.ts` to catch zero-heavy and numeric-pattern abuse before generation and before moderation logging.

### 4) Index hardening

- Updated `/api/index` filtering so permanently blocked slugs are excluded from live listings too.

### 5) Friendly UI signal

- Updated banned-topic presentation in `src/client/App.tsx` so blocked requests show the actual refusal message.

## Why this is the right approach

This moves safety from "clean up after write" to "reject before write" for clearly abusive/nonsensical topics, reducing abuse surface, improving UX clarity, and saving API budget.

## Related PR
- https://github.com/BaderBC/halupedia/pull/15


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Content safety hardening: pre-generation topic moderation for abusive and zero-gibberish entries #16

Why this is necessary

Screenshot evidence

What we changed

1) Pre-generation content-policy gate

2) Deterministic gibberish / spam detection

3) Permanent blocklist hardening

4) Index hardening

5) Friendly UI signal

Why this is the right approach

Related PR

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Content safety hardening: pre-generation topic moderation for abusive and zero-gibberish entries #16

Description

Why this is necessary

Screenshot evidence

What we changed

1) Pre-generation content-policy gate

2) Deterministic gibberish / spam detection

3) Permanent blocklist hardening

4) Index hardening

5) Friendly UI signal

Why this is the right approach

Related PR

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions