Skip to content

Scheduling quality improvements: use stronger topic signals + reduce 'random feeling' plans #6

Description

@szverev

Problem

Scheduling currently produces plan items, but users may perceive the output as insufficiently grounded or not aligned to what they want to publish next. We want scheduling to feel intentional, explainable, and clearly derived from analyzed topics + source articles.

Goal

Improve scheduling decisions and explainability using embedding/topic signals and explicit constraints.

Proposed approach

  • Add richer scoring features per topic:
    • topic article count
    • cluster coherence score
    • recency / momentum
    • diversity constraints (avoid many near-duplicate topics in one schedule batch)
  • Improve explainability:
    • include top linked article titles in the scheduling rationale
    • persist the scheduler output decision JSON
  • Add deterministic guardrails:
    • validate topic IDs
    • configurable number of plan items

Acceptance criteria

  • Schedule output includes: topic label + 2-3 representative source titles
  • Reduced duplication across scheduled items
  • Re-running schedule with unchanged DB yields stable results (within reason)

Tasks

  • Implement topic scoring + diversity constraints
  • Persist scheduler decisions for audit
  • Update CLI output

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions