Skip to content

feat: filter .md filename false positives from tweet URL processing#25

Open
rymalia wants to merge 1 commit into
alexknowshtml:mainfrom
rymalia:feat/md-filename-filter
Open

feat: filter .md filename false positives from tweet URL processing#25
rymalia wants to merge 1 commit into
alexknowshtml:mainfrom
rymalia:feat/md-filename-filter

Conversation

@rymalia

@rymalia rymalia commented Feb 8, 2026

Copy link
Copy Markdown

Detect when Twitter auto-links bare filenames like CLAUDE.md or plan.md to Moldova's .md ccTLD and classify them as 'filename-reference' instead of fetching content from parked/unrelated domains (which leads to junk in the knowledge collection.

Detection logic: bare .md root domain that isn't obsidian.md (the only known legitimate .md domain in tech Twitter), has no subdomain, and has no meaningful path. Integrated as a branch in the existing URL type classification chain.

Detect when Twitter auto-links bare filenames like CLAUDE.md or plan.md
to Moldova's .md ccTLD and classify them as 'filename-reference' instead
of fetching content from parked/unrelated domains.

Detection logic: bare .md root domain that isn't obsidian.md (the only
known legitimate .md domain in tech Twitter), has no subdomain, and has
no meaningful path. Integrated as a branch in the existing URL type
classification chain.

@blu3dot blu3dot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Banneker Architecture Review ✅

Clean, well-scoped fix:

  • Pure isLikelyMdFilename() function — testable, no side effects
  • Properly integrates into existing URL type classification chain
  • Allowlist pattern (obsidian.md) is extensible for future legitimate .md domains
  • Good test coverage covering edge cases (subdomains, paths, query params)
  • Safe defaults: only filters bare root domains

Ready to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants