Modular, multi-source research pipeline with citation tier classification. A drop-in Claude Code skill plus a pip-installable Python CLI.
Status: v0.1.0 — alpha. The CLI flags and Python API may change before v1.0. Pin
polysearch==0.1.0in production.
polysearch runs a single research topic through up to four parallel layers — vector search over a personal corpus, decomposed sub-question research (Perplexity), web grounding (Firecrawl), and community signal (last30days). It synthesizes the results with claims-level extraction, verifies the cited sources against scraped pages, and writes a tiered, auditable report.
It's modular by design: pick which providers you want, pay only for those, and swap implementations behind the same protocol. A user with only a Perplexity key gets a thin pipeline; a user with the full stack gets all four layers blended.
pip install polysearch
echo "PERPLEXITY_API_KEY=your_key_here" > .env
polysearch --topic "What is the current US federal funds rate?" --depth quickResult: a markdown report in ./reports/ with citations from Perplexity. Cost: ~$0.10–$0.50.
To get the same thing as a Claude Code skill:
git clone https://github.com/milock/polysearch.git
cd polysearch && ./install.sh
# Then in Claude Code: "research the current federal funds rate" ┌─────────────────────┐
│ polysearch CLI │
└──────────┬──────────┘
│
┌──────────┴──────────┐
│ Orchestrator │
└──────────┬──────────┘
│
┌─────────┬──────────┼──────────┬─────────┐
▼ ▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐
│ Vector │ │ Web │ │ Decomp │ │ Comm. │ │ Source │
│ Search │ │ Ground │ │ Sub-Q │ │ Signal │ │ Tiering │
│(Qdrant)│ │(Firecr)│ │(Perplx)│ │(L30D) │ │(YAML map)│
└────┬───┘ └────┬───┘ └────┬───┘ └────┬───┘ └────┬─────┘
└──────────┴──────────┴──────────┘ │
│ │
┌───────┴────────┐ │
│ Synthesis │◄───────────────┘
│ (OpenAI/Claude)│
└────────┬───────┘
│
┌────────┴───────┐
│ Citation │
│ Verification │
└────────┬───────┘
│
┌────────┴───────┐
│ Report Writer │
│ (md + json) │
└────────────────┘
Each layer has a Protocol (ResearchProvider, VectorStore, WebGrounder, Synthesizer, CitationVerifier). Missing credentials substitute null implementations rather than failing — tier downgrade is non-fatal.
| Tier | What you need | What you get | Cost per query |
|---|---|---|---|
| 0 | PERPLEXITY_API_KEY |
Decomposed sub-question research, no web grounding, no verification | ~$0.10–$0.50 |
| 1 | Tier 0 + FIRECRAWL_API_KEY + (OPENAI_API_KEY or ANTHROPIC_API_KEY) |
3 layers, synthesis, citation verification | ~$0.50–$2.00 |
| 2 | Tier 1 + QDRANT_URL + QDRANT_API_KEY + a personal corpus |
Personal context blended into research | ~$0.50–$5.00 + Qdrant |
| 3 | Tier 2 + last30days-skill installed |
Community signal layer (Reddit/X/HN/Polymarket) | + last30days API costs |
If a tier's credentials aren't present, that layer gets a null provider and the pipeline runs with what it has. You don't have to wire everything to start.
| Path | What it is |
|---|---|
src/polysearch/ |
The Python package. Provider abstractions, orchestrator, CLI. |
src/polysearch/providers/ |
Concrete implementations of each provider protocol. |
config/ |
Default domain_tiers.yaml (gov, academic, major news) plus annotated example. |
skills/research/ |
Drop-in for Claude Code at ~/.claude/skills/research/. |
agents/ |
Thin agent template — drop in if you want a slash-command-driven research agent. |
examples/ |
Quickstart scripts, tier setup walkthroughs, sample outputs. |
tests/ |
Unit, integration (mocked providers), and opt-in live tests. |
docs/ |
Architecture, providers, modes, citation tiers, cost modeling, migration. |
install.sh |
One-line installer for the Claude Code skill. |
polysearch --topic "..." [options]
| Flag | Default | Notes |
|---|---|---|
--topic |
required | Research subject |
--depth |
standard |
quick (45–60s) / standard (90–180s) / deep (3–5m) |
--output-dir |
./reports/ |
Markdown + JSON written here |
--providers |
auto | Comma-list to override: perplexity,firecrawl,qdrant,community |
--synthesizer |
auto | openai / anthropic / none |
--verify-budget |
5.00 |
Max USD for citation verification scrapes |
--no-verify |
off | Skip citation verification entirely |
Auto-resolution: if both OPENAI_API_KEY and ANTHROPIC_API_KEY are set, OpenAI wins (cheaper). If only one is set, that one is used. If neither, synthesis is skipped and the report contains raw layer outputs with a note.
Per query, by tier:
| Tier | Quick | Standard | Deep |
|---|---|---|---|
| 0 (Perplexity only) | ~$0.10 | ~$0.30 | ~$1.00 |
| 1 (+Firecrawl, synthesis) | ~$0.30 | ~$0.80 | ~$2.50 |
| 2 (+Qdrant) | ~$0.30 | ~$0.85 | ~$2.60 |
| 3 (+community signal) | ~$0.40 | ~$1.10 | ~$3.50 |
Citation verification adds ~$0.50–$5.00 depending on scrape volume. Use --verify-budget to cap.
This is not OpenAI's deep research, Perplexity-as-a-service, or gpt-researcher. The closest analogies and how polysearch differs:
- vs. gpt-researcher / autoresearch frameworks: polysearch ships a fixed, opinionated 4-layer architecture rather than an agentic loop. No tool-calling chain to debug; the orchestrator runs all layers in parallel and synthesis is one pass with a self-audit. More predictable, less flexible.
- vs. Perplexity directly: Perplexity is one of polysearch's layers. Polysearch adds web grounding, citation verification against scraped pages, optional vector context, and a tiered report format. Use Perplexity directly if you want the conversational interface; use polysearch if you want a self-hosted, modular, batch-friendly pipeline.
- vs. OpenAI's deep research API: when that API is publicly available, polysearch will likely add it as an alternate
ResearchProvider. Until then, Perplexity Sonar is the equivalent layer. - vs. AI detector-evasion / paraphrase tools: unrelated. polysearch is for original research with verifiable citations, not for laundering AI-generated text.
- More provider implementations: Tavily, Brave Search, Weaviate, Pinecone, Chroma
- OpenAI deep research API as an alternate
ResearchProvider(when GA) - Configurable depth profiles (cost caps per phase)
- Native Slack / Discord bot wrappers
- Long-form report mode (multi-section briefings)
PRs welcome. The provider protocols (src/polysearch/providers/base.py) are the easiest place to contribute a new implementation.
MIT — see LICENSE.
This pipeline composes work from several upstreams. Full credits in ATTRIBUTION.md.