A Model Context Protocol (MCP) server that provides safe, high-signal web browsing and content fetching for LLM agents.
- Multi-format support: HTML, JavaScript-rendered pages (SPA), Markdown, PDF, JSON, RSS/Atom/XML feeds
- Intelligent extraction: Uses Readability + Turndown for clean markdown output
- Security first: SSRF protection, prompt injection detection, rate limiting
- LLM-optimized output: Structured packets with citations, outlines, and metadata
- Context management: Semantic chunking and intelligent compaction
- AI Search: Conversation-scoped knowledge base via Cloudflare R2 + AI Search
- AI Gateway: Optional LLM-powered compaction via Cloudflare AI Gateway
┌─────────────────────────────────────────────────────────────────┐
│ MCP Client (LLM) │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ web-fetch-mcp Server │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ MCP Tools │ │
│ │ fetch │ extract │ chunk │ compact │ ai_search_query │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────────┐ │
│ │ Security Layer │ │
│ │ SSRF Guard │ Rate Limiter │ Injection Detector │ │
│ └───────────────────────────┼───────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────────┐ │
│ │ Fetcher Layer │ │
│ │ HTTP (undici) │ Browser (Playwright) │ robots.txt │ │
│ └───────────────────────────┼───────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────────┐ │
│ │ Extractor Layer │ │
│ │ HTML │ Markdown │ PDF │ JSON │ XML/RSS │ Text │ │
│ └───────────────────────────┼───────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────────┐ │
│ │ Processing Layer │ │
│ │ Normalizer │ Chunker │ Compactor │ Outline Generator │ │
│ └───────────────────────────┼───────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────────┐ │
│ │ AI Search / AI Gateway │ │
│ │ R2 Upload │ Scoped Queries │ LLM Compaction │ │
│ └───────────────────────────┴───────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
npm install
npm run build
# Optional: Install Playwright for JS-rendered pages
npx playwright install chromiumCopy .env.example to .env and customize. All settings have sensible defaults.
See .env.example for the full annotated list. Key sections:
| Section | Variables | Purpose |
|---|---|---|
| Fetch limits | MAX_BYTES, TIMEOUT_MS, MAX_REDIRECTS |
Control request size, timing, redirects |
| Security | BLOCK_PRIVATE_IP, ALLOWLIST_DOMAINS, RATE_LIMIT_PER_HOST |
SSRF protection, domain filtering, rate limiting |
| Processing | DEFAULT_MAX_TOKENS, CHUNK_MARGIN_RATIO, RESPECT_ROBOTS |
Chunking/compaction defaults |
| Features | PLAYWRIGHT_ENABLED, PDF_ENABLED |
Toggle optional capabilities |
| Caching | CACHE_TTL_S |
In-memory resource cache TTL (default: 300s) |
| Rendering | RENDER_BLOCK_THIRD_PARTY, RENDER_TIMEOUT_MS, USER_AGENT |
Browser rendering settings |
| AI Gateway | CF_AI_GATEWAY_ENDPOINT, CF_AIG_TOKEN, CF_AI_GATEWAY_MODEL |
LLM-powered compaction |
| AI Search | AI_SEARCH_ENABLED, CF_ACCOUNT_ID, CF_R2_BUCKET, ... |
Cloudflare AI Search ingestion |
| AI Search Scoping | AI_SEARCH_SCOPE, WEB_FETCH_THREAD_KEY, AI_SEARCH_STATE_DIR |
Conversation/workspace isolation |
Fetch and extract content from a URL or raw bytes.
Top-level input:
url: URL to fetchraw_bytes: Base64-encoded raw content to normalize instead of fetchingcontent_type: MIME type forraw_bytescanonical_url: Canonical URL forraw_bytes
Options:
mode:"auto" | "http" | "render"— Fetch mode (render uses Playwright)headers: Custom HTTP headerstimeout_ms: Request timeoutmax_bytes: Maximum response sizerender: Browser rendering optionswait_until:"load" | "domcontentloaded" | "networkidle"wait_ms: Additional wait timeblock_third_party: Block tracking requestsscreenshot: Capture screenshotselector: Wait for specific element
extraction: Content extraction optionsprefer_readability: Use Mozilla Readabilitykeep_tables: Preserve table contentkeep_code_blocks: Preserve code blocksremove_selectors: CSS selectors to remove
cache_ttl_s: Cache TTL in seconds for the HTTP fetch (set to 0 to disable)format: Output format optionsoutput:"llm_packet" | "raw" | "normalized"include_raw_excerpt: Include raw HTML snippet
ai_search: Cloudflare AI Search ingestion optionsenabled: Upload extracted content to R2 for AI Search indexingthread_key: Conversation/thread identifier for scopingprefix: Optional R2 key prefixmax_file_bytes: Per-file byte cap before splittingwait_ms: Delay before running AI Search query (indexes asynchronously)skip_if_exists: Skip upload if content already exists in R2require_success: Fail the fetch tool if upload or query failscontext: Custom context string for R2 metadata (auto-generated from title/URL/summary if omitted)metadata: Custom key-value metadata for R2 objects (see Metadata Schema)query: Optional AI Search query after uploadmessages: Array of{ content, role }objects for multi-turn queries (primary format)query: Single query string (backward-compatible alias; sent as a user message)mode:"search"(retrieval only) or"ai_search"(retrieval + generation)stream: Boolean — stream the response as SSE (only forai_searchmode; see Streaming)ai_search_options: Per-request overridesretrieval.filters: Vectorize-style field filters (see Filter Format)retrieval.max_num_results: Max chunks returnedretrieval.retrieval_type:"vector"|"keyword"|"hybrid"retrieval.match_threshold: Minimum similarity score (0–1)cache.enabled: Override instance cache settingreranking.enabled: Override instance reranking setting
model: Custom model name (only forai_searchmode)system_prompt: Custom system prompt- Backward-compatible flat aliases:
max_num_results,retrieval_type,match_threshold,reranking,filters,cache,ranking_options
- Note:
ai_searchis skipped whenformat.outputisraw
Response diagnostics:
request_id: Unique ID for the fetch requestduration_ms: End-to-end request duration in millisecondsretry_count: Number of HTTP retries performed
When success is false, error.details may include url, status_code, and retry_after (from 429 responses).
Extract content from raw bytes or URL. Consider using fetch instead for the full pipeline (caching, AI Search, diagnostics).
Input:
url: URL to fetch and extractraw_bytes: Base64-encoded raw contentcontent_type: MIME type of raw_bytescanonical_url: Canonical URL for citations
Split content into semantic chunks.
Options:
max_tokens: Maximum tokens per chunkmargin_ratio: Safety margin (0-0.5)overlap_tokens: Approximate token overlap between adjacent chunksstrategy:"headings_first" | "balanced"
Intelligently compress content. When AI Gateway is configured, map_reduce and question_focused modes use LLM-powered summarization.
Options:
max_tokens: Target output sizemode: Compaction strategy"structural": Remove boilerplate"salience": Keep high-density content"map_reduce": Summarize per-chunk"question_focused": Relevant to question
question: Focus question (for question_focused mode)preserve: Content types to keep:["numbers", "dates", "names", "definitions", "procedures"]
Query the conversation-scoped knowledge base built by fetch(). Results are automatically scoped based on AI_SEARCH_SCOPE configuration.
Input:
query: AI Search query options (auto-scoped). Eithermessagesorqueryis required.messages: Array of{ content: string, role: "user"|"system"|"assistant" }(primary format)query: Single query string (backward-compatible alias)mode:"search"(chunks only) or"ai_search"(chunks + generated answer)stream: Boolean — stream the response as SSE events (only forai_searchmode)ai_search_options: Per-request overridesretrieval.filters: Vectorize-style metadata filters (see Filter Format)retrieval.max_num_results: Max chunksretrieval.retrieval_type:"vector"|"keyword"|"hybrid"retrieval.match_threshold: Minimum similarity (0–1)cache.enabled: Booleanreranking.enabled: Boolean
model: Custom model name (forai_searchmode)system_prompt: Custom system prompt- Backward-compatible flat aliases:
max_num_results,retrieval_type,match_threshold,reranking,filters,cache,ranking_options
thread_key: Override the conversation thread key for this request
When AI Search is enabled, uploaded content and queries are scoped to prevent cross-conversation leakage. This is configured via environment variables and persisted across server restarts.
| Mode | Env Value | Behavior |
|---|---|---|
| Conversation (default) | AI_SEARCH_SCOPE=conversation |
Each conversation gets its own isolated namespace via thread_key |
| Workspace | AI_SEARCH_SCOPE=workspace |
Shared per workspace (git repo root or AI_SEARCH_WORKSPACE_ROOT) |
| Global | AI_SEARCH_SCOPE=global |
No isolation — all content is shared |
The thread key is resolved in order:
- Per-request
thread_keyparameter (inai_searchoptions) WEB_FETCH_THREAD_KEYenv var (aliases:AI_SEARCH_THREAD_KEY,MCP_THREAD_KEY)- Auto-generated from workspace if no key is provided
The mapping from (workspace_id, thread_key) → conversation_id is persisted to ~/.config/web-fetch-mcp/ai-search-state.json (configurable via AI_SEARCH_STATE_DIR). This means conversation scoping survives server restarts.
Uploads are automatically prefixed based on scope:
- Global:
{prefix}/ - Workspace:
{prefix}/workspaces/{workspace_id}/ - Conversation:
{prefix}/workspaces/{workspace_id}/conversations/{conversation_id}/
Fetch a page and query AI Search with the new message format:
{
"url": "https://example.com/docs/setup",
"options": {
"ai_search": {
"enabled": true,
"query": {
"messages": [{ "content": "How do I configure authentication?", "role": "user" }],
"mode": "ai_search",
"ai_search_options": {
"retrieval": { "max_num_results": 5, "retrieval_type": "hybrid" }
}
}
}
}
}Standalone query with backward-compatible query string:
{
"query": { "query": "authentication setup", "mode": "search" }
}Multi-turn conversation with streaming:
{
"query": {
"messages": [
{ "role": "system", "content": "Use only indexed material." },
{ "role": "user", "content": "Summarize the setup docs" }
],
"mode": "ai_search",
"stream": true
}
}Filters use Vectorize-style field constraints. Multiple top-level keys are AND-combined.
| Operator | Example | Meaning |
|---|---|---|
Implicit $eq |
{ "tag": "docs" } |
tag equals "docs" |
$ne |
{ "tag": { "$ne": "draft" } } |
tag is not "draft" |
$in |
{ "tag": { "$in": ["docs", "guides"] } } |
tag is "docs" or "guides" |
$nin |
{ "tag": { "$nin": ["draft"] } } |
tag is not in list |
$gt / $gte |
{ "score": { "$gte": 0.8 } } |
score >= 0.8 |
$lt / $lte |
{ "score": { "$lt": 1.0 } } |
score < 1.0 |
Backward compatibility: Legacy
{ type: "and", filters: [{ type: "eq", key: "tag", value: "docs" }] }filter trees are automatically translated to the Vectorize format.
The Cloudflare AI Search REST API only supports AND-combined filters. OR is not available at the API level.
Workarounds:
- Same-field OR: Use
$in— e.g.,{ "folder": { "$in": ["docs/", "guides/"] } } - Cross-field OR: Make separate
ai_search_querycalls with different filters and merge results client-side
R2 uploads include metadata that AI Search can use for filtering and context generation.
contextis a built-in AI Search field — it is always recognized and used to guide response generation. Auto-generated from the page title, URL, and summary when not provided explicitly.- Custom metadata fields (e.g.,
{ "category": "docs" }) are written as R2 object metadata (x-amz-meta-*headers). To filter on these fields in AI Search queries, they must be declared in the AI Search instance'scustom_metadataschema. Maximum 5 custom fields per instance. Reserved names (timestamp,folder,filename) are skipped.
When stream: true is set for ai_search mode, the server processes the SSE stream incrementally from Cloudflare but returns the assembled result as a single MCP tool response (MCP protocol constraint). The assembled result includes the concatenated text, retrieved chunks, and the full events array.
When configured, compact modes map_reduce and question_focused use an LLM via Cloudflare AI Gateway for higher-quality summarization.
CF_AI_GATEWAY_ENDPOINT=https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions
CF_AIG_TOKEN=your_token
CF_AI_GATEWAY_MODEL=your_model
CF_AI_GATEWAY_TIMEOUT_MS=60000The gateway uses OpenAI-compatible chat completions. Without AI Gateway configured, compaction falls back to local heuristic-based processing.
This server exposes prompt templates for user-invoked workflows:
fetch_url: argsurl(required),mode,extraction(JSON string)fetch_and_chunk: argsurl(required),max_tokens,strategyfetch_and_compact: argsurl(required),max_tokens,mode,questionfetch_ai_search: argsurl(required),query(required),wait_ms,modeai_search_query: argsquery(required),mode,thread_keyresources_tips: no args — guidance on reusing fetched content via MCP resources
Prompts are discoverable via prompts/list and retrievable via prompts/get.
This server supports completion/complete to provide argument suggestions for prompts and resource URIs.
Example prompt argument completion:
{
"jsonrpc": "2.0",
"id": 1,
"method": "completion/complete",
"params": {
"ref": { "type": "ref/prompt", "name": "fetch_url" },
"argument": { "name": "mode", "value": "re" }
}
}Example resource URI completion (source_id suggestions):
{
"jsonrpc": "2.0",
"id": 2,
"method": "completion/complete",
"params": {
"ref": { "type": "ref/resource", "uri": "webfetch://packet/{source_id}" },
"argument": { "name": "source_id", "value": "" }
}
}Recently fetched packets are exposed as MCP resources using a custom webfetch:// URI scheme. Resources are stored in-memory with TTL controlled by CACHE_TTL_S (default: 300s) and are not persisted across restarts.
Resource list entries use webfetch://packet/{source_id} and include metadata like title, lastModified, and size. Reads support:
webfetch://packet/{source_id}: Full LLMPacket JSON (application/json)webfetch://content/{source_id}: Markdown content (text/markdown)webfetch://normalized/{source_id}: NormalizedContent JSON (application/json)webfetch://screenshot/{source_id}: Screenshot blob (image/png, only if captured)
The server emits notifications/resources/list_changed when new resources are stored. Tool, prompt, and resource lists support cursor-based pagination (page size: 50).
{
"source_id": "abc123...",
"original_url": "https://example.com/page",
"canonical_url": "https://example.com/page",
"retrieved_at": "2024-01-15T10:30:00Z",
"status": 200,
"content_type": "text/html",
"metadata": {
"title": "Page Title",
"author": "John Doe",
"published_at": "2024-01-10T00:00:00Z",
"estimated_reading_time_min": 5
},
"outline": [
{"level": 1, "text": "Introduction", "path": "Introduction"},
{"level": 2, "text": "Background", "path": "Introduction > Background"}
],
"key_blocks": [
{"block_id": "b0", "kind": "heading", "text": "# Introduction", "char_len": 14},
{"block_id": "b1", "kind": "paragraph", "text": "Content...", "char_len": 500}
],
"content": "# Introduction\n\nContent in markdown format...",
"source_summary": [
"Main topics: Introduction, Background, Results",
"Key figures: $53m, 700 jobs",
"Content length: ~2500 words"
],
"unsafe_instructions_detected": [],
"warnings": [],
"hashes": {
"content_hash": "sha256...",
"raw_hash": "sha256..."
}
}- Blocks localhost and loopback addresses
- Blocks private IP ranges (10.x, 172.16-31.x, 192.168.x)
- Validates DNS resolution to prevent rebinding attacks
- Only allows http:// and https:// protocols
- Scans for instruction override patterns
- Detects role reassignment attempts
- Identifies fake message delimiters
- Quarantines suspicious content in
unsafe_instructions_detected
- Per-host request limits
- Exponential backoff on errors
- Respects Retry-After headers
- Removes scripts and styles
- Strips event handlers
- Removes hidden content
- Detects paywalled pages
| Threat | Mitigation |
|---|---|
| SSRF to internal services | Block private IPs, validate DNS resolution |
| DNS rebinding | Re-validate IPs after each redirect |
| Prompt injection in content | Detect and quarantine suspicious patterns |
| Malicious scripts in HTML | Remove all scripts, styles, event handlers |
| Hidden content attacks | Remove display:none and aria-hidden content |
| Resource exhaustion | Enforce max_bytes, timeouts, rate limits |
| Redirect loops | Limit redirect count, track visited URLs |
| Paywall bypass | Detect paywalls, do not attempt bypass |
| CAPTCHA bypass | Return error, do not attempt bypass |
| Browser sandbox escape | Use isolated contexts, block file:// protocol |
# Unit tests
npm test
# Watch mode
npm run test:watch
# Coverage
npm run test:coverageweb-fetch-mcp/
├── src/
│ ├── index.ts # MCP server entry, tool/prompt/resource registration
│ ├── config.ts # Configuration from env vars
│ ├── types.ts # TypeScript types
│ ├── completions.ts # MCP completion/complete handler
│ ├── pagination.ts # Cursor-based pagination for list endpoints
│ ├── ai-gateway/
│ │ └── client.ts # Cloudflare AI Gateway (LLM compaction)
│ ├── ai-search/
│ │ ├── index.ts # R2 upload, AI Search query, content quality gating
│ │ └── state.ts # Conversation-scope persistence
│ ├── tools/
│ │ ├── fetch.ts # fetch tool
│ │ ├── fetch-contract.ts # Fetch input parsing/validation
│ │ ├── extract.ts # extract tool
│ │ ├── chunk.ts # chunk tool
│ │ ├── compact.ts # compact tool
│ │ └── ai-search-query.ts # ai_search_query tool
│ ├── fetcher/
│ │ ├── http-fetcher.ts # HTTP fetching (undici)
│ │ ├── browser-renderer.ts # Playwright rendering
│ │ └── robots.ts # robots.txt handling
│ ├── extractors/
│ │ ├── html-extractor.ts
│ │ ├── markdown-extractor.ts
│ │ ├── pdf-extractor.ts
│ │ ├── json-extractor.ts
│ │ ├── xml-extractor.ts
│ │ └── text-extractor.ts
│ ├── processing/
│ │ ├── normalizer.ts # Content normalization, key_blocks, source_summary
│ │ ├── chunker.ts # Semantic chunking
│ │ ├── compactor.ts # Content compaction (4 modes)
│ │ ├── outline.ts # Document outline generation
│ │ └── synonyms.ts # Synonym expansion for question-focused compaction
│ ├── security/
│ │ ├── ssrf-guard.ts
│ │ ├── injection-detector.ts
│ │ ├── content-sanitizer.ts
│ │ └── rate-limiter.ts
│ ├── resources/
│ │ ├── handlers.ts # MCP resource read/list handlers
│ │ ├── store.ts # In-memory resource store with TTL
│ │ └── uri.ts # webfetch:// URI parsing
│ └── utils/
│ ├── hash.ts
│ ├── url.ts
│ ├── tokens.ts
│ └── cache.ts
├── tests/
│ ├── unit/
│ └── integration/
├── scripts/
│ └── batch-fetch-wp-docs.ts # Batch document fetcher utility
├── .env.example
├── package.json
├── tsconfig.json
└── README.md
MIT