This Cloudflare Worker acts as a reverse proxy and bot traffic handler for your web server. It detects whether the incoming request comes from a known AI bot or scraper (ChatGPT, GPTBot, Claude, Perplexity, Google-Extended, BingPreview) and, based on that, decides:
Human visitors: Serve content normally from your primary origin server.
AI bot visitors: Serve content from an alternate origin (Amazon S3 bucket), likely to control what AI crawlers or automated tools can access.
Additionally, it logs bot traffic events to an external analytics API for tracking.
- AI Bot Detection: Detects ChatGPT, GPTBot, Google Extended, Bing Preview, and Perplexity bots
- Proxy Functionality: Routes requests to different origins based on bot detection
- Event Tracking: Sends events to external API when AI bots are detected
- Fallback Handling: Graceful fallback to original webserver when needed
- Redirect Support: Handles redirects from the original webserver
- Node.js 18+ installed
- Cloudflare account with Workers enabled
- Wrangler CLI installed globally
-
Install Wrangler CLI (if not already installed):
npm install -g wrangler
-
Install project dependencies:
npm install
-
Login to Cloudflare:
wrangler login
The worker uses several configuration constants that you may want to customize:
ORGANIZATION_ID: Your organization identifierALT_ORIGIN: Alternative origin for serving AI bot requestsEXTERNAL_API_URL: API endpoint for event tracking
Edit wrangler.toml to configure your deployment:
name = "your-worker-name"
main = "index.js"
compatibility_date = "2024-01-01"
compatibility_flags = ["nodejs_compat"]
[env.production]
name = "your-worker-name-prod"
[env.staging]
name = "your-worker-name-staging"Run the worker locally for development:
npm run devThis will start a local development server at http://localhost:8787.
You can test the bot detection by setting the user-agent query parameter:
# Test ChatGPT detection
curl "http://localhost:8787/?user-agent=chatgpt"
# Test with actual ChatGPT User-Agent
curl -H "User-Agent: ChatGPT-User/1.0" "http://localhost:8787/"npm run deploy:stagingnpm run deploy:productionnpm run tail- Request Analysis: The worker analyzes incoming requests for AI bot user agents
- Bot Detection: Detects various AI bots using regex patterns
- Event Tracking: For AI bots, sends events to external API
- Routing Logic:
- Non-AI visitors: Directly served from current webserver
- AI visitors: First tries ALT_ORIGIN, falls back to current webserver
- Response Handling: Returns appropriate responses with proper headers
The worker detects the following AI bots:
- ChatGPT-User:
ChatGPT-User/1.0 - GPTBot:
GPTBot/1.0 - Google Extended:
Google-Extended - Bing Preview:
bingpreview - PerplexityBot:
PerplexityBot
To add detection for a new AI bot, add a new regex pattern:
const NEW_BOT_RE = /NewBotPattern/i;Then update the detection logic:
const isNewBot = NEW_BOT_RE.test(ua);
const isAIVisitor = isChatGPT || isGPTBot || /* ... */ || isNewBot;Edit the postPayload object in the main handler to customize the event data sent to your API.
- Deployment Fails: Ensure you're logged in to Cloudflare and have proper permissions
- Local Development Issues: Check that Node.js version is 18+ and Wrangler is properly installed
- Bot Detection Not Working: Verify the user-agent patterns match your test cases
Use the console logs to debug issues:
npm run tailThe worker includes extensive logging with emojis for easy identification of different request types.
MIT License - see LICENSE file for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
For issues and questions, please check the Cloudflare Workers documentation or create an issue in this repository.