Skip to content

JohnsonL111/botbouncer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

botbouncer · v1.0.0 ✨

Abstract

As AI companies increasingly rely on large-scale web scraping to train their models, traditional mechanisms for controlling data access (such as robots.txt) are no longer reliable. Many modern crawlers ignore or evade these rules, leaving website owners without meaningful ways to protect their content or understand how it is being used. BotBouncer addresses this gap by introducing an active, user-controlled system for managing and monitoring bot traffic. Our project lets website owners generate custom access policies through a simple interface, then enforces these policies at the network edge by blocking unauthorized bots before they reach the site. We also provide a real-time analytics pipeline that reveals which bots visited, what they attempted to access, and whether they complied with the rules. Together, these tools restore data agency to users and offer a clearer picture of how automated agents interact with the web, making BotBouncer a practical and human-centered step toward modern data governance.

Purpose

  • A comprehensive platform for managing, previewing, publishing, & viewing observability around robots.txt rules and analytics.
  • Intended for testing and experimenting with bot blocking, rule composition, and how different user-agents are affected by a robots.txt configuration.

What the frontend does

  • Loads a live robots.txt (or a custom set of rules) and parses rules into a simple UI.
  • Lets you add / remove rules, preview which paths are disallowed for specific user-agents, and publish rule changes to the configured publish endpoint.
  • Polished UI built with Vite + React + TypeScript and Tailwind CSS for styling.
  • Frontend is client-only; any real publishing/analytics functionality is performed by external APIs hosted behind AWS API Gateway

Where to look

  • Main app logic: frontend/src/App.tsx
  • Styling: frontend/src/index.css (Tailwind)
  • Example env template: frontend/.env.example

Quick start (frontend)

  1. Install deps: cd frontend npm install

  2. Provide env values (Vite only loads real env files; .env.example is a template):

  3. Run dev server: npm run dev Open the app (Vite default: http://localhost:5173)

Environment variables (keys only)

  • VITE_LIVE_ROBOTS_URL
  • VITE_ANALYTICS_API_URL
  • VITE_PUBLISH_API_URL

Architecture diagram

Below is the system architecture for the project.

BotBouncer architecture diagram

CloudFront → S3 logs → SQS → Lambda → DynamoDB; frontend reads robots.txt and calls analytics/publish endpoints (API Gateway).

About

AI Bot Traffic edge enforcement & Observability

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors