A high-performance asynchronous HTTP tarpit for monitoring automated scanning campaigns, collecting behavioral data, and implementing active defense mechanisms.
This tool complements the Multi-threaded SSH Honeypot by providing a lightweight active-defense layer. While the SSH honeypot captures post-authentication payload delivery, the tarpit focuses on large-scale automated scanning and reconnaissance, offering a broader view of the botnet lifecycle.
- Asynchronous Engine — built with
aiohttpto handle high-concurrency connections with minimal resource overhead. - Active Defense — implements resource exhaustion (tarpit) by draining attacker connections with slow, drip-feed responses.
- Forensic Logging — captures detailed HTTP metadata including full request headers, User-Agent strings, and session duration in structured JSON format.
- GeoIP Enrichment — integrated MaxMind GeoLite2 support for real-time ASN and geographic mapping of source IPs.
- Structured Storage — SQLite-backed persistence for reliable event storage and post-hoc query analysis.
- AbuseIPDB Integration — optional automated reporting of detected malicious IPs with configurable rate-limiting.
Internet ──► [Reverse Proxy / iptables REDIRECT]
│
▼
HTTP Tarpit (aiohttp)
│
┌─────────────┼─────────────┐
▼ ▼ ▼
SQLite DB GeoIP Lookup AbuseIPDB API
(events log) (MaxMind DB) (optional)
The tarpit listens on a configurable address and port. All incoming requests are accepted and held open while the server drip-feeds a slow response, exhausting scanner thread pools. Every connection is fully logged before and after the tarpit cycle.
- Python 3.11 or newer
- Poetry dependency manager
- (Optional) MaxMind GeoLite2 databases (
GeoLite2-City.mmdb,GeoLite2-ASN.mmdb) - (Optional) AbuseIPDB API key
git clone https://github.com/t1a0/http-tarpit.git
cd http-tarpitpoetry install --without analysisTo include the data analysis toolset (pandas, matplotlib, seaborn, folium):
poetry installCreate a .env file in the project root:
# Optional: enable AbuseIPDB reporting (leave unset to disable)
ABUSEIPDB_API_KEY=your_api_key_hereAll other parameters are configured directly in src/http_tarpit/config.py.
Edit src/http_tarpit/config.py to adjust the core settings:
| Parameter | Default | Description |
|---|---|---|
HOST |
127.0.0.1 |
Bind address |
PORT |
8080 |
Bind port |
RESPONSE_DELAY_SECONDS |
1.5 |
Delay between drip-feed chunks |
RESPONSE_CHUNK |
b'.' |
Payload sent per chunk |
MAX_RESPONSE_BYTES |
1200 |
Maximum bytes per connection before close |
ABUSEIPDB_REPORT_INTERVAL_MINUTES |
40 |
Minimum interval between reports for the same IP |
- Register for a free MaxMind account at maxmind.com.
- Download
GeoLite2-City.mmdbandGeoLite2-ASN.mmdb. - Place both files in the
data/directory (created automatically on first run).
GeoIP enrichment is automatically disabled if the database files are not present.
poetry run python main.pyCreate /etc/systemd/system/http-tarpit.service:
[Unit]
Description=HTTP Tarpit & Bot Analyzer
After=network.target
[Service]
Type=simple
User=tarpit
WorkingDirectory=/opt/http-tarpit
ExecStart=/opt/http-tarpit/.venv/bin/python main.py
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.targetEnable and start:
sudo systemctl daemon-reload
sudo systemctl enable --now http-tarpit
sudo systemctl status http-tarpitTo redirect traffic from common scan targets (e.g., port 80, 8888) to the tarpit without running as root:
# Redirect port 80 to tarpit on port 8080
sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 8080
# Redirect additional ports
sudo iptables -t nat -A PREROUTING -p tcp --dport 8888 -j REDIRECT --to-port 8080The tarpit reads the X-Tarpit-Target-Port header to record the originally targeted port in the database. Set this header in your reverse proxy configuration if using nginx or HAProxy as a front-end.
server {
listen 80 default_server;
location / {
proxy_pass http://127.0.0.1:8080;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Tarpit-Target-Port $server_port;
proxy_read_timeout 3600s;
}
}| Path | Contents |
|---|---|
logs/tarpit.log |
Structured JSON log of all events |
data/tarpit_events.db |
SQLite database with full event records |
The SQLite database is created automatically on first run. The schema includes GeoIP fields, AbuseIPDB reporting status, full request metadata, and session timing.
This project generates structured datasets suited for threat intelligence research, botnet activity analysis, and behavioral classification studies. The tarpit_events.db database supports complex SQL queries for identifying scanning trends, injection attempts, and geographic distribution of malicious actors.
The dataset generated by this tarpit is available on Zenodo:
All published datasets are pre-processed for forensic fidelity, including IP address sanitization and exclusion of operational reporting metadata.
The integrated AbuseIPDB module operates without ASN-based whitelisting. This provides utility for internal security monitoring but can produce false positives for high-frequency legitimate crawlers (e.g., search engines, security research scanners).
ASN-based filtering or User-Agent whitelisting is strongly recommended before enabling ABUSEIPDB_API_KEY in production.
If generating datasets for publication, reporting metadata should be excluded. Automated reports may propagate inaccurate reputation signals to the community.
| Component | Technology |
|---|---|
| Language | Python 3.11+ |
| Async runtime | asyncio, aiohttp |
| Dependency management | Poetry |
| Configuration | python-dotenv |
| Geolocation | geoip2 (MaxMind GeoLite2) |
| Database | SQLite (sqlite3) |
If you use this framework or the associated dataset in your research, please cite:
@software{boiko_2026_tarpit_git,
author = {Boiko, Viktor and Spesivtsev, Mykola},
title = {HTTP Tarpit & Bot Analyzer},
month = jun,
year = 2026,
publisher = {GitHub},
version = {v1.0.0},
url = {https://github.com/boykoatwork/http-tarpit}
}This project is licensed under the MIT License. See the LICENSE file for details.