Skip to content

ssupshub/Telegram-Deployment-Automation-Bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

55 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Telegram Deployment Automation Bot

A production-ready, secure Telegram bot for triggering deployments to staging and production environments β€” with role-based access control, audit logging, real-time log streaming, concurrent health checks, deploy locking, subprocess timeouts, and auto-rollback.


Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        TELEGRAM DEPLOYMENT BOT                          β”‚
β”‚                                                                         β”‚
β”‚  Developer (Telegram)                                                   β”‚
β”‚       β”‚                                                                 β”‚
β”‚       β”‚  /deploy production                                             β”‚
β”‚       β–Ό                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    RBAC     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   Audit Log           β”‚
β”‚  β”‚  Bot Handlerβ”‚ ──────────► β”‚  Role Check      β”‚ ──────────► File/S3   β”‚
β”‚  β”‚  (PTB)      β”‚             β”‚  (admin_ids list)β”‚                       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚                                       β”‚ βœ… Authorized                   β”‚
β”‚                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚                              β”‚ Deploy Lock      β”‚ ← prevents double-    β”‚
β”‚                              β”‚ (_deploying set) β”‚   deploy race cond.   β”‚
β”‚                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚                                       β”‚ βœ… Lock acquired                β”‚
β”‚                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚                              β”‚ Inline Confirm   β”‚                       β”‚
β”‚                              β”‚ (commit hash)    β”‚                       β”‚
β”‚                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚                                       β”‚ βœ… Confirmed                    β”‚
β”‚                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚                              β”‚ DeploymentManagerβ”‚                       β”‚
β”‚                              β”‚ subprocess exec  β”‚                       β”‚
β”‚                              β”‚ + timeout guard  β”‚                       β”‚
β”‚                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚                    β–Ό                  β–Ό                  β–Ό              β”‚
β”‚             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”‚
β”‚             β”‚ Git Pull  β”‚  β”‚  Docker Build    β”‚  β”‚ Push to ECRβ”‚         β”‚
β”‚             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜         β”‚
β”‚                                                         β”‚               β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β”‚
β”‚                    β–Ό                                                    β”‚
β”‚             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                        β”‚
β”‚             β”‚  Health Check   β”‚ ← state files written only              β”‚
β”‚             β”‚  (retry loop)   β”‚   AFTER this passes                     β”‚
β”‚             β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                         β”‚
β”‚              βœ… Pass β”‚  ❌ Fail                                        β”‚
β”‚        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                   β”‚
β”‚        β–Ό                            β–Ό                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                          β”‚
β”‚  β”‚ Notify user βœ…β”‚         β”‚  Auto-Rollback  β”‚                          β”‚
β”‚  β”‚ Release lock  β”‚         β”‚  Notify user ❌ β”‚                          β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β”‚  Release lock   β”‚                          β”‚
β”‚                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Project Structure

telegram-deploy-bot/
β”‚
β”œβ”€β”€ bot/                        # Python bot source
β”‚   β”œβ”€β”€ bot.py                  # Entry point, command handlers, deploy lock
β”‚   β”œβ”€β”€ config.py               # Lazy classmethod config (all values read at call time)
β”‚   β”œβ”€β”€ rbac.py                 # Role-based access control decorator
β”‚   β”œβ”€β”€ audit_logger.py         # Structured audit log (JSON Lines)
β”‚   β”œβ”€β”€ deployment.py           # Deployment orchestration + subprocess timeout
β”‚   └── requirements.txt        # Runtime Python dependencies
β”‚
β”œβ”€β”€ scripts/                    # Shell scripts (the actual deploy work)
β”‚   β”œβ”€β”€ deploy.sh               # Full deployment pipeline
β”‚   └── rollback.sh             # Rollback to previous image
β”‚
β”œβ”€β”€ terraform/                  # AWS infrastructure as code
β”‚   β”œβ”€β”€ main.tf                 # EC2 + ECR + IAM + VPC + OIDC
β”‚   └── destroy.sh              # Safe teardown of all AWS resources
β”‚
β”œβ”€β”€ docs/                       # Documentation
β”‚   β”œβ”€β”€ INSTALLATION.md         # Step-by-step installation guide
β”‚   └── BENEFITS.md             # Why use this bot
β”‚
β”œβ”€β”€ nginx/                      # Reverse proxy (webhook mode)
β”‚   └── nginx.conf
β”‚
β”œβ”€β”€ monitoring/                 # Prometheus config
β”‚   └── prometheus.yml
β”‚
β”œβ”€β”€ .github/
β”‚   └── workflows/
β”‚       └── ci-cd.yml           # GitHub Actions CI/CD pipeline
β”‚
β”œβ”€β”€ Dockerfile                  # Multi-stage Docker build for the bot
β”œβ”€β”€ docker-compose.yml          # Run the bot + supporting services
β”œβ”€β”€ requirements-dev.txt        # Pinned dev + test dependencies
β”œβ”€β”€ .env.example                # Environment variable template
β”œβ”€β”€ .secrets.baseline           # detect-secrets baseline (committed)
β”œβ”€β”€ pytest.ini                  # Pytest configuration
└── README.md

Getting Started

πŸ“– Full step-by-step installation instructions are in docs/INSTALLATION.md

Prerequisites:

GitHub Actions secrets required β€” set these under Settings β†’ Secrets and variables β†’ Actions:

Secret Source
TELEGRAM_BOT_TOKEN From @BotFather
TELEGRAM_CHAT_ID Your Telegram user ID (from @userinfobot)
ECR_REGISTRY terraform output ecr_registry
AWS_DEPLOY_ROLE_ARN terraform output deploy_role_arn
STAGING_SSH_KEY Contents of ~/.ssh/deploy_key
PRODUCTION_SSH_KEY Contents of ~/.ssh/deploy_key (same file)
STAGING_HOST terraform output staging_ip
PRODUCTION_HOST terraform output production_ip
STAGING_HEALTH_URL http://<staging-ip>/health
PRODUCTION_HEALTH_URL http://<production-ip>/health

Bot Commands

Command Role Required Description
/start or /help Any authorized Show available commands
/deploy staging Staging Deploy develop branch to staging
/deploy production Admin Deploy main branch to production (requires confirmation)
/rollback staging Admin Rollback staging to the previous image
/rollback production Admin Rollback production to the previous image
/status Staging Show health and deployed commit for all environments

Environment Variables

All configuration is read from environment variables at call time β€” never frozen at import time. Copy .env.example to .env to get started.

Variable Required Default Description
TELEGRAM_BOT_TOKEN βœ… β€” Bot token from @BotFather
ADMIN_TELEGRAM_IDS βœ… β€” Comma-separated admin user IDs
STAGING_TELEGRAM_IDS β€” β€” Comma-separated staging user IDs
REGISTRY_URL βœ… β€” ECR registry URL
REGISTRY_IMAGE β€” myapp Docker image name
AWS_REGION β€” us-east-1 AWS region for ECR auth
STAGING_HOST β€” β€” Staging server IP/hostname
PRODUCTION_HOST β€” β€” Production server IP/hostname
DEPLOY_USER β€” deploy SSH user on target servers
SSH_KEY_PATH β€” /app/secrets/deploy_key Path to SSH deploy key
STAGING_HEALTH_URL β€” β€” Health check endpoint for staging
PRODUCTION_HEALTH_URL β€” β€” Health check endpoint for production
HEALTH_CHECK_TIMEOUT β€” 30 Seconds per health check request
HEALTH_CHECK_RETRIES β€” 5 Number of health check retries
DEPLOY_TIMEOUT_SECONDS β€” 600 Max seconds before deploy is killed
USE_KUBERNETES β€” false Use kubectl instead of Docker Compose
KUBE_NAMESPACE β€” default Kubernetes namespace
AUDIT_LOG_PATH β€” /var/log/deploybot/audit.log Audit log file path
GITHUB_BRANCH_STAGING β€” develop Branch deployed to staging
GITHUB_BRANCH_PRODUCTION β€” main Branch deployed to production

Security Architecture

Role-Based Access Control (RBAC)

ADMIN   β†’ full access: production deploy, rollback, staging, /status
          set via: ADMIN_TELEGRAM_IDS=123456789,987654321

STAGING β†’ limited access: staging deploy + /status only
          set via: STAGING_TELEGRAM_IDS=111222333

Roles are enforced by the @require_role decorator on every handler. Admin role is re-verified on every callback button press β€” buttons cannot be replayed by unauthorized users.

Deploy Lock

A module-level _deploying: set[str] prevents two concurrent deploys to the same environment. If an admin double-taps "Confirm" or a callback is replayed while a deploy is running, the second request is rejected immediately. The lock is released in a try/finally block so it is always freed, even if an unexpected exception occurs.

Command Injection Prevention

# ❌ DANGEROUS β€” shell injection possible
subprocess.run(f"deploy.sh {user_input}", shell=True)

# βœ… SAFE β€” fixed argument list, no shell interpolation
asyncio.create_subprocess_exec("/app/scripts/deploy.sh", environment, commit)

Environment and commit hash are additionally validated against strict allow-lists before reaching the subprocess call.

Subprocess Timeout

Every deploy and rollback subprocess is wrapped in asyncio.timeout(DEPLOY_TIMEOUT_SECONDS). If deploy.sh hangs β€” SSH timeout, docker build stall, network issue β€” the process is killed and an error is streamed back to the user. The bot never hangs indefinitely.

Audit Log Integrity

The audit log writes core fields (timestamp, user_id, action) after spreading arbitrary metadata, so no metadata key can silently overwrite the forensic trail. Every action β€” deploy started, deploy success, deploy failed, rollback, denial β€” is recorded with user identity, environment, commit, and UTC timestamp.

SSH Key Cleanup

CI/CD deploy steps use trap 'rm -f /tmp/deploy_key' EXIT to guarantee the private key is deleted from the runner filesystem even if the SSH command fails.


Deployment Flow

User β†’ /deploy production
         β”‚
         β–Ό
1. RBAC check β†’ not admin? 🚫 Denied + audited
         β”‚ admin βœ…
         β–Ό
2. Check deploy lock β†’ env already deploying? ⏳ Rejected
         β”‚ lock free βœ…
         β–Ό
3. Fetch latest commit from Config.github_branch_production()
         β”‚
         β–Ό
4. Confirmation dialog (commit hash shown)
         β”‚ Confirm clicked
         β–Ό
5. Re-verify admin role on callback
         β”‚
         β–Ό
6. Acquire deploy lock for environment
         β”‚
         β–Ό
7. Audit log: { user, action=deploy_started, env, commit, timestamp }
         β”‚
         β–Ό
8. Run deploy.sh production <commit> (timeout: DEPLOY_TIMEOUT_SECONDS)
   β”œβ”€β”€ Validate inputs (whitelist env, validate commit SHA format)
   β”œβ”€β”€ git fetch + checkout + pull origin main
   β”œβ”€β”€ docker build --no-cache (image tagged with exact commit)
   β”œβ”€β”€ aws ecr get-login-password | docker login
   β”œβ”€β”€ docker push β†’ ECR
   β”œβ”€β”€ Save previous image ref for rollback
   β”œβ”€β”€ ssh deploy@host β†’ docker compose up -d
   └── Health check (10 retries Γ— 10s)
             β”‚
             β”œβ”€β”€ βœ… PASS β†’ write state files (commit + timestamp)
             β”‚            audit log deploy_success
             β”‚            notify user βœ…
             β”‚            release deploy lock
             β”‚
             └── ❌ FAIL β†’ audit log deploy_failed
                           notify user ❌
                           run rollback.sh (with timeout + streaming)
                           audit log auto_rollback_completed/failed
                           notify user with rollback result
                           release deploy lock

Why state files are written after health check: If deploy.sh exits with code 1 (health check failed) and the bot triggers rollback, rollback.sh reads the previous image ref to revert to. Writing state files before health check would record a broken deployment as the last known-good state β€” the rollback would restore the broken image. State files are written only after a successful health check confirms the deployment is live and healthy.


Running Tests

# Install runtime + dev dependencies
pip install -r bot/requirements.txt
pip install -r requirements-dev.txt

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ -v --cov=bot --cov-report=term-missing

95 tests across 5 test files, covering:

  • Config lazy evaluation and env-change reflection
  • RBAC allow/deny logic and HTML parse mode on denial messages
  • Deploy lock acquisition, rejection, and guaranteed release
  • Deployment streaming, error detection, and subprocess timeout
  • Concurrent health checks via asyncio.gather()
  • Audit log integrity (metadata cannot overwrite core fields)
  • Auto-rollback triggering on deploy failure
  • Callback security (re-verification, double-confirm rejection)

CI/CD Pipeline

Push to develop β†’ test β†’ build β†’ push to ECR β†’ deploy to staging β†’ health check
Push to main    β†’ test β†’ build β†’ push to ECR β†’ [approval gate] β†’ deploy to production β†’ health check β†’ notify Telegram
Pull request    β†’ test only

All GitHub Actions are pinned to specific versions (no @master tags). The security scan (detect-secrets) runs against a committed .secrets.baseline so it produces stable, reproducible results.


Teardown

cd terraform/
bash destroy.sh            # interactive β€” prompts "type DESTROY to confirm"
bash destroy.sh --dry-run  # preview all commands without executing
bash destroy.sh --force    # skip confirmation (CI use)
bash destroy.sh --region eu-west-1  # override region

Tears down EC2 instances, ECR repository and all images, IAM roles, VPC, subnets, internet gateway, security group, and SSH key pair.


Production Hardening Checklist

Infrastructure:
[ ] SSH: disable password auth and root login (key-only)
[ ] Security group: restrict port 22 to your IP, not 0.0.0.0/0
[ ] Rotate the SSH deploy key every 90 days
[ ] ECR: scan images on push, fail CI on CRITICAL CVEs (Trivy configured)
[ ] Add SSH server fingerprints to known_hosts instead of StrictHostKeyChecking=no

Bot Security:
[ ] Whitelist only known Telegram user IDs β€” never run as a public bot
[ ] Permissions re-verified on every callback (already implemented)
[ ] Deploy lock prevents concurrent deploys (already implemented)
[ ] Subprocess timeout prevents hangs (already implemented)
[ ] Never log secrets (TELEGRAM_BOT_TOKEN excluded from safe_env)

Deployment:
[ ] Require PR review before merging to main
[ ] GitHub Environment protection rules with required reviewers for production
[ ] Add post-deploy smoke tests on top of the health check
[ ] Ship audit logs to immutable storage (S3 with Object Lock, CloudWatch Logs)
[ ] Set DEPLOY_TIMEOUT_SECONDS to match your slowest expected build time

Built with Python 3.12 Β· python-telegram-bot 21 Β· Runs on AWS EC2 Β· Deployed via Docker Β· Controlled via Telegram

About

Telegram deploy bot

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors