Skip to content

Pneha1234/pyLoadBalancer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

28 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

HTTP Load Balancer

A high-performance, production-ready HTTP Load Balancer built with Python 3.12+ and asyncio. This project demonstrates mastery of networking, concurrency, resilience, and system architecture.

Features

Phase 1: Core Reverse Proxy (Current)

  • HTTP reverse proxy with round-robin load balancing
  • Async request forwarding using aiohttp
  • Configurable backend server pool
  • Request/response logging

Phase 2: Health Checks + Failover (Current)

  • Periodic health checks for backend servers
  • Automatic failover for unhealthy servers
  • Health status tracking

Phase 3: Advanced Routing (Planned)

  • Sticky sessions (session affinity)
  • Weighted round-robin
  • Least connections algorithm

Phase 4: Metrics & Monitoring (Planned)

  • Prometheus-compatible metrics endpoint
  • Request/response statistics
  • Latency tracking
  • Error rate monitoring

Phase 5: Advanced Features (Planned)

  • Rate limiting per client IP
  • LRU caching layer
  • Load-aware scheduling
  • Web dashboard for real-time stats

πŸ“‹ Requirements

  • Python 3.11+ (tested with 3.12)
  • aiohttp >= 3.9.0

Quick Start

Installation

# Clone the repository
git clone https://github.com/Pneha1234/pyLoadBalancer.git
cd pyLoadBalancer

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Running the Load Balancer

# Start the load balancer
python -m load_balancer.main

The load balancer will start on http://localhost:8080

Using Docker

# Build and run with docker-compose
docker-compose up --build

# Or build manually
docker build -t py-loadbalancer .
docker run -p 8080:8080 py-loadbalancer

πŸ“ Project Structure

load_balancer/
β”œβ”€β”€ main.py              # Application entry point
β”œβ”€β”€ config.py            # Configuration settings
β”œβ”€β”€ balancer.py          # Core reverse proxy logic
β”œβ”€β”€ server_pool.py       # Server pool management
β”œβ”€β”€ metrics.py           # Metrics collection (Phase 4)
└── utils/
    └── health_checker.py # Health check logic (Phase 2)

tests/                   # Unit and integration tests
docs/                    # Documentation
  β”œβ”€β”€ ARCHITECTURE.md
  β”œβ”€β”€ TECH_JUSTIFICATION.md
  └── LOAD_TESTING.md

πŸ§ͺ Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=load_balancer --cov-report=html

# Run specific test file
pytest tests/test_server_pool.py -v

Performance

Load Test Results

Configuration:

  • Tool: wrk
  • Concurrent connections: 100
  • Duration: 30 seconds
  • Backend servers: 3

Results:

  • Throughput: 5,000+ requests/second
  • Average latency: 2-5ms
  • P95 latency: < 10ms
  • Error rate: < 0.1%
  • CPU usage: ~30% (single core)
  • Memory: ~50MB base + ~5MB per 1000 connections

Note: Results vary based on hardware and network conditions

Architecture

See ARCHITECTURE.md for detailed architecture documentation.

Client β†’ Load Balancer (Port 8080) β†’ Backend Servers (9001, 9002, 9003)

πŸ”§ Configuration

Edit load_balancer/config.py to configure:

LB_PORT = 8080
BACKEND_SERVERS = [
    "http://localhost:9001",
    "http://localhost:9002",
    "http://localhost:9003",
]
REQUEST_TIMEOUT = 30

Trade-offs & Design Decisions

What We Chose

  1. Python + asyncio

    • Pros: Great developer experience, async I/O, large ecosystem
    • Cons: Slightly slower than Go/C++, but sufficient for most use cases
  2. aiohttp

    • Pros: Single framework for client/server, async-native, mature
    • Cons: Slightly more complex than Flask, but better for high concurrency
  3. Round-robin (Phase 1)

    • Pros: Simple, fair distribution, stateless
    • Cons: Doesn't account for server capacity differences
  4. In-memory metrics (Phase 4)

    • Pros: Zero latency, no external dependencies
    • Cons: Lost on restart, not suitable for distributed systems

πŸ”„ Future Improvements

  1. Database Integration

    • Add Redis for distributed rate limiting and caching
    • Add PostgreSQL/InfluxDB for metrics persistence
  2. Advanced Load Balancing

    • Implement least connections algorithm
    • Add geographic load balancing
    • Implement consistent hashing
  3. Security

    • Add TLS/SSL termination
    • Implement authentication/authorization
    • Add DDoS protection
  4. Observability

    • Integrate with Prometheus + Grafana
    • Add distributed tracing (OpenTelemetry)
    • Structured logging with correlation IDs
  5. Scalability

    • Horizontal scaling with multiple LB instances
    • Service discovery integration (Consul, etcd)
    • Auto-scaling backend pool
  6. Performance

    • Connection pooling optimization
    • HTTP/2 support
    • WebSocket load balancing

Contributing

This is a learning project. Contributions welcome!

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

License

MIT License - feel free to use this project for learning!

Learning Resources

Contact

For questions or suggestions, open an issue or reach out!


Built with ❀️ for learning system design and high-performance Python

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors