- Overview
- Architecture
- Prerequisites
- Getting Started
- Deployment Modes
- Setup Process
- Workflow Scripts
- Examples and Demo
- Troubleshooting
- Cleanup
- Security Considerations
- Contributing
- License and Disclaimer
Amazon EKS Blue/Green Upgrade Automation provides a comprehensive solution for implementing zero-downtime upgrades for critical Kubernetes workloads. This workshop demonstrates a deployment strategy that ensure:
✨ Zero-Downtime Upgrades
- Seamless cluster version upgrades without service interruption
- Traffic switching between blue and green clusters
- Rollback capabilities for quick recovery
✨ Automation
- GitLab CI/CD pipeline orchestration
- Infrastructure as Code with Terraform
- GitOps workflow with ArgoCD integration
✨ Enterprise Features
- Internal testing capabilities before full promotion
- Automated load balancer configuration
- Comprehensive monitoring and validation
✨ Flexible Deployment Options
- Manual mode for learning and debugging
- Automated CI/CD mode for production deployments
- Step-by-step workflow for understanding the process
This repository provides educational value for learning blue/green deployment patterns and an automated upgrade setup for enterprise Kubernetes environments to build on.
The blue/green deployment strategy uses two identical production environments:
┌─────────────────┐ ┌─────────────────┐
│ Blue Cluster │ │ Green Cluster │
│ (Version N) │ │ (Version N+1) │
│ │ │ │
│ ┌───────────┐ │ │ ┌───────────┐ │
│ │ ArgoCD │ │ │ │ ArgoCD │ │
│ │ (main br) │ │ │ │ (N+1 br) │ │
│ └───────────┘ │ │ └───────────┘ │
└─────────────────┘ └─────────────────┘
│ │
└───────────┬───────────┘
│
┌─────────────────┐
│ Application │
│ Load Balancer │
│ │
│ Traffic Weight │
│ Blue: 100% │
│ Green: 0% │
└─────────────────┘
Network Configuration:
- Public Subnets:
10.0.0.0/20 - Blue Cluster Private Subnets:
10.0.16.0/20 - Green Cluster Private Subnets:
10.0.32.0/20 - Database Private Subnets:
10.0.48.0/20
Key Components:
- Application Load Balancer (ALB): Routes traffic between clusters
- Target Groups: Separate groups for blue and green clusters
- ArgoCD: GitOps deployment automation
- GitLab CI/CD: Pipeline orchestration and automation
Ensure you have the following tools installed:
- AWS CLI configured with appropriate credentials
- IAM permissions for:
- EKS cluster creation and management
- EC2 instance management
- VPC and networking resources
- Application Load Balancer management
- Clone the Repository:
git clone https://github.com/aws-samples/eks-blue-green-upgrade-automation.git # Replace this with actual repo
cd eks-blue-green-upgrade-automation # Replace this with actual repo- Install Dependencies:
# Install Node.js dependencies
npm install
# Install zx for script execution
npm install -g zxBest for:
- Learning and understanding the workflow
- Development and testing environments
- Step-by-step debugging and validation
- Educational workshops and demonstrations
Characteristics:
- All scripts run locally on your machine
- Manual execution of each deployment step
- Full visibility into each operation
- Easy troubleshooting and customization
Best for:
- Production deployments
- Repeatable and consistent operations
- Enterprise CI/CD integration
- Reduced human error and operational overhead
Characteristics:
- GitLab CI/CD pipelines orchestrate deployment
- Only bootstrap scripts run locally
- Automated validation and rollback capabilities
- Audit trail and deployment history
Configure your environment by creating a .env.local file with your specific values:
# Get your AWS Account ID
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
# Create your local environment configuration
cat > .env.local << EOF
ENVIRONMENT_NAME=your-environment-name
KUBERNETES_VERSION=1.30 # can change this to whatever version you wish to start from
REGION=your-aws-region
ACCOUNT_ID=$ACCOUNT_ID
EKS_ADMIN_ROLE=your-admin-role-name
CI=false # Set to 'true' for automated mode
SLACK_CHANNEL="#your-slack-channel" # can leave the slack variables empty if no slack integration
SLACK_BOT_TOKEN=your-slack-bot-token
EOFRequired Configuration Values:
ENVIRONMENT_NAME: Unique name for your deployment environmentKUBERNETES_VERSION: Target EKS version for green cluster (1.30 or higher)REGION: AWS region for deployment (e.g., us-west-2, eu-west-1)ACCOUNT_ID: Your AWS Account ID (auto-populated above)EKS_ADMIN_ROLE: IAM role name for cluster administrationCI: Toggle between manual (false) and automated (true) modesSLACK_CHANNEL: Optional Slack channel for notificationsSLACK_BOT_TOKEN: Optional Slack bot token for notifications
Note: The
.env.localfile will override default values in.envand should contain your specific configuration. This file is typically excluded from version control.
Set up the GitLab environment for CI/CD operations:
# Make scripts executable
chmod +x scripts/*.mjs
chmod +x scripts/utils/*.mjs
# Initialize the environment
./setup.mjs
# Deploy GitLab infrastructure
./1-setup-gitlab.mjsThis will:
- Deploy GitLab on an EC2 instance
- Configure GitLab projects and repositories
- Set up CI/CD variables and runners (if CI=true)
- Generate access credentials
Execute the complete workflow step by step:
# 1. Create base infrastructure
./2-create-base-infra.mjs
# 2. Deploy blue cluster with ArgoCD
./3-create-blue-cluster.mjs
# 3. Prepare for green cluster deployment
./4-setup-next-version-branch.mjs
# 4. Deploy green cluster
./5-create-green-cluster.mjs
# 5. Enable internal testing
./6-enable-internal-test.mjs
# 6. Promote green cluster to production
./7-promote-green-cluster.mjs
# Optional: Rollback if needed
./8-rollback-blue-cluster.mjs
# 7. Merge version branches
./9-merge-next-version-branch.mjs
# 8. Clean up old cluster
./10-delete-green-cluster.mjsAfter GitLab setup, use the GitLab web interface to trigger pipeline stages:
- Navigate to your GitLab instance
- Access the project pipeline section
- Trigger each stage manually or configure automatic triggers
- Monitor progress through GitLab CI/CD interface
| Script | Purpose | Components |
|---|---|---|
1-setup-gitlab.mjs |
GitLab Infrastructure | EC2, Docker, GitLab Runner, CI/CD Variables |
2-create-base-infra.mjs |
Base Infrastructure | VPC, ALB, Target Groups, Networking |
| Script | Purpose | Components |
|---|---|---|
3-create-blue-cluster.mjs |
Blue Cluster Setup | EKS Cluster, Node Groups, ArgoCD, Core Add-ons |
| Script | Purpose | Components |
|---|---|---|
4-setup-next-version-branch.mjs |
Version Branch Setup | Git Branch Creation, Repository Preparation |
5-create-green-cluster.mjs |
Green Cluster Setup | EKS Cluster (N+1), ArgoCD Configuration |
| Script | Purpose | Components |
|---|---|---|
6-enable-internal-test.mjs |
Internal Testing | ALB Rules, Header-based Routing |
7-promote-green-cluster.mjs |
Production Promotion | Traffic Weight Adjustment, Cluster Role Swap |
8-rollback-blue-cluster.mjs |
Rollback Capability | Traffic Reversion, State Management |
| Script | Purpose | Components |
|---|---|---|
9-merge-next-version-branch.mjs |
GitOps Management | Branch Merging, Version Tagging |
10-delete-green-cluster.mjs |
Resource Cleanup | Cluster Deletion, State Cleanup |
cleanup-everything.mjs |
Complete Cleanup | All Workload Resources |
cleanup-gitlab.mjs |
GitLab Cleanup | GitLab Infrastructure |
The workshop includes a sample application (nginxdemos/nginx-hello) that demonstrates:
- Initial Blue Deployment: Application running on cluster version N
- Green Cluster Creation: New cluster with version N+1
- Internal Testing: Validate green cluster with test traffic
- Traffic Switching: Gradual or immediate traffic migration
- Rollback Testing: Quick reversion to blue cluster if needed
Remove workload infrastructure while preserving GitLab for future deployments:
./cleanup-everything.mjsThis removes:
- Both blue and green EKS clusters
- Application Load Balancer and target groups
- VPC and networking resources
- Cluster state files
Remove all infrastructure including GitLab:
# Clean up workloads first
./cleanup-everything.mjs
# Then remove GitLab infrastructure
./cleanup-gitlab.mjs
⚠️ Warning: Complete cleanup will destroy all resources. Ensure you have backups of any important data or configurations.
IMPORTANT SECURITY WARNING: The GitLab setup in this workshop is designed for demonstration and learning purposes only and includes several security configurations that are NOT suitable for production environments.
Current GitLab Configuration:
- EC2 Instance: Deployed in a public subnet with public IP address
- GitLab Access: Runs as a Docker container accessible via HTTP (port 80)
- Security Groups: Allow access from anywhere (
0.0.0.0/0) on multiple ports:- Port 22 (SSH) - Open to the internet
- Port 80 (HTTP) - Open to the internet
- Port 443 (HTTPS) - Open to the internet
- Port 2222 (GitLab SSH) - Open to the internet
- Authentication: Uses hardcoded password (
eks12345) for easy demonstration - Encryption: HTTP traffic is unencrypted in transit
Security Risks:
- ❌ Public Internet Exposure: GitLab instance is directly accessible from the internet
- ❌ Weak Authentication: Hardcoded, well-known password
- ❌ Unencrypted Traffic: HTTP communication without TLS/SSL
- ❌ Overly Permissive Access: Security groups allow global access
- ❌ No Network Segmentation: No VPN or bastion host protection
For Production Deployments, Implement:
- ✅ Private Subnets: Deploy GitLab in private subnets behind NAT Gateway
- ✅ VPN/Bastion Access: Use VPN or bastion hosts for secure access
- ✅ HTTPS/TLS: Enable SSL certificates and force HTTPS
- ✅ Strong Authentication: Implement strong passwords, MFA, and LDAP/SSO
- ✅ Restricted Security Groups: Limit access to specific IP ranges/VPCs
- ✅ Network Segmentation: Use private networking with proper firewall rules
- ✅ Regular Updates: Keep GitLab and underlying OS updated
- ✅ Backup Strategy: Implement automated backups and disaster recovery
- ✅ Monitoring: Enable logging, monitoring, and alerting
- ✅ Secrets Management: Use AWS Secrets Manager or similar for credentials
We welcome contributions! Please read our Contributing Guidelines and Code of Conduct for more information.
This project is licensed under the MIT License - see LICENSE file.
Use at your own risk. The authors are not responsible for any issues, damages, or losses that may result from using this code in production.
Check Security Considerations for more information on the security scans.
