██████╗ ██████╗ ██╗ ██╗███████╗██████╗ ██████╗ ██████╗ ████████╗███████╗
██╔══██╗██╔═══██╗██║ ██╔╝██╔════╝██╔══██╗██╔══██╗██╔═══██╗╚══██╔══╝██╔════╝
██████╔╝██║ ██║█████╔╝ █████╗ ██████╔╝██████╔╝██║ ██║ ██║ ███████╗
██╔═══╝ ██║ ██║██╔═██╗ ██╔══╝ ██╔══██╗██╔══██╗██║ ██║ ██║ ╚════██║
██║ ╚██████╔╝██║ ██╗███████╗██║ ██║██████╔╝╚██████╔╝ ██║ ███████║
╚═╝ ╚═════╝ ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚═════╝ ╚═════╝ ╚═╝ ╚══════╝
MIT POKERBOTS 2026
A sophisticated poker AI competition framework and research platform.
A unique poker variant with a strategic twist:
┌─────────────────────────────────────────────────────────────────────────────┐
│ TOSS HOLD'EM RULES │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ PRE-FLOP: You receive 3 hole cards (not 2!) │
│ ┌───┐ ┌───┐ ┌───┐ │
│ │ A♠│ │ K♠│ │ Q♠│ │
│ └───┘ └───┘ └───┘ │
│ │
│ FLOP: 3 community cards dealt │
│ ┌───┐ ┌───┐ ┌───┐ │
│ │ J♠│ │10♠│ │ 2♥│ │
│ └───┘ └───┘ └───┘ │
│ │
│ THE TOSS: Discard 1 card FACE UP (opponent sees it!) │
│ ┌───┐ ┌───┐ ┌───┐ │
│ │ A♠│ │ K♠│ ───► │ Q♠│ (discarded) │
│ └───┘ └───┘ └───┘ │
│ │
│ TURN/RIVER: Standard Texas Hold'em from here │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Strategic Implications:
- The discard reveals information to your opponent
- Bluffing becomes more complex (you can mislead with your toss)
- Hand reading incorporates the discarded card
The foundation of our poker AI strategy.
┌──────────────────────────────────────────────────────────────────┐
│ CFR TRAINING PIPELINE │
├──────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────────┐ ┌───────────────┐ │
│ │ Game │ ───► │ Build Info │ ───► │ Calculate │ │
│ │ Tree │ │ Sets │ │ Regrets │ │
│ └─────────┘ └─────────────┘ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ ┌─────────────┐ ┌───────────────┐ │
│ │ Nash │ ◄─── │ Average │ ◄─── │ Update │ │
│ │ Equilib │ │ Strategy │ │ Strategy │ │
│ └─────────┘ └─────────────┘ └───────────────┘ │
│ │
│ Iterations: 1,000,000+ │ Info Sets: 150k - 500k+ │
└──────────────────────────────────────────────────────────────────┘
Key Features:
- CFR+ variant with linear weighting for faster convergence
- Explicit board abstraction (e.g., "K732" vs lossy "B4NFC")
- Hand abstraction categories:
- 169 high card variations
- 13 pair types
- 78 two-pair combinations
- Trips, straights, flushes, etc.
Real-time classification and adaptation.
┌──────────────────────────────────────────────────────────────────┐
│ OPPONENT CLASSIFICATION │
├──────────────────────────────────────────────────────────────────┤
│ │
│ Prior Distribution: │
│ ├── Fish ██████████████████████████ 25% │
│ ├── Random Weak ██████████████████████████████ 30% │
│ ├── TAG (Tight Aggressive) ██████████ 10% │
│ ├── LAG (Loose Aggressive) ██████████ 10% │
│ ├── GTO ██████████ 10% │
│ ├── Nit ██████████ 10% │
│ └── Maniac █████ 5% │
│ │
│ Update: P(type|action) ∝ P(action|type) × P(type) │
│ │
└──────────────────────────────────────────────────────────────────┘
Exploitation Strategies Per Type:
| Opponent Type | Strategy |
|---|---|
| Fish | Value bet heavily, avoid bluffing |
| Random Weak | Play tight, exploit with premiums |
| TAG | 3-bet light, apply pressure |
| LAG | Trap with strong hands |
| GTO | Play GTO back, minimize exploits |
| Nit | Steal blinds aggressively |
| Maniac | Call down light, let them bluff |
┌──────────────────────────────────────────────────────────────────┐
│ DEEP CFR ARCHITECTURE │
├──────────────────────────────────────────────────────────────────┤
│ │
│ Input Layer │
│ ┌─────────────────────────────────────────────┐ │
│ │ Hand strength, board texture, pot odds, │ │
│ │ position, opponent actions, stack sizes │ │
│ └─────────────────────┬───────────────────────┘ │
│ │ │
│ ▼ │
│ Hidden Layers [64] ─► [48] ─► [32] │
│ │ │
│ ▼ │
│ Output Layer [Fold, Check, Call, Bet sizes...] │
│ │
│ Training: 100k iterations │
│ Use: Bet sizing hints │
└──────────────────────────────────────────────────────────────────┘
Test Setup: 1000 hands per matchup (5 matches × 200 hands)
┌─────────────────────────────────────────────────────────────────────────────┐
│ PERFORMANCE METRICS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ hybrid_cfr vs python_skeleton │
│ ████████████████████████████████████████████████████ +1,006 BB/100 │
│ Chips: +4,023 avg [CRUSHING] │
│ │
│ hybrid_cfr vs new_cfr │
│ ██████████████████████████████████████████ +823 BB/100 │
│ Chips: +3,291 avg [DOMINANT] │
│ │
│ hybrid_cfr vs cfr_implementation │
│ ████████████████████████████████████████ +782 BB/100 │
│ Chips: +3,128 avg [DOMINANT] │
│ │
│ hybrid_cfr vs tuff_model │
│ ██████████████████ +362 BB/100 │
│ Chips: +1,449 avg [STRONG] │
│ │
│ ───────────────────────────────────────────────────────────────────── │
│ BB/100 = Big blinds won per 100 hands │
│ Values > 100 indicate crushing performance │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
| Opponent | Avg Chips Won | BB/100 | Verdict |
|---|---|---|---|
| python_skeleton | +14,157 | +3,539 | Dominating |
| new_cfr | +13,604 | +3,401 | Dominating |
| tuff_model | +4,842 | +1,211 | Strong |
| Self (mirror) | +1,952 | +488 | Balanced |
engine-2026/
│
├── engine.py # Main game engine
├── config.py # Bot configuration
├── README.md # You are here!
│
├── BOT IMPLEMENTATIONS/
│ │
│ ├── hybrid_cfr/ # BEST PERFORMER
│ │ ├── cfr_trainer.py # CFR training (1M+ iterations)
│ │ ├── player.py # Bayesian opponent modeling
│ │ ├── deep_cfr/ # Neural network integration
│ │ │ ├── trainer.py
│ │ │ ├── networks.py
│ │ │ └── models/100k/ # Trained model
│ │ └── results/ # Test results & gamelogs
│ │
│ ├── new_cfr/ # Improved CFR
│ │ ├── cfr_trainer.py
│ │ ├── player.py
│ │ └── cfr_1M.pkl # 1M iteration trained model
│ │
│ ├── test_suite/ultron/ # IN DEVELOPMENT
│ │ ├── player.py # Main bot (1033 lines)
│ │ ├── opponent_modeling/ # Advanced tracking
│ │ ├── layers/ # Strategy layers
│ │ ├── toss_strategy/ # Discard optimization
│ │ └── training/ # Training infrastructure
│ │
│ └── cfr_implementation/ # Earlier CFR version
│
├── SKELETON CODE/
│ ├── python_skeleton/ # Python template
│ ├── java_skeleton/ # Java template
│ └── cpp_skeleton/ # C++ template
│
└── TEST OPPONENTS/
├── nemesis/ # Counter-strategy
├── tier4_adaptive/ # Adaptive exploitation
├── tier4_gto/ # GTO baseline
├── tight_aggressive/ # TAG strategy
├── ultra_aggressive/ # All-in maniac
├── ultra_passive/ # Check-call station
├── trappy/ # Slow-play specialist
└── ... (12+ bots total)
┌──────────────────────────────────────────────────────────────────┐
│ HYBRID CFR ARCHITECTURE │
├──────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ │
│ │ Game State │ │
│ └────────┬────────┘ │
│ │ │
│ ┌──────────────┼──────────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌─────────────┐ ┌───────────────┐ │
│ │ Opponent │ │ CFR Table │ │ Deep CFR │ │
│ │ Classifier │ │ Lookup │ │ Network │ │
│ └──────┬───────┘ └──────┬──────┘ └───────┬───────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Strategy Selector │ │
│ │ ┌─────────────────────────────────────────────┐ │ │
│ │ │ IF opponent = Fish/Weak → Exploit Strategy │ │ │
│ │ │ IF opponent = GTO/TAG → CFR Strategy │ │ │
│ │ │ Bet sizing hints from Deep CFR │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Final Action │ │
│ └─────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────┘
Most sophisticated architecture with:
- Advanced opponent modeling (1033 lines)
- TossEngine for optimal discard decisions
- Multi-layer strategy system
- PPO reinforcement learning (planned)
| Component | Technology |
|---|---|
| Language | Python 3.8+ |
| Hand Evaluation | pkrbot (custom Cython library) |
| Neural Networks | PyTorch |
| Numerical Computing | NumPy |
| Communication | TCP sockets |
| Package Management | uv (recommended) |
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh # macOS/Linux
# OR
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" # Windows
# Setup environment
uv venv
uv sync
# Run a match
.venv/bin/python engine.pyEdit config.py to select which bots compete:
# Example configuration
PLAYER_1 = "hybrid_cfr"
PLAYER_2 = "python_skeleton"
NUM_HANDS = 1000C++ Setup
Requires: C++17, cmake>=3.8, boost
# Linux
sudo apt-get install -y libboost-all-dev
# macOS
brew install boost
# Windows (use WSL)
wsl --install
sudo apt install -y libboost-all-devJava Setup
Requires: Java 8+
# macOS
brew install --cask temurin
# Linux
sudo apt install -y openjdk-17-jdk| Document | Description |
|---|---|
hybrid_cfr/STRATEGY_DEEP_DIVE.md |
Complete poker AI theory & implementation |
test_suite/ULTRON_TRAINING_GUIDE.md |
Training methodology & hyperparameters |
new_cfr/DIAGNOSTICS_README.md |
CFR training diagnostics & debugging |
new_cfr/ABSTRACTION_CHANGES.md |
Hand/board abstraction improvements |
[✓] CFR implementation with 1M+ iterations trained
[✓] Bayesian opponent modeling (7 archetypes)
[✓] Deep CFR neural network integration
[✓] Comprehensive test suite (12+ opponent bots)
[✓] Crushing performance: +1,000 BB/100 vs baseline
[✓] Extensive documentation & training guides
[~] ULTRON bot refinement
[~] Deep CFR retraining (larger architecture)
[~] PPO-based exploitation learning
- Deep CFR network needs larger architecture for full convergence
- ULTRON occasionally overcalls with marginal hands
- Response formatting edge cases under investigation
Part of the annual MIT Pokerbots student programming competition, where teams implement poker AIs competing in tournament-style matches. Students can use any algorithm—from simple heuristics to advanced game theory and machine learning.
┌─────────────────────────────────────────────────────────────────┐
│ │
│ MIT POKERBOTS 2026 │
│ │
│ "Building the next generation of game-playing AI" │
│ │
└─────────────────────────────────────────────────────────────────┘
Built with CFR, neural networks, and a lot of poker theory.