Skip to content

jeffelin/CFR_pokerbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 

Repository files navigation

MIT Pokerbots Engine 2026

    ██████╗  ██████╗ ██╗  ██╗███████╗██████╗ ██████╗  ██████╗ ████████╗███████╗
    ██╔══██╗██╔═══██╗██║ ██╔╝██╔════╝██╔══██╗██╔══██╗██╔═══██╗╚══██╔══╝██╔════╝
    ██████╔╝██║   ██║█████╔╝ █████╗  ██████╔╝██████╔╝██║   ██║   ██║   ███████╗
    ██╔═══╝ ██║   ██║██╔═██╗ ██╔══╝  ██╔══██╗██╔══██╗██║   ██║   ██║   ╚════██║
    ██║     ╚██████╔╝██║  ██╗███████╗██║  ██║██████╔╝╚██████╔╝   ██║   ███████║
    ╚═╝      ╚═════╝ ╚═╝  ╚═╝╚══════╝╚═╝  ╚═╝╚═════╝  ╚═════╝    ╚═╝   ╚══════╝
                              MIT POKERBOTS 2026

A sophisticated poker AI competition framework and research platform.

Game Variant: Toss Hold'em

A unique poker variant with a strategic twist:

┌─────────────────────────────────────────────────────────────────────────────┐
│                           TOSS HOLD'EM RULES                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   PRE-FLOP:     You receive 3 hole cards (not 2!)                          │
│                      ┌───┐ ┌───┐ ┌───┐                                      │
│                      │ A♠│ │ K♠│ │ Q♠│                                      │
│                      └───┘ └───┘ └───┘                                      │
│                                                                             │
│   FLOP:         3 community cards dealt                                    │
│                      ┌───┐ ┌───┐ ┌───┐                                      │
│                      │ J♠│ │10♠│ │ 2♥│                                      │
│                      └───┘ └───┘ └───┘                                      │
│                                                                             │
│   THE TOSS:     Discard 1 card FACE UP (opponent sees it!)                 │
│                      ┌───┐ ┌───┐       ┌───┐                                │
│                      │ A♠│ │ K♠│  ───► │ Q♠│ (discarded)                    │
│                      └───┘ └───┘       └───┘                                │
│                                                                             │
│   TURN/RIVER:   Standard Texas Hold'em from here                           │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Strategic Implications:

  • The discard reveals information to your opponent
  • Bluffing becomes more complex (you can mislead with your toss)
  • Hand reading incorporates the discarded card

Algorithms & Approaches

1. Counterfactual Regret Minimization (CFR)

The foundation of our poker AI strategy.

┌──────────────────────────────────────────────────────────────────┐
│                    CFR TRAINING PIPELINE                         │
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ┌─────────┐      ┌─────────────┐      ┌───────────────┐       │
│   │ Game    │ ───► │ Build Info  │ ───► │ Calculate     │       │
│   │ Tree    │      │ Sets        │      │ Regrets       │       │
│   └─────────┘      └─────────────┘      └───────┬───────┘       │
│                                                  │               │
│                                                  ▼               │
│   ┌─────────┐      ┌─────────────┐      ┌───────────────┐       │
│   │ Nash    │ ◄─── │ Average     │ ◄─── │ Update        │       │
│   │ Equilib │      │ Strategy    │      │ Strategy      │       │
│   └─────────┘      └─────────────┘      └───────────────┘       │
│                                                                  │
│   Iterations: 1,000,000+  │  Info Sets: 150k - 500k+            │
└──────────────────────────────────────────────────────────────────┘

Key Features:

  • CFR+ variant with linear weighting for faster convergence
  • Explicit board abstraction (e.g., "K732" vs lossy "B4NFC")
  • Hand abstraction categories:
    • 169 high card variations
    • 13 pair types
    • 78 two-pair combinations
    • Trips, straights, flushes, etc.

2. Bayesian Opponent Modeling

Real-time classification and adaptation.

┌──────────────────────────────────────────────────────────────────┐
│                  OPPONENT CLASSIFICATION                         │
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Prior Distribution:                                            │
│   ├── Fish ██████████████████████████ 25%                       │
│   ├── Random Weak ██████████████████████████████ 30%            │
│   ├── TAG (Tight Aggressive) ██████████ 10%                     │
│   ├── LAG (Loose Aggressive) ██████████ 10%                     │
│   ├── GTO ██████████ 10%                                        │
│   ├── Nit ██████████ 10%                                        │
│   └── Maniac █████ 5%                                           │
│                                                                  │
│   Update: P(type|action) ∝ P(action|type) × P(type)             │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

Exploitation Strategies Per Type:

Opponent Type Strategy
Fish Value bet heavily, avoid bluffing
Random Weak Play tight, exploit with premiums
TAG 3-bet light, apply pressure
LAG Trap with strong hands
GTO Play GTO back, minimize exploits
Nit Steal blinds aggressively
Maniac Call down light, let them bluff

3. Deep CFR (Neural Network)

┌──────────────────────────────────────────────────────────────────┐
│                  DEEP CFR ARCHITECTURE                           │
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Input Layer                                                    │
│   ┌─────────────────────────────────────────────┐               │
│   │ Hand strength, board texture, pot odds,     │               │
│   │ position, opponent actions, stack sizes     │               │
│   └─────────────────────┬───────────────────────┘               │
│                         │                                        │
│                         ▼                                        │
│   Hidden Layers    [64] ─► [48] ─► [32]                         │
│                         │                                        │
│                         ▼                                        │
│   Output Layer     [Fold, Check, Call, Bet sizes...]            │
│                                                                  │
│   Training: 100k iterations                                      │
│   Use: Bet sizing hints                                          │
└──────────────────────────────────────────────────────────────────┘

Tournament Results

Hybrid CFR Performance

Test Setup: 1000 hands per matchup (5 matches × 200 hands)

┌─────────────────────────────────────────────────────────────────────────────┐
│                         PERFORMANCE METRICS                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   hybrid_cfr vs python_skeleton                                             │
│   ████████████████████████████████████████████████████ +1,006 BB/100       │
│   Chips: +4,023 avg                                          [CRUSHING]     │
│                                                                             │
│   hybrid_cfr vs new_cfr                                                     │
│   ██████████████████████████████████████████ +823 BB/100                   │
│   Chips: +3,291 avg                                          [DOMINANT]     │
│                                                                             │
│   hybrid_cfr vs cfr_implementation                                          │
│   ████████████████████████████████████████ +782 BB/100                     │
│   Chips: +3,128 avg                                          [DOMINANT]     │
│                                                                             │
│   hybrid_cfr vs tuff_model                                                  │
│   ██████████████████ +362 BB/100                                           │
│   Chips: +1,449 avg                                          [STRONG]       │
│                                                                             │
│   ─────────────────────────────────────────────────────────────────────    │
│   BB/100 = Big blinds won per 100 hands                                     │
│   Values > 100 indicate crushing performance                                │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Extended Results (500 hands × 3 matches)

Opponent Avg Chips Won BB/100 Verdict
python_skeleton +14,157 +3,539 Dominating
new_cfr +13,604 +3,401 Dominating
tuff_model +4,842 +1,211 Strong
Self (mirror) +1,952 +488 Balanced

Project Structure

engine-2026/
│
├── engine.py                      # Main game engine
├── config.py                      # Bot configuration
├── README.md                      # You are here!
│
├── BOT IMPLEMENTATIONS/
│   │
│   ├── hybrid_cfr/                # BEST PERFORMER
│   │   ├── cfr_trainer.py         # CFR training (1M+ iterations)
│   │   ├── player.py              # Bayesian opponent modeling
│   │   ├── deep_cfr/              # Neural network integration
│   │   │   ├── trainer.py
│   │   │   ├── networks.py
│   │   │   └── models/100k/       # Trained model
│   │   └── results/               # Test results & gamelogs
│   │
│   ├── new_cfr/                   # Improved CFR
│   │   ├── cfr_trainer.py
│   │   ├── player.py
│   │   └── cfr_1M.pkl             # 1M iteration trained model
│   │
│   ├── test_suite/ultron/         # IN DEVELOPMENT
│   │   ├── player.py              # Main bot (1033 lines)
│   │   ├── opponent_modeling/     # Advanced tracking
│   │   ├── layers/                # Strategy layers
│   │   ├── toss_strategy/         # Discard optimization
│   │   └── training/              # Training infrastructure
│   │
│   └── cfr_implementation/        # Earlier CFR version
│
├── SKELETON CODE/
│   ├── python_skeleton/           # Python template
│   ├── java_skeleton/             # Java template
│   └── cpp_skeleton/              # C++ template
│
└── TEST OPPONENTS/
    ├── nemesis/                   # Counter-strategy
    ├── tier4_adaptive/            # Adaptive exploitation
    ├── tier4_gto/                 # GTO baseline
    ├── tight_aggressive/          # TAG strategy
    ├── ultra_aggressive/          # All-in maniac
    ├── ultra_passive/             # Check-call station
    ├── trappy/                    # Slow-play specialist
    └── ... (12+ bots total)

Key Bot Implementations

hybrid_cfr (Best Performer)

┌──────────────────────────────────────────────────────────────────┐
│                      HYBRID CFR ARCHITECTURE                     │
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│                    ┌─────────────────┐                          │
│                    │  Game State     │                          │
│                    └────────┬────────┘                          │
│                             │                                    │
│              ┌──────────────┼──────────────┐                    │
│              ▼              ▼              ▼                    │
│   ┌──────────────┐  ┌─────────────┐  ┌───────────────┐         │
│   │ Opponent     │  │ CFR Table   │  │ Deep CFR      │         │
│   │ Classifier   │  │ Lookup      │  │ Network       │         │
│   └──────┬───────┘  └──────┬──────┘  └───────┬───────┘         │
│          │                 │                 │                  │
│          ▼                 ▼                 ▼                  │
│   ┌──────────────────────────────────────────────────┐         │
│   │            Strategy Selector                      │         │
│   │  ┌─────────────────────────────────────────────┐ │         │
│   │  │ IF opponent = Fish/Weak → Exploit Strategy  │ │         │
│   │  │ IF opponent = GTO/TAG  → CFR Strategy       │ │         │
│   │  │ Bet sizing hints from Deep CFR              │ │         │
│   │  └─────────────────────────────────────────────┘ │         │
│   └──────────────────────────────────────────────────┘         │
│                             │                                    │
│                             ▼                                    │
│                    ┌─────────────────┐                          │
│                    │  Final Action   │                          │
│                    └─────────────────┘                          │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

ULTRON (Development Bot)

Most sophisticated architecture with:

  • Advanced opponent modeling (1033 lines)
  • TossEngine for optimal discard decisions
  • Multi-layer strategy system
  • PPO reinforcement learning (planned)

Technology Stack

Component Technology
Language Python 3.8+
Hand Evaluation pkrbot (custom Cython library)
Neural Networks PyTorch
Numerical Computing NumPy
Communication TCP sockets
Package Management uv (recommended)

Setup Instructions

Quick Start with uv (Recommended)

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh  # macOS/Linux
# OR
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"  # Windows

# Setup environment
uv venv
uv sync

# Run a match
.venv/bin/python engine.py

Configure Bots

Edit config.py to select which bots compete:

# Example configuration
PLAYER_1 = "hybrid_cfr"
PLAYER_2 = "python_skeleton"
NUM_HANDS = 1000

Language-Specific Setup

C++ Setup

Requires: C++17, cmake>=3.8, boost

# Linux
sudo apt-get install -y libboost-all-dev

# macOS
brew install boost

# Windows (use WSL)
wsl --install
sudo apt install -y libboost-all-dev
Java Setup

Requires: Java 8+

# macOS
brew install --cask temurin

# Linux
sudo apt install -y openjdk-17-jdk

Documentation

Document Description
hybrid_cfr/STRATEGY_DEEP_DIVE.md Complete poker AI theory & implementation
test_suite/ULTRON_TRAINING_GUIDE.md Training methodology & hyperparameters
new_cfr/DIAGNOSTICS_README.md CFR training diagnostics & debugging
new_cfr/ABSTRACTION_CHANGES.md Hand/board abstraction improvements

Current Status

Achievements

[✓] CFR implementation with 1M+ iterations trained
[✓] Bayesian opponent modeling (7 archetypes)
[✓] Deep CFR neural network integration
[✓] Comprehensive test suite (12+ opponent bots)
[✓] Crushing performance: +1,000 BB/100 vs baseline
[✓] Extensive documentation & training guides

In Progress

[~] ULTRON bot refinement
[~] Deep CFR retraining (larger architecture)
[~] PPO-based exploitation learning

Known Challenges

  • Deep CFR network needs larger architecture for full convergence
  • ULTRON occasionally overcalls with marginal hands
  • Response formatting edge cases under investigation

Competition Context

Part of the annual MIT Pokerbots student programming competition, where teams implement poker AIs competing in tournament-style matches. Students can use any algorithm—from simple heuristics to advanced game theory and machine learning.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│                    MIT POKERBOTS 2026                           │
│                                                                 │
│    "Building the next generation of game-playing AI"           │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Built with CFR, neural networks, and a lot of poker theory.

About

CFR-based poker bot with Bayesian opponent modeling, trained over 1M+ iterations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors