Skip to content

graygill5/BlackJack

Repository files navigation

Blackjack RL (Reinforcement Learning) Tutorial Project

This project teaches you the basics of Reinforcement Learning (RL) using Blackjack as an example game! It's totally beginner-friendly, and every file and piece of code comes with thorough comments and explanations.

Table of Contents

Overview

We implement a simple Q-Learning agent that learns how to play Blackjack by playing thousands of games against the house. We provide our own step-by-step Blackjack environment, a thoroughly commented agent, and a visible, easy-to-understand training loop!

Project Structure

blackjack_env.py      # The Blackjack game environment (like OpenAI Gym's)
qlearning_agent.py    # The RL agent that learns how to play
train_blackjack.py    # Script to train the agent (extremely verbose)
README.md             # This file!
  • blackjack_env.py:

    • Implements the Blackjack game logic from scratch.
    • Compatible with RL code like OpenAI Gym environments.
    • Super commented for learning.
  • qlearning_agent.py:

    • Contains a tabular Q-Learning RL agent, the simplest RL method.
    • All the key RL learning math is implemented and explained in comments.
  • train_blackjack.py:

    • Main script to train the agent!
    • Uses a progress bar and prints stats so you can see learning as it happens.
    • Every line has a comment and every step is broken down to help you learn RL (and practice Python).

How to Run

Requirements: Python 3.7+ (v13+ ok), and the packages listed below.

  1. Create your environment and install dependencies (if not done):

    python3 -m venv venv
    source venv/bin/activate
    pip install numpy tqdm
  2. Train the agent by running:

    python train_blackjack.py
  3. (Optional) Tweak the files and parameters to see how the agent and learning process changes. Try smaller or larger episode counts!

How it Works

  • The BlackjackEnv lets an agent play against a simulated dealer.
  • The QLearningAgent starts with no idea how to play, and slowly improves using Q-learning:
    1. It explores different actions randomly (epsilon-greedy)
    2. It builds a Q-table of state-action values over time
    3. Using these values, it learns which actions lead to winning in each state
  • The training script prints updates as the agent gets better!
  • You can explore the Q-table after training, or add code for an evaluation loop.

Ideas for Exploration

  • Display the learned Q-table or visualize policy!
  • Try different hyperparameters (alpha, gamma, epsilon)
  • Add learning curve plots
  • Implement "double down" or "split" actions to make things trickier
  • Rewrite the environment with OpenAI Gym's interface
  • Try deep Q-learning (DQN) with neural networks for fun!

Feel free to experiment and modify anything! Learning happens best by doing :)

About

Reinforecment Learning Fall 25

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages