Blackjack RL (Reinforcement Learning) Tutorial Project

This project teaches you the basics of Reinforcement Learning (RL) using Blackjack as an example game! It's totally beginner-friendly, and every file and piece of code comes with thorough comments and explanations.

Overview

We implement a simple Q-Learning agent that learns how to play Blackjack by playing thousands of games against the house. We provide our own step-by-step Blackjack environment, a thoroughly commented agent, and a visible, easy-to-understand training loop!

Project Structure

blackjack_env.py      # The Blackjack game environment (like OpenAI Gym's)
qlearning_agent.py    # The RL agent that learns how to play
train_blackjack.py    # Script to train the agent (extremely verbose)
README.md             # This file!

blackjack_env.py:
- Implements the Blackjack game logic from scratch.
- Compatible with RL code like OpenAI Gym environments.
- Super commented for learning.
qlearning_agent.py:
- Contains a tabular Q-Learning RL agent, the simplest RL method.
- All the key RL learning math is implemented and explained in comments.
train_blackjack.py:
- Main script to train the agent!
- Uses a progress bar and prints stats so you can see learning as it happens.
- Every line has a comment and every step is broken down to help you learn RL (and practice Python).

How to Run

Requirements: Python 3.7+ (v13+ ok), and the packages listed below.

Create your environment and install dependencies (if not done):

python3 -m venv venv
source venv/bin/activate
pip install numpy tqdm

Train the agent by running:
```
python train_blackjack.py
```
(Optional) Tweak the files and parameters to see how the agent and learning process changes. Try smaller or larger episode counts!

How it Works

The BlackjackEnv lets an agent play against a simulated dealer.
The QLearningAgent starts with no idea how to play, and slowly improves using Q-learning:
1. It explores different actions randomly (epsilon-greedy)
2. It builds a Q-table of state-action values over time
3. Using these values, it learns which actions lead to winning in each state
The training script prints updates as the agent gets better!
You can explore the Q-table after training, or add code for an evaluation loop.

Ideas for Exploration

Display the learned Q-table or visualize policy!
Try different hyperparameters (alpha, gamma, epsilon)
Add learning curve plots
Implement "double down" or "split" actions to make things trickier
Rewrite the environment with OpenAI Gym's interface
Try deep Q-learning (DQN) with neural networks for fun!

Feel free to experiment and modify anything! Learning happens best by doing :)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
README.md		README.md
blackjack_env.py		blackjack_env.py
play_blackjack.py		play_blackjack.py
qlearning_agent.py		qlearning_agent.py
results_dump.txt		results_dump.txt
train_blackjack.py		train_blackjack.py
trained_qtable.npy		trained_qtable.npy
vals.py		vals.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Blackjack RL (Reinforcement Learning) Tutorial Project

Table of Contents

Overview

Project Structure

How to Run

How it Works

Ideas for Exploration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Blackjack RL (Reinforcement Learning) Tutorial Project

Table of Contents

Overview

Project Structure

How to Run

How it Works

Ideas for Exploration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages