This repository contains implementations of foundational algorithms and experiments from “Reinforcement Learning: An Introduction” by Richard S. Sutton and Andrew G. Barto, recreated and explored in Jupyter Notebooks for educational purposes.
The goal of this project is to understand reinforcement learning deeply through code, starting from basic value estimation to advanced policy-based methods.
| File | Description |
|---|---|
rl.ipynb |
Core reinforcement learning implementations, covering basic algorithms such as Monte Carlo methods, Temporal Difference (TD) learning, and tabular Q-learning. |
rl2.ipynb |
Extended experiments exploring policy gradients, actor-critic methods, and environment simulations using OpenAI Gym. |
- Markov Decision Processes (MDPs)
- Monte Carlo Prediction & Control
- Temporal Difference Learning (TD(0), SARSA, Q-learning)
- Exploration vs. Exploitation (ε-greedy, Softmax policies)
- Policy Gradient Methods (REINFORCE)
- Actor-Critic Architectures
- Value Function Approximation
- Install required libraries
pip install numpy matplotlib gym torch jupyter