Snake RL — Reinforcement Learning Sandbox

A research-oriented Snake reinforcement-learning environment built with PyTorch.

Demo

This demo shows a trained agent exhibiting stable navigation and late-game risk avoidance.

This project implements a reinforcement-learning Snake agent designed to study learning behavior, reward shaping, and control dynamics under delayed consequences. The agent learns purely from numerical rewards and penalties, without scripted rules or hard-coded strategies.

The repository is intended as an experimental sandbox for observing emergent behavior and failure modes in reinforcement learning.

What This Is

A reinforcement-learning Snake environment
A testbed for studying:
- delayed reward effects
- policy collapse
- reward exploitation (looping, stalling)
- survival vs. reward tradeoffs
A compact environment where small reward changes produce large behavioral shifts

What This Is Not

❌ A scripted or rule-based Snake bot
❌ A shortest-path solver
❌ A benchmark or “perfect” Snake agent

The agent does not know explicit rules like “avoid walls.”
It updates its policy solely through gradient-based learning from outcomes.

Learning Loop

Observe environment state
Predict action values
Select an action (with exploration)
Receive reward or penalty
Compute prediction error
Update network weights

This process allows the agent to:

learn from failure
propagate consequences backward
trade short-term reward for long-term survival
exploit reward structures when misaligned

Why Snake?

Snake is well-suited for reinforcement learning research because it combines:

simple mechanics
a large state space
delayed consequences
clear and observable failure modes

This makes it an effective environment for studying learning dynamics and control behavior.

Key Observations

Learning behavior emerges from reward structure
Optimization alone does not guarantee stability
Control mechanisms influence long-term performance
Misaligned incentives produce predictable failure patterns

Running

python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
env		env
rl		rl
ui		ui
.gitignore		.gitignore
README.md		README.md
config.py		config.py
demo.gif		demo.gif
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snake RL — Reinforcement Learning Sandbox

Demo

What This Is

What This Is Not

Learning Loop

Why Snake?

Key Observations

Running

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Snake RL — Reinforcement Learning Sandbox

Demo

What This Is

What This Is Not

Learning Loop

Why Snake?

Key Observations

Running

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages