Skip to content

Atharva2099/Hardread

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hardread

Pokemon AI research across three domains: competitive battling, card-game strategy, and RPG speedrunning. Each sub-project targets a major competition and shares a common agent-architecture philosophy — structured state, validated actions, shaped rewards, and clear training pipelines.

Domains

Domain Competition Status
VGC/ PokéAgent Challenge — Track 1 (NeurIPS 2025) Mature: GRPO-trained LoRA on Pokemon Showdown
TCG/ Pokémon TCG AI Battle (Kaggle, 2026) In progress: behavior cloning pipeline with deck matching + Elo filtering
RPG/ PokéAgent Challenge — Track 2 (NeurIPS 2025) Scaffolding: Pokemon Emerald speedrun agent

Shared Architecture

Each domain uses the same scaffolding pattern:

State → Featurizer → Policy → Validated Action → Reward
  • Structured state: raw observations normalized to fixed feature vectors (VGC: markdown state blocks; TCG: board scalars + card-ID embeddings; RPG: pixel/parsed-game-state)
  • Legal-action validation: actions filtered to legal moves only, no illegal-action noise in training
  • Shaped rewards: dense per-step signals tied to game progress, not just terminal win/loss
  • Episode-level evaluation: train/val split by episode ID, not random shuffle — no information leakage inflating metrics
  • Early stopping with patience: per-epoch val metrics, best checkpoint saved, diverging runs auto-killed

VGC/ — Competitive Battling

GRPO-trained LoRA adapter on Qwen3-4B-Instruct. Plays Pokemon Showdown battles through poke-env WebSocket client. Dense shaped reward (damage dealt/taken, knockouts, type effectiveness, anti-stall step penalty). Deployed as a Hugging Face Space with Gradio battle replay viewer.

Live demo → | Model weights →

TCG/ — Trading Card Game

Local-first agent workflow for the Pokemon TCG AI Battle challenge. Uses the official cabt simulator through Kaggle environments. Pipeline:

Replay JSON → extract (obs, action) pairs → featurize → BC train → eval vs bots

Current capabilities:

  • Deck identity recovery from visible card IDs (Lucario/Crustle/Alakazam/unknown)
  • Agent Elo filtering via empirical win-rate two-pass scan
  • Episode-level train/val split (no leakage)
  • Behavior cloning with dropout + gradient clipping
  • Stratified deck sampling (upsample rare decks)
  • Per-matchup evaluation harness (BC vs heuristic bots)
  • Voluntary kill switch for long Kaggle runs

Training runs: see TCG/experiments/runs.csv

RPG/ — RPG Speedrunning

Planned agent for Pokemon Emerald playthrough (thousands of timesteps, partial observability, heterogeneous actions). Targets the PokéAgent Track 2 starter kit with a VLM-driven baseline before attempting RL. Five scaffolding dimensions: State, Tools, Memory, Feedback, Fine-tuning.

Docs

Shared documentation covering agent rules, orchestration patterns, and cross-domain design principles lives in docs/.

Quick Start

git clone https://github.com/Atharva2099/Hardread.git
cd Hardread

# VGC — needs local Pokemon Showdown on port 8000
cd VGC && pip install -e . && python examples/run_single_episode.py

# TCG — needs Kaggle CLI + Docker
cd TCG && make setup && make smoke

License

MIT

About

AI game engine for each format of Pokemon

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors