Skip to content

rsasaki0109/PythonInteractiveRobotics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PythonInteractiveRobotics

CI Python License: MIT Core dependencies

Robots observe, act, fail, retry, update beliefs, and replan. This repo shows that loop in small, readable Python — no ROS, no GPU, no simulator. Just numpy + matplotlib.

Open the example gallery, try the live playground, or open a shareable live trace, or jump straight into the first runnable loop below. You can also run the flagship loops directly in Colab: pick and retry, safety filter, and human correction replanning. For language ambiguity, try clarifying question, or run the integrated household task agent. If the project helps you teach, prototype, or explain robotics loops, a GitHub star helps others find it.

Avoiding Reaching under occlusion Mapping while uncertain
A point robot's naive go-to-goal velocity is projected onto a CBF safe set at every step. The policy itself never knows the obstacles exist - a separate runtime safety filter slides it around them. A 2-link arm predicts a briefly occluded moving target, keeps servoing through the occlusion, and reaches the intercept point when the target reappears. A toy active-SLAM agent shrinks pose belief and occupancy belief at the same time, by picking moves that maximize expected entropy drop.

Try it

git clone https://github.com/rsasaki0109/PythonInteractiveRobotics.git
cd PythonInteractiveRobotics
python3 -m pip install -e .
python3 examples/manipulation/01_pick_and_retry.py

A tiny tabletop robot misses a grasp, updates its belief, and retries — in under 5 seconds. Core dependencies are numpy and matplotlib only.

For an even smaller first loop:

python3 examples/runtime/01_sense_act_loop.py

Start Here

If you want to see Run What it teaches
Failure recovery python3 examples/manipulation/01_pick_and_retry.py grasp miss -> belief update -> retry
Runtime safety python3 examples/navigation/29_safety_filter_cbf.py nominal controller -> CBF projection -> safe motion
Active perception python3 examples/navigation/07_active_slam_toy.py map and pose uncertainty -> information-seeking action
Shareable live trace Try live belief entropy, compare mode, and failure timeline
Human correction Open in Colab shortcut -> human correction -> cost update -> replan
Language ambiguity Open in Colab ambiguous command -> ask question -> answer -> act
Integrated household task Open in Colab clarify -> plan -> safety check -> retry -> human replan

Status

39 runnable examples · 38 README GIFs · 111 smoke / regression tests · 5 Gymnasium-style adapters · CI green on Python 3.10, 3.11, and 3.12.

See docs/status.md for the implementation snapshot, docs/plan.md for the working execution plan, and examples/README.md for the complete example index. The GitHub Pages gallery is generated from docs/index.html, and docs/public_launch.md keeps the public launch checklist.

Why this project?

Modern robotics is not just planning a path or running a controller once. Robots observe, act, fail, retry, update beliefs, and replan in partially observable environments. This repository teaches those loops with small, readable, runnable Python examples.

Design goals

Run in 5 seconds · minimal dependencies · no ROS / Docker / GPU / heavy simulator required · notebook friendly · interactive · closed-loop · failure-aware · educational.

Install

git clone https://github.com/rsasaki0109/PythonInteractiveRobotics.git
cd PythonInteractiveRobotics
python3 -m pip install -e .

For contributors and GIF regeneration: python3 -m pip install -e ".[dev]".

See The Loops

These GIFs are generated from the runnable examples, not separate animations.

Runtime and first manipulation loop

Sense-act loop Pick and retry
A point robot repeatedly observes noisy pose, acts, and observes again. A tabletop robot misses grasps, updates belief, and retries.

Manipulation

Reactive grasping Closed-loop IK
A gripper servos toward an updated object belief, misses because of visual bias, corrects, and grasps. A 2-link arm observes a noisy moving target and repeatedly servos with Jacobian IK until tracking stabilizes.
Moving target reaching Object search and pick
A 2-link arm predicts a briefly occluded moving target and keeps servoing until it reaches the target. A tabletop agent searches viewpoints, stores object memory, misses a low-confidence pick, then reobserves and succeeds.
Push then grasp Probabilistic suction sorting
A target starts under a shelf, the robot detects a blocked grasp, pushes it into open space, and then picks it. A suction sorter estimates per-object success probabilities, recovers from a suction miss, prepares the seal, retries, and sorts into bins.
Belief-guided grasp selection Active viewpoint for grasp
A grasp agent keeps a belief over three pose hypotheses, picks the grasp with highest expected success, misses, runs a Bayes update, and tries a different grasp. A grasp agent looks from the viewpoint that maximally reduces occlusion under its pose belief, updates the belief from each observation, then grasps with the type that maximizes expected success.
Clear path before pick Conformal ask-for-help
A tabletop agent tries to pick the target, gets a precondition failure because an obstacle blocks the gripper path, picks the obstacle, places it in the clear zone, and retries the original pick. A sorter calibrates a conformal prediction set offline, then places items when the prediction set is a singleton and asks a toy oracle for help when it is ambiguous.

Navigation and recovery

Reactive obstacle avoidance Dynamic obstacle avoidance
A grid robot uses fake lidar to avoid observed obstacles. A grid robot avoids a moving obstacle with one-step prediction.
Online A* replanning
A grid robot plans through unknown space, observes a hidden wall, and replans.
Frontier exploration Belief-based navigation
A grid robot selects frontier cells to reveal unknown map space. A grid robot maintains a belief heatmap, estimated pose, and true pose while navigating.
Active SLAM toy Interactive MPC
A grid robot reduces pose and map uncertainty with active sensing. A point robot repeatedly replans short-horizon controls around a moving obstacle.
Blocked path recovery Localization uncertainty recovery
A grid robot detects a newly blocked path, steps back, marks the blocked cell, and replans. A grid robot starts with a bimodal pose belief, drives toward a landmark to break the symmetry, then navigates to the goal.
Information-gain navigation Multi-agent avoidance
A grid robot scouts an observation point to reveal an unknown gate state, then runs A* with full information to either the short route or the long detour. A grid robot shares the grid with two goal-seeking other agents, predicts each agent's next step, and A* around the predicted cells to reach its own goal.
Safety filter (CBF) Options with interrupts
A point robot's naive go-to-goal nominal velocity is projected at each step onto a control-barrier-function half-space for each obstacle, sliding around them without the policy itself ever knowing they exist. A battery-aware robot runs a go-to-goal option, gets interrupted mid-task when the battery drops below threshold, switches to dock-and-charge, then resumes go-to-goal once the battery is full.
Human correction replanning
A grid robot starts on a shortcut, receives a human correction before entering an unwanted zone, raises that zone's traversal cost, replans, and reaches the goal by a longer route.

Embodied AI

Goal command pick Door search POMDP
A controlled language goal is parsed, then a tabletop robot searches, updates belief, misses grasps, and retries. A room-search agent updates key-location belief after a locked door and an empty container, then finds the key.
Goal-conditioned minikitchen Tiny VLA loop
A kitchen agent parses a bring goal, searches containers, handles a closed cabinet, picks a mug, and places it on the table. A toy VLA loop parses a language goal, reads visual tokens, picks from low confidence, recovers with a close view, and places the block.
Clarifying question Household task agent
A tabletop robot receives the ambiguous command pick the block, asks which block, receives a red answer, resolves the goal, and picks the red block. A household robot asks which block to put away, plans through a room, rejects an unsafe floor step, retries a missed grasp, accepts human correction, replans, and stores the block.
Object permanence toy
An embodied agent sees an object, watches it go behind an occluder, persists its memory, walks to the remembered position, and peeks behind the occluder to recover the object.
Curiosity grid exploration Empowerment navigation
A grid robot keeps a visit-count map, picks the least-visited reachable cell as an intrinsic curiosity target, walks to it on an A* path, and repeats until the visited coverage of free cells crosses a threshold. A grid robot prefers cells with many reachable successors by adding a k-step empowerment shaping term to its A* edge cost, sliding around narrow corridors even when the detour is slightly longer.
Inverse reward from demo
A grid robot watches one demo trajectory that detours through hidden scenic zones, learns linear reward weights from the demo's feature expectation versus a uniform random walk, then plans to a new goal with a shaped A* that reproduces the demonstrator's scenic preference.

World models

Tiny world-model planning Model error recovery
A point robot predicts action-conditioned dynamics, observes drift model error, updates a residual model, and replans to the goal. A point robot detects a sudden dynamics shift, switches to a short system-identification probe phase, updates the learned offset, and resumes goal navigation.

Regenerate them with:

python scripts/make_gifs.py

Run the smoke suite and GIF checks with:

python scripts/run_all_smoke_tests.py --gifs --check-gifs

CI runs the same smoke suite and GIF checks on Python 3.10, 3.11, and 3.12.

Core idea

obs = env.reset(seed=0)
agent.reset()

for t in range(max_steps):
    action = agent.act(obs)
    obs, reward, done, info = env.step(action)
    agent.update(obs, reward, info)
    env.render()

    if done:
        break

The goal is not photorealism. The goal is to understand the perception-action loop.

Every example returns a Trace, so headless runs can be inspected without rendering. See docs/trace.md for the full trace contract.

trace = run(seed=0, render=False)
summary = trace.summary()
print(summary.steps, summary.success, summary.failure_counts, summary.counters)

Example categories

  • Manipulation
  • Navigation
  • Active perception
  • Failure recovery
  • Belief-based decision making
  • Embodied AI
  • Tiny world models
  • Robot runtime loops

What this is not

This is not a production robotics framework. This is not a replacement for ROS2, Nav2, MoveIt, MuJoCo, Isaac Sim, or Habitat. This is a lightweight educational bridge toward them.

Bridge direction is documented separately:

  • docs/plan.md
  • docs/trace.md
  • docs/ros2_bridge_strategy.md
  • docs/simulator_integration_strategy.md

Philosophy

Toy world, real concept.

A simplified 2D world is enough to teach:

  • partial observability
  • online replanning
  • active perception
  • retry
  • collision
  • uncertainty
  • manipulation failure
  • closed-loop intelligence

Dependency policy

Core dependencies are intentionally small:

  • Python >= 3.10
  • numpy
  • matplotlib

Optional extras are used for everything heavier:

pip install -e ".[dev]"      # pytest and GIF checks
pip install -e ".[viz]"      # GIF export only
pip install -e ".[pygame]"
pip install -e ".[rl]"
pip install -e ".[mujoco]"
pip install -e ".[pybullet]"

ROS2 and simulator integrations are optional bridges, not core dependencies.

GridWorld2D, DynamicObstacleGridWorld, BlockedPathWorld, MovingObstacleWorld, and Tabletop2D also have lightweight Gymnasium-style adapters:

import numpy as np

from pir.adapters import (
    BlockedPathWorldGymnasiumAdapter,
    DynamicObstacleGridWorldGymnasiumAdapter,
    GridWorldGymnasiumAdapter,
    MovingObstacleWorldGymnasiumAdapter,
    Tabletop2DGymnasiumAdapter,
)

env = GridWorldGymnasiumAdapter(seed=0)
obs, info = env.reset(seed=0)
obs, reward, terminated, truncated, info = env.step(1)  # north

dynamic = DynamicObstacleGridWorldGymnasiumAdapter(seed=0)
obs, info = dynamic.reset(seed=0)
obs, reward, terminated, truncated, info = dynamic.step(2)  # east

blocked = BlockedPathWorldGymnasiumAdapter()
obs, info = blocked.reset(seed=0)
obs, reward, terminated, truncated, info = blocked.step(2)  # east

moving = MovingObstacleWorldGymnasiumAdapter(seed=0)
obs, info = moving.reset(seed=0)
obs, reward, terminated, truncated, info = moving.step(
    np.asarray([0.30, 0.10], dtype=np.float32)
)  # continuous velocity

tabletop = Tabletop2DGymnasiumAdapter(seed=0)
obs, info = tabletop.reset(seed=0)
obs, reward, terminated, truncated, info = tabletop.step(
    {"action_type": 0, "target": obs["camera"], "position": obs["detection_position"]}
)

Install pip install -e ".[rl]" when you want Gymnasium spaces for RL tooling.

Contributing

See CONTRIBUTING.md and docs/example_authoring.md before adding examples. Contributions should keep the loop readable, failure-aware, headless-testable, and fast to run.

License

MIT.

Releases

No releases published

Packages

 
 
 

Contributors