Skip to content

GreyViperTooth/RLRoboticsFinalProject2026

 
 

Repository files navigation

Robotic Broom Sweeping with Reinforcement Learning

A robotics and machine learning project that teaches a simulated Franka Panda robot to autonomously detect objects, grasp a broom, and sweep them to a target location — built on MuJoCo physics, a fine-tuned YOLO vision model, and a custom RL environment.

Demo

image

Overview

The system combines computer vision, inverse kinematics, and reinforcement learning into a single end-to-end pipeline:

  1. Synthetic data generation — MuJoCo renders annotated training frames automatically, no manual labeling
  2. Vision — YOLOv8-OBB fine-tuned to detect the broom and target objects with oriented bounding boxes (preserving rotation angle, critical for grasping)
  3. 3D pose estimation — 2D detections + depth maps → 6-DOF object poses via quaternion math
  4. Motion control — Damped Least-Squares inverse kinematics + PD torque control to move the arm
  5. RL environment — Gymnasium-compatible environment with a multi-metric reward that balances object displacement, tool alignment, grasp stability, and drop penalties

Tech Stack

Component Tool
Physics simulation MuJoCo 3.4.0
Robot model Franka Panda (MuJoCo Menagerie)
Object detection YOLOv8-OBB (Ultralytics)
Deep learning PyTorch
RL environment Gymnasium
Vision utilities OpenCV
Rendering MediaPy

Project Structure

RLRoboticsFinalProject2026/
├── project.ipynb                    # Full end-to-end pipeline (stages 1–7)
├── project_Model.ipynb              # RL policy training and evaluation
├── Dylan_reward_2dsim_pass.ipynb    # Reward function prototyping in 2D
├── yolov8n-obb.pt                   # Pre-trained YOLO base weights
├── mujoco_obb/                      # Synthetic training dataset (objects only)
├── mujoco_obb_with_broom/           # Synthetic training dataset (with broom)
└── *.obj                            # 3D assets: broom, mug, can opener, shoe, action figure

Pipeline Walkthrough

The main notebook (project.ipynb) runs through 7 self-contained stages:

Stage Description
1 Environment setup and imports
2 Synthetic training data generation via MuJoCo rendering
3 YOLOv8-OBB fine-tuning on generated data
4 Perception module validation (2D → 3D pose)
5 Full scene composition — robot, broom, and objects
6 Motion planning and torque control
7 Reward function definition and RL environment construction

Getting Started

Prerequisites: Python 3.8+, GPU recommended

All dependencies are installed automatically when you run the first cell of project.ipynb.

git clone https://github.com/GreyViperTooth/RLRoboticsFinalProject2026.git
cd RLRoboticsFinalProject2026
jupyter notebook project.ipynb

Run cells top to bottom. The notebook will install MuJoCo, Ultralytics, PyTorch, Gymnasium, OpenCV, and MediaPy as needed.

To train or evaluate the RL policy separately:

jupyter notebook project_Model.ipynb

Key Design Decisions

  • Synthetic data over manual annotation — all training images are rendered directly from the MuJoCo scene, making the dataset free to regenerate and inherently aligned with the simulation domain
  • Oriented bounding boxes — standard axis-aligned boxes lose the broom's angle; OBB detection preserves it, which is essential for computing a correct grasp pose
  • Custom gripper — enlarged fingertip plates with friction ridges prevent the broom handle from slipping during sweeping motions
  • Multi-metric reward — a single reward signal (e.g., distance to goal) is insufficient for a contact-rich task; the reward combines object displacement, tool-object alignment, grasp stability, and a drop penalty

Contributors

  • Jeff Helzner
  • Maanav Anand Kumar
  • Dylan

License

MIT

About

A sweeping robot using mujoco and deep RL + computer vision

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 97.0%
  • Python 3.0%