Robotic Broom Sweeping with Reinforcement Learning

A robotics and machine learning project that teaches a simulated Franka Panda robot to autonomously detect objects, grasp a broom, and sweep them to a target location — built on MuJoCo physics, a fine-tuned YOLO vision model, and a custom RL environment.

Demo

Overview

The system combines computer vision, inverse kinematics, and reinforcement learning into a single end-to-end pipeline:

Synthetic data generation — MuJoCo renders annotated training frames automatically, no manual labeling
Vision — YOLOv8-OBB fine-tuned to detect the broom and target objects with oriented bounding boxes (preserving rotation angle, critical for grasping)
3D pose estimation — 2D detections + depth maps → 6-DOF object poses via quaternion math
Motion control — Damped Least-Squares inverse kinematics + PD torque control to move the arm
RL environment — Gymnasium-compatible environment with a multi-metric reward that balances object displacement, tool alignment, grasp stability, and drop penalties

Tech Stack

Component	Tool
Physics simulation	MuJoCo 3.4.0
Robot model	Franka Panda (MuJoCo Menagerie)
Object detection	YOLOv8-OBB (Ultralytics)
Deep learning	PyTorch
RL environment	Gymnasium
Vision utilities	OpenCV
Rendering	MediaPy

Project Structure

RLRoboticsFinalProject2026/
├── project.ipynb                    # Full end-to-end pipeline (stages 1–7)
├── project_Model.ipynb              # RL policy training and evaluation
├── Dylan_reward_2dsim_pass.ipynb    # Reward function prototyping in 2D
├── yolov8n-obb.pt                   # Pre-trained YOLO base weights
├── mujoco_obb/                      # Synthetic training dataset (objects only)
├── mujoco_obb_with_broom/           # Synthetic training dataset (with broom)
└── *.obj                            # 3D assets: broom, mug, can opener, shoe, action figure

Pipeline Walkthrough

The main notebook (project.ipynb) runs through 7 self-contained stages:

Stage	Description
1	Environment setup and imports
2	Synthetic training data generation via MuJoCo rendering
3	YOLOv8-OBB fine-tuning on generated data
4	Perception module validation (2D → 3D pose)
5	Full scene composition — robot, broom, and objects
6	Motion planning and torque control
7	Reward function definition and RL environment construction

Getting Started

Prerequisites: Python 3.8+, GPU recommended

All dependencies are installed automatically when you run the first cell of project.ipynb.

git clone https://github.com/GreyViperTooth/RLRoboticsFinalProject2026.git
cd RLRoboticsFinalProject2026
jupyter notebook project.ipynb

Run cells top to bottom. The notebook will install MuJoCo, Ultralytics, PyTorch, Gymnasium, OpenCV, and MediaPy as needed.

To train or evaluate the RL policy separately:

jupyter notebook project_Model.ipynb

Key Design Decisions

Synthetic data over manual annotation — all training images are rendered directly from the MuJoCo scene, making the dataset free to regenerate and inherently aligned with the simulation domain
Oriented bounding boxes — standard axis-aligned boxes lose the broom's angle; OBB detection preserves it, which is essential for computing a correct grasp pose
Custom gripper — enlarged fingertip plates with friction ridges prevent the broom handle from slipping during sweeping motions
Multi-metric reward — a single reward signal (e.g., distance to goal) is insufficient for a contact-rich task; the reward combines object displacement, tool-object alignment, grasp stability, and a drop penalty

Contributors

Jeff Helzner
Maanav Anand Kumar
Dylan

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
datasets		datasets
mujoco_menagerie		mujoco_menagerie
runs		runs
.DS_Store		.DS_Store
Dylan_reward_2dsim_pass.ipynb		Dylan_reward_2dsim_pass.ipynb
MUJOCO_LOG.TXT		MUJOCO_LOG.TXT
README.md		README.md
action_fig.obj		action_fig.obj
broom.obj		broom.obj
can_opener.obj		can_opener.obj
mug.obj		mug.obj
mujoco_dataset.yaml		mujoco_dataset.yaml
mujoco_dataset_with_broom.yaml		mujoco_dataset_with_broom.yaml
pipeline.drawio		pipeline.drawio
project.ipynb		project.ipynb
project_Model.ipynb		project_Model.ipynb
shoe.obj		shoe.obj
yolov8n-obb.pt		yolov8n-obb.pt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robotic Broom Sweeping with Reinforcement Learning

Demo

Overview

Tech Stack

Project Structure

Pipeline Walkthrough

Getting Started

Key Design Decisions

Contributors

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Robotic Broom Sweeping with Reinforcement Learning

Demo

Overview

Tech Stack

Project Structure

Pipeline Walkthrough

Getting Started

Key Design Decisions

Contributors

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages