Skip to content

sahithchada/innate-os

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2,068 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Team Rocky

RoboHacks Hackathon Submission

Our project turns the base Innate robot stack into a house hold assistant that can:

  • understand spoken commands through ElevenLabs Conversational AI
  • orchestrate robot actions through a LangGraph-based execution layer
  • pick up a Red Bull can
  • search for a green dustbin and move toward it
  • remember where objects were seen and navigate back to them
  • take a wrist-camera selfie and email it
  • check whether a refrigerator appears open from the robot's main camera

The agent identity for this submission is Rocky.

What We Built

We added a full application layer on top of Innate OS rather than changing the low-level robot platform itself.

1. Voice-first robot control

We integrated ElevenLabs Conversational AI so the robot can listen, respond, and trigger tools from natural speech.

Core files:

  • elevenlabs_agent.py
  • langgraph_orchestrator.py
  • agents/rocky_agent.py

This gives Rocky a conversational interface instead of a purely app- or script-driven workflow.

2. Multi-step orchestration

We added a LangGraph orchestrator that sits between the voice agent and Innate's ROS2 skill execution server. This layer handles:

  • routing tool calls to the correct robot skill
  • retrying long-running or transiently failing skills
  • running some behaviors in the background so the conversation stays responsive
  • composing multi-step actions like "step back, point wrist camera, then email the photo"

3. Task-specific robot capabilities

We built and wired multiple new agents and skills for common household demo tasks:

  • agents/rocky_agent.py: household assistant persona focused on Red Bull pickup
  • agents/photo_agent.py: photo-taking workflow
  • agents/green_dustbin_agent.py: green dustbin search
  • agents/human_search_agent.py: arm-up human search workflow
  • agents/spatial_memory_agent.py: memory-driven object finding
  • agents/mapping_test_agent.py: mapping plus recall demo flow

And custom skills including:

  • skills/send_arm_picture_via_email.py
  • skills/check_if_refrigerator_open.py
  • skills/scan_for_green_dustbin.py
  • skills/scan_for_green_dustbin_360.py
  • skills/map_and_remember.py
  • skills/recall_location.py
  • skills/go_to_remembered.py
  • skills/forget_location.py
  • skills/spin.py

4. Cross-trained with Ego-centric data

Used ego-centric data from scale, mapped the unified position for cross training between ego and mars robot

Submission Highlights

Red Bull pickup assistant

Rocky can act as a fetch assistant for a Red Bull can. The custom Rocky agent prompt is tuned around:

  • visually locating the can
  • navigating closer if it is too far away
  • estimating grasp coordinates
  • calling the existing pickup skill only when the can is actually visible

Dustbin search and approach

We added a Gemini-powered dustbin search skill that rotates in increments, analyzes camera frames, estimates:

  • whether the green dustbin is present
  • whether it is left, center, or right in view
  • how centered it is
  • rough proximity

This is used by the ElevenLabs flow to repeatedly search, orient, move forward, and stop near the dustbin before dropping the can.

Spatial memory

One of our larger additions is a lightweight memory layer for demo navigation:

  • while the user drives the robot during mapping, map_and_remember watches the camera
  • when target objects are detected, their positions are stored against odometry
  • later, Rocky can recall what it has seen and drive back to remembered objects

This enables commands like:

  • "What do you remember?"
  • "Where is the redbull?"
  • "Go to the dustbin."

Photo and email workflow

We added a wrist-camera photo flow that:

  • moves the robot back for framing
  • repositions the arm
  • captures the wrist camera image already available in robot state
  • emails the result

This turns a raw robot capability into a demo-friendly end-to-end interaction.

Running Rocky

python3 elevenlabs_agent.py

Mock mode for testing the conversational path without ROS2:

python3 elevenlabs_agent.py --mock

Deploy updated skills and agents to the robot

./deploy.sh rocky_agent

The deploy script syncs local agents/ and skills/ to the robot and can optionally restart the selected agent.

Demo Commands

Examples of the interactions this submission is designed around:

  • "Pick up the Red Bull."
  • "Come here."
  • "Find the green dustbin."
  • "Take my photo."
  • "What do you remember?"
  • "Where is the dustbin?"
  • "Go to the redbull."
  • "Is the refrigerator open?"

Acknowledgements

This project is built on top of the upstream Innate OS repository and runtime provided by Innate. Our hackathon work extends that base with Team Rocky's custom agents, skills, orchestration, and submission-specific workflows.

About

A lightweight yet powerful, agentic, ROS2-based operating system for Innate robots

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 66.7%
  • C++ 25.9%
  • Shell 2.9%
  • TypeScript 2.7%
  • CMake 1.3%
  • Dockerfile 0.2%
  • Other 0.3%