Our project turns the base Innate robot stack into a house hold assistant that can:
- understand spoken commands through ElevenLabs Conversational AI
- orchestrate robot actions through a LangGraph-based execution layer
- pick up a Red Bull can
- search for a green dustbin and move toward it
- remember where objects were seen and navigate back to them
- take a wrist-camera selfie and email it
- check whether a refrigerator appears open from the robot's main camera
The agent identity for this submission is Rocky.
We added a full application layer on top of Innate OS rather than changing the low-level robot platform itself.
We integrated ElevenLabs Conversational AI so the robot can listen, respond, and trigger tools from natural speech.
Core files:
elevenlabs_agent.pylanggraph_orchestrator.pyagents/rocky_agent.py
This gives Rocky a conversational interface instead of a purely app- or script-driven workflow.
We added a LangGraph orchestrator that sits between the voice agent and Innate's ROS2 skill execution server. This layer handles:
- routing tool calls to the correct robot skill
- retrying long-running or transiently failing skills
- running some behaviors in the background so the conversation stays responsive
- composing multi-step actions like "step back, point wrist camera, then email the photo"
We built and wired multiple new agents and skills for common household demo tasks:
agents/rocky_agent.py: household assistant persona focused on Red Bull pickupagents/photo_agent.py: photo-taking workflowagents/green_dustbin_agent.py: green dustbin searchagents/human_search_agent.py: arm-up human search workflowagents/spatial_memory_agent.py: memory-driven object findingagents/mapping_test_agent.py: mapping plus recall demo flow
And custom skills including:
skills/send_arm_picture_via_email.pyskills/check_if_refrigerator_open.pyskills/scan_for_green_dustbin.pyskills/scan_for_green_dustbin_360.pyskills/map_and_remember.pyskills/recall_location.pyskills/go_to_remembered.pyskills/forget_location.pyskills/spin.py
Used ego-centric data from scale, mapped the unified position for cross training between ego and mars robot
Rocky can act as a fetch assistant for a Red Bull can. The custom Rocky agent prompt is tuned around:
- visually locating the can
- navigating closer if it is too far away
- estimating grasp coordinates
- calling the existing pickup skill only when the can is actually visible
We added a Gemini-powered dustbin search skill that rotates in increments, analyzes camera frames, estimates:
- whether the green dustbin is present
- whether it is left, center, or right in view
- how centered it is
- rough proximity
This is used by the ElevenLabs flow to repeatedly search, orient, move forward, and stop near the dustbin before dropping the can.
One of our larger additions is a lightweight memory layer for demo navigation:
- while the user drives the robot during mapping,
map_and_rememberwatches the camera - when target objects are detected, their positions are stored against odometry
- later, Rocky can recall what it has seen and drive back to remembered objects
This enables commands like:
- "What do you remember?"
- "Where is the redbull?"
- "Go to the dustbin."
We added a wrist-camera photo flow that:
- moves the robot back for framing
- repositions the arm
- captures the wrist camera image already available in robot state
- emails the result
This turns a raw robot capability into a demo-friendly end-to-end interaction.
python3 elevenlabs_agent.pyMock mode for testing the conversational path without ROS2:
python3 elevenlabs_agent.py --mock./deploy.sh rocky_agentThe deploy script syncs local agents/ and skills/ to the robot and can optionally restart the selected agent.
Examples of the interactions this submission is designed around:
- "Pick up the Red Bull."
- "Come here."
- "Find the green dustbin."
- "Take my photo."
- "What do you remember?"
- "Where is the dustbin?"
- "Go to the redbull."
- "Is the refrigerator open?"
This project is built on top of the upstream Innate OS repository and runtime provided by Innate. Our hackathon work extends that base with Team Rocky's custom agents, skills, orchestration, and submission-specific workflows.