👁️ Vision AI Guide

Empowering the Visually Impaired with Hybrid AI.

An accessibility tool powered by a dual-architecture system: Gemini Flash for speed and Gemini Pro for reasoning.

💡 What is this?

This prototype empowers visually impaired users to understand their environment in real-time. It uses the device's camera to narrate the world, acting as a digital guide.

🚀 The Innovation: Hybrid Architecture

The project features a multi-faceted approach to accessibility:

Gemini 2.5 Flash (Speed): Handles rapid object detection and immediate voice feedback for fluid navigation.
Gemini 3 Pro (Reasoning): Used for complex tasks, decision-making, and contextual analysis.
🛒 Voice Commerce & Memory Module: The system supports voice-activated e-commerce (buying recognized items) and uses a persistent memory module to securely store personal data (medications, clothing sizes) and set voice reminders.
🚨 Autonomous Safety Protocol: The AI constantly monitors the user for accidents (falls, impacts) and autonomously initiates an emergency call (911/contacts) if the user is incapacitated.

🕹️ How to Interact (Usage Guide)

The Vision AI Guide is designed for a seamless, hands-free experience:

Launch: After completing the setup steps below, open your browser and grant permissions for the Camera, Microphone, and Location.
Core Feature: Precision Context. Geolocation data (latitude/longitude) is utilized to provide precise, contextual guidance relevant to the user's exact coordinates.
Activation: The application is always listening for activation.
Core Command: Press the central activation button on the screen and clearly ask your question (e.g., "What is in front of me?" or "Read the sign").
Response: Gemini 2.5 Flash provides immediate voice feedback for quick guidance.
Safety Override: The AI maintains continuous monitoring; the emergency protocol is always active, even if the user is unable to speak.

🛠️ Tech Stack

Frontend: React + Vite (Smooth UX)
Backend: Node.js (Secure API handling)
AI: Google Gemini 2.5 Flash & 3.0 Pro models
Audio: Web Speech API / TTS
Live: WebSocket integration for real-time interaction.

Run and deploy your AI Studio app

This contains everything you need to run your app locally.

View your app in AI Studio: https://ai.studio/apps/drive/1BOnfawH6K2yFgRE7MMjN6vHvHIJHIWU_

Run Locally

Prerequisites: Node.js

Setup Backend:

cd server
npm install
# Ensure .env file exists in server/ with GEMINI_API_KEY
npm run dev

Setup Frontend (New Terminal):

# In the root directory
npm install
npm run dev

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
components		components
hooks		hooks
server		server
services		services
.gitignore		.gitignore
App.tsx		App.tsx
LICENSE		LICENSE
MIT License		MIT License
README.md		README.md
index.html		index.html
index.tsx		index.tsx
metadata.json		metadata.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
types.ts		types.ts
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

👁️ Vision AI Guide

Empowering the Visually Impaired with Hybrid AI.

💡 What is this?

This prototype empowers visually impaired users to understand their environment in real-time. It uses the device's camera to narrate the world, acting as a digital guide.

🚀 The Innovation: Hybrid Architecture

🕹️ How to Interact (Usage Guide)

🛠️ Tech Stack

Run and deploy your AI Studio app

Run Locally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

👁️ Vision AI Guide

Empowering the Visually Impaired with Hybrid AI.

💡 What is this?

This prototype empowers visually impaired users to understand their environment in real-time. It uses the device's camera to narrate the world, acting as a digital guide.

🚀 The Innovation: Hybrid Architecture

🕹️ How to Interact (Usage Guide)

🛠️ Tech Stack

Run and deploy your AI Studio app

Run Locally

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages