This Python project is a fully voice-controlled AI assistant, powered by:
- Google Gemini 1.5 Flash natural language generation
- ElevenLabs for high-quality, customizable text-to-speech
- SpeechRecognition for capturing your voice input
- Pygame for audio playback
- 🎙️ Voice-based input using microphone
- 🧠 AI-generated responses with customizable tone/personality
- 🔊 Realistic speech synthesis with ElevenLabs
- 🔁 Continuous conversation loop
- 🔧 Highly customizable via
.envfile
pip install -r requirements.txtRequired packages:
SpeechRecognition
pygame
python-dotenv
google-generativeai
elevenlabsCreate a .env file in your project directory with the following:
GEMINI_API_KEY=your_google_gemini_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
AI_NAME=Misa
AI_PERSONALITY=You are a helpful AI assitant.
VOICE_ID=21m00Tcm4TlvDq8ikWAM
MODEL_ID=eleven_multilingual_v2
EXIT_COMMANDS=exit,quit,stop,goodbye,bye,exit misa
MAX_RETRIES=5You can find available voice_ids by logging into ElevenLabs and checking your voice dashboard.
python main.py- The assistant will listen to your voice input via microphone.
- The input is converted to text using Google Speech Recognition.
- Gemini generates a custom AI response based on your conversation history and personality settings.
- ElevenLabs converts the AI text into spoken audio and plays it.
- The loop continues until you say any of the exit commands (e.g., "exit", "stop").
All customization is controlled via environment variables:
| Variable | Description |
|---|---|
AI_NAME |
Name of your assistant (used in printed and spoken text) |
AI_PERSONALITY |
Prompt that defines your assistant’s personality and tone |
VOICE_ID |
ElevenLabs voice ID (e.g., Rachel, Bella, etc.) |
MODEL_ID |
ElevenLabs voice model ID (default: eleven_multilingual_v2) |
EXIT_COMMANDS |
Comma-separated phrases to stop the assistant |
MAX_RETRIES |
Number of retries for failed voice recognition attempts |
Your voice → [SpeechRecognition] → Text
→ [Google Gemini] → AI response (text)
→ [ElevenLabs] → Audio (MP3)
→ [Pygame] → Plays the voice
Conversation context is preserved to maintain continuity and sarcasm 😎.
- 🔇 Microphone not working? Make sure your default input device is enabled and accessible.
- 🔑 API Key errors? Double-check your
.envkeys and ensure the services are active. - 🧠 Want a polite assistant? Just replace the
AI_PERSONALITYwith a friendly or formal description.
├── main.py
├── .env
├── requirements.txt
MIT License — feel free to use, fork, and personalize.