A modern web application that generates creative and diverse captions for your images using Google's Gemini AI. Upload any image and get captions tailored for different vibes, from accessibility descriptions to Gen Z slang, hashtags, and more.
- Image Upload: Drag & drop or click to upload images
- AI-Powered Captions: Leverages Google's Gemini 2.5 Flash model for high-quality caption generation
- Multiple Caption Styles:
- 🔍 Accessibility Description (for visually impaired users)
- 📝 Base Caption (natural description)
- 🌸 Aesthetic (poetic vibe)
- 😂 Funny (humorous take)
- 💙 Emotional (heartfelt)
- ⚡ Gen Z (trendy slang)
- ◾ Minimal (2-5 words only)
- #️⃣ Hashtags (10 relevant tags)
- 🕐 Posting Vibe (best time/mood to post)
- 😊 Emoji Combo (matching emojis)
- Responsive Design: Works on desktop and mobile devices
- Dark Theme: Modern UI with gradient accents
- Error Handling: User-friendly error messages and retry options
- Frontend: HTML5, CSS3, JavaScript (ES6+)
- AI API: Google Generative AI (Gemini 2.5 Flash)
- Fonts: Syne (headings), DM Sans (body)
- Styling: Custom CSS with gradients and animations
-
Clone the repository:
git clone https://github.com/yourusername/captionai.git cd captionai -
Get a Google AI API Key:
- Visit Google AI Studio
- Create a new API key
- Copy the API key
-
Configure the API Key:
- Open
script.js - Replace the empty string in
const API_KEY = "";with your actual API key:const API_KEY = "your-api-key-here";
- Open
-
Open the application:
- Open
index.htmlin your web browser - Or serve it using a local server (recommended for better functionality):
# Using Python python -m http.server 8000 # Using Node.js npx serve . # Then open http://localhost:8000
- Open
-
Upload an Image:
- Click the upload zone or drag & drop an image file
- Supported formats: JPEG, PNG, GIF, WebP, etc.
-
Preview & Generate:
- Review your uploaded image
- Click "✦ Generate Captions" to process
-
View Results:
- Browse through different caption styles
- Copy any caption you like
-
Change Image:
- Use the "↩ Change Image" button to upload a different image
The app uses Google's Gemini 2.5 Flash model with the following settings:
- Temperature: 0.8 (for creative variety)
- Max Output Tokens: 3048
- Response Format: Structured JSON with predefined caption categories
- Colors: Dark theme with purple/blue gradients
- Fonts: Syne for headings, DM Sans for body text
- Animations: Smooth transitions and hover effects
Modify the prompt in script.js to customize caption generation:
text: `Analyze this image and respond ONLY with a JSON object...`- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes
- Test thoroughly
- Submit a pull request
This project is open source and available under the MIT License.
- Google AI Studio for the powerful Gemini API
- Google Fonts for the beautiful typography
- Inspired by the need for diverse, inclusive caption generation
If you encounter any issues:
- Check that your API key is correctly set in
script.js - Ensure you're using a modern web browser
- Verify your internet connection for API calls
- Check the browser console for error messages
Made with ❤️ using Google's Gemini AI