Skip to content

shrey363/Captionai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CaptionAI

A modern web application that generates creative and diverse captions for your images using Google's Gemini AI. Upload any image and get captions tailored for different vibes, from accessibility descriptions to Gen Z slang, hashtags, and more.

✨ Features

  • Image Upload: Drag & drop or click to upload images
  • AI-Powered Captions: Leverages Google's Gemini 2.5 Flash model for high-quality caption generation
  • Multiple Caption Styles:
    • 🔍 Accessibility Description (for visually impaired users)
    • 📝 Base Caption (natural description)
    • 🌸 Aesthetic (poetic vibe)
    • 😂 Funny (humorous take)
    • 💙 Emotional (heartfelt)
    • ⚡ Gen Z (trendy slang)
    • ◾ Minimal (2-5 words only)
    • #️⃣ Hashtags (10 relevant tags)
    • 🕐 Posting Vibe (best time/mood to post)
    • 😊 Emoji Combo (matching emojis)
  • Responsive Design: Works on desktop and mobile devices
  • Dark Theme: Modern UI with gradient accents
  • Error Handling: User-friendly error messages and retry options

🚀 Tech Stack

  • Frontend: HTML5, CSS3, JavaScript (ES6+)
  • AI API: Google Generative AI (Gemini 2.5 Flash)
  • Fonts: Syne (headings), DM Sans (body)
  • Styling: Custom CSS with gradients and animations

🛠️ Setup & Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/captionai.git
    cd captionai
  2. Get a Google AI API Key:

  3. Configure the API Key:

    • Open script.js
    • Replace the empty string in const API_KEY = ""; with your actual API key:
      const API_KEY = "your-api-key-here";
  4. Open the application:

    • Open index.html in your web browser
    • Or serve it using a local server (recommended for better functionality):
      # Using Python
      python -m http.server 8000
      
      # Using Node.js
      npx serve .
      
      # Then open http://localhost:8000

📖 Usage

  1. Upload an Image:

    • Click the upload zone or drag & drop an image file
    • Supported formats: JPEG, PNG, GIF, WebP, etc.
  2. Preview & Generate:

    • Review your uploaded image
    • Click "✦ Generate Captions" to process
  3. View Results:

    • Browse through different caption styles
    • Copy any caption you like
  4. Change Image:

    • Use the "↩ Change Image" button to upload a different image

🔧 API Configuration

The app uses Google's Gemini 2.5 Flash model with the following settings:

  • Temperature: 0.8 (for creative variety)
  • Max Output Tokens: 3048
  • Response Format: Structured JSON with predefined caption categories

🎨 Customization

Styling

  • Colors: Dark theme with purple/blue gradients
  • Fonts: Syne for headings, DM Sans for body text
  • Animations: Smooth transitions and hover effects

Caption Prompts

Modify the prompt in script.js to customize caption generation:

text: `Analyze this image and respond ONLY with a JSON object...`

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

📄 License

This project is open source and available under the MIT License.

🙏 Acknowledgments

  • Google AI Studio for the powerful Gemini API
  • Google Fonts for the beautiful typography
  • Inspired by the need for diverse, inclusive caption generation

📞 Support

If you encounter any issues:

  1. Check that your API key is correctly set in script.js
  2. Ensure you're using a modern web browser
  3. Verify your internet connection for API calls
  4. Check the browser console for error messages

Made with ❤️ using Google's Gemini AI

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors