Your personal Instagram knowledge base. Paste any Instagram URL into Telegram — InstaIntel downloads it, extracts text via OCR, identifies topics/people/brands, and indexes everything into a searchable knowledge graph. Then ask questions in plain English and get answers powered by Claude.
You (Telegram) InstaIntel
│ │
├── paste IG URL ──────► download (instaloader + yt-dlp)
│ │ extract text (Gemini Vision)
│ │ identify entities (Claude)
│ │ index (ChromaDB + NetworkX)
│ │
◄── summary + images ──┤
│ │
├── "skincare tips?" ──► semantic search + RAG
◄── Claude answers ────┤
| Service | Where to get it | What it does |
|---|---|---|
| Telegram Bot Token | Message @BotFather → /newbot |
Your bot's identity |
| Your Telegram User ID | Message @userinfobot | Restricts bot access to you |
| Gemini API Key | aistudio.google.com/apikey | Vision OCR + video analysis |
| Anthropic API Key | console.anthropic.com | Entity extraction + RAG chat |
git clone <your-repo> && cd insta-intel
chmod +x setup.sh && ./setup.shThe script will:
- Install Python dependencies + ffmpeg
- Ask for your 4 keys interactively
- Create your
.envfile - Start the bot
That's it. Paste an Instagram link and watch it work.
Paste any Instagram URL into the chat:
https://www.instagram.com/p/ABC123/
https://www.instagram.com/reel/XYZ789/
The bot downloads, analyzes, and replies with a summary + the images.
Just type naturally:
What skincare tips did I save?
Summarize that fitness post
Which posts mention protein?
| Command | What it does |
|---|---|
/start |
Welcome message + library overview |
/stats |
Post counts, categories, index size |
/topics |
Your most-saved topics |
/recent |
Last 5 saved posts |
/category |
Browse by category (e.g. /category fitness) |
/graph |
Download interactive knowledge graph (HTML) |
/cost |
AI API usage & costs (this session) |
/flush confirm |
Delete all saved data and start fresh |
Every night at midnight IST, the bot sends you a summary of everything you saved that day — grouped by category with trending topics. Media files are cleaned up automatically.
Instagram URL
│
├─ Image/Carousel ──► instaloader (no login needed)
│ │
└─ Reel/Video ───────► yt-dlp
│
Download media
│
┌───────────┴───────────┐
│ │
Images/Slides Video (mp4)
│ │
Gemini Vision Gemini Video
(batched OCR) (keyframe analysis)
│ │
└───────────┬───────────┘
│
Claude Entity Extraction
(topics, people, brands,
products, locations, tips)
│
┌────────────┼────────────┐
│ │ │
SQLite ChromaDB NetworkX
metadata vector search knowledge graph
│ │ │
└────────────┼────────────┘
│
Claude RAG Chat
(answers your questions)
| Type | Download | OCR | Entity Extraction | Graph |
|---|---|---|---|---|
| Single image | instaloader | Gemini Vision | Claude | yes |
| Carousel (all slides) | instaloader (each slide) | Gemini batched | Claude | yes |
| Reel (video) | yt-dlp + keyframes | Gemini Video | Claude | yes |
| Priority | Provider | Cost | Quality |
|---|---|---|---|
| 1 | Gemini 2.5 Flash | ~$0.001/image | excellent |
| 2 | Claude Vision | ~$0.004/image | excellent |
| 3 | Tesseract OCR | free | text only |
Override with VISION_PROVIDER=claude in .env.
insta-intel/
├── main.py # Entry point — starts Telegram bot
├── config.py # Configuration from .env
├── query.py # CLI search interface
├── setup.sh # Interactive setup script
├── requirements.txt
├── core/
│ ├── models.py # MediaType, MediaItem dataclasses
│ ├── downloader.py # instaloader + yt-dlp
│ ├── vision.py # Gemini/Claude/Tesseract OCR
│ ├── gemini_video.py # Reel video analysis
│ ├── entity_extractor.py # Claude entity extraction
│ └── pipeline.py # Orchestrates the full flow
├── storage/
│ ├── database.py # SQLite metadata + dedup
│ ├── vector_store.py # ChromaDB semantic search
│ └── knowledge_graph.py # NetworkX entity graph + pyvis export
├── bot/
│ └── telegram_bot.py # Telegram bot + daily digest
└── data/ # Created at runtime
├── instaintel.db
├── chroma/
├── media/
└── knowledge_graph.json
Posts are connected to extracted entities:
post:ABC123 ──has_topic──► topic:skincare
│ │
├──mentions_brand──► brand:cerave
│
├──authored_by──► person:@dermatologist
│
└──in_category──► category:beauty
Export as interactive HTML with /graph in Telegram or python query.py --graph from CLI.
All settings in .env:
# Required
TELEGRAM_BOT_TOKEN=... # From @BotFather
TELEGRAM_ALLOWED_USERS=123456 # Your Telegram user ID
# Recommended
GEMINI_API_KEY=... # Vision + video analysis
ANTHROPIC_API_KEY=... # Entity extraction + RAG chat
# Optional (defaults shown)
VISION_PROVIDER=gemini # auto | gemini | claude | tesseract
ANTHROPIC_CHAT_MODEL=claude-sonnet-4-6
GEMINI_VIDEO_MODEL=gemini-2.5-flash
REEL_KEYFRAME_INTERVAL=3 # Seconds between keyframes
EMBEDDING_MODEL=all-MiniLM-L6-v2
LOG_LEVEL=INFO| Usage | Images/month | Cost/month |
|---|---|---|
| Light (3 posts/day) | ~100 | < $0.10 |
| Moderate (10 posts/day) | ~400 | ~$0.40 |
| Heavy (30 posts/day) | ~2000 | ~$2.00 |
For always-on deployment (Linux):
# Copy service file
sudo cp instaintel.service /etc/systemd/system/
sudo systemctl daemon-reload
# Start
sudo systemctl start instaintel
sudo systemctl enable instaintel # auto-start on boot
# Logs
journalctl -u instaintel -fGo live in one command:
cd terraform
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your API keys + SSH key name
terraform init
terraform applyBot is live in ~3 minutes. Check status:
terraform output ssh_command # SSH in
terraform output bot_logs # Stream logs
terraform output setup_log # Check first-boot progressTear down:
terraform destroy # deletes everythingWhat it creates:
- 1 EC2 instance (
t3.small, ~$15/mo) - Security group (SSH only, all outbound)
- Auto-installs Python, ffmpeg, clones repo, starts bot via systemd
MIT