You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
📊 InsightIQ — AI-Powered Business Data Analysis Platform
An intelligent data analysis platform that allows businesses to upload CSV/Excel files, automatically generate visual insights, and interact with their data through a conversational AI assistant powered by Hugging Face — built with Flask, PostgreSQL, and a full ML/AI stack.
# 1. Clone the repository
git clone <repo-url>cd insightiq-backend
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate# 3. Install dependencies
pip install -r requirements.txt
# 4. Set up environment variables
cp .env.example .env
# Edit .env with your credentials# 5. Start Redis (required for Celery)
redis-server
# 6. Initialize database
flask db init
flask db migrate -m "initial migration"
flask db upgrade
# 7. Start Celery worker (separate terminal)
celery -A server.celery worker --loglevel=info
# 8. Start Celery beat scheduler (separate terminal)
celery -A server.celery beat --loglevel=info
# 9. Run the Flask server
python server.py
📡 API Overview
🔐 Authentication — /auth
Method
Endpoint
Description
POST
/auth/register
Register new user/organisation
POST
/auth/login
Login
POST
/auth/send-otp
Send OTP to email
POST
/auth/verify-otp
Verify OTP
POST
/auth/reset-password
Reset password
POST
/auth/logout
Logout
GET
/auth/profile
Get own profile
PUT
/auth/profile
Update profile
📂 File Upload — /upload
Method
Endpoint
Description
POST
/upload/csv
Upload CSV file
POST
/upload/excel
Upload Excel (.xlsx) file
GET
/upload/files
List all uploaded files
GET
/upload/file/:id
File detail + column metadata
DELETE
/upload/file/:id
Delete file
GET
/upload/file/:id/preview
Preview first 50 rows
GET
/upload/file/:id/columns
Get column names & data types
Supported formats:.csv, .xlsx, .xls
🔍 Analysis — /analysis
Method
Endpoint
Description
POST
/analysis/run/:file_id
Run full analysis on a file
GET
/analysis/:file_id
Get analysis results
GET
/analysis/:file_id/summary
Statistical summary
GET
/analysis/:file_id/charts
All generated charts
GET
/analysis/:file_id/insights
Text insights
GET
/analysis/:file_id/correlations
Correlation matrix
GET
/analysis/:file_id/trends
Trend detection
GET
/analysis/:file_id/anomalies
Anomaly detection
GET
/analysis/:file_id/forecast
Future value predictions
GET
/analysis/:file_id/clusters
Customer/data clustering
🤖 AI Chat — /ai
Method
Endpoint
Description
POST
/ai/chat/:file_id
Ask AI about your data
GET
/ai/chat/:file_id/history
Get chat history
DELETE
/ai/chat/:file_id/history
Clear chat history
POST
/ai/summarize/:file_id
AI text summary of data
POST
/ai/recommend/:file_id
AI business recommendations
Example queries:
"What is the total revenue for Q3?"
"Which product has the highest return rate?"
"Show me the top 5 customers by sales"
"What are the main trends in this data?"
"Give me 3 business recommendations based on this data"
📊 Dashboard — /dashboard
Method
Endpoint
Description
GET
/dashboard/overview
KPI cards & summary
GET
/dashboard/recent
Recent uploads & analyses
GET
/dashboard/charts
Saved charts
GET
/dashboard/activity
User activity log
📄 Reports — /report
Method
Endpoint
Description
POST
/report/generate/:file_id
Generate PDF report
GET
/report/list
All generated reports
GET
/report/:id
Download PDF report
DELETE
/report/:id
Delete report
POST
/report/schedule
Schedule auto report
GET
/report/scheduled
List scheduled reports
DELETE
/report/schedule/:id
Cancel scheduled report
🔔 Notifications — /notifications
Method
Endpoint
Description
GET
/notifications
All notifications
PUT
/notifications/:id/read
Mark as read
PUT
/notifications/read-all
Mark all as read
DELETE
/notifications/:id
Delete notification
👑 Admin — /admin
Method
Endpoint
Description
GET
/admin/dashboard
Platform-wide stats
GET
/admin/users
All users
PUT
/admin/user/:id
Update user
DELETE
/admin/user/:id
Delete user
GET
/admin/usage
API & storage usage
GET
/admin/subscriptions
Billing & plans
🧠 AI & ML Pipeline
1. File Upload & Parsing
CSV/Excel uploaded
→ Pandas reads file
→ Auto-detect column types (numeric, categorical, datetime, text)
→ Store column metadata in DB
→ Preview first 50 rows cached
2. Automatic Analysis (Celery async task)
Analysis triggered
→ Statistical summary (mean, median, std, min, max, quartiles)
→ Missing value detection & report
→ Correlation matrix (NumPy/Pandas)
→ Trend analysis (SciPy regression)
→ Anomaly detection (scikit-learn IsolationForest)
→ Customer segmentation (scikit-learn KMeans clustering)
→ Time-series forecasting (TensorFlow / scikit-learn)
→ Chart generation (Matplotlib + Plotly)
→ Upload charts to Cloudinary
→ Save insights to DB
→ Notify user via SocketIO
3. Chart Types Generated
Chart
When Generated
Bar Chart
Categorical comparisons
Line Chart
Time-series / trends
Scatter Plot
Correlation between columns
Pie / Donut
Distribution / proportions
Heatmap
Correlation matrix
Histogram
Numeric distribution
Box Plot
Outlier detection
Cluster Plot
Segmentation results
Forecast Chart
Future predictions
4. AI Chat (Hugging Face)
User asks question
→ Context built from:
- Column names & types
- Statistical summary
- Top 100 rows as sample
- Previous conversation history
→ Sent to Hugging Face model (Mistral / Llama / etc.)
→ Response returned to user
→ Stored in chat history
Reports are generated by Celery Beat and emailed via Flask-Mail as PDF attachments.
💳 Subscription & Billing (Razorpay)
Plan
File Uploads
AI Queries
Storage
Free
5/month
20/month
100 MB
Pro
50/month
500/month
5 GB
Business
Unlimited
Unlimited
50 GB
Payments processed via Razorpay integration.
🚦 Rate Limiting
default_limits= ["500 per day", "100 per hour"]
# AI endpoints (heavier — limited separately)limiter.limit("50 per day;10 per hour") # AI chatlimiter.limit("20 per day;5 per hour") # Report generation
🚀 Deployment
Production with Gunicorn
# Install production deps
pip install gunicorn eventlet
# Run with Gunicorn + Eventlet (for SocketIO support)
gunicorn --worker-class eventlet -w 1 \
--bind 0.0.0.0:5000 \
--timeout 300 \
server:app
⚠️ Use --timeout 300 since ML analysis jobs can take longer than default 30s.
With Nginx
server{listen80;server_name your-domain.com; # Increase timeout for ML processingproxy_read_timeout300;proxy_connect_timeout300;location / {proxy_passhttp://127.0.0.1:5000;proxy_http_version 1.1;proxy_set_header Upgrade $http_upgrade;proxy_set_header Connection "upgrade";proxy_set_header Host $host;client_max_body_size50M; # allow large file uploads
}}
Celery in Production
# Worker
celery -A server.celery worker --loglevel=info --concurrency=4
# Scheduler
celery -A server.celery beat --loglevel=info
🔒 Security
JWT tokens with 7-day expiry
OTP-based email verification & password reset
Role-based access (Admin / User)
File type validation on upload (only CSV/Excel allowed)
File size limits enforced
Rate limiting on all endpoints (stricter on AI/ML routes)
Cloudinary signed URLs for secure file access
All credentials in .env (never committed)
📊 Key Dependencies Summary
# Web Framework
Flask==3.1.2
Flask-RESTful==0.3.10
Flask-JWT-Extended==4.7.1
Flask-SocketIO==5.6.0
Flask-Limiter==4.1.1
Flask-Mail==0.10.0
# Database
Flask-SQLAlchemy==3.1.1
Flask-Migrate==4.1.0
psycopg2-binary==2.9.11
alembic==1.18.1
# ML / AI
numpy==2.4.3
pandas==3.0.1
scikit-learn==1.8.0
scipy==1.17.1
matplotlib==3.10.8
plotly==6.6.0
huggingface_hub==1.7.1
# Background Jobs
celery==5.6.2
redis==7.1.1
APScheduler==3.11.1
# File & Storage
cloudinary==1.44.1
openpyxl==3.1.5
reportlab==4.4.5
Pillow==12.1.0
# Production
gunicorn==25.0.3
eventlet==0.40.4
# Payments
razorpay==2.0.0
An AI-powered business data analysis platform where users upload CSV/Excel files to get automated visual insights, statistical summaries, anomaly detection, and forecasting — with a conversational AI assistant powered by Hugging Face for natural language data queries.