An end-to-end multi-agent framework for automated business analytics β built with AutoGen, connected to Google BigQuery, and served through a real-time group chat UI. Supports any monthly dataset through a simple configuration layer.
Demo
Agents taking in user feedback and master agent assign tasks to analyst agent:

Directly Query Data from BigQuery Global Ads Performance Dataset (Kaggle)
Given a reference date and a dataset, the system automatically:
- Pulls live data from Google BigQuery (any monthly dataset)
- Runs a structured analytical pipeline across 4 specialist AI agents
- Generates 12-month trend charts (YoY line + CTG stacked bar) rendered inline in the chat UI
- Searches the web for external context explaining the key driver
- Writes an executive narrative combining quantitative findings with external context
- Generates a PowerPoint slide with chart, table, and key insights
- Streams everything in real time to a dark-themed group chat UI
- Accepts user feedback at checkpoints β ask questions, request drill-downs, edit the narrative β Master routes to the right agent automatically and loops back until you approve
| Agent | Role | Tools |
|---|---|---|
| π― Master | Orchestrates pipeline, synthesizes findings, routes user feedback | None (coordinates only) |
| π Analyst | Runs all quantitative analysis β decomposition, drill-downs, trend data | BigQuery β pandas tools |
| π WebSearch | Finds external factors explaining trends | Serper.dev search API |
| π¨ Visualization | Generates the final PowerPoint slide | pptxgenjs (Node.js) |
Reference Date + Dataset Input
β
βΌ
Phase 1 β Steps 1-3 (Analyst)
Step 1: Schema & data quality check
Step 2: Overall MoM + YoY metric summary
Step 3: CTG decomposition by all configured dimensions
+ 12-month YoY line charts (one per dimension)
+ 12-month CTG stacked bar charts (one per dimension)
β
βΌ
β Checkpoint 1 β Interactive feedback loop
User can: ask questions | request drill-downs | continue
Master routes to Analyst or WebSearch as needed
Loops until user approves
β
βΌ
Phase 2 β Step 4 (Analyst)
Filter to top primary dimension driver
β Secondary dimension decomposition within that segment
β Top sub-driver identification + analytical observation
β
βΌ
Round 2 β Master identifies search queries (using actual dimension names)
β
βΌ
Round 3 β WebSearch runs queries, returns external context
β
βΌ
Round 4 β Master writes narrative + slide JSON spec
β
βΌ
β Checkpoint 2 β Interactive feedback loop
User can: edit narrative | request more analysis | search for more context | continue
Master routes to Analyst or WebSearch as needed, re-writes spec
Loops until user approves
β
βΌ
Round 5 β Visualization generates PowerPoint slide + download link
- YoY % =
(segment_current / segment_prior_year) β 1 - CTG % =
(segment_current β segment_prior_year) / total_prior_year_value- All CTGs for a dimension sum to total portfolio YoY β
- Trend charts = last 12 completed months, month-over-same-month prior year
The system works with any monthly dataset via a DatasetConfig β no code changes needed to switch datasets.
credit_card (default)
DatasetConfig(
date_col="Date", value_col="Amount", value_label="Spend",
dimensions=["Card Type", "Exp Type"],
primary_dim="Card Type", secondary_dim="Exp Type",
)global_ads
DatasetConfig(
date_col="month", value_col="total_revenue", value_label="Revenue",
dimensions=["platform", "campaign_type", "industry", "country"],
primary_dim="platform", secondary_dim="campaign_type",
)Just change two lines in .env:
DATASET_PRESET=global_ads
BQ_TABLE=your-project.your_dataset.Global_Ads_monthlyAdd one entry to DATASET_PRESETS in tools.py:
"my_dataset": DatasetConfig(
date_col="transaction_date",
value_col="revenue",
value_label="Revenue",
dimensions=["region", "product_category", "channel"],
primary_dim="region",
secondary_dim="product_category",
),- Agents: AutoGen AgentChat with GPT-4o
- Data: Google BigQuery (with local CSV fallback)
- Analytics: pandas
- Web Search: Serper.dev API
- Slide Generation: pptxgenjs (Node.js)
- Server: Flask with Server-Sent Events (SSE) broadcast queue
- UI: Vanilla HTML/CSS/JS + Chart.js β dark group chat interface
Multi-Agent-Analytics-Automation/
βββ multi_agent_code/
β βββ server.py # Flask server β SSE broadcast, checkpoint routing
β βββ pipeline.py # Pipeline orchestration β phases, checkpoints, feedback loops
β βββ agents_multi.py # Agent definitions (Master, Analyst, WebSearch, Viz)
β βββ prompts_multi.py # Dataset-agnostic system prompts for all 4 agents
β βββ tools.py # DatasetConfig, BigQuery loader, analytics tools, search, slide gen
β βββ generate_slide.js # Node.js PowerPoint builder (pptxgenjs)
β βββ main_multi.py # Terminal mode entry point
β βββ ui/
β βββ index.html # Group chat UI with inline Chart.js charts
βββ data/ # Local CSV fallback (not committed)
βββ output/ # Generated .pptx files (not committed)
βββ .env # API keys and config (never commit)
βββ requirements.txt
- Python 3.11+
- Node.js 18+ (for slide generation)
- Google Cloud account (free tier works)
- OpenAI API key
- Serper.dev API key (free tier: 2,500 searches/month)
git clone https://github.com/BFvandy/Multi-Agent-Analytics-Automation.git
cd Multi-Agent-Analytics-Automation
python -m venv venv
source venv/bin/activatepip install -r requirements.txt
pip install google-cloud-bigquery db-dtypes pyarrowcd multi_agent_code
npm install pptxgenjs# Install gcloud CLI (macOS)
brew install --cask google-cloud-sdk
# Authenticate
gcloud auth login
gcloud auth application-default login
gcloud config set project YOUR_PROJECT_ID
gcloud auth application-default set-quota-project YOUR_PROJECT_IDUpload your CSV to BigQuery. The table needs a date/month column, a numeric value column, and one or more categorical dimension columns.
OPENAI_API_KEY=sk-...
SERPER_API_KEY=...
BQ_PROJECT=your-gcp-project-id
BQ_TABLE=your-project.your_dataset.your_table
DATASET_PRESET=credit_card # or global_ads, or any custom preset name
# Optional β use local CSV instead of BigQuery
# USE_CSV=true
# DATA_FILE=India_cc_transactions.csv
# Optional β change server port (default 8080)
# PORT=8080# Web UI (recommended)
python server.py
# β open http://localhost:8080
# Terminal mode
python main_multi.py- Open
http://localhost:8080 - Enter a reference date (
YYYY-MM-01) β e.g.2025-03-01analyses February 2025 - Watch the 4 agents work in real time in the group chat
- At Checkpoint 1 β review the analysis. You can:
- Ask a question: "what was January revenue?"
- Request a drill-down: "drill into Google Ads by country"
- Leave blank or type
okto continue
- At Checkpoint 2 β review the narrative and slide spec. You can:
- Request an edit: "change the headline to focus on the decline"
- Request more research: "search for TikTok ad spend trends Q4 2025"
- Leave blank or type
okto generate the slide
- Download the generated
.pptxat the end
Both checkpoints loop β Master handles your request, routes to the right agent, and shows the checkpoint again until you approve.
.env variable |
Default | Description |
|---|---|---|
OPENAI_API_KEY |
β | OpenAI API key |
SERPER_API_KEY |
β | Serper.dev API key for web search |
BQ_PROJECT |
β | GCP project ID |
BQ_TABLE |
β | Full BigQuery table path (project.dataset.table) |
DATASET_PRESET |
credit_card |
Which DatasetConfig preset to use |
USE_CSV |
false |
Set to true to bypass BigQuery and use local CSV |
DATA_FILE |
India_cc_transactions.csv |
CSV filename under data/ |
PORT |
8080 |
Flask server port |
macOS note: Port 5000 is used by AirPlay Receiver. Default port is 8080. To use 5000, disable AirPlay in System Settings β AirDrop & Handoff.
Charts bypass the LLM entirely. get_trend_charts is called directly from pipeline.py in Python, and chart events are emitted straight to the UI. LLMs silently normalize percentage values to decimals when serializing JSON arrays β bypassing them ensures the numbers are always correct.
Two-phase analyst. Steps 1-3 and Step 4 are separate run_until_complete calls with distinct trigger words (STEPS 1-3 COMPLETE / ANALYSIS COMPLETE). This guarantees Step 4 always runs regardless of how long Step 3 takes.
Fresh agent for additional analysis. When a user requests a drill-down at a checkpoint, a new analyst agent instance is created with no conversation history. The original agent's history ends with "ANALYSIS COMPLETE" which causes it to skip tool calls. The fresh agent gets all context via the task prompt instead.
Master holds all memory. Every Master call receives the full accumulated context β original analysis, all user-requested drill-downs, web research, and the current narrative. Specialist agents are stateless workers; Master is the single source of truth.
SSE broadcast queue. Each browser connection gets its own queue.Queue. _push() writes every event to all queues simultaneously, so reconnects and multiple tabs both see the complete event stream without missing messages.
Dataset-agnostic prompts. Master's system prompt contains no hardcoded column names. Dataset context (primary_dim, secondary_dim, value_label) is injected into every Master task message at runtime from DatasetConfig, preventing the model from defaulting to terminology from its training data.