Senior Data Analyst & Analytics Engineer | Revenue Operations (RevOps) · AI & Agentic BI | SQL · Python · dbt · BigQuery · MCP
Senior Data Analyst & Analytics Engineer with 10+ years across consultative ERP sales (Omie — Brazil's leading cloud ERP), business consulting, and analytics. I diagnose where SaaS, DTC, and E-Commerce companies lose revenue and build the governed Agentic BI systems to capture it.
I've sat in the revenue meetings. I know what the VP of Sales needs by Monday morning — and I build the analytics to answer it before they ask.
Python SQL dbt ELT Pipelines BigQuery Databricks MCP Servers LangChain Claude/Gemini APIs FastAPI Looker Studio Streamlit React/NextJS
Building governed AI analytics systems where natural language queries return deterministic, dbt-governed insights in 15 seconds — no hallucinations, no metric drift.
BI Architecture · Revenue Analytics · Marketing & Funnel Analytics · SaaS Unit Economics (CAC, LTV, MRR, Churn, ROAS) · Financial Modeling · ELT Architecture
🎓 TripleTen Data Science Residency (700h+) · Databricks Certified · Six Sigma Green Belt · Final-round candidate — Epic Games 2026
| Category | Project | Stack |
|---|---|---|
| 🏆 AI / Analytics Eng | Full-Funnel AI Analytics Platform | dbt · MCP · BigQuery · XGBoost · Claude/Gemini |
| 🧠 AI Tooling / OSS | Unified AI Data Framework | Claude Code · 12 Playbooks · 33 Skills · 9 Personas |
| RevOps / AI | RevOps Lead Engine: AI-Powered B2B Command Center | Python · Streamlit · Plotly · XAI |
| Product Strategy / UXR | Epic Games Store: 2026 Ecosystem Intelligence Audit | Python · Streamlit · Random Forest · NLP |
| Stats / Edu | PunkSQL — Mobile SQL Learning Platform | Next.js · SQLite/WASM · Vercel · Google OAuth |
| Infra / AI Chatbot | Portfolio Website + AI Chatbot Infrastructure | React · Gemini · Cloud Run · LangFuse · Docker |
| GenAI / BI | Conversational BI & Generative Analytics (Music Trends) | Streamlit · LLMs · LangChain |
Natural language queries · dbt Semantic Layer · 7 MCP Servers · ML Lead Scoring · $0/month base cost
Ask in natural language:
"Show me the complete marketing funnel for Q1 2025: ad spend across Google and Meta, website sessions by channel, lead conversion rates, and final revenue. Calculate blended CAC and ROAS."
Business Context: Companies run ads across Google, Meta, and organic channels. Marketing claims leads. Sales says they're low quality. The CEO asks: "Where should we spend next quarter?"
Answering this requires joining data from 5+ platforms, building attribution models, scoring leads, and making it all accessible to non-technical stakeholders. This project builds the production system — at $0/month base cost.
Architecture: Three Heads, One Spine
- AI Layer — 7 MCP servers + Claude Desktop, OpenCode, Gemini CLI, Antigravity. Ask questions in plain English, get production-grade React dashboards in 15 seconds.
- BI Layer — Looker Studio + Streamlit + Claude React artifacts. Same governed metrics across every output.
- ML Layer — XGBoost trained on 93K rows + FastAPI
/scoreendpoint + n8n auto-routing. Hot leads bypass the queue instantly.
Key Numbers:
- 29 dbt models — full Medallion Architecture (Bronze → Silver → Gold)
- 2.2M+ rows of real Olist ecommerce data + synthetic marketing data
- 7 MCP servers — Google Ads, Meta Ads, GA4, HubSpot, Salesforce, BigQuery, dbt Semantic Layer
- 4 attribution models — First-Touch, Last-Touch, Linear, Time-Decay
- 5 warehouses — BigQuery, DuckDB, Snowflake, Databricks, Supabase
- 4 AI clients — Claude Desktop, OpenCode, Gemini CLI, Antigravity
- $0/month base infrastructure cost
The Core Insight — Why Governance Matters: Most AI-to-SQL tools fail because they lack a source of truth. This project solves that with the dbt Semantic Layer (MetricFlow): define "ROAS" once in YAML, and every AI client, dashboard, and ML pipeline consumes the exact same definition — forever. No hallucinations. No metric drift.
Turn Claude Code into a senior data science team — 12 playbooks · 33 skill templates · 9 LLM personas · $0/month
The Problem: You ask Claude to analyze a dataset. It writes a notebook — with no assumption checks, no baseline comparison, no experiment tracking, inconsistent structure every time. Not because Claude is bad. Because Claude has no structure to follow.
The Solution: This framework installs a library of playbooks, personas, skills, and guardrails directly into Claude Code. Claude reads them. Claude follows them. You get senior-quality output — every time.
| Without this framework | With this framework |
|---|---|
| Hallucinated analytical structures | 33 vetted skill templates with scripts |
| No assumption checks | Full statistical diagnostics before every test |
| Inconsistent quality across projects | Same playbook, every time |
| No stakeholder communication plan | Executive summaries + impact quantification |
| Data leakage goes undetected | DS Reviewer persona audits the full pipeline |
Key Numbers:
- 12 analytical playbooks — full DS lifecycle: Problem Framing → EDA → Feature Engineering → Model Training → Stakeholder Communication
- 33 skill templates — cohort analysis, A/B testing, funnel analysis, metric reconciliation, root cause investigation, and more
- 9 LLM personas — Data Analyst, Analytics Engineer, ML Engineer, DS Reviewer, Product Manager, UX Researcher — deployed as Claude Code
/slashcommands - $0/month — pure markdown, zero runtime dependencies
- Reference implementation included — fully executed end-to-end pipeline on 10K ad-click records (EDA → Feature Engineering → Model Training → Inferencing)
Preview: Databricks Serverless — Live Run Output
Live run output from the framework's EDA playbook on Databricks Serverless — bivariate analysis, multivariate scatter matrix, and temporal click-rate patterns. Status: Succeeded in 2m 29s. One of 10+ notebooks in the reference implementation pipeline.
Architecture:
Hub (this repo) — Pure markdown. Playbooks, personas, skills, templates.
├── Engine (ai-analyst) — Python execution backend
└── full-funnel-ai-analytics — Production deployment using this framework
This is the methodology layer behind the Full-Funnel AI Analytics Platform above.
Full-cycle autonomous lead generation · Predictive scenario modeling · Post-sales retention analytics
Business Context: Most B2B sales teams burn 60–70% of SDR time on manual prospecting. This project builds a fully autonomous RevOps platform — from ICP-driven lead discovery to AI-powered scoring, predictive revenue modeling, and post-sales retention tracking.
Key Features:
- AI RevOps Copilot: Natural-language chat for pipeline risk analysis and quota pacing
- Revenue Scenario Modeler: 4 levers (Volume, Win Rate, ACV, Cycle Time) → 90-day S-curve revenue projections
- Post-Sales NDR Module: Net Dollar Retention, Account Health scoring, ARR Waterfall charts
- Explainable AI (XAI): Every lead score includes transparent reasoning — no black-box models
- 10 Integrated Modules: Revenue Dashboard, Lead Intelligence, Sales Navigator, Pipeline Analytics, and more
Business Context: A strategic audit transitioning the Epic Games Store from a digital storefront to an "Ecosystem of Intelligence." Random Forest Regression (R²=0.392) and K-Means Clustering decode the "UX Alpha" — proving 60% of player satisfaction is driven by intangible factors beyond price and specs.
Key Findings:
- The "Hardware Wall": High system requirements correlate negatively (-0.133) with user ratings — a critical churn zone
- Behavioral Segmentation: 4 Product Personas mapping the "Premium Friction" risk in high-cost Indie titles
- SHAP values revealing that review velocity outweighs review volume in driving satisfaction scores
- Final-round candidate for Epic Games Data Analyst role (2026)
What it is: A mobile-first SQL learning platform with a cyberpunk terminal aesthetic. Write real SQL queries that execute in your browser using SQLite compiled to WebAssembly. No backend, no signup required to play.
Key Features:
- 80 SQL challenges across 8 modules — SELECT → CTEs, sequential unlocking
- Real SQLite execution via WASM — queries validated against expected output
- Gamified — 20 levels, XP system, 10 achievements, sound effects
- Google OAuth — persistent progress across sessions
- Bilingual — full EN/PT-BR support
- $0/month infrastructure — fully client-side
Streaming AI chatbot · Google Cloud Run · GitHub Actions CI/CD · LangFuse LLMOps · 4-layer security · $0/month
What it is: The production infrastructure powering this portfolio — not a side project, but the live system you're looking at right now. Built to be a proof of work in itself: a streaming AI chatbot, containerized via multi-stage Docker, deployed on Google Cloud Run through a zero-touch GitHub Actions pipeline, with every conversation traced in LangFuse for cost, latency, and security analysis.
Architecture:
GitHub push → Actions: Docker build → Artifact Registry → Cloud Run deploy
Browser → React 19 + Vite (SSE streaming) → Express proxy → Gemini 2.5 Flash
→ LangFuse (trace every token)
Key Features:
- Streaming AI chatbot — Server-Sent Events deliver token-by-token responses; no waiting for the full reply
- 4-layer security model — client sanitization, IP rate-limiting, canary token injection, intent classification (jailbreak detection)
- LLMOps with LangFuse — every conversation traced with cost, latency, and safety flag
- Zero-touch CI/CD — push to
main, site is live in ~3 minutes via OIDC-authenticated GitHub Actions - $0/month at rest — Cloud Run scales to zero when idle
Business Context: Stakeholders need quick answers but lack SQL skills. This solves that by integrating an LLM directly into the dashboard — ask "What was the most popular genre in the 80s?" and get an instant data-backed answer. Analysis of 160k+ tracks spanning 100 years.
Senior Data Analyst & Analytics Engineer | Revenue Operations (RevOps) · AI & Agentic BI | SQL · Python · dbt · BigQuery · MCP
Senior Data Analyst & Analytics Engineer com mais de 10 anos entre vendas consultivas de ERP (Omie — maior ERP cloud do Brasil), consultoria de negócios e analytics. Eu identifico onde empresas de SaaS, DTC e E-Commerce perdem receita e construo os sistemas de Agentic BI governados para capturá-la.
Já estive nas reuniões de receita. Sei o que o VP de Vendas precisa na segunda de manhã — e construo a analytics para responder antes de ser perguntado.
Python SQL dbt Pipelines ELT BigQuery Databricks MCP Servers LangChain Claude/Gemini APIs FastAPI Looker Studio Streamlit React/NextJS
Construindo sistemas de analytics com IA governada — onde consultas em linguagem natural retornam insights determinísticos, governados pelo dbt, em 15 segundos. Sem alucinações. Sem metric drift.
Arquitetura de BI · Revenue Analytics · Analytics de Marketing & Funil · Unit Economics SaaS (CAC, LTV, MRR, Churn, ROAS) · Modelagem Financeira · Arquitetura ELT
🎓 Residência em Data Science — TripleTen (700h+) · Certificado Databricks · Green Belt Lean Six Sigma · Finalista — Epic Games 2026
| Categoria | Projeto | Stack |
|---|---|---|
| 🏆 IA / Analytics Eng | Full-Funnel AI Analytics Platform | dbt · MCP · BigQuery · XGBoost · Claude/Gemini |
| 🧠 AI Tooling / OSS | Unified AI Data Framework | Claude Code · 12 Playbooks · 33 Skills · 9 Personas |
| RevOps / IA | RevOps Lead Engine: Central de Comando B2B | Python · Streamlit · Plotly · XAI |
| Estratégia de Produto / UXR | Epic Games Store: Auditoria de Inteligência (2026) | Python · Streamlit · Random Forest · NLP |
| Edu / SQL | PunkSQL — Plataforma de Aprendizado de SQL | Next.js · SQLite/WASM · Vercel · Google OAuth |
| Infra / AI Chatbot | Infraestrutura Portfolio Website + AI Chatbot | React · Gemini · Cloud Run · LangFuse · Docker |
| GenAI / BI | BI Conversacional & Analytics Generativo | Streamlit · LLMs · LangChain |
Contexto de Negócio: Empresas rodam anúncios no Google, Meta e canais orgânicos. Marketing reivindica os leads. Vendas diz que a qualidade é baixa. O CEO pergunta: "Onde devemos investir no próximo trimestre?"
Responder isso exige unir dados de 5+ plataformas, construir modelos de atribuição, pontuar leads e tornar tudo acessível para stakeholders não-técnicos. Este projeto constrói o sistema de produção — com custo base de $0/mês.
Números-chave:
- 29 modelos dbt — Arquitetura Medallion completa (Bronze → Silver → Gold)
- 2,2M+ linhas de dados reais Olist + dados sintéticos de marketing
- 7 MCP Servers — Google Ads, Meta Ads, GA4, HubSpot, Salesforce, BigQuery, dbt Semantic Layer
- 4 modelos de atribuição — First-Touch, Last-Touch, Linear, Time-Decay
- 5 warehouses — BigQuery, DuckDB, Snowflake, Databricks, Supabase
- $0/mês de custo base de infraestrutura
Transforme Claude Code em um time sênior de data science — 12 playbooks · 33 templates · 9 personas · $0/mês
O Problema: Você pede ao Claude para analisar um dataset. Ele escreve um notebook — sem verificações de premissas, sem baseline de comparação, sem rastreamento de experimentos, estrutura inconsistente a cada vez. Não porque o Claude é ruim. Porque o Claude não tem estrutura para seguir.
A Solução: Este framework instala uma biblioteca de playbooks, personas, skills e guardrails diretamente no Claude Code. O Claude lê. O Claude segue. Você recebe output de qualidade sênior — sempre.
Números-chave:
- 12 playbooks analíticos — ciclo completo de DS: Problem Framing → EDA → Feature Engineering → Model Training → Stakeholder Communication
- 33 skill templates — cohort analysis, A/B testing, funnel analysis, metric reconciliation e mais
- 9 personas LLM — Data Analyst, Analytics Engineer, ML Engineer, DS Reviewer, Product Manager — como comandos
/slashno Claude Code - $0/mês — markdown puro, zero dependências de runtime
- Implementação de referência inclusa — pipeline end-to-end executado em 10K registros de ad-click
Preview: Databricks Serverless — Live Run Output
Output ao vivo do playbook de EDA no Databricks Serverless — análise bivariada, scatter matrix multivariada e padrões temporais de click-rate. Status: Succeeded em 2m 29s. Um dos 10+ notebooks no pipeline da implementação de referência.
Contexto: Plataforma RevOps totalmente autônoma — da descoberta de leads por ICP ao scoring com IA, modelagem preditiva de receita e rastreamento de retenção pós-venda. IA Explicável (XAI) em cada score de lead.
Destaques:
- AI RevOps Copilot: Chat em linguagem natural para análise de risco de pipeline e ritmo de quota
- Revenue Scenario Modeler: 4 alavancas (Volume, Win Rate, ACV, Cycle Time) → projeções de receita em curva-S de 90 dias
- Módulo NDR Pós-Venda: Net Dollar Retention, Account Health scoring, ARR Waterfall charts
- XAI: Cada score de lead inclui raciocínio transparente — sem modelos black-box
Descobertas principais: Hardware Wall (-0.133 correlação com satisfação), 4 Personas de Produto, 60% da satisfação do jogador impulsionada por fatores intangíveis de UX. SHAP values revelando que velocidade de reviews supera volume na predição de satisfação.
Finalista para a vaga de Data Analyst na Epic Games (2026).
Plataforma mobile-first para aprender SQL com estética cyberpunk. 80 desafios, 8 módulos (SELECT → CTEs), execução real de SQL no browser via SQLite/WASM, gamificação completa (20 níveis, XP, conquistas), Google OAuth, bilíngue EN/PT-BR. $0/mês de infraestrutura.
O que é: A infraestrutura de produção que roda este portfólio. Chatbot de IA com streaming, containerizado via Docker multi-stage, implantado no Google Cloud Run via GitHub Actions zero-touch, com cada conversa rastreada no LangFuse para custo, latência e segurança.
Arquitetura:
GitHub push → Actions: Docker build → Artifact Registry → Cloud Run deploy
Browser → React 19 + Vite (SSE streaming) → Express proxy → Gemini 2.5 Flash
→ LangFuse (trace every token)
Destaques:
- Segurança em 4 camadas — sanitização no cliente, rate-limit por IP, canary token, classificação de intenção (detecção de jailbreak)
- LLMOps com LangFuse — cada conversa rastreada com custo, latência e flag de segurança
- CI/CD zero-touch — push para
main, site no ar em ~3 minutos via GitHub Actions com OIDC - $0/mês em repouso — Cloud Run escala para zero quando ocioso
Dashboard interativo com Consultor IA — gestores perguntam em linguagem natural e recebem respostas baseadas em dados de 160k+ tracks cobrindo 100 anos de evolução musical.
From: 24 October 2025 - To: 14 May 2026
Total Time: 342 hrs 32 mins
Python 249 hrs 33 mins █████████████████▓░░░░░░░ 70.91 %
Markdown 46 hrs 26 mins ███▒░░░░░░░░░░░░░░░░░░░░░ 13.20 %
TypeScript 12 hrs 59 mins █░░░░░░░░░░░░░░░░░░░░░░░░ 03.69 %
HTML 10 hrs 44 mins ▓░░░░░░░░░░░░░░░░░░░░░░░░ 03.05 %
Other 9 hrs 24 mins ▓░░░░░░░░░░░░░░░░░░░░░░░░ 02.67 %© 2026 Eduardo Cornelsen — Senior Data Analyst & Analytics Engineer
Diagnosing revenue leakage. Building governed Agentic BI systems to capture it.



