Skip to content
View eduardocornelsen's full-sized avatar
🚀
Building Agentic BI systems | dbt · MCP · Python · Claude API
🚀
Building Agentic BI systems | dbt · MCP · Python · Claude API

Block or report eduardocornelsen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
eduardocornelsen/README.md
Data Science Cover

Eduardo Cornelsen's Applied Data Portfolio

   

Total Time Coded

 

🇺🇸 English Version

👋 About Me

Senior Data Analyst & Analytics Engineer | Revenue Operations (RevOps) · AI & Agentic BI | SQL · Python · dbt · BigQuery · MCP

Senior Data Analyst & Analytics Engineer with 10+ years across consultative ERP sales (Omie — Brazil's leading cloud ERP), business consulting, and analytics. I diagnose where SaaS, DTC, and E-Commerce companies lose revenue and build the governed Agentic BI systems to capture it.

I've sat in the revenue meetings. I know what the VP of Sales needs by Monday morning — and I build the analytics to answer it before they ask.

🛠️ Tech Stack

Python SQL dbt ELT Pipelines BigQuery Databricks MCP Servers LangChain Claude/Gemini APIs FastAPI Looker Studio Streamlit React/NextJS

🧠 AI & Agentic BI

Building governed AI analytics systems where natural language queries return deterministic, dbt-governed insights in 15 seconds — no hallucinations, no metric drift.

📊 Business Stack

BI Architecture · Revenue Analytics · Marketing & Funnel Analytics · SaaS Unit Economics (CAC, LTV, MRR, Churn, ROAS) · Financial Modeling · ELT Architecture

🎓 TripleTen Data Science Residency (700h+) · Databricks Certified · Six Sigma Green Belt · Final-round candidate — Epic Games 2026


🛠️ Full Tech & Business Stack

🧠 AI & Agentic BI

MCP Claude LangChain Generative AI n8n FastAPI

💻 Data Engineering & Analytics Engineering

Python SQL dbt ELT Pipelines BigQuery Databricks Snowflake

📊 BI & Visualization

Looker Studio Tableau Power BI Streamlit

🏢 Business Stack

RevOps Business Intelligence Product Analytics


🚀 Featured Projects

Category Project Stack
🏆 AI / Analytics Eng Full-Funnel AI Analytics Platform dbt · MCP · BigQuery · XGBoost · Claude/Gemini
🧠 AI Tooling / OSS Unified AI Data Framework Claude Code · 12 Playbooks · 33 Skills · 9 Personas
RevOps / AI RevOps Lead Engine: AI-Powered B2B Command Center Python · Streamlit · Plotly · XAI
Product Strategy / UXR Epic Games Store: 2026 Ecosystem Intelligence Audit Python · Streamlit · Random Forest · NLP
Stats / Edu PunkSQL — Mobile SQL Learning Platform Next.js · SQLite/WASM · Vercel · Google OAuth
Infra / AI Chatbot Portfolio Website + AI Chatbot Infrastructure React · Gemini · Cloud Run · LangFuse · Docker
GenAI / BI Conversational BI & Generative Analytics (Music Trends) Streamlit · LLMs · LangChain


🏆 Full-Funnel AI Marketing Analytics Platform

Natural language queries · dbt Semantic Layer · 7 MCP Servers · ML Lead Scoring · $0/month base cost

Ask in natural language:

"Show me the complete marketing funnel for Q1 2025: ad spend across Google and Meta, website sessions by channel, lead conversion rates, and final revenue. Calculate blended CAC and ROAS."

Full-Funnel AI Analytics — /marketing command live demo

Python dbt BigQuery Claude XGBoost FastAPI MLflow

Business Context: Companies run ads across Google, Meta, and organic channels. Marketing claims leads. Sales says they're low quality. The CEO asks: "Where should we spend next quarter?"

Answering this requires joining data from 5+ platforms, building attribution models, scoring leads, and making it all accessible to non-technical stakeholders. This project builds the production system — at $0/month base cost.

Architecture: Three Heads, One Spine

  • AI Layer — 7 MCP servers + Claude Desktop, OpenCode, Gemini CLI, Antigravity. Ask questions in plain English, get production-grade React dashboards in 15 seconds.
  • BI Layer — Looker Studio + Streamlit + Claude React artifacts. Same governed metrics across every output.
  • ML Layer — XGBoost trained on 93K rows + FastAPI /score endpoint + n8n auto-routing. Hot leads bypass the queue instantly.

Key Numbers:

  • 29 dbt models — full Medallion Architecture (Bronze → Silver → Gold)
  • 2.2M+ rows of real Olist ecommerce data + synthetic marketing data
  • 7 MCP servers — Google Ads, Meta Ads, GA4, HubSpot, Salesforce, BigQuery, dbt Semantic Layer
  • 4 attribution models — First-Touch, Last-Touch, Linear, Time-Decay
  • 5 warehouses — BigQuery, DuckDB, Snowflake, Databricks, Supabase
  • 4 AI clients — Claude Desktop, OpenCode, Gemini CLI, Antigravity
  • $0/month base infrastructure cost

The Core Insight — Why Governance Matters: Most AI-to-SQL tools fail because they lack a source of truth. This project solves that with the dbt Semantic Layer (MetricFlow): define "ROAS" once in YAML, and every AI client, dashboard, and ML pipeline consumes the exact same definition — forever. No hallucinations. No metric drift.

View Repo Architecture Demo Live Demo



🧠 Unified AI Data Framework

Turn Claude Code into a senior data science team — 12 playbooks · 33 skill templates · 9 LLM personas · $0/month

Unified AI Data Framework

Claude Code Python dbt MLflow Databricks

The Problem: You ask Claude to analyze a dataset. It writes a notebook — with no assumption checks, no baseline comparison, no experiment tracking, inconsistent structure every time. Not because Claude is bad. Because Claude has no structure to follow.

The Solution: This framework installs a library of playbooks, personas, skills, and guardrails directly into Claude Code. Claude reads them. Claude follows them. You get senior-quality output — every time.

Without this framework With this framework
Hallucinated analytical structures 33 vetted skill templates with scripts
No assumption checks Full statistical diagnostics before every test
Inconsistent quality across projects Same playbook, every time
No stakeholder communication plan Executive summaries + impact quantification
Data leakage goes undetected DS Reviewer persona audits the full pipeline

Key Numbers:

  • 12 analytical playbooks — full DS lifecycle: Problem Framing → EDA → Feature Engineering → Model Training → Stakeholder Communication
  • 33 skill templates — cohort analysis, A/B testing, funnel analysis, metric reconciliation, root cause investigation, and more
  • 9 LLM personas — Data Analyst, Analytics Engineer, ML Engineer, DS Reviewer, Product Manager, UX Researcher — deployed as Claude Code /slash commands
  • $0/month — pure markdown, zero runtime dependencies
  • Reference implementation included — fully executed end-to-end pipeline on 10K ad-click records (EDA → Feature Engineering → Model Training → Inferencing)

Preview: Databricks Serverless — Live Run Output

Live run output from the framework's EDA playbook on Databricks Serverless — bivariate analysis, multivariate scatter matrix, and temporal click-rate patterns.

Live run output from the framework's EDA playbook on Databricks Serverless — bivariate analysis, multivariate scatter matrix, and temporal click-rate patterns. Status: Succeeded in 2m 29s. One of 10+ notebooks in the reference implementation pipeline.


Architecture:

Hub (this repo) — Pure markdown. Playbooks, personas, skills, templates.
 ├── Engine (ai-analyst) — Python execution backend
 └── full-funnel-ai-analytics — Production deployment using this framework

This is the methodology layer behind the Full-Funnel AI Analytics Platform above.

View Repo



🚀 RevOps Lead Engine: AI-Powered B2B Command Center

Full-cycle autonomous lead generation · Predictive scenario modeling · Post-sales retention analytics

RevOps Lead Engine Dashboard

Python Streamlit Plotly Pydantic

Business Context: Most B2B sales teams burn 60–70% of SDR time on manual prospecting. This project builds a fully autonomous RevOps platform — from ICP-driven lead discovery to AI-powered scoring, predictive revenue modeling, and post-sales retention tracking.

Key Features:

  • AI RevOps Copilot: Natural-language chat for pipeline risk analysis and quota pacing
  • Revenue Scenario Modeler: 4 levers (Volume, Win Rate, ACV, Cycle Time) → 90-day S-curve revenue projections
  • Post-Sales NDR Module: Net Dollar Retention, Account Health scoring, ARR Waterfall charts
  • Explainable AI (XAI): Every lead score includes transparent reasoning — no black-box models
  • 10 Integrated Modules: Revenue Dashboard, Lead Intelligence, Sales Navigator, Pipeline Analytics, and more

Live Demo View Repo



⚡ Epic Games Store Ecosystem Intelligence: Strategic Audit (2026)

Epic Games Store Strategic Audit Cover

Python Streamlit Scikit-Learn NotebookLM

Business Context: A strategic audit transitioning the Epic Games Store from a digital storefront to an "Ecosystem of Intelligence." Random Forest Regression (R²=0.392) and K-Means Clustering decode the "UX Alpha" — proving 60% of player satisfaction is driven by intangible factors beyond price and specs.

Key Findings:

  • The "Hardware Wall": High system requirements correlate negatively (-0.133) with user ratings — a critical churn zone
  • Behavioral Segmentation: 4 Product Personas mapping the "Premium Friction" risk in high-cost Indie titles
  • SHAP values revealing that review velocity outweighs review volume in driving satisfaction scores
  • Final-round candidate for Epic Games Data Analyst role (2026)

View UXR Slides View Dashboard View Repo View Notebook



>_ PunkSQL — Mobile SQL Learning Platform

Duolingo meets LeetCode · 80 challenges · Real in-browser SQL execution · Cyberpunk CLI aesthetic

PunkSQL — Mobile SQL Learning Platform Demo

Next.js SQLite Supabase Vercel

What it is: A mobile-first SQL learning platform with a cyberpunk terminal aesthetic. Write real SQL queries that execute in your browser using SQLite compiled to WebAssembly. No backend, no signup required to play.

Key Features:

  • 80 SQL challenges across 8 modules — SELECT → CTEs, sequential unlocking
  • Real SQLite execution via WASM — queries validated against expected output
  • Gamified — 20 levels, XP system, 10 achievements, sound effects
  • Google OAuth — persistent progress across sessions
  • Bilingual — full EN/PT-BR support
  • $0/month infrastructure — fully client-side

Play Now View Repo



⚙️ Portfolio Website + AI Chatbot Infrastructure

Streaming AI chatbot · Google Cloud Run · GitHub Actions CI/CD · LangFuse LLMOps · 4-layer security · $0/month

Eduardo Cornelsen Portfolio — Live Site

React TypeScript Gemini Docker Cloud Run GitHub Actions LangFuse

What it is: The production infrastructure powering this portfolio — not a side project, but the live system you're looking at right now. Built to be a proof of work in itself: a streaming AI chatbot, containerized via multi-stage Docker, deployed on Google Cloud Run through a zero-touch GitHub Actions pipeline, with every conversation traced in LangFuse for cost, latency, and security analysis.

Architecture:

GitHub push → Actions: Docker build → Artifact Registry → Cloud Run deploy
Browser → React 19 + Vite (SSE streaming) → Express proxy → Gemini 2.5 Flash
                                                           → LangFuse (trace every token)

Key Features:

  • Streaming AI chatbot — Server-Sent Events deliver token-by-token responses; no waiting for the full reply
  • 4-layer security model — client sanitization, IP rate-limiting, canary token injection, intent classification (jailbreak detection)
  • LLMOps with LangFuse — every conversation traced with cost, latency, and safety flag
  • Zero-touch CI/CD — push to main, site is live in ~3 minutes via OIDC-authenticated GitHub Actions
  • $0/month at rest — Cloud Run scales to zero when idle

Live Site View Repo



🎵 Conversational BI & Generative Analytics

"Chat with Data" Agent — 100 Years of Music History

Streamlit App

MusicInsights AI — Conversational BI Dashboard

Python LangChain Gemini Streamlit

Business Context: Stakeholders need quick answers but lack SQL skills. This solves that by integrating an LLM directly into the dashboard — ask "What was the most popular genre in the 80s?" and get an instant data-backed answer. Analysis of 160k+ tracks spanning 100 years.

View Repo Launch App


🇧🇷 Versão em Português

👋 Sobre Mim

Senior Data Analyst & Analytics Engineer | Revenue Operations (RevOps) · AI & Agentic BI | SQL · Python · dbt · BigQuery · MCP

Senior Data Analyst & Analytics Engineer com mais de 10 anos entre vendas consultivas de ERP (Omie — maior ERP cloud do Brasil), consultoria de negócios e analytics. Eu identifico onde empresas de SaaS, DTC e E-Commerce perdem receita e construo os sistemas de Agentic BI governados para capturá-la.

Já estive nas reuniões de receita. Sei o que o VP de Vendas precisa na segunda de manhã — e construo a analytics para responder antes de ser perguntado.

🛠️ Stack Técnica

Python SQL dbt Pipelines ELT BigQuery Databricks MCP Servers LangChain Claude/Gemini APIs FastAPI Looker Studio Streamlit React/NextJS

🧠 IA & Agentic BI

Construindo sistemas de analytics com IA governada — onde consultas em linguagem natural retornam insights determinísticos, governados pelo dbt, em 15 segundos. Sem alucinações. Sem metric drift.

📊 Stack de Negócios

Arquitetura de BI · Revenue Analytics · Analytics de Marketing & Funil · Unit Economics SaaS (CAC, LTV, MRR, Churn, ROAS) · Modelagem Financeira · Arquitetura ELT

🎓 Residência em Data Science — TripleTen (700h+) · Certificado Databricks · Green Belt Lean Six Sigma · Finalista — Epic Games 2026


🚀 Projetos em Destaque

Categoria Projeto Stack
🏆 IA / Analytics Eng Full-Funnel AI Analytics Platform dbt · MCP · BigQuery · XGBoost · Claude/Gemini
🧠 AI Tooling / OSS Unified AI Data Framework Claude Code · 12 Playbooks · 33 Skills · 9 Personas
RevOps / IA RevOps Lead Engine: Central de Comando B2B Python · Streamlit · Plotly · XAI
Estratégia de Produto / UXR Epic Games Store: Auditoria de Inteligência (2026) Python · Streamlit · Random Forest · NLP
Edu / SQL PunkSQL — Plataforma de Aprendizado de SQL Next.js · SQLite/WASM · Vercel · Google OAuth
Infra / AI Chatbot Infraestrutura Portfolio Website + AI Chatbot React · Gemini · Cloud Run · LangFuse · Docker
GenAI / BI BI Conversacional & Analytics Generativo Streamlit · LLMs · LangChain


🏆 Full-Funnel AI Marketing Analytics Platform

Consultas em linguagem natural · dbt Semantic Layer · 7 MCP Servers · ML Lead Scoring · $0/mês

Full-Funnel AI Analytics — demo de consulta em linguagem natural

Contexto de Negócio: Empresas rodam anúncios no Google, Meta e canais orgânicos. Marketing reivindica os leads. Vendas diz que a qualidade é baixa. O CEO pergunta: "Onde devemos investir no próximo trimestre?"

Responder isso exige unir dados de 5+ plataformas, construir modelos de atribuição, pontuar leads e tornar tudo acessível para stakeholders não-técnicos. Este projeto constrói o sistema de produção — com custo base de $0/mês.

Números-chave:

  • 29 modelos dbt — Arquitetura Medallion completa (Bronze → Silver → Gold)
  • 2,2M+ linhas de dados reais Olist + dados sintéticos de marketing
  • 7 MCP Servers — Google Ads, Meta Ads, GA4, HubSpot, Salesforce, BigQuery, dbt Semantic Layer
  • 4 modelos de atribuição — First-Touch, Last-Touch, Linear, Time-Decay
  • 5 warehouses — BigQuery, DuckDB, Snowflake, Databricks, Supabase
  • $0/mês de custo base de infraestrutura

Ver Repo Demo Arquitetura Demo ao Vivo



🧠 Unified AI Data Framework

Transforme Claude Code em um time sênior de data science — 12 playbooks · 33 templates · 9 personas · $0/mês

Unified AI Data Framework

Claude Code Python dbt MLflow Databricks

O Problema: Você pede ao Claude para analisar um dataset. Ele escreve um notebook — sem verificações de premissas, sem baseline de comparação, sem rastreamento de experimentos, estrutura inconsistente a cada vez. Não porque o Claude é ruim. Porque o Claude não tem estrutura para seguir.

A Solução: Este framework instala uma biblioteca de playbooks, personas, skills e guardrails diretamente no Claude Code. O Claude lê. O Claude segue. Você recebe output de qualidade sênior — sempre.

Números-chave:

  • 12 playbooks analíticos — ciclo completo de DS: Problem Framing → EDA → Feature Engineering → Model Training → Stakeholder Communication
  • 33 skill templates — cohort analysis, A/B testing, funnel analysis, metric reconciliation e mais
  • 9 personas LLM — Data Analyst, Analytics Engineer, ML Engineer, DS Reviewer, Product Manager — como comandos /slash no Claude Code
  • $0/mês — markdown puro, zero dependências de runtime
  • Implementação de referência inclusa — pipeline end-to-end executado em 10K registros de ad-click

Preview: Databricks Serverless — Live Run Output

Output ao vivo do playbook de EDA no Databricks Serverless — análise bivariada, scatter matrix multivariada e padrões temporais de click-rate.

Output ao vivo do playbook de EDA no Databricks Serverless — análise bivariada, scatter matrix multivariada e padrões temporais de click-rate. Status: Succeeded em 2m 29s. Um dos 10+ notebooks no pipeline da implementação de referência.


Ver Repo



🚀 RevOps Lead Engine: Central de Comando B2B com IA

Geração autônoma de leads · Modelagem preditiva de receita · Analytics de retenção pós-venda

RevOps Lead Engine Dashboard

Python Streamlit Plotly Pydantic

Contexto: Plataforma RevOps totalmente autônoma — da descoberta de leads por ICP ao scoring com IA, modelagem preditiva de receita e rastreamento de retenção pós-venda. IA Explicável (XAI) em cada score de lead.

Destaques:

  • AI RevOps Copilot: Chat em linguagem natural para análise de risco de pipeline e ritmo de quota
  • Revenue Scenario Modeler: 4 alavancas (Volume, Win Rate, ACV, Cycle Time) → projeções de receita em curva-S de 90 dias
  • Módulo NDR Pós-Venda: Net Dollar Retention, Account Health scoring, ARR Waterfall charts
  • XAI: Cada score de lead inclui raciocínio transparente — sem modelos black-box

Demo ao Vivo Ver Repo



⚡ Epic Games Store: Auditoria Estratégica de Ecossistema (2026)



Python Streamlit Scikit-Learn NotebookLM

Descobertas principais: Hardware Wall (-0.133 correlação com satisfação), 4 Personas de Produto, 60% da satisfação do jogador impulsionada por fatores intangíveis de UX. SHAP values revelando que velocidade de reviews supera volume na predição de satisfação.

Finalista para a vaga de Data Analyst na Epic Games (2026).

Ver Apresentação Acessar Dashboard Ver Repo Ver Notebook



>_ PunkSQL — Plataforma de Aprendizado de SQL

Duolingo meets LeetCode · 80 desafios · Execução real de SQL no browser · Estética cyberpunk

PunkSQL — Demo Plataforma de Aprendizado de SQL

Next.js SQLite Supabase Vercel

Plataforma mobile-first para aprender SQL com estética cyberpunk. 80 desafios, 8 módulos (SELECT → CTEs), execução real de SQL no browser via SQLite/WASM, gamificação completa (20 níveis, XP, conquistas), Google OAuth, bilíngue EN/PT-BR. $0/mês de infraestrutura.

Jogar Agora Ver Repo



⚙️ Infraestrutura Portfolio Website + AI Chatbot

AI chatbot com streaming · Google Cloud Run · CI/CD GitHub Actions · LLMOps com LangFuse · segurança 4 camadas · $0/mês

Eduardo Cornelsen Portfolio — Site ao Vivo

React TypeScript Gemini Docker Cloud Run GitHub Actions LangFuse

O que é: A infraestrutura de produção que roda este portfólio. Chatbot de IA com streaming, containerizado via Docker multi-stage, implantado no Google Cloud Run via GitHub Actions zero-touch, com cada conversa rastreada no LangFuse para custo, latência e segurança.

Arquitetura:

GitHub push → Actions: Docker build → Artifact Registry → Cloud Run deploy
Browser → React 19 + Vite (SSE streaming) → Express proxy → Gemini 2.5 Flash
                                                           → LangFuse (trace every token)

Destaques:

  • Segurança em 4 camadas — sanitização no cliente, rate-limit por IP, canary token, classificação de intenção (detecção de jailbreak)
  • LLMOps com LangFuse — cada conversa rastreada com custo, latência e flag de segurança
  • CI/CD zero-touch — push para main, site no ar em ~3 minutos via GitHub Actions com OIDC
  • $0/mês em repouso — Cloud Run escala para zero quando ocioso

Site ao Vivo Ver Repo


🎵 Conversational BI & Generative Analytics

"Chat with Data" Agent — 100 Anos de História Musical

Streamlit App

MusicInsights AI — Dashboard de BI Conversacional

Python LangChain Gemini Streamlit

Dashboard interativo com Consultor IA — gestores perguntam em linguagem natural e recebem respostas baseadas em dados de 160k+ tracks cobrindo 100 anos de evolução musical.

Ver Repo Acessar App


🏆 Certifications & Education

Data Science - TripleTen Databricks Certified Business Administration - Insper Green Belt Falconi



⚡ Neural Link Activity

From: 24 October 2025 - To: 14 May 2026

Total Time: 342 hrs 32 mins

Python                     249 hrs 33 mins       █████████████████▓░░░░░░░   70.91 %
Markdown                   46 hrs 26 mins        ███▒░░░░░░░░░░░░░░░░░░░░░   13.20 %
TypeScript                 12 hrs 59 mins        █░░░░░░░░░░░░░░░░░░░░░░░░   03.69 %
HTML                       10 hrs 44 mins        ▓░░░░░░░░░░░░░░░░░░░░░░░░   03.05 %
Other                      9 hrs 24 mins         ▓░░░░░░░░░░░░░░░░░░░░░░░░   02.67 %

📊 GitHub Stats




© 2026 Eduardo Cornelsen — Senior Data Analyst & Analytics Engineer

Diagnosing revenue leakage. Building governed Agentic BI systems to capture it.

Pinned Loading

  1. full-funnel-ai-analytics full-funnel-ai-analytics Public

    Full-Funnel AI Marketing Analytics. A modern data stack powered by dbt MetricFlow and MCP. Natural language insights across Google/Meta Ads, CRM, and 5 data warehouses. Includes XGBoost lead scorin…

    Python 7 2

  2. cv-educornelsen cv-educornelsen Public

    Production portfolio with a streaming AI chatbot (Gemini + SSE), containerized via Docker, deployed to Google Cloud Run through GitHub Actions CI/CD, with LangFuse LLMOps tracing every conversation…

    TypeScript 6

  3. unified-ai-data-framework unified-ai-data-framework Public

    Turn Claude Code into a senior data science team. Playbooks, personas, and guardrails that guide Claude through every phase of a data science project — from problem framing to production monitoring…

    HTML 1

  4. punksql punksql Public

    Mobile-first SQL learning platform (Next.js + SQLite/WASM + Vercel) - 80 challenges, 8 modules (SELECT → CTEs), in-browser SQL execution, Google OAuth, XP/levels/achievements. Bilingual EN/PT-BR.

    JavaScript 1

  5. revops_lead_engine revops_lead_engine Public

    🤖 Autonomous B2B RevOps Command Center. End-to-end SDR automation: from programmatic lead discovery & enrichment to AI lead scoring (XAI), outreach, and 90-day revenue scenario modeling. Features a…

    Python

  6. epic-store-analysis epic-store-analysis Public

    ⚡ Epic Game Store (EGS) Ecosystem Intelligence: A strategic Data Science & UXR audit of the Epic Games Store. Uses K-Means Clustering, NLP, and Predictive Modeling to identify UX friction points, h…

    HTML 1