class ShaanSatsangi:
def __init__(self):
self.name = "Shaan Satsangi"
self.role = "B.Tech CSE @ JECRC, Jaipur (Class of 2026)"
self.focus = ["Data Engineering", "AI Systems", "Analytics", "Full Stack"]
self.stack = ["Python", "SQL", "PySpark", "Databricks", "dbt", "Airflow", "FastAPI", "Next.js"]
self.currently = "Building production-grade data platforms, local-first AI tools, and develop intelligence platforms"
self.open_to = "Data Engineering / Analytics / AI Systems roles"
def say_hi(self):
print("Thanks for stopping by! Let's build something useful.")
me = ShaanSatsangi()
me.say_hi()- 📝 Medallion Lakehouse Architecture in Practice — Building Bronze
$\rightarrow$ Silver$\rightarrow$ Gold pipelines on Delta Lake. - ⚙️ Containerized Orchestration — Scheduling dbt data marts with Apache Airflow and Docker.
- 🧠 Offline RAG without Cloud APIs — Fastembed ONNX CPU vector search with FAISS.
Note
💡 Community Discussions: Want to collaborate, ask questions, or share your own builds? Join our active developer community on GitHub Discussions across my flagship repos!
Have a question about Data Engineering, local AI models, or my career journey? Ask away!
💬 How do you build local RAG pipelines in jarvis-py? (asked by @Shaan-alpha)
We use fastembed ONNX models for local CPU embeddings and FAISS for similarity vector search. No cloud APIs or torch dependencies required!
Read full repository →
| Project | Stack | What it does |
|---|---|---|
| CRM Sales Warehouse | Python · PostgreSQL · Docker · dbt · Airflow · Power BI | End-to-end CRM analytics platform with ETL/ELT pipeline, star-schema warehouse, dbt tests, and 5-page executive dashboard |
| YouTube Wrapped | Databricks · PySpark · Delta Lake · FastAPI · Next.js | Personal YouTube watch-history product using medallion lakehouse architecture, Neon Postgres, deployed API, and analytics dashboard |
| JARVIS-PY | Python · Ollama · Vosk · openWakeWord · FAISS · fastembed | Local-first AI voice assistant with wake-word barge-in, online/offline STT, semantic memory, PDF RAG, tool-agent routing, reminders, and interruptible TTS |
| Face Sort Studio | Python · Flask · OpenCV DNN · SQLite · SSE | Local face-recognition photo organizer using YuNet + SFace embeddings, real-time progress streaming, and privacy-first local processing |
| Sahaara | Next.js · TypeScript · Supabase · Tailwind · Twilio · MapLibre | Safety-focused full-stack app with emergency SOS workflows, live location sharing, trusted contacts, and Twilio alerts |
| Skill Issue | FastAPI · Next.js · GitHub API · AI · Analytics | Upcoming GitHub intelligence platform for developer identity analysis, OSS signals, recruiter readiness, and personality-driven diagnostics |
Data Engineering & Analytics
AI / ML / Local AI Systems
Full Stack & Tools
- 📚 B.Tech CSE @ JECRC Jaipur (2022 – Present) · CGPA: 7.39
- 🧠 Upgraded JARVIS-PY into a local-first AI assistant with RAG, semantic memory, wake-word barge-in, and tool-agent routing
- 🏗️ Building Skill Issue — a GitHub intelligence platform that analyzes developer identity, OSS activity, engineering maturity, and recruiter readiness
- 🌱 Studying for the Databricks Certified Data Engineer Associate exam
- 🤝 Open to Data Engineering, Data Analytics, and AI Systems roles
Welcome to my Visitor Wall! Click below to sign my guestbook. An automated GitHub Action will instantly grab your GitHub avatar and greeting and add you to the wall below!
| Visitor | Message | Date |
|---|---|---|
Shaan Satsangi Maintainer |
"Welcome to my GitHub! Drop a note and connect with me." | May 16, 2026 |
A huge thank you to everyone supporting my research in local AI assistants and open-source data tools! Sponsoring helps cover testing hardware and model hosting.
Explore my Sponsor Tier Architecture & Perks (SPONSORS.md) to see rewards like priority PR reviews, monthly 1-on-1 architecture advisory sessions, and premium logo placement!
"I build data platforms, AI systems, and full-stack products that turn messy inputs into useful outcomes."
