I kept running into the same problem when looking at e-commerce analytics projects online — they'd show you a dashboard with customer counts and revenue bars, but never actually answer the question that matters: which customers should the business care about right now, and why?
That's what PulseIQ is built to answer.
It takes 540,000 transactions from a real UK online retailer, runs RFM scoring across every customer, and segments them into actionable groups — not just "high value" and "low value", but specific behavioural profiles like Champions, At Risk, Cannot Lose Them, and Lost. Each segment tells you something different about what to do next.
---
RFM stands for Recency, Frequency, Monetary. Three dimensions, each scored 1–5 using quintiles:
- Recency — how many days since last purchase (lower = better, so scored in reverse)
- Frequency — how many distinct orders placed
- Monetary — total lifetime spend
A customer scoring 555 is your Champion. They bought recently, buy often, and spend a lot. A customer scoring 111 bought once, a long time ago, and spent almost nothing. Everything in between tells a different story.
The RFM scoring happens entirely in SQL using CTEs and NTILE window functions. The Python notebook validates the results visually and builds the cohort retention matrix.
---
The At Risk segment is the biggest opportunity. These customers were historically high-value — good frequency, decent spend — but haven't purchased in 60–90 days. They're not lost yet. A targeted win-back email with a relevant offer would recover a meaningful chunk of them. Ignoring them means watching them slide into the Lost bucket over the next 30 days.
Third purchase is the loyalty threshold. The cohort analysis showed retention stabilises sharply after a customer's third order. Getting someone from 1 order to 2 is hard. Getting them from 2 to 3 is the unlock. After that, they're likely to keep buying. That's where retention effort should be concentrated.
UK dominates, but that's also a risk. Over 90% of transactions are from UK customers. The international tail is small but has higher average order values in some countries. That's an expansion angle worth exploring.
Cannot Lose Them needs personal attention, not a mass email. These are former VIPs who've gone quiet. Sending them the same discount code as everyone else is a missed opportunity. They need a different approach entirely.
---
- PostgreSQL — RFM scoring, cohort analysis, LTV calculation, all in SQL
- Python (pandas, matplotlib, seaborn, squarify) — EDA, cohort heatmap, segment treemap, export
- Power BI — 4-page interactive dashboard with segment treemap, RFM heatmap, cohort matrix
---
PulseIQ/
│
├── README.md
├── requirements.txt
│
├── data/
│ ├── online\_retail\_sample.csv ← 200-row sample
│ └── processed/ ← CSVs exported from Python for Power BI
│
├── sql/
│ ├── 00\_setup.sql ← table creation + data loading
│ ├── 01\_rfm\_scoring.sql ← core RFM calculation with CTEs + NTILE
│ ├── 02\_segment\_analysis.sql ← segment sizes and revenue breakdown
│ ├── 03\_cohort\_retention.sql ← monthly cohort retention matrix
│ ├── 04\_ltv\_and\_products.sql ← lifetime value tiers + top products
│ ├── 05\_purchase\_behaviour.sql ← day/hour analysis + repeat gap
│ └── 06\_master\_view.sql ← final view that powers Power BI
│
├── notebooks/
│ └── rfm\_analysis.ipynb ← full EDA + RFM + cohort in Python
│
├── dashboard/
│ ├── PulseIQ.pbix ← Power BI file
│ └── POWERBI\_GUIDE.md
│
├── screenshots/
│ ├── page1\_intelligence\_hub.png
│ ├── page2\_rfm\_deep\_dive.png
│ ├── page3\_cohort\_retention.png
│ └── page4\_geo\_products.png
│
└── docs/
├── SETUP.md
└── INTERVIEW\_PREP.md
---
*(Screenshots go here after building the dashboard)*
---
Full instructions in docs/SETUP.md. Short version:
# 1. Install Python dependencies
pip install -r requirements.txt
# 2. Download dataset
# UCI: https://archive.ics.uci.edu/dataset/352/online+retail
# OR Kaggle: https://www.kaggle.com/datasets/mashlyn/online-retail-ii-uci
# 3. Set up PostgreSQL
# Create DB 'pulseiq', run sql/00\_setup.sql, load the CSV
# 4. Run SQL files in order (00 through 06)
# 5. Run notebook
jupyter notebook notebooks/rfm\_analysis.ipynb
# 6. Open dashboard/PulseIQ.pbix in Power BI Desktop---
- Churn probability score using logistic regression on top of the RFM features
- Automated segment migration tracking — how many customers moved from At Risk to Lost this month?
- Email campaign ROI simulator — given a win-back budget, which segments should you target first?
The natural next project from here is ChurnRadar, which takes the RFM features and feeds them into a proper ML churn model.
---
Dataset: UCI Machine Learning Repository — Online Retail Dataset (Chen et al., 2012)