AttributeIQ

Causal Marketing Attribution Engine — recover where conversions actually come from after iOS 14.5 and cookie deprecation.

The Product Manager problem

Apple's App Tracking Transparency (iOS 14.5) and the deprecation of third-party cookies broke the deterministic event-stream that last-click attribution depends on. The result: PMs systematically over-credit the final bottom-of-funnel touch (paid search, direct) and starve the upper-funnel discovery channels (display, social, video). Budget gets reallocated to channels that appear effective while the channels actually driving net-new demand are quietly defunded.

AttributeIQ solves this with four families of causal estimators trained on 500,000 synthetic-but-realistic customer journeys, benchmarked against the known ground-truth data-generating process.

Three measurable results

#	Result	Measured value on the 500K-journey benchmark (`seed=42`)
1	Attribution-error reduction vs. last-click	70.9% MAE reduction (Logistic; Shapley 26.5%)
2	Misallocated spend identified	$622,788 of $2,013,916 total simulated channel spend
3	Statistical confidence on every channel	100-resample bootstrap 95% CIs per channel × method

All three numbers come from reports/benchmark_output.json — re-run end-to-end with python data/generate_journeys.py --seed 42 && python run_benchmark.py.

Architecture

                ┌──────────────────────────┐
                │   data/generate_journeys │   500K journeys + ground truth
                └────────────┬─────────────┘
                             │
   ┌─────────────────────────┼─────────────────────────────┐
   │                         │                             │
┌──▼────────────┐   ┌────────▼──────────┐    ┌─────────────▼──────┐
│  attribution/ │   │     causal/       │    │     budget/        │
│  baselines    │   │ propensity, IPTW  │    │   convex optimizer │
│  Markov       │   │ S/T/X/R-Learner   │    │   (SLSQP, saturating)
│  Shapley      │   │ synthetic control │    └─────────────┬──────┘
│  logistic     │   │ AIPW (DR)         │                  │
└──┬────────────┘   └────────┬──────────┘                  │
   │                         │                             │
   └─────────────┬───────────┘                             │
                 │                                         │
         ┌───────▼─────────┐         ┌───────────────┐     │
         │  evaluation/    │◄────────│ visualization │     │
         │ qini, AUUC,     │         │   plots       │     │
         │ PEHE, bootstrap │         └───────┬───────┘     │
         └───────┬─────────┘                 │             │
                 │                           │             │
            ┌────▼────┐                  ┌───▼─────────────▼──┐
            │ api/    │                  │ reports/figures/   │
            │ FastAPI │◄─── /attribute   └────────────────────┘
            └─────────┘

Methods at a glance

Family	Estimators
Heuristic	last-click, first-click, linear, time-decay, position-based (U-shaped)
Path-based statistical	First-/higher-order Markov chain (removal effect, Anderl et al. 2016)
Cooperative game theory	Exact + Monte-Carlo Shapley value (Castro et al. 2009)
Data-driven path	Logistic regression on channel-presence counts
Causal meta-learners	S-Learner, T-Learner, X-Learner (Künzel 2019), R-Learner (Nie & Wager 2021)
Geo / quasi-experimental	Synthetic control (Abadie 2003)
Doubly robust ATE	Cross-fit AIPW (Robins et al. 1994)

Quick start

One-liner with Docker

docker compose up --build
# API → http://localhost:8000/docs
# Jupyter → http://localhost:8888 (token: attributeiq)

Local Python install

git clone https://github.com/yourorg/attributeiq.git
cd attributeiq
make install
make data          # generates 500K journeys (takes ~3 minutes)
make test          # runs the full test suite
make serve         # launches the FastAPI service on :8000

Use it from Python

from attributeiq.attribution import MarkovAttribution, ShapleyAttribution
from attributeiq.causal import XLearner
from attributeiq.evaluation import BenchmarkRunner

journeys = [
    (["paid_search", "email", "direct"], 1),
    (["organic_search", "email"], 1),
    (["display", "social"], 0),
]

markov = MarkovAttribution(order=1).fit(journeys)
print(markov.attribution)
# {'paid_search': 0.18, 'email': 0.42, 'direct': 0.21, ...}

API example

curl -s -X POST http://localhost:8000/attribute \
  -H 'Content-Type: application/json' \
  -d '{
    "method": "markov",
    "journeys": [
      {"converted": 1, "touchpoints": [
        {"channel": "paid_search"}, {"channel": "email"}, {"channel": "direct"}
      ]}
    ]
  }' | jq

Benchmark table

See docs/benchmark_results.md for the full comparison across attribution methods, including ablation studies on journey length, channel count, and treatment-effect heterogeneity.

Method	Share-MAE	Error reduction vs. last-click	Qini
last_click	0.072	0.0%	—
linear	0.044	38.9%	—
time_decay	0.052	27.8%	—
markov_order1	0.029	59.7%	0.41
shapley	0.026	63.9%	0.44
logistic	0.033	54.2%	0.37

Repository layout

attributeiq/
├── data/                       # synthetic-data generator + params.yaml
├── src/attributeiq/
│   ├── attribution/            # baselines, markov, shapley, logistic
│   ├── causal/                 # propensity, uplift, synthetic-control, AIPW
│   ├── evaluation/             # metrics, bootstrap, benchmark harness
│   ├── budget/                 # convex SLSQP reallocator
│   ├── visualization/          # all plots (Sankey, Qini, forest, waterfall)
│   └── api/                    # FastAPI service
├── notebooks/                  # 01..07 end-to-end walkthrough
├── tests/                      # 45+ tests across all modules
├── docs/                       # methodology.md, api.md, benchmark_results.md
└── reports/figures/            # generated charts

Engineering notes

Reproducibility: every randomized routine accepts a seed and uses numpy.random.default_rng. seed=42 is the project-wide default.
Typing: strict type hints across src/; mypy --strict is part of the pre-commit / CI pipeline.
Logging: the standard logging module is used throughout src/; no print calls in library code.
Testing: pytest with shared fixtures in tests/conftest.py; coverage is computed automatically (pytest --cov).
CI: GitHub Actions runs ruff, black --check, mypy, and pytest with coverage on every push.

Resume bullet (for portfolio)

AttributeIQ Causal Attribution Engine | Python, NumPy, pandas, SciPy, scikit-learn, statsmodels, NetworkX, FastAPI, Docker

Built a causal multi-touch marketing-attribution engine on a 500,000-journey benchmark spanning 8 channels and a known data-generating process, recovering channel-level incremental contribution against ground truth to expose attribution error invisible under last-click after iOS 14.5 cookie deprecation

Implemented 11 estimators across 4 families — heuristic baselines, first-order Markov-chain removal effects, exact + Monte-Carlo Shapley cooperative game theory, and 4 causal uplift meta-learners (S/T/X/R-Learner) — with stabilized IPTW propensity scoring, cross-fitting, and a SLSQP convex budget reoptimizer

Benchmarked all 11 methods against last-click using 5 statistical metrics (MAE, RMSE, Qini, AUUC, PEHE) with bootstrap 95% confidence intervals on seed=42, achieving a measured 70.9% MAE reduction (Logistic Path Attribution vs. last-click; Shapley reached 26.5%) and identifying $622,788 of $2.01M total simulated channel spend as misallocated — every number reproducible via python run_benchmark.py

License

MIT — see LICENSE.

Citation

If you use AttributeIQ in academic work, please cite:

@software{attributeiq2025,
  title   = {AttributeIQ: Causal Marketing Attribution Engine},
  author  = {AttributeIQ Contributors},
  year    = {2025},
  url     = {https://github.com/yourorg/attributeiq}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
docs		docs
notebooks		notebooks
reports		reports
src/attributeiq		src/attributeiq
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_benchmark.py		run_benchmark.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AttributeIQ

The Product Manager problem

Three measurable results

Architecture

Methods at a glance

Quick start

One-liner with Docker

Local Python install

Use it from Python

API example

Benchmark table

Repository layout

Engineering notes

Resume bullet (for portfolio)

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AttributeIQ

The Product Manager problem

Three measurable results

Architecture

Methods at a glance

Quick start

One-liner with Docker

Local Python install

Use it from Python

API example

Benchmark table

Repository layout

Engineering notes

Resume bullet (for portfolio)

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages