AI-Induced Distribution Shift in Google Trends: Implications for GDP Nowcasting

What is this project about?

The OECD Weekly Tracker uses Google Trends search data to nowcast weekly GDP growth for 46 countries. It relies on the intuition that search intensity for economically relevant terms, such as bankruptcy, unemployment benefits, and mortgage, reflects real economic conditions.

This project investigates whether AI-powered search tools are systematically changing how people use Google, and whether this is degrading the quality of GDP nowcasts that depend on GTrends data.

The core problem

AI chatbots are replacing Google as the first place people go for information. If this shift is concentrated in the same categories the OECD tracker relies on, the tracker's signal erodes, not because the economy changed, but because the measurement instrument changed.

This is a distribution shift: the data-generating process for GTrends is being disrupted by technology independent of (or not fully dependent on) the economy.

Research design

The project has five main phases:

Phase	Question
0	Does the AI adoption proxy behave as expected? (validation)
1	Are economic GTrends series actually shifting? (descriptive)
2	Does substitute AI predict shifts while placebo AI does not? (identification)
3	Which categories and search types are most affected? (heterogeneity)
4	Does the shift degrade GDP nowcasting performance? (nowcasting)

Identification strategy

Three orthogonal contrasts separate AI-induced measurement shift from confounders:

Platform (substitute vs. placebo AI): ChatGPT, Claude, Perplexity, etc. route queries away from Google, Gemini (post-Feb 2024) routes queries through Google. (We may be able to extend the placebo time series to incorporate earlier Bard data; though data quality is likely to make this difficult.) The general logic is that if substitute AI erodes GTrends but placebo does not, the effect is specific to measurement disruption, not general changes in economic behavior.
Content (GDP-relevant vs. non-relevant): Some categories carry strong GDP signal (bankruptcy, unemployment). Others have high AI substitutability but low GDP relevance (roleplaying, creative writing, cooking recipes). I.e., if only GDP-relevant categories erode, the tracker is specifically threatened.
Timing (lagged treatment + pre-trends): We further restrict alternative channels of causation by adding lagged AI adoption. we also test for parallel pre-trends before November 2022 to minimize the impact of selection bias.

Key data sources

GTrends: approximately 215 economic categories, MSA-level (US) and country-level, monthly and weekly, 2020 to 2026
OECD Quarterly National Accounts: Real GDP growth (y-o-y) for 38 OECD members
BLS LAUS: Monthly state-level unemployment
Census ACS: MSA-level demographics (median age, education)
EF English Proficiency Index: Cross-country moderator. AI tools work better in English, so higher proficiency leads to stronger substitution effects.
Shapley values: Category-level importance scores from OECD Weekly Tracker repo (weekly_shap.pkl) used to prioritize the top-12 economic categories and compute vulnerability scores.

Expected outputs

Proxy validation showing AI adoption patterns are consistent across time and geography
Time series figures showing structural breaks in economic GTrends series
Main result: negative effect of substitute AI on economic GTrends, with placebo near zero
Category-level vulnerability scores (β coefficient × Shapley importance)
Rolling-origin nowcasting evaluation: RMSE comparison pre-AI (2019 to 2022) versus post-AI (2023 to 2026)
Potential policy recommendation could be a reweighted tracker that down-weights vulnerable categories
2×2 diagnostic table (economic vs. non-economic categories × substitute vs. placebo AI)
Pre-trend figure and parallel trends test before November 2022
Collinearity diagnostic (partial correlation between substitute and placebo AI series)

Technical stack

Data: pytrends, pickle, OECD SDMX API, BLS LAUS, Census ACS
Analysis: linearmodels (panel regression with two-way clustered SEs), statsmodels (HAC SEs, mixed effects), pygam (nonlinearity diagnostics), lightgbm (gradient boosting for nowcasting), strucchange (structural break detection)
Visualization: matplotlib, seaborn

Why this matters beyond the OECD tracker

I see this project as more of a template for evaluating measurement stability of alternative-data nowcasting input per se. Not as necessarily confined to GTrends. Any instrument whose data-generating process is susceptible to technological disruption faces the same class of problem, such as satellite imagery when sensor technology changes, payments data when crypto adoption shifts transaction channels, and shipping data when supply chains restructure.

GTrends are interesting simply because they are widely used in nowcasting and because the impact seems first-order with AI tools replacing (and for a good part having replaced) search. In some sense, GTrends and AI search substitution are a case study, but the methodology is a portable contribution.

Reference

Woloszko, N. (2020). "Tracking Economic Growth in Real Time with Weekly Indicators." OECD Economic Department Working Papers No. 1634.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
Woloszko 2020.md		Woloszko 2020.md
Woloszko 2020.pdf		Woloszko 2020.pdf
pyproject.toml		pyproject.toml
run_all_pulls.py		run_all_pulls.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-Induced Distribution Shift in Google Trends: Implications for GDP Nowcasting

What is this project about?

The core problem

Research design

Identification strategy

Key data sources

Expected outputs

Technical stack

Why this matters beyond the OECD tracker

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI-Induced Distribution Shift in Google Trends: Implications for GDP Nowcasting

What is this project about?

The core problem

Research design

Identification strategy

Key data sources

Expected outputs

Technical stack

Why this matters beyond the OECD tracker

Reference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages