In-context learning of representations can be explained by induction circuits

This repository contains minimal reproduction scripts for the blog post:

In-context learning of representations can be explained by induction circuits Andy Arditi (Northeastern University)

The blog post responds to Park et al., 2025, who find that when LLMs process random walks on a graph in-context, their token representations come to mirror the graph's connectivity structure. We offer a simpler mechanistic explanation: the task can be solved by induction circuits, and the geometric structure of representations is a byproduct of previous-token mixing within those circuits.

Setup

Requirements: Python 3.10+ and uv. Scripts 01 and 02 require a CUDA-capable GPU with at least 48 GB of memory (for Llama-3.1-8B). 03_neighbor_mixing.py runs on CPU only and has no GPU requirement.

# 1. Create and activate a virtual environment
uv venv
source .venv/bin/activate

# 2. Install dependencies
uv pip install -r requirements.txt

# 3. Log in to Hugging Face (required for Llama-3.1-8B access)
huggingface-cli login

Llama-3.1-8B is a gated model — you'll need to have accepted the license on Hugging Face before running scripts 01 and 02.

Scripts

`01_reproduce.py` — Figures 2 and 6

Reproduces the core results from Park et al.: a language model performing in-context learning on a grid random walk task.

python 01_reproduce.py

Outputs (in results/reproduce/plots/):

File	Description
`accuracy_curve.{pdf,png,html}`	Fig 2 left. Accuracy (probability on valid next tokens) as a function of context length.
`pca_class_means.{pdf,png,html}`	Fig 2 right. PCA of the 16 class-mean activations at layer 26 (last 200 positions).
`bigram_pca.{pdf,png,html}`	Fig 6. Individual activations projected onto the same PCA directions. Fill color = current token, border color = previous token.

`02_ablation.py` — Figures 3 and 4

Tests the induction circuit hypothesis by ablating attention heads.

python 02_ablation.py

This script first identifies induction heads and previous-token heads using repeated random token sequences (Appendix A), then ablates the top-k heads of each type and measures the effect on accuracy and representations.

Outputs (in results/ablation/plots/):

File	Description
`ablation_induction.{pdf,png,html}`	Fig 3 left. Accuracy curves when ablating top-k induction heads (k = 1, 2, 4, 8, 16, 32).
`ablation_prev_token.{pdf,png,html}`	Fig 3 right. Accuracy curves when ablating top-k previous-token heads.
`pca_baseline.{pdf,png,html}`	Fig 4 left. Class-mean PCA with no ablation (baseline).
`pca_induction_ablated.{pdf,png,html}`	Fig 4 center. Class-mean PCA with top-32 induction heads ablated.
`pca_prev_token_ablated.{pdf,png,html}`	Fig 4 right. Class-mean PCA with top-32 previous-token heads ablated.

`03_neighbor_mixing.py` — Figure 5

Demonstrates that a single round of previous-token (neighbor) mixing can explain the emergent grid structure in representations. No model or GPU required.

python 03_neighbor_mixing.py

Outputs (in results/neighbor_mixing/plots/):

File	Description
`before_mixing.{pdf,png,html}`	Fig 5 left. PCA of 16 random Gaussian vectors in R^4096.
`after_mixing.{pdf,png,html}`	Fig 5 right. PCA after applying one round of neighbor mixing: each embedding is updated by adding the mean of its grid neighbors' embeddings.

The Grid Task

The task uses a 4×4 grid of common English words:

 apple  bird   car    egg
 house  milk   plane  opera
 box    sand   sun    mango
 rock   math   code   phone

A random walk produces a sequence of words by moving to adjacent cells (up/down/left/right). The model is given this sequence and must predict valid next words at each position. A "correct" prediction is one that places probability on tokens corresponding to grid neighbors of the current word.

Project Structure

.
├── README.md
├── requirements.txt
├── utils.py                 # Shared utilities (Grid, model loading, PCA, plotting)
├── 01_reproduce.py          # Fig 2 (accuracy + PCA) and Fig 6 (bigram PCA)
├── 02_ablation.py           # Fig 3 (ablation accuracy) and Fig 4 (ablation PCA)
├── 03_neighbor_mixing.py    # Fig 5 (toy model of neighbor mixing)
└── results/
    ├── reproduce/
    │   ├── data/            # Cached activations, accuracies, sequences
    │   └── plots/           # PDF, PNG, and interactive HTML figures
    ├── ablation/
    │   ├── data/            # Cached head scores, ablation accuracies, PCA data
    │   └── plots/
    └── neighbor_mixing/
        ├── data/            # Cached mixing projections
        └── plots/

Reproducibility

All scripts use set_seed(42) for deterministic results. The 16 accuracy curves use uniform initialization (one sequence starting at each grid position) to ensure all positions are represented.

Model: meta-llama/Llama-3.1-8B via TransformerLens
Analysis layer: 26 (residual stream pre-attention)
Sequence length: 1,400 tokens
PCA lookback: last 200 positions

Citation

If you use this code, please cite:

TODO

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

In-context learning of representations can be explained by induction circuits

Setup

Scripts

`01_reproduce.py` — Figures 2 and 6

`02_ablation.py` — Figures 3 and 4

`03_neighbor_mixing.py` — Figure 5

The Grid Task

Project Structure

Reproducibility

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
results		results
.gitignore		.gitignore
01_reproduce.py		01_reproduce.py
02_ablation.py		02_ablation.py
03_neighbor_mixing.py		03_neighbor_mixing.py
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

In-context learning of representations can be explained by induction circuits

Setup

Scripts

01_reproduce.py — Figures 2 and 6

02_ablation.py — Figures 3 and 4

03_neighbor_mixing.py — Figure 5

The Grid Task

Project Structure

Reproducibility

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`01_reproduce.py` — Figures 2 and 6

`02_ablation.py` — Figures 3 and 4

`03_neighbor_mixing.py` — Figure 5

Packages