GitHub - ynklab/shared_syntactic_mechanism

Fine-Grained Analysis of Shared Syntactic Mechanisms in Language Models

This repository contains code for the paper "Fine-Grained Analysis of Shared Syntactic Mechanisms in Language Models".

The dataset used for the analysis is in kumoryo9/shared-mech

Setup

Setup the enviornment with uv (uv sync etc.). To log the results with wandb, setup enviornment variables.

export WANDB_API_KEY=<your_api_key>

Analysis

Activation patching on the residual stream, attention output, and MLP output in filler-gap dependencies. Change categories to control for the control pattern and npi for NPI licensing. For NPI licensing, add --src_base_pattern label2 as well.

uv run python main.py\
    --patch_type activation\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --layer_idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15\
    --categories filler_gap\
    --proj_method vanilla\
    --interv_component resid attn mlp\
    --do_leave_one_out \
    --batch_size 10\

Activation patching on the attention heads at the last token for filler-gap dependencies. Analysis on other categories can be done in a similar manner to the above.

uv run python main.py\
    --patch_type activation\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --layer_idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15\
    --categories filler_gap\
    --proj_method vanilla\
    --interv_component attn-head\
    --do_leave_one_out \
    --batch_size 10\
    --only_last_token

DAS on the residual stream, attention output, and MLP output in filler-gap dependencies. Analysis on other categories or attention heads can be done in a similar manner to the above.

uv run python main.py\
    --patch_type activation\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --layer_idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15\
    --categories filler_gap\
    --proj_method das\
    --label_methods None label-ood\
    --das_lr 5e-3\
    --das_steps 100\
    --interv_component resid attn mlp\
    --do_leave_one_out \
    --batch_size 10\

Steering

Steering attention heads and evaluate the performance on BLiMP.

uv run python blimp.py\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --steer_heads 7.5 7.6 9.2\
    --steer_strength 0.8 1.0 1.2 1.5\
    --categories all \
    --batch_size 10\

Steering attention heads and evaluate the performance on SyntaxGym. Data can be downloaded from here. You may have to modify the predictions/formula attribute in json to parse correctly.

uv run python syntaxgym.py\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --steer_heads 7.5 7.6 9.2\
    --steer_strength 0.8 1.0 1.2 1.5\
    --categories all \
    --batch_size 10\

Steering attention heads and evaluate the performance on HANS. Data can be downloaded from here.

uv run python nli.py\
    --steer_strength 1.0\
    --train_path /path/to/heuristics_train_set.txt \
    --test_path /path/to/heuristics_test_set.txt

Data Generation

Generate data for patterns pattern_A and pattern_B in data/patterns/.

cd data
uv run python builder.py --patterns pattern_A pattern_B

Acknowledgement

We utilized code in CausalGym and Data Generation.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
models		models
subspace		subspace
utils		utils
.gitignore		.gitignore
README.md		README.md
activation.py		activation.py
blimp.py		blimp.py
intervene.py		intervene.py
main.py		main.py
metric.py		metric.py
nli.py		nli.py
plot.py		plot.py
pyproject.toml		pyproject.toml
steer_attention.py		steer_attention.py
syntaxgym.py		syntaxgym.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-Grained Analysis of Shared Syntactic Mechanisms in Language Models

Setup

Analysis

Steering

Data Generation

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fine-Grained Analysis of Shared Syntactic Mechanisms in Language Models

Setup

Analysis

Steering

Data Generation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages