Skip to content

ynklab/shared_syntactic_mechanism

Repository files navigation

Fine-Grained Analysis of Shared Syntactic Mechanisms in Language Models

This repository contains code for the paper "Fine-Grained Analysis of Shared Syntactic Mechanisms in Language Models".

The dataset used for the analysis is in kumoryo9/shared-mech

Setup

Setup the enviornment with uv (uv sync etc.). To log the results with wandb, setup enviornment variables.

export WANDB_API_KEY=<your_api_key>

Analysis

  • Activation patching on the residual stream, attention output, and MLP output in filler-gap dependencies. Change categories to control for the control pattern and npi for NPI licensing. For NPI licensing, add --src_base_pattern label2 as well.
uv run python main.py\
    --patch_type activation\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --layer_idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15\
    --categories filler_gap\
    --proj_method vanilla\
    --interv_component resid attn mlp\
    --do_leave_one_out \
    --batch_size 10\
  • Activation patching on the attention heads at the last token for filler-gap dependencies. Analysis on other categories can be done in a similar manner to the above.
uv run python main.py\
    --patch_type activation\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --layer_idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15\
    --categories filler_gap\
    --proj_method vanilla\
    --interv_component attn-head\
    --do_leave_one_out \
    --batch_size 10\
    --only_last_token
  • DAS on the residual stream, attention output, and MLP output in filler-gap dependencies. Analysis on other categories or attention heads can be done in a similar manner to the above.
uv run python main.py\
    --patch_type activation\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --layer_idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15\
    --categories filler_gap\
    --proj_method das\
    --label_methods None label-ood\
    --das_lr 5e-3\
    --das_steps 100\
    --interv_component resid attn mlp\
    --do_leave_one_out \
    --batch_size 10\

Steering

  • Steering attention heads and evaluate the performance on BLiMP.
uv run python blimp.py\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --steer_heads 7.5 7.6 9.2\
    --steer_strength 0.8 1.0 1.2 1.5\
    --categories all \
    --batch_size 10\
  • Steering attention heads and evaluate the performance on SyntaxGym. Data can be downloaded from here. You may have to modify the predictions/formula attribute in json to parse correctly.
uv run python syntaxgym.py\
    --model_name pythia\
    --num_param 1b\
    --num_steps 143000\
    --log_dir /path/to/dir/for/logging\
    --steer_heads 7.5 7.6 9.2\
    --steer_strength 0.8 1.0 1.2 1.5\
    --categories all \
    --batch_size 10\
  • Steering attention heads and evaluate the performance on HANS. Data can be downloaded from here.
uv run python nli.py\
    --steer_strength 1.0\
    --train_path /path/to/heuristics_train_set.txt \
    --test_path /path/to/heuristics_test_set.txt

Data Generation

Generate data for patterns pattern_A and pattern_B in data/patterns/.

cd data
uv run python builder.py --patterns pattern_A pattern_B

Acknowledgement

We utilized code in CausalGym and Data Generation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages