InferPilot

This is the official repository of the ACL 2026 Findings paper InferPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents.

InferPilot is an agentic benchmark for evaluating LLM-driven machine learning privacy attacks. It provides a structured environment where an LLM-based agent autonomously plans and executes inference-time attacks against black-box ML models.

Overview

InferPilot covers five attack tasks:

Task	Description
Membership Inference (MIA)	Determine whether a data sample was used to train the target model
Attribute Inference	Infer sensitive attributes of input data from model predictions
Data Reconstruction	Reconstruct training data from model outputs via inversion attacks
Model Stealing	Extract a functionally equivalent surrogate model from query access
All-in-One	A meta-task where a ControllerAgent autonomously selects and coordinates multiple attacks

Repository Structure

inferpilot/
├── runner.py                        # Main entry point
├── run_exp.sh                       # Experiment runner script
├── env.py                           # Benchmark environment
├── LLM.py                           # LLM API wrappers (OpenAI, Claude, etc.)
├── schema.py                        # Data schemas (Action, Step, Trace, ...)
├── low_level_actions.py             # Primitive environment actions (read, write, execute)
├── high_level_actions.py            # High-level agent tools (edit script, understand file, ...)
├── logger.py                        # Logging utility for target model training
├── prepare_dataset.py               # Prepare raw datasets into target/shadow .pt splits
├── agents/
│   ├── agent.py                     # Base agent class
│   ├── attack_agent.py              # AttackAgent for individual attack tasks
│   └── controller_agent.py         # ControllerAgent for all-in-one coordination
├── targets/
│   ├── target_service.py            # Flask server exposing the target model API
│   ├── train_target_model.py        # Script to train target models
│   ├── prepare_target_dataset.py    # Script to prepare target dataset splits
│   ├── target_dataset.py            # Dataset utilities
│   ├── custom_datasets/             # Dataset loaders for AFAD, CelebA, UTKFace
│   ├── model_pool/                  # Model architecture definitions
│   └── assets/                      # (generated) trained models and dataset splits
├── benchmarks/
│   ├── mia/                         # Membership inference task
│   │   ├── env/                     # Attack scripts and resources
│   │   └── scripts/                 # Prompt template and config
│   ├── attr_infer/                  # Attribute inference task
│   ├── data_recon/                  # Data reconstruction task
│   ├── model_steal/                 # Model stealing task
│   └── all_in_one/                  # Combined multi-attack task
└── task_configs/                    # Task configuration JSON files
    ├── mia/
    ├── attr_infer/
    ├── data_recon/
    ├── model_steal/
    └── all_in_one/

Requirements

pip install anthropic openai tiktoken torch torchvision dacite flask tqdm timm pandas pillow nltk rouge-score bert-score numpy matplotlib scikit-learn torch==2.5.1 torchvision==0.20.1

Set up your API keys:

echo "YOUR_OPENAI_KEY" > openai_api_key.txt        # format: org_id:api_key
echo "YOUR_ANTHROPIC_KEY" > claude_api_key.txt

Target Service Setup

Each experiment requires a running target model service. Setup involves two steps.

Step 1 — Prepare datasets

CIFAR-10, STL-10 are downloaded automatically by prepare_dataset.py. The three face datasets require manual download first:

CelebA

Download from the official page (or Kaggle mirror):
- img_align_celeba.zip — aligned face images
Unzip and place the image folder at data/celeba/img_align_celeba/
Place attribute/partition CSV files (list_attr_celeba.csv, list_eval_partition.csv) at data/celeba/

UTKFace

Download from the official page — get the aligned image archive
Place all .jpg images at data/utkface/UTKFace/
Place utkface_attr.csv file at data/utkface/

AFAD

Download from GitHub — get AFAD-Full
Place the image folder at data/afad/AFAD-Full/
Place afad_attr.csv at data/afad/

Once face datasets are placed correctly, run:

python prepare_dataset.py

This splits raw data into target .pt files under data/target/ and shadow .pt files under data/shadow/. The shadow datasets are used by the agent to train shadow models during attacks.

Step 2 — Train a target model

python targets/train_target_model.py \
    --dataset_name utkface \
    --model_name resnet18 \
    --size 5000 \
    --save_dir targets/assets \
    --num_epochs 300

This produces:

targets/assets/models/<dataset>_<model>_<size>_target_model_final.pth
targets/assets/datasets/<dataset>_<size>_target_train.pt
targets/assets/datasets/<dataset>_<size>_target_test.pt

Once assets are ready, run_exp.sh will copy them into the workspace and start the target service automatically.

Supported datasets: cifar10, stl10, utkface, celeba, afad
Supported models: cnn, resnet18, resnet50, xception

Notes:

attr_infer is only supported for face datasets (utkface, celeba, afad).
model_steal task configs do not require a --model argument (model architecture is not needed for black-box stealing). Run as: bash run_exp.sh <dataset> cnn model_steal.

Running Experiments

# Single task experiment
bash run_exp.sh <dataset> <model> <task>

# Examples
bash run_exp.sh cifar10 cnn mia
bash run_exp.sh celeba resnet18 attr_infer
bash run_exp.sh cifar10 resnet50 data_recon
bash run_exp.sh cifar10 cnn model_steal
bash run_exp.sh cifar10 resnet18 all_in_one

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InferPilot

Overview

Repository Structure

Requirements

Target Service Setup

Running Experiments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
agents		agents
benchmarks		benchmarks
targets		targets
task_configs		task_configs
.gitignore		.gitignore
LLM.py		LLM.py
README.md		README.md
env.py		env.py
high_level_actions.py		high_level_actions.py
logger.py		logger.py
low_level_actions.py		low_level_actions.py
prepare_dataset.py		prepare_dataset.py
run_exp.sh		run_exp.sh
runner.py		runner.py
schema.py		schema.py

Folders and files

Latest commit

History

Repository files navigation

InferPilot

Overview

Repository Structure

Requirements

Target Service Setup

Running Experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages