Quantifying Aleatoric Uncertainty of In-Context Learning via Self-Function Vectors

Official code for the ACL 2026 paper "Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence" by Jinseok Chung, Minkyoung Song*, Hyunji Jung*, and Namhoon Lee (POSTECH). _{*Equal contribution.}

📄 Paper: arXiv:2606.19353

Setup

conda env create -f environment.yml
conda activate fv

Data

python -c "import nltk; nltk.download('wordnet')"   # once
python prepare_data.py                              # build all datasets
# or a subset:
python prepare_data.py --only wordnet moons ag_news emotion gsm8k hellaswag

WordNetMCQ and moons are constructed locally; AG News / Emotion / GSM8K / HellaSwag are downloaded from the Hugging Face Hub and reformatted. Sources and licenses are listed in dataset_files/README.md.

Reproducing the experiments

The pipeline has two stages: select causal heads once, then run the self-function-vector experiments on top of them.

# 1) Causal head selection: CIE analysis -> top_heads.json + mean_head_act.pt
bash scripts/01_compute_cie.sh

# 2) Self-function-vector uncertainty decomposition (also runs the
#    Total-entropy and function-vector baselines)
bash scripts/02_run_self_fv.sh

Each script exposes MODEL, DATA, ANSWER_TYPE, NUM_SHOTS, etc. as environment variables, e.g.:

MODEL=meta-llama/Llama-2-13b-hf DATA=ag_news ANSWER_TYPE=generation NUM_LAYERS=40 \
  bash scripts/01_compute_cie.sh

Models in the paper: LLaMA2-7B/13B/70B, Qwen2.5-7B, Mistral-7B. Set NUM_LAYERS to the model's depth (32 / 40 / 80); the intervention layer defaults to ~1/3 of the depth.

Repository layout

main_icl/                 Core paper code
  main.py                 Main entry point: CIE head scoring (--cie 1) and the self-FV
                          experiment (--use_self_fv 1) with baselines (FV / total entropy)
  instruction.py          Prompt templates / instructions
  utils/                  prompt, data, model, inference, intervention, metrics
select_top_heads.py       Aggregate per-layer CIE tensors -> top_heads.json (head selection)
prepare_data.py           Reconstruct all task datasets into dataset_files/icl/
generate_ood_wordnet.py   OOD-query dataset generation (used by prepare_data.py)
dataset_files/            Dataset documentation; icl/ is generated, not committed
scripts/                  Reproduction entry points
environment.yml           Conda environment

Task data (dataset_files/icl/) and experiment outputs (results*/, wandb/, *.pt, figures) are git-ignored and regenerated by prepare_data.py / the pipeline.

Citation

@inproceedings{chung2026quantifying,
  title     = {Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence},
  author    = {Chung, Jinseok and Song, Minkyoung and Jung, Hyunji and Lee, Namhoon},
  booktitle = {Proceedings of the 2026 Annual Meeting of the Association for Computational Linguistics (ACL)},
  year      = {2026},
  eprint    = {2606.19353},
  archivePrefix = {arXiv},
  url       = {https://arxiv.org/abs/2606.19353}
}

Acknowledgments

We are grateful to the authors of the following repositories, which we referred to while developing this work:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quantifying Aleatoric Uncertainty of In-Context Learning via Self-Function Vectors

Setup

Data

Reproducing the experiments

Repository layout

Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
dataset_files		dataset_files
main_icl		main_icl
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
generate_ood_wordnet.py		generate_ood_wordnet.py
prepare_data.py		prepare_data.py
select_top_heads.py		select_top_heads.py

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Quantifying Aleatoric Uncertainty of In-Context Learning via Self-Function Vectors

Setup

Data

Reproducing the experiments

Repository layout

Citation

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages