Normalizing flows for jet topic modelling

Paper: M. J. Dolan and A. Ore, TopicFlow: Disentangling quark and gluon jets with normalizing flows, arXiv:2211.16053 [hep-ph]

Dependencies

The code has been tested with the following package versions:

python 3.9
tensorflow 2.10.1
tensorflow-probability 0.18.0
numpy 1.23.0
scikit-learn 1.1.2
scipy 1.8.1
energyflow 1.3.2

Usage

Producing EFP datasets

First download the EnergyFlow Quark/Gluon dataset, either using energyflow.qg_jets.load, or from Zenodo.
The script write-efps.py can then be used to calculate EFP sets from the data:

$ ./write-efps.py [--max_degree 3] [--paths /path/to/dataset/*.npz]

It may take a long time to convert the entire dataset, so it is recommended to parallelize across files if possible.

Training a flow model

The script train-flow.py trains quark and gluon jet topic flows on mixed datasets with given quark fractions:

$ ./train-flow.py --purities 0.3 0.7 --savedir outputs

The default settings run the FFJORD model. The spline architecture used for generative classification in the paper can be run with:

$ ./train-flow.py --purities 0.3 0.7 --arch rqs --num_transform_layers 16 --num_residual_blocks 1 --hidden_dim 128 --learning_rate 1e-4 --warmup_epochs 5 --dropout 0.2

The script produces the following outputs:
- model_config.p: Pickled dictionary containing the flow config.
- qweights, gweights: Network weights for each flow, which can be restored with topicflow.utils.load_topic_flows.
- wassersteins.p: Pickled Wasserstein distances to truth distributions for each EFP, stored as [quark_dists, gluon_dists].
- metrics.p: Pickled generative classification metrics, stored as {'acc': acc, 'auc': auc}.
- summary: Tensorboard summary directory.

Training a classifier

A simple fully-connected classifier can be trained on mixed datasets using train-dnn.py:

$ ./train-dnn.py --purities 0.3 0.7 --savedir outputs

The script produces the following outputs:
- weights: Network weights for the model, which can be restored with tensorflow.keras.Model.load_weights.
- metrics.p: Pickled generative classification metrics, stored as [loss, auc, acc].
- summary: Tensorboard summary directory.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
topicflow		topicflow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
train-dnn.py		train-dnn.py
train-flow.py		train-flow.py
write-efps.py		write-efps.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Normalizing flows for jet topic modelling

Dependencies

Usage

Producing EFP datasets

Training a flow model

Training a classifier

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Normalizing flows for jet topic modelling

Dependencies

Usage

Producing EFP datasets

Training a flow model

Training a classifier

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages