HBLM

Hierarchical Bi-Directional Linear Feature Modulation for Hierarchical Text Classification.

This repository trains and evaluates FiLM-based hierarchical text classifiers. The easiest way to use it is to edit one of the run scripts in scripts/ and run it with bash.

What Is Included

Single-branch FiLM cascades:
- top_down: predicts from root levels toward leaf levels.
- bottom_up: predicts from leaf levels toward root levels.
Joint FiLM models:
- dual: trains top-down and bottom-up branches together.
- tri_head_diffcat: trains top-down, bottom-up, and DiffCat mix heads.
Evaluation scripts for:
- one single-branch checkpoint,
- an ensemble of separate top-down and bottom-up checkpoints,
- one joint checkpoint.

Main entrypoint: hblm_experiments.py

Run scripts: scripts/

Install

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Prepare Data

Datasets are not stored in this repository. Download each dataset from its original source, then convert it with the scripts in datasets/.

Prepared datasets are expected under:

data/<dataset_name>/...

Supported dataset_name values:

nyt
aapd
rcv1v2
bgc
wos

See DATA.md for dataset sources, conversion commands, and validation steps.

Run Scripts

Each script has a configuration block at the top. In normal use, edit that block, save the file, and run the script.

bash scripts/train_single.sh
bash scripts/train_joint.sh
bash scripts/eval_single.sh
bash scripts/eval_ensemble.sh
bash scripts/eval_joint.sh

Training writes checkpoints and result JSON files under outputs/ by default. Evaluation writes timestamped result JSON files under outputs/results/.

Train A Single Branch

Use scripts/train_single.sh to train one FiLM cascade direction.

Edit these settings first:

datasets=("aapd")
seeds=(1)
cascade_direction="top_down"
model_name="FacebookAI/roberta-large"
data_dir="data"
output_dir="outputs"

Then run:

bash scripts/train_single.sh

Use cascade_direction="top_down" for root-to-leaf prediction, or cascade_direction="bottom_up" for leaf-to-root prediction.

To run several datasets or seeds in one script call, add them to the arrays:

datasets=("aapd" "wos")
seeds=(1 2 3)

The script will run every dataset/seed combination.

Train A Joint Model

Use scripts/train_joint.sh to train both hierarchy directions in one model.

Edit these settings first:

datasets=("aapd")
seeds=(1)
joint_variant="tri_head_diffcat"
joint_objective="full"
model_name="FacebookAI/roberta-large"
data_dir="data"
output_dir="outputs"

Then run:

bash scripts/train_joint.sh

joint_variant controls the joint architecture:

dual: top-down + bottom-up branches.
tri_head_diffcat: top-down + bottom-up + DiffCat mix head.

joint_objective controls which logits receive training loss:

full: use the fused objective plus auxiliary branch losses.
aux_only: use only branch losses. This is only valid for dual.

The script forces joint_objective="full" when joint_variant="tri_head_diffcat", because that is the supported configuration.

Configure Training

The most important shared training settings are:

learning_rate=0.00003
epochs=65
batch_size=32
threshold=0.5
model_name="FacebookAI/roberta-large"

learning_rate: optimizer learning rate.
epochs: maximum number of training epochs. Early stopping may stop sooner.
batch_size: documents per training batch.
threshold: probability/logit threshold used for validation and test metrics.
model_name: Hugging Face encoder name or local model path.

For quick debugging, you can add --dataset_fraction directly to the Python command inside the script, for example:

--dataset_fraction=0.1 \

Configure FiLM

The training and evaluation scripts expose the same core FiLM settings:

classification_head_type="mlp"
propagation_method="sigmoid"
propagation_space="child_space"
film_scale=0.2
pooling="cls"

classification_head_type: per-level classification head. Options: mlp, wide_mlp, residual_mlp.
propagation_method: how previous hierarchy logits are transformed before FiLM conditioning. Options: sigmoid, multi_hot, logit, none.
propagation_space: where the conditioning signal is represented. Options: child_space, parent_space.
film_scale: strength of FiLM modulation.
pooling: encoder pooling strategy. Options: cls, mean, last, max.

The scripts also pass these flags by default:

--project_embedding
--detach_parent_signal

Keep them enabled unless you are intentionally reproducing a different ablation.

Configure Joint Losses

scripts/train_joint.sh includes these joint-specific settings:

joint_eval_view="fused"
joint_loss_fused=0
joint_loss_td=0.5
joint_loss_bu=0.5
joint_loss_mix=1.0

joint_eval_view: which view to evaluate after training. Options: fused, td, bu, mix, all.
joint_loss_fused: loss weight for fused logits.
joint_loss_td: loss weight for top-down branch logits.
joint_loss_bu: loss weight for bottom-up branch logits.
joint_loss_mix: loss weight for the DiffCat mix head. Used by tri_head_diffcat.

For dual, mix is not available. For tri_head_diffcat, all loss weights must not be zero at the same time.

Evaluate One Single-Branch Checkpoint

Use scripts/eval_single.sh.

Edit:

checkpoint="outputs/checkpoints/aapd_single_top_down_roberta-large_seed1.pt"
dataset="aapd"
cascade_direction="top_down"
model_name="FacebookAI/roberta-large"

Then run:

bash scripts/eval_single.sh

Important: cascade_direction, model_name, and the FiLM settings in the evaluation script must match the checkpoint you are loading.

Evaluate A Top-Down + Bottom-Up Ensemble

Use scripts/eval_ensemble.sh.

Edit:

top_down_checkpoint="outputs/checkpoints/aapd_single_top_down_roberta-large_seed1.pt"
bottom_up_checkpoint="outputs/checkpoints/aapd_single_bottom_up_roberta-large_seed1.pt"
dataset="aapd"
model_name="FacebookAI/roberta-large"
ensemble_weight=0.5

Then run:

bash scripts/eval_ensemble.sh

ensemble_weight is the top-down logit weight. The bottom-up weight is 1 - ensemble_weight.

Evaluate A Joint Checkpoint

Use scripts/eval_joint.sh.

Edit:

checkpoint="outputs/checkpoints/aapd_joint_tri_head_diffcat_roberta-large_seed1.pt"
dataset="aapd"
joint_variant="tri_head_diffcat"
joint_objective="full"
model_name="FacebookAI/roberta-large"
posthoc_repath=false

Then run:

bash scripts/eval_joint.sh

Important: joint_variant, joint_objective, model_name, and the FiLM settings must match the checkpoint.

For tri_head_diffcat, the script also exposes fusion weights used during post-hoc evaluation:

tri_fuse_td=1.0
tri_fuse_bu=1.0
tri_fuse_mix=1.0

Set posthoc_repath=true to additionally report ancestor-closed threshold metrics.

Outputs

Training creates:

outputs/checkpoints/
outputs/results/

Single-branch checkpoint names follow:

outputs/checkpoints/{dataset}_single_{cascade_direction}_{model_tag}_seed{seed}.pt
outputs/results/{dataset}_single_{cascade_direction}_{model_tag}_seed{seed}.json

Joint checkpoint names follow:

outputs/checkpoints/{dataset}_joint_dual_{joint_objective}_{model_tag}_seed{seed}.pt
outputs/checkpoints/{dataset}_joint_tri_head_diffcat_{model_tag}_seed{seed}.pt
outputs/results/{dataset}_joint_dual_{joint_objective}_{model_tag}_seed{seed}.json
outputs/results/{dataset}_joint_tri_head_diffcat_{model_tag}_seed{seed}.json

Explicit evaluation creates timestamped files:

outputs/results/eval_single_{dataset}_{cascade_direction}_{model_tag}_{timestamp}.json
outputs/results/eval_ensemble_{dataset}_{timestamp}.json
outputs/results/eval_joint_{dataset}_{joint_variant}_{model_tag}_{timestamp}.json

Joint result JSON files also receive a compact .csv sidecar with macro/micro F1 rows.

Short CLI Usage

The scripts are the recommended interface. For direct CLI usage, inspect all available options with:

python3 hblm_experiments.py --help

Minimal single-branch example:

python3 hblm_experiments.py \
  --train_mode single \
  --cascade_direction top_down \
  --dataset_name aapd \
  --model_name FacebookAI/roberta-large \
  --project_embedding \
  --detach_parent_signal

Minimal joint example:

python3 hblm_experiments.py \
  --train_mode joint \
  --joint_variant tri_head_diffcat \
  --joint_objective full \
  --dataset_name aapd \
  --model_name FacebookAI/roberta-large \
  --project_embedding \
  --detach_parent_signal

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datasets		datasets
modeling		modeling
scripts		scripts
.gitignore		.gitignore
DATA.md		DATA.md
LICENSE		LICENSE
README.md		README.md
experiment_cli.py		experiment_cli.py
experiment_core.py		experiment_core.py
experiment_models.py		experiment_models.py
hblm_experiments.py		hblm_experiments.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HBLM

What Is Included

Install

Prepare Data

Run Scripts

Train A Single Branch

Train A Joint Model

Configure Training

Configure FiLM

Configure Joint Losses

Evaluate One Single-Branch Checkpoint

Evaluate A Top-Down + Bottom-Up Ensemble

Evaluate A Joint Checkpoint

Outputs

Short CLI Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HBLM

What Is Included

Install

Prepare Data

Run Scripts

Train A Single Branch

Train A Joint Model

Configure Training

Configure FiLM

Configure Joint Losses

Evaluate One Single-Branch Checkpoint

Evaluate A Top-Down + Bottom-Up Ensemble

Evaluate A Joint Checkpoint

Outputs

Short CLI Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages