Skip to content

FloHauss/HBLM

Repository files navigation

HBLM

Hierarchical Bi-Directional Linear Feature Modulation for Hierarchical Text Classification.

This repository trains and evaluates FiLM-based hierarchical text classifiers. The easiest way to use it is to edit one of the run scripts in scripts/ and run it with bash.

What Is Included

  • Single-branch FiLM cascades:
    • top_down: predicts from root levels toward leaf levels.
    • bottom_up: predicts from leaf levels toward root levels.
  • Joint FiLM models:
    • dual: trains top-down and bottom-up branches together.
    • tri_head_diffcat: trains top-down, bottom-up, and DiffCat mix heads.
  • Evaluation scripts for:
    • one single-branch checkpoint,
    • an ensemble of separate top-down and bottom-up checkpoints,
    • one joint checkpoint.

Main entrypoint: hblm_experiments.py

Run scripts: scripts/

Install

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Prepare Data

Datasets are not stored in this repository. Download each dataset from its original source, then convert it with the scripts in datasets/.

Prepared datasets are expected under:

data/<dataset_name>/...

Supported dataset_name values:

nyt
aapd
rcv1v2
bgc
wos

See DATA.md for dataset sources, conversion commands, and validation steps.

Run Scripts

Each script has a configuration block at the top. In normal use, edit that block, save the file, and run the script.

bash scripts/train_single.sh
bash scripts/train_joint.sh
bash scripts/eval_single.sh
bash scripts/eval_ensemble.sh
bash scripts/eval_joint.sh

Training writes checkpoints and result JSON files under outputs/ by default. Evaluation writes timestamped result JSON files under outputs/results/.

Train A Single Branch

Use scripts/train_single.sh to train one FiLM cascade direction.

Edit these settings first:

datasets=("aapd")
seeds=(1)
cascade_direction="top_down"
model_name="FacebookAI/roberta-large"
data_dir="data"
output_dir="outputs"

Then run:

bash scripts/train_single.sh

Use cascade_direction="top_down" for root-to-leaf prediction, or cascade_direction="bottom_up" for leaf-to-root prediction.

To run several datasets or seeds in one script call, add them to the arrays:

datasets=("aapd" "wos")
seeds=(1 2 3)

The script will run every dataset/seed combination.

Train A Joint Model

Use scripts/train_joint.sh to train both hierarchy directions in one model.

Edit these settings first:

datasets=("aapd")
seeds=(1)
joint_variant="tri_head_diffcat"
joint_objective="full"
model_name="FacebookAI/roberta-large"
data_dir="data"
output_dir="outputs"

Then run:

bash scripts/train_joint.sh

joint_variant controls the joint architecture:

  • dual: top-down + bottom-up branches.
  • tri_head_diffcat: top-down + bottom-up + DiffCat mix head.

joint_objective controls which logits receive training loss:

  • full: use the fused objective plus auxiliary branch losses.
  • aux_only: use only branch losses. This is only valid for dual.

The script forces joint_objective="full" when joint_variant="tri_head_diffcat", because that is the supported configuration.

Configure Training

The most important shared training settings are:

learning_rate=0.00003
epochs=65
batch_size=32
threshold=0.5
model_name="FacebookAI/roberta-large"
  • learning_rate: optimizer learning rate.
  • epochs: maximum number of training epochs. Early stopping may stop sooner.
  • batch_size: documents per training batch.
  • threshold: probability/logit threshold used for validation and test metrics.
  • model_name: Hugging Face encoder name or local model path.

For quick debugging, you can add --dataset_fraction directly to the Python command inside the script, for example:

--dataset_fraction=0.1 \

Configure FiLM

The training and evaluation scripts expose the same core FiLM settings:

classification_head_type="mlp"
propagation_method="sigmoid"
propagation_space="child_space"
film_scale=0.2
pooling="cls"
  • classification_head_type: per-level classification head. Options: mlp, wide_mlp, residual_mlp.
  • propagation_method: how previous hierarchy logits are transformed before FiLM conditioning. Options: sigmoid, multi_hot, logit, none.
  • propagation_space: where the conditioning signal is represented. Options: child_space, parent_space.
  • film_scale: strength of FiLM modulation.
  • pooling: encoder pooling strategy. Options: cls, mean, last, max.

The scripts also pass these flags by default:

--project_embedding
--detach_parent_signal

Keep them enabled unless you are intentionally reproducing a different ablation.

Configure Joint Losses

scripts/train_joint.sh includes these joint-specific settings:

joint_eval_view="fused"
joint_loss_fused=0
joint_loss_td=0.5
joint_loss_bu=0.5
joint_loss_mix=1.0
  • joint_eval_view: which view to evaluate after training. Options: fused, td, bu, mix, all.
  • joint_loss_fused: loss weight for fused logits.
  • joint_loss_td: loss weight for top-down branch logits.
  • joint_loss_bu: loss weight for bottom-up branch logits.
  • joint_loss_mix: loss weight for the DiffCat mix head. Used by tri_head_diffcat.

For dual, mix is not available. For tri_head_diffcat, all loss weights must not be zero at the same time.

Evaluate One Single-Branch Checkpoint

Use scripts/eval_single.sh.

Edit:

checkpoint="outputs/checkpoints/aapd_single_top_down_roberta-large_seed1.pt"
dataset="aapd"
cascade_direction="top_down"
model_name="FacebookAI/roberta-large"

Then run:

bash scripts/eval_single.sh

Important: cascade_direction, model_name, and the FiLM settings in the evaluation script must match the checkpoint you are loading.

Evaluate A Top-Down + Bottom-Up Ensemble

Use scripts/eval_ensemble.sh.

Edit:

top_down_checkpoint="outputs/checkpoints/aapd_single_top_down_roberta-large_seed1.pt"
bottom_up_checkpoint="outputs/checkpoints/aapd_single_bottom_up_roberta-large_seed1.pt"
dataset="aapd"
model_name="FacebookAI/roberta-large"
ensemble_weight=0.5

Then run:

bash scripts/eval_ensemble.sh

ensemble_weight is the top-down logit weight. The bottom-up weight is 1 - ensemble_weight.

Evaluate A Joint Checkpoint

Use scripts/eval_joint.sh.

Edit:

checkpoint="outputs/checkpoints/aapd_joint_tri_head_diffcat_roberta-large_seed1.pt"
dataset="aapd"
joint_variant="tri_head_diffcat"
joint_objective="full"
model_name="FacebookAI/roberta-large"
posthoc_repath=false

Then run:

bash scripts/eval_joint.sh

Important: joint_variant, joint_objective, model_name, and the FiLM settings must match the checkpoint.

For tri_head_diffcat, the script also exposes fusion weights used during post-hoc evaluation:

tri_fuse_td=1.0
tri_fuse_bu=1.0
tri_fuse_mix=1.0

Set posthoc_repath=true to additionally report ancestor-closed threshold metrics.

Outputs

Training creates:

outputs/checkpoints/
outputs/results/

Single-branch checkpoint names follow:

outputs/checkpoints/{dataset}_single_{cascade_direction}_{model_tag}_seed{seed}.pt
outputs/results/{dataset}_single_{cascade_direction}_{model_tag}_seed{seed}.json

Joint checkpoint names follow:

outputs/checkpoints/{dataset}_joint_dual_{joint_objective}_{model_tag}_seed{seed}.pt
outputs/checkpoints/{dataset}_joint_tri_head_diffcat_{model_tag}_seed{seed}.pt
outputs/results/{dataset}_joint_dual_{joint_objective}_{model_tag}_seed{seed}.json
outputs/results/{dataset}_joint_tri_head_diffcat_{model_tag}_seed{seed}.json

Explicit evaluation creates timestamped files:

outputs/results/eval_single_{dataset}_{cascade_direction}_{model_tag}_{timestamp}.json
outputs/results/eval_ensemble_{dataset}_{timestamp}.json
outputs/results/eval_joint_{dataset}_{joint_variant}_{model_tag}_{timestamp}.json

Joint result JSON files also receive a compact .csv sidecar with macro/micro F1 rows.

Short CLI Usage

The scripts are the recommended interface. For direct CLI usage, inspect all available options with:

python3 hblm_experiments.py --help

Minimal single-branch example:

python3 hblm_experiments.py \
  --train_mode single \
  --cascade_direction top_down \
  --dataset_name aapd \
  --model_name FacebookAI/roberta-large \
  --project_embedding \
  --detach_parent_signal

Minimal joint example:

python3 hblm_experiments.py \
  --train_mode joint \
  --joint_variant tri_head_diffcat \
  --joint_objective full \
  --dataset_name aapd \
  --model_name FacebookAI/roberta-large \
  --project_embedding \
  --detach_parent_signal

About

Official implementation of HBLM: Hierarchical Bidirectional Level-Wise Feature Modulation for Hierarchical Text Classification.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors