Hierarchical Bi-Directional Linear Feature Modulation for Hierarchical Text Classification.
This repository trains and evaluates FiLM-based hierarchical text classifiers.
The easiest way to use it is to edit one of the run scripts in scripts/ and
run it with bash.
- Single-branch FiLM cascades:
top_down: predicts from root levels toward leaf levels.bottom_up: predicts from leaf levels toward root levels.
- Joint FiLM models:
dual: trains top-down and bottom-up branches together.tri_head_diffcat: trains top-down, bottom-up, and DiffCat mix heads.
- Evaluation scripts for:
- one single-branch checkpoint,
- an ensemble of separate top-down and bottom-up checkpoints,
- one joint checkpoint.
Main entrypoint: hblm_experiments.py
Run scripts: scripts/
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txtDatasets are not stored in this repository. Download each dataset from its
original source, then convert it with the scripts in datasets/.
Prepared datasets are expected under:
data/<dataset_name>/...
Supported dataset_name values:
nyt
aapd
rcv1v2
bgc
wos
See DATA.md for dataset sources, conversion commands, and validation steps.
Each script has a configuration block at the top. In normal use, edit that block, save the file, and run the script.
bash scripts/train_single.sh
bash scripts/train_joint.sh
bash scripts/eval_single.sh
bash scripts/eval_ensemble.sh
bash scripts/eval_joint.shTraining writes checkpoints and result JSON files under outputs/ by default.
Evaluation writes timestamped result JSON files under outputs/results/.
Use scripts/train_single.sh to train one FiLM cascade direction.
Edit these settings first:
datasets=("aapd")
seeds=(1)
cascade_direction="top_down"
model_name="FacebookAI/roberta-large"
data_dir="data"
output_dir="outputs"Then run:
bash scripts/train_single.shUse cascade_direction="top_down" for root-to-leaf prediction, or
cascade_direction="bottom_up" for leaf-to-root prediction.
To run several datasets or seeds in one script call, add them to the arrays:
datasets=("aapd" "wos")
seeds=(1 2 3)The script will run every dataset/seed combination.
Use scripts/train_joint.sh to train both hierarchy directions in one model.
Edit these settings first:
datasets=("aapd")
seeds=(1)
joint_variant="tri_head_diffcat"
joint_objective="full"
model_name="FacebookAI/roberta-large"
data_dir="data"
output_dir="outputs"Then run:
bash scripts/train_joint.shjoint_variant controls the joint architecture:
dual: top-down + bottom-up branches.tri_head_diffcat: top-down + bottom-up + DiffCat mix head.
joint_objective controls which logits receive training loss:
full: use the fused objective plus auxiliary branch losses.aux_only: use only branch losses. This is only valid fordual.
The script forces joint_objective="full" when
joint_variant="tri_head_diffcat", because that is the supported configuration.
The most important shared training settings are:
learning_rate=0.00003
epochs=65
batch_size=32
threshold=0.5
model_name="FacebookAI/roberta-large"learning_rate: optimizer learning rate.epochs: maximum number of training epochs. Early stopping may stop sooner.batch_size: documents per training batch.threshold: probability/logit threshold used for validation and test metrics.model_name: Hugging Face encoder name or local model path.
For quick debugging, you can add --dataset_fraction directly to the Python
command inside the script, for example:
--dataset_fraction=0.1 \The training and evaluation scripts expose the same core FiLM settings:
classification_head_type="mlp"
propagation_method="sigmoid"
propagation_space="child_space"
film_scale=0.2
pooling="cls"classification_head_type: per-level classification head. Options:mlp,wide_mlp,residual_mlp.propagation_method: how previous hierarchy logits are transformed before FiLM conditioning. Options:sigmoid,multi_hot,logit,none.propagation_space: where the conditioning signal is represented. Options:child_space,parent_space.film_scale: strength of FiLM modulation.pooling: encoder pooling strategy. Options:cls,mean,last,max.
The scripts also pass these flags by default:
--project_embedding
--detach_parent_signalKeep them enabled unless you are intentionally reproducing a different ablation.
scripts/train_joint.sh includes these joint-specific settings:
joint_eval_view="fused"
joint_loss_fused=0
joint_loss_td=0.5
joint_loss_bu=0.5
joint_loss_mix=1.0joint_eval_view: which view to evaluate after training. Options:fused,td,bu,mix,all.joint_loss_fused: loss weight for fused logits.joint_loss_td: loss weight for top-down branch logits.joint_loss_bu: loss weight for bottom-up branch logits.joint_loss_mix: loss weight for the DiffCat mix head. Used bytri_head_diffcat.
For dual, mix is not available. For tri_head_diffcat, all loss weights
must not be zero at the same time.
Use scripts/eval_single.sh.
Edit:
checkpoint="outputs/checkpoints/aapd_single_top_down_roberta-large_seed1.pt"
dataset="aapd"
cascade_direction="top_down"
model_name="FacebookAI/roberta-large"Then run:
bash scripts/eval_single.shImportant: cascade_direction, model_name, and the FiLM settings in the
evaluation script must match the checkpoint you are loading.
Use scripts/eval_ensemble.sh.
Edit:
top_down_checkpoint="outputs/checkpoints/aapd_single_top_down_roberta-large_seed1.pt"
bottom_up_checkpoint="outputs/checkpoints/aapd_single_bottom_up_roberta-large_seed1.pt"
dataset="aapd"
model_name="FacebookAI/roberta-large"
ensemble_weight=0.5Then run:
bash scripts/eval_ensemble.shensemble_weight is the top-down logit weight. The bottom-up weight is
1 - ensemble_weight.
Use scripts/eval_joint.sh.
Edit:
checkpoint="outputs/checkpoints/aapd_joint_tri_head_diffcat_roberta-large_seed1.pt"
dataset="aapd"
joint_variant="tri_head_diffcat"
joint_objective="full"
model_name="FacebookAI/roberta-large"
posthoc_repath=falseThen run:
bash scripts/eval_joint.shImportant: joint_variant, joint_objective, model_name, and the FiLM
settings must match the checkpoint.
For tri_head_diffcat, the script also exposes fusion weights used during
post-hoc evaluation:
tri_fuse_td=1.0
tri_fuse_bu=1.0
tri_fuse_mix=1.0Set posthoc_repath=true to additionally report ancestor-closed threshold
metrics.
Training creates:
outputs/checkpoints/
outputs/results/
Single-branch checkpoint names follow:
outputs/checkpoints/{dataset}_single_{cascade_direction}_{model_tag}_seed{seed}.pt
outputs/results/{dataset}_single_{cascade_direction}_{model_tag}_seed{seed}.json
Joint checkpoint names follow:
outputs/checkpoints/{dataset}_joint_dual_{joint_objective}_{model_tag}_seed{seed}.pt
outputs/checkpoints/{dataset}_joint_tri_head_diffcat_{model_tag}_seed{seed}.pt
outputs/results/{dataset}_joint_dual_{joint_objective}_{model_tag}_seed{seed}.json
outputs/results/{dataset}_joint_tri_head_diffcat_{model_tag}_seed{seed}.json
Explicit evaluation creates timestamped files:
outputs/results/eval_single_{dataset}_{cascade_direction}_{model_tag}_{timestamp}.json
outputs/results/eval_ensemble_{dataset}_{timestamp}.json
outputs/results/eval_joint_{dataset}_{joint_variant}_{model_tag}_{timestamp}.json
Joint result JSON files also receive a compact .csv sidecar with macro/micro
F1 rows.
The scripts are the recommended interface. For direct CLI usage, inspect all available options with:
python3 hblm_experiments.py --helpMinimal single-branch example:
python3 hblm_experiments.py \
--train_mode single \
--cascade_direction top_down \
--dataset_name aapd \
--model_name FacebookAI/roberta-large \
--project_embedding \
--detach_parent_signalMinimal joint example:
python3 hblm_experiments.py \
--train_mode joint \
--joint_variant tri_head_diffcat \
--joint_objective full \
--dataset_name aapd \
--model_name FacebookAI/roberta-large \
--project_embedding \
--detach_parent_signal