Skip to content

avp-neelam/GraphHeatFieldSignatures

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GHFS Experiments

Code for reproducing all experiments in Graph Heat Field Signatures: Multiscale Feature-Structure Alignment for Graph Learning.


Structure

code/
├── ghfs/               # Core GHFS encoding
│   ├── encoding.py     # GHFSEncoder class (main entry point)
│   ├── kernels.py      # Graph diffusion & feature kernel computations
│   └── utils.py        # Preprocessing, scale selection
├── datasets/
│   └── loaders.py      # All 10 dataset loaders with correct split protocols
├── models/
│   ├── mlp.py          # MLP baseline
│   └── gnn.py          # GCN, GraphSAGE, GAT, APPNP, GPS
├── pe/
│   ├── lape.py         # Laplacian Positional Encoding
│   └── rwse.py         # Random Walk Structural Encoding
├── experiments/
│   ├── trainer.py      # Training loop with early stopping
│   ├── main.py         # Main experiment runner (all methods × all datasets)
│   └── ablations.py    # Ablation studies (channels, scales, hop radius, kernels)
├── analysis/
│   └── alignment.py    # Alignment spectrum computation & figures
└── scripts/
    ├── run_main_experiments.sh
    ├── run_ablations.sh
    ├── run_analysis.sh
    └── summarize_results.py

Setup

# Create environment
conda create -n ghfs python=3.10
conda activate ghfs

# Install PyTorch (adjust CUDA version as needed)
pip install torch --index-url https://download.pytorch.org/whl/cu118

# Install PyG and extensions
pip install torch_geometric
pip install torch_scatter torch_sparse torch_cluster torch_spline_conv \
    -f https://data.pyg.org/whl/torch-2.0.0+cu118.html

# Install remaining dependencies
pip install -r requirements.txt

Running Experiments

All commands should be run from the code/ directory.

Table 2: Main results (all methods × all datasets)

# Full run (~hours on GPU, ~days on CPU)
bash scripts/run_main_experiments.sh

# Single dataset, all methods
bash scripts/run_main_experiments.sh cora

# Single method on single dataset
bash scripts/run_main_experiments.sh texas mlp+ghfs

Results written to ./results/results.csv and ./results/results_full.json.

Print / export results table

# Console table
python scripts/summarize_results.py

# Also write LaTeX
python scripts/summarize_results.py --latex

Tables 3-6: Ablation studies

bash scripts/run_ablations.sh

Results in ./results/ablations/ablations.json.

Alignment figures (Section 4.4)

Run after main experiments so the accuracy-vs-alignment scatter is populated:

bash scripts/run_analysis.sh

Figures saved to ./results/figures/.


Individual method examples

# MLP baseline
python -m experiments.main --dataset cora --method mlp

# GCN
python -m experiments.main --dataset texas --method gcn

# MLP + GHFS (no message passing, GHFS as fixed encoding)
python -m experiments.main --dataset roman-empire --method mlp+ghfs

# GCN + GHFS
python -m experiments.main --dataset roman-empire --method gcn+ghfs

# GPS + GHFS
python -m experiments.main --dataset cora --method gps+ghfs

# MLP + LapPE (structural encoding baseline)
python -m experiments.main --dataset texas --method mlp+lape

# MLP + RWSE (structural encoding baseline)
python -m experiments.main --dataset texas --method mlp+rwse

Datasets

Dataset Protocol Splits
Cora, CiteSeer, PubMed Standard Planetoid public split 1
Texas, Cornell, Wisconsin, Actor Pei et al. (2019) 10-split 10
Roman-Empire, Amazon-Ratings, Tolokers Platonov et al. (2023) fixed 10-split 10

All datasets are downloaded automatically by PyG on first use.


Baselines NOT implemented here

The following heterophily-aware methods are referenced from their original implementations:

Numbers for these can be taken from the original papers or Platonov et al. (2023).


GHFS hyperparameters

Parameter Default Meaning
T 4 Number of feature scales
S 4 Number of graph diffusion scales
k 2 Diffusion truncation radius
pca_dim 64 PCA target dimension (applied when d > 500)
cmin 0.05 Feature scale lower multiplier
cmax 0.25 Feature scale upper multiplier

Override via CLI: --ghfs_T 8 --ghfs_S 8 --ghfs_k 3


Encoding dimension

With default T=S=4: 64 dimensions (4 channels × 4 × 4 scales). With density channel enabled: 80 dimensions.


Scalability (Table 7)

Preprocessing timings are logged automatically during fit_transform. GHFS is computed once and cached to ./cache/ghfs/. Training time after preprocessing is identical to the base model.


Reproducibility

All experiments use --seed 42 by default. The random seed is set for both torch and numpy at the start of each experiment run. Dataset splits for Planetoid and HeterophilousGraphDataset are fixed by PyG; WebKB / Actor splits are the pre-computed masks stored in PyG's dataset objects.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors