Most machine learning systems for drug–drug interaction (DDI) prediction assume a simple structure: interactions occur between pairs of drugs. This assumption enables the use of standard graph-based models such as GCNs and GATs, but it may not fully reflect how drug combinations behave in real biomedical settings.
In practice, many clinically relevant interactions involve multiple drugs simultaneously, where the effect emerges from the combination rather than any single pairwise relationship. This observation motivates a shift from graph-based modeling to hypergraph-based representations, where each interaction event can connect more than two drugs.
This is a solo exploratory research project.
- Synthetic benchmark: completed
- Unified training/evaluation pipeline: completed
- HGNN, GCN, GAT, GraphSAGE, and MLP baselines: completed
- Multi-seed metric reporting: completed
- Real-world DDI benchmarking: planned
- Stronger molecular feature representations: planned
The current results are based on a synthetic smoke-test benchmark and should be interpreted as pipeline validation, not as pharmacological or clinical evidence.
Instead of representing drug interactions as edges between two nodes, I model them as hyperedges in a hypergraph:
- Nodes represent drugs
- Hyperedges represent multi-drug interaction events
This formulation preserves higher-order structure that is typically lost when interactions are decomposed into pairs.
I built an experimental framework to study higher-order drug interaction modeling using Hypergraph Neural Networks (HGNNs). The repository includes:
- Construction of hypergraph representations from drug interaction datasets
- A Hypergraph Neural Network for higher-order message passing
- Standard graph-based baselines, including GCN, GAT, and GraphSAGE
- A non-graph baseline (MLP) for reference
- A unified evaluation pipeline for fair comparison
The downstream task is formulated as binary drug–drug interaction prediction, evaluated using standard classification metrics.
Traditional graph neural networks rely on pairwise edges, which implicitly assume that interactions are decomposable into binary relationships. This can lead to:
- Loss of multi-drug contextual information
- Redundant representation of higher-order interactions
- Structural bias toward pairwise dependencies
Hypergraphs address this limitation by allowing a single interaction event to directly connect multiple drugs, preserving its full structure.
The Hypergraph Neural Network operates through two stages of message passing:
- Aggregation from drug nodes to hyperedges
- Propagation from hyperedges back to nodes
This enables each drug representation to incorporate both local and group-level interaction context. The learned embeddings are then used for pairwise interaction scoring.
The codebase supports evaluation on standard biomedical interaction datasets, including DrugBank and TWOSIDES-style data. Models are compared under a consistent pipeline with identical preprocessing, negative sampling strategy, and train/test splits.
Performance can be measured using:
- ROC-AUC
- PR-AUC
- Precision@K
- F1-score
Synthetic demo benchmark (5 seeds: 42–46; CPU; pipeline validation only — not pharmacological evidence). Full per-seed logs and JSON: RESULTS.md.
| Model | ROC-AUC | PR-AUC | F1 |
|---|---|---|---|
| HGNN | 0.4893 ± 0.1034 | 0.7168 ± 0.0750 | 0.8235 ± 0.0000 |
| GCN | 0.5693 ± 0.0991 | 0.7549 ± 0.0722 | 0.3283 ± 0.4496 |
| GAT | 0.3824 ± 0.0850 | 0.6477 ± 0.0478 | 0.8235 ± 0.0000 |
| GraphSAGE | 0.4923 ± 0.0704 | 0.6909 ± 0.0419 | 0.8235 ± 0.0000 |
| MLP | 0.4970 ± 0.0998 | 0.7069 ± 0.0503 | 0.8235 ± 0.0000 |
Reproduce the full table:
export HYPERGRAPH_DDI_ROOT="$(pwd)"
python scripts/run_synthetic_benchmark.py --seeds 42 43 44 45 46DrugBank / TWOSIDES numbers are not bundled; run the configs under config/ on your licensed data.
The main objective of this work is not only to improve predictive performance, but to investigate how structural assumptions in data representation influence model behavior.
By moving from pairwise graphs to hypergraphs, interaction events are encoded as higher-order objects rather than decomposed prematurely.
This study remains exploratory in nature:
- Node features are limited unless externally engineered
- Final prediction is still reduced to pairwise scoring
- Hypergraph construction depends on dataset quality and formatting
- No clinical validation is performed
Results should be interpreted as evidence about representation choices, not as clinical applicability.
This project explores a simple but important question in representation learning:
What happens if we model drug interactions as higher-order structures instead of pairwise relationships?
Hypergraph neural networks provide one possible answer by preserving interaction-level structure that standard graph models inherently discard.
hypergraph-ddi — research codebase for modeling drug–drug interactions (DDI) with hypergraph neural networks and graph/MLP baselines.
Repository: github.com/meolen07/hypergraph-ddi
Author: Huynh Mai Linh Nguyen — research implementation; feedback and issues are welcome on GitHub.
Requirements: Python ≥ 3.9 (see pyproject.toml).
git clone https://github.com/meolen07/hypergraph-ddi.git
cd hypergraph-ddi
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
pip install -e .
export HYPERGRAPH_DDI_ROOT="$(pwd)"PyTorch Geometric: torch-geometric must match your PyTorch and CUDA/CPU build. See the official PyG installation guide if pip install fails.
Optional sanity check (requires dependencies):
python scripts/verify_imports.pyhypergraph-ddi/
├── config/ # YAML experiment configs
│ ├── demo_synthetic.yaml
│ ├── demo_mlp.yaml
│ ├── drugbank_hgnn.yaml
│ └── twosides_gcn.yaml
├── data/
│ ├── raw/ # User-provided DrugBank / TWOSIDES files
│ ├── processed/ # Parquet + hypergraph artifacts
│ └── splits/ # Train/val/test JSON (hyperedge-level)
├── src/
│ ├── preprocessing/ # Loaders, hypergraph, splits, negative sampling
│ ├── datasets/ # PyTorch Dataset + PyG graph builder
│ ├── models/ # HGNN, GCN, GAT, GraphSAGE, MLP
│ ├── training/ # Trainer
│ ├── evaluation/ # Metrics and plots
│ ├── experiments/ # ExperimentRunner
│ ├── inference/ # Checkpoint loading / prediction helpers
│ └── baselines/ # Model factory and forward wrappers
├── scripts/ # CLI entry points
├── tests/ # Smoke tests
└── experiments/ # Run outputs (logs, checkpoints, plots)
The synthetic config (source: synthetic_demo) generates random drugs and hyperedges for pipeline smoke tests only. It is labeled in config and code; do not use it for publication or benchmark claims.
# 1. Preprocess (writes data/processed and data/splits)
python scripts/preprocess.py --config config/demo_synthetic.yaml
# 2. Train (writes experiments/demo_hgnn_<run_id>/)
python scripts/train.py --config config/demo_synthetic.yaml
# Optional: fixed run id (e.g. for tests)
python scripts/train.py --config config/demo_synthetic.yaml --run-id smokeArtifacts under experiments/<experiment.name>_<run_id>/ include:
checkpoints/best.pttrain.logtest_metrics.jsonhistory.jsontraining_curves.pngroc_curve.png
MLP baseline on the same synthetic data:
python scripts/preprocess.py --config config/demo_mlp.yaml
python scripts/train.py --config config/demo_mlp.yamlYou must obtain and license data yourself. This repository does not redistribute DrugBank or TWOSIDES files.
- Download the DrugBank full database XML under their license terms.
- Place the file at
data/raw/drugbank.xml. - Preprocess and train:
python scripts/preprocess.py --config config/drugbank_hgnn.yaml
python scripts/train.py --config config/drugbank_hgnn.yamlAdjust paths and hyperparameters in config/drugbank_hgnn.yaml as needed. DrugBank XML structure can vary by export version—verify parsed interaction counts after preprocessing.
- Obtain a TWOSIDES-style pair file from the original data release (format varies by source).
- Place it at
data/raw/twosides.csvwith columns matching your file (defaults:drug1_name,drug2_name). - Preprocess and train:
python scripts/preprocess.py --config config/twosides_gcn.yaml
python scripts/train.py --config config/twosides_gcn.yamlUpdate drug_a_col and drug_b_col in the config if your CSV uses different column names.
- Splits are at the hyperedge level: the same interaction does not appear in both train and test.
- The hypergraph (and pair-expanded graph for GNN baselines) is built from training hyperedges only, so test interactions do not leak into structure used at train time.
python scripts/train.py --config <path/to/config.yaml> [--run-id <id>]Training reads processed data from data/ (run preprocess.py first unless using run_experiment.py, which can preprocess automatically).
python scripts/evaluate.py \
--config config/demo_synthetic.yaml \
--checkpoint experiments/demo_hgnn_smoke/checkpoints/best.pt \
--output experiments/eval_smokeWrites metrics.json and roc_curve.png under --output when provided.
Runs preprocessing once (unless skipped), then trains one run per seed:
python scripts/run_experiment.py \
--config config/demo_synthetic.yaml \
--seeds 42 43 44Summary written to experiments/multi_seed_summary.json. Use --skip-preprocess if data are already processed.
All models (HGNN + baselines) on synthetic data:
python scripts/run_synthetic_benchmark.py --seeds 42 43 44 45 46Writes RESULTS.md and experiments/synthetic_benchmark_summary.json.
pytest tests/test_smoke.py -vSet model.name in your YAML config:
| Model | model.name |
Structure used |
|---|---|---|
| HGNN | hgnn |
Hypergraph incidence matrix H (training only) |
| MLP | mlp |
Node features only |
| GCN | gcn |
Pair-expanded graph (PyG), training edges |
| GAT | gat |
Same as GCN |
| GraphSAGE | graphsage |
Same as GCN |
Implementation: src/models/. HGNN supports optional attention (use_attention in config).
Node features: By default, features are random or identity vectors keyed by seed (build_node_features). For stronger results, plug in external descriptors (e.g., molecular fingerprints) in the dataset layer—this is left to the user.
Computed in src/evaluation/metrics.py on held-out pairs:
| Metric | Key in JSON |
|---|---|
| ROC-AUC | roc_auc |
| PR-AUC | pr_auc |
| Precision | precision (threshold 0.5 on sigmoid logits) |
| Recall | recall |
| F1 | f1 |
| Precision@K | precision_at_10, precision_at_50, precision_at_100 |
If a split has only one class, some metrics return nan.
- Set
seedin the YAML config;src/utils/seed.pyfixes Python/NumPy/PyTorch seeds where applicable. - Use the same
HYPERGRAPH_DDI_ROOTand config file across preprocess, train, and evaluate. - For variance estimates, use
scripts/run_experiment.pywith multiple--seeds. - Save
test_metrics.json, config copies, and checkpoints underexperiments/for each run.
Document your data version, preprocessing choices, and hardware when reporting results.
Additional scope notes: The synthetic demo is not representative of real pharmacology; default node features are placeholders; DrugBank parsing depends on XML export format; TWOSIDES layouts vary by release; higher-order hyperedges are expanded to pairs for link prediction.
This project is licensed under the MIT License (see also LICENSE on GitHub).
DrugBank, TWOSIDES, and any third-party datasets remain under their own licenses and are not included in or covered by this repository’s MIT license.
If you use this codebase in published work, please cite it appropriately. A BibTeX entry can be added here when a formal publication is available:
@misc{hypergraph-ddi,
author = {Nguyen, Huynh Mai Linh},
title = {hypergraph-ddi: Hypergraph Neural Networks for Drug--Drug Interaction Modeling},
year = {2026},
howpublished = {\url{https://github.com/meolen07/hypergraph-ddi}},
note = {Software. Replace with article citation when applicable.}
}Please also cite DrugBank and TWOSIDES (or your data sources) according to their terms.
Do not fabricate or copy benchmark numbers into papers, slides, or portfolios without running this code on your data. Metrics printed during a demo run (including synthetic smoke tests) are for debugging the pipeline only, not evidence of clinical or predictive utility.
This software is provided for research and education. It is not medical advice, not validated for patient care, and not a substitute for licensed databases, expert review, or regulatory processes. Always reproduce experiments locally and report limitations honestly.