Skip to content

peng-lab/phoenix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

88 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Phoenix 🐦‍🔥

Pan-cancer virtual spatial transcriptomics from routine histology with Phoenix

[preprint] [weights] [notebook]

Open In Colab

Phoenix is a (latent) flow matching generative model that predicts spatially resolved single-cell gene expression directly from routine H&E-stained histology images. It generalizes across cohorts, donors, organs, and tissues — enabling in silico analysis of tissue organization and treatment response at population scale.


Getting started

You can install Phoenix with the following command

pip install git+https://github.com/peng-lab/phoenix

To load the 224x224 patches saved in an H5 file use

import numpy as np
from torch.utils.data import DataLoader
from torchvision.transforms import v2
from torchvision.transforms import InterpolationMode
from github.datasets.h5py_dataset import H5PYDataset

gene_path = './xenium_human_multi.npy'
gene_list = list(np.load(gene_path))

stats_path = "./stats_table.npz"
statistics = np.load(stats_path)

bicubic = InterpolationMode.BICUBIC
image_transform = v2.Compose(
    [
        v2.Resize((224, 224), bicubic),
        v2.CenterCrop((224, 224)),
        v2.ToTensor(),
        v2.Normalize(
            (0.707223, 0.578729, 0.703617),
            (0.211883, 0.230117, 0.177517),
        ),
    ]
)

image_path = "./demo_patch.h5"
dataset = H5PYDataset(
    image_path=image_path,
    transform=image_transform,
)
dataloader = DataLoader(
    dataset,
    batch_size=128,
    shuffle=False,
    num_workers=4,
    pin_memory=True,
)
print('Length dataset & dataloader:', (len(dataset), len(dataloader)))

To load the model weights hosted on HuggingFace use
(We recommend using the model trained on the Nest)

#https://huggingface.co/peng-lab/phoenix/resolve/main/weights/flow/tenx/multi/cell/20x/discrete/flow_model.pth
https://huggingface.co/peng-lab/phoenix/resolve/main/weights/flow/nest/multi/cell/20x/discrete/flow_model.pth

To load the vision encoder and flow transformer use
(We recommend using the optimized implementation)

from phoenix.models.flow_llama3 import FlowTransformerModel, FlowTransformerConfig
#from phoenix.models.flow_simple import FlowTransformerModel, FlowTransformerConfig

vision_model = timm.create_model(
    "vit_giant_patch14_reg4_dinov2",
    pretrained=False,
    img_size=224,
    num_classes=0,
    global_pool="token",
    init_values=1e-5,
    dynamic_img_size=False,
)

flow_model = FlowTransformerModel(
    FlowTransformerConfig(
        d_genes=1,
        d_image=1536,
        d_model=512,
        d_cross=512,
        n_heads=8,
        n_layers=8,
        qkv_bias=False,
        ffn_bias=False,
        ffn_mult=4,
        attn_drop=0.0,
        proj_drop=0.0,
        n_classes=0,
        cls_drop=0.1,
        checkpoint=False,
    ),
    vision_model=vision_model
)

state_dict = torch.load(state_path, map_location='cuda:0')
flow_model.load_state_dict(state_dict, strict=True)
flow_model = flow_model.eval().cuda()

To make a forward pass and check that it works use

x = torch.rand(1, 377, 1).cuda()
t = torch.rand(x.shape[0]).cuda()
c = torch.rand(1, 256, 1536).cuda()

output = flow_model(x, t, c)
print("Output:", output.size())

To predict gene expression from histology images use

from github.helpers.inference import FlowPipeline

pipeline = FlowPipeline(
    model=flow_model,
    stats=statistics,
    t_0=0.0,
    t_1=1.0,
    atol=1e-1,
    rtol=1e-1,
)

gex_pred, coords_list = pipeline(gene_list, dataloader)

Citation

In case you found our work useful, please consider citing us:

@article{tran/gindra2026.04.25.720812,
	author = {Tran, Manuel and Gindra, Rushin H. and Putze, Philipp and Senbai, Kang and Palla, Giovanni and Kos, Tina and Falcomat{\`a}, Chiara and Wang, Chen and Guo, Ruifeng (Ray) and Boxberg, Melanie and Berclaz, Luc M. and Lindner, Lars H. and Bergmayr, Linda and Kn{\"o}sel, Thomas and Jurmeister, Philipp and Klauschen, Frederick and Homicsko, Krisztian and Gottardo, Raphael and Eckstein, Markus and Matek, Christian and Mock, Andreas and Theis, Fabian J. and Saur, Dieter and Peng, Tingying},
	title = {Pan-cancer virtual spatial transcriptomics from routine histology with Phoenix},
	year = {2026},
	journal = {bioRxiv},
	doi = {https://doi.org/10.64898/2026.04.25.720812},
}

About

Highly accurate prediction of single-cell spatial transcriptomics from histology images

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors