Skip to content

Common-gene filtering can misalign spatial and reference expression columns #2

Description

@leotenshii

Hi again! I opened a separate issue about the spot-wise cosine similarity, but I think there may also be an independent issue in the gene alignment.

R implementation

In the R implementation, both matrices are indexed using the same ordered vector of common genes:

cg <- intersect(colnames(ref$X), colnames(srt$X))

srt$X <- srt$X[, cg]
ref$X <- ref$X[, cg]

This ensures that both matrices have the same gene ordering before downstream computations.

Python implementation

In the Python implementation, the common genes are identified first:

common_genes = np.intersect1d(spatial['genes'], ref['genes'])

but each matrix is then filtered independently:

sp_idx = np.where(np.isin(spatial['genes'], common_genes))[0]
rf_idx = np.where(np.isin(ref['genes'], common_genes))[0]

np.isin() preserves the original order of the array being filtered. Therefore, this relies on the common genes having the same relative order in the spatial and reference matrices.

This matters because the least-squares initialization, cosine similarities, reconstructed expression, and gradients all assume column-wise correspondence between the two expression matrices.
Different ordering can occur when the datasets have undergone different preprocessing. For example, if reference differential-expression selection returns genes in score-ranked order while the spatial preprocessing preserves the original feature order, the resulting matrices may no longer share the same relative gene ordering.

Unless I'm missing something, it would be safer to index both matrices using the same ordered vector of common genes (as in the R implementation), rather than relying on the existing ordering being identical.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions