Skip to content

Incorrect fold_change semantics in pdex >=0.2.0 break cell-eval metrics #232

@LeonHafner

Description

@LeonHafner

Hi,

we identified a bug in cell-eval caused by a change in pdex starting from version >= 0.2.0.

Previously, the fold_change column in pdex output contained linear fold changes computed as target_mean / ref_mean. In newer versions, pdex now outputs log2 fold changes, but the column name remains fold_change.

cell-eval expects linear fold changes unless a log2_fold_change column is present. If that column is missing, it computes log2 values from fold_change. Because pdex now already provides log2-transformed values under the same column name, this results in applying log2 twice:

# Add log2 fold change columns if not present
if self.log2_fold_change_col not in self.data.columns:
self.data = self.data.with_columns(
pl.col(self.fold_change_col)
.log(base=2)
.alias(self.log2_fold_change_col)
.fill_nan(0.0)
).with_columns(
pl.col(self.log2_fold_change_col)
.abs()
.alias(self.abs_log2_fold_change_col)
)

  • First transformation (in pdex): FC → log2(FC)
  • Second transformation (in cell-eval): log2(FC) → log2(log2(FC))

This leads to:

  • Values between 0 and 1 becoming negative after the first log2
  • Invalid values (NaN) after the second log2
  • NaNs being replaced with 0.0 in cell-eval (see line 106 above)

As a result, roughly 50% of fold change values are set to zero, which significantly affects downstream metrics, including:

  • overlap_at_*
  • precision_at_*
  • de_direction_match

Two ways to fix that:

  1. Revert pdex to output linear fold changes in fold_change column, maintaining current naming
  2. Alternatively, update pdex by changing the name of the output column from fold_change to log2_fold_change

The second option may introduce compatibility issues with downstream tools expecting the fold_change column.

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions