Proseg v3 directly outputs the data as spatialdata zarr-store. The Points dataframe (stored as parquet within the zarr) that contains the transcripts has a column called assignment that stores the cell assignment as integer. However, for transcripts assigned to background this is null. When reading the zarr-store with spatialdata dask/pandas converts this column to float due to the null values.
Theoretically this issue could easily be fixed by changing the dtype_backend in the read_parquet function for the points. However, this will currently fail the validation logic (apparently only numpy dtypes are allowed?) and may have further implications.
This issue does not exist when writing the zarr-store directly via spatialdata as pandas will store a bunch of pandas-specific metadata into the parquet file including the dataype-backend for each column. But given that Proseg writes the dataframe directly from Rust with an Arrow Writer this metadata is not available and integer columns with null will be converted to float when loading it.
Proseg v3 directly outputs the data as spatialdata zarr-store. The Points dataframe (stored as parquet within the zarr) that contains the transcripts has a column called assignment that stores the cell assignment as integer. However, for transcripts assigned to background this is null. When reading the zarr-store with spatialdata dask/pandas converts this column to float due to the null values.
Theoretically this issue could easily be fixed by changing the
dtype_backendin theread_parquetfunction for the points. However, this will currently fail the validation logic (apparently only numpy dtypes are allowed?) and may have further implications.This issue does not exist when writing the zarr-store directly via spatialdata as pandas will store a bunch of pandas-specific metadata into the parquet file including the dataype-backend for each column. But given that Proseg writes the dataframe directly from Rust with an Arrow Writer this metadata is not available and integer columns with null will be converted to float when loading it.