Skip to content

_onclass.py:71 uses literal string 'self.labels_key' instead of attribute self.labels_key #114

Description

@joschkahey

Summary

popv.algorithms._onclass.py:71 uses the literal string "self.labels_key" as a pandas column key instead of the attribute self.labels_key. This silently creates a column named "self.labels_key" on the query and masks the real intent of the line. The user-facing crash (a TypeError: Cannot setitem on a Categorical with a new category (unknown) from line 191's followup write) is already fixed on main post-0.6.1 by casting self.result_key and self.seen_result_key to str before the relabel write — but the literal-string typo at line 71 remains as a code-quality fix.

Reproduction

Run OnClass voter in prediction_mode="inference" on a query with unknown_celltype_label-tagged cells:

from popv.hub import HubModel
hub = HubModel.from_pretrained("popv/tabula_sapiens_All_Cells")
hub.annotate_data(
    query, save_folder="popv_out",
    prediction_mode="inference",
    methods_list=["OnClass", "KNN_SCVI", "Support_Vector"],
)
# On 0.6.0:
#   TypeError: Cannot setitem on a Categorical with a new category (unknown), set the categories first
# On main:
#   Runs to completion, but a phantom column "self.labels_key" appears on the query.

Root cause

# popv/algorithms/_onclass.py:71 — both 0.6.0 and main
adata.obs.loc[adata.obs["_dataset"] == "query", "self.labels_key"] = adata.uns["unknown_celltype_label"]

The second arg is the literal string "self.labels_key" (in quotes). pandas treats it as a new column name and creates it. The intended attribute is self.labels_key (no quotes). The phantom column is harmless functionally but is a clear typo and obscures the line's intent for reviewers.

Suggested patch

- adata.obs.loc[adata.obs["_dataset"] == "query", "self.labels_key"] = adata.uns["unknown_celltype_label"]
+ adata.obs.loc[adata.obs["_dataset"] == "query", self.labels_key] = adata.uns["unknown_celltype_label"]

Related issues

  • Closed issue #28 describes the same broader root pattern (Categorical setitem with new category) — the user-facing crash on 0.6.0 is now fixed on main, but the literal-string typo at line 71 is independent and remains.

Affected releases

Verified on 0.6.0 and main (commit at time of writing).

Context

We carry an in-template monkey-patch for the 0.6.0-only user-facing crash in our downstream pipeline (Cytoreason nf-core-scdownstream) at modules/local/popv_ensemble/templates/popv_patches.py. The literal-string fix is a small code-quality cleanup; happy to send a PR if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions