fix: hide non-assembly-scope workflows from the assistant (#1321)#1333
fix: hide non-assembly-scope workflows from the assistant (#1321)#1333dannon wants to merge 5 commits into
Conversation
The assistant's catalog tools were still doing an exact taxon-id match when checking workflow/organism compatibility, so a workflow annotated for Bacteria (taxon 2) got rejected on an actual bacterium like E. coli (562). The frontend, the catalog build, and the MCP server were all already fixed to check the organism's full lineage -- this just brings the assistant's copy in tools/catalog_data.py into line. It builds a lineage index from each genome's lineageTaxonomyIds and matches when the workflow's taxon is anywhere in that lineage. Fixes galaxyproject#1319.
The workflow/organism compatibility rule lives in four copies that drift apart -- which is what caused galaxyproject#1319. Adds a NOTE in the assistant's catalog_data.py pointing at the other three, and rewrites the existing (stale, only-listed-two) note in build-workflow-mappings.ts to name all four and point at the consolidation follow-up. Tracked in galaxyproject#1327.
There was a problem hiding this comment.
Pull request overview
This PR updates the assistant agent’s in-memory catalog access to (a) hide workflows that aren’t scope === "ASSEMBLY" (so the assistant doesn’t surface ORGANISM-scoped workflows it can’t drive) and (b) use lineage-based taxonomy matching for workflow compatibility checks (stacked from #1328 / #1319 fix).
Changes:
- Add an
_is_assembly_scopehelper and apply it across assistant catalog enumeration methods (category counts, category listing, compatible workflow search, and workflow detail lookup). - Build a taxonomy lineage index from
lineageTaxonomyIdsand use it to match ancestor-targeted workflows to descendant organisms/assemblies. - Extend fixtures and add tests validating scope filtering behavior (including “missing scope defaults to ASSEMBLY”).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| catalog/build/ts/build-workflow-mappings.ts | Updates internal NOTE about duplicated compatibility logic locations. |
| backend/api/tests/test_catalog_data.py | Adds scope to fixtures and introduces TestScopeFiltering coverage. |
| backend/api/app/services/tools/catalog_data.py | Implements scope filtering and lineage-based taxonomy compatibility for assistant tools. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
galaxyproject#1321) The MCP server uses a separate CatalogData (app/services/catalog_data.py) from the assistant's (app/services/tools/catalog_data.py), so the scope filter has to be applied in both -- otherwise ORGANISM-scoped workflows still leak through the MCP list/get/compatible/details tools. Caught in Copilot review on galaxyproject#1333.
|
Thanks @copilot-pull-request-reviewer -- good catch, that was a real gap. Real bug, fixed: the MCP server uses a separate Also corrected the PR description -- the "wraps the same CatalogData" line was wrong; they're two implementations. (The lineage half was fine: the MCP One judgment call: I extended the scope filter to the MCP tools on the assumption they share the assistant's single-assembly model. If you'd rather MCP stay a broad catalog interface that still exposes organism-scoped workflows, I'll revert that and just fix the description instead. |
…t schema updates (galaxyproject#1321) Codex adversarial review on galaxyproject#1333 found two paths that still reached hidden ORGANISM-scope workflows by IWC id even though the listing tools filter them: MCP resolve_workflow_inputs (resolved name/TRS/params before any scope check) and the assistant's _apply_schema_updates / _find_workflow_trs_id (matched against the raw workflows_by_category, bypassing the scoped get_workflow_details). Both now fail closed via _is_assembly_scope, with regression tests.
| """The assistant only drives the single-organism/single-assembly flow, so it | ||
| only sees ASSEMBLY-scope workflows (matching the frontend's default view). | ||
| Organism- and comparative-scoped workflows are hidden from it.""" |
Stacks on #1328 (the taxonomy-lineage fix) -- this branch is based on
assistant-workflow-taxonomy-lineage, so it should merge after #1328. Until then the diff below also includes #1328's commits.Problem
The assistant exposes every workflow in the catalog, including the 4 ORGANISM-scoped ones (
influenza-isolates-consensus-and-subtyping-main,assembly-with-flye-main,bacterial-genome-assembly-main,hyphy-capheine-core-and-compare). Those are assembly-building / comparative-genomics workflows whose 0- or 2+-assembly flow the assistant's single-organism/single-assembly model can't drive, so it lists them and then gets confused about what to do with them (#1321).Fix
Mirror the frontend's default view (which filters to
scope === "ASSEMBLY"): hide non-ASSEMBLY workflows from every catalog tool. A module-level_is_assembly_scopehelper (treating a missing scope as ASSEMBLY, like the frontend default) is applied in the four enumerating methods -- category counts, category listings, compatible-workflow search, and detail lookup.The assistant and the MCP server use separate
CatalogDataimplementations (app/services/tools/catalog_data.pyandapp/services/catalog_data.py), so the filter is applied in both (the MCP copy also guardscheck_workflow_assembly_compatibility). Thanks to @copilot-pull-request-reviewer for catching that the MCP path was initially missed.The #1319 taxonomy/lineage logic is untouched.
Tests
Added
scopeto the test fixtures and aTestScopeFilteringclass asserting an ORGANISM-scoped workflow is absent from category listings and compatible-workflow results, returnsNonefrom detail lookup, and isn't counted in category counts -- plus a missing-scope-defaults-to-ASSEMBLY case. The assistantCatalogDatahas 45 passing tests (40 existing incl. the #1319 lineage tests + 5 new); a newtest_mcp_catalog_scope.pyadds 4 more for the MCPCatalogData.