Skip to content

fix: hide non-assembly-scope workflows from the assistant (#1321)#1333

Open
dannon wants to merge 5 commits into
galaxyproject:mainfrom
dannon:fix/assistant-hide-nonassembly-workflows
Open

fix: hide non-assembly-scope workflows from the assistant (#1321)#1333
dannon wants to merge 5 commits into
galaxyproject:mainfrom
dannon:fix/assistant-hide-nonassembly-workflows

Conversation

@dannon

@dannon dannon commented Jun 8, 2026

Copy link
Copy Markdown
Member

Stacks on #1328 (the taxonomy-lineage fix) -- this branch is based on assistant-workflow-taxonomy-lineage, so it should merge after #1328. Until then the diff below also includes #1328's commits.

Problem

The assistant exposes every workflow in the catalog, including the 4 ORGANISM-scoped ones (influenza-isolates-consensus-and-subtyping-main, assembly-with-flye-main, bacterial-genome-assembly-main, hyphy-capheine-core-and-compare). Those are assembly-building / comparative-genomics workflows whose 0- or 2+-assembly flow the assistant's single-organism/single-assembly model can't drive, so it lists them and then gets confused about what to do with them (#1321).

Fix

Mirror the frontend's default view (which filters to scope === "ASSEMBLY"): hide non-ASSEMBLY workflows from every catalog tool. A module-level _is_assembly_scope helper (treating a missing scope as ASSEMBLY, like the frontend default) is applied in the four enumerating methods -- category counts, category listings, compatible-workflow search, and detail lookup.

The assistant and the MCP server use separate CatalogData implementations (app/services/tools/catalog_data.py and app/services/catalog_data.py), so the filter is applied in both (the MCP copy also guards check_workflow_assembly_compatibility). Thanks to @copilot-pull-request-reviewer for catching that the MCP path was initially missed.

The #1319 taxonomy/lineage logic is untouched.

Tests

Added scope to the test fixtures and a TestScopeFiltering class asserting an ORGANISM-scoped workflow is absent from category listings and compatible-workflow results, returns None from detail lookup, and isn't counted in category counts -- plus a missing-scope-defaults-to-ASSEMBLY case. The assistant CatalogData has 45 passing tests (40 existing incl. the #1319 lineage tests + 5 new); a new test_mcp_catalog_scope.py adds 4 more for the MCP CatalogData.

dannon added 3 commits June 5, 2026 12:34
The assistant's catalog tools were still doing an exact taxon-id match when
checking workflow/organism compatibility, so a workflow annotated for Bacteria
(taxon 2) got rejected on an actual bacterium like E. coli (562). The frontend,
the catalog build, and the MCP server were all already fixed to check the
organism's full lineage -- this just brings the assistant's copy in
tools/catalog_data.py into line. It builds a lineage index from each genome's
lineageTaxonomyIds and matches when the workflow's taxon is anywhere in that
lineage. Fixes galaxyproject#1319.
The workflow/organism compatibility rule lives in four copies that drift apart
-- which is what caused galaxyproject#1319. Adds a NOTE in the assistant's catalog_data.py
pointing at the other three, and rewrites the existing (stale, only-listed-two)
note in build-workflow-mappings.ts to name all four and point at the
consolidation follow-up. Tracked in galaxyproject#1327.
Copilot AI review requested due to automatic review settings June 8, 2026 16:41
@github-actions github-actions Bot added the fix label Jun 8, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the assistant agent’s in-memory catalog access to (a) hide workflows that aren’t scope === "ASSEMBLY" (so the assistant doesn’t surface ORGANISM-scoped workflows it can’t drive) and (b) use lineage-based taxonomy matching for workflow compatibility checks (stacked from #1328 / #1319 fix).

Changes:

  • Add an _is_assembly_scope helper and apply it across assistant catalog enumeration methods (category counts, category listing, compatible workflow search, and workflow detail lookup).
  • Build a taxonomy lineage index from lineageTaxonomyIds and use it to match ancestor-targeted workflows to descendant organisms/assemblies.
  • Extend fixtures and add tests validating scope filtering behavior (including “missing scope defaults to ASSEMBLY”).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
catalog/build/ts/build-workflow-mappings.ts Updates internal NOTE about duplicated compatibility logic locations.
backend/api/tests/test_catalog_data.py Adds scope to fixtures and introduces TestScopeFiltering coverage.
backend/api/app/services/tools/catalog_data.py Implements scope filtering and lineage-based taxonomy compatibility for assistant tools.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/api/app/services/tools/catalog_data.py
galaxyproject#1321)

The MCP server uses a separate CatalogData (app/services/catalog_data.py) from
the assistant's (app/services/tools/catalog_data.py), so the scope filter has to
be applied in both -- otherwise ORGANISM-scoped workflows still leak through the
MCP list/get/compatible/details tools. Caught in Copilot review on galaxyproject#1333.
@dannon

dannon commented Jun 8, 2026

Copy link
Copy Markdown
Member Author

Thanks @copilot-pull-request-reviewer -- good catch, that was a real gap.

Real bug, fixed: the MCP server uses a separate CatalogData (app/services/catalog_data.py) from the assistant's (app/services/tools/catalog_data.py), so my scope filter only covered the assistant path -- ORGANISM-scoped workflows still leaked through the MCP list_workflow_categories / get_workflows_in_category / get_compatible_workflows / get_workflow_details tools. Applied the same _is_assembly_scope filter to the MCP CatalogData (plus check_workflow_assembly_compatibility), with direct unit tests. (93bb1a4d)

Also corrected the PR description -- the "wraps the same CatalogData" line was wrong; they're two implementations. (The lineage half was fine: the MCP CatalogData already does lineage-based taxonomy matching.)

One judgment call: I extended the scope filter to the MCP tools on the assumption they share the assistant's single-assembly model. If you'd rather MCP stay a broad catalog interface that still exposes organism-scoped workflows, I'll revert that and just fix the description instead.

…t schema updates (galaxyproject#1321)

Codex adversarial review on galaxyproject#1333 found two paths that still reached hidden
ORGANISM-scope workflows by IWC id even though the listing tools filter them:
MCP resolve_workflow_inputs (resolved name/TRS/params before any scope check)
and the assistant's _apply_schema_updates / _find_workflow_trs_id (matched
against the raw workflows_by_category, bypassing the scoped get_workflow_details).
Both now fail closed via _is_assembly_scope, with regression tests.
Copilot AI review requested due to automatic review settings June 8, 2026 18:09

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Comment on lines +26 to +28
"""The assistant only drives the single-organism/single-assembly flow, so it
only sees ASSEMBLY-scope workflows (matching the frontend's default view).
Organism- and comparative-scoped workflows are hidden from it."""
@dannon dannon marked this pull request as ready for review June 9, 2026 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants