Skip to content

Reduce excessive path indexing#4915

Merged
adamnovak merged 14 commits into
masterfrom
plan-path-indexing
Jun 11, 2026
Merged

Reduce excessive path indexing#4915
adamnovak merged 14 commits into
masterfrom
plan-path-indexing

Conversation

@adamnovak

@adamnovak adamnovak commented May 26, 2026

Copy link
Copy Markdown
Member

Changelog Entry

To be copied to the draft changelog by merger:

  • vg find -Q/--paths-named is now deprecated due to its partial-Protobuf output
  • vg find will now index its target paths but not other haplotype paths.
  • vg should no longer position-index haplotype paths unnecessarily in commands using the PathPositionOverlayHelper.

Description

This should fix cases where using the PathPositionOverlayHelper was making us index all haplotype paths.

I'm not sure if more commands need to be patched like vg paths to send along the names of haplotype paths they're actually interested in, to let queries on haplotype paths work. But we've only been indexing all the paths for a few months, and before that I think the behavior before that was for those queries to mysteriously give wrong answers.

@adamnovak adamnovak changed the title Plan path indexing Reduce excessive path indexing May 26, 2026
@adamnovak

Copy link
Copy Markdown
Member Author

I think we might need to fix vg stats --snarl-sample; I'm not sure it's under test with non-reference samples.

@adamnovak

Copy link
Copy Markdown
Member Author

This uses vgteam/libbdsg#241 in libbdsg.

@adamnovak

adamnovak commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

OK I went through all the subcommands using PathPositionOverlayHelper and I think I've fixed them all to ask to index the right paths.

The one I couldn't do yet was vg inject; now it can only inject into reference paths, since we don't have a great way to load the BAM header without the graph being ready.

We already have a lockout for that:

vg/src/alignment.cpp

Lines 590 to 600 in 1d90c0b

// TODO: Decide we need to positional-index this path? Make
// PackedReferencePathOverlay take a collection of paths to
// index and use this one?
#pragma omp critical (cerr)
std::cerr << "error[vg::parse_tid_path_handle_map] Path " << target_name
<< " referenced in header exists in graph, but as a haplotype."
<< " It is probably not indexed for positional lookup. Make the"
<< " path a reference path"
<< " <https://github.com/vgteam/vg/wiki/Changing-References>"
<< " and try again." << std::endl;
exit(1);

This should be ready to merge when CI passes.

@adamnovak adamnovak merged commit 9788761 into master Jun 11, 2026
2 checks passed
@faithokamoto faithokamoto deleted the plan-path-indexing branch June 11, 2026 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant