In this repo, we present code for producing results in the article: "Efficient algorithms for pangenome personalization".
To compile the source code, the following tools should be installed:
- KMC API headers, which can be downloaded with
./download_include.sh. Run the script from the root of the repo after cloning. - OpenMP library for C++ multithreading
- OR-TOOLS library for min-cost flow.
OR-TOOLS can be downloaded from their offical website.
You can use cmake to compile code with rules from CMakeLists.txt.
In that case you should add argument -DCMAKE_PREFIX_PATH="<path/to/or-tools/dir>" to cmake before running.
There are three executables, that can be compiled.
-
gfa_scoreris an executable that for a given graph in GFA, and$k$ -mers database outputs scores of vertices inpgfformat on standart output. -
paths_finderis an executable that for a given graph in PGF runs 2-paths algorithm, and returns two paths in specificed format. -
table_produceris an executable that for given set of GFAs and walks returns tables with stats that are further used in section "Recovery of true vertices".
Important
To run paths_finder properly, header of graph in PGF should be changed, which can be done by exps/scripts/change_header.py:
python change_header.py graph.gfa score_graph.pgf > scored_graph_with_header.pgf