Skip to content

T0chka/MLPermutationSolver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLPermutationSolver

MLPermutationSolver is a GPU-native framework for solving permutation sorting problems on Cayley graphs with beam search, learned scoring models, and puzzle-specific heuristics. It currently supports:

  • LRX puzzle: permutation sorting on a Cayley graph generated by three moves — L (left shift), R (right shift), and X (swap of the first two elements).
  • Pancake puzzle: prefix-reversal pancake sorting with moves Rk, which reverse the first k elements.

This repo is a spinoff from CayleyPy by Alexander Chervov and collaborators (applying artificial intelligence methods to the mathematical problem of Cayley graph pathfinding). It represents an independent implementation of the Beam search solvers for testing, the results of which were reported in academic publications e.g. here.

Key Features

  • Random walk data generation algorithms for training ML models
  • Beam search solver BeamSolver with ML guidance and backward modes (backward_mode="off" | "bfs" | "beam")
  • Multiple model architectures (XGBoost, CatBoost, MLP)
  • GPU-optimized implementation: the search hot path, state hashing, meet-in-the-middle lookups, and model inference are designed to stay on GPU
  • Benchmarking, profiling, and evaluation tools

Installation

CUDA-capable GPU is required; CPU execution is not supported. uv is used for environment management.

Setup

git clone https://github.com/T0chka/MLPermutationSolver.git
cd MLPermutationSolver
uv venv
uv sync

Quick Start

To run the full pipeline end-to-end (generate random-walk training data, train an ML model, and then solve target permutations with BeamSolver using model-guided scoring) use solve_puzzle.py — configurable in the script header via InputConfig:

  • input.state = [1,0,2,...] — solve this specific permutation
  • input.state = None — generate a random permutation (use input.random_state_size and input.random_seed)
uv run solve_puzzle.py

Project Structure

MLPermutationSolver/
├── src/
│   ├── data_gen/       # Random walk data generation, BFS distances
│   ├── models/         # ML model implementations
│   └── solvers/        # Beam search solvers
├── experiments/        # Benchmarking and analysis scripts
├── pyproject.toml      # Project config and dependencies (uv)
├── uv.lock             # Locked dependency versions
└── README.md           # This file

Experiments

The experiments/ directory contains scripts for:

  • Benchmarking random walk implementations
  • Optimizing hyperparameters
  • Profiling model and solver performance

See experiments/README_exp.md for detailed descriptions of each script and their parameters.

Algorithms

BFS Distances (src/data_gen/bfs_distances.py)

Numba-optimized BFS over the full permutation Cayley graph for any puzzle (generators passed in). Computes exact distances from the identity to every reachable state. Memory-bound for LRX puzzle: max n=13 on RTX 3090 with 24 GB VRAM / 124 GB RAM (~70 GiB for n=13). Typical runtimes: n=12 ~2 min, n=13 ~35 min.

Random Walk Data Generation (src/data_gen/random_walks.py)

All dataset generators simulate sequences of states (permutations) starting from the identity permutation. They output (X, y), where X is a batch of states and y is a step index used as a training target / proxy for "search depth".

  1. first_visit_random_walks Unconstrained random walks. Repeats are allowed. After sampling, each unique state is assigned y = the earliest step at which the state appeared in the whole sample ("first-visit time"). Therefore y is a function of state only, and (state, y) pairs collapse to unique states. This generator is very fast, but can waste a large fraction of the sampling budget on duplicates for larger state_size.

  2. nbt_random_walks Per-walk self-avoiding (non-backtracking) random walks. Each walk tracks the set of states it has visited and forbids transitions to already visited states within that walk. If a walk has no valid move, it terminates early. The same state may appear at different steps across different walks, so y is not a pure function of state. This generator is significantly slower due to per-walk history checks.

  3. random_walks_beam_nbt Beam-style sampling with global non-backtracking in a sliding history window. At each step, expand the current beam of states by all moves, discard states present in the recent global hash history, then randomly select a new beam of size n_walks from the remaining candidates. The target uses an "effective step" (counter of successful expansions); when the state-space saturates, effective step may stop increasing even if the loop continues. This variant typically maximizes unique-state coverage per fixed sampling budget and is much faster than per-walk self-avoidance.

ML Models (src/models/)

The solver treats a permutation state as a feature vector (the raw sequence of integers) and learns a scalar score y used as a proxy for "how deep / hard" a state is in the search graph.

All models implement BaseModel with four methods:

  • train(X, y): fit on a batch of states X and scalar targets y
  • predict(X): return a 1D tensor of scores for states X
  • save(path), load(path): persist and restore a trained model
  1. XGBoostModel (src/models/xgboost_model.py) Gradient-boosted trees trained and inferred fully on GPU. PyTorch CUDA tensors are converted to CuPy via DLPack, XGBoost uses a GPU DMatrix for training, and inference uses inplace_predict(). Predictions are returned as a CUDA torch.Tensor via DLPack zero-copy.

  2. CatBoostModel (src/models/catboost_model.py) CatBoostRegressor configured for GPU training. However, this wrapper currently moves tensors to CPU (X.cpu().numpy(), y.cpu().numpy()) for both training and prediction, then wraps predictions back into a torch tensor.

  3. MLPModel (src/models/mlp_model.py) A feed-forward neural network (Linear + BatchNorm + ReLU + Dropout blocks) trained with Adam and MSE loss. This is a straightforward regression baseline that runs on the same device as X. The checkpoint includes both model and optimizer state, plus the architecture/training hyperparameters needed to reconstruct the network on load.

Beam Search Solvers and Exact Solver (src/solvers/)

  • Unified approximate solver BeamSolver
    • Forward beam search guided by ML scores
    • backward_mode="off" — pure forward beam
    • backward_mode="bfs" — tensor-only backward BFS archive on GPU, then forward beam with meet lookups in the frozen archive and path improvement when a better meeting path is found
    • backward_mode="beam" — full bidirectional beam with live backward frontier and meet-in-the-middle path improvement on both sides
    • Solution objective (SolutionObjective) tracks the best known solution and is designed to support future objectives beyond shortest path
  • Pruning and bounds
    • Lower-bound heuristic from the adapter combined with a dynamic path-length limit (max_steps and current best solution length) aggressively prunes branches that provably cannot beat the current best solution
  • Hash-based infrastructure
    • HashHistory: sliding-window hash history on GPU for global non-backtracking
    • MeetArchive + MeetLookup: cached, sorted hash index for fast meet-in-the-middle lookups
  • Path reconstruction
    • PathAssembler: pure-tensor path reconstruction from parent/move archives
  • Search scheduling
    • SearchScheduler / IndependentFrontierScheduler layer that chooses which frontier (forward / backward) to expand, ready for future NBS / pair-based schedulers
  • Exact solver (pancake only)
    • PancakeExactSolver with pluggable incumbent solver selected via solver registry

All hot-path pieces (expansion kernel, hash history, meet index, path assembly, lower bounds, beam pruning) are implemented as GPU tensor operations.

Profiling and Solver Instrumentation

  • Solver runtime profiling (experiments/profile_solver.py)
    • Coarse metrics: total search time, CUDA memory, states explored, solution length
    • Stage-level timings for key solver stages: expand, unique, history_filter, lower_bound, beam_prune, meet_lookup, bfs_prebuild
    • Per-stage counters: number of calls and total input states processed
  • _SolverProfiler helper inside BeamSolver
    • GPU synchronization only around total search_time (for accurate wall-clock timings), with optional stage profiling guarded by a runtime flag
    • Clean context-manager API (with profiler.section("expand", ...)) so that solve() and BFS prebuild code do not contain manual timer bookkeeping

TODO / Ideas

  • Additional solution objectives: Generalize SolutionObjective beyond objective="shortest" to support alternative criteria such as preferring longer or lexicographically-structured solutions, while keeping the same API (solution_better_than, update_best).
  • Faster LRX lower bound computation: The current admissible lower bound for LRX uses a max of two terms: an inversion-based bound and a cyclic displacement bound. The inversion term is still computed in O(n^2) time via temporary comparison tensors. Replace it with a more efficient exact implementation (e.g. merge-sort or Fenwick-tree based) without changing the resulting lower-bound values.
  • Hybrid ML + heuristic distance estimates: unify model-based scores with analytic lower bounds (gap / inversions) into a single calibrated estimate of distance-to-go, to improve ranking quality and pruning.
  • Diversity buckets for tie-break: When pruning the beam, many states can have the same score (tie). Currently ties are broken by lookahead (productive chain) and/or random. An idea: use diversity buckets to spread selection among states with the same score, so the beam keeps more varied states instead of clustering.

Research

This work builds upon and contributes to research in AI-based approaches to Cayley graph pathfinding. Results using these methods have been reported in:

  • Chervov, A., et al. (2025). "CayleyPy RL: Pathfinding and Reinforcement Learning on Cayley Graphs." arXiv:2502.18663

License

This project is licensed under the MIT License. See the LICENSE file for details.

Citation

If you use this work in your research, please cite:

@software{mlpermutationsolver,
  title={MLPermutationSolver: Machine Learning Approach to Permutation Sorting},
  author={Dolgorukova, Antonina},
  year={2024},
  url={https://github.com/T0chka/MLPermutationSolver}
}

@article{chervov2025cayleypy,
  title={CayleyPy RL: Pathfinding and Reinforcement Learning on Cayley Graphs},
  author={Chervov, A. and Soibelman, A. and Lytkin, S. and others},
  journal={arXiv preprint arXiv:2502.18663},
  year={2025}
}

Contact

For questions and collaboration:

About

GPU-native framework for solving permutation sorting problems on Cayley graphs

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages