GLEAM (Galaxy Learning and Modeling) is a maintained suite of Galaxy tools for no-code and low-code machine learning workflows. The repository is the software-maintenance home for the GLEAM workbench: it contains Galaxy wrappers, Python entrypoints, test assets, and container build definitions for tool development and deployment in Galaxy.
This repository is not intended to be a manuscript-specific analysis archive. Paper-specific benchmark datasets, figure-generation notebooks, and result tables should live in separate companion repositories or public data archives referenced by the corresponding publication.
- Backend: PyCaret
- Tasks: classification and regression on structured tabular data
- Outputs: trained model artifact, best-model parameters, HTML evaluation report
- Docs: tools/tabularlearner/README.md
- Backend: Ludwig with TorchVision and MetaFormer model support
- Tasks: image classification and regression from image ZIP archives plus metadata CSV files
- Outputs: trained model artifact, HTML report, metrics/prediction assets
- Docs: tools/imagelearner/README.md
- Backend: AutoGluon Multimodal
- Tasks: classification and regression using tabular, text, and image inputs
- Outputs: HTML report, metrics JSON, training config YAML
- Docs: tools/multimodallearner/README.md
- Backend: Ludwig
- Tasks: general-purpose model configuration, training, evaluation, prediction, hyperparameter search, and visualization
- Outputs: Ludwig model artifacts, metrics, reports, plots, and configuration files
- Docs: tools/galaxy-ludwig/README.md
- Image tiling with PyHIST
- Embedding extraction with TorchVision and pathology-oriented backbones
- MIL bag construction from embedding tables
- Docs:
GLEAM tools are published for Galaxy administrators through the Galaxy ToolShed.
- Sign in to your Galaxy instance as an administrator.
- Open
Adminand thenInstall and UninstallorManage Tools. - Search for tool suites published by the
goeckslabowner. - Install the suites you need, for example:
suite_tabular_learnersuite_imagelearnersuite_ludwigsuite_tilersuite_embedding_extractorsuite_mil_bag
- Let Galaxy resolve the declared dependencies and restart the server if your deployment requires it.
This is the recommended path for production Galaxy instances because it tracks released tool definitions rather than an arbitrary development snapshot.
Use this path if you are developing GLEAM itself, testing local modifications, or validating wrapper behavior before a ToolShed release.
-
Clone the repository:
git clone https://github.com/goeckslab/gleam.git cd gleam -
Copy or symlink the tool directories you want into your Galaxy
tools/tree. -
Register the desired wrappers in your Galaxy tool panel configuration. For example:
<section id="gleam" name="GLEAM"> <tool file="gleam/tools/tabularlearner/tabular_learner.xml" /> <tool file="gleam/tools/tabularlearner/pycaret_predict.xml" /> <tool file="gleam/tools/imagelearner/image_learner.xml" /> <tool file="gleam/tools/multimodallearner/multimodal_learner.xml" /> <tool file="gleam/tools/galaxy-ludwig/ludwig_train.xml" /> <tool file="gleam/tools/galaxy-ludwig/ludwig_evaluate.xml" /> <tool file="gleam/tools/galaxy-ludwig/ludwig_predict.xml" /> <tool file="gleam/tools/galaxy-tiler/tiling_pyhist.xml" /> <tool file="gleam/tools/galaxy-embedding_extractor/pytorch_embedding.xml" /> <tool file="gleam/tools/galaxy-mil_bag/mil_bag.xml" /> </section>
-
Ensure your Galaxy deployment can execute the containers referenced by the wrappers. Most GLEAM tools expect Docker or another Galaxy-supported container backend.
-
Restart Galaxy and verify that the tools load without wrapper errors.
- Several tools use prebuilt images from
quay.io/goeckslab/.... - GPU-backed tools require compatible CUDA drivers and a Galaxy job configuration that permits GPU/container execution.
- Some models download pretrained weights at runtime on first use. For reproducible production deployments, pre-populate caches or pin the corresponding container image and model source.
The repository includes Galaxy wrapper tests and CI workflows under .github/workflows. Local development typically relies on planemo plus wrapper-specific test data already versioned in tools/*/test-data.
- Citation metadata is provided in CITATION.cff and codemeta.json.
- Release history is tracked in CHANGELOG.md.
- Maintainers and author credit are listed in AUTHORS.md and MAINTAINERS.md.
- Third-party software, model, and container provenance is summarized in THIRD_PARTY.md.
- The archival release workflow is documented in RELEASE.md.
Contributions that improve Galaxy wrapper quality, testing, documentation, and container reproducibility are welcome.
- Fork the repository.
- Create a feature branch.
- Run the relevant wrapper and CI tests.
- Open a pull request with a clear description of the user-facing impact.