| title | SpreadGL |
|---|---|
| emoji | ๐ |
| colorFrom | blue |
| colorTo | green |
| sdk | docker |
| pinned | false |
spread.gl v2.0-beta is an integrated data pipeline designed to visualize pathogen dispersal over geographic space and time. It unifies complex backend data processing (the extraction, transformation, and loading [ETL] of phylogenetic trees and environmental rasters) with a highly interactive, GPU-accelerated web rendering engine into a single, user-friendly graphical interface.
- Universal File Support & Frictionless UX: Offers a frictionless user experience through workspace-wide and global drag-and-drop upload functionality. The platform natively parses raw spatial files (
.geojson,.csv) and instantly restores saved session states (.json), allowing users to transition seamlessly between dataset processing and visual exploration. - Tianditu (ๅคฉๅฐๅพ) Official Basemaps: Integrates China's official Tianditu basemaps natively into the mapping engine. This ensures strict geographic mapping compliance and seamless accessibility for researchers within Chinese academia.
- Interactive Visual Analytics: Supports real-time, exploratory analysis of phylogeographic structures. Users can dynamically filter transmission networks by Bayes Factor thresholds, play animations on a synchronized 4D spacetime timeline, and inspect raw geospatial data layers without dropping records.
spread.gl is designed with a privacy-first architecture.
Unlike conventional web visualization tools that upload your sequences to third-party cloud servers, spread.gl processes and renders all datasets locally. This relieves all security and compliance concerns regarding sensitive, unpublished molecular epidemiology sequences during active outbreaks.
The most secure and reliable way to run spread.gl locallyโwithout installing Python, Node.js, or compiling dependenciesโis using Docker.
Step 1: Install Docker Download and install Docker Desktop for your operating system (Mac, Windows, or Linux). Ensure the Docker application is open and running in the background.
Step 2: Start the Processing Engine (Backend) Open your terminal (or Command Prompt) and run the following command to download and start the data extraction toolkit. This engine handles the parsing of BEAST log files and Bayes Factor computations.
docker run -d -p 8000:7860 --name spreadgl_backend florentlee/spread.gl.processing.toolkit:2.0-betaStep 3: Start the Frontend (Visualization) Run the following command to launch the browser-based visualization environment:
docker run -d -p 3000:7860 --name spreadgl_frontend florentlee/spread.gl.web.page:2.0-betaStep 4: Launch the App Open your web browser and navigate to: http://localhost:3000
The application is structured around a top navigation bar dividing the workflow into three sequential stages: Setup, Workspace, and Map.
graph LR
A[Step 1: Setup] --> B[Step 2: Workspace]
B --> C[Step 3: Map]
In this initial step, you load your raw phylogenetic trees, location mappings, and environmental data layers to run them through the processing bridge.
- Analysis Type: Choose Discrete (for discrete spatial traits modeled on fixed location points) or Continuous (for continuous latitude/longitude traits representing geographic coordinates).
- Upload Files:
- Tree File: Upload your NEXUS format tree file (e.g.,
.treeor.treesfile from BEAST). - Location File (Discrete only): Upload a CSV containing location names and coordinates. The CSV must have a comma separator and a header of
location,latitude,longitude. - BEAST Log File (Optional - Discrete only): Upload the BEAST
.logfile containing rate indicators to compute Bayes factors for the migration network.
- Tree File: Upload your NEXUS format tree file (e.g.,
- Metadata Traits:
- Most Recent Tip Time: Specify the calendar date (e.g.,
2021-06-01) or decimal year (e.g.,2021.42) of the latest sampled taxon to calibrate the timeline. - Location Trait: Enter the name of the annotation trait representing geography (e.g.,
regionorcoordinates). If continuous and using separate lat/lon annotations, separate them with a comma (e.g.,location1,location2).
- Most Recent Tip Time: Specify the calendar date (e.g.,
- Reprojection Options (Advanced): Convert coordinates on-the-fly from a local reference system (e.g., British National Grid EPSG:27700) to World Geodetic System 1984 (WGS84 EPSG:4326).
- Trimming Options (Advanced): Exclude geographic outliers by referencing an external CSV file (e.g., checking for empty locations or null attributes).
- Geo-Contextual Data Layers:
- Regions: Upload a CSV with environmental values and a GeoJSON boundary map to color state/provincial polygons based on variables (e.g. swine density).
- Rasters: Upload a set of monthly climate
.tifrasters, a GeoJSON boundary mask, and a list of target locations to clip and plot environmental grids.
Once the processing completes, the application automatically routes you to the Workspace panel. This stage allows you to review the outputs before loading them onto the map.
- Summary Metrics: Check how many features, branches, and polygons were successfully parsed.
- Download Outputs: Download the clean spatial outputs directly to your computer:
dynamic_pathway.geojson: Dynamic trajectories (trips) tracking migrations over time. You can create point, line and arc layers based on this file.aggregated_migration_network.geojson(Discrete only): Aggregated Markov jumps between discrete locations with computed weights.geo_contextual_data.geojson / .csv: Clipped rasters or populated region boundary maps.
- Verification: Verify that no coordinates are out of bounds or missing. Click the Apply to Map button to load the datasets into the Kepler engine.
The Map view launches the interactive, GPU-accelerated map engine.
- Arc Layer (Migration Network): Displays arcs showing discrete dispersal patterns. Arcs are colored from source (greenish blue) to target (red).
- Trip Layer (Pathways): Renders phylogenetic branches as moving light trails. You can adjust the trail length and thickness.
- Kepler Timebar: Displays a timeline at the bottom of the map. Play the animation to watch the virus disperse across the globe chronologically, or drag the time window handles to see specific intervals.
- Geo-Contextual Layer: Shows overlay grid points (raster data) or styled boundary polygons (regional data) matching the map backdrop.
- In Step 1: Setup, locate the upload inputs.
- Drag and drop your NEXUS tree file into the Tree File dropzone.
- If doing a discrete analysis, drop your coordinate reference list into the Location CSV dropzone.
- Click Run Pipeline at the bottom of the panel and wait for the Workspace view to appear.
To overlay climate grid cells or demographic polygons onto your dispersal map:
- For Regional Polygons (e.g., PEDV example discussed in spread.gl v1.0):
- Set Environmental Type to
Regions. - Upload your environmental spreadsheet (
Environmental_variables.csv). - Upload your map boundary file (
China_map.geojson). - Specify the Location Column inside the CSV (e.g.,
location) and the Location Variable property inside the GeoJSON (e.g.,name).
- Set Environmental Type to
- For Raster Grids (e.g., Climate Rasters used in YFV example):
- Set Environmental Type to
Rasters. - Upload one or more
.tiffiles. - Upload the geographic mask boundaries (
geoBoundaries-BRA-ADM1.geojson). - Upload a text file listing the regions of interest (
Involved_brazilian_states.txt).
- Set Environmental Type to
In discrete phylogeographic studies, many migration routes are tested, but only a few are statistically significant.
- In the Setup tab, if you uploaded a BEAST log file, the Bayes Factor threshold slider will be active.
- The default threshold is
3.0(indicating positive support). - Slide the value higher (e.g.,
10.0for strong support, or30.0for very strong support) to filter out support paths. - The map will instantly update, hiding weaker routes and adjusting the width of the remaining arcs based on their jump weights (number of estimated transitions).
This section walks through continuous and discrete spatial analysis use cases using the React GUI and backend processing toolkit.
Continuous phylogeography traces the exact latitude and longitude of viral lineages over time. By syncing this trajectory with environmental variables (e.g., temperature grids), researchers can correlate climate variations with dispersal speed.
- Dataset: Yellow Fever Virus (YFV) in Brazil
- Tree File:
inputdata/YFV_Brazil/YFV.MCC.tree - Rasters: Directory of
.tifrastersinputdata/YFV_Brazil/wc2.1_5m_tmax_2015-2019/ - Mask Boundary:
inputdata/YFV_Brazil/geoBoundaries-BRA-ADM1.geojson - Location List:
inputdata/YFV_Brazil/Involved_brazilian_states.txt
- Tree File:
- Select the Setup tab and set Analysis Type to Continuous.
- Upload
YFV.MCC.treein the Tree File field. Your tree file must strictly adhere to the standard#NEXUSformat. - Enter
location1,location2under Location Trait and set Most Recent Tip Time to2019-04-16. - Turn on the Environmental Data Layer and select Rasters as the type.
- Upload the folder containing the
.tifrasters, selectgeoBoundaries-BRA-ADM1.geojsonas the mask boundary, and chooseInvolved_brazilian_states.txtas the location list. Set Location Variable toshapeName. - Click Run Pipeline. The backend maps the viral trajectories and clips the temperature rasters to the target states.
- Click Apply to Map. The visualization engine natively binds the moving Trip Layer (which renders viral lineage trails set to 1/10th of the outbreak duration) and the shifting Geo-Contextual Data Layer (displaying temperature grid points) to the shared Kepler.gl Timebar.
A core feature of the Map tab is the synchronized 4D timeline. When you press play on the time slider, Kepler.gl perfectly synchronizes all spatial layers simultaneously: the viral lineages moving along the dynamic_pathway (Trips Layer), the shifting credible intervals of the hpd_polygons, and the fading dynamic environmental temperature rasters. This allows researchers to visually correlate pathogen spread directly with changing ecological conditions.
| YFV Spread in 2015 | YFV Spread in 2016 |
|---|---|
![]() |
![]() |
| YFV Spread in 2017 | YFV Spread in 2018 |
|---|---|
![]() |
![]() |
Discrete phylogeography reconstructions represent viral spread as transitions among discrete locations. This example demonstrates how to run a discrete phylogeographic analysis with Bayes Factor filtering to resolve primary transmission hubs.
- Dataset: SARS-CoV-2 Eta Variant (B.1.525)
- Tree File:
inputdata/SARS_CoV-2_B.1.525_Global/Analysis2.joint.phylogeo.HIPSTR.tree - Locations:
inputdata/SARS_CoV-2_B.1.525_Global/full.dataset.2910.region.coordinates.csv - BEAST Log:
inputdata/SARS_CoV-2_B.1.525_Global/Analysis2.thorney.joint.phylogeo.burnin.removed.log
- Tree File:
- Select the Setup tab and set Analysis Type to Discrete.
- Upload the B.1.525
.treefile. - Upload the
.csvLocation List (containing location, latitude, longitude) and type the corresponding Location Trait asregion. Set Most Recent Tip Time to2021-07-03. - In the Bayes Factors section, upload the BEAST
.logfile and set the Burn-in fraction (e.g.,0.1or 10%). - Click Run Pipeline.
- Click Apply to Map. The map will load the animated
dynamic_pathwayby default. - Click the layer visibility icon to turn on the Aggregated Migration Network. Use the auto-generated Bayes Factor filter in the left panel to dynamically threshold the network (e.g., slide to
>150for decisive evidence, isolating the primary export hubs).
| B.1.525 Dispersal Path | B.1.525 Migration Flow |
|---|---|
![]() |
![]() |





