Also available in: 🇧🇷 Português (Brasil)
2PathTerpenes is a bioinformatics and chemical modeling tool based on graph grammar to reconstruct and explore metabolic networks of plant terpene biosynthesis (C10 monoterpenes and C15 sesquiterpenes). The project utilizes the MedØlDatschgerl (MØD) simulator and the Double Pushout (DPO) formalism to generate synthesis pathways for complex molecules (such as
- Molecular Topological Definition: Modeling of chemical precursors and carbocations using SMILES linear representations and DFS in graphs.
- GML Reaction Rule Grammar: Application of realistic reaction mechanisms (cyclizations, Wagner-Meerwein rearrangements, hydride shifts, water addition and elimination).
- Automatic Chemical Network Generation: Combinatorial exploration and search for reachability pathways solved by Integer Linear Programming (ILP).
- Visual Plotting and Export: PDF report generation showing the structural chemical derivation graph.
- Cyclization Constraint Analysis: Proposals for thermodynamic and 3D geometric filters (ring strain) on chemical routes.
The simulation workflow is divided into components for graph specification, DPO grammar application, derivation graph generation, and database persistence.
graph TD
UI[Web Interface: index.html] -->|Generates parameters| SimPy[simulation.py]
SimPy -->|Imports molecules| MolPy[molecules.py]
SimPy -->|Loads GML rules| RulesGML[rules/*.gml]
SimPy -->|Invokes| MOD[MØD Wrapper/Library]
MOD -->|Builds| DG[Derivation Graph]
DG -->|Formats Report| PrintPy[printer.py]
PrintPy -->|Generates| PDF[PDF Report]
DG -->|Writes data| Neo4jStore[Neo4j Storer]
Neo4jStore -->|Saves| Neo4j[Neo4j Database]
The web interface (https://waldeyr.github.io/2PathTerpenes) provides a responsive dashboard for selecting reaction rules and defining simulation parameters.
| Technology | Version | Main Function |
|---|---|---|
| MedØlDatschgerl (MØD) | v0.8.0 or v1.0.0+ | Chemical graph transformation kernel (DPO) and ILP solver. |
| Python | v3.8+ | Simulation automation scripts (molecules.py, simulation.py, printer.py). |
| Open Babel | v3.0+ | 3D conformation generation and energy calculation of carbocations. |
| Docker | v19.x+ | Full Linux environment packaging for agnostic execution of MØD. |
| Feature | Form Field | Database Field | Applied Rules |
|---|---|---|---|
| Molecular Definition | N/A (Script file) | Compound.smiles, Compound.modName |
SMILES or DFS syntax to initialize the simulation reagents multiset. |
| Chemical Network Generation | N/A (Script file) | Rule.modName, Compound.id |
Repetitive application of DPO graph rewriting rules (addSubset >> repeat). |
| PDF Generation | N/A (Final report) | N/A | Rendering of intermediates with collapsed hydrogens and coloring (red for rings, blue for charges). |
| Scenario Saving | N/A (Load script) | Scenario.scenarioID, Scenario.ncbiAccession, Scenario.pubmedAccession |
Relational mapping of molecules and physical reactions with in vitro assays described in the literature. |
During the generation of sesquiterpene synthesis pathways, electrophilic cyclization reactions occur with high molecular reactivity. Since traditional MØD operates only at the topological level of discrete graphs, cyclizations that are impossible to occur in actual 3D space due to high conformational strain could be simulated.
We have identified four potential architectural improvements to implement cyclization constraints in the simulations:
- Conformational Energy Filter (via Open Babel): With MØD v1.0.0+, Open Babel calculates 3D coordinates and estimates the free energy of each carbocation using the MMFF94 force field (
Graph.energy). A validation script in Python can be implemented to discard cyclization intermediates where the conformational energy delta relative to the precursor is excessive (inviable ring strain). - Constraints in the GML Rules Context: Addition of rigid paths and preventing topology in the
contextof the GML rule. This prevents the rule from being applied if the molecule already contains rigid adjacent ring systems that physically prevent chain folding. - Heuristics with Custom Derivation Strategy (DGStrat in Python): Use a derivation strategy written in Python to intercept cycle creation and block reactions that generate incompatible strained rings (e.g., complex bridged rings of 3 or 4 carbons in inappropriate positions).
- Hyperflow and Linear Programming with Costs: In MØD v1.0, the Integer Linear Programming (ILP) solver can assign costs and capacities based on thermodynamic constraints in the overall network flow, minimizing energetically unfavorable pathways.
- GML Rules (
rules/): All official GML reaction rules are maintained and updated directly in therules/directory. The temporary/legacyrules/novas/directory has been removed to avoid duplication. - Resource Images (
docs/img/): Only images actively used or dynamically referenced (like rule previews) bydocs/index.htmlare kept in version control. Temporary or redundant image files are cleaned up prior to committing.
To generate the SVG preview images for the chemical reaction rules shown in the web interface:
-
Run the generator inside Docker (this compiles the GML rules and outputs
.pdffiles inout/):docker run --rm --volume "$(pwd):/home/shared" --workdir /home/shared 2path-terpenes-mod:latest -f /home/shared/generate_rules_svg.py -
Convert the generated PDFs to SVGs (running the loop inside the MØD container in one go):
docker run --rm --entrypoint /bin/bash --volume "$(pwd):/home/shared" --workdir /home/shared 2path-terpenes-mod:latest -c "for f in out/*_{L,K,R}.pdf; do mod_post --mode pdfToSvg \${f%.pdf} \${f%.pdf}; done"
-
Copy generated SVGs to the assets folder:
python organize_svgs.py
This helper script copies
_L.svg,_K.svg, and_R.svgcomponents fromout/directly todocs/img/preserving their original names.
-
Rebuild the Docker image:
docker build -t 2path-terpenes-mod:latest .(Note: The Docker image installs the LaTeX compiler with support for Latin Modern fonts
texlive-lmodern, fixing potential compilation issues for the summary reportsummary.pdf. To support minimal LaTeX environments or the legacy image, we also include a fallback mocklmodern.styin the workspace, so the compilation automatically falls back to default LaTeX fonts iflmodernis missing.) -
Run the main project simulation:
docker run --rm --volume $(pwd):/home/shared/ --workdir /home/shared/ 2path-terpenes-mod:latest -f /home/shared/molecules.py -f /home/shared/simulation.py -f /home/shared/printer.py
If you prefer to use the existing image pre-built on Docker Hub without building locally:
docker run --rm --volume $(pwd):/home/shared/ --workdir /home/shared/ waldeyr/mod_v0.8.0:v1.0 /home/mod-v0.8.0/bin/mod -f /home/shared/molecules.py -f /home/shared/simulation.py -f /home/shared/printer.pyThe simulation files contain dynamic compatibility helpers, now updated to use the new MØD build interface (DG.build().execute()) on newer MØD versions (1.0+) to resolve deprecation warnings for dgRuleComp and DG.calc(), as well as addressing deprecations for pushVertexColour and postSection in printer.py, while remaining fully backwards-compatible with legacy versions (0.8.0).
