Grammar-Constrained Chemical Space Exploration for Isomers using SMILES
SmilX is a software designed to explore the chemical space of isomers using the SMILES language under grammar constraints.
Given a molecular formula, SmilX systematically constructs chemically valid molecular graphs by applying a grammar-driven generation process, ensuring consistency with chemical rules such as:
- valence constraints
- connectivity
- unsaturation patterns
- ring closures
This approach enables the systematic enumeration of candidate molecular isomers.
This repository contains the official implementation of the SmilX / TokenSMILES framework, developed at:
Centro de Investigación y Estudios Avanzados (CINVESTAV) Mérida
- Molecular formula parsing (e.g.,
C6H6,C2H5NO2) - Hydrogen Deficiency Index (HDI) computation
- Enumeration of valid unsaturation patterns (double bonds, triple bonds, rings)
- Grammar-based construction of carbon skeletons and branching
- Cycle formation (ring closures) with validity checks
- Heteroatom substitution under valence constraints Supported elements: N, O, S, P, B, F, Cl, Br, I
- Interactive exploration via Streamlit
- RDKit-friendly workflow for downstream cheminformatics tasks
SmilX can be used through the online web application or by running it locally.
Use the web interface to explore molecular isomers directly from a molecular formula:
https://smilx-isogenerator.streamlit.app/
No installation is required.
Clone the repository and install dependencies:
git clone https://github.com/LuisOrz/SmilX.git cd SmilX
python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
SmilX
│
├── logo_smilx.png
├── requirements.txt
├── packages.txt
│
├── smilx_parameters.py
├── smilx_chemistry_tools.py
│
└── app.py
Developed by: Luis Armando Gonzalez-Ortiz, Lisset Noriega, Filiberto Ortiz, Gabriela Vidales-Ayala, Emmanuel Soberanis, Amilcar Meneses, Alan Aspuru-Guzik, and Gabriel Merino.
Centro de Investigación y Estudios Avanzados (CINVESTAV) Mérida
GNU General Public License v3.0 (GPL-3.0)
If you use SMILX in scientific research, please cite:
Gonzalez-Ortiz, L. A.; Noriega, L.; Ortiz-Chi, F.; Vidales-Ayala, G.; Soberanis-Cáceres, E.; Meneses-Viveros, A.; Aspuru-Guzik, A.; Merino, G. Grammar-driven SMILES standardization with TokenSMILES. Chemical Science, 2026, 17, 1666–1675. DOI: https://doi.org/10.1039/D5SC05004A
