Skip to content

[Proposal] Add a high-level declarative syntax for diagrams#20

Open
anko9801 wants to merge 36 commits into
Typsium:masterfrom
anko9801:master
Open

[Proposal] Add a high-level declarative syntax for diagrams#20
anko9801 wants to merge 36 commits into
Typsium:masterfrom
anko9801:master

Conversation

@anko9801

@anko9801 anko9801 commented Sep 8, 2025

Copy link
Copy Markdown

This pull request introduces a proposal for a new molecule module, intended to offer a more declarative and concise way to write chemical diagrams.

The goal is to provide a syntax that is even more intuitive and concise than established tools like ChemFig, allowing the natural way of writing a structure to yield an IUPAC preferred diagram.

This module is designed to integrate with core Alchemist functions, enabling a powerful hybrid workflow. While core Alchemist provides ultimate flexibility through direct cetz integration, this new module, at the cost of that flexibility, offers exceptional conciseness for the most common use cases.

Proposed Features

  • Concise Molecule Generation: A new #molecule command is introduced that can take a simple string (e.g., #molecule("CH3-CH2-OH")) and parse it into a diagram.
  • Automatic IUPAC-Preferred Orientation: The parser is designed to automatically orient chemical structures according to IUPAC recommendations to produce aesthetically pleasing and standardized diagrams.
  • Labeling: A labeling system (:label for points, ::label for lines, and a label: "..." argument) is included to allow for the creation of more complex structures with non-sequential bonds or mechanism arrows.

Example Usage

#skeletize(molecule("NH2-CH(-CH3)-C(=O)-OH"))

#skeletize(molecule((
  "CH2:a-CH2-CH2",
  "CH2-CH2-CH2:b",
  ":a=:b"
)))

#skeletize({
  molecule(
    "E-C(=O(lewis: (dots(0), dots(180)))-O:to(lewis: (dots(-45), dots(-135)))-::from-H:H + B:base <=> ",
    "[R-C(=O(lewis: (dots(0), dots(180))))-O(lewis: (dots(0), dots(-90), dots(90))) <-> ",
    "R-C(-O(lewis: (dots(0), dots(90), dots(180))))-O(lewis: (dots(0), dots(-90), dots(90))) <->",
    "R-C(-O(lewis: (dots(0), dots(90), dots(180))))-O(lewis: (dots(-135), dots(45)))]",
    "+ BH",
    ":base(lewis: (dots(180))",
  )

  arrow("->", from: "from", to: "to.north", style: (paint: red))
  arrow("->", from: "base.west", to: "H.east", style: (paint: red))
})

What's implemented

  • Transform system (input -> Node Graph -> Alchemist structure)
    • parser combinator
    • IUPAC-compliant angle calculation
    • Connecting points
    • Resolving labels
  • Atoms (CH3 -> $C$ $H_3$)
    • Charge (NH3+, COO-)
    • Isotope (^13C, ^2H)
  • Bonds (- = # > < :> <: |> <|)
  • Rings and substituents (@6(-=-=-(-CH3)=)-CH3)
  • Option (O(lewis: (dots(0), dots(180))) -(angle: 60deg))
  • Labels (O:O1 =::bond)
  • Error and validations
  • Edge case tests
  • Optimize
  • Manual entry

Refereneces

  1. Brecher, J. Graphical Representation Standards for Chemical Structure Diagrams (IUPAC Recommendations 2008). Pure and Applied Chemistry 2008, 80 (2), 277–410. DOI: 10.1351/pac200880020277

This is an initial proposal, and I would be very grateful for any feedback, suggestions, or critiques to help improve it. Thank you for your consideration.

Comment thread lib.typ Outdated
Comment thread src/elements/molecule/parser.typ Outdated
@Robotechnic

Copy link
Copy Markdown
Collaborator

I'll take time to look at it more in-depth. It seems like awesome work. Thanks !
Before merging, it would be a good idea to add tests and manual entries.

Also, maybe skeletize should be able to directly detect a single string instead of doing #skeletize(molecule(...)), we could have something like #skeletize(...). I have to think about it.

Comment thread tests/molecule-edge-cases/test.typ Outdated
@lt25106

lt25106 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Doesn't the typed-smiles package already exist? I feel like I usually use alchemist for better control over the angles, so maybe a chemfig-like syntax would be better.

@lt25106

lt25106 commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Just found the molchemist package that supports SMILES and uses alchemist under the hood. For fine control,

If you need to manually fine-tune a molecule, add a specific Lewis structure, or integrate the structure into a larger custom alchemist drawing, you can use the dump parameter.
When dump: true is passed, molchemist will not render the molecule. Instead, it will output the generated native alchemist code block into your document. You can then copy, paste, and modify this code directly.

From the molchemist documentation.

@anko9801 anko9801 force-pushed the master branch 2 times, most recently from b4479c9 to 3f64e10 Compare June 23, 2026 08:32
anko9801 added 3 commits June 23, 2026 18:01
Replace the recursion-limited pure-Typst molecule parser with a declarative engine
under src/elements/chem/. A DSL/SMILES string is parsed by a Rust engine
(engine/ -> chem.wasm), laid out by CoordgenLibs (coordgen/ -> coordgen.wasm), then
drawn by render.typ with IUPAC GR-2008 orientation, labels, stereo and the GR
feature set. One public entry point, chem(), returns stand-alone content;
reaction()/rxn-arrow()/curly-arrow() build schemes; electron-pushing arrows are
addressed by atom id via chem's `arrows` option. `molecule` stays the hand-drawn
fragment alias.

DSL ring bodies list vertices with inferred bonds — @6 benzene, @6(N) pyridine,
@6(C(-CH3)CCCCC) toluene — auto-aromatised via a graph-level kekulize shared with
the SMILES front-end; explicit bonds override (@6(------) cyclohexane). The build
sources live next to the chem Typst code and are excluded from the package.

Assisted-by: Claude Code (model: Opus 4.8)
Five visual-regression galleries covering the engine pipeline (chem-parse,
chem-layout, chem-render, chem-compose, chem-gallery), alongside the hand-drawn
fragment/link tests (cetz-skeleton-anchors, resonance, molecules-*) kept intact
now that `molecule` aliases `fragment`.

Assisted-by: Claude Code (model: Opus 4.8)
Add a Part III "Molecule engine" chapter — chem(), the DSL (incl. inferred-bond
rings), SMILES, stereochemistry, render options, electron-pushing arrows,
reactions and formula — plus a command reference.

Assisted-by: Claude Code (model: Opus 4.8)
Comment thread src/elements/chem/render.typ Outdated

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you building a new renderer? I don't think it's a good idea. If you miss any features on the current renderer, I can add them if you don't have time/don't want to do it. This duplicates the logic and as far as I can see is not compatible with the current renderer

@Robotechnic

Copy link
Copy Markdown
Collaborator

Just found the molchemist package that supports SMILES and uses alchemist under the hood. For fine control,

If you need to manually fine-tune a molecule, add a specific Lewis structure, or integrate the structure into a larger custom alchemist drawing, you can use the dump parameter.
When dump: true is passed, molchemist will not render the molecule. Instead, it will output the generated native alchemist code block into your document. You can then copy, paste, and modify this code directly.

From the molchemist documentation.

I just talked with the author of this module, and apparently, the smiles support of molchemist was only there as someone requested it (rice8y/molchemist#1). If I understood correctly, the author is quite ok with making this a part of alchemist instead of keeping it in the molchemist module.

@Ants-Aare

Copy link
Copy Markdown

I just talked with the author of this module, and apparently, the smiles support of molchemist was only there as someone requested it (rice8y/molchemist#1). If I understood correctly, the author is quite ok with making this a part of alchemist instead of keeping it in the molchemist module.

Yeah, I think this is the right approach. I'm just very slowly working on the smiles implementation but it's going to have inchi support as well as some other smaller file types. I was thinking of making a new renderer for it, since it's going to output more coordinate-based information. I even started working on it, but I kind of want to keep alchemist as the go to solution. I feel kind of guilty that my smiles implementation is taking so long. There are already packages trying to fill the vacuum, creating more fragmentation. Even though typed-smiles is useful, it'd be cooler if there was just one package & styling system to use for everything. The diagrams do look very different and it would be hard to use both in the same document.

@rice8y

rice8y commented Jun 24, 2026

Copy link
Copy Markdown

Hi everyone,

I’m the author of molchemist. molchemist was originally created for rendering Molfile/SDF data. SMILES support began as experimental work in response to rice8y/molchemist#1, and v0.1.2 was the first release to include it. I’m therefore positive about consolidating the overlapping SMILES functionality into Alchemist rather than maintaining multiple partially overlapping implementations.

I checked out the current PR head (1f61b47) and compared it locally with molchemist 0.1.2. This is not an exhaustive review, so there may still be details I have missed.

For the SMILES path specifically, both implementations use CoordgenLibs for 2D layout, but their parser and intermediate-representation boundaries differ. molchemist uses a vendored copy of opensmiles with a small local parser fix, then converts the resulting graph into native Alchemist elements for draw-skeleton. The current PR uses its handwritten front/smiles.rs, maps both the DSL and SMILES into a shared graph, applies its own orientation and stereo processing, and renders through draw-chem.

If the current shared graph and layout/depiction pipeline are intended to remain, one option would be to evaluate opensmiles as the SMILES frontend. An adapter could map opensmiles::Molecule directly into that graph while preserving the metadata required by the downstream pipeline. This could retain the PR’s DSL and chemical capabilities while avoiding two independently maintained SMILES parsers.

Regarding rendering, I share Robotechnic’s concern about duplicating behavior that already exists in draw-skeleton. The PR history confirms that the earlier pure-Typst implementation produced native Alchemist drawables for skeletize, while commit f3ace27 replaced that transformer with the dedicated draw-chem path.

At the same time, I understand Ants-Aare’s point that coordinate-based output may benefit from a different rendering layer. I do not yet have a strong conclusion about whether reusing draw-skeleton directly is practical for all the new features. However, I think the important goal is to keep one Alchemist styling system: either by extending the existing renderer, or by making a coordinate-aware layer share the same primitives, spacing rules, and configuration rather than becoming an independent parallel renderer.

@Robotechnic

Robotechnic commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

I have to say that the current rendering engine fully supports coordinate based rendering as the current way it works is by first placing all the fragments and anchors and then drawing the links between them. The api is just not exposed. We can discuss it a bit but I can provide a more computer friendly api that takes fragments, their coordinates and the links between them to do the rendering. It's just connecting the right piece together.

My main concern with a separate renderer is that if it's not fully compatible with the current one, you can't really mix up the two ways of rendering a molecule and the cetz anchors won't be the same if you want to integrate a molecule in cetz. This would be more confusing for the end user.

https://github.com/Typsium/alchemist/blob/master/src%2Fdrawer.typ#L434

This call is the place where all the links are drawn and it's just a matter of giving the right context to this function

@anko9801

Copy link
Copy Markdown
Author

Thanks — and you're right that those pieces are addable. I think we actually converge here.

The way I see it, the renderer's job is "draw a molecule whose atoms already have coordinates" — that part I want to share, not fork. The hard part is upstream: parsing + IUPAC 2D layout (fused/bridged/macrocycles), which has no counterpart in the current renderer to improve. It's a coordinate solver (CoordgenLibs, same as molchemist) + orientation/stereo, and it has to run as a compiled WASM plugin, not in Typst itself — the earlier pure-Typst version here hit Typst's recursion limit and had to fake rings with invisible bonds. So the WASM engine isn't a reimplementation of your renderer; it's the missing layer.

And what I'd need from you is exactly what you offered. Reading draw-link-decoration / calculate-mol-mol-link-anchors, they already work from coordinates (angle from angle-between, endpoints from the fragment ellipse — no turtle in the path), so a "place fragments at coords, then draw the links" entry — skeletize minus the turtle front-end — looks like all that's needed, reusing your link/lewis/fragment drawing. The engine already emits your link names and one label per atom, so it maps ~1:1.

In return I'd drop render.typ's own bond/label drawing, adopt your anchor convention (so cetz anchors match by construction — your main concern), and use alchemist's config as the single styling system.

@rice8y — following your suggestion, I'll use opensmiles as the SMILES frontend in this PR. @Ants-Aare — I don't want to step on your InChI/Molfile work; it can feed the same graph + renderer, and I'm happy to coordinate however you prefer.

So PR would be DSL + opensmiles → engine → that coordinate entry, render.typ's core removed and anchors/config unified. Want to design the entry's signature together?

anko9801 added 2 commits June 27, 2026 23:02
Replace the dedicated coordinate renderer (`render.typ`) with a pipeline that
drives alchemist's own drawing, so molecules share one rendering path — and
therefore identical CeTZ anchors, spacing and styling — with hand-drawn
`skeletize` molecules instead of duplicating the logic:

- `skeleton.typ`: place each engine atom as a native fragment at its absolute
  coordinate, then draw bonds turtle-free through the existing
  `draw-link-decoration`. Reuses link/lewis/fragment drawing; one bond-length
  unit maps to `atom-sep` (resolved to canvas units via `convert-length` so the
  manual geometry stays float-valued), keeping engine and hand-drawn molecules
  at the same scale.
- `decorations.typ`: thin overlay for the non-bond GR features that have no
  counterpart in alchemist (ionic dots, brackets, aromatic circles, curly
  arrows, partial charges, hapto/Markush, delocalized/bent/wavy bonds).
- `labels.typ`: shared GR label/colour helpers (subscripts, isotope/charge/
  oxidation superscripts, Jmol CPK colours).
- styling now flows from alchemist's `default` config (single styling system).

`draw-chem` is exposed alongside `chem` so several molecules, reaction arrows
and cross-molecule electron-pushing arrows compose in one shared canvas.

Assisted-by: Claude Code (model: claude-opus-4-8)
The alchemist-primitive rendering and atom-sep scaling change the output, so
refresh the golden images for the chem test suite.

Assisted-by: Claude Code (model: claude-opus-4-8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants