Skip to content

Add inline LaTeX math support via OMML injection#225

Open
troutrot wants to merge 1 commit into
MartinPacker:mainfrom
troutrot:feature/inline-math-support
Open

Add inline LaTeX math support via OMML injection#225
troutrot wants to merge 1 commit into
MartinPacker:mainfrom
troutrot:feature/inline-math-support

Conversation

@troutrot
Copy link
Copy Markdown
Contributor

Summary

  • Adds $...$ inline math syntax that renders as native PowerPoint math objects (OMML)
  • New mathxsl metadata option specifies the path to MML2OMML.XSL (included with Microsoft Office)
  • New pptx_math.py module handles LaTeX → MathML → OMML conversion via latex2mathml and lxml
  • Falls back to literal $...$ text if the option is unset or conversion fails

Usage

Add to your Markdown metadata:

mathxsl: /path/to/MML2OMML.XSL

Then use inline math in bullet text:

* The Euler identity $e^{i\pi}+1=0$ is beautiful.

Requirements

pip3 install latex2mathml lxml

MML2OMML.XSL is included with a local Microsoft Office installation.

Introduces $...$ inline math syntax that converts LaTeX expressions to
native PowerPoint math objects (OMML) using MML2OMML.XSL. Activated by
the new `mathxsl` metadata option. Falls back to literal text if the
option is unset or conversion fails.

- pptx_math.py: new module for LaTeX -> MathML -> OMML conversion
- paragraph.py: sentinel-based inline math parser and OMML injector
- md2pptx: `mathxsl` processing option and math inserter initialisation
- docs/user-guide.md: document `mathxsl` option and inline math syntax
- README.md: note latex2mathml and lxml as optional dependencies
@MartinPacker
Copy link
Copy Markdown
Owner

I like this - but I need to check it out in detail.

As this requires a fair amount of documentation I intend to address it after your others @troutrot .

@MartinPacker
Copy link
Copy Markdown
Owner

Question: Is there a python package to pull in to support all this - whose presence md2pptx can test for (preferably with the level being returned)?

@troutrot
Copy link
Copy Markdown
Contributor Author

troutrot commented May 20, 2026

I take it you're suggesting I publish this as a standalone PyPI
package (something like pptx-math) with __version__ exposed,
so md2pptx can do an optional import pptx_math check at startup
— same pattern as Pillow/CairoSVG. Let me know if I'm reading
that wrong.

If so, happy to do it. I checked PyPI and didn't find anything
existing that does the LaTeX→OMML→pptx pipeline end-to-end
(latex2mathml stops at MathML), so a new package seems warranted.

One caveat to flag: the MathML→OMML step uses Microsoft's
MML2OMML.XSL, which ships with Office and can't be
redistributed under its EULA. Users will need to supply their
local copy. I'll document this clearly in the README.

Will update the PR once pptx-math is published.

@MartinPacker
Copy link
Copy Markdown
Owner

Actually we need md2pptx more actively involved: The idea is to have the user code something in a md2pptx input file that creates the OMML. Your code probably does that. All I was asking was if it used a python shim to get to the underlying code. And does that code run on all architectures python-pptx runs on?

@MartinPacker
Copy link
Copy Markdown
Owner

Again, don't take this as a negative but we will need to document this deviation from standard markdown. There is a section for that in the User Guide.

If there IS a dialect of Markdown that supports this let me know - and I can nod to it when documenting it as a deviation.

I also need to see how this renders. Can you post a screenshot?

@troutrot
Copy link
Copy Markdown
Contributor Author

troutrot commented May 20, 2026

Thanks for the clarification! Let me address your points directly:

  1. Dependencies and Architecture (The "shim" question)
    No, it does not use a shim to call an external LaTeX installation (like TeX Live). It is not dependent on any heavy external software.
    The code relies strictly on two Python packages: latex2mathml (a pure Python package) and lxml.

Regarding architectures: Yes, the Python code itself runs on all architectures that python-pptx supports (Windows, Mac, Linux). However, there is one caveat: the MathML-to-OMML conversion currently requires the MML2OMML.XSL stylesheet. Since this file ships with Microsoft Office, Mac or Linux users would need to manually supply this file and specify its path in the markdown header (e.g., mathxsl: /path/to/MML2OMML.XSL).
(If this XSL dependency is a concern for true cross-platform portability in the future, I can look into writing a pure-Python mapping rule to replace it, but this is how the current code operates.)

  1. Markdown Dialect
    I completely understand the need for documentation! I will add a section to the User Guide.
    As for the dialect, using $$ ... $$ for block math and $ ... $ for inline math is actually the de facto standard in modern Markdown. It is natively supported by:

GitHub Flavored Markdown (GFM) (via MathJax)

Jupyter Notebooks

Pandoc Markdown

Obsidian / Notion

  1. Rendering Screenshot
    Because this converts directly to OMML, it renders as a native PowerPoint Equation object (not an image). It is fully editable within PowerPoint and scales perfectly.

Could you please wait until tomorrow for the screenshot? I am currently working on a Mac that doesn't have the MML2OMML.XSL file, and I don't have the previously generated PPTX file at hand. I tested the code on another machine where the environment is fully set up. I will have access to that machine tomorrow, so I will generate the file and post the screenshot here as soon as I can!

Let me know if the current dependency setup (requiring the XSL file) is acceptable for now.

@MartinPacker
Copy link
Copy Markdown
Owner

Thanks for all this. Please don't touch the User Guide - particularly as I am currently editing it. And it needs care in getting from .mdp to .md plus .log and then to .html. (mdpre is an important step in this.)

@troutrot
Copy link
Copy Markdown
Contributor Author

スクリーンショット 2026-05-21 14 02 03 スクリーンショット 2026-05-21 14 00 16

path is not a real one.

@MartinPacker
Copy link
Copy Markdown
Owner

I'm also bothered by a simple $ bracketing. That would complicate parsing versus other uses of the character.

Would something like <math> ... </math> be OK? I'm assuming many use cases would involve automating wrapping the LaTeX with whatever syntax.

Also what would an incantation that put the equation on its own line look like? And would we like some kind of equation reference number? (Perhaps these are stretch objectives.)

@troutrot
Copy link
Copy Markdown
Contributor Author

troutrot commented May 21, 2026

Thanks for raising this, Martin! You make a great point about a simple $ colliding with currency (like "costs $5 to $10").

To avoid custom tags, there are two established Markdown standards to handle this safely. I can implement whichever you prefer:

Option 1: Spacing Rules (Jupyter / Pandoc)
We use standard \$ ... \$(\$ means '$'), but the parser enforces strict spacing (e.g., no spaces immediately inside the $). This safely ignores normal text like $5 to $10.

Option 2: Backticks (GitHub / GitLab)
We use $\...\$ for inline math. This guarantees zero collisions with currency, and also stops Markdown from accidentally turning math symbols (like _ or *) into italics/bold.

Which inline math approach do you prefer for md2pptx?

Regarding your other questions:

Equations on their own line (\$\$...\$\$): This is the standard for block math. I haven't implemented this in the current PR yet, but I plan to add it!

Equation reference numbers: As you guessed, true cross-referencing (linking) is definitely a stretch objective. As for simply displaying an equation number (like using \tag{1}), I haven't fully verified how the OMML conversion handles that specific layout yet, so I will investigate and test it!

Let me know your thoughts on the inline syntax!

@MartinPacker
Copy link
Copy Markdown
Owner

MartinPacker commented May 21, 2026

Just to answer the "equation in a block" question I have used the following for "alien syntax" elsewhere in md2pptx:

Triple backtick on a line with eg "mathml" as the second word on the line.

Ending with another triple backtick line.

A slightly weird way of doing equation numbers might be a third parameter so...

 ''' mathml 1.2.4
...

'''

where ''' is really triple backtick.

If the inline equation use case isn't wanted the above would be all that needs implementing.

@troutrot
Copy link
Copy Markdown
Contributor Author

troutrot commented May 21, 2026

You are absolutely right. After thinking about it, I completely agree that trying to parse inline math can introduce unwanted errors and edge cases. Keeping the parser safe and simple should be the top priority.

As a compromise for the future:

Let's leave the inline math support (like the GitHub-style \$`...`\$") and the \$\$...\$\$ block syntax as a future enhancement or a separate PR down the road. For now, getting this solid code block implementation merged is a great first step.

@MartinPacker
Copy link
Copy Markdown
Owner

Great. So what syntax are you going for? Triple backtick with mathml (not case sensitive) to open the bracket and triple backtick to close it?

A nice benefit of that is you get multiple lines supported.

I assume this will add XML into the tree, rather than a rendered image. The advantage of the latter is controllable dimensions.

@troutrot
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback.

  1. Syntax:
    Yes, I will definitely use the triple backtick format (```). However, I prefer using math instead of mathml. Since the user is writing LaTeX (not raw MathML XML tags), math is semantically correct and aligns perfectly with the GitHub standard.

  2. XML vs Image:
    I'd definitely prefer going with XML (OMML). Images make precise layout positioning difficult and would make future inline math support much harder to implement.

I will update the PR with this code block implementation in the coming days!

@MartinPacker
Copy link
Copy Markdown
Owner

I have no problem with math. Might appeal to those who don't know the contraction maths. perhaps accept either.

I hope you are able to make the generated stuff fit within a rendering rectangle - as that is the primitive that enables multiple blocks on a slide. (Or even one without colliding with a title.)

@troutrot
Copy link
Copy Markdown
Contributor Author

  1. math vs maths
    I see. I will definitely make it accept both math and maths (case-insensitive) to be universally friendly!
  2. Rendering Rectangle (Dimensions & Alignment)
    You don't need to worry about collisions! Since OMML acts exactly like native text in PowerPoint, we don't need to do any complex boundary calculations.
    The plan is to simply add a new Paragraph within the current rendering rectangle (TextFrame), inject the OMML there, and apply Center Alignment. Because it is just text, PowerPoint will naturally confine it within the existing rectangle and prevent it from colliding with the title, scaling the font size if necessary.
    Does simply appending a centered paragraph inside the current text block sound like the right approach to you?

@MartinPacker
Copy link
Copy Markdown
Owner

One thought that occurred to me:

RunPython might be a good place to prototype much of this. It might also be a good model - as it also does the "triple backtick followed by a name" thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants