Skip to content

Page cache key omits PageRendererConfig; rw.toml diagram-config changes serve stale HTML #401

Description

@yumike

Summary

PageRenderer::render_content (crates/rw-site/src/page.rs:305) computes the
page cache etag as:

let etag = format!("{source_mtime}:{}", ctx.resolution_fingerprint);

That covers two of the three inputs the rendered HTML depends on — the page's
own markdown (source_mtime) and cross-page snapshot state
(resolution_fingerprint) — but not PageRendererConfig. Because the
on-disk FileCache is otherwise only wiped on CARGO_PKG_VERSION mismatch
(crates/rw-cache/src/file.rs:136, version comes from
crates/rw-server/src/lib.rs:149CARGO_PKG_VERSION), editing rw.toml
config that does not also bump the binary version and does not touch any
page's markdown mtime is invisible to the page cache and silently serves
stale HTML.

The affected PageRendererConfig fields are kroki_url, dpi,
include_dirs, and extract_title (crates/rw-site/src/page.rs:63). They're
read into the renderer at PageRenderer::new (page.rs:249) and feed
create_renderer / create_diagram_processor (page.rs:412, page.rs:474),
but no representation of them reaches the etag.

Recovery requires manually deleting .rw/cache/pages/ — undiscoverable, with
no log, warning, or error.

History: This issue originally reported two classes of stale-HTML bugs.
The second — cross-page state (wikilink display titles, data-section-ref
attributes resolved from the target page) — was fixed in 649627e by
folding a resolution_fingerprint into the etag. That case is gone. This
issue now tracks only the remaining PageRendererConfig case.

Reproduction

Verified on main (head 649627e).

dpi change is ignored until cache wipe

mkdir -p docs && cat > docs/index.md <<'MD'
# Home

\`\`\`plantuml
@startuml
Alice -> Bob: hi
@enduml
\`\`\`
MD

cat > rw.toml <<'TOML'
[docs]
source_dir = "docs"
[diagrams]
kroki_url = "https://kroki.io"
dpi = 96
TOML

rw serve &  # first render caches HTML with dpi=96 SVG
curl -s localhost:7979/_api/pages/ | grep -o 'width="[^"]*"'   # observe dpi=96 width
kill %1

sed -i'' 's/dpi = 96/dpi = 300/' rw.toml
rw serve &
curl -s localhost:7979/_api/pages/ | grep -o 'width="[^"]*"'   # SAME width — cache hit

Same shape for toggling kroki_url from unset to https://kroki.io (or vice
versa): code blocks vs. rendered SVGs are cached identically as long as the
markdown's mtime hasn't moved. The diagrams bucket keys on diagram source
and would re-render — but the page HTML is served from the pages bucket
before that ever runs, so the freshly rendered SVG never gets embedded.

Root cause

crates/rw-site/src/page.rs:289-319:

fn render_content(
    &self,
    path: &str,
    page: &Page,
    breadcrumbs: Vec<BreadcrumbItem>,
    ctx: &RenderContext,
) -> Result<PageRenderResult, RenderError> {
    let source_mtime = self.storage.mtime(path).map_err(RenderError::from)?;
    let metadata = self.load_metadata(path);

    let etag = format!("{source_mtime}:{}", ctx.resolution_fingerprint);

    if let Some(cached) = self.page_bucket.get_json::<CachedPage>(path, &etag) {
        return Ok(/* cached html, title, toc */);
    }
    ...
}

The rendered HTML is a function of
(markdown source, SiteState, PageRendererConfig). The etag now covers the
first two (source_mtime and resolution_fingerprint, the latter built in
SiteState::new at crates/rw-site/src/site_state.rs:231) but never the
third. The FileCache VERSION guard catches binary upgrades only.

Suggested fix

Hash a stable representation of PageRendererConfig once at
PageRenderer::new and store the hash on the renderer, then mix it into the
etag:

let etag = format!("{source_mtime}:{}:{}", ctx.resolution_fingerprint, self.config_hash);

kroki_url, dpi, include_dirs, and extract_title are all cheaply
hashable. This mirrors the already-merged resolution_fingerprint approach
(commit 649627e) — one more component on the same etag string. The
expensive part (diagram rendering) lives in the separately content-keyed
diagrams bucket and is unaffected, so a config change just replays the
markdown pass plus syntax highlighting.

Why it matters

  • Routine operator action (tuning dpi, switching kroki_url) silently has
    no effect. Today's only fix is "delete .rw/cache", which isn't
    discoverable.
  • No log, no warning, no error — the failure mode is "my config change didn't
    take effect".

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions