Skip to content

fix: use ~ as URL list separator so shared links survive linkifiers (#672)#686

Open
grzanka wants to merge 2 commits into
masterfrom
claude/bold-gauss-jec0x
Open

fix: use ~ as URL list separator so shared links survive linkifiers (#672)#686
grzanka wants to merge 2 commits into
masterfrom
claude/bold-gauss-jec0x

Conversation

@grzanka
Copy link
Copy Markdown
Contributor

@grzanka grzanka commented Jun 1, 2026

Problem (#672)

Pasting a shareable dEdx link into Signal/iMessage/email auto-links only the part up to the first comma; the rest renders as plain text. Our shareable URLs used a literal comma as the list-item separator for energies, particles, materials, programs, lookups, mat_elements, and series. Messenger/email linkifiers are heuristic and treat a comma as sentence punctuation, so they terminate the auto-link there — breaking almost every multi-row shared link (even the default two-row energies=100,200).

Fix

Switch the canonical list separator to ~ (RFC 3986 unreserved — never percent-encoded in human terms, never dropped by linkifiers, and unused elsewhere in our token grammar). Decoders accept both ~ and the legacy ,, so every previously shared/bookmarked link keeps working.

  • Encoders join with ~ (calculator/plot/entity-id/mat_elements codecs).
  • Decoders split on /[,~]/ (backward compatible).
  • Serializers restore %7E~ (URLSearchParams.toString() encodes ~); : stays literal, so URLs remain human-readable.
  • isUrlSafeNumeric also rejects ~ so a bad row can never inject the separator.
  • Grammar gains list-sep = "~" / ","; LookupUnitToken excludes ~.

Version bump → urlv=3

The canonical form changed, so the schema is bumped to v3. A naive bump would have made negotiateVersion reject every existing v2 link (it accepted only v === CURRENT), so negotiateVersion now accepts the [MIN_SUPPORTED_URL_MAJOR (2), CURRENT_URL_MAJOR (3)] range. v2 links hydrate identically (decoders read both separators) and are rewritten to canonical v3 ~ form on load via the existing replaceState. migrateUrl stays the identity; v1 remains retired.

Examples

Before (v2) After (v3)
Calculator …&energies=100,200,500 …&energies=100~200~500
Multi-program programs=9,2,101 programs=9~2~101
Inverse lookups lookups=7.72:cm,45:um lookups=7.72:cm~45:um
Plot series series=9.1.276,2.1.276 series=9.1.276~2.1.276

Tests

  • New src/tests/unit/url-separator-672.test.ts: no-comma encoder guards (the regression lock), legacy-comma + mixed-separator decode, encode→decode round-trips, a linkifier regression (a comma-terminating autolink captures the whole ~ URL but truncates the comma form), and the v2→v3 upgrade.
  • New ~ parse cases in url-parse.test.ts; new E2E in calculator-url.spec.ts (canonical ~, v2→v3 upgrade) and url-parser.spec.ts (v3 current / v2-in-range no banner).
  • Existing fixtures updated: encoder output ,~ and urlv=2urlv=3; input URLs kept with commas to prove backward compatibility.

Static checks

pnpm lint, pnpm run check (svelte-check + tsc), and pnpm run format:check all pass. Full unit suite green except 3 pre-existing guard-forbidden-files tests that fail on a clean tree too (they need git base refs unavailable in this environment).

Docs

  • shareable-urls.md → v8, shareable-urls-formal.md → v9: ABNF list-sep, version-detection range, canonicalization, "linkifier-safe" rationale, and examples updated.

https://claude.ai/code/session_01P21YxiM9UmhQe1R1fS1htJ


Generated by Claude Code

…672)

Shareable URLs joined list items (energies, particles, materials, programs,
lookups, mat_elements, series) with a literal comma. Messenger/email
auto-linkifiers terminate an auto-link at the first comma, so pasted
multi-row links were truncated (Signal screenshot in #672).

Switch the canonical list separator to `~` (RFC 3986 unreserved, never
dropped by linkifiers). Decoders accept both `~` and the legacy `,`, so
every previously shared/bookmarked link keeps working. Serializers restore
`%7E` -> `~` (URLSearchParams encodes `~`); `:` stays literal.

Bump the URL schema to v3 (urlv=3). To preserve backward compatibility,
negotiateVersion now accepts the [MIN_SUPPORTED_URL_MAJOR, CURRENT_URL_MAJOR]
= [2, 3] range: v2 links hydrate identically and are rewritten to canonical
v3 `~` form on load. migrateUrl stays the identity. The grammar gains a
`list-sep = "~" / ","` rule.

Docs (shareable-urls.md v8, shareable-urls-formal.md v9) and a dedicated
unit + E2E regression battery added; existing fixtures updated.
@grzanka grzanka self-assigned this Jun 1, 2026
@grzanka grzanka requested a review from Copilot June 1, 2026 21:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the shareable-URL schema to prevent messenger/email linkifiers from truncating links at commas by switching list-valued query parameters to use ~ as the canonical separator while keeping legacy comma decoding for backward compatibility.

Changes:

  • Canonical URL encoding now joins list params with ~ and decoders split on /[,~]/ to accept both new and legacy links.
  • URL major version is bumped to urlv=3, and version negotiation now accepts the supported range [2..3] so existing urlv=2 links still hydrate and upgrade to v3 canonical form.
  • Adds/updates unit + E2E regression coverage plus spec/docs updates to lock in the linkifier-safe contract.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/e2e/url-parser.spec.ts Updates URL-version warning E2E to treat urlv=3 as current and accept legacy urlv=2 without banner.
tests/e2e/calculator-url.spec.ts Adds E2E regression coverage for comma→tilde canonicalization and linkifier-safe URLs.
src/tests/unit/url-version.test.ts Updates unit tests for [2..3] negotiateVersion behavior and CURRENT_URL_MAJOR=3.
src/tests/unit/url-shared.test.ts Updates mat_elements encoding expectation to use ~.
src/tests/unit/url-separator-672.test.ts New dedicated regression battery for issue #672 covering encoding, decoding, linkifier heuristic, and v2→v3 upgrade.
src/tests/unit/url-parse.test.ts Adds parser coverage for ~ and mixed ,/~ separators in list params.
src/tests/unit/plot-url.test.ts Updates plot series encoding expectation to use ~.
src/tests/unit/multi-program-state.test.ts Updates multi-program URL encoding expectation to use ~.
src/tests/unit/external-data-url.test.ts Updates entity-id list expectations and adds explicit legacy-comma acceptance + canonical ~ re-emit assertions.
src/tests/unit/custom-compound-url.test.ts Updates custom-compound mat_elements expectations to use ~.
src/tests/unit/custom-compound-plot-url.test.ts Updates plot custom-compound mat_elements expectations to use ~.
src/tests/unit/calculator-url.test.ts Updates calculator URL encoding expectations (energies/lookups/program lists) to use ~ and urlv=3.
src/tests/contracts/url-codec.contract.test.ts Updates contract test asserting encoded calculator URLs always include urlv=3.
src/lib/utils/url-version.ts Bumps CURRENT_URL_MAJOR to 3 and allows negotiateVersion to accept majors in [MIN..CURRENT].
src/lib/utils/url-shared.ts Introduces shared URL_LIST_SEPARATOR="~" and URL_LIST_SPLIT_RE=/[,~]/; updates mat_elements encode/decode helpers accordingly.
src/lib/utils/url-grammar.peggy Updates PEG grammar to accept ~ and legacy , via ListSep and excludes ~ from lookup-unit tokens.
src/lib/utils/plot-url.ts Joins series with ~, splits on /[,~]/, and restores %7E~ in the query-string writer.
src/lib/utils/calculator-url.ts Bumps calculator URL version to 3, joins list params with ~, decodes lists with /[,~]/, restores %7E~, and hardens numeric safety against ~.
src/lib/state/multi-program.svelte.ts Updates multi-program URL param comment to reflect ~ separation (implementation uses shared formatter).
src/lib/external-data/ids.ts Parses entity-id lists with /[,~]/ and formats canonical lists using ~.
docs/ai-logs/README.md Adds index entry for the 2026-06-01 issue #672 AI session log.
docs/ai-logs/2026-06-01-issue-672-url-list-separator.md Adds detailed AI session log for the separator/schema change.
docs/04-feature-specs/shareable-urls.md Updates human-facing shareable URL spec to v3 (urlv=3) and ~ separator rationale/examples.
docs/04-feature-specs/shareable-urls-formal.md Updates formal ABNF contract to v3 with list-sep = "~" / "," and version-range semantics.
CHANGELOG-AI.md Adds a changelog entry for the issue #672 work and links the session log.

Comment on lines 436 to 444
for (const row of state.lookups) {
const trimmed = row.rawInput.trim();
if (trimmed === "") continue;
// Encode as `rawInput:unit` when unitFromSuffix, else bare `rawInput`
if (row.unitFromSuffix) {
encodedLookups.push(`${trimmed}:${row.unit}`);
} else {
encodedLookups.push(trimmed);
}
Comment thread src/lib/utils/url-shared.ts Outdated
Comment on lines +17 to +21
* We emit `~` (RFC 3986 *unreserved*) rather than the `,` used through `urlv=2`:
* messenger/email auto-linkifiers are heuristic and terminate a link at the
* first comma (sentence punctuation), so multi-item shared links were truncated
* (issue #672). `~` is never percent-encoded and is reliably kept inside
* auto-links, keeping URLs both human-readable and paste-safe.
Comment thread src/lib/utils/calculator-url.ts Outdated
}
}
const restStr = remaining.toString().replaceAll("%3A", ":").replaceAll("%2C", ",");
const restStr = remaining.toString().replaceAll("%3A", ":").replaceAll("%7E", "~");
Comment thread src/lib/utils/plot-url.ts Outdated
// Keep `:` literal (per-row/triplet sub-separator) and restore `~` (the list
// separator, which `URLSearchParams` percent-encodes as `%7E`) so shared URLs
// stay human-readable and survive auto-linkification (issue #672).
const restStr = paramsNoExtdata.toString().replaceAll("%3A", ":").replaceAll("%7E", "~");
Comment thread src/lib/utils/plot-url.ts Outdated
Comment on lines 189 to 196
if (key !== "extdata") paramsNoExtdata.append(key, value);
}

const restStr = paramsNoExtdata.toString().replaceAll("%3A", ":").replaceAll("%2C", ",");
// Keep `:` literal (per-row/triplet sub-separator) and restore `~` (the list
// separator, which `URLSearchParams` percent-encodes as `%7E`) so shared URLs
// stay human-readable and survive auto-linkification (issue #672).
const restStr = paramsNoExtdata.toString().replaceAll("%3A", ":").replaceAll("%7E", "~");
if (restStr) parts.push(restStr);
Review follow-up on PR #686:

- Apply the same URL-safety guard to inverse-lookup rows as energies, so a
  value containing a list separator (1,000 / 100~200) is dropped instead of
  corrupting tokenization or reintroducing the linkifier truncation.
- Plot URLs now carry the urlv version signal: encodePlotUrl sets urlv,
  plotUrlQueryString emits it first, and plot-url-sync uses plotUrlQueryString
  (readable literal : and ~, ordered extdata) instead of
  encodePlotUrl().toString(). This lets an older client show the
  unsupported-link banner for a v3 ~-series plot link rather than silently
  dropping the series.
- Replace hard-coded "~"/"%7E" in both query-string writers with a derived
  URL_LIST_SEPARATOR_ENCODED constant (computed via URLSearchParams from
  URL_LIST_SEPARATOR so they cannot drift). Fixes a latent bug: encodeURIComponent
  leaves ~ untouched, so it would not have restored %7E.
- Correct the URL_LIST_SEPARATOR doc comment (URLSearchParams does percent-encode
  ~ as %7E, hence the restore step).
- Update in-app user docs: /docs/technical grammar shows list-sep = "~" / "," and
  v3 prose/example; /docs/user-guide bumped to urlv=3 with a separator note;
  example URLs use urlv=3 and ~.

Adds tests for the lookups guard and the plot urlv signal.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants