perf(stdlib): defer string.format width padding to iolist#316
Open
davydog187 wants to merge 3 commits into
Open
perf(stdlib): defer string.format width padding to iolist#316davydog187 wants to merge 3 commits into
davydog187 wants to merge 3 commits into
Conversation
apply_width_flags/3 built String.duplicate/2 padding and concatenated it onto the formatted value with <>, allocating a fresh padded binary for every width-flagged specifier. Return an iolist ([pad, str] / [str, pad]) instead and let it thread through the existing format_directive/3 accumulator, so the padded result is materialised exactly once at the format_string/3 base case via IO.iodata_to_binary/1. Output is byte-identical; the width-flagged benchmark (n=1000) improves ~13%. Plan: B13 Closes #310
davydog187
commented
Jun 1, 2026
Contributor
Author
davydog187
left a comment
There was a problem hiding this comment.
Automated code review — verdict: clean (perf-only, correct)
Reviewed the diff against main. This is a sound performance-only change; output is byte-identical and the data flow is iolist-safe end to end.
Correctness (the one real risk: does anything downstream treat the result as a binary?) — no.
apply_width_flags/3measuresbyte_size(str),binary_part(str, …), andString.starts_with?(str, "-")on the inputrawbinary, then returns an iolist. No binary operation is ever applied to the iolist it produces.- Its sole consumer is
apply_format_spec/2→format_directive/3, which appends[acc, str]. An ioliststris a valid iolist element, so it threads through untouched and is materialised exactly once at theformat_string/3base case (IO.iodata_to_binary/1). No intermediate flatten, no per-specifier binary allocation — which is the whole point.
Edge cases verified:
- Zero-pad-with-sign
["-", pad, binary_part(str, 1, byte_size(str) - 1)]is byte-for-byte equal to the old"-" <> pad <> binary_part(...)— sign stays leftmost (%05dof-7→-0007). 0+-flag combo:pad_charcorrectly falls back to space (printf has-override0); left-justify path[str, pad].- Multibyte
%swidth stays byte-measured (%6sof"café"→" café").
Conventions: scope is stdlib (not the plan id); no plan-id leakage in source; comment explains the iolist rationale well.
Tests/bench: full suite byte-identical (2114 passed / 19 skipped, unchanged); width-flagged n=1000 ~3.88ms → ~3.39ms (~13% faster).
No blocking or actionable findings. Note for the merge train: #317 and #319 also edit lib/lua/vm/stdlib/string.ex, so whichever of these merges later will need a trivial rebase.
This was referenced Jun 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
defer string.format width padding to iolist
Plan:
.agents/plans/B13-string-format-width-iolist.mdCloses #310
Goal
Eliminate the per-specifier binary allocation in
string.formatwidth padding.apply_width_flags/3builtString.duplicate(pad_char, deficit)and concatenated the padding onto the formatted value with<>, producing a fresh binary for every width-flagged specifier. It now returns an iolist ([pad, str]for right-justify,[str, pad]for left-justify) that flows through the iolist accumulator introduced in PR #299, so the padded result is materialised exactly once at the top level viaIO.iodata_to_binary/1.This is a performance-only change. Output is byte-identical to the previous implementation for every input.
Success criteria
apply_width_flags/3returns an iolist ([pad, str]/[str, pad]) instead of a concatenated binary; the no-padding branch still returnsstrunchanged.format_string/3/format_directive/3with no intermediateIO.iodata_to_binaryper specifier — materialisation happens only at theformat_string/3base case. Verified: single call site atapply_format_spec/2, which feeds the existing[acc, str]append.%05dof-7->-0007). Verified with directLua.eval!checks and the full property suite.%spreserved (format("%6s", "café")->" café", one fill byte). Verified.test/lua/vm/string_test.exs152 passed (41 properties, 111 tests). (Note: the path named in the plan,test/lua/vm/stdlib/string_test.exs, does not exist; the actual string.format coverage lives intest/lua/vm/string_test.exs— recorded in Discoveries.)mix testpasses: 2114 passed, 19 skipped, 1 excluded — identical to pre-change counts.mix compile --warnings-as-errorspasses.mix run benchmarks/string_format.exswidth-flagged (n=1000) case improved (see below).Benchmark (width-flagged, n=1000)
Focused timing (200 runs after 50-warmup, this machine):
~13% faster on the width-flagged path by removing the per-specifier
<>copy ofstr+pad.Changes
Discoveries
test/lua/vm/stdlib/string_test.exs, which does not exist. The string.format unit + property coverage is intest/lua/vm/string_test.exs; that file was run instead (152 passed). No scope change.Verification
Out of scope (intentional)
format_spec_integer,format_spec_float,format_spec_hex, ...) — they keep returning binaries.format_string/3/format_directive/3parsing, the literal fast path, the:binary.split/2chunking from PR perf(stdlib): iolist string.format and plain-table sort/concat fast paths #299.string.ex. perf(stdlib): parse string.format flags into bitmask, integer specifier #309 and perf(stdlib): use io_lib.format for string.format float conversion #311 also edit this file; this PR stays strictly insideapply_width_flags/3and its single call site.