i18n: keep messages.pot location comments on one line to prevent merge conflicts#12903
Open
lokesh wants to merge 1 commit into
Open
i18n: keep messages.pot location comments on one line to prevent merge conflicts#12903lokesh wants to merge 1 commit into
lokesh wants to merge 1 commit into
Conversation
Babel wraps the #: location comments at 76 chars, so adding or removing one file from a string's location list re-flows the whole block and rewrites filenames that didn't change. That turns unrelated PRs into overlapping diffs and causes spurious messages.pot merge conflicts (internetarchive#12837). The issue proposed write_po(..., width=0), but Babel mirrors xgettext and always wraps comments regardless of width, so width=0 only unwraps msgid text (the downside) without unwrapping the location comments (the actual fix). Instead, post-process write_po's output to collapse each #: block onto a single line. Adding a file now just extends that line; message text stays wrapped and readable. This first regeneration reflows the whole file as a one-time change. Closes internetarchive#12837
Member
|
Important: Are there other types of comments (like fuzzy) that might break? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #12837
Refactor. Stops
messages.potfrom generating spurious merge conflicts between unrelated PRs.The problem, in plain terms
openlibrary/i18n/messages.potis a generated file we commit. Every translatable string carries a#:"location comment" listing the files that use it:Babel wraps those lines at 76 characters. So when a PR adds one file to a string, the line tips over the limit and Babel re-flows the whole block — rewriting filenames that didn't actually change:
subjects.htmlandwork_search.htmlgot pushed onto a new line even though nothing about them changed. Now if a second in-flight PR also touches one of those templates, both branches have rewritten the same physical lines → git reports a conflict, even though the two PRs added completely unrelated files. The wrapping is what turns a non-overlapping change into an overlapping one.Why the originally-proposed fix doesn't work
The issue suggested
write_po(..., width=0). I tried it against Babel 2.18 and it does the opposite of what we want. Babel deliberately copiesxgettext's behaviour — it always wraps comments, even when wrapping is otherwise off. From Babel's source:So
width=0leaves the#:location comments wrapped at 76 (the actual problem — untouched) and only unwraps longmsgid/msgstrtext into single long lines (the cosmetic downside — applied). It ships the cost without the benefit. There is no Babel option that single-lines the locations while keeping message text wrapped.What this PR does
After Babel writes the file, a small post-processing pass (
_unwrap_location_comments) joins each message's wrapped#:block back onto a single line. Adding a file now just extends that one line:Unrelated string changes no longer share any lines, so they stop conflicting. Message text (
msgid/msgstr) is left wrapped at Babel's default width, so it stays readable in diffs.For developers
messages.potconflicting again?" when rebasing or merging concurrent PRs that touch templates.which templates use this string) are kept — we didn't drop them to solve this..pot-touching PRs are mid-flight.For translators
.pothas the exact same set ofmsgids and the exact same locations; only how the location lines are wrapped differs. No string was added, removed, or altered.#:"where is this string used" hints translators rely on for context are preserved.Technical
openlibrary/i18n/__init__.py— writewrite_po's output to a buffer, run it through_unwrap_location_comments, then write to disk. The helper walks the file line by line and joins any run of consecutive#:lines (Babel emits a message's location comment as a contiguous block) into one. It's a pure string transform with no dependency on Babel internals beyond the stable#:gettext convention.width=0keeps comments wrapped (shown above);no_location=Truewould fix conflicts but drops the location breadcrumbs, which the issue explicitly wanted to keep.Testing
openlibrary/tests/core/test_i18n.py(Test_unwrap_location_comments): collapsing a wrapped block, leaving single-line comments untouched, not touching message text or#.translator notes, and the "add one file → clean one-line diff" scenario from feat(search): add /search/editions.json API + /search/editions UI (Phase 1 & 2 of #7451) #12663.messages.potround-trips throughread_poto an identical set of msgids and locations vs. the old file (semantic no-op).git diffthat the.potchange touches only#:lines — zero message-content lines changed.pre-commitpasses on the changed files, includingmypyand theGenerate POThook (confirming the committed.potmatches what the hook regenerates).pytest openlibrary/tests/core/test_i18n.py→ 6 passed.Possible follow-ups (out of scope)
#:line per file would make even two PRs editing the same string's location list merge cleanly. It's the most conflict-proof but unconventional and makes the file taller; this PR keeps the conventional space-joined single line..gitattributesmerge driver that regeneratesmessages.poton conflict, or having CI own the file outright (both noted in the issue).Stakeholders
@cdrini (i18n lead)