Skip to content

update from Gluejar#94

Open
eshellman wants to merge 1111 commits into
EbookFoundation:masterfrom
Gluejar:master
Open

update from Gluejar#94
eshellman wants to merge 1111 commits into
EbookFoundation:masterfrom
Gluejar:master

Conversation

@eshellman

Copy link
Copy Markdown

No description provided.

rdhyee and others added 30 commits April 28, 2026 10:09
Closes part of #1131.

The SEND_TEST_EMAIL_JOB is a dev-era artifact that's been hardcoded to
mail `unglueit@ebookfoundation.org` 72 times/day whenever a deploy is
tagged `deploy_type=test`. On production it's never registered
(`'prod' == 'test'` is False), but on test.unglue.it it fired 1,796 times
through 2026-02-24, polluting Eric's inbox with "hi there 20,25,30" /
"testing 1, 2, 3" messages. See #1131 for the full analysis.

The companion change removing the `if '{{ deploy_type }}' == 'test':`
branch from `prod.py.j2` is in EbookFoundation/regluit-provisioning.

Also removes the commented-out reference in settings/dev.py so the
symbol name doesn't dangle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Catch ValueError/TypeError/KeyError from int(self.data["notarobot"])
and raise forms.ValidationError instead, so malformed submissions
render as a normal form error rather than a 500.

Fixes #1136

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The legacy `auth_logout` name was provided by old django-registration;
django_registration 3.4 no longer defines it. `django.contrib.auth.urls`
(already included in libraryauth/urls.py) provides the equivalent name
`logout`.

Without this, any authenticated request that renders base.html crashes
with NoReverseMatch.

Surfaced on dj42.unglue.it after a successful Google sign-in.

Note: Django 4.2's LogoutView still accepts GET with a deprecation
warning; the GET-via-link → POST-only migration is a separate follow-up
needed before Django 5.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Django 4.2's LogoutView, with no `next` param and no
LOGOUT_REDIRECT_URL configured, falls through to rendering
`registration/logged_out.html`. This template does not exist in the
repo (the legacy template was `registration/logout.html`, served by
old django-registration's auth_logout view).

The navbar Sign Out link has no `?next=`, so without this setting
clicking Sign Out would produce TemplateDoesNotExist.

Caught by Codex on PR #1145.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fix auth_logout URL name in templates (Django 4.2)
The 429 handler in load_doab_oai logs to the Django logger
(unglue.it.log) but otherwise returns normally — so the cron's
doab-harvest.log records "loaded 0 records, ending at 2000-01-01" or a
truncated success line, indistinguishable from a clean run.

Add a 4th return value (`error`) from load_doab_oai, set when 429 is
caught. The management command echoes "ERROR: DOAB OAI rate-limited..."
to stdout when present, so 429s appear in doab-harvest.log alongside the
regular summary line.

Bitstream-API 429s in doab_utils.get_streamdata are out of scope for
this patch — they fire per-record rather than terminating the harvest,
and surfacing them well needs aggregation. Filed as a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When DOAB's OAI endpoint returns HTTP 429, the previous behavior was
to log and continue. Cron's next nightly run would re-hit DOAB inside
the ban window, which refreshes the 24h ban. The harvest never
escaped on its own.

This commit:

- Persists the Retry-After deadline to a sentinel file across cron
  invocations.
- Skips subsequent runs while now < deadline (and clears the file
  once the deadline passes).
- Honors RFC 9110 Retry-After (delta-seconds or HTTP-date) via the
  parser in core.loaders.doab.
- Falls back to a conservative 1-hour deadline if Retry-After is
  missing or unparseable (handled in core.loaders.doab).
- Uses timezone-aware UTC throughout; persisted ISO strings include
  the offset.
- Logs sentinel write failures to stderr with a traceback; silent
  swallowing here would mask the very ban-extension loop this patch
  prevents.
- Adds --ignore-retry-after operator override.

Sentinel path defaults to /var/log/regluit/doab-oai-retry-after.state
(durable, in a dir already writable by the cron user, not touched by
the existing log-cleanup cron). Override via DOAB_OAI_SENTINEL_PATH
env var for dev/test environments.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surface DOAB OAI 429 errors in load_doab cron output
…after

Respect DOAB OAI Retry-After across cron runs
The DOAB bitstream/REST endpoints (used by get_streamdata and the
cover download in store_doab_cover) get rate-limited independently
of the OAI endpoint. The OAI fix (#1143 + #1144) is a cross-cron
sentinel; the bitstream API is called per-record inside the harvest
loop, so it needs an in-process circuit breaker that also persists
across runs.

This commit:

- Adds an in-process circuit breaker in core.loaders.doab_utils that
  opens when either bitstream endpoint returns 429. While open, calls
  return None without making HTTP requests.
- Persists the deadline to a sentinel file
  (/var/log/regluit/doab-bitstream-retry-after.state by default;
  DOAB_BITSTREAM_SENTINEL_PATH env override) so subsequent cron runs
  honor the same window without an extra hit.
- Reuses parse_retry_after() (RFC 9110 delta-seconds OR HTTP-date)
  by moving it from doab.py into doab_utils.py.
- Single ERROR log per trip — no more per-record 429 spam in the
  Django log (was ~14/run on May 8).
- Patches both endpoints (get_streamdata + the cover requests.get
  call inside store_doab_cover, including the post-redirect retry).
- Test-only reset hook for unit tests.

Doesn't stall OAI harvesting — only cover work is skipped while the
bitstream breaker is open. Each cover lookup returns None (= no
cover), and the harvest continues processing the record.

doab-check has the same get_streamdata code but no callers (dead
code inherited from regluit) — patch confined to this repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…breaker

Add DOAB bitstream API circuit breaker (HTTP 429)
Switches the pyoai pin from infrae/pyoai (last release March 2022,
explicitly unmaintained per its own README) to our fork carrying the
RateLimitedError patch:

  EbookFoundation/pyoai @ bf709d26ae6e4b34b9b0ca726e0f032d97f0bd38
  Branch: fix/expose-429-as-rate-limited-error

The fork raises a structured RateLimitedError on HTTP 429 with
Retry-After parsed per RFC 9110, instead of leaving callers to parse
HTTPError.headers themselves. An upstream PR to eth-library/oaipmh
(the maintained fork; infrae/pyoai is no longer accepting changes)
is pending; we'll un-pin to that fork when it lands.

load_doab_oai now catches RateLimitedError first; the existing
HTTPError-with-code-429 path is kept as a runtime fallback for
deployments that revert the pin or run against stock pyoai. A
never-raised placeholder class handles the ImportError case so the
except clause is always syntactically valid.

Step C of today's DOAB rate-limit plan (after #1143/#1144 OAI
sentinel and #1147 bitstream breaker).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pin pyoai to EbookFoundation fork; consume RateLimitedError
Resolve settings/dev.py conflict: keep Celery 5.x setting renames from this
branch (CELERY_BEAT_SCHEDULE, CELERY_WORKER_HIJACK_ROOT_LOGGER) plus master's
ADMINS email update; both sides drop the send_test_email schedule line.

Brings in 16 master commits since fork (DOAB 429 handling, pyoai EbookFoundation
fork pin, notarobot int-guard, SEND_TEST_EMAIL_JOB removal, admin-email rate limit).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Compute the publication year range from a single evaluated edition
queryset instead of two separate (asc + desc) queries. The previous
code initialized `latest_publication` to None and set it only from the
second query; when the dated-edition set changed between the two reads
(e.g. concurrent edition loading) the second query could find no
truthy dates, leaving `latest_publication` None and crashing on
`earliest_publication + "-" + latest_publication`.

Also switch to save(update_fields=['publication_range']) so this
nominally-read property doesn't persist other (possibly stale)
in-memory Work fields.

Adds WorkTests regression coverage: mixed null/blank/valid dates,
single-year vs range, all-blank, and a query-count guard proving the
single-query shape.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The api `widget` view treated any non-"featured", non-10/13-char token
as a work id and called safe_get_work() without catching
Work.DoesNotExist, so /api/widget/<bad-id>/ returned 500 instead of the
existing empty-widget response. Two related defects fixed at the same
time:

- convert_10_to_13() returns None for an invalid ISBN-10, after which
  `len(isbn)` raised TypeError. Guard with `if isbn and len(isbn)==13`.
- widget.html renders "...ISBN {{isbn}}..." but the work=None paths did
  not pass `isbn`. Pass it into every render path.

Wrap safe_get_work() in try/except Work.DoesNotExist -> render empty
widget, matching the existing Identifier.DoesNotExist handling.

Adds api ApiTests regression coverage: non-numeric token, numeric
unknown id, and invalid ISBN-10 all return 200.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Keeps the Python 'if date' guard as belt-and-suspenders: the structural
invariant (years contains only truthy strings) stays enforced locally,
independent of the queryset.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Fix #1155: TypeError in Work.publication_date from two-query race
Fix #1156: widget endpoint 500s on unknown/non-numeric/invalid ids
Staging boxes restored from a prod snapshot keep prod's Site row
(domain='unglue.it'), so every emailed link (password-reset, notices,
etc.) points at prod instead of the staging box's own host.

This command updates Site.objects.get_current() (the SITE_ID row) to
the supplied domain (and optional name).  It is idempotent: if the row
already matches, it prints a no-op message and exits cleanly.

Used by the provisioning repo's post-deploy Ansible task to localise
the Site to the box's own server_name on every deploy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mechanical, no-meaning-change corrections to the FAQ page surfaced by the
CC+Codex copy review on #1165. Typos, a broken URL, proper-noun/acronym
fixes, subject-verb agreement, and site-name casing — nothing factual.

- "that why" → "that's why"; "an non-profit" → "a non-profit"
- "do well be selling" → "by selling"; "the the copyright" → "the copyright"
- "right holder tools" → "rights holder tools"; "some interested" → "some interest"
- broken Facebook URL "facebook/com" → "facebook.com"
- "Wikisources/Hathi Trust/Github" → "Wikisource/HathiTrust/GitHub"
- "page.You'll" → "page. You'll"; mid-sentence "Let" → "let"
- "cannot not be obtained" → "cannot be obtained"; "They does" → "They do"
- "Authors' Guild" → "Authors Guild"; CC license styling "NoDerivatives, NonCommercial"
- site-name casing unglue.it → Unglue.it; "Thanks for Ungluing" → "Thanks-for-Ungluing"

Factual/staleness/voice items (fees, payouts, sender email, campaign
retirement, etc.) are handled separately in the judgment-call PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FAQ copy: objective typo & grammar fixes (#1165)
Mechanical, no-meaning-change corrections to the FAQ page surfaced by the
CC+Codex copy review on #1165. Typos, a broken URL, proper-noun/acronym
fixes, subject-verb agreement, and site-name casing — nothing factual.

- "that why" → "that's why"; "an non-profit" → "a non-profit"
- "do well be selling" → "by selling"; "the the copyright" → "the copyright"
- "right holder tools" → "rights holder tools"; "some interested" → "some interest"
- broken Facebook URL "facebook/com" → "facebook.com"
- "Wikisources/Hathi Trust/Github" → "Wikisource/HathiTrust/GitHub"
- "page.You'll" → "page. You'll"; mid-sentence "Let" → "let"
- "cannot not be obtained" → "cannot be obtained"; "They does" → "They do"
- "Authors' Guild" → "Authors Guild"; CC license styling "NoDerivatives, NonCommercial"
- site-name casing unglue.it → Unglue.it; "Thanks for Ungluing" → "Thanks-for-Ungluing"

Factual/staleness/voice items (fees, payouts, sender email, campaign
retirement, etc.) are handled separately in the judgment-call PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 0c43233)
… not a hard fail

Codex review of #1164 fix: a fresh/scrubbed DB without a Site row for SITE_ID
would crash the post-deploy task with Site.DoesNotExist. get_or_create makes it
self-healing while staying idempotent on existing rows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add set_site_domain management command (fix #1164)
Staging boxes restored from a prod snapshot keep prod's Site row
(domain='unglue.it'), so every emailed link (password-reset, notices,
etc.) points at prod instead of the staging box's own host.

This command updates Site.objects.get_current() (the SITE_ID row) to
the supplied domain (and optional name).  It is idempotent: if the row
already matches, it prints a no-op message and exits cleanly.

Used by the provisioning repo's post-deploy Ansible task to localise
the Site to the box's own server_name on every deploy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… not a hard fail

Codex review of #1164 fix: a fresh/scrubbed DB without a Site row for SITE_ID
would crash the post-deploy task with Site.DoesNotExist. get_or_create makes it
self-healing while staying idempotent on existing rows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…1170 (#1171)

Align master with deployed Django 4.2 (prod-green @ 751f781) — refs #1170
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants