Skip to content

Bug: 404 legacy website link can point to an external site #3013

@e-q

Description

@e-q

Describe the bug

The custom 404 page includes a link to try the same path on
legacy.python.org. For a missing URL whose decoded path starts with //, the
legacy-link helper currently treats the path as a network-location URL and
generates a link to that external host.

For example, a path such as /%2Fevil.test/x can become a 404 page whose
"same page on the legacy website" link points to http://evil.test/x instead
of staying under http://legacy.python.org.

To Reproduce

  1. Use a missing python.org path whose decoded path starts with two slashes,
    such as /%2Fevil.test/x.
  2. Render the custom 404 page.
  3. Inspect the "same page on the legacy website" link.
  4. The rendered href can use the decoded host-like path as the URL host.

Local source-level reproduction from the current helper:

>>> from apps.cms.views import legacy_path
>>> legacy_path("//evil.test/x")
'http://evil.test/x'

Expected behavior

The legacy-site link should always stay under the legacy.python.org origin.
For the decoded path above, the link should be
http://legacy.python.org//evil.test/x.

Additional context

Local source-level evidence:

legacy_path('//evil.test/x') = http://evil.test/x

The helper currently uses URL joining to combine a fixed absolute base URL with
the request path. URL joining allows a path beginning with // to replace the
base URL's host.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions