Skip to content

Fix path join#2138

Open
alastair wants to merge 7 commits into
masterfrom
fix-path-join
Open

Fix path join#2138
alastair wants to merge 7 commits into
masterfrom
fix-path-join

Conversation

@alastair
Copy link
Copy Markdown
Member

Issue(s)
Inspired by #2135

Description
We had a number of places in the codebase where we use string concatenation to construct paths on disk. This has a few issues

  • we have to ensure that we always add a trailing slash to config items that represent directories
  • We had 2 different ways to remove a common prefix from a path, which can be done directly with os.path.relpath
  • In some cases, malicious user input might be able to escape a path and do something bad.

Here we replace all uses of string concatenation/truncating with methods from os.path. Additional fix the one case we found where a user provided path is unsafely concatenated (would have been a problem with os.path.join too)

Deployment steps:
After this is deployed we can remove the trailing / from all location settings

alastair and others added 7 commits May 28, 2026 10:58
str.replace could potentially replace text that occurs in the middle of
the string. This can probably not happen in this specific case, but
relpath is more correct
** implies that we want to do a recursive lookup, but we only have one
level of subdirs in this directory
End result is the same but it shows intent a bit better
`validate_input_csv_file` joined the user-supplied `audio_filename`
field directly with `sounds_base_dir` and only checked the resulting
path's extension and existence. A CSV row with `audio_filename` like
`../789/something.wav` resolved outside the submitting user's upload
directory, which then fed `bulk_describe_from_csv` and allowed copying
the resolved file into the attacker's account.

Resolve the candidate path with `os.path.realpath` and require it to
share a common ancestor with the (resolved) `sounds_base_dir` before
accepting it. Downstream code consumes the validated, dedup-filtered
set, so this single guard is sufficient.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Covers both relative-escape (`../somewhere/file.wav`) and absolute-path
(`/etc/passwd.wav`) variants for the validation added in the previous
commit. The CSV row goes through the normal validator and both rows
end up with `audio_filename` in `line_errors`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant