fix(frames): apply display-matrix rotation when decoding frames#3
Conversation
PyAV decodes frames in their stored orientation and ignores the container's rotation side data (unlike the ffmpeg CLI, which autorotates). Phone-recorded vertical videos store landscape frames with a 90° display matrix, so frames and metadata came out sideways — and so did every downstream consumer (e.g. pose estimation). - rotate decoded frames to display orientation via VideoFrame.rotation (np.rot90 k=1 for rotation=90 matches ffmpeg autorotate pixel-exactly) - report display-oriented width/height on VideoMetadata and expose a new `rotation` field (default 0, backward compatible) - read_frames_from_stream decodes the first frame eagerly to learn the rotation on non-seekable input (pipes) and replays it through the generator - pin av>=14.1 (VideoFrame.rotation was added in 14.1) - add tests/assets/rotated90.mp4 (rotation=90 clip) and regression tests comparing pixels against ffmpeg autorotate output Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR adds complete support for display-matrix rotation in video frames: ChangesDisplay-Matrix Rotation Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
np.rot90 yields a non-contiguous view (negative strides), which consumers like MediaPipe and OpenCV reject. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@simple_video_utils/frames.py`:
- Around line 236-246: In read_frames_from_stream the PyAV container created as
container = av.open(...) can leak if an exception occurs during setup (setting
s.thread_type, decoding first_frame, or computing meta via
video_metadata_from_container) because container.close() only lives in
frame_generator()'s finally; wrap the setup steps that touch container (the for
s in container.streams.video loop, first_frame = next(...), rotation assignment
and meta = video_metadata_from_container(...)) in a try/except or try/finally
that calls container.close() on error and re-raises, or use a context manager to
ensure container.close() is invoked if returning the generator fails, so the
container is always closed on setup-time failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e7e7e2cf-950e-4bb8-a7a9-eb7842205b19
⛔ Files ignored due to path filters (1)
tests/assets/rotated90.mp4is excluded by!**/*.mp4
📒 Files selected for processing (4)
pyproject.tomlsimple_video_utils/frames.pysimple_video_utils/metadata.pytests/test_rotation.py
PyAV 14.1 (which added VideoFrame.rotation) dropped Python 3.8 support; 3.8 has been EOL since October 2024. Drop it from the CI matrix, bump requires-python, and apply the pyupgrade fixes ruff now flags for the 3.9 target (typing.Tuple/Generator -> builtins/collections.abc). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The eager first-frame decode made setup-time exceptions likelier (e.g. corrupted input); previously container.close() only ran in the generator's finally, leaking the container if setup raised. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
BytesIO is seekable, so the existing stream tests didn't exercise the no-rewind path that real pipe input (e.g. an HTTP upload) takes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The README has advertised 3.10+ all along; 3.9 went EOL in October 2025 and limits av to <16. Drop 3.9 from CI, bump requires-python, and apply the pyupgrade/bugbear fixes ruff enables for the 3.10 target (Optional[X] -> X | None, zip strict=). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem
Vertical (phone-recorded) videos came out 90° rotated: frames were decoded in their stored landscape orientation and metadata reported coded dimensions. PyAV does not apply the container's display-matrix rotation side data (unlike the ffmpeg CLI, which autorotates by default). Downstream, this produced sideways pose estimations in pose-estimation.
Changes
frames.py: new_frame_to_rgb()rotates decoded frames to display orientation usingVideoFrame.rotation. Direction verified empirically:rotation=90+np.rot90(k=1)matches ffmpeg autorotate pixel-exactly (the opposite direction differs by a mean of ~68).metadata.py:VideoMetadata.width/heightare now display-oriented (swapped for 90°/270°), consistent with the frames the package yields. Newrotationfield (default0, appended last — backward compatible with positional unpacking). Rotation is probed by decoding one frame and rewinding, since PyAV only exposes it per-frame.read_frames_from_stream: input may be non-seekable (e.g. an OS pipe in pose-estimation's/streamendpoint), so the first frame is decoded eagerly for the rotation and replayed through the generator;skip_framessemantics unchanged.pyproject.toml: pinav>=14.1(VideoFrame.rotationwas added in PyAV 14.1).Tests
New
tests/test_rotation.py(7 tests) with a 78KB assettests/assets/rotated90.mp4— a downscaled clip of a real vertical sign-language recording (stored 640×360,rotation=90):rotation == 0read_frames_exactskip_frameswith the eagerly-decoded first frameruff checkclean, 45 tests pass locally. (3 pre-existingtest_regression.pyfailures on my machine also fail on cleanmain— local ffmpeg 8 vs PyAV's bundled FFmpeg differ by ±3 in YUV→RGB; unrelated.)Note for consumers
video_metadata()now reports display dimensions rather than rawffprobe-style coded dimensions. For pose-estimation this is the desired behavior: the pose header and landmark unnormalization stay consistent with the rotated frames. Poses previously extracted from rotated videos are sideways and should be re-extracted.🤖 Generated with Claude Code
Note
Medium Risk
Behavior change for rotated videos (frame pixels and metadata dimensions) and a breaking semantic shift for consumers expecting raw ffprobe-coded sizes; stream path now decodes one frame up front.
Overview
Fixes phone-recorded / display-matrix rotated videos where PyAV decoded frames in stored orientation while ffmpeg autorotates by default.
Frame decoding now routes through
_frame_to_rgb(), applyingVideoFrame.rotationwithnp.rot90and a C-contiguous copy so OpenCV/MediaPipe accept the arrays.VideoMetadatareports display width/height (swapped for 90°/270°), adds arotationfield, and probes rotation by decoding the first frame when needed.read_frames_from_streameagerly decodes the first frame for rotation on non-seekable inputs (pipes), replays it through the generator, and preservesskip_framesbehavior.Tooling: requires Python ≥3.10, pins
av>=14.1, drops 3.8/3.9 from CI, and addstests/test_rotation.py(ffmpeg pixel checks, pipe streaming).Reviewed by Cursor Bugbot for commit 5a7a53e. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit
New Features
Chores
avlibrary dependency requirement to version 14.1+