Related issues:
Problem
On RK3588, the Android H264 encoder may output one encoded frame as an Annex B access unit containing multiple NAL units.
For example:
or:
The current H264Packet logic appears to assume that one input ByteBuffer contains only one NAL unit. It removes only one Annex B start code, then treats the remaining data as a single NAL unit.
For example, this input:
00 00 00 01 06 ... 00 00 00 01 65 ...
is actually:
After removing only the first start code, the remaining data becomes:
06 ... 00 00 00 01 65 ...
The current logic may then package this as one FLV AVC NAL unit:
[length][SEI + startCode + IDR]
This is not valid AVC-in-FLV payload. FLV AVC payload should use length-prefixed NAL units and should not contain Annex B start codes inside the payload.
The correct output should be either:
[SEI length][SEI data][IDR length][IDR data]
or, if SEI is intentionally ignored:
Impact
This can produce malformed RTMP video packets.
In our case, the RTMP server closes the connection after receiving the invalid video data. After the socket is closed, Ktor may throw from its internal cio-to-nio-writer coroutine, which is tracked separately in #2085.
We also tried to work around this at the upper layer by slicing the input ByteBuffer or adjusting MediaFrame.Info to remove SEI before passing the frame to H264Packet.
However, that workaround is currently unreliable because of the ByteBuffer copy/range behavior described in #2083. A sliced heap ByteBuffer still shares the original backing array, and directly using array() does not return the visible sliced range.
Because of that, the only safe workaround right now is to create a new clean ByteBuffer that contains only the NAL units we want to send.
Expected behavior
H264Packet should not assume that one encoded input buffer contains exactly one NAL unit.
It should first parse the Annex B input buffer into individual NAL units, then process each NAL unit according to its type.
Expected handling:
- split input by Annex B start codes:
00 00 01 and 00 00 00 01
- remove all start codes before writing FLV AVC payload
- write each NAL unit with its own 4-byte big-endian length prefix
- handle SPS/PPS through AVC sequence header
- handle SEI explicitly, either by keeping it as a separate NAL unit or dropping it intentionally
- detect keyframe status by scanning all NAL units in the access unit, not only the first NAL unit
Root cause chain
RK3588 encoder outputs H264 buffer with SEI + IDR
↓
H264Packet removes only one start code
↓
SEI + startCode + IDR is packed as one FLV NAL unit
↓
RTMP server receives malformed AVC payload
↓
server closes socket
↓
Ktor internal writer coroutine throws
So #2085 is about safely handling the socket failure after the server closes the connection.
#2083 is about the ByteBuffer issue that blocks a simple upper-layer workaround.
This issue is about the actual RTMP/H264 packaging root cause: H264Packet should correctly parse Annex B access units containing multiple NAL units.
Related issues:
cio-to-nio-writerexception cannot be caught by caller #2085Problem
On RK3588, the Android H264 encoder may output one encoded frame as an Annex B access unit containing multiple NAL units.
For example:
or:
The current
H264Packetlogic appears to assume that one inputByteBuffercontains only one NAL unit. It removes only one Annex B start code, then treats the remaining data as a single NAL unit.For example, this input:
is actually:
After removing only the first start code, the remaining data becomes:
The current logic may then package this as one FLV AVC NAL unit:
This is not valid AVC-in-FLV payload. FLV AVC payload should use length-prefixed NAL units and should not contain Annex B start codes inside the payload.
The correct output should be either:
or, if SEI is intentionally ignored:
Impact
This can produce malformed RTMP video packets.
In our case, the RTMP server closes the connection after receiving the invalid video data. After the socket is closed, Ktor may throw from its internal
cio-to-nio-writercoroutine, which is tracked separately in #2085.We also tried to work around this at the upper layer by slicing the input
ByteBufferor adjustingMediaFrame.Infoto remove SEI before passing the frame toH264Packet.However, that workaround is currently unreliable because of the
ByteBuffercopy/range behavior described in #2083. A sliced heapByteBufferstill shares the original backing array, and directly usingarray()does not return the visible sliced range.Because of that, the only safe workaround right now is to create a new clean
ByteBufferthat contains only the NAL units we want to send.Expected behavior
H264Packetshould not assume that one encoded input buffer contains exactly one NAL unit.It should first parse the Annex B input buffer into individual NAL units, then process each NAL unit according to its type.
Expected handling:
00 00 01and00 00 00 01Root cause chain
So #2085 is about safely handling the socket failure after the server closes the connection.
#2083 is about the
ByteBufferissue that blocks a simple upper-layer workaround.This issue is about the actual RTMP/H264 packaging root cause:
H264Packetshould correctly parse Annex B access units containing multiple NAL units.