Fix Basic.Return content frame leak#2
Open
erikshestopal wants to merge 2 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a connection-wide deadlock caused by leaked
Basic.Returncontent frames.When RabbitMQ returns an unroutable mandatory publish, it sends a content-bearing return:
Basic.ReturnBasic.Ack/Basic.Nackwhen confirms are enabledSerena handled the
Basic.Returnmethod frame, but routed the following header/body frames into the channel delivery buffer. A publish-only channel never drains that buffer. Repeated returned publishes can fill it, causing the single connection-wide reader to block while trying to enqueue another frame. Once that happens, every channel on the connection stops receiving frames.How to reproduce the bug
Run RabbitMQ locally:
Then publish mandatory messages to a missing queue on one channel with a small channel buffer:
Before this fix,
channel.current_buffer_sizegrows as returned content frames accumulate. With a small buffer, the connection reader eventually logs:At that point, later operations on any channel can hang because frame processing is blocked at the connection reader.
This is reachable with Serena's default
channel_buffer_size=48: a non-empty returned publish leaks at least two frames, so one publish-only channel can deadlock after roughly 24 returned publishes.Root cause
The connection reader split one returned message across two internal queues:
Basic.Returnwent to the regular method stream.The delivery stream is bounded and is only drained by consume /
basic_getpaths. A publisher-only channel has no delivery consumer, so those frames remain buffered forever.Fix
This PR adds a small shared content assembly state machine for content-bearing AMQP methods.
The same method/header/body shape is used by:
Basic.DeliverBasic.GetOkBasic.ReturnThe new assembler validates the header class id, handles empty bodies, and combines multi-frame bodies. Existing delivery/get code now uses that assembler instead of maintaining its own inline reconstruction loop.
For
Basic.Return, the connection reader now tracks pending return content by channel id. It consumes the return header/body frames before they can enter the delivery buffer. Once the returned message is complete, the originalBasic.Returnis surfaced tobasic_publishas before, butMessageReturnedErrornow also includes:headerbodyThat preserves the existing error contract while exposing the returned message content and preventing the deadlock.
Tests
Added regression and characterization coverage for:
Basic.GetEmptyBasic.GetOkwith empty and multi-frame bodiesBasic.Deliverwith empty and multi-frame bodiesMessageReturnedErrorValidated locally against RabbitMQ:
Results: