Skip to content

[Data] Fix streaming-generator hang on plasma-resident return object#64386

Open
goutamvenkat-anyscale wants to merge 1 commit into
ray-project:masterfrom
goutamvenkat-anyscale:fix-streaming-gen-eof-plasma-hang
Open

[Data] Fix streaming-generator hang on plasma-resident return object#64386
goutamvenkat-anyscale wants to merge 1 commit into
ray-project:masterfrom
goutamvenkat-anyscale:fix-streaming-gen-eof-plasma-hang

Conversation

@goutamvenkat-anyscale

@goutamvenkat-anyscale goutamvenkat-anyscale commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Summary

test_parquet_read_spread (and any read whose generator task return object lands in plasma) hangs until the 180s test timeout. The streaming executor reaches 200/200 rows but the consumer starves forever, blocked in core_worker.get_objects via ObjectRefGenerator._next_sync.

Culprit: #64014

#64014 changed ObjectRefGenerator._next_sync to honor the caller's timeout_s for the end-of-stream ray.get(generator_ref) (previously unbounded). That get is what distinguishes "task finished cleanly" (StopIteration) from "task errored", once all yielded blocks are consumed.

Ray Data's DataOpTask.on_data_ready polls the generator non-blocking with timeout_s=0. After #64014, once the stream is exhausted that same call runs ray.get(generator_ref, timeout=0).

Why timeout=0 deadlocks on a plasma return object

test_parquet_read_spread sets _system_config={"max_direct_call_object_size": 0}, which forces the generator's return object into plasma (it isn't inlined with the owner). With timeout=0, CoreWorkerPlasmaStoreProvider::Get:

  1. issues an async plasma pull (AsyncGetObjects), returning a ScopedResponse cleanup handler,
  2. immediately times out (remaining_timeout=0batch_timeout=0), and
  3. cancels the in-flight pull when the function returns — the ScopedResponse destructor fires CancelGetRequest.

So every poll re-issues and instantly cancels the pull; the return object never lands locally, _next_sync always returns a nil ref, the task is never observed as finished, and the executor spins forever.

In normal runs the return object is tiny and inlined (in-memory store), where timeout=0 resolves instantly — which is why only configs that push the return object into plasma (this test's max_direct_call_object_size=0, or the dead-node case #64014 targeted) expose it.

Fix

Keep _next_sync's single timeout_s and decide the value at the Ray Data call site based on stream phase:

  • Task still producingtimeout_s=0. The scheduling thread must not block waiting for a block that's still in flight; if it isn't ready, move on and service other operators. A blocking timeout here would stall the whole scheduling loop on every poll.
  • Stream exhausted → bounded METADATA_GET_TIMEOUT_S (1.0s). The call's purpose flips from "peek for more data" to "finalize this task": the only remaining work is the one-time terminal ray.get of the return object, which needs a brief, bounded wait so a plasma-resident object can actually be pulled.

The bounded value is needed specifically when the stream is exhausted and the generator return object is not yet local (e.g. resides in plasma on another node). On timeout — e.g. the object was lost to a dead node and needs reconstruction — it returns nil and the scheduler retries on a later loop, preserving #64014's "don't block the scheduler indefinitely" guarantee.

PR ray-project#64014 made ObjectRefGenerator._next_sync honor the caller's timeout_s
for the end-of-stream ray.get(generator_ref). Ray Data's
DataOpTask.on_data_ready polls with timeout_s=0, so once the stream is
exhausted that get runs with a 0 timeout. When the generator's return
object lives in plasma (e.g. max_direct_call_object_size=0), a 0-timeout
get issues an async plasma pull and then immediately cancels it on return,
so the object never arrives, the task is never observed as finished, and
the streaming executor spins forever (e.g. test_parquet_read_spread timed
out at 180s).

Fix at the Ray Data call site: poll with timeout_s=0 while the task is
still producing, but once the stream is exhausted use a bounded blocking
timeout (METADATA_GET_TIMEOUT_S) so the return object can be pulled. On
timeout (e.g. object lost to a dead node) it falls through and retries,
preserving PR ray-project#64014's intent of not blocking the scheduler indefinitely.

Adds ObjectRefGenerator._stream_exhausted(), a non-blocking in-memory
end-of-stream check used to select the timeout.

Signed-off-by: goutam <goutam@anyscale.com>
@goutamvenkat-anyscale goutamvenkat-anyscale requested review from a team as code owners June 27, 2026 00:03

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a non-blocking _stream_exhausted check on the object reference generator to determine if the end-of-stream marker has been reached. In physical_operator.py, this check is used to conditionally apply a bounded blocking timeout (METADATA_GET_TIMEOUT_S) instead of a zero timeout when pulling the generator's return object, preventing immediate cancellation of the pull. The review feedback suggests defensively checking for the existence of _stream_exhausted using getattr to avoid potential AttributeErrors if the generator is mocked or wrapped.

Comment on lines +223 to +227
next_timeout_s = (
METADATA_GET_TIMEOUT_S
if self._streaming_gen._stream_exhausted()
else 0
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To ensure backward compatibility and prevent potential AttributeErrors in unit tests or other environments where self._streaming_gen might be mocked or wrapped (and thus may not implement the newly added _stream_exhausted method), it is safer to use getattr to defensively check for the method's existence before calling it.

                stream_exhausted_fn = getattr(self._streaming_gen, "_stream_exhausted", None)
                next_timeout_s = (
                    METADATA_GET_TIMEOUT_S
                    if stream_exhausted_fn and stream_exhausted_fn()
                    else 0
                )

@goutamvenkat-anyscale goutamvenkat-anyscale added data Ray Data-related issues go add ONLY when ready to merge, run all tests labels Jun 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants