Skip to content

Docs: protocol.rst overstates scheduler unpickling protection (contradicts ToPickle docstring) #9283

@EunhoKim98

Description

@EunhoKim98

What

docs/source/protocol.rst lists, as an advantage of the scheduler dealing in MsgPack, that "The Scheduler is protected from unpickling unsafe code," and says a pickled function sent to the scheduler will not be unpacked but kept as bytes.

This is inaccurate. The scheduler does unpickle __Pickled__ / ToPickle-wrapped frames during frame decode in distributed/protocol/core.py::loads(). The deserialize=False flag gates only the __Serialized__ branch; the __Pickled__ branch calls pickle.loads() unconditionally.

This is by design — the client wraps control-plane fields (code, annotations, span_metadata) in ToPickle on update-graph, and the scheduler must unpack them — and the ToPickle docstring states it directly: both the scheduler and workers automatically unpickle the object on arrival.

So the docs contradict the code and the docstring. deserialize=False protects the Serialized (forwarded task data) path, not the Pickled (control-plane) path.

Why it matters

The claim is more than cosmetic given the default posture: the scheduler binds 0.0.0.0:8786, distributed.comm.require-encryption defaults to null, and there is no shared secret. An operator relying on the documented "protected from unpickling" guarantee is misled about the actual trust boundary.

Suggested fix

Replace the "protected from unpickling unsafe code" claim with a security note rather than deleting it silently — e.g. state that the scheduler unpickles control-plane (ToPickle) frames during decode, that access to the scheduler port must therefore be treated as trusted, and that network-level controls (and TLS via require-encryption) are the recommended mitigation. This mirrors the established threat model (cf. the withdrawn CVE-2024-10096).

Context

A Sonar maintainer reviewing a private GitHub Security Advisory confirmed the behavior is intended and that the protocol.rst line is a documentation bug, and asked that this issue be filed. Happy to open a PR with the corrected wording.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions