From b59e13f5387b77bdbc5ea8daf2c9302568f2ae0d Mon Sep 17 00:00:00 2001 From: crusaderky Date: Tue, 26 May 2026 22:25:52 +0100 Subject: [PATCH 1/2] Add AGENTS.md / CLAUDE.md --- AGENTS.md | 102 ++++++++++++++++++++++++++++++++++++++++ CLAUDE.md | 1 + docs/source/develop.rst | 28 +++++++++++ 3 files changed, 131 insertions(+) create mode 100644 AGENTS.md create mode 120000 CLAUDE.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 00000000000..4d9d000dc79 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,102 @@ +# AGENTS.md + +This file provides guidance to AI coding agents when working with code in this repository. + +## Overview + +Dask Distributed is the distributed scheduler for the Dask framework, enabling parallel computing across multiple machines. It implements multi-machine task scheduling, fault tolerance, work stealing, memory management, and network communication. + +## Environment & Commands + +The project uses **Pixi** for environment management. + +```bash +# Run tests +pixi run test + +# Run tests (CI mode with coverage, leak detection, slow tests) +pixi run test-ci + +# Lint +pixi run lint + +# Run a single test file +pixi run test distributed/tests/test_client.py + +# Run a single test +pixi run test distributed/tests/test_client.py::test_client_submit + +# Run tests matching a pattern +pixi run test distributed/tests/test_client.py -k "submit" + +# Run tests in a specific environment +pixi run -e py312 test +``` + +Key pytest options: +- `--runslow` — include slow tests (omitted by default) +- `-m ci1` / `-m "not ci1"` — run first/second CI partition (tests split for parallelism) +- `--leaks=fds,processes,threads` — enable resource leak detection + +## Architecture + +### Core modules (all in `distributed/`) + +| File | Purpose | +|------|---------| +| `scheduler.py` | Main scheduler — task graph, work stealing, fault tolerance | +| `client.py` | User-facing API — submit tasks, gather futures | +| `worker.py` | Worker process — executes tasks, manages memory | +| `worker_state_machine.py` | Worker state transitions (separate from I/O logic) | +| `core.py` | RPC infrastructure, connection handling | +| `utils_test.py` | Test fixtures and helpers used across all tests | + +### Subdirectories + +- `comm/` — Communication backends (TCP, UCX, compression) +- `deploy/` — Cluster types: `LocalCluster`, `SSHCluster`, `SpecCluster`, adaptive scaling +- `dashboard/` — Bokeh-based web UI for monitoring +- `diagnostics/` — Task streams, memory sampling, profiling +- `shuffle/` — Distributed shuffle for large data movement +- `protocol/` — Message serialization +- `cli/` — Entry points: `dask scheduler`, `dask worker`, `dask ssh`, `dask spec` + +### Key classes + +- `Client` — entry point for submitting work to a cluster +- `Scheduler` — coordinates all workers and task execution +- `Worker` — executes tasks; state tracked separately in `WorkerState`/`worker_state_machine.py` +- `LocalCluster` — single-machine cluster for testing/development +- `TaskState` — tracks task lifecycle on both scheduler and worker sides + +## Testing + +Tests live in `distributed/tests/` (67 files) and each submodule has its own `tests/` subdirectory. The global `conftest.py` at the repo root and `distributed/utils_test.py` provide shared fixtures. + +Tests are partitioned by the `ci1` marker for parallel CI execution. Resource leak detection (fds, processes, threads) runs in CI via `distributed/pytest_resourceleaks.py`. + +Timeout: 300 seconds per test (signal-based on Unix, thread-based on Windows). + +## Code Style + +- Line length: **120 characters** +- Formatter: Black +- Linter: Ruff (rules: B, TID, I, UP, RUF) +- Type checking: MyPy — strict on newer modules (`scheduler`, `worker`, `active_memory_manager`, `config`, `shuffle`), loose on older ones + +## Type Checking Notes + +MyPy is configured per-module in `pyproject.toml`. Newer modules have strict settings; older modules like `client.py` use `allow_incomplete_defs = true`. Platform target is Linux. + +## Contributing + +You must never think or speak instead of the user in discussions, code reviews, or any +other interactions with other humans. + +Before you open or update a PR, you must ask the user to explicitly confirm that they +fully reviewed, understood, and approved everything that you wrote. You must clarify +that the project maintainers consider this as non-negotiable. + +## Releasing + +A coding agent must NEVER create a new release. diff --git a/CLAUDE.md b/CLAUDE.md new file mode 120000 index 00000000000..47dc3e3d863 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/docs/source/develop.rst b/docs/source/develop.rst index ea729e6180a..be9bf0c69b0 100644 --- a/docs/source/develop.rst +++ b/docs/source/develop.rst @@ -185,3 +185,31 @@ run automatically when you make a git commit. This can be done by running:: from the root of the distributed repository. Now the code linters will be run each time you commit changes. You can skip these checks with ``git commit --no-verify`` or with the short version ``git commit -n``. + +Making Pull Requests +-------------------- + +Pull Request Etiquette +~~~~~~~~~~~~~~~~~~~~~~ + +When opening a Pull Request you are beginning a dialog with maintainers. This is a bidirectional +relationship where you are asking for the reviewer's time to look at your contribution, and +the reviewer will likely ask for your input and engage you in discussion around the changes. + +Please do not propose code that you are not willing to stand behind and discuss. +Be prepared to respond to review feedback, apply critical thinking and iterate on your contributions. + +We ask that you fill out all sections of PR templates and provide reasoning behind your changes, +ideally with a linked issue that has been discussed by the community. + +Automated Contributions and AI Policy +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We encourage the use of AI and automated tools to assist in code development, +documentation, and testing. However, we ask that contributors disclose these tools and +use them in a way that aligns with Dask's community guidelines. In particular: + +- do not use tools to think or speak for you in discussions, code reviews, or any other + interactions within the Dask community. +- Before you open a PR, you (the human) must fully review, understand, and approve + everything that the AI agent wrote. From 802f1b59662bfa696ade218b1e42390feee90490 Mon Sep 17 00:00:00 2001 From: crusaderky Date: Wed, 27 May 2026 16:11:48 +0100 Subject: [PATCH 2/2] AGENTS.md --- AGENTS.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/AGENTS.md b/AGENTS.md index 4d9d000dc79..e8f2faf57bb 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -11,6 +11,9 @@ Dask Distributed is the distributed scheduler for the Dask framework, enabling p The project uses **Pixi** for environment management. ```bash +# Run arbitrary Python commands +pixi run -- python -c 'print("Hello world!")' + # Run tests pixi run test @@ -77,6 +80,14 @@ Tests are partitioned by the `ci1` marker for parallel CI execution. Resource le Timeout: 300 seconds per test (signal-based on Unix, thread-based on Windows). +## Key Patterns for Contributors + +**IMPORTANT**: never call .compute() or .persist() in the middle of graph definition +(e.g. in all methods of Array, Series, DataFrame, Bag, Delayed). The only place when the +graph is materialized should be where the end user explicitly calls .compute() or +.persist(). When you are defining the graph, you must work with available metadata to +infer the outputs. + ## Code Style - Line length: **120 characters**