From 982d826f718543cc0981fd94746730414494a7da Mon Sep 17 00:00:00 2001 From: Maren Mahsereci Date: Mon, 27 Apr 2026 22:54:05 +0200 Subject: [PATCH 1/3] Add AGENTS.md as cross-agent source of truth, CLAUDE.md imports it AGENTS.md is the agent-agnostic instruction file for all AI coding tools. CLAUDE.md is a thin stub that imports AGENTS.md via @AGENTS.md for Claude Code. Co-Authored-By: Claude Sonnet 4.6 --- AGENTS.md | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ CLAUDE.md | 5 ++++ 2 files changed, 78 insertions(+) create mode 100644 AGENTS.md create mode 100644 CLAUDE.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 00000000..692646f6 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,73 @@ +# AGENTS.md + +This file provides guidance to AI coding agents when working with code in this repository. + +## Development Commands + +```bash +# Install for development +pip install -e .[tests] # core + test tooling +pip install -e .[dev] # everything (tests, docs, examples, optional ML backends) + +# Run tests +pytest tests/ # unit tests +pytest integration_tests/ # integration tests +pytest --cov emukit --cov-report term-missing tests/ # with coverage +pytest -m 'not (gpy or pybnn or sklearn or notebooks)' # skip optional-dependency tests +pytest -m gpy # only GPy tests + +# Lint and format (enforced in CI) +black . +isort . +flake8 . +``` + +**Line length:** 120 characters. **Exceptions:** E731, E127 in flake8. + +## Architecture + +Emukit is a modular, framework-agnostic library for emulation-based decision-making (Bayesian optimization, experimental design, Bayesian quadrature, sensitivity analysis). The central design is the **OuterLoop**: + +``` +while stopping_condition not met: + candidate_point_calculator → next points to evaluate + user_function(points) → evaluations + model_updater → update model with new data +``` + +All loop components are swappable, enabling model-agnostic algorithms. + +### Key Packages + +- **`emukit/core/`** — All shared abstractions: + - `interfaces/` — Model interfaces (`IModel`, `IDifferentiable`, `IJointlyDifferentiable`, `IPriorHyperparameters`, `IModelWithNoise`) + - `loop/` — `OuterLoop`, `LoopState`, `CandidatePointCalculator`, `ModelUpdater`, `StoppingCondition`, `UserFunction`, `EventHandler` + - `acquisition/` — `Acquisition` base class; supports `+`, `*`, `/` operator overloading for composing acquisitions + - `optimization/` — `AcquisitionOptimizer` (maximizes acquisition over parameter space) + - `parameter_space.py` — `ParameterSpace` composed of `ContinuousParameter`, `DiscreteParameter`, `CategoricalParameter`, `BanditParameter` + - `initial_designs/` — Sampling strategies for initialization + - `encodings.py` — `OneHotEncoding`, `OrdinalEncoding` + +- **`emukit/bayesian_optimization/`** — `BayesianOptimizationLoop` (wraps OuterLoop with sensible defaults), acquisitions (EI, EI-MCMC, entropy search, max-value entropy search, local penalization, NegativeLowerConfidenceBound, PoF, PoI) + +- **`emukit/experimental_design/`** — `ExperimentalDesignLoop`, design-specific acquisitions + +- **`emukit/quadrature/`** — Bayesian quadrature: specialized kernels, loop, and `WarpedBayesianQuadratureModel` + +- **`emukit/multi_fidelity/`** — Multi-fidelity GP models built on GPy + +- **`emukit/sensitivity/`** — Monte Carlo sensitivity analysis (Sobol indices) + +- **`emukit/model_wrappers/`** — Bridges external ML libraries to emukit interfaces: `GPyModelWrapper`, `GPyMultiOutputWrapper`, `SklearnModelWrapper`, `SimpleGaussianProcessModel` + +- **`emukit/samplers/`** — MCMC and other samplers + +- **`emukit/test_functions/`** — Benchmark functions (Branin, Forrester, etc.) + +### Interface Conventions + +Interface names are prefixed with `I` (e.g., `IModel`, `IDifferentiable`). Models only need to implement the interfaces required by the algorithms they are used with — there is no single monolithic model class. Type hints are required on all public functions. + +### Optional Dependencies + +Optional backends (GPy, pybnn/torch, sklearn) are guarded by `pytest.importorskip()` in tests and declared as optional extras in `pyproject.toml`. Tests for these backends are marked with `@pytest.mark.gpy`, `@pytest.mark.pybnn`, `@pytest.mark.sklearn`, or `@pytest.mark.notebooks`. diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 00000000..078c29c4 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,5 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +@AGENTS.md From ffe4c43ae00db433eca2359d2b044e3ee8cf821e Mon Sep 17 00:00:00 2001 From: Maren Mahsereci Date: Fri, 8 May 2026 17:21:10 +0200 Subject: [PATCH 2/3] Expand AGENTS.md with PR preparation guidelines and license header rules Adds a "Preparing a Pull Request" section covering target branch, PR scope, pre-PR checklist (unit + integration tests, linting, license headers), and detailed license header rules for new and existing files. Co-Authored-By: Claude Sonnet 4.6 --- AGENTS.md | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/AGENTS.md b/AGENTS.md index 692646f6..d4cd67fc 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -12,6 +12,7 @@ pip install -e .[dev] # everything (tests, docs, examples, optional M # Run tests pytest tests/ # unit tests pytest integration_tests/ # integration tests +pytest tests/.../test_file.py::test_name # single test (replace path and test name) pytest --cov emukit --cov-report term-missing tests/ # with coverage pytest -m 'not (gpy or pybnn or sklearn or notebooks)' # skip optional-dependency tests pytest -m gpy # only GPy tests @@ -71,3 +72,30 @@ Interface names are prefixed with `I` (e.g., `IModel`, `IDifferentiable`). Model ### Optional Dependencies Optional backends (GPy, pybnn/torch, sklearn) are guarded by `pytest.importorskip()` in tests and declared as optional extras in `pyproject.toml`. Tests for these backends are marked with `@pytest.mark.gpy`, `@pytest.mark.pybnn`, `@pytest.mark.sklearn`, or `@pytest.mark.notebooks`. + +## Preparing a Pull Request + +**Target branch:** `main` on the upstream remote. + +**PR scope:** One PR per functional change. Large changes must be split into multiple PRs with clear, independent scope — do not mix refactoring with new features or bundle unrelated fixes. + +**Pre-PR checklist:** +- [ ] All unit tests pass (`pytest tests/`) +- [ ] Integration tests pass (`pytest integration_tests/`) — run these unless the developer has indicated they will verify manually +- [ ] Linting clean (`black .`, `isort .`, `flake8 .`) +- [ ] License headers present and up to date on all meaningfully changed and new files (see below) + +### License Headers + +**New files** get only the Emukit Authors header (new files are not covered by the Amazon or Opsani copyrights): + +```python +# Copyright 2020-2026 The Emukit Authors. All Rights Reserved. +# SPDX-License-Identifier: Apache-2.0 +``` + +Replace the end year with the current year. + +**Existing files** already have an Emukit Authors header, and may also have an Amazon or Opsani header below it. Only update the end year in the Emukit Authors line if it is behind the current year. Never modify the Amazon or Opsani headers. + +**Year update rule:** Use `2020` as the fixed start year. Update the end year to the current year only for files where meaningful changes were made — not for reformatting-only edits. From b31833032f9b05e6d33b22ce06c377e81d43a0dc Mon Sep 17 00:00:00 2001 From: Maren Mahsereci Date: Tue, 26 May 2026 15:14:26 +0200 Subject: [PATCH 3/3] Expand AGENTS.md with coding conventions, doc update rules, and PR guidelines Adds Coding Conventions section covering interface naming, docstring style (Sphinx/reST with example), optional dependencies, and documentation update rules for doc/api/ .rst files. Expands PR checklist with doc build verification, license header precision, and requirement to disclose AI agent involvement. Co-Authored-By: Claude Sonnet 4.6 --- AGENTS.md | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 50 insertions(+), 5 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index d4cd67fc..c3b6d19c 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -17,14 +17,12 @@ pytest --cov emukit --cov-report term-missing tests/ # with coverage pytest -m 'not (gpy or pybnn or sklearn or notebooks)' # skip optional-dependency tests pytest -m gpy # only GPy tests -# Lint and format (enforced in CI) +# Lint and format (enforced in CI) — line length: 120 chars, flake8 exceptions: E731, E127 black . isort . flake8 . ``` -**Line length:** 120 characters. **Exceptions:** E731, E127 in flake8. - ## Architecture Emukit is a modular, framework-agnostic library for emulation-based decision-making (Bayesian optimization, experimental design, Bayesian quadrature, sensitivity analysis). The central design is the **OuterLoop**: @@ -65,14 +63,58 @@ All loop components are swappable, enabling model-agnostic algorithms. - **`emukit/test_functions/`** — Benchmark functions (Branin, Forrester, etc.) +## Coding Conventions + ### Interface Conventions Interface names are prefixed with `I` (e.g., `IModel`, `IDifferentiable`). Models only need to implement the interfaces required by the algorithms they are used with — there is no single monolithic model class. Type hints are required on all public functions. +### Docstring Style + +Use **Sphinx/reStructuredText (reST)** style. Do not use Google style (`Args:`, `Returns:`) or NumPy style (section headers with underlines). + +- Parameters: `:param name: description` +- Return value: `:return: description` +- Do not add `:type:` or `:rtype:` tags — types belong in the function signature via type hints only +- Document array shapes inline in the parameter description, e.g. `(n_points x n_dims) array` + +```python +def sample_uniform(self, point_count: int) -> np.ndarray: + """ + Generates multiple uniformly distributed random parameter points. + + :param point_count: number of data points to generate + :return: Generated points with shape (point_count, num_features) + """ +``` + ### Optional Dependencies Optional backends (GPy, pybnn/torch, sklearn) are guarded by `pytest.importorskip()` in tests and declared as optional extras in `pyproject.toml`. Tests for these backends are marked with `@pytest.mark.gpy`, `@pytest.mark.pybnn`, `@pytest.mark.sklearn`, or `@pytest.mark.notebooks`. +### Documentation + +API docs are Sphinx-based and live in `doc/`. Each package has a hand-maintained `.rst` file in `doc/api/` that lists its modules via `.. automodule::` directives. Sphinx pulls docstrings from source automatically — but the `.rst` files must be kept in sync with the code structure. + +Edit `.rst` files manually — do not use automated tools to regenerate them. + +**When to update `doc/api/` `.rst` files:** +- **New file in an existing package**: add a `.. automodule::` block to the corresponding `doc/api/emukit..rst`: + ```rst + .. automodule:: emukit.package.new_module + :members: + :undoc-members: + :show-inheritance: + ``` +- **New subpackage**: create a new `doc/api/emukit..rst` and add it to the `toctree` of the parent `.rst` +- **Deleted or renamed module**: remove or update the corresponding entry in the relevant `.rst` + +**Verify the docs build locally** whenever files under `doc/` were changed or docstrings in source files were modified. Install dependencies first if needed (`pip install -e .[dev]`), then from inside the `doc/` directory run: +```bash +make html +``` +No need to run this if neither `doc/` files nor any docstrings were touched. + ## Preparing a Pull Request **Target branch:** `main` on the upstream remote. @@ -83,7 +125,10 @@ Optional backends (GPy, pybnn/torch, sklearn) are guarded by `pytest.importorski - [ ] All unit tests pass (`pytest tests/`) - [ ] Integration tests pass (`pytest integration_tests/`) — run these unless the developer has indicated they will verify manually - [ ] Linting clean (`black .`, `isort .`, `flake8 .`) -- [ ] License headers present and up to date on all meaningfully changed and new files (see below) +- [ ] License headers present and up to date on all new files and files where logic or behaviour was changed (see below) +- [ ] If code structure changed (new, deleted, or renamed modules or subpackages): `doc/api/` `.rst` files updated accordingly +- [ ] If `doc/` files or docstrings in source were changed: `make html` passes (run from `doc/`) +- [ ] PR description explicitly states that an AI agent was involved in the development ### License Headers @@ -98,4 +143,4 @@ Replace the end year with the current year. **Existing files** already have an Emukit Authors header, and may also have an Amazon or Opsani header below it. Only update the end year in the Emukit Authors line if it is behind the current year. Never modify the Amazon or Opsani headers. -**Year update rule:** Use `2020` as the fixed start year. Update the end year to the current year only for files where meaningful changes were made — not for reformatting-only edits. +**Year update rule:** Use `2020` as the fixed start year. Update the end year to the current year only for files where logic or behaviour was changed — not for whitespace, import reordering, or comment-only edits.