Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHECKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,14 @@ Confirm that the execution phase creates a sandbox for each check and runs the a

Inspect the execution phase. Confirm that checks are dispatched concurrently — for example via a `JoinSet`, `FuturesUnordered`, or per-check `tokio::spawn` — and awaited together, rather than run inside a blocking loop that starts and awaits one check before beginning the next. The check fails if check execution is strictly sequential.

# Requirement Concurrency Defaults To The Host's Core Count

The concurrency cap is user-configurable (a `--concurrency` flag, layered the same way as `--provider`/`--model`/`--effort`/`--executor`), but absent any override its default must equal the number of CPU cores available on the machine running `multi check` — not a hardcoded constant. A fixed default either strands cores on big machines or overcommits small ones.

## Check Default Concurrency Equals Available Parallelism

Inspect how the default check concurrency is computed. Confirm that, with no `--concurrency` flag, no `MULTI_CHECKS_CONCURRENCY` environment variable, and no `checks.concurrency` config-file value set, the resolved concurrency is derived from the host's available parallelism (for example via `std::thread::available_parallelism`) rather than a fixed literal such as `2`. The check fails if the default concurrency is a hardcoded number instead of a value computed from the running machine's core count.

# Requirement Authoring Errors Are Actionable

When a `CHECKS.md` file is malformed, the tool must tell the author exactly what is wrong and where, instead of failing opaquely. Clear diagnostics are what make the format usable.
Expand Down
24 changes: 16 additions & 8 deletions guides/checks.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,22 +159,25 @@ Glob, plus the judge tool) — a verification agent observes, it does not mutate

## ⚙️ Configuration

The default **provider**, **model**, **effort**, and **executor** are resolved
from three sources, in order of precedence (highest wins):
The default **provider**, **model**, **effort**, **executor**, and
**concurrency** are resolved from three sources, in order of precedence
(highest wins):

1. **Flags** — `--provider`, `--model`, `--effort`, `--executor` on `multi check`.
1. **Flags** — `--provider`, `--model`, `--effort`, `--executor`,
`--concurrency` on `multi check`.
2. **Environment** — `MULTI_`-prefixed vars mapped into the `checks` namespace,
e.g. `MULTI_CHECKS_MODEL`, `MULTI_CHECKS_PROVIDER`, `MULTI_CHECKS_EFFORT`,
`MULTI_CHECKS_EXECUTOR`.
`MULTI_CHECKS_EXECUTOR`, `MULTI_CHECKS_CONCURRENCY`.
3. **Config file** — the `[checks]` table of `MultiTool.toml` (or `.json` /
`.jsonc`), discovered up the directory tree like any MultiTool manifest.

```toml
[checks]
provider = "anthropic" # anthropic | openai | gemini
model = "claude-sonnet-4-6" # must be a known model ID for the provider
effort = "low" # low | medium | high → thinking-token budget
executor = "cersei" # cersei (in-process, default) | claude (fallback)
provider = "anthropic" # anthropic | openai | gemini
model = "claude-sonnet-4-6" # must be a known model ID for the provider
effort = "low" # low | medium | high → thinking-token budget
executor = "cersei" # cersei (in-process, default) | claude (fallback)
concurrency = 8 # checks run at once; must be > 0 (default: CPU core count)

# optional, non-secret base-URL overrides per provider
[checks.providers.anthropic]
Expand All @@ -194,6 +197,11 @@ CLI). `claude` is the legacy `claude -p` shell-out fallback, kept selectable for
migration while the in-process path is validated; it requires the `claude` CLI on
your `PATH` and will be removed once cersei is proven out.

The **`concurrency`** flag caps how many checks run at once; it must be a
positive integer (`0` is rejected with a clear error). Its default matches the
number of CPU cores available on the machine running `multi check`, so a suite
fans out to use the whole machine rather than leaving cores idle.

**Credentials are environment-only.** API keys are read directly from each
provider's native variable and never live in the config file or under the
`MULTI_` prefix:
Expand Down
33 changes: 26 additions & 7 deletions src/checks/config/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ mod models;
mod providers;
mod schema;

use std::num::NonZeroUsize;
use std::time::Duration;

use figment::{
Expand All @@ -32,15 +33,22 @@ use crate::checks::executor::claude::ClaudeExecutor;
pub use providers::{ProviderFactory, ProviderRegistry};
pub use schema::{CliOverrides, Effort, ExecutorKind, ProviderKind};

/// Maximum number of checks executed concurrently. A small fan-out gives each
/// (CPU-heavy) reasoning agent enough cores to finish promptly.
const DEFAULT_CONCURRENCY: usize = 2;
/// Per-agent wall-clock timeout. Generous: the heaviest reasoning checks can
/// take a few minutes under contention before they report.
const DEFAULT_AGENT_TIMEOUT: Duration = Duration::from_secs(240);
/// How many times to (re)run a check whose agent fails to report.
const DEFAULT_MAX_ATTEMPTS: usize = 3;

/// Default number of checks executed concurrently: one per available CPU core,
/// so a check suite fans out to use the whole machine rather than leaving cores
/// idle. Falls back to `1` on the rare platform where the count can't be
/// determined.
fn default_concurrency() -> usize {
std::thread::available_parallelism()
.map(NonZeroUsize::get)
.unwrap_or(1)
}

/// The resolved configuration for a `multi check` run.
#[derive(Debug, Clone)]
pub struct Config {
Expand All @@ -56,7 +64,8 @@ pub struct Config {
pub effort: Effort,
/// Which execution engine runs each check (default: in-process cersei).
pub executor: ExecutorKind,
/// Maximum number of checks executed concurrently.
/// Maximum number of checks executed concurrently (default: the number of
/// available CPU cores; see [`default_concurrency`]).
pub concurrency: usize,
/// Per-agent wall-clock timeout (reaps an agent that hangs before reporting).
pub agent_timeout: Duration,
Expand Down Expand Up @@ -148,6 +157,7 @@ pub fn load(overrides: CliOverrides) -> Result<Resolved> {
.unwrap_or_else(|| models::default_model(provider).to_string());
let effort = checks.effort.unwrap_or(Effort::Low);
let executor = checks.executor.unwrap_or(ExecutorKind::Cersei);
let concurrency = checks.concurrency.unwrap_or_else(default_concurrency);

if !models::is_valid_model(provider, &model) {
return Err(miette!(
Expand All @@ -157,6 +167,10 @@ pub fn load(overrides: CliOverrides) -> Result<Resolved> {
));
}

if concurrency == 0 {
return Err(miette!("checks.concurrency must be greater than 0"));
}

// Build one handle per provider whose credential is present, then require
// that the *selected* provider actually resolved to an available handle.
let registry = providers::build_registry(&checks.providers)?;
Expand All @@ -181,7 +195,7 @@ pub fn load(overrides: CliOverrides) -> Result<Resolved> {
model,
effort,
executor,
concurrency: DEFAULT_CONCURRENCY,
concurrency,
agent_timeout: DEFAULT_AGENT_TIMEOUT,
max_attempts: DEFAULT_MAX_ATTEMPTS,
};
Expand All @@ -207,7 +221,7 @@ pub fn configuration() -> Config {
model: models::default_model(provider).to_string(),
effort: Effort::Low,
executor: ExecutorKind::Cersei,
concurrency: DEFAULT_CONCURRENCY,
concurrency: default_concurrency(),
agent_timeout: DEFAULT_AGENT_TIMEOUT,
max_attempts: DEFAULT_MAX_ATTEMPTS,
}
Expand All @@ -230,6 +244,7 @@ mod tests {
model: Some(model.to_string()),
effort: Some(Effort::Low),
executor: None,
concurrency: None,
providers: ProvidersSection::default(),
},
}
Expand All @@ -242,6 +257,8 @@ mod tests {
assert_eq!(cfg.model, "claude-sonnet-4-6");
assert_eq!(cfg.executor, ExecutorKind::Cersei);
assert!(cfg.concurrency >= 1);
// The default must track the machine's core count, not a hardcoded value.
assert_eq!(cfg.concurrency, default_concurrency());
// The fallback executor is constructible from config alone (DI seam works).
let _exec = cfg.build_claude_executor();
}
Expand All @@ -255,6 +272,7 @@ mod tests {
Some("gpt-4o".into()),
None,
None,
None,
);
let checks = resolve_layers(file, overrides).unwrap();
assert_eq!(checks.provider, Some(ProviderKind::OpenAi));
Expand Down Expand Up @@ -288,7 +306,8 @@ mod tests {
assert_eq!(checks.model.as_deref(), Some("claude-haiku-4-5"));

// ...and a flag outranks env.
let overrides = CliOverrides::new(None, Some("claude-opus-4-8".into()), None, None);
let overrides =
CliOverrides::new(None, Some("claude-opus-4-8".into()), None, None, None);
let checks = resolve_layers(file, overrides).unwrap();
assert_eq!(checks.model.as_deref(), Some("claude-opus-4-8"));
Ok(())
Expand Down
8 changes: 8 additions & 0 deletions src/checks/config/schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,10 @@ pub struct ChecksSection {
/// Which execution engine runs each check (`cersei` by default).
#[serde(default, skip_serializing_if = "Option::is_none")]
pub executor: Option<ExecutorKind>,
/// Maximum number of checks executed concurrently (default: the number of
/// available CPU cores).
#[serde(default, skip_serializing_if = "Option::is_none")]
pub concurrency: Option<usize>,
/// Optional, non-secret per-provider base-URL overrides.
#[serde(default)]
pub providers: ProvidersSection,
Expand Down Expand Up @@ -134,6 +138,8 @@ pub struct CliChecksOverrides {
pub effort: Option<Effort>,
#[serde(skip_serializing_if = "Option::is_none")]
pub executor: Option<ExecutorKind>,
#[serde(skip_serializing_if = "Option::is_none")]
pub concurrency: Option<usize>,
}

impl CliOverrides {
Expand All @@ -143,13 +149,15 @@ impl CliOverrides {
model: Option<String>,
effort: Option<Effort>,
executor: Option<ExecutorKind>,
concurrency: Option<usize>,
) -> Self {
Self {
checks: CliChecksOverrides {
provider,
model,
effort,
executor,
concurrency,
},
}
}
Expand Down
13 changes: 8 additions & 5 deletions src/checks/e2e.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ use miette::Result;
use tempfile::TempDir;
use tokio::sync::Barrier;

use crate::checks::config::configuration;
use crate::checks::config::{Config, configuration};
use crate::checks::discovery::discover;
use crate::checks::executor::{
AgentOutcome, AgentRunRequest, CheckExecutor, CheckReport, FakeExecutor,
Expand Down Expand Up @@ -207,8 +207,9 @@ impl CheckExecutor for InterleavingExecutor {

#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn checks_execute_concurrently_not_in_a_barrier() {
// Two independent requirements (two checks total). The default concurrency
// is 2, so both should run at once.
// Two independent requirements (two checks total). Pin concurrency to 2
// explicitly so both run at once regardless of the host's core count (the
// default now tracks available parallelism, not a fixed value).
let dir = TempDir::new().unwrap();
fs::write(
dir.path().join("CHECKS.md"),
Expand All @@ -221,8 +222,10 @@ async fn checks_execute_concurrently_not_in_a_barrier() {
let executor = Arc::new(InterleavingExecutor {
barrier: Arc::new(Barrier::new(2)),
});
let cfg = configuration();
assert_eq!(cfg.concurrency, 2, "test assumes a concurrency of 2");
let cfg = Config {
concurrency: 2,
..configuration()
};

let outcomes = run_to_outcomes(
&cfg,
Expand Down
1 change: 1 addition & 0 deletions src/checks/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ pub async fn run(terminal: &Terminal, working_dir: &Path, overrides: CliOverride
provider = resolved.config.provider.as_str(),
model = %resolved.config.model,
executor = ?resolved.config.executor,
concurrency = resolved.config.concurrency,
available_providers = ?resolved.providers.keys().collect::<Vec<_>>(),
"resolved checks configuration and provider registry",
);
Expand Down
8 changes: 8 additions & 0 deletions src/config/check/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
use std::num::NonZeroUsize;
use std::path::{Path, PathBuf};

use clap::Args;
Expand Down Expand Up @@ -32,6 +33,12 @@ pub struct CheckSubcommand {
/// legacy `claude -p` fallback). Overrides `checks.executor` from env/file.
#[arg(long, value_enum)]
executor: Option<ExecutorKind>,

/// Maximum number of checks to run concurrently. Must be greater than 0.
/// Overrides `checks.concurrency` from env/file. Defaults to the number of
/// available CPU cores.
#[arg(long)]
concurrency: Option<NonZeroUsize>,
}

impl CheckSubcommand {
Expand All @@ -48,6 +55,7 @@ impl CheckSubcommand {
self.model.clone(),
self.effort,
self.executor,
self.concurrency.map(NonZeroUsize::get),
)
}
}