isty2e · isty2e · Jul 2, 2026 · Jul 2, 2026 · Jul 2, 2026 · Jul 2, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,7 +8,22 @@ format. Stability guarantees for the public surface are documented in the
 
 ## [Unreleased]
 
-No unreleased changes yet.
+### Breaking
+
+- `Study.run(...)` and `Study.optimize(...)` now default
+  `count_evaluation_cost=True`. Evaluation budgets are charged against reported
+  logical evaluation cost, including inner local-search evaluations, rather than
+  only the number of returned records. Code that intentionally wants outer-record
+  counting must pass `count_evaluation_cost=False`.
+- Study execution now raises `EvaluationBudgetExhausted` instead of silently
+  assimilating a step whose reported evaluation cost exceeds the remaining hard
+  budget.
+
+### Added
+
+- Added `stop_at_checkpoint_boundary=True` for `Study.run(...)` and
+  `Study.optimize(...)` so CSA runs can return the latest checkpoint-safe state
+  when the budget ends inside an unsafe generation segment.
 
 ## [0.1.0] - 2026-06-15
 

diff --git a/docs/concepts/candidate-refinement.md b/docs/concepts/candidate-refinement.md
@@ -55,10 +55,11 @@ problem's semantic evaluation rule.
 Refinement metadata does not count evaluations by itself. Logical evaluation cost
 is carried by `EvaluationOutcome.evaluation_count`.
 
-`Study.optimize(..., count_evaluation_cost=True)` charges the reported
-`evaluation_count` instead of only counting returned records. This matters when
-a local-search kernel evaluates several inner candidates before returning one
-refined result.
+By default, `Study.optimize(...)` charges the reported `evaluation_count` instead
+of only counting returned records. This matters when a local-search kernel
+evaluates several inner candidates before returning one refined result. Set
+`count_evaluation_cost=False` only when you deliberately want outer-record
+counting.
 
 Terminal surfaces preserve provenance only as aligned metadata:
 

diff --git a/docs/guides/local-optimization-methods.md b/docs/guides/local-optimization-methods.md
@@ -90,10 +90,9 @@ Practical trade-off:
 - but it may consume many inner objective evaluations due to finite-difference
   gradient estimation
 
-If you care about actual objective-evaluation cost rather than just the number
-of outer proposals, enable cost-aware budgeting in
-[`Study.optimize(...)`](../reference/api/study.md)
-with `count_evaluation_cost=True`.
+`Study.optimize(...)` budgets by actual objective-evaluation cost by default, so
+SciPy inner evaluations are charged against `max_evaluations` rather than hidden
+behind one outer proposal.
 
 ### `Powell`
 
@@ -177,15 +176,17 @@ The kernel path reports that cost through
 [`EvaluationOutcome.evaluation_count`](../reference/api/variopt.md).
 `Study.optimize(...)` then offers two modes:
 
-- default: budget decreases by the number of returned observations
-- `count_evaluation_cost=True`: budget decreases by the sum of inner objective
-  evaluations reported by the kernel/evaluator path
+- default: budget decreases by the sum of objective evaluations reported by the
+  kernel/evaluator path
+- `count_evaluation_cost=False`: budget decreases by the number of returned
+  observations
 
 Practical guidance:
 
-- use default counting when you only care about outer search steps
-- use `count_evaluation_cost=True` when comparing methods with and without local
-  optimization, or when the kernel itself is expensive
+- keep the default when comparing methods with and without local optimization,
+  or when the objective itself is expensive
+- use `count_evaluation_cost=False` only when you deliberately want an outer-step
+  budget rather than an objective-cost budget
 
 If a custom kernel already computed the objective value, it should return both
 that value and the true `evaluation_count` so that `Study` can reuse the value
@@ -257,7 +258,7 @@ execution configuration, not as free throughput toggles.
 | All-discrete structured space that already has a justified staged neighborhood-widening story | `StructuredVariableNeighborhoodKernel(max_steps=..., stages=(...))` |
 | All-discrete structured space with a fixed stage sequence | `StructuredScheduledLocalSearchKernel(stages=(...))` |
 | Mixed real/integer/categorical space | no built-in generic mixed adapter yet; use a custom kernel, split the local search cleanly by domain, or skip local optimization |
-| Comparing methods with and without local optimization | enable `count_evaluation_cost=True` |
+| Comparing methods with and without local optimization | use the default objective-cost budget |
 | Batch-parallel evaluation with joblib | keep the kernel serial and let the evaluator own parallelism |
 | Early debugging or correctness validation | start with `SequentialEvaluator` |
 
@@ -269,7 +270,7 @@ If you are unsure, start here:
 2. If the space is continuous and local improvement is clearly valuable, try
    `L-BFGS-B`.
 3. If `L-BFGS-B` behaves poorly on a rough objective, switch to `Powell`.
-4. Turn on `count_evaluation_cost=True` before making any fairness claims about
+4. Keep default objective-cost budgeting before making any fairness claims about
    efficiency.
 5. Add `JoblibEvaluator` only after the kernel itself is behaving well in
    sequential execution.

diff --git a/docs/reference/checkpointing.md b/docs/reference/checkpointing.md
@@ -41,8 +41,11 @@ study = Study(
     evaluator=SequentialEvaluator[int, int](),
 )
 
-# Run partway and save.
-result, state = study.optimize(max_evaluations=20)
+# Run partway to a checkpoint-safe boundary and save.
+result, state = study.optimize(
+    max_evaluations=20,
+    stop_at_checkpoint_boundary=True,
+)
 checkpoint = optimizer.state_to_dict(state)
 
 with open("checkpoint.json", "w") as f:

diff --git a/src/variopt/__init__.py b/src/variopt/__init__.py
@@ -19,6 +19,8 @@
     SEQUENTIAL_EXECUTION_MODEL,
     STALE_ASYNC_EXECUTION_MODEL,
     SYNC_BATCH_EXECUTION_MODEL,
+    EvaluationBudget,
+    EvaluationBudgetExhausted,
     ExecutionAssimilationMode,
     ExecutionCompletionMode,
     ExecutionModel,
@@ -64,6 +66,8 @@
     "CategoricalSpace",
     "DiversityMetric",
     "EvaluationOutcome",
+    "EvaluationBudget",
+    "EvaluationBudgetExhausted",
     "EvaluationProtocol",
     "EvaluationRecord",
     "EvaluationRequest",

diff --git a/src/variopt/algorithms/local_search/scipy/kernel.py b/src/variopt/algorithms/local_search/scipy/kernel.py
@@ -14,6 +14,7 @@
     Proposal,
     ProposalEvaluationSpec,
 )
+from ....execution import EvaluationBudgetExhausted
 from ....kernel import (
     Kernel,
     KernelDiagnostics,
@@ -89,7 +90,8 @@ def _as_local_search_context(
 
 
 @dataclass(frozen=True, slots=True)
-class ScipyMinimizeKernel(FrozenGenericSlotsCompat,
+class ScipyMinimizeKernel(
+    FrozenGenericSlotsCompat,
     Kernel[
         ProposalBatchQuery[
             BoundaryT,
@@ -228,6 +230,7 @@ def _evaluate_proposal(
                     if proposal_evaluation_spec is None
                     else (proposal_evaluation_spec,)
                 ),
+                evaluation_budget=query.evaluation_budget,
             ),
         )
         if len(local_outcomes) != 1:
@@ -272,6 +275,7 @@ def _evaluate_candidate(
                     if proposal_evaluation_spec is None
                     else (proposal_evaluation_spec,)
                 ),
+                evaluation_budget=query.evaluation_budget,
             ),
         )
         if len(local_outcomes) != 1:
@@ -293,6 +297,7 @@ def _optimize_proposal(
             [ProposalBatchQuery[BoundaryT, ContinuousCandidateT]],
             tuple[EvaluationOutcome[ContinuousCandidateT], ...],
         ],
+        reserved_count: int,
     ) -> EvaluationOutcome[ContinuousCandidateT]:
         """Run one local descent episode for one original proposal."""
         context = self._proposal_context(query=query, proposal_index=proposal_index)
@@ -317,10 +322,46 @@ def _optimize_proposal(
             EvaluationOutcome[ContinuousCandidateT],
         ] = {}
 
+        def can_evaluate_local_candidate() -> bool:
+            budget = query.evaluation_budget
+            return budget is None or budget.can_consume(1 + reserved_count)
+
+        def budget_exhausted_outcome(
+            optimized_outcome: EvaluationOutcome[ContinuousCandidateT],
+        ) -> EvaluationOutcome[ContinuousCandidateT]:
+            optimized_candidate = optimized_outcome.record.candidate
+            refinement = _candidate_refinement_from_codec(
+                codec=codec,
+                source_candidate=proposal.candidate,
+                refined_candidate=optimized_candidate,
+            )
+            return EvaluationOutcome(
+                record=Observation(
+                    proposal=proposal,
+                    proposal_evaluation_spec=proposal_evaluation_spec,
+                    candidate=optimized_candidate,
+                    value=optimized_outcome.record.value,
+                    score=optimized_outcome.record.score,
+                    elapsed_seconds=optimized_outcome.record.elapsed_seconds,
+                ),
+                evaluation_count=evaluation_count,
+                kernel_diagnostics=KernelDiagnostics(
+                    backend="scipy.optimize.minimize",
+                    method=self.method,
+                    status=KernelStatus.STOPPED,
+                    message="evaluation budget exhausted before local convergence",
+                ),
+                refinement=refinement,
+                candidate_equal=query.problem.space.candidates_equal,
+            )
+
         def objective_in_coordinate_space(
             coordinates: Sequence[float],
         ) -> float:
             nonlocal evaluation_count
+            if not can_evaluate_local_candidate():
+                msg = "evaluation budget exhausted"
+                raise EvaluationBudgetExhausted(msg)
             coordinate_key = tuple(float(coordinate) for coordinate in coordinates)
             local_candidate = codec.candidate_from_coordinates(
                 proposal.candidate,
@@ -339,41 +380,71 @@ def objective_in_coordinate_space(
             evaluated_outcomes_by_coordinates[coordinate_key] = local_outcome
             return local_outcome.record.score
 
-        scipy_result = ScipyMinimizeResult.from_optimize_result(
-            run_scipy_minimize(
-                objective_in_coordinate_space=objective_in_coordinate_space,
-                initial_coordinates=initial_coordinates,
-                method=self.method,
-                coordinate_bounds=codec.coordinate_bounds,
-                tolerance=self.tolerance,
-                options=self._scipy_options(context=context),
+        try:
+            scipy_result = ScipyMinimizeResult.from_optimize_result(
+                run_scipy_minimize(
+                    objective_in_coordinate_space=objective_in_coordinate_space,
+                    initial_coordinates=initial_coordinates,
+                    method=self.method,
+                    coordinate_bounds=codec.coordinate_bounds,
+                    tolerance=self.tolerance,
+                    options=self._scipy_options(context=context),
+                )
             )
-        )
+        except EvaluationBudgetExhausted:
+            if len(evaluated_outcomes_by_coordinates) == 0:
+                raise
+            optimized_outcome = min(
+                evaluated_outcomes_by_coordinates.values(),
+                key=lambda outcome: outcome.record.score,
+            )
+            return budget_exhausted_outcome(optimized_outcome)
         if not scipy_result.has_finite_solution:
             original_outcome = evaluated_outcomes_by_coordinates.get(
                 initial_coordinates,
             )
             if original_outcome is None:
-                original_outcome = self._evaluate_proposal(
-                    query=query,
-                    proposal=proposal,
-                    proposal_evaluation_spec=proposal_evaluation_spec,
-                    runner=runner,
+                if (
+                    query.evaluation_budget is not None
+                    and not can_evaluate_local_candidate()
+                    and len(evaluated_outcomes_by_coordinates) > 0
+                ):
+                    original_outcome = min(
+                        evaluated_outcomes_by_coordinates.values(),
+                        key=lambda outcome: outcome.record.score,
+                    )
+                else:
+                    original_outcome = self._evaluate_proposal(
+                        query=query,
+                        proposal=proposal,
+                        proposal_evaluation_spec=proposal_evaluation_spec,
+                        runner=runner,
+                    )
+                    evaluation_count += original_outcome.evaluation_count
+
+            fallback_candidate = original_outcome.record.candidate
+            refinement = None
+            if not query.problem.space.candidates_equal(
+                proposal.candidate,
+                fallback_candidate,
+            ):
+                refinement = _candidate_refinement_from_codec(
+                    codec=codec,
+                    source_candidate=proposal.candidate,
+                    refined_candidate=fallback_candidate,
                 )
-                evaluation_count += original_outcome.evaluation_count
-
             return EvaluationOutcome(
                 record=Observation(
                     proposal=proposal,
                     proposal_evaluation_spec=proposal_evaluation_spec,
-                    candidate=proposal.candidate,
+                    candidate=fallback_candidate,
                     value=original_outcome.record.value,
                     score=original_outcome.record.score,
                     elapsed_seconds=original_outcome.record.elapsed_seconds,
                 ),
                 evaluation_count=evaluation_count,
                 kernel_diagnostics=scipy_result.diagnostics(method=self.method),
-                refinement=None,
+                refinement=refinement,
                 candidate_equal=query.problem.space.candidates_equal,
             )
 
@@ -382,15 +453,28 @@ def objective_in_coordinate_space(
             proposal.candidate,
             optimized_coordinates,
         )
-        optimized_outcome = evaluated_outcomes_by_coordinates.get(optimized_coordinates)
-        if optimized_outcome is None:
-            optimized_outcome = self._evaluate_candidate(
+        cached_optimized_outcome = evaluated_outcomes_by_coordinates.get(
+            optimized_coordinates,
+        )
+        if cached_optimized_outcome is None:
+            if (
+                query.evaluation_budget is not None
+                and not can_evaluate_local_candidate()
+                and len(evaluated_outcomes_by_coordinates) > 0
+            ):
+                best_seen_outcome = min(
+                    evaluated_outcomes_by_coordinates.values(),
+                    key=lambda outcome: outcome.record.score,
+                )
+                return budget_exhausted_outcome(best_seen_outcome)
+
+            cached_optimized_outcome = self._evaluate_candidate(
                 query=query,
                 candidate=optimized_candidate,
                 proposal_evaluation_spec=proposal_evaluation_spec,
                 runner=runner,
             )
-            evaluation_count += optimized_outcome.evaluation_count
+            evaluation_count += cached_optimized_outcome.evaluation_count
         refinement = _candidate_refinement_from_codec(
             codec=codec,
             source_candidate=proposal.candidate,
@@ -401,9 +485,9 @@ def objective_in_coordinate_space(
                 proposal=proposal,
                 proposal_evaluation_spec=proposal_evaluation_spec,
                 candidate=optimized_candidate,
-                value=optimized_outcome.record.value,
-                score=optimized_outcome.record.score,
-                elapsed_seconds=optimized_outcome.record.elapsed_seconds,
+                value=cached_optimized_outcome.record.value,
+                score=cached_optimized_outcome.record.score,
+                elapsed_seconds=cached_optimized_outcome.record.elapsed_seconds,
             ),
             evaluation_count=evaluation_count,
             kernel_diagnostics=scipy_result.diagnostics(method=self.method),
@@ -434,10 +518,13 @@ def run(
         tuple[EvaluationOutcome[ContinuousCandidateT], ...]
             Locally improved outcomes aligned to ``query.proposals``.
         """
-        prepared_codec: ContinuousStructuredSpaceCodec[
-            BoundaryT,
-            ContinuousCandidateT,
-        ] | None = None
+        prepared_codec: (
+            ContinuousStructuredSpaceCodec[
+                BoundaryT,
+                ContinuousCandidateT,
+            ]
+            | None
+        ) = None
 
         def codec_provider() -> ContinuousStructuredSpaceCodec[
             BoundaryT,
@@ -458,6 +545,7 @@ def codec_provider() -> ContinuousStructuredSpaceCodec[
                 proposal=proposal,
                 codec_provider=codec_provider,
                 runner=runner,
+                reserved_count=len(query.proposals) - proposal_index - 1,
             )
             for proposal_index, proposal in enumerate(query.proposals)
         )