Implementation plan: Thierry-Biegler ℓ₁-exact penalty-barrier wrapper + MPCC paper reproduction

## Goal

Port the Thierry & Biegler (2020) ℓ₁-exact penalty-barrier wrapper from ripopt#23 / commit `7847bba9` into pounce as `pounce-l1penalty`, and ship the **MPCC paper reproduction** that ripopt explicitly deferred (their probe was on the adjacent CUTEst NE-suffix slice). End state: a default-off opt-in wrapper plus an opt-in auto-fallback, validated on (a) pounce's own CUTEst sweep, (b) the same 42-problem NE-suffix probe ripopt ran, and (c) the MPCC benchmarks Thierry & Biegler 2020 report.

## Background

See pounce#5 for the algorithmic background. Algorithmic reference: [`L1PenaltyBarrierNlp` in ripopt#23 commit `7847bba9`](https://github.com/jkitchin/ripopt/commit/7847bba9), 550 LoC wrapper. The ripopt closing comment on issue #23 documents 2 hard rescues (`DENSCHNENE`, `HATFLDFLNE`), 26 honest-infeasibility upgrades (`RestorationFailed` / `MaxIterations` → `LocalInfeasibility`, often turning 3000-iter budget exhausts into 50–200-iter certificates), and 0 regressions on a 42-problem CUTEst NE-suffix probe; flag-off byte-identical on a 100-problem comparison (99/100; sole diff is wall-time non-determinism on `CRESC132`).

## Architecture decisions

**Crate placement.** New crate `crates/pounce-l1penalty/`. Depends on `pounce-common`, `pounce-nlp`. Used by `pounce-algorithm` (for the IPM driver hook) and `pounce-cli` (for option wiring). Mirrors how `pounce-restoration` is structured.

**Wrapper level.** Wrap at the `TNLP` level — `crates/pounce-l1penalty/src/wrapper.rs` exposes `L1PenaltyBarrierTnlp<'a>(inner: Rc<RefCell<dyn TNLP>>, opts)` implementing `TNLP`. Rationale: the augmented NLP feeds straight into the existing `TNLPAdapter → OrigIpoptNlp` chain unchanged, which is the cleanest separation. ripopt wraps at `NlpProblem` (its analog), not at the lower algorithm-side `Nlp` trait — same call here.

**Variable layout.** `[x(n_orig), p(m_eq), n(m_eq)]`, total `n_orig + 2·m_eq` variables. Constraints unchanged in row order.

**Detection of equality rows.** `g_l[i] == g_u[i]`. Cached in `eq_rows: Vec<usize>` at construction time.

## Phased plan

### Phase 0 — Audit & baseline (1 day, no code shipped)

**Goal.** Quantify how much of pounce's current failure tail this targets, before writing any wrapper code.

**Tasks.**
1. Walk `benchmarks/cutest/results_feral_v030fma_2026-05-13.json`, classify pounce's non-`Solve_Succeeded` outcomes by status. Extract the `RestorationFailed`, `LocalInfeasibility`, `Acceptable`, `MaxIterations` problems — these are the auto-fallback candidates.
2. Walk the same set, find the subset where `m_eq > 0` (equality rows present). Augment with rank deficiency hints: rerun those problems with `print_level=8` and grep for "perturbation" / "Wrong Inertia" patterns. Those are the wrapper's primary targets.
3. Pull the 42 problems on ripopt's `benchmarks/cutest/l1_rescue/problems.txt` — record pounce's current outcome for each so we can compare like-for-like.
4. Pull MPCC benchmark sources (see Phase 4 below) and confirm we have the test problems offline.

**Deliverable.** `benchmarks/cutest/l1_audit/REPORT.md` with: pounce-side rescue-candidate count, ripopt 42-problem comparison table (current pounce status), Phase 4 MPCC corpus inventory.

**Stop-gate.** If fewer than ~5 pounce problems are rescue-candidates, downgrade the priority and revisit. (ripopt found 28 of 42 actionable; we expect a similar ratio on our corpus given shared algorithmic lineage.)

### Phase 1 — TNLP wrapper, fixed ρ, default-off byte-identical (3–5 days)

**Goal.** Land the wrapper and the option wiring; prove it doesn't change any solve trajectory when the flag is off.

**Tasks.**
1. New crate `crates/pounce-l1penalty/`. Cargo.toml + README + workspace registration.
2. `src/wrapper.rs`: `L1PenaltyBarrierTnlp<'a>` implementing the `TNLP` trait. Closely follows ripopt's `L1PenaltyBarrierNlp` — algorithmic body is essentially a translation. Key changes from ripopt: pounce uses `Rc<RefCell<dyn TNLP>>` for the inner pointer; sparsity is reported via `SparsityRequest<'_>` (pounce's enum) rather than separate row/col returns.
3. `src/options.rs`: register `l1_exact_penalty_barrier` (bool, default false), `l1_penalty_init` (f64, default 1.0). Plumbed into `pounce-cli`'s option-table loader.
4. Wire the wrapper into `pounce-algorithm/src/application.rs::optimize_tnlp`: when the option is set, replace the user TNLP with `Rc::new(RefCell::new(L1PenaltyBarrierTnlp::new(user_tnlp, opts)))` *before* the existing chain. No back-projection yet — Phase 2.
5. Tests in `crates/pounce-l1penalty/tests/wrapper.rs` (~10 unit tests, mirroring ripopt's `tests/l1_penalty_barrier.rs::wrapper_*`):
   - Variable-layout invariants (n_orig + 2·m_eq, constraint count unchanged).
   - Equality-row detection on `g_l[i] == g_u[i]`.
   - `eval_f` augments by `ρ·Σ(p+n)` correctly.
   - `eval_g` rewrites equality rows to `c_i(x) − p_i + n_i = g_i`.
   - `eval_jac_g` includes the `+/-I` blocks for slacks.
   - `eval_h` does **not** add Hessian contribution from the linear penalty term (slacks have zero Hessian).
   - Bounds-info lifts `(x_L, x_U)` and adds `(0, +∞)` for `(p, n)`.
   - Starting point lifts `x_0` and seeds `(p, n)` from `max(0, ±c(x_0))`.
6. Flag-off regression: `benchmarks/cutest/l1_audit/regression_pre_l1.jsonl` (current run) vs `regression_post_l1_flag_off.jsonl` (post-Phase-1, flag off). Acceptance: ≥99/100 byte-identical (matches ripopt's threshold).

**Deliverable.** Crate compiles, tests pass, flag-off byte-identical on 100-problem CUTEst probe. Flag-on does not yet back-project the solution — that's Phase 2.

### Phase 2 — Augmented IPM driver + back-projection (2–3 days)

**Goal.** Reported `(x*, λ*)` and reported objective are the **original** problem's, not the augmented problem's.

**Tasks.**
1. In `pounce-algorithm/src/application.rs::optimize_tnlp`, after the augmented IPM solve completes:
   - Truncate `x` from length `n_orig + 2·m_eq` back to `n_orig`.
   - Re-evaluate `f(x*)` on the user TNLP (strip the `ρ·Σ(p+n)` term from the reported objective).
   - Re-evaluate `c(x*)` on the user TNLP (strip the slack contribution from constraint values).
   - Map the equality-row multipliers back: `λ_eq[i]` is the same scalar in both spaces (the slack rows don't introduce new dual variables, only new primals).
2. `Solution<'_>` callback: ensure the user receives original-space `(x*, λ*, z_L*, z_U*)`, not augmented-space.
3. Tests:
   - `flag_on_objective_excludes_penalty_term` — solve a small problem, assert reported `obj == f(x*)` not `f(x*) + ρ·Σ(p+n)`.
   - `flag_on_solution_x_truncated` — assert `x.len() == n_orig`.
   - `flag_on_lambda_eq_passes_through` — solve a problem with one equality, assert `λ_eq` matches a known reference within 1e-8.
4. Probe one of the audit's rescue candidates: solve with flag on, fixed ρ = 1e3, confirm convergence on at least one problem that fails with flag off.

**Deliverable.** End-to-end solve in augmented space with original-space reporting. At least one audit candidate transitions failure → success at fixed ρ.

### Phase 3 — Byrd–Nocedal–Waltz dynamic ρ + honest-infeasibility upgrade (3–4 days)

**Goal.** Replace the fixed ρ with the BNW steering rule. Detect non-collapsing slacks and upgrade `Optimal` / `Acceptable` → `LocalInfeasibility` when the original problem is genuinely infeasible at the ℓ₁ stationary point.

**Tasks.**
1. Add an outer ρ-loop in `application.rs` around the inner IPM solve. After each inner solve:
   - Read the equality-row multipliers `y_eq` from the `Solution<'_>` callback.
   - Compute `ρ_new = max(ρ · l1_penalty_increase_factor, l1_steering_factor · ‖y_eq‖∞ + ε)`, capped at `l1_penalty_max`.
   - Stop when ρ stops increasing (saturated) or `Σ(p+n) < l1_slack_tol` (slacks collapsed).
2. Honest-infeasibility upgrade: if the inner solve returned `Solve_Succeeded` or `Solved_To_Acceptable_Level` but `Σ(p+n) ≥ l1_slack_tol`, override `ApplicationReturnStatus` to `Infeasible_Problem_Detected`. The point is BNW ℓ₁-stationary but the original problem has no feasible solution at this iterate.
3. New options: `l1_penalty_max`, `l1_penalty_increase_factor`, `l1_penalty_max_outer_iter`, `l1_slack_tol`, `l1_steering_factor` — defaults match ripopt's choices.
4. Tests:
   - `bnw_rho_escalates_with_lambda` — solve a problem where the equality multipliers grow, assert ρ tracks `τ·‖y_eq‖∞`.
   - `infeasible_certificate_burke_han` — minimize subject to `x₁²+x₂²=0` with `x₁ ≥ 1` (Burke-Han classic infeasible). Assert status → `Infeasible_Problem_Detected` with collapsed slack mass corresponding to the ℓ₁-best infeasibility distance.
   - `wachter_biegler_anti_case` — `x₁ = 1, x₁² − x₂² = 0`. Assert flag-on hurts (this is the documented anti-case from ripopt). Pin the behavior so we know if upstream feral / IPM changes affect it.
5. Re-run audit's 42-problem set: count rescues + honest-infeasibility upgrades. Acceptance: ≥1 hard rescue, ≥10 honest-infeasibility upgrades on our corpus (ripopt got 2 + 26).

**Deliverable.** `benchmarks/cutest/l1_audit/REPORT.md` updated with Phase-3 numbers; tests green; comparison table vs ripopt.

### Phase 3.5 — Auto-fallback option (1–2 days)

**Goal.** Make the wrapper invisible-but-effective for users who don't know to opt in.

**Tasks.**
1. New option `l1_fallback_on_restoration_failure: bool` (default `false`). When set, an `optimize_tnlp` call ending in `Restoration_Failed` / `Infeasible_Problem_Detected` / `Solved_To_Acceptable_Level` triggers an automatic retry with `l1_exact_penalty_barrier = true`.
2. Promotion rule: only replace the original result if the retry returns `Solve_Succeeded`. Otherwise return the original result with the retry's iteration count folded into `SolveStatistics`.
3. Tests:
   - `auto_fallback_promotes_resto_failure_to_optimal` — pick one of the Phase-3 hard rescues; with vanilla flag set, assert promotion fires and we get `Solve_Succeeded`.
   - `auto_fallback_keeps_original_when_retry_also_fails` — Wächter-Biegler counterexample; vanilla path returns `Solve_Succeeded` so fallback should not fire. Pick a problem where vanilla returns `Restoration_Failed` and the wrapper also fails; assert original status is preserved.
4. Document the option in `crates/pounce-cli/README.md` and the workspace README.

**Deliverable.** `cargo test --workspace` green; auto-fallback option documented.

### Phase 4 — MPCC paper reproduction (1–2 weeks, the deliverable ripopt deferred)

**Goal.** Reproduce the specific MPCC benchmarks Thierry & Biegler 2020 report, comparing pounce-flag-on against pounce-flag-off and against the paper's reported numbers. This is the empirical case for the paper's central claim that the wrapper handles MPCC-class degeneracy where stock IPM thrashes.

**Tasks.**
1. **Acquire MPCC benchmark sources.** Two candidate corpora:
   - **MacMPEC** ([Sven Leyffer's collection](https://wiki.mcs.anl.gov/leyffer/index.php/MacMPEC)) — 145 AMPL `.mod` files of MPCC test problems. The community standard.
   - **MPECLib** (Ferris et al.). Older but cited by the paper.
   The Thierry-Biegler 2020 paper reports on a curated subset of MacMPEC. If we can find the exact subset they used (their Table 1 names), reproduce on those; otherwise reproduce on the full MacMPEC corpus and report both shapes.
2. **Translate to `.nl`** via the AMPL CE binary already used in `benchmarks/mittelmann/Makefile`. New harness `benchmarks/mpcc/`:
   - `Makefile` mirroring `benchmarks/mittelmann/Makefile` (fetch / translate / run-pounce / run-ipopt / report).
   - `problems.txt` listing the corpus.
   - `run_solver.sh` parametrized like the Mittelmann version.
3. **Run four configurations** on the corpus:
   - pounce, flag off (baseline)
   - pounce, `l1_exact_penalty_barrier = true`
   - pounce, `l1_fallback_on_restoration_failure = true`
   - ipopt 3.14 + MA57 (reference; the paper compares to this)
4. **Comparison report** `benchmarks/mpcc/REPORT.md`:
   - Per-problem table: status, iters, wall, objective for all four configurations.
   - Aggregate: solve-rate, iter geomean, wall geomean.
   - Per-problem comparison vs the paper's Table 1 numbers where available.
   - Categorical breakdown: NLP-degenerate vs MPCC-strict-complementarity-violating vs other.
5. **Acceptance criterion.** The wrapper improves solve-rate on the MacMPEC corpus by ≥X% over flag-off pounce, where X is calibrated against the paper's reported delta (read the paper, set the threshold accordingly — placeholder ≥10% pending paper read).
6. **Stretch: write up findings.** Section in `benchmarks/mpcc/REPORT.md` summarizing where the wrapper helps, where it doesn't, and how our numbers compare to the paper. This is the MPCC reproduction the field is missing publicly.

**Deliverable.** `benchmarks/mpcc/` harness + report + acceptance criterion met.

**Open questions for Phase 4 (resolve in Phase 0 audit).**
- Does Thierry-Biegler 2020 list the exact MacMPEC subset in their paper? (PDF links in pounce#5.) If not, default to the full MacMPEC corpus.
- AMPL CE may not translate every MacMPEC `.mod` file — confirm coverage before locking the harness in.
- The paper's per-step BNW variant vs our per-outer BNW: if the gap is large we may need to revisit Phase 3 implementation choice.

### Phase 5 — Documentation, options reference, release (1–2 days)

1. Per-crate README for `pounce-l1penalty`.
2. New section in workspace README: "Degenerate / MPCC NLPs" with worked example.
3. Option reference: extend `crates/pounce-algorithm/src/upstream_options.rs` registration list.
4. Note in `crates/pounce-cli/README.md` and `pounce --help` covering `l1_exact_penalty_barrier` and `l1_fallback_on_restoration_failure`.
5. Reference the MPCC reproduction report from the new section.
6. Cross-link this issue's eventual closure comment from pounce#5 and from `benchmarks/mpcc/REPORT.md`.

## Cumulative timeline estimate

| Phase | Effort | Cumulative |
|-------|-------:|-----------:|
| 0 audit              |   1 d   |   1 d   |
| 1 wrapper + flag-off |  3–5 d  |  6 d    |
| 2 driver + back-proj |  2–3 d  |  9 d    |
| 3 BNW + infeasibility|  3–4 d  | 13 d    |
| 3.5 auto-fallback    |  1–2 d  | 15 d    |
| 4 MPCC reproduction  | 7–14 d  | 29 d    |
| 5 docs + release     |  1–2 d  | 31 d    |

≈4 weeks engineering for the full deliverable. Phase 4 dominates and is the part that can be deferred or split off if calendar pressure demands.

## Acceptance criteria (whole-issue)

- [ ] `cargo test --workspace` green with `pounce-l1penalty` included.
- [ ] Flag-off byte-identical on 100-problem CUTEst probe (≥99/100, mirroring ripopt's threshold).
- [ ] On the 42-problem ripopt l1_rescue set: ≥1 hard rescue, ≥10 honest-infeasibility upgrades.
- [ ] On the MacMPEC corpus: solve-rate improvement vs flag-off pounce calibrated to the paper.
- [ ] Auto-fallback option promotes at least one currently-failing CUTEst problem to `Solve_Succeeded` without regressing any `Solve_Succeeded` problem.
- [ ] `pounce-l1penalty` README, workspace README section, options reference all published.

## Out of scope (file separately if pursued)

- Per-step BNW (the paper's variant). v1 ships per-outer-iter BNW per ripopt's design choice.
- Replacing pounce's restoration phase with the wrapper. Restoration stays as today; the wrapper is an opt-in alternative for the user.
- Sparse-aware optimization of the slack-block Jacobian assembly — the slack blocks are `±I`, so a dense expansion is fine for the volume of problems where the wrapper helps; revisit if `2·m_eq` grows past 100k for a real user.

## Cross-link

- pounce#5 — research / decision (lead-in to this implementation issue).
- ripopt#23 — algorithmic reference; commit `7847bba9` is the canonical port source.
- Thierry & Biegler 2020 IFAC paper — primary algorithmic reference for the MPCC reproduction in Phase 4.
- pounce#9 — restoration-phase TRON (independent; does not block this).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation plan: Thierry-Biegler ℓ₁-exact penalty-barrier wrapper + MPCC paper reproduction #10

Goal

Background

Architecture decisions

Phased plan

Phase 0 — Audit & baseline (1 day, no code shipped)

Phase 1 — TNLP wrapper, fixed ρ, default-off byte-identical (3–5 days)

Phase 2 — Augmented IPM driver + back-projection (2–3 days)

Phase 3 — Byrd–Nocedal–Waltz dynamic ρ + honest-infeasibility upgrade (3–4 days)

Phase 3.5 — Auto-fallback option (1–2 days)

Phase 4 — MPCC paper reproduction (1–2 weeks, the deliverable ripopt deferred)

Phase 5 — Documentation, options reference, release (1–2 days)

Cumulative timeline estimate

Acceptance criteria (whole-issue)

Out of scope (file separately if pursued)

Cross-link

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Phase	Effort	Cumulative
0 audit	1 d	1 d
1 wrapper + flag-off	3–5 d	6 d
2 driver + back-proj	2–3 d	9 d
3 BNW + infeasibility	3–4 d	13 d
3.5 auto-fallback	1–2 d	15 d
4 MPCC reproduction	7–14 d	29 d
5 docs + release	1–2 d	31 d

Implementation plan: Thierry-Biegler ℓ₁-exact penalty-barrier wrapper + MPCC paper reproduction #10

Description

Goal

Background

Architecture decisions

Phased plan

Phase 0 — Audit & baseline (1 day, no code shipped)

Phase 1 — TNLP wrapper, fixed ρ, default-off byte-identical (3–5 days)

Phase 2 — Augmented IPM driver + back-projection (2–3 days)

Phase 3 — Byrd–Nocedal–Waltz dynamic ρ + honest-infeasibility upgrade (3–4 days)

Phase 3.5 — Auto-fallback option (1–2 days)

Phase 4 — MPCC paper reproduction (1–2 weeks, the deliverable ripopt deferred)

Phase 5 — Documentation, options reference, release (1–2 days)

Cumulative timeline estimate

Acceptance criteria (whole-issue)

Out of scope (file separately if pursued)

Cross-link

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions