HugoMachadoRodrigues · HugoMachadoRodrigues · Jun 12, 2026 · Jun 12, 2026
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
@@ -898,6 +898,7 @@ This benchmark is the empirical core of the paper (candidates: *SoftwareX*, *Geo
 | **v0.9.2.B** | **Specifier infrastructure** (Ano- / Epi- / Endo- / Bathy- / Panto- via prefix dispatch in the resolver). `.detect_specifier()` recognises the prefix, `.apply_specifier()` calls the base `qual_*` and filters layers by depth band. No need to define one function per (specifier × base) -- the system is generic. CH gains Endogleyic / Endostagnic / Endocalcic in canonical positions. Specifiers Kato- / Poly- / Supra- / Thapto- / Amphi- deferred to v0.9.3 (require buried-horizon flags / chains of designations). Tests: +55 expectations. | **shipped** |
 | **v0.9.2.C** | **v0.3.x diagnostic corrections** -- false-positive reduction. **cambic** gains a depth gate (`min_top_cm = 5`) and a structural-development gate (`structure_grade ∈ {weak, moderate, strong}` AND `structure_type ∉ {massive, single grain}`); A horizons and massive-C no longer pass. **plaggic** gains an anthropogenic-evidence gate directly in the diagnostic (P >= 50 mg/kg OR artefacts > 0 OR designation Apl/Aplg/Apk); the v0.9.1 gate in `qual_plaggic` was removed (now direct delegation). **sombric** gains a humus-illuviation gate (the candidate layer must have OC ≥ OC_layer_above + 0.1%); the v0.3.3 permissiveness is eliminated. Resulting canonical classification change: DU → "Duric Skeletic Durisol" (loses Cambic from the massive BC1). FR (Latossolo) and the other 30 fixtures unchanged. Tests: +43 expectations. | **shipped** |
 | **v0.9.3.A** | **Remaining specifiers (Kato/Amphi/Poly/Supra/Thapto) + supplementary engine**. Refactor of `.wrb_specifiers` to support two `kind`s -- `depth` (simple depth band; reuses the v0.9.2.B path) and `filter` (custom function). Helpers: `.kato_filter` (top_cm >= 50), `.amphi_filter` (Epi AND Endo), `.poly_filter` (>= 2 disjoint runs), `.supra_filter` (above a barrier: continuous_rock / petric / technic_hard), `.thapto_filter` (designation ending in `b`). Engine: `resolve_wrb_qualifiers` now also processes the `supplementary:` slot of the YAML, returning `principal` + `supplementary` with families suppressed in both. `classify_wrb2022` renders the full WRB Ch 6 name with parenthesised tags. Tests: +66 expectations. | **shipped** |
+| **v0.9.112** | **"An argic horizon is never a Regosol" (accuracy front B2, engine).** Honest B1 benchmark exposed a key correctness bug: a profile with a confirmed argic (clay-illuvial B) horizon dropped to the Regosol catch-all when the eutric/alic split (BS/Al-sat) was unmeasured (`luvisol()` returned NA → key skips → Regosol). Fix in `luvisol()` (R/diagnostics-rsg-argic-derived.R): a graceful Al-sat default mirroring the Acrisol BS-fallback — when `argic()` passes, CEC/clay≥24, and Al-sat is unmeasured on a **B master horizon**, default to Luvisol (the generic high-activity argic; Alisol needs positive Al-sat≥50). Fires only on `is.na()` (a measured Luvisol/Alisol is never overridden); B-horizon guard excludes a Fluvisol's stratified C-layer clay jump; `al_sat_pct` stays in `missing_data` so the assumption is transparent. **FEBR-WRB +9 Luvisols (Regosol→Luvisol), 0 regressions (17.8→21.9%); all 44 canonical fixtures byte-identical.** Scope note: the dominant FEBR-WRB ceiling is missing data (most argic-RSG pedons lack measured clay), not the discriminator the audit imagined — so this is a targeted correctness fix. | **shipped** |
 | **v0.9.110** | **Benchmark methodology (accuracy front B1).** Harness-only, engine unchanged. (1) **Sampling fix**: `.benchmark_one_dataset_one_system()` now filters each dataset to pedons carrying the requested system's reference label (`.benchmark_has_reference`) BEFORE the `max_n` cap (`.benchmark_filter_then_cap`); FEBR loads with `require_classification="any"`. Fixes the cap-before-filter bug that starved sparse labels (FEBR-USDA n=3 → hundreds). FEBR + BDsolos branches; KSSL/LUCAS/Redape documented as-is. (2) **Metrics**: `.benchmark_metrics_from_confusion()` (NIR majority baseline, balanced accuracy, macro-F1, Cohen's kappa, per-class P/R/F1) + `.benchmark_bootstrap_metrics()` (seed-42, RNG-preserving 95% CIs); attached to `pool_one()` and every report row via a uniform `.suite_row()`. (3) **Honest report**: new columns + `n<30` flag; LUCAS WRB labelled topsoil-only lower-bound (honest WRB rests on offline FEBR + AfSP, the latter now carrying the full metric set); SoilGrids subsoil-fill documented as opt-in. The accuracy-raising B2 (argic/ferralic/nitic discriminator, which moves fixtures) is a separate follow-up. | **shipped** |
 | **v0.9.109** | **CRAN release hardening.** Documentation-only; engine byte-identical. `R CMD check --as-cran` flagged 545 exported function topics missing `\value` (CI never ran `--as-cran`). ~600 atomic engine predicates (`qual_*`, `*_usda` gates, `carater_*`/`horizonte_*`) marked `@keywords internal` (still exported/callable, out of the public reference) → documented API ~910→~195; the 85 genuinely-public no-value topics gained `@return`. Runnable `@examples` on the entry points; `_pkgdown.yml` gains a `has_keyword("internal")` section; CI runs explicit `--as-cran` + `check_pkgdown()`; dead `SOILKEY_SKIP_*` vars removed; `LazyDataCompression: xz`; lifecycle → maturing. Result: `--as-cran` 0/0/0; suite 5038/0. | **shipped** |
 | **v0.9.108** | **Pro app polish (front 3 of 3).** A UX pass on `classify_app_pro/`: a soil-science `bs_theme()` palette + slim `www/soilkey.css` (warmer cards, navbar wordmark, CSS-only busy spinner); a global **pedon ribbon** (`page_navbar(header=)`, rendered from `rv$pedon`) and a **"Getting started" modal** with a one-click **Load example & classify** that builds the canonical Ferralsol through the real Pedon flow (`rv$example_request` → `mod_pedon` observer). New visualisations: a Vis-NIR spectrum plot (`pro_spectrum_plot()`, one trace per horizon) in **Spectra** and an uploaded-photo preview + VLM-confidence badge in **Photo**. lat/lon range validation in `mod_pedon`; the **USDA-family** / **WRB-specifier** toggles surfaced in the Classify sidebar, two-way-synced with Settings through a shared-`rv` single source of truth. **Package change (additive):** `report()` / `report_html()` / `report_pdf()` gain `include_family` / `specifiers` (forwarded to `classify_usda()` / `classify_wrb2022()` when a `PedonRecord` is passed); both default `FALSE` → report output byte-identical (regression test). No new dependencies. | **shipped** |

diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,7 +1,7 @@
 Package: soilKey
 Type: Package
 Title: Automated Soil Profile Classification per WRB 2022, SiBCS 5 and USDA Soil Taxonomy 13
-Version: 0.9.110
+Version: 0.9.112
 Date: 2026-06-11
 Authors@R:
     person("Hugo", "Rodrigues",

diff --git a/NEWS.md b/NEWS.md
@@ -1,3 +1,44 @@
+# soilKey 0.9.112 (2026-06-11)
+
+The "**an argic horizon is never a Regosol**" release (accuracy front B2,
+engine). The honest B1 benchmark exposed a correctness bug in the WRB key:
+a profile with a CONFIRMED argic (clay-illuvial B) horizon could drop to the
+Regosol catch-all -- the gate for soils with NO diagnostic subsurface horizon
+-- purely because the eutric/alic split (base saturation / Al-saturation) was
+unmeasured, leaving the Luvisol gate at \code{NA}.
+
+## The fix (surgical, in the key)
+
+\itemize{
+  \item \code{luvisol()} (R/diagnostics-rsg-argic-derived.R) gains a graceful
+        Al-saturation default, mirroring the Acrisol BS-fallback: when
+        \code{argic()} passes, the clay is high-activity (CEC/clay >= 24), and
+        Al-saturation is \strong{unmeasured} on a \strong{B master horizon},
+        the profile defaults to \strong{Luvisol} (the generic high-activity
+        argic RSG; Alisol is the high-Al special case that requires positive
+        Al-sat >= 50 evidence). It fires only on \code{is.na()}, so a measured
+        Luvisol (Al-sat < 50) or Alisol (Al-sat >= 50) is never overridden, and
+        a B-horizon guard keeps it off a Fluvisol's stratified C-layer clay
+        jump (a sedimentary, not pedogenic, increase). \code{al_sat_pct} stays
+        in the result's \code{missing_data}, and Alisol surfaces as an
+        ambiguity, so the assumption is transparent.
+}
+
+## Impact
+
+\itemize{
+  \item Measured on the FEBR WRB benchmark: \strong{+9 Luvisols recovered
+        (Regosol -> Luvisol), 0 regressions} (17.8\% -> 21.9\% order accuracy).
+        All \strong{44 canonical fixtures classify byte-identically} (the
+        fallback only fires on missing data, which the fixtures never have).
+  \item Scope note from the B1 measurement: the dominant FEBR-WRB ceiling is
+        \emph{missing data} (most argic-RSG reference pedons carry no measured
+        clay at all), which no key change can address -- so this is a targeted
+        correctness fix, not the broad "discriminator" the earlier audit
+        imagined.
+}
+
+
 # soilKey 0.9.110 (2026-06-11)
 
 The "**benchmark methodology**" release (front B1 of the accuracy work). A

diff --git a/R/diagnostics-rsg-argic-derived.R b/R/diagnostics-rsg-argic-derived.R
@@ -189,6 +189,35 @@ luvisol <- function(pedon, min_cec = 24, max_al_sat = 50) {
                                                   max_pct = max_al_sat,
                                                   candidate_layers = layers)
 
+  # v0.9.111: graceful Al-saturation fallback. A confirmed argic horizon with
+  # high-activity clay (CEC >= 24) but NO Al-saturation measurement is a Luvisol
+  # by default -- the Alisol is the special high-Al case that requires POSITIVE
+  # al_sat >= 50 evidence. Without this, an undeterminable eutric/alic split
+  # leaves al_sat_low = NA, the gate returns NA, and the profile drops to the
+  # Regosol catch-all -- but an argic horizon is never a Regosol. Fires only on
+  # is.na() (unmeasured): a measured Luvisol (al_sat < 50) already passes and a
+  # measured Alisol (al_sat >= 50) gives FALSE, so neither is overridden. The
+  # promoted layers must equal what the aggregate re-intersects with, and
+  # al_sat_pct stays in $missing so it still surfaces in $missing_data.
+  # The B-horizon guard rejects a known false-positive: argic's clay-increase
+  # test can fire on a STRATIFIED (sedimentary) clay jump between a Fluvisol's C
+  # layers. A genuine argic horizon is an illuvial B (Bt); a clay increase into
+  # a C layer is depositional, not pedogenic. Only default to Luvisol when the
+  # promoted, CEC-high argic layer is a B master horizon -- this keeps the
+  # Fluvisol (its argic sits on a C) keyed to Fluvisols, not Luvisol.
+  if (isTRUE(tests$cec_high$passed) && is.na(tests$al_sat_low$passed)) {
+    promoted <- intersect(arg$layers, tests$cec_high$layers)
+    desig    <- as.character(pedon$horizons$designation)[promoted]
+    promoted <- promoted[grepl("B", desig)]
+    if (length(promoted) > 0L) {
+      tests$al_sat_low$passed  <- TRUE
+      tests$al_sat_low$layers  <- promoted
+      tests$al_sat_low$details <- c(tests$al_sat_low$details %||% list(),
+        list(al_sat_low_default =
+          "no Al-saturation measured; high-activity argic defaults to Luvisol"))
+    }
+  }
+
   agg <- .argic_derived_aggregate(tests,
                                     layer_keys = c("cec_high", "al_sat_low"))
 

diff --git a/inst/benchmarks/reports/benchmark_suite_v09112.md b/inst/benchmarks/reports/benchmark_suite_v09112.md
@@ -0,0 +1,26 @@
+# soilKey benchmark suite -- v0.9.112
+
+Generated by `run_all_benchmarks()` (max_n = 200, level = order).
+
+## Accuracy by dataset x system
+
+Headline metric for imbalanced classes is **balanced accuracy / macro-F1**, read against the **NIR** (no-information-rate) majority-class baseline. Point accuracy carries a bootstrap 95% CI.
+
+| Dataset | System | n | Accuracy [95% CI] | Bal. acc | Macro-F1 | Kappa | NIR | Flag |
+|---------|--------|--:|-------------------|---------:|---------:|------:|----:|------|
+| canonical | all | 132 | 100.0% | n/a | n/a | n/a | n/a |  |
+| febr | sibcs | 200 | 38.0% (31.5%-45.0%) | 23.1% | 17.3% | 0.17 | 28.0% |  |
+| febr | usda | 194 | 45.4% (38.1%-52.6%) | 28.1% | 25.5% | 0.34 | 32.5% |  |
+| febr | wrb2022 | 199 | 22.6% (17.6%-28.6%) | 15.4% | 10.3% | 0.19 | 28.1% |  |
+| redape | sibcs | 94 | 59.6% (50.0%-69.1%) | 61.6% | 60.4% | 0.54 | 26.6% |  |
+
+## Zero-recall classes (improvement targets)
+
+- **redape/sibcs**: nitossolos, unknowns
+
+## Notes
+
+- The **canonical** row is an offline fixture sanity check (coverage, not field accuracy); it has no confusion matrix, so its per-class metrics are blank.
+- Rows flagged **n<30** are statistically indicative only. External-dataset rows reflect the local data snapshot and `max_n`.
+- **lucas_esdb/wrb2022** is a topsoil-only **lower bound** (LUCAS ships 0-20 cm chemistry only); the honest WRB-at-scale number is the morphologically-complete **FEBR** row. For a LUCAS estimate with a synthetic subsoil, run the opt-in (network, ~1 h): `benchmark_lucas_2018(pedons, fill_subsoil_from = "soilgrids")`.
+- **kssl/usda** uses a head-N (not random) sample of the gpkg; **bdsolos** accumulates leading (state-clustered) CSVs until the label cap is met. Both are documented samples, not full random draws.
diff --git a/tests/testthat/test-b2-argic-never-regosol.R b/tests/testthat/test-b2-argic-never-regosol.R
@@ -0,0 +1,88 @@
+# Tests for v0.9.111 "an argic horizon is never a Regosol": the Luvisol
+# graceful-default fallback in luvisol(). A confirmed argic horizon with
+# high-activity clay (CEC >= 24) but no Al-saturation measurement defaults to
+# Luvisol instead of dropping to the Regosol catch-all -- guarded so a measured
+# Alisol/Luvisol is never overridden and the argic must sit on a B master
+# horizon (not a stratified Fluvisol C layer).
+
+# A 3-horizon argic profile: clean clay increase into a high-activity Bt.
+# al_sat / base cations control whether the eutric/alic split is determinable.
+.b2_argic_pedon <- function(al_sat = NA_real_, ca = NA_real_, mg = NA_real_,
+                            k = NA_real_, na = NA_real_, al_cmol = NA_real_,
+                            bt_designation = c("Bt1", "Bt2")) {
+  h <- data.frame(
+    designation = c("A", bt_designation[1], bt_designation[2]),
+    top_cm = c(0, 25, 60), bottom_cm = c(25, 60, 120),
+    clay_pct = c(15, 38, 40), silt_pct = c(20, 17, 15),
+    sand_pct = c(65, 45, 45), cec_cmol = c(8, 16, 16),
+    ph_h2o = c(5.5, 5.6, 5.7),
+    clay_films_amount = c(NA, "common", "common"),
+    al_sat_pct = c(NA, al_sat, al_sat),
+    ca_cmol = c(NA, ca, ca), mg_cmol = c(NA, mg, mg),
+    k_cmol = c(NA, k, k), na_cmol = c(NA, na, na),
+    al_cmol = c(NA, al_cmol, al_cmol),
+    stringsAsFactors = FALSE)
+  soilKey::PedonRecord$new(site = list(id = "b2"), horizons = h)
+}
+
+test_that("a high-activity argic with no Al-sat defaults to Luvisol, not Regosol", {
+  p  <- .b2_argic_pedon()                     # al_sat + all bases NA
+  lv <- soilKey:::luvisol(p)
+  expect_true(isTRUE(soilKey:::argic(p)$passed))
+  expect_true(isTRUE(lv$passed))              # promoted
+  expect_gt(length(lv$layers), 0L)            # non-empty layers (load-bearing)
+  expect_true(any(grepl("al_sat", lv$missing)))   # al_sat still flagged
+  expect_true(!is.null(lv$evidence$al_sat_low$details$al_sat_low_default))
+  res <- classify_wrb2022(p, on_missing = "silent")
+  expect_equal(res$rsg_or_order, "Luvisols")  # was "Regosols" pre-v0.9.111
+  expect_true(any(grepl("al_sat", res$missing_data)))  # assumption surfaced
+})
+
+test_that("a measured Alisol (al_sat >= 50) is not overridden by the default", {
+  p <- .b2_argic_pedon(al_sat = 60, ca = 1, mg = 1, k = 0.2, na = 0.1,
+                       al_cmol = 6)
+  expect_true(isTRUE(soilKey:::alisol(p)$passed))
+  # Luvisol must be FALSE (measured high Al), NOT NA and NOT promoted-TRUE
+  expect_false(isTRUE(soilKey:::luvisol(p)$passed))
+  expect_equal(classify_wrb2022(p, on_missing = "silent")$rsg_or_order,
+               "Alisols")
+})
+
+test_that("a measured Luvisol (al_sat < 50) passes the canonical path, not the default", {
+  p  <- .b2_argic_pedon(al_sat = 20, ca = 4, mg = 2, k = 0.3, na = 0.1,
+                        al_cmol = 1)
+  lv <- soilKey:::luvisol(p)
+  expect_true(isTRUE(lv$passed))
+  # canonical pass -> the default note must NOT be present
+  expect_null(lv$evidence$al_sat_low$details$al_sat_low_default %||% NULL)
+  expect_equal(classify_wrb2022(p, on_missing = "silent")$rsg_or_order,
+               "Luvisols")
+})
+
+test_that("Alisol abstains (NA) when Al-sat is unmeasured, ceding to the promoted Luvisol", {
+  # Guards the key-ordering reasoning: Alisol (tested before Luvisol) must
+  # return NA (skip), not FALSE, so the engine continues to the Luvisol gate.
+  p <- .b2_argic_pedon()
+  expect_true(is.na(soilKey:::alisol(p)$passed))
+})
+
+test_that("the default does NOT fire on a stratified clay increase in a C layer", {
+  # Mirrors the make_fluvisol_canonical pattern: argic's clay-increase test
+  # fires on a sedimentary jump between C layers; that is a Fluvisol, not a
+  # default Luvisol. The B-horizon guard keeps it out of the Luvisol gate.
+  p <- .b2_argic_pedon(bt_designation = c("C1", "C2"))   # argic layer is a C
+  expect_false(isTRUE(soilKey:::luvisol(p)$passed))      # NA or FALSE, not TRUE
+})
+
+test_that("canonical fixtures with measured chemistry are byte-identical", {
+  # The fallback fires only on is.na(al_sat); every argic-derived fixture
+  # carries measured or computable al_sat/BS, so none flips.
+  expect_equal(classify_wrb2022(make_luvisol_canonical())$rsg_or_order, "Luvisols")
+  expect_equal(classify_wrb2022(make_alisol_canonical())$rsg_or_order,  "Alisols")
+  expect_equal(classify_wrb2022(make_acrisol_canonical())$rsg_or_order, "Acrisols")
+  expect_equal(classify_wrb2022(make_lixisol_canonical())$rsg_or_order, "Lixisols")
+  expect_equal(classify_wrb2022(make_fluvisol_canonical())$rsg_or_order, "Fluvisols")
+  # the SiBCS argic fixtures' WRB landings (previously unasserted) are pinned
+  expect_equal(classify_wrb2022(make_argissolo_canonical())$rsg_or_order, "Acrisols")
+  expect_equal(classify_wrb2022(make_luvissolo_canonical())$rsg_or_order, "Luvisols")
+})