Skip to content

Branch coverage push (waves 2,7,8) + 5 src/ bug fixes surfaced by tests#223

Merged
singaraiona merged 7 commits into
masterfrom
fix/bool-cast-nulls-and-cleanups
Jun 3, 2026
Merged

Branch coverage push (waves 2,7,8) + 5 src/ bug fixes surfaced by tests#223
singaraiona merged 7 commits into
masterfrom
fix/bool-cast-nulls-and-cleanups

Conversation

@ser-vasilich
Copy link
Copy Markdown
Collaborator

Summary

Branch coverage push across 30+ src/ files, surfacing and fixing 5 latent
bugs along the way.

Coverage delta (src/, excl. test/):

Metric Master This PR Δ
Branches ~66.1% 68.58% +2.5pp
Regions ~82.8% 84.91% +2.0pp
Lines ~87.2% 89.07% +1.9pp
Functions ~97.3% 98.50% +1.2pp
Tests 2855 3231 +376

Bug fixes (5 new, all found by coverage tests)

  1. fix(system): qsort(NULL, 0, ...) UB in .os.listray_os_list_fn
    called qsort on an empty-directory result; C11 §7.22.5.2/2 makes that
    UB even with nmemb == 0 (UBSan trips on the nonnull attr). Guard
    with if (count > 0).
  2. fix(env): ray_setenv_fn under-retained its val arg → caller
    over-released → freed slot reused by next allocation → spurious "type"
    error on a later unrelated op (manifested as (set X (table …))
    failing after setenv + getenv). Add ray_retain(val) to match the
    ray_quote_fn convention.
  3. fix(env): pooled ray strings passed unterminated to libc getenv/setenv — ray strings >12 chars are not NUL-terminated; libc reads OOB.
    Copy into a stack buffer first.
  4. fix(expr): (int64_t)1 << 63 UB in the i64 DIV/MOD overflow guard
    (INT64_MIN / -1). Replace with INT64_MIN.
  5. fix(query): int32 overflow in xbar near INT32_MINq * b32
    transient overflowed during the floored-bucket computation. Hoist to
    int64, truncate on store.
  6. fix(vec): memcpy(dst, NULL, 0) UB in ray_vec_from_raw — guard
    if (data_size) and reject NULL with non-zero size.

Test additions (≈30k+ lines across waves)

  • Wave 2: 13 src/ops/ files (strop, filter, idiom, tblop, pivot, window,
    collection, join, datalog, fused_group, graph, idxop, fvec)
  • Wave 3-5 (committed earlier together with the bug fixes): query, group,
    expr, traverse, linkop, rowsel, journal, fuse, eval, csv, vec, dict, col,
    heap, parse, csr, part, table
  • Wave 7: push hll.c / timer.c / idxop.c / fused_group.c / group.c to
    ≥80% region / function / line:
    • hll.c: region 21 → 90.75%, function 60 → 100%, line 33.5 → 96.56%
    • timer.c: region 53.3 → 95.83%, function 61.5 → 100%, line 59.1 → 100%
    • idxop.c: region 68.3 → 89.37%, function 85 → 100%, line 69.6 → 94.86%
    • fused_group.c: region 73.7 → 80.10%, function 92.2 → 100%, line 76.7 → 84.99%
    • group.c: region 78.83 → 80.16%, line 86.05 → 87.28%
  • Wave 8 (2nd pass on remaining branch-coverage gaps):
    query +1.0pp, fused_group +1.57pp, group +0.45pp, traverse +0.52pp, eval +0.14pp.

Post-rebase test-rename fixups: timer.time.now, args.sys.args.

Test plan

  • make test (ASan+UBSan debug): 3231 of 3233 pass, 2 skipped, 0 failed, 0 sanitizer errors
  • Each src/ fix has a regression test (sum-of-error-prop / sel_compact MAPCOMMON / setenv refcount / 1<<63 mod overflow / xbar near INT32_MIN / from_raw NULL / .os.list empty dir)
  • No src/ de-static, no internal-header exposure for tests
  • All static inline instantiations / inline accounting artifacts documented inline

Commits

b19edab1 fix(system): skip qsort on empty directory in .os.list
f9d3b1b6 test(branches): wave 8 — 2nd pass on query/group/fused_group/eval/traverse
ce018965 test(system): update args→.sys.args after master rename
283dfcc6 test(branches): wave 7 — push hll/timer/idxop/fused_group/group to ≥80% all metrics
3ff3120c test(system): swap `timer` → `.time.now` post-rebase
93a1fe20 fix(env/expr/query/vec): 4 bugs surfaced by wave 3-5 coverage push
a7fd1039 test(branches): wave 2 — coverage push across 13 more src/ops/ files

🤖 Generated with Claude Code

ser-vasilich and others added 7 commits June 3, 2026 15:15
Parallel agent sweep targeting:
  agg.c          +8.80pp  (78.49→87.29%)
  graph_builtin.c +10.42pp (67.88→78.30%)
  cmp.c          +9.82pp  (71.07→80.89%)
  temporal.c     208 assertions (extract/trunc all types × null)
  system.c       182 assertions (ser/de, splayed, mount, guid, meta)
  string.c       ILIKE cache, replace/concat >8192, substr types
  sort.c         radix reverse-detect, STR topk fallback
  opt.c          111 expressions (IF/SUBSTR/REPLACE/CONCAT ext-walk)
  exec.c         165 RFL + 399 C lines (reductions, pivot+sel, prod)
  idxop.c        56 C tests + 100 RFL (type×kind matrix, F32, nulls)
  embedding.c    30 C tests + 120 RFL (metric/hnsw/knn error paths)
  arith.c        785 lines (temporal null propagation, mod, unary)
  fused_topk.c   23 C tests + 50 RFL (I16/U8/BOOL/temporal keys)

Also: remove dead-by-link block_alloc_stub.c (weak fallback never
called when buddy allocator is present), bump RFL_THUNK_CAPACITY
320→384.

3001/3003 pass (2 skipped, 0 failed) under ASan+UBSan.
Coverage: 68.02% branches / 89.24% lines (src/ only, excl test/).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch-coverage tests across query/group/expr/traverse/eval/csv/vec/
dict/col/heap/parse/csr/part/table/linkop/rowsel/journal/fuse exposed
four real src/ bugs, all fixed here:

* setenv refcount (system.c): ray_setenv_fn returned its `val` arg
  without ray_retain → caller over-releases → string freed early →
  freed slot reused by the next allocation → spurious "type" error on
  a later unrelated op (e.g. second `(set X (table …))`).  Mirrors the
  retain convention in ray_quote_fn.
* env NUL-termination (system.c): getenv/setenv passed raw ray_str_ptr
  to libc, but pooled ray strings (len > 12) are not NUL-terminated →
  OOB read.  Copy into a NUL-terminated buffer first.
* expr.c i64 DIV/MOD: `(int64_t)1<<63` is signed-shift UB; use
  INT64_MIN for the INT64_MIN/-1 overflow guard.
* query.c xbar i32: `q * b32` overflowed int32 near INT32_MIN; compute
  the floored bucket in int64, truncate on store.
* vec.c from_raw: memcpy(dst, NULL, 0) is UB; guard on data_size and
  reject NULL data with a non-zero size.

Tests: 11 new RFL files + 7 extended C/RFL test files (~1900 new
assertions/cases).  Per-file branch coverage gains include
graph_builtin +10.4pp, cmp +9.8pp, agg +8.8pp, linkop +25.8pp,
journal +26.1pp; many others +2–4pp.

3111/3113 pass (2 skipped, 0 failed), 0 UBSan/ASan under ASan+UBSan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Master replaced the legacy `(timer N)` builtin with the `.time.*`
namespace (`.time.now`, `.time.timer.set`, `.time.timer.del`).  Two
assertions in system_branch_cov2.rfl from the wave-2 push referenced
the old name; update to `.time.now` so the file passes on the new
master.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…0% all metrics

Targeted the 5 files where region, function, or line coverage was
still under 80% on the rebased master (after PR #219). Goal was 100%;
all 5 cleared ≥80% on every metric, and 4 of 5 reach 100% function
coverage.

  hll.c          region 21.0 → 90.75% | fn 60 → 100% | line 33.5 → 96.56%
  timer.c        region 53.3 → 95.83% | fn 61.5 → 100% | line 59.1 → 100%
  idxop.c        region 68.3 → 89.37% | fn 85 → 100%   | line 69.6 → 94.86%
  fused_group.c  region 73.7 → 80.10% | fn 92.2 → 100% | line 76.7 → 84.99%
  group.c        region 78.83 → 80.16% (fn 98.17%, line 86.0 → 87.28%)

New test work (no src/ changes):
- test_group_extra.c: 5 HLL kernels (direct API + production routing), ~950 lines
- test/rfl/ops/hll_coverage.rfl: 1.05M-row streaming + per-group HLL paths
- test_runtime.c: 8 timer tests (heap grow/sift, tie-break, forever rearm,
  callback errors, pump-for guards, destroy paths)
- test_index.c: 34 tests (20 chunk_zone × all numeric/temporal + nulls,
  14 hash_eq_rowsel including multi-segment + grow buffer + type matrix)
- test_fused_group.c: 19 tests (v2 per-(worker,partition) shards for
  COUNT/SUM/AVG/wide multi-key, hash-index dispatch, MG top-K rebuild
  for I64/I32/TIMESTAMP, emit-filter top-N by SUM/MIN)
- group/group_branch_cov.rfl: 7 new sections (§14-§20) — v2 fused-radix,
  DA-path early-abort, exec_group_sum_count_rowform N=3..8, parted
  SUM/AVG, multi-key composite matrix, accum_from_entry skip, v2_emit
  topn_filter MIN/MAX/COUNT

Unreachable branches documented inline per file (OOM-injection paths,
v2 gate exclusions, defensive guards, dead-by-construction switches).

Suite: 3182 of 3184 pass (2 skipped, 0 failed) under ASan+UBSan.
Coverage src/ (excl test/): branches 66.12 → 68.43% (+2.31pp);
regions 82.84 → 84.85%; lines 87.21 → 89.04%; functions 97.31 → 98.44%.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`args` builtin was renamed/replaced; current binding is `.sys.args`
returning a DICT (formerly a LIST keyed by N).  Update the
ray_args_fn coverage assertion to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…verse

Following the rebase onto PR #218/#219/#220 master (new attribute
system, asof fast-path, RAY_IDX_PART, HLL routing, MG top-K for
TIMESTAMP), targeted the still-large branch-coverage gaps:

  query.c         62.54 → 63.54% (+1.00pp, -107 missed)
  fused_group.c   65.69 → 67.26% (+1.57pp, -55 missed)
  group.c         67.50 → 67.95% (+0.45pp, -39 missed)
  traverse.c      60.16 → 60.68% (+0.52pp, -12 missed)
  eval.c          60.73 → 60.87% (+0.14pp, -4 missed)

Additions:
- query_branch_cov.rfl    +670 lines (§19-§63: 2-stage count-distinct
  rewrite for I64/I32/TIMESTAMP, match_group_desc_count_take per-op,
  wide-key fused, asof wrapper, narrow_known_small_extract, HLL
  inner-type cascade, prefilter computed-by + WHERE + desc:count)
- fused_group_branch_cov.rfl  +190 lines + 1156 C lines (chunk_zone
  fast path EQ/GT/LT/NE/LE/GE, IN/EQ masked dispatch, BOOL/SYM key
  topk, U8/I16 hash-eq kbits, strlen agg input)
- group_branch_cov.rfl    +488 lines §21-§38 (maxmin/pearson rowform
  with null x/y/k, per-partition STDDEV/VAR/FIRST/LAST, multi-key
  heavy-hitter, v2 multi-key TIMESTAMP+I64 / DATE+TIME+I64,
  count_distinct STR/GUID/LIST, accum_from_entry skip path)
- eval_branch_cov.rfl     +300 lines §9-§30 (OP_STOREGLOBAL error,
  lambda dispatch errors, try handler dispatch, try_sum_affine bail
  paths, nested-try depth, raise vec/dict/table payload survival)
- test_traverse.c         +760 lines / 18 C tests (A* relax fail,
  cluster_coeff parallel/asym, SIP dir2 neg/oob src, betweenness/
  closeness sample-clamp)
- traverse_branch_cov.rfl +311 lines (bidirectional cliques, parallel
  edges, K4, disjoint comps, diamond/2-cycle/back-edge fixtures)

Suite: 3231 of 3233 pass under ASan+UBSan. Unreachable branches
documented inline per file (OOM-injection, VM trap stack, restricted-
mode, MAPCOMMON/PARTED I/O-only, CSR invariants).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
C11 §7.22.5.2/2 makes passing NULL to qsort undefined behaviour even
when nmemb == 0 (UBSan flags it via qsort's nonnull attribute).
ray_os_list_fn invoked qsort(NULL, 0, ...) when readdir produced no
entries — a bug the system.c wave-2 coverage agent flagged but could
not test ("cannot test without crash").

Guard the call with `if (count > 0)`.  An empty array is already
sorted by definition, so semantics are unchanged.

Regression: an empty-directory case is added to system_branch_cov.rfl
covering the previously-untestable arm under ASan+UBSan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@singaraiona singaraiona merged commit 23a88f3 into master Jun 3, 2026
3 of 4 checks passed
ser-vasilich added a commit that referenced this pull request Jun 3, 2026
A test hang in the suite currently stalls the whole CI job and we
learn nothing about which test caused it (macOS+ASan on PR #223 hung
26+ min vs 150 s on master).  Install a SIGALRM-based watchdog: when
a test exceeds the timeout the handler writes its name to stderr
using async-signal-safe write(2) + _exit(124), so the CI log
captures the culprit and the job actually finishes.

Default 90 s comfortably exceeds the slowest legitimate tests
(1.05M-row HLL, splayed I/O round-trips); override via
RAY_TEST_TIMEOUT_S env var (set to 0 to disable).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants