⚡ Bolt: Reduces bulk scan execution time by ~10% by cungminh2710 · Pull Request #86 · hyperlocalise/rain-orm

cungminh2710 · 2026-05-26T02:51:49Z

💡 What: Optimized row scanning logic for "direct" types (integers, strings, booleans, floats, and time.Time) by using field offsets and unsafe.Pointer assignments instead of reflection-based Set* methods.

🎯 Why: Reflection-based assignments in the hot scanning loop (executed per row per column) were a measurable bottleneck for large result sets.

📊 Impact: Reduces bulk scan execution time by ~8-10% for medium-sized datasets.

🔬 Measurement: Verified with BenchmarkSQLiteSelectBulkScan/medium (~3.1ms -> ~2.8ms). All integration tests passed.

PR created automatically by Jules for task 18071894035170893412 started by @cungminh2710

…safe pointers Bypasses reflection API for common "direct" types during hot scan loops by pre-calculating field offsets and using unsafe.Pointer for direct memory assignments. Also optimizes slice element addressing during bulk scans. Co-authored-by: cungminh2710 <8063319+cungminh2710@users.noreply.github.com>

google-labs-jules · 2026-05-26T02:51:50Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

greptile-apps · 2026-05-26T02:55:32Z

Greptile Summary

This PR optimizes the hot per-row scan loop in scanDirectRowAddr by pre-computing struct field offsets at plan-build time and using unsafe.Pointer arithmetic to write directly to memory instead of going through reflection's Set* methods for integer, string, boolean, float, and time.Time direct fields. Fields accessed through embedded pointer intermediaries fall back to the existing reflection path via a canUseOffset flag.

scanColumnPlan gains three fields (offset, kind, canUseOffset) populated in newRowScanPlanForColumns; the hot path in scanDirectRowAddr uses these to bypass reflection for non-complex fields.
The Float32 overflow check has been updated to correctly match reflect.OverflowFloat semantics, allowing ±Inf and NaN to pass through while returning an error only for finite overflow.
Non-complex pointer columns (int64Pointer, stringPointer, etc.) continue to use fieldByIndexAlloc unchanged, since they require nil-check and allocation logic that already dominates their cost.

Confidence Score: 4/5

Safe to merge for the common case; two maintainability concerns around the float switch and GC write barriers are worth addressing but won't cause failures with current callers.

The core optimization is sound: offset computation is correct, canUseOffset correctly gates all complex/pointer-embedded fields back to the reflection path, and the previously flagged Float32 overflow and ±Inf issues are properly fixed. The float64 fast-path switch silently skips assignment on an unexpected col.kind rather than falling back like the int path does, and unsafe writes to string and time.Time fields bypass GC write barriers for their internal pointer members. Neither causes a failure today, but both represent fragile patterns that warrant a follow-up.

pkg/rain/model.go — specifically the float64ValueCols canUseOffset switch and the string/time.Time unsafe write sites.

Important Files Changed

Filename	Overview
pkg/rain/model.go	Introduces unsafe-pointer fast path for int/string/bool/float/time.Time direct column scans; canUseOffset flag and per-plan field offsets computed at plan-build time; Float32 overflow and ±Inf handling fixed from previous review; two P2 observations remain (missing default in float switch, write-barrier bypass for string and time.Time).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[scanRowsAgainstTableDirect] --> B{pointerElems?}
    B -- yes --> C[reflect.New struct
scanDirectRow item.Elem]
    B -- no --> D[item.Set zeroElem
scanDirectRowAddr item.Addr.UnsafePointer]
    C --> E[scanDirectRow
computes baseAddr via Addr.UnsafePointer
calls scanDirectRowAddr]
    D --> F[scanDirectRowAddr]
    E --> F
    F --> G{col.canUseOffset?}
    G -- yes --> H[unsafe.Add baseAddr + col.offset
direct typed write]
    G -- no --> I[fieldByIndexAlloc
reflection Set*]
    H --> J{col.kind}
    J -- Int*/Uint* --> K[overflow check
then *int* ptr = val]
    J -- String --> L[*string ptr = v.String]
    J -- Bool --> M[*bool ptr = v.Bool]
    J -- Float32/64 --> N[overflow check for Float32
*float* ptr = val]
    J -- time.Time --> O[*time.Time ptr = v.Time]
    J -- default int only --> P[fieldByIndexAlloc fallback
assignRawValueToField]

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
pkg/rain/model.go:584-599
**Float64 fast-path switch has no `default` fallback**

The `switch col.kind` in the `canUseOffset` block for `float64ValueCols` handles `Float64` and `Float32` but has no `default` case. If `col.kind` were neither (which can't happen today, but could after a future change to `isSimpleDirectType`), the loop hits `continue` without writing anything to the field — silently leaving the destination at its zero value with no error. The equivalent int64 fast-path switch at line 365–429 has a proper `default:` block that falls back to `fieldByIndexAlloc` + `assignRawValueToField`, so the float path is inconsistently defensible against this class of silent failure.

### Issue 2 of 2
pkg/rain/model.go:494-496
**Unsafe writes for pointer-containing types bypass GC write barriers**

`*(*string)(unsafe.Add(baseAddr, col.offset)) = v.String` and `*(*time.Time)(unsafe.Add(baseAddr, col.offset)) = v.Time` (line 648) write structs that contain internal pointers (`string` carries a `*byte` data pointer; `time.Time` carries a `*time.Location`) through `unsafe.Pointer`, bypassing the hybrid write barrier Go's concurrent GC relies on. In the current usage the source values are always live on the stack so the pointed-to objects won't be collected, but the pattern violates the invariant that all pointer stores into heap memory must go through the write barrier. If the surrounding code ever changes (e.g. scratch lives in a sync.Pool on the heap), this could silently break GC correctness. Using `reflect.Value.SetString` / the existing `typedmemmove`-backed reflect path for these two types, or at minimum `//go:nowritebarrier`-aware runtime helpers, would make the intent explicit.

_{Reviews (5): Last reviewed commit: "refactor(rain): optimize row scanning wi..." | Re-trigger Greptile}

cungminh2710 · 2026-05-26T02:56:39Z

@jules Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

Issue 1 of 3

pkg/rain/model.go:572-578
Float32 overflow check absent in fast path

When canUseOffset is true and col.kind == reflect.Float32, the code does a bare float32(v.Float64) with no overflow check. A float64 value like 1e40 would silently become +Inf in the stored field. The fallback (non-canUseOffset) path at line 586–590 correctly calls field.OverflowFloat(v.Float64) and returns an error, so the fast path now silently corrupts data where the old path would have errored.

Issue 2 of 3

pkg/rain/model.go:436
fieldByIndexAlloc errors silently dropped

Throughout scanDirectRowAddr, every call on the non-canUseOffset fallback path now uses field, _ := fieldByIndexAlloc(...), discarding the error. For complex fields whose canUseOffset is false (i.e. fields accessed through an embedded pointer intermediary), fieldByIndexAlloc can return a zero reflect.Value{} and a non-nil error when the intermediate pointer field is not settable. The caller then immediately invokes methods like field.Kind() or field.SetInt() on the zero value, which panics. The original code propagated this error cleanly; this PR regresses that behaviour.

Issue 3 of 3

pkg/rain/model.go:282-284
Dead code in elemSize calculation

elemSize is only ever consumed in the else branch (non-pointer elements, line 316). The if pointerElems { elemSize = target.Type().Elem().Size() } branch overwrites it with the pointer-element size, but that branch never uses elemSize—it calls scanDirectRow via reflection instead. The default initialisation elemSize := structType.Size() is already the correct value for the only path that reads it.

google-labs-jules · 2026-05-26T03:00:55Z

@jules Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

Issue 1 of 3

pkg/rain/model.go:572-578
Float32 overflow check absent in fast path

When canUseOffset is true and col.kind == reflect.Float32, the code does a bare float32(v.Float64) with no overflow check. A float64 value like 1e40 would silently become +Inf in the stored field. The fallback (non-canUseOffset) path at line 586–590 correctly calls field.OverflowFloat(v.Float64) and returns an error, so the fast path now silently corrupts data where the old path would have errored.

Issue 2 of 3

pkg/rain/model.go:436
fieldByIndexAlloc errors silently dropped

Throughout scanDirectRowAddr, every call on the non-canUseOffset fallback path now uses field, _ := fieldByIndexAlloc(...), discarding the error. For complex fields whose canUseOffset is false (i.e. fields accessed through an embedded pointer intermediary), fieldByIndexAlloc can return a zero reflect.Value{} and a non-nil error when the intermediate pointer field is not settable. The caller then immediately invokes methods like field.Kind() or field.SetInt() on the zero value, which panics. The original code propagated this error cleanly; this PR regresses that behaviour.

Issue 3 of 3

pkg/rain/model.go:282-284
Dead code in elemSize calculation

elemSize is only ever consumed in the else branch (non-pointer elements, line 316). The if pointerElems { elemSize = target.Type().Elem().Size() } branch overwrites it with the pointer-element size, but that branch never uses elemSize—it calls scanDirectRow via reflection instead. The default initialisation elemSize := structType.Size() is already the correct value for the only path that reads it.

I've addressed all three issues:

Float32 overflow check: Added an explicit check against math.MaxFloat32 in the canUseOffset path of scanDirectRowAddr.
fieldByIndexAlloc errors: Updated all fallback paths to check and return the error from fieldByIndexAlloc instead of ignoring it.
Dead code in elemSize: Removed the redundant branch that recalculated elemSize for pointer elements, as it is only consumed by the non-pointer path.

…safe pointers Bypasses reflection API for common "direct" types during hot scan loops by pre-calculating field offsets and using unsafe.Pointer for direct memory assignments. Also optimizes slice element addressing during bulk scans. Address code review feedback: - Add Float32 overflow checks in fast path. - Propagate errors from fieldByIndexAlloc in fallback paths. - Remove redundant elemSize calculation for pointer elements. Co-authored-by: cungminh2710 <8063319+cungminh2710@users.noreply.github.com>

Bypasses reflection API for "direct" types during hot scan loops by pre-calculating field offsets and using unsafe pointer assignments. Fixes in this version: - Use item.Addr().UnsafePointer() for safer slice scanning (fixes CI lint). - Use unsafe.Add for field address calculations. - Add Float32 overflow validation in optimized path. - Propagate errors from fieldByIndexAlloc in fallback paths. - Remove redundant elemSize calculation for pointer elements. Co-authored-by: cungminh2710 <8063319+cungminh2710@users.noreply.github.com>

cungminh2710 · 2026-05-26T03:43:43Z

@jules Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

Issue 1 of 2

pkg/rain/model.go:852-863
When canUseOffset is set to false mid-traversal (because an intermediate field is a pointer), the loop still continues to accumulate offset. The accumulated value is the sum of offsets before and after the pointer boundary, which does not correspond to any meaningful memory address relative to baseAddr. Although this value is never read when canUseOffset == false, storing a semantically invalid offset in the plan struct is likely to confuse future readers or maintainers who add new fast-path cases. Breaking out of the offset accumulation once canUseOffset is cleared would make the invariant explicit.

		if isComplex {
			fieldType = modelType
			for _, i := range fieldInfo.index {
				if fieldType.Kind() == reflect.Pointer {
					fieldType = fieldType.Elem()
					canUseOffset = false
				}
				f := fieldType.Field(i)
				if canUseOffset {
					offset += f.Offset
				}
				fieldType = f.Type
			}
		} else {

Issue 2 of 2

pkg/rain/model.go:584-605
The default branch in the float64ValueCols switch is unreachable: columns are added to float64ValueCols only for reflect.Float32 and reflect.Float64 kinds, and both are covered by the explicit cases above. The dead branch can be removed entirely to avoid the extra fieldByIndexAlloc call that would never execute.

		if col.canUseOffset {
			ptr := unsafe.Add(baseAddr, col.offset)
			switch col.kind {
			case reflect.Float64:
				*(*float64)(ptr) = v.Float64
			case reflect.Float32:
				f64 := v.Float64
				if f64 < -math.MaxFloat32 || f64 > math.MaxFloat32 {
					return fmt.Errorf("rain: value %f overflows float32", v.Float64)
				}
				*(*float32)(ptr) = float32(v.Float64)
			}
			continue
		}

google-labs-jules · 2026-05-26T03:47:38Z

@jules Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

Issue 1 of 2

pkg/rain/model.go:852-863
When canUseOffset is set to false mid-traversal (because an intermediate field is a pointer), the loop still continues to accumulate offset. The accumulated value is the sum of offsets before and after the pointer boundary, which does not correspond to any meaningful memory address relative to baseAddr. Although this value is never read when canUseOffset == false, storing a semantically invalid offset in the plan struct is likely to confuse future readers or maintainers who add new fast-path cases. Breaking out of the offset accumulation once canUseOffset is cleared would make the invariant explicit.
		if isComplex {
			fieldType = modelType
			for _, i := range fieldInfo.index {
				if fieldType.Kind() == reflect.Pointer {
					fieldType = fieldType.Elem()
					canUseOffset = false
				}
				f := fieldType.Field(i)
				if canUseOffset {
					offset += f.Offset
				}
				fieldType = f.Type
			}
		} else {
Issue 2 of 2

pkg/rain/model.go:584-605
The default branch in the float64ValueCols switch is unreachable: columns are added to float64ValueCols only for reflect.Float32 and reflect.Float64 kinds, and both are covered by the explicit cases above. The dead branch can be removed entirely to avoid the extra fieldByIndexAlloc call that would never execute.
		if col.canUseOffset {
			ptr := unsafe.Add(baseAddr, col.offset)
			switch col.kind {
			case reflect.Float64:
				*(*float64)(ptr) = v.Float64
			case reflect.Float32:
				f64 := v.Float64
				if f64 < -math.MaxFloat32 || f64 > math.MaxFloat32 {
					return fmt.Errorf("rain: value %f overflows float32", v.Float64)
				}
				*(*float32)(ptr) = float32(v.Float64)
			}
			continue
		}

I've implemented the suggested fixes for both issues:

Offset accumulation: The loop in newRowScanPlanForColumns now only increments the offset while canUseOffset remains true. This ensures the stored offset is always valid or zero.
Unreachable branch: Removed the redundant default branch in the float64ValueCols switch, as it was indeed unreachable.

Bypasses reflection API for "direct" types during hot scan loops by pre-calculating field offsets and using unsafe pointer assignments. Polished in this version: - Ensure offsets for complex fields stop accumulating once a pointer is encountered. - Remove unreachable default branch in float scanning path. - Use item.Addr().UnsafePointer() for safer slice scanning (fixes CI lint). - Use unsafe.Add for field address calculations. - Add Float32 overflow validation in optimized path. - Propagate errors from fieldByIndexAlloc in fallback paths. - Remove redundant elemSize calculation for pointer elements. Co-authored-by: cungminh2710 <8063319+cungminh2710@users.noreply.github.com>

greptile-apps · 2026-05-26T03:54:21Z

+			case reflect.Float32:
+				f64 := v.Float64
+				if f64 < -math.MaxFloat32 || f64 > math.MaxFloat32 {
+					return fmt.Errorf("rain: value %f overflows float32", v.Float64)
+				}
+				*(*float32)(ptr) = float32(v.Float64)


Float32 fast path incorrectly rejects ±Inf

The guard f64 > math.MaxFloat32 evaluates to true for +Inf, causing the fast path to return an error for infinity values. Go's reflect.Value.OverflowFloat uses the formula MaxFloat32 < |x| && |x| <= MaxFloat64, which evaluates to false for ±Inf (because ±Inf <= MaxFloat64 is false), so the slow path correctly accepts them and stores float32 ±Inf. Databases like PostgreSQL and SQLite can store ±Inf in float columns; any such value mapped to a float32 field will now error only when canUseOffset is true, creating an inconsistent failure depending on whether the field is embedded through a pointer.

The fix is to exempt infinities: !math.IsInf(f64, 0) && (f64 < -math.MaxFloat32 || f64 > math.MaxFloat32).

Prompt To Fix With AI

This is a comment left during a code review. Path: pkg/rain/model.go Line: 589-594 Comment: **Float32 fast path incorrectly rejects ±Inf** The guard `f64 > math.MaxFloat32` evaluates to `true` for `+Inf`, causing the fast path to return an error for infinity values. Go's `reflect.Value.OverflowFloat` uses the formula `MaxFloat32 < |x| && |x| <= MaxFloat64`, which evaluates to `false` for ±Inf (because `±Inf <= MaxFloat64` is false), so the slow path correctly accepts them and stores float32 ±Inf. Databases like PostgreSQL and SQLite can store ±Inf in float columns; any such value mapped to a float32 field will now error only when `canUseOffset` is true, creating an inconsistent failure depending on whether the field is embedded through a pointer. The fix is to exempt infinities: `!math.IsInf(f64, 0) && (f64 < -math.MaxFloat32 || f64 > math.MaxFloat32)`. How can I resolve this? If you propose a fix, please make it concise.

I've updated the Float32 overflow check to exactly match the logic in reflect.OverflowFloat, which allows infinities and NaNs while erroring on true overflows. I've also improved the error message to include the field type for better context, consistent with other types.

Bypasses reflection API for "direct" types during hot scan loops by pre-calculating field offsets and using unsafe pointer assignments. Polished in this version: - Align Float32 overflow logic with reflect.OverflowFloat (allows infinities). - Improve error messages for all optimized scan paths. - Ensure offsets for complex fields stop accumulating once a pointer is encountered. - Remove unreachable default branch in float scanning path. - Use item.Addr().UnsafePointer() for safer slice scanning (fixes CI lint). - Use unsafe.Add for field address calculations. - Add Float32 overflow validation in optimized path. - Propagate errors from fieldByIndexAlloc in fallback paths. - Remove redundant elemSize calculation for pointer elements. Co-authored-by: cungminh2710 <8063319+cungminh2710@users.noreply.github.com>

greptile-apps Bot reviewed May 26, 2026

View reviewed changes

Comment thread pkg/rain/model.go Outdated

Comment thread pkg/rain/model.go Outdated

Comment thread pkg/rain/model.go Outdated

google-labs-jules Bot and others added 2 commits May 26, 2026 03:03

greptile-apps Bot reviewed May 26, 2026

View reviewed changes

cungminh2710 merged commit feb8ee1 into main May 27, 2026
5 checks passed

cungminh2710 deleted the bolt-unsafe-scan-opt-18071894035170893412 branch May 27, 2026 04:07

Conversation

cungminh2710 commented May 26, 2026

Uh oh!

google-labs-jules Bot commented May 26, 2026

Uh oh!

greptile-apps Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cungminh2710 commented May 26, 2026

Issue 1 of 3

Issue 2 of 3

Issue 3 of 3

Uh oh!

google-labs-jules Bot commented May 26, 2026

Issue 1 of 3

Issue 2 of 3

Issue 3 of 3

Uh oh!

cungminh2710 commented May 26, 2026

Issue 1 of 2

Issue 2 of 2

Uh oh!

google-labs-jules Bot commented May 26, 2026

Issue 1 of 2

Issue 2 of 2

Uh oh!

greptile-apps Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

cungminh2710 May 26, 2026

Choose a reason for hiding this comment

Uh oh!

google-labs-jules Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented May 26, 2026 •

edited

Loading