Skip to content

FastByteBuffer: tiered grow strategy to avoid O(n²) reallocation at the extremes #270

@num0001

Description

@num0001

Summary

The current auto-resize policy in FastByteBuffer.ensureCapacity() causes excessive reallocations at both ends of the buffer-size spectrum. This issue proposes a tiered growth strategy, picking up the existing TODO at FastByteBuffer.java:182.

Current behaviour

// FastByteBuffer.java:183
final int addCapacity = Math.min(
        Math.max(DEFAULT_MIN_CAPACITY_INCREASE, newCapacity >> 3),
        DEFAULT_MAX_CAPACITY_INCREASE); // min 1 KiB, +12.5%, max 100 KiB

Effectively: addCapacity = clamp(newCapacity / 8, 1 KiB, 100 KiB).

Problems

1. Small buffers grow too slowly

Below ~8 KiB, newCapacity >> 3 is below the 1 KiB floor, so growth is fixed at +1 KiB regardless of buffer size. Growing from 1 KiB to 100 KiB takes ~99 reallocations + memcpys. ArrayList and StringBuilder both double in this regime to amortise allocation cost.

2. Large buffers grow too slowly

Above ~800 KiB, the +12.5% step gets clamped to the 100 KiB ceiling. A 10 MiB buffer that needs to roughly double requires ~100 reallocations — classic O(n²) cost on the hot serialisation path the README benchmarks rely on.

The midrange (~8 KiB – 800 KiB) at +12.5% works well and should be preserved.

Proposal

A three-tier policy matching the existing TODO comment ("increase fast for small arrays, +n% for medium, byte-by-byte for large"):

Regime Range Growth strategy Rationale
Small < 8 KiB Double (addCapacity = newCapacity) Avoid many tiny reallocations during ramp-up
Medium 8 KiB – 8 MiB +12.5%, floored at 1 KiB Unchanged from today; works well in practice
Large ≥ 8 MiB Fixed slab (+1 MiB) Avoid runaway memory waste on huge buffers
private static final int SMALL_THRESHOLD  = 1 << 13;   // 8 KiB
private static final int LARGE_THRESHOLD  = 1 << 23;   // 8 MiB
private static final int LARGE_ADD_CAPACITY = 1 << 20; // 1 MiB

final int addCapacity;
if (newCapacity < SMALL_THRESHOLD) {
    addCapacity = newCapacity; // double
} else if (newCapacity < LARGE_THRESHOLD) {
    addCapacity = Math.max(DEFAULT_MIN_CAPACITY_INCREASE, newCapacity >> 3); // +12.5%
} else {
    addCapacity = LARGE_ADD_CAPACITY; // fixed 1 MiB slab
}
forceCapacity(newCapacity + addCapacity, limit());

Expected impact

Scenario Reallocations (current) Reallocations (proposed)
1 KiB → 100 KiB ~99 ~7
10 MiB → 20 MiB ~100 ~10
64 KiB → 80 KiB (midrange) 2 2 (unchanged)

Memory overhead stays bounded: ≤ 1 MiB unused slab at large sizes, ≤ 2× allocation at small sizes (same as ArrayList).

Open questions for maintainers

  1. Threshold values — are 8 KiB / 8 MiB acceptable defaults, or do you want them tuned to a known workload (e.g. typical MDP message sizes)?
  2. Tunability — should the thresholds be exposed as constructor parameters / system properties, or hard-coded constants? I'd suggest keeping them private static final for now and revisiting only if a real use case needs override.
  3. Large-tier policy — fixed 1 MiB slab vs. capped percentage? Fixed slab is simpler and bounds worst-case waste; percentage with a high cap (e.g. min(25%, 16 MiB)) would also work.

Proposed scope

  • Modify ensureCapacity() in FastByteBuffer.java — ~15 LOC.
  • Add tests covering each regime + data-preservation across grows in IoBufferTests.java. The existing testFastByteBufferResizing() uses Matchers.greaterThan and remains compatible.
  • (Optional) JMH microbench measuring time-to-fill a 50 MiB buffer via 1 KiB writes, old vs. new.

No API change; IoBuffer.ensureCapacity() semantics are unchanged.

Happy to open a PR once thresholds are confirmed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions