Foundations for BList by loladenney · Pull Request #825 · typelevel/cats-collections

loladenney · 2026-05-28T15:41:37Z

building out the framework for the new datatype

johnynek · 2026-05-28T17:18:50Z

This is the same issue as #809 and #634 right?

A lot of the same discussion from #809 would be relevant here I think.

Co-authored-by: Sergey Torgashov <satorg@users.noreply.github.com>

johnynek · 2026-06-02T21:26:50Z

+  }
+
+  // (maybe impl will be covariant or not)
+  private case class Impl[A](offset: Int, block: Array[A], tail: BList[A]) extends NonEmpty[A] {


I think you can make this covariant if block: Array[_ <: A] that said, since we don't have a classtag for the type A and we may not want to require it, we may want to instead use Array[Any] and cast when we get out, which would require boxing primitives, but List also boxes primitives.

It used to be Array[Any], but it ended up requiring so many uses of .asInstanceOf, and I didn't like how cluttered and confusing it made everything. Is there a good reason why we would want Impl to be covariant? If its worth it then I will try to find ways to contain the typecasts in a maintainable way.

johnynek · 2026-06-02T21:29:05Z

+    }
+    def getUnsafe(idx: Long): A = {
+      if (idx < 0)
+        throw new IndexOutOfBoundsException


we are throwing a different kind of exception for < 0 vs too large. Does List also do that? I would comment and follow List behavior here so it is fully compliant.

Just checked and it seems to throw the same index out of bounds exception for index too small or too large!

johnynek · 2026-06-02T21:33:08Z

+  }
+
+  def empty[A]: BList[A] = Empty
+  def unapply[A](l: BList[A]): Option[(A, BList[A])] = l.uncons


I think this should be on the object NonEmpty inside BList and we should also have an apply method that builds a NonEmpty by prepending onto a BList. So BList.NonEmpty(h, t) can be used in construction and deconstruction.

Yes, makes sense. I think something like this should work:

object BList { object NonEmpty { def apply[A](h: A, t: BList[A]): NonEmpty[B] = ... def unapply[A](l: BList.NonEmpty[A]): Some[(A, BList[A])] = l.uncons } }

What kinds of things do in the object NonEmpty vs the object Impl?

Also, should the return type of unapply still keep the Option?

What kinds of things do in the object NonEmpty vs the object Impl?

NonEmpty is a part of the public API, whereas Impl hosts implementation details which we don't want to reveal.

Also, should the return type of unapply still keep the Option?

That is a part of the trick: if we returned Option[...], then Scala compiler might struggle on exhausiveness checks in some cases – because it doesn't know whether it resolves to Some or None at runtime.

However, when we return Some, then we reassure the compiler that the result does always exist:

val fooBList: BList[Int] = ... fooBList match { case BList.NonEmpty(head, tail) => // only matches if our `fooBList` is a `NonEmpty` instance. case BList.Empty => // or `case _ : BList.Empty.type =>` or just `case _ =>` – the `match` should be exhaustive in any case. }

Minimal-ish example: https://scastie.scala-lang.org/smsfk232TM6C01PJonKThQ

Co-authored-by: Arman Bilge <arman@typelevel.org>

satorg · 2026-06-17T07:07:11Z

+
+  private class Impl[A](val offset: Int, val block: Array[A], val tailBList: BList[A]) extends NonEmpty[A] {
+
+    override def equals(other: Any): Boolean = {


While this implementation would probably work in general, it makes a lot of temporary heap allocations (for tuples and options). I’d suggest implementing it block‑by‑block rather than element‑by‑element, and avoiding allocations as much as possible. Something like this:

// Helper recursive class-level function. @tailrec private def isEqualTo(that: BList[A]): Boolean = { // We always know that `this` is `Impl` already. // Therefore we can safely short-circut to `false` if `that` is not. that match { case _: Empty.type => false case thatImpl: Impl[A] => if (offset != thatImpl.offset || block.lengh != thatImpl.block.length) false else { // The offsets and block sizes are the same. Consequently, // we can use a boring yet efficient `while` loop over the array // with just a single var i = offset; // ..or `i = block.length - 1` if you will while { ... } // exits with `false` on the first mistmatch // If both blocks are equal, we can try to proceed with the next block: tailBList match { case _: Empty.type => thatImpl.isEmpty case nextImpl: Impl[A] => nextImpl.isEqualTo(thatImpl.tailBList) } } } }

Now, override def equals(other: Any) just needs to call our isEqualTo properly.

NOTES:

Instead of multi-level nesting, short-circuiting with return false would probably look better and cleaner.

case _: Empty.type => can be more efficient compared to case Empty =>

Checks for block.length aren’t strictly necessary right now, but they will be if we ever decide to implement some kind of “adaptive” block sizes.

do ... while would fit even better our case, because every Impl is always non-empty. However, Scala 3 dropped do ... while in favor of while { do side effects here then emit true or false }() kind of thing.

Ok, so the issue I have with this implementation is that it checks for equality of blocks, which is stricter than the equality we want on ordered elements. Corresponding blocks of two BLists do not need to have the same offsets for the two lists to be equal.
Consider the lists BList(Block(1,2,3), Block(4,5), Empty) and BList(Block(1,2), Block(3,4,5), Empty). We want to say these are equal and I don't think isEqualTo will help with that, unless while checking equality we shift all the element to the left most spots as we go. That could be nice for reducing fragmentation, but can get expensive.

I could make isEqualTo compare arrays from a start index to an end index and do that where the blocks overlap...

Update: I made a new equals that greedily compares within blocks. I think once I have some benchmarking up I might try to compare the performance between the two, but before then if anyone has thoughts on which might be faster I'd love to hear them.

the issue I have with this implementation is that it checks for equality of blocks, which is stricter than the equality we want on ordered elements.

Good point. In that case I'd personally go with the Iterator-based approach.

I believe we do need to get Iterator here anyway, and having that, the equals and hashCode can be implemented in a very simple and straightforward way.

satorg · 2026-06-17T07:13:43Z

+      for (i <- this.offset until BlockSize) {
+        res = res * 31 + this.block(i).hashCode()
+      }


while would probably be more efficient that for:

var res = 31 var i = offset while ( ... )

Besides, you already have var res here anyway.

satorg · 2026-06-23T16:59:22Z

+import org.scalacheck.Gen
+
+trait ArbitraryBList {
+  def bListGen[A](implicit arb: Arbitrary[A], cogen: Cogen[A]): Gen[BList[A]] = {


Usually, Gen functions take Gen parameters, not Arbitrary. Moreover, Cogen param here looks a little strange – for a list-like structure it should not be necessary.

I based my implementation off the other arbitrary datatype implementations in the same folder, and it seems like none of their Gen functions have a Gen parameter, but they do have implicit Arbitrary parameters.
As for the Cogen param, I will remove it.

Thanks for the clarification. It's somewhat surprising that *Gen functions are implemented that way in CatsCollections. The ScalaCheck itself follows a slightly different approach.

For example,
stringOf takes Gen and returns Gen, whereas arbString is an implicit function that takes implicit Arbitrary (and uses stringOf under the hood).

Likewise,
buildableOf takes Gen[T] explicitly and returns Gen as well, whereas arbContainer is an implicit that takes Arbitrary[T] implicitly and returns Arbitrary.

So I'd say – that was the original idea of ScalaCheck, whereas the current implementation in CatsCollections deviates from there quite a bit.

satorg · 2026-06-23T17:03:06Z

+          Gen.resize(n / 2, bListGen[A]).flatMap { l =>
+            arbitrary[A => A].map(fn => l.map(fn))
+          }


This is the only place that requires Cogen instance, but I don't think we really need it:

// map Gen.resize(n / 2, bListGen[A]).map { l => l.map(identity) }

We can simply pass identity here, because it won't affect how map function works.

Thank you! I made this change.

This reverts commit f18b0e5. this reverses changes to toList

This reverts commit cb70945. the last revert I did on the wrong commit. this is the correct one, to undo changes to toList

satorg · 2026-06-28T00:11:59Z

+  def concat[B >: A](l2: BList[B]): BList[B]
+  def toList: List[A]
+  def isEmpty: Boolean
+  def toIterator: Iterator[A]


Actually, .toIterator is deprecated in Scala Library in favor of .iterator starting with v2.13:
https://github.com/scala/scala/blob/b5e60a0853bbd53f9975050240199483d41bda92/src/library/scala/collection/IterableOnce.scala#L204-L205

So perhaps, BList should use name iterator here as well.

made this change!

satorg · 2026-06-28T05:15:23Z

+          while (iterX.hasNext && iterY.hasNext) {
+            if (iterX.next() != iterY.next()) { // not sure if i could be comparing with != here...
+              return false
+            }
+          }
+          iterX.hasNext == iterY.hasNext


Perhaps, it can simply call iterX.sameElements(iterY) (after casting to the same base element type):
https://github.com/scala/scala/blob/b5e60a0853bbd53f9975050240199483d41bda92/src/library/scala/collection/Iterator.scala#L841-L855

This seems cleaner, thanks!

satorg · 2026-06-28T05:36:27Z

+  final class BListIterator[A](from: BList[A]) extends Iterator[A] {
+    private var curNode: BList[A] = from
+    private var curOffset: Int = if (!from.isEmpty) curNode.asInstanceOf[Impl[A]].offset else BlockSize
+


BListIterator can be private. In that case, no need to copy init params into private variables. Moreover, since Empty.iterator can simply return Iterator.empty, perhaps no need to pass common base BList here and simply work with Impl all along:

private final class BListIterator[A]( var curNode: Impl[A], var curOffset: Int) extends Iterator[A] { def hasNext: Boolean = // Leave `curOffset` at `curNode.block.length` if `curNode` is the last one. curOffset < curNode.block.length def next: A = ???

Thanks! Will do.

Co-authored-by: Sergey Torgashov <satorg@users.noreply.github.com>

foundations for BList

116687a

satorg reviewed May 28, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

loladenney and others added 2 commits May 28, 2026 13:27

Update core/src/main/scala/cats/collections/BList.scala

f8b1a16

Co-authored-by: Sergey Torgashov <satorg@users.noreply.github.com>

some more methods

877d3d6

satorg reviewed May 29, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

satorg reviewed May 29, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

satorg reviewed May 29, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

satorg reviewed May 29, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

loladenney and others added 3 commits June 2, 2026 10:13

Update core/src/main/scala/cats/collections/BList.scala

9172ebe

Co-authored-by: Sergey Torgashov <satorg@users.noreply.github.com>

Update core/src/main/scala/cats/collections/BList.scala

7224246

Co-authored-by: Sergey Torgashov <satorg@users.noreply.github.com>

fixes to drop and toList

971943b

johnynek reviewed Jun 2, 2026

View reviewed changes

loladenney added 2 commits June 4, 2026 09:53

removed explicit typecasts and changed tail to tailBList

a1a5d9a

made nonempty object, and other changes addressing comments

f9e6c8b

satorg reviewed Jun 5, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

satorg reviewed Jun 5, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

satorg reviewed Jun 5, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala

loladenney added 4 commits June 5, 2026 11:59

starting on test suite with individual tests

089d854

addressing Sergey's commments from june 4th

7888e92

commit to check in on CI

e12726d

more individual tests

2a7df73

satorg reviewed Jun 9, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala

loladenney added 3 commits June 10, 2026 10:26

cleaned up some Impl methods

8b02e8a

forgot to format on previous commit

bdf9d36

fixed concat and lastOption

d9a0dee

armanbilge reviewed Jun 10, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

loladenney and others added 3 commits June 11, 2026 09:21

Update core/src/main/scala/cats/collections/BList.scala

69865f6

Co-authored-by: Arman Bilge <arman@typelevel.org>

property tests

2c30668

trying to get past ci again

dba9c9a

loladenney added 2 commits June 11, 2026 12:07

banging my head against the ci wall

34a91f2

stack overflow in blist generation for property tests addressed

62c530e

satorg reviewed Jun 11, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

loladenney added 5 commits June 12, 2026 09:44

fixed uncons and tail to prevent blocking GC

d67a55c

make Impl class, removed case

f8a1d26

overrode equals and hashcode

2e56cfb

fixed for scala 3

d623b83

fixed equals

aefdb7b

satorg reviewed Jun 17, 2026

View reviewed changes

loladenney added 3 commits June 18, 2026 11:58

made another version of equals

db06cc4

forgot to run formatter

ffde5b5

changed hashcode to match other scala list-like datastructures

0bf49bd

satorg reviewed Jun 23, 2026

View reviewed changes

loladenney added 6 commits June 23, 2026 13:35

removed usedless cogen param

f18b0e5

fixing issues with toList caused by mutable builder

cb70945

Revert "removed usedless cogen param"

ce76352

This reverts commit f18b0e5. this reverses changes to toList

removed usedless cogen param

0c35b05

Revert "fixing issues with toList caused by mutable builder"

8659fea

This reverts commit cb70945. the last revert I did on the wrong commit. this is the correct one, to undo changes to toList

iterator, and using it for hashcode and equals

7325bf8

satorg reviewed Jun 28, 2026

View reviewed changes

Comment thread core/src/main/scala/cats/collections/BList.scala Outdated

satorg reviewed Jun 28, 2026

View reviewed changes

loladenney and others added 2 commits June 29, 2026 14:48

Update core/src/main/scala/cats/collections/BList.scala

62bee70

Co-authored-by: Sergey Torgashov <satorg@users.noreply.github.com>

changes to iterator

12d460b


		private class Impl[A](val offset: Int, val block: Array[A], val tailBList: BList[A]) extends NonEmpty[A] {

		override def equals(other: Any): Boolean = {

Uh oh!

Uh oh!

Conversation

loladenney commented May 28, 2026

Uh oh!

Uh oh!

johnynek commented May 28, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loladenney Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loladenney Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

satorg Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

loladenney Jun 18, 2026 •

edited

Loading

loladenney Jun 18, 2026 •

edited

Loading

satorg Jun 23, 2026 •

edited

Loading