Skip to content

penemue/keap

Repository files navigation

Keap

Build Maven Central JitPack Apache License 2.0 Pure Kotlin

Keap is a heap data structure written in Kotlin, similar to a binary heap. It separately maintains the queue of elements and a tournament tree atop the queue.

Keap is stable, that is, it keeps initial order of equal elements.

It is faster than binary heap in terms of the number of comparisons. For any kind of input (random or ordered) of size n, the heapify procedure requires exactly n - 1 comparisons. Binary heap reaches this number of comparisons only for ordered input. For random input, compared to binary heap, keap does approximately 1.9 times fewer comparisons to heapify, 2.7 times fewer to offer, and 3.5 times fewer to poll. However, an array-backed keap on average consumes 1.5 times more memory than an array-backed binary heap.

Performance summary of keap and binary heap (both array-backed) is as follows:

Keap average Keap worst case Binary heap average Binary heap worst case
Heapify exactly n - 1 exactly n - 1 Θ(n) Θ(n)
Peek Θ(1) Θ(1) Θ(1) Θ(1)
Poll Θ(log n) Θ(log n) Θ(log n) Θ(log n)
Offer Θ(1) Θ(log n) Θ(1) Θ(log n)
Memory used Θ(n) Θ(n) exactly n exactly n

Here are two applications: a keap-based PriorityQueue as a replacement for java.util.PriorityQueue and the Keapsort sorting algorithm. Both might be useful in two cases:

  1. stability is a must;
  2. comparing elements is a rather heavyweight operation (e.g., it requires a database access).

PriorityQueue

PriorityQueue is a keap-based stable replacement of java.util.PriorityQueue.

Keapsort

Keapsort is a Heapsort with a keap used instead of a binary heap. It is a comparison-based sorting algorithm with a worst-case O(n log n) runtime. Since keap is a stable priority queue, Keapsort is stable as well. Unlike Heapsort, Keapsort does not have an in-place version.

Like Heapsort, Keapsort produces its first output after Θ(n) comparisons. For Keapsort, this Θ(n) estimate is exactly n - 1. To sort random input completely, Keapsort does almost half as many comparisons as Heapsort, roughly 1% fewer than Mergesort, and 1–2% more than Timsort.

Like Heapsort, Keapsort is not a modern-CPU-friendly algorithm because it has poor data-cache performance.

Like Heapsort, Keapsort does not seem to parallelize well.

Benchmarks

The JMH Gradle Plugin (plugin id me.champeau.jmh) is used to build and run benchmarks. The benchmark scores below are historical, originally captured on a PC running Windows 7 with an Intel(R) Core(TM) i7-3770 3.4 GHz CPU and 64-bit JRE 1.8.0_112-b15 with java parameters -Xms1g -Xmx1g. To get results in your environment, run:

./gradlew clean jar jmh

For both java.util.PriorityQueue and the keap-based PriorityQueue, there are two types of benchmarks:

  1. Benchmarks measuring operations on random elements: heapify (building the queue from a collection), offer, peek, and poll. The queue elements are random strings about 30 characters long with a constant 10-character prefix. The queue size is 10000.
  2. Benchmarks measuring offering of ordered elements: offerIncreasing (successively increasing elements) and offerDecreasing (successively decreasing elements). The queue elements are strings about 30 characters long with a constant 10-character prefix. The queue size is 10000.

Results grouped for easy review are as follows. Building the queue from collections of random elements:

Benchmark                                   Mode  Cnt   Score   Error   Units
JavaQueueRandomBenchmark.heapify           thrpt   20   1.810 ± 0.039  ops/ms
KeapQueueRandomBenchmark.heapify           thrpt   20   3.487 ± 0.013  ops/ms

Basic queue operations with random elements:

Benchmark                                   Mode  Cnt   Score   Error   Units
JavaQueueRandomBenchmark.offer             thrpt   20   3.649 ± 0.041  ops/us
JavaQueueRandomBenchmark.peek              thrpt   20  79.458 ± 0.626  ops/us
JavaQueueRandomBenchmark.poll              thrpt   20   4.250 ± 0.070  ops/us
KeapQueueRandomBenchmark.offer             thrpt   20   3.727 ± 0.073  ops/us
KeapQueueRandomBenchmark.peek              thrpt   20  84.371 ± 0.499  ops/us
KeapQueueRandomBenchmark.poll              thrpt   20  11.993 ± 0.239  ops/us

Offering ordered elements:

Benchmark                                   Mode  Cnt   Score   Error   Units
JavaQueueOrderedBenchmark.offerDecreasing  thrpt   20   5.432 ± 0.177  ops/us
JavaQueueOrderedBenchmark.offerIncreasing  thrpt   20  30.795 ± 0.588  ops/us
KeapQueueOrderedBenchmark.offerDecreasing  thrpt   20   7.959 ± 0.033  ops/us
KeapQueueOrderedBenchmark.offerIncreasing  thrpt   20   8.860 ± 0.044  ops/us

The scores above are operations per microsecond, or per millisecond for heapify. Higher is better.

Download

Maven Central

Maven Central

Versions up to and including 0.3.0 are available from Maven Central. Newer releases are published via JitPack (see below).

<!-- in Maven project -->
<dependency>
    <groupId>com.github.penemue</groupId>
    <artifactId>keap</artifactId>
    <version>0.3.0</version>
</dependency>
// in Gradle project (Kotlin DSL)
dependencies {
    implementation("com.github.penemue:keap:0.3.0")
}

JitPack

JitPack

Any commit, branch, or tag of this repository can be consumed as a Maven artifact via JitPack.

// in Gradle project (Kotlin DSL)
repositories {
    maven(url = "https://jitpack.io")
}
dependencies {
    implementation("com.github.penemue:keap:0.4.1")
}
<!-- in Maven project -->
<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>
<dependency>
    <groupId>com.github.penemue</groupId>
    <artifactId>keap</artifactId>
    <version>0.4.1</version>
</dependency>

Building from Source

Gradle is used to build, test, and run benchmarks. JDK 1.8+ is required to run the produced artifact; the project itself is built with Kotlin 2.3.21. To build the project, run:

./gradlew

ToDo

  • Compare Keapsort to Quicksort.

  • A tree-backed version of keap could be exposed as an immutable/persistent/lock-free heap data structure. In addition, it could support a heap-merge operation in Θ(1) time.

About

Keap is a heap data structure presenting stable PriorityQueue and stable Keapsort sorting algorithm

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors