Performance issues?

Hi - thanks for creating a library for Bloom filters (with a great API!)

I have run some experiments using your scalable Bloom filters, but they do not seem to perform very well :(

I created a [fork](https://github.com/jnysteen/Maybe) containing code for benchmarking the use cases I want to use your library for, as well as experiments with optimizing some of your code.

One performance bottleneck can be found in your choice of method for converting objects to bytes, in order to hash the bytes. I have created [a project](https://github.com/jnysteen/ByteSerializationBenchmark) for benchmarking various methods for the conversion, showing that the `BinaryFormatter` used in your library performs horribly compared to the alternatives.

The benchmarks in my fork shows the performance improvements achievable by replacing `BinaryFormatter` with alternatives (I have included the benchmarking results at the bottom of this issue - notice the memory usage column on the far right)

Despite the new optimizations, the memory usage of your Bloom filters (esspecially the scalable version, which I really want to use) is _very_ high compared to an alternative like a `HashSet`.

Is this just the nature of the implementation, or can it be improved?


(The table below is a part of the output of the benchmarking code. In the tabl, your original implementation of a scalable Bloom filter is called `ScalableBloomFilter`.)

``` ini

BenchmarkDotNet=v0.11.5, OS=macOS Mojave 10.14.5 (18F203) [Darwin 18.6.0]
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=2.2.300
  [Host] : .NET Core 2.2.5 (CoreCLR 4.6.27617.05, CoreFX 4.6.27618.01), 64bit RyuJIT
  Core   : .NET Core 2.2.5 (CoreCLR 4.6.27617.05, CoreFX 4.6.27618.01), 64bit RyuJIT

Job=Core  Runtime=Core  

```
|                              Method | ItemsToInsert | MaximumErrorRate |         Mean |         Error |        StdDev | Ratio | RatioSD | Rank |       Gen 0 | Gen 1 | Gen 2 |     Allocated |
|------------------------------------ |-------------- |----------------- |-------------:|--------------:|--------------:|------:|--------:|-----:|------------:|------:|------:|--------------:|
|                             HashSet |          1000 |             0.02 |     146.3 μs |     0.1939 μs |     0.1620 μs |  1.00 |    0.00 |    1 |     15.1367 |     - |     - |      70.31 KB |
|  StringOptimizedScalableBloomFilter |          1000 |             0.02 |     909.7 μs |     5.6858 μs |     5.0404 μs |  6.21 |    0.03 |    2 |    234.3750 |     - |     - |    1083.49 KB |
| GenericOptimizedScalableBloomFilter |          1000 |             0.02 |   5,701.6 μs |    42.8845 μs |    40.1142 μs | 39.00 |    0.27 |    3 |   2414.0625 |     - |     - |   11144.79 KB |
|                 ScalableBloomFilter |          1000 |             0.02 |   6,271.6 μs |    56.0022 μs |    52.3845 μs | 42.89 |    0.41 |    4 |   2390.6250 |     - |     - |   11029.23 KB |
|                                     |               |                  |              |               |               |       |         |      |             |       |       |               |
|                             HashSet |         10000 |             0.02 |   1,492.3 μs |     3.4257 μs |     3.0368 μs |  1.00 |    0.00 |    1 |    152.3438 |     - |     - |     703.13 KB |
|  StringOptimizedScalableBloomFilter |         10000 |             0.02 |  17,605.3 μs |    45.3247 μs |    37.8482 μs | 11.80 |    0.04 |    2 |   4062.5000 |     - |     - |   18814.31 KB |
| GenericOptimizedScalableBloomFilter |         10000 |             0.02 | 106,426.8 μs | 1,297.8431 μs | 1,214.0032 μs | 71.35 |    0.88 |    3 |  43600.0000 |     - |     - |  201469.91 KB |
|                 ScalableBloomFilter |         10000 |             0.02 | 114,675.1 μs |   952.2085 μs |   844.1081 μs | 76.84 |    0.62 |    4 |  43200.0000 |     - |     - |   199364.2 KB |
|                                     |               |                  |              |               |               |       |         |      |             |       |       |               |
|                             HashSet |         50000 |             0.02 |   8,018.3 μs |    23.7326 μs |    21.0383 μs |  1.00 |    0.00 |    1 |    828.1250 |     - |     - |    3828.13 KB |
|  StringOptimizedScalableBloomFilter |         50000 |             0.02 | 103,353.9 μs | 1,100.2632 μs |   975.3547 μs | 12.89 |    0.11 |    2 |  26400.0000 |     - |     - |  122239.81 KB |
| GenericOptimizedScalableBloomFilter |         50000 |             0.02 | 693,657.8 μs | 2,892.3871 μs | 2,564.0259 μs | 86.51 |    0.44 |    3 | 285000.0000 |     - |     - | 1316991.46 KB |
|                 ScalableBloomFilter |         50000 |             0.02 | 734,364.0 μs | 9,817.7735 μs | 9,183.5514 μs | 91.65 |    1.16 |    4 | 282000.0000 |     - |     - | 1303198.18 KB |


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issues? #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Method	ItemsToInsert	MaximumErrorRate	Mean	Error	StdDev	Ratio	RatioSD	Rank	Gen 0	Gen 1	Gen 2	Allocated
HashSet	1000	0.02	146.3 μs	0.1939 μs	0.1620 μs	1.00	0.00	1	15.1367	-	-	70.31 KB
StringOptimizedScalableBloomFilter	1000	0.02	909.7 μs	5.6858 μs	5.0404 μs	6.21	0.03	2	234.3750	-	-	1083.49 KB
GenericOptimizedScalableBloomFilter	1000	0.02	5,701.6 μs	42.8845 μs	40.1142 μs	39.00	0.27	3	2414.0625	-	-	11144.79 KB
ScalableBloomFilter	1000	0.02	6,271.6 μs	56.0022 μs	52.3845 μs	42.89	0.41	4	2390.6250	-	-	11029.23 KB

HashSet	10000	0.02	1,492.3 μs	3.4257 μs	3.0368 μs	1.00	0.00	1	152.3438	-	-	703.13 KB
StringOptimizedScalableBloomFilter	10000	0.02	17,605.3 μs	45.3247 μs	37.8482 μs	11.80	0.04	2	4062.5000	-	-	18814.31 KB
GenericOptimizedScalableBloomFilter	10000	0.02	106,426.8 μs	1,297.8431 μs	1,214.0032 μs	71.35	0.88	3	43600.0000	-	-	201469.91 KB
ScalableBloomFilter	10000	0.02	114,675.1 μs	952.2085 μs	844.1081 μs	76.84	0.62	4	43200.0000	-	-	199364.2 KB

HashSet	50000	0.02	8,018.3 μs	23.7326 μs	21.0383 μs	1.00	0.00	1	828.1250	-	-	3828.13 KB
StringOptimizedScalableBloomFilter	50000	0.02	103,353.9 μs	1,100.2632 μs	975.3547 μs	12.89	0.11	2	26400.0000	-	-	122239.81 KB
GenericOptimizedScalableBloomFilter	50000	0.02	693,657.8 μs	2,892.3871 μs	2,564.0259 μs	86.51	0.44	3	285000.0000	-	-	1316991.46 KB
ScalableBloomFilter	50000	0.02	734,364.0 μs	9,817.7735 μs	9,183.5514 μs	91.65	1.16	4	282000.0000	-	-	1303198.18 KB

Performance issues? #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions