Skip to content

remove random subchunk ordering from sharding codec #4008

@d-v-b

Description

@d-v-b

#3826 introduced an option for writing subchunks in a random order, using a random number generator, as an interepration of the "unordered" subchunk write order. I think "unordered" should be interpreted as "no guaranteed order", which e.g. allows an implementation to write chunks in whatever order is most performant, but also plain lexicographic or morton order.

So I don't think zarr-python needs any random number generation in the sharding codec to satisfy the contract of "unordered". For that mode we can arbitrarily use a concrete subchunk order for now, but we are free to change it when more performant options become available.

that means we should remove the random number generation logic from the sharding codec. This has the advantage of resolving some of the sticky issues created by the rng machinery, like #4005 and #4004

@ilan-gold @LDeakin does this make sense? I'm sorry for not catching this in the review for #3826, I don't think I fully grasped the intent of the unordered mode at that time.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions