You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there any way for a codec (as applied to encoding/decoding within zarr) to be reliably provided the shape of the chunk it is decoding?
My use-case here is to write a codecs that apply dynamic scaling and quantization (based on planes of a 3+-dimension array, normalizing by local min/max within a chunk) and/or two-dimensional linear prediction (extending numcodecs.Delta, essentially).
When calling Codec.encode() this is not a problem; the buffer supplied is a full array-like unless an earlier filter stage has done something. However, on decoding the codec is only reliably supplied a byte-stream without shape information. The out parameter to decode() seems to be inconsistently supplied.
Obviously, Zarr knows what shape of chunk it is seeking to fill. Without that information, I'll have to encode the array shape information in the output datastream. That's unnecessary redundancy, and more importantly it is aesthetically displeasing.
The text was updated successfully, but these errors were encountered:
I previously have pushed for the concept of "context", which would be passed by zarr to both the codec's encode/decode methods and to the storage layer, specifying where in the array we are, the shape, key, ... and other useful pieces of information that are available at call time. Currently, the context (zarr.context.Context) only has meta_array: NDArrayLike, I see no reason not to populate it further.
Context of the chunk within the larger super-array would also be interesting, since it could allow some special-case encoders that apply data transforms along the way.
Is there any way for a codec (as applied to encoding/decoding within zarr) to be reliably provided the shape of the chunk it is decoding?
My use-case here is to write a codecs that apply dynamic scaling and quantization (based on planes of a 3+-dimension array, normalizing by local min/max within a chunk) and/or two-dimensional linear prediction (extending numcodecs.Delta, essentially).
When calling Codec.encode() this is not a problem; the buffer supplied is a full array-like unless an earlier filter stage has done something. However, on decoding the codec is only reliably supplied a byte-stream without shape information. The
out
parameter todecode()
seems to be inconsistently supplied.Obviously, Zarr knows what shape of chunk it is seeking to fill. Without that information, I'll have to encode the array shape information in the output datastream. That's unnecessary redundancy, and more importantly it is aesthetically displeasing.
The text was updated successfully, but these errors were encountered: