Skip to content

Conversation

@jsignell
Copy link
Contributor

@jsignell jsignell commented Dec 12, 2025

This PR makes the handling of chunks="auto" consistent between open_zarr and open_dataset(..., engine="zarr").

The handling of chunks still differs in open_zarr vs open_dataset(..., engine="zarr") in that the default in open_zarr is to use chunks="auto" and a chunk manager (aka dask) when available in your env. And in open_dataset the default is to use chunks=None (aka no chunks).

@github-actions github-actions bot added topic-backends topic-zarr Related to zarr storage library io labels Dec 12, 2025
@jsignell jsignell self-assigned this Dec 12, 2025
@jsignell jsignell marked this pull request as ready for review December 12, 2025 19:10
Breaking Changes
~~~~~~~~~~~~~~~~

- Remove special mapping of ``"auto"`` to ``{}`` in ``open_zarr``. This matches the behavior of ``open_dataset(..., engine="zarr")``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone is going to complain but I think this is an improvement.

  • it removes one source of confusion.
  • in general you do want a multiple of the on-disk chunks.
  • users can still pass {} to get previous behaviour.

^ perhaps we can call that third point out in the release note.

Also, since this could be pretty breaking I'd like us to get a few more thumbs up on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this could be pretty breaking I'd like us to get a few more thumbs up on it.

Yeah absolutely.

I am wondering if it's worth trying to improve the auto logic to make it less willing to split chunks before we merge this. If auto never split chunks then I think this is a much less breaking change. As it currently stands, auto prefers to keep existing chunk boundaries, but it isn't strict about it. I would kind of like something that is logically (max({}, "auto"), but maybe that should be a different name?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

io topic-backends topic-zarr Related to zarr storage library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent chunking between xr.open_zarr and xr.open_dataset(..., engine='zarr') with chunks="auto"

2 participants