-
Notifications
You must be signed in to change notification settings - Fork 52
feat: zarr3 #220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: zarr3 #220
Conversation
8371f18 to
07edf73
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #220 +/- ##
==========================================
+ Coverage 72.96% 73.85% +0.88%
==========================================
Files 10 10
Lines 825 872 +47
==========================================
+ Hits 602 644 +42
- Misses 223 228 +5 ☔ View full report in Codecov by Sentry. |
7c0f187 to
1459e83
Compare
|
@floriankrb what was the conclusion in terms of adding/updating to zarr 3? |
|
@anaprietonem I updated the description of the PR |
tjhunter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@floriankrb @b8raoult this is fantastic, thank you for the hard work! We will try it this week or next and get back to you.
|
|
||
| def __delitem__(self, key: str) -> None: | ||
| """Prevent deletion of items.""" | ||
| raise NotImplementedError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this operation is illegale, it is not a not implementer error
| """Prevent deletion of items.""" | ||
| raise NotImplementedError() | ||
|
|
||
| def __setitem__(self, key: str, value: bytes) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| def __getitem__(self, key: str) -> bytes: | ||
| """Retrieve an item from the store and print debug information.""" | ||
| # print() | ||
| print("GET", key, self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am putting linter rules in wegen to disable all print statements
| from typing import Any | ||
| from typing import Optional | ||
|
|
||
| import zarr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you could move this import into create_array and put a version check there (you already have it in zarr_2_or_3)
| def create_array(zarr_root, *args, **kwargs): | ||
| if "compressor" in kwargs and kwargs["compressor"] is None: | ||
| # compressor is deprecated, use compressors instead | ||
| kwargs.pop("compressor") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: no need for checking if the key is there. you can do kwargs.pop("x", None)
| import numpy as np | ||
|
|
||
| if dtype == "datetime64[s]": | ||
| dtype = np.dtype("int64") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.int64?
| else: | ||
| LOG.warning("⚠️" * 80) | ||
| LOG.warning( | ||
| f"Only Zarr version 2 is supported when creating datasets, found version: {zarr.__version__}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not support writing in zarr3 format for the time being, only reading. Is there a use case for it?
cd9231b to
c95780e
Compare
for more information, see https://pre-commit.ci
Description
Anemoi-datasets should be agnostic to the version of zarr. It should run with zarr3 installed or with zarr2 installed.
We should also take into account that zarr2 code cannot read datasets created by zarr3. So, ideally, we should
1 - update anemoi-datasets to work with zarr2 and zarr3
2 - keep using zarr2 to build datasets (dependency
anemoi-datasets[create]on zarr<=2, but have zarr2 or 3 when reading (dependency ofanemoi-datasetson zarr3 - when user have updated their environment (6 months?), start building zarr3 datasets
This PR mostly addresses the first point, making anemoi-datasets detect the version of zarr and adapt to it.
What is still missing is:
📚 Documentation preview 📚: https://anemoi-datasets--220.org.readthedocs.build/en/220/