async zarr

A quick sketch of how we can couple zarr with async code. This is aimed slightly at pyscript, but can be useful in its own right: for instance, I asked the question a while ago what it would take to be able to fetch concurrent chunks not just from one array but, say, one chunk each from multiple arrays in a dataset. 
This sketch is only for reading...

Outline:
- we subclass Group, so that `__getitem__` produces an AsyncArray
- we subclass Array as AsyncArray and overwrite its `_get_selection` (which calls IO) up to `__getitem__` (which is the user-facing API)
- we have **three** stores: 
  - a synchronous HTTP one for the dataset metadata. This can be based on requests for standard python or [pyfetch](https://pyodide.org/en/stable/usage/api/python-api.html#pyodide.http.pyfetch) under pyodide. Note that sync calls in pyodide are limited to text, which is perfect for this use case.
  - a fake synchronous store which merely stored the paths that are attempted, but returns FileNotFound for all of them
  - a fake synchronous store in which we have prefilled all the keys it will ever need, i.e., this can be a simple dict
- The flow goes as follows:
  - A zarr AsyncGroup is made by reading JSON files synchronously
  - When we attempt to get data, we make a coroutine in which first we use the fake store and zarr's existing machinery to record all the keys that will be needed (this will temporarily make an array of NaN); then we fetch all these keys concurrently, then we populate a dict and have the existing zarr machinery read from the dict
- For interest, [this](https://github.com/fsspec/filesystem_spec/pull/960/) is an fsspec async filesystem for pyodide. We don't need it to be this verbose for zarr.
- Note that in the browser, no fetches can ever be done without considering CORS, but any dataset known to work with zarr.js will work for this case too.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

async zarr #1104

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

async zarr #1104

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions