xarray logo

Grouping by multiple arrays with Xarray

Monday, September 2nd, 2024 (over 1 year ago)



TLDR#

Xarray now supports grouping by multiple variables (docs). 🎉 😱 🤯 🥳. Try it out!

How do I use it?#

Install xarray>=2024.09.0 and optionally flox for better performance with reductions.

Simple example#

Simple grouping by multiple categorical variables is easy:

1import xarray as xr
2from xarray.groupers import UniqueGrouper
3
4da = xr.DataArray(
5    np.array([1, 2, 3, 0, 2, np.nan]),
6    dims="d",
7    coords=dict(
8        labels1=("d", np.array(["a", "b", "c", "c", "b", "a"])),
9        labels2=("d", np.array(["x", "y", "z", "z", "y", "x"])),
10    ),
11)
12
13gb = da.groupby(["labels1", "labels2"])
14gb
15
<DataArrayGroupBy, grouped over 2 grouper(s), 9 groups in total:
	'labels1': 3 groups with labels 'a', 'b', 'c'
	'labels2': 3 groups with labels 'x', 'y', 'z'>

Reductions work as usual:

1gb.mean()
2
8G8M0u

So does map:

1gb.map(lambda x: x[0])
2
8G8M0u

More complex time grouping#

Grouping by multiple /virtual/ variables like "time.month" is also supported:

1import xarray as xr
2
3ds = xr.tutorial.open_dataset("air_temperature")
4ds.groupby(["time.year", "time.month"]).mean()
5
8G8M0u

Multiple Grouper types#

The above syntax da.groupby(["labels1", "labels2"]) is a short cut for using Grouper objects.

1da.groupby(labels1=UniqueGrouper(), labels2=UniqueGrouper())
2

Grouper objects allow you to express more complicated GroupBy problems. For example, combining different grouper types is allowed. That is you can combine categorical grouping with UniqueGrouper, binning with BinGrouper, and resampling with TimeResampler.

1from xarray.groupers import BinGrouper
2
3ds = xr.Dataset(
4        {"foo": (("x", "y"), np.arange(12).reshape((4, 3)))},
5        coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
6    )
7gb = ds.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper())
8gb
9
<DatasetGroupBy, grouped over 2 grouper(s), 4 groups in total:
	'x_bins': 2 groups with labels (5,, 15], (15,, 25]
	'letters': 2 groups with labels 'a', 'b'>

Now reduce as usual

1gb.mean()
2
8G8M0u
Back to Blog

xarray logo

ihwYhQ

docs-i1

TwitterGitHubYouTubeBlog RSS Feed

Xarray

7p5kLiastDB+KM6m8pTyARtS/gaSVUUO9lnZ
Deploys by Netlify