-
Hi, I'm looking for an efficient way of joining on subsets of coordinates without passing through pandas. I have tried xr.merge, xr.align, xr.combine_by_coords with little success, but perhaps I'm missing something... Below is an example where I'm using pandas to achieve the desired result. How could I get the same result using xarray directly ? import xarray as xr, numpy as np, pandas as pd
d1 = xr.DataArray(np.random.rand(5, 100), dims=["x", "y"])
d2 = xr.DataArray(np.random.rand(4), dims=["x"])
d1["c1"] = xr.DataArray([0, 0, 0, 2, 3], dims="x")
d1["c2"] = xr.DataArray([0, 0, 1, 1, 1], dims="x")
d1["c3"] = xr.DataArray([2, 0, 0, 0, 0], dims="x")
d2["c1"] = xr.DataArray([0, 0, 2, 3], dims="x")
d2["c2"] = xr.DataArray([0, 1, 1, 1], dims="x")
d1= d1.to_dataset(name="d1")
d2= d2.to_dataset(name="d2")
expected_result = d1.to_dataframe().reset_index().merge(d2.to_dataframe().reset_index(drop=True), on=["c1", "c2"], how="left")\
.set_index(["x", "y"]).to_xarray().set_coords(["c1", "c2", "c3"]).drop_vars(["x", "y"])
#Edit: added the following to remove broadcasting
for a in ["c1", "c2", "c3", "d2"]:
expected_result[a] = expected_result[a].isel(y=0) Note, I believe I have gone through the combine doc without finding a similar use case. |
Beta Was this translation helpful? Give feedback.
Answered by
dcherian
Jun 18, 2025
Replies: 1 comment 3 replies
-
Is this right? It's not broadcasted out like your import xarray.indexes
d1i = d1.set_xindex(("c1", "c2"), xr.indexes.PandasMultiIndex)
d2i = d2.set_xindex(("c1", "c2"), xr.indexes.PandasMultiIndex)
xr.merge([
d1i,
# automatic alignment does not work because of the duplicated (c1, c2) = (0,0) in d1.
# so we reindex manually
d2i.reindex_like(d1i)
]).drop_indexes(("x", "c1", "c2")) |
Beta Was this translation helpful? Give feedback.
3 replies
Answer selected by
JulienBrn
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is this right? It's not broadcasted out like your
expected_result
though