-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attributes that are lists of strings with a single member don't survive a round-trip #4798
Comments
There is an explicit condition for the size to be xarray/xarray/backends/netCDF4_.py Lines 274 to 275 in f52a95c
However, even when setting the condition to |
Actually, this is a problem which is specific to the netcdf4 backend, using import xarray as xr
ds = xr.Dataset(attrs={"foo": ["bar"]},)
ds.to_netcdf("ds.nc", engine="h5netcdf")
rd = xr.open_dataset("ds.nc", engine="h5netcdf")
assert ds.attrs["foo"] == rd.attrs["foo"] and runs without problems. As soon as I involve netcdf4 in reading or writing, this fails. I don't know enough about netcdf's on-disk format to really debug what is going on. The |
@mikapfl That's by design in netCDF4: if len(result) == 1:
return result[0]
else:
return result |
@kmuehlbauer thanks for finding that out! Now I'm wondering if that should be documented or some other solution; certainly, for me it was surprising, but maybe also my expectations were just wrong. |
@mikapfl This was added with Unidata/netcdf4-python#597. But I can't say anything profound why the single element is extracted from the list. |
Could add a note to the docstring... |
h5netcdf aligned with netCDF4 in h5netcdf/h5netcdf#116. |
As this recently surfaced in #9699, it would be great to at least mention this in the docs. Those are the locations where
Should we add this to the data structures or the faq section? We could then link to there from the three API docstrings. |
What happened:
If I create a Dataset with an attr that is a list of strings with only a single member, write it to netcdf and open it again, the attr is a single string instead of a list of strings.
What you expected to happen:
Writing the Dataset to netcdf and opening it should not change the attrs.
Minimal Complete Verifiable Example:
Anything else we need to know?:
This was implemented in PR2045, including the test test_setncattr_string. However, there
assert_array_equal
is used to compare the attrs after a roundtrip (line 1454 ), which unfortunately compares element-wise if one of the elements is a scalar. Changing thetest like this:
corrects it (the test then fails currently).
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: f52a95c
python: 3.8.6 (default, Sep 25 2020, 09:36:53)
[GCC 10.2.0]
python-bits: 64
OS: Linux
OS-release: 5.8.0-36-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.12.0
libnetcdf: 4.7.4
xarray: 0.16.3.dev75+gf52a95cb
pandas: 1.2.0
numpy: 1.19.5
scipy: None
netCDF4: 1.5.5.1
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.3.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2020.12.0
distributed: 2020.12.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 51.1.2
pip: 20.3.3
conda: None
pytest: 6.2.1
IPython: None
sphinx: None
The text was updated successfully, but these errors were encountered: