Skip to content
This repository has been archived by the owner on Apr 30, 2021. It is now read-only.

time operations where time_bounds span multiple averaging periods #55

Open
matt-long opened this issue Feb 9, 2019 · 13 comments
Open
Labels
help wanted Extra attention is needed

Comments

@matt-long
Copy link
Contributor

There is an assumption within the functions in climatology.py that the time_bound of data fit concisely within the averaging period applied; this assumption is violated when computing monthly averages, say, on 5-day data. A more appropriate approach would be to compute averaging weights based on the portion of the time_bound that falls within the target averaging period.

@matt-long matt-long added the bug label Feb 9, 2019
@andersy005 andersy005 added this to the sprint-feb18-mar03 milestone Feb 12, 2019
@alperaltuntas
Copy link
Member

One solution for this issue is to interpolate from, say, 5-day data to 1-day data (using 'zero', i.e., piecewise polynomial interpolation), and then to compute monthly averages on daily data. This would be less efficient compared to an approach based on computing weights, but would be more general and easier to implement. The problem, however, is that xarray does not support interpolation over a chunked dimension.

When I try to interpolate a dataset that's read in using open_mfdataset, I get the following:

>>> da.interp(time=new_time_dim).compute()
...
NotImplementedError: Chunking along the dimension to be interpolated (2) is not yet supported.

Eliminating the chunking over time dimension solves this issue, but that would definitely be an infeasible option for practical use.

@matt-long
Copy link
Contributor Author

@alperaltuntas, what if we "unfix time" and use the resample on the float time-axis, then "refix time" to compute the monthly climatology?

@alperaltuntas
Copy link
Member

@alperaltuntas, what if we "unfix time" and use the resample on the float time-axis, then "refix time" to compute the monthly climatology?

I'll try this.

@matt-long
Copy link
Contributor Author

on second thought, I think resample only works on time axes.

@alperaltuntas
Copy link
Member

alperaltuntas commented Feb 15, 2019

Can't we convert the time axis from cftime to Pandas' accepted time type, instead of unfixing the time?

@matt-long
Copy link
Contributor Author

I think pandas is too restrictive for our data:

When decoding/encoding datetimes for non-standard calendars or for dates before year 1678 or after 
year 2262, xarray uses the cftime library. It was previously packaged with the netcdf4-python package 
under the name netcdftime but is now distributed separately. cftime is an optional dependency of 
xarray.

@kmpaul
Copy link

kmpaul commented Feb 15, 2019 via email

@matt-long
Copy link
Contributor Author

Yes. The issue is that cftime doesn’t work with resample and pandas time is too restrictive.

@kmpaul
Copy link

kmpaul commented Feb 15, 2019 via email

@alperaltuntas
Copy link
Member

Actually, this issue still applies to compute_mon_climatology. Not sure if it applies to other functions in climatology.py I am planning to update compute_mon_climatology based on the new function that computes means (compute_mon_mean).

Relatedly, I am wondering what's the best way of distinguishing functions that compute climatology vs functions that compute means. I added compute_mon_mean (which computes monthly means, not climatology) to climatology module , but not sure if placing it to climatology module will cause confusion. Also, another potential source of confusion is that the function that computes annual climatology is named compute_ann_mean. Should it be named compute_ann_climatology?

@matt-long ?

@andersy005
Copy link
Contributor

Relatedly, I am wondering what's the best way of distinguishing functions that compute climatology vs functions that compute means.

@matt-long, I presume @alperaltuntas's concern would be solved by the nomenclature suggestion you made in our conversation today.

@andersy005
Copy link
Contributor

Should we imitate NCL's nomenclature to a certain level : https://www.ncl.ucar.edu/Document/Functions/climo.shtml?

@andersy005
Copy link
Contributor

andersy005 commented Apr 4, 2019

@alperaltuntas,

I added compute_mon_mean (which computes monthly means, not climatology) to climatology module , but not sure if placing it to climatology module will cause confusion.

In #109, I am removing the climatology.py module and most of utility functions in utils will be moved to an EsmlabAccessor class in a new module core.py.

Not sure that it completely solves the confusion issue, I've also moved most functions to the top-level of esmlab.e.g. you can now call esmlab.compute_ann_mean() instead of esmlab.climatology.compute_ann_mean()

@andersy005 andersy005 added help wanted Extra attention is needed and removed bug labels Jul 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants