Add heuristic for ERA5 download chunk sizes #252

euronion · 2022-09-06T12:22:27Z

Think about heuristic to download in smaller/larger chunks depending on data geographical scope to download

ERA5 cutouts are currently being downloaded as time=yearly slices (after #236 on time=monthly slices) to avoid requesting too large data pieces from the ERA5 backend. Monthly retrieval could theoretically negatively affect the cutout preparation speed. We could emply a heuristic to check for the request size and then decide based on the size whether to use monthly or yearly retrieval.

See discussion here: #236 (comment)_

johhartm · 2024-02-21T09:01:03Z

@euronion Stumbled upon the same issue and adapted the timeframe to optmize for my usecase (small cutout but long timeframe). I added a heuristic to optimise the requests to be as large as possible while staying within the 120.000 fields limit. However, I don't know how to account for the size limit with cutouts for large areas. If someone could help me with this information, I might be able to implement this feature.

euronion · 2024-02-25T22:45:52Z

Hi @johhartm ,
Thanks for the initiative. I would assume an approach of estimating the number of fields through

resolution * range latitude * range longitude * number of time steps * variables within the request

should be good for a heuristic.

Where did you get the 120.000 fields number from? It is the first time I hear about a concrete number + it seems a bit small, but that might depend on the definition of what a "field" is.

johhartm · 2024-02-26T10:23:40Z

I got this number from playing with creating larger requests and have them failing with the error message that the request was to large and the maximum request size is 120.000 fields. For me, the heuristic number of time steps * variables within the request worked, but only downloaded data for a pretty limited spatial frame. However, I start to think that the spatial extend does not affect the "field size", but still might to be taken into account to prevent the file size per request from getting to large.
I will test this hypothesis with some larger cutouts and will get back when I have some results.

euronion added type: enhancement status: help wanted priority: low labels Sep 6, 2022

Tasqu mentioned this issue Oct 17, 2023

Problems with convert_and_aggregate for long timespans? #324

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add heuristic for ERA5 download chunk sizes #252

Add heuristic for ERA5 download chunk sizes #252

euronion commented Sep 6, 2022

johhartm commented Feb 21, 2024

euronion commented Feb 25, 2024

johhartm commented Feb 26, 2024

Add heuristic for ERA5 download chunk sizes #252

Add heuristic for ERA5 download chunk sizes #252

Comments

euronion commented Sep 6, 2022

johhartm commented Feb 21, 2024

euronion commented Feb 25, 2024

johhartm commented Feb 26, 2024