Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-processing of ICESat-2 ATL11 data to cloud-optimized format #3

Open
1 of 3 tasks
weiji14 opened this issue Nov 4, 2022 · 0 comments
Open
1 of 3 tasks

Pre-processing of ICESat-2 ATL11 data to cloud-optimized format #3

weiji14 opened this issue Nov 4, 2022 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@weiji14
Copy link
Member

weiji14 commented Nov 4, 2022

To enable fast reads of ice surface elevation time-series data for analytical and/or visualization purposes!

Current state

ICESat-2 ATL11 data is stored in an HDF5 format (see https://nsidc.org/data/atl11/versions/5), in a cumbersome nested hierarchical format (one 'dataset' per laser pair track). While the ICESat-2 HDF5 files are now on AWS S3 object storage (https://nsidc.org/data/user-resources/data-announcements/data-set-updates-new-earthdata-cloud-access-option-icesat-2-and-icesat-data-sets) as of 29 Sep 2022, which streamlines data read speeds (as long as compute is next to the data on AWS us-west2), the nested HDF5 structure of ATL11 can still be a pain to handle.

Desired state

Remove the nested pair-track data structure (pt1, pt2, pt3), i.e. flatten it into a single non-nested structure. There has been some work on this before to convert the ICESat-2 HDF5 format to:

There's a good blog post about cloud-native vector formats at https://cholmes.medium.com/an-overview-of-cloud-native-vector-c223845638e0, not point-cloud specific, but it discusses about the analytics/visualization trade-offs. Also check out new developments on cloud-optimized ICESat-2 data, for example, see what @tsnow03 is doing at https://github.com/CryoInTheCloud/IS2CloudOptimizedData

Action points

  • Chat with people from NSIDC/NASA who are currently thinking about cloud-optimized formats, see if there's anyone keen on getting a pangeo-forge recipe set-up - and use it!
  • Focus on a small geographic region of Antarctica as a test case - e.g. Siple Coast
  • Get icepyx to read ATL11 properly, both for local and cloud-hosted files (update read-in module for ATL11 icesat2py/icepyx#398, and for cloud?)

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants