Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining data from GEFS (FV3) tiled restart files with 2d-coords #276

Open
StevePny opened this issue Feb 10, 2024 · 2 comments
Open

Combining data from GEFS (FV3) tiled restart files with 2d-coords #276

StevePny opened this issue Feb 10, 2024 · 2 comments
Labels

Comments

@StevePny
Copy link

StevePny commented Feb 10, 2024

Hi, I'm trying to apply mppnccombine to produce a combined file from the 6-tiled restart data from GEFS, which I assume is using the GFDL FMS:
https://noaa-gefs-pds.s3.amazonaws.com/index.html#gefs.20240208/00/atmos/init/p01/

[EDIT: @spencerkclark mentioned here: https://github.com/pydata/xarray/discussions/8730 that mppnccombine is the wrong tool for this purpose - What FRE-NCtools should be used to do this instead - fregrid? combine_restarts?]

Screen Shot 2024-02-09 at 6 58 45 PM

I assume mppnccombine is the appropriate tool to use here. I've tried building FRE-NCtools in a docker image both on my own laptop (mac M1) and on an AWS ec2 instance running ubuntu.

In both cases, I run something like:

docker run -v /home/ubuntu/GEFS/c00:/rundir -it sofarocean/fretools:0.0.2 mppnccombine -v -M gfs_data.nc gfs_data.tile?.nc

And yet the resulting gfs_data.nc file looks like:
Screen Shot 2024-02-09 at 6 51 22 PM

which looks to me that the final file is only the first tile and the combine operation has failed.

Am I using the tool incorrectly, or is there another tool that is more appropriate for this dataset?

@StevePny StevePny changed the title Loading data from GEFS (FV3) tiled restart files with 2d-coords Combining data from GEFS (FV3) tiled restart files with 2d-coords Feb 10, 2024
@spencerkclark
Copy link
Member

spencerkclark commented Feb 10, 2024

fregrid is maybe what you are looking for. Raw diagnostics and restart files from FV3-based models are output on the cubed sphere native grid, which is logically rectangular on each cubed sphere tile, but not globally. fregrid can regrid fields12 in those files to a regular latitude-longitude grid, which is globally rectangular, so a global horizontal field can be stored in a simple 2D array. This can be viewed as "combining" data from the tiles together, though keep in mind it is also a grid transformation (perfectly valid and frequently used of course!).

mppnccombine is somewhat of a lower level tool that can be useful to preprocess files before sending them to a tool like fregrid. It is only needed if the data was produced using an fv_core_nml.io_layout not equal to 1, 1.

For example, if an I/O layout of 2, 2 were used, you would have a set of 24 files per restart category, where each tile was broken up into four subdomains. For tile one of gfs_data you would start from:

gfs_data.tile1.nc.0000
gfs_data.tile1.nc.0001
gfs_data.tile1.nc.0002
gfs_data.tile1.nc.0003

and then use mppnccombine to produce the "combined" gfs_data.tile1.nc file:

$ mppnccombine gfs_data.tile1.nc gfs_data.tile1.nc.*

You would do the same for tiles two through six. Your data is already combined in this sense, so the mppnccombine step is not needed.

Footnotes

  1. You may run into trouble with this approach for the horizontal winds in restart files, however, which are on a staggered grid, which is something that I do not believe fregrid supports. That may not be a concern if your interest is in other fields.

  2. Some fields in the surface restart files are categorical, e.g. land surface type, and so may not be amenable to regridding, which involves averaging or interpolation. Again, this may not be a concern depending on the variables of interest.

@StevePny
Copy link
Author

Hi @spencerkclark, thanks for your suggestions. And also thank you for the clarification on mppnccombine, which I have used before with MOM6 and incorrectly assumed it could be used similarly for this objective with FV3 as well.

We are currently using fregrid with FV3-SHiELD to interpolate the output to a regular global grid for our own runs. This works ok for viewing output globally and calculating basic statistics.

Unfortunately this does not work for us when using the GEFS archive data. Below I'm using the UFS/GFS mosaic and remap_weights files provided at:
https://ftp.emc.ncep.noaa.gov/static_files/public/UFS/GFS/fix/fix_fv3/C384

After attempting to use fregrid, it seems that it has an issue when looking for lon as a variable, but cannot find it:

docker run -v /Users/spenny/Data/GEFS/p01:/rundir -it gfdl/fretools:latest fregrid --input_mosaic C384_mosaic.nc --nlon 1440 --nlat 720 --lonBegin 0 --lonEnd 360 --latBegin -90 --latEnd 90 --input_file gfs_data --output_file gfs_data.nc --input_dir . --scalar_field t --interp_method conserve_order1 --remap_file remap_weights_C384_1deg.nc
****fregrid: first order conservative scheme will be used for regridding.
Error from pe 0: mpp_io(mpp_get_varid): error in get field_id of variable lon from file ./gfs_data.tile1.nc: NetCDF: Variable not found

I'm guessing either a different set of mosaic files are needed, specifically for GEFS (though I'm not sure where to find these), or there is a fregrid setting needed to tell it where to look for lat/lon information.

Here is the GEFS file header:
Screen Shot 2024-02-12 at 3 00 03 PM

Here is the C384 mosaic grid tile 1 header:
Screen Shot 2024-02-12 at 3 00 48 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants