Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fregrid error: FATAL Error: nxgrid is greater than MAXXGRID/nthreads, increase MAXXGRID, decrease nthreads, or increase number of MPI ranks #208

Open
StevePny opened this issue Mar 3, 2023 · 3 comments
Labels

Comments

@StevePny
Copy link

StevePny commented Mar 3, 2023

Hi I'm getting this error when trying to combine the diagnostic file output from fv3-shield when running at C768 resolution (this works for us at the other resolutions):

FATAL Error: nxgrid is greater than MAXXGRID/nthreads, increase MAXXGRID, decrease nthreads, or increase number of MPI ranks

We've tried moving to a bigger memory machine (an aws ec2 instance with ~1000GB of memory), we've tried running with the max number of processors (64), and we've also tried rebuilding the package with MAXXGRID set to 1e10 in create_xgrid.h. We've also tried using the remap files generated by UFS_UTILS. We're only able to combine the files to a 1-degree resolution. We'd like to be able to combine at the original or at least at 1/4-degree resolution.

Here are some examples of our run commands:

export res=768
export outname=C${res}
export expdir=`pwd`
export outputdir=OUTPUT_${outname}
export root=~
export mosaicdir=${root}/UFS_setup/out
ln -f ${mosaicdir}/C${res}/C${res}_mosaic.nc ${expdir}/${outputdir}
ln -f ${mosaicdir}/C${res}/C${res}_grid.tile?.nc ${expdir}/${outputdir}

# 1 deg
export nlon=360
export nlat=180
docker run --cap-add=SYS_PTRACE -v ${expdir}/${outputdir}:/rundir -it sofarocean/fretools:0.0.2 mpirun --allow-run-as-root -np 64 fregrid --input_mosaic  C${res}_mosaic.nc --nlon ${nlon} --nlat ${nlat} --lonBegin 0 --lonEnd 360 --latBegin -90 --latEnd 90 --input_file atmos_sos --output_file atmos_sos_${outname}.nc --input_dir ./ --scalar_field  UGRD10m,VGRD10m --interp_method conserve_order1 --remap_file remap_weights_C768_1deg.nc

# 0p5 deg
export nlon=720
export nlat=360
docker run --cap-add=SYS_PTRACE -v ${expdir}/${outputdir}:/rundir -it sofarocean/fretools:0.0.2 mpirun --allow-run-as-root -np 64 fregrid --input_mosaic  C${res}_mosaic.nc --nlon ${nlon} --nlat ${nlat} --lonBegin 0 --lonEnd 360 --latBegin -90 --latEnd 90 --input_file atmos_sos --output_file atmos_sos_${outname}.nc --input_dir ./ --scalar_field  UGRD10m,VGRD10m --interp_method conserve_order1 --remap_file remap_weights_C768_0p5deg.nc

# 0p25 deg
export nlon=1440 #720 #360
export nlat=720  #360 #180
#docker run --cap-add=SYS_PTRACE -v ${expdir}/${outputdir}:/rundir -it sofarocean/fretools:0.0.2 mpirun --allow-run-as-root -np 64 fregrid --input_mosaic  C${res}_mosaic.nc --nlon ${nlon} --nlat ${nlat} --lonBegin 0 --lonEnd 360 --latBegin -90 --latEnd 90 --input_file atmos_3d_8xdaily --output_file atmos_3d_8xdaily_${outname}.nc --input_dir ./ --scalar_field  TMP,HGT,Q,UCOMP,VCOMP --interp_method conserve_order1 --remap_file remap_weights_C768_0p25deg.nc

We'd like a workable solution for combining these files at the C768 resolution into a single global netcdf file.

@ngs333
Copy link
Contributor

ngs333 commented Mar 3, 2023

@StevePny
Right now regridding high resolution grids is done with fregrid_parallel instead of fregrid. There is an example runscript in directory FRE-NCtools/docs, file extreme_fregrid_sample_runscript.txt. fregid_parallel is built by compiling FRE-NCtools with the option --with-mpi. Let us know if you are able to use fregrid_parallel as in the example.

@StevePny
Copy link
Author

StevePny commented Mar 4, 2023

Thanks, with that simple change (fregrid to fregrid_parallel, see below) this works for us. Is it safe to make this our default choice for all resolutions?

docker run --cap-add=SYS_PTRACE -v ${expdir}/${outputdir}:/rundir -it sofarocean/fretools:0.0.2 mpirun --allow-run-as-root -np 64 fregrid_parallel --input_mosaic  C${res}_mosaic.nc --nlon ${nlon} --nlat ${nlat} --lonBegin 0 --lonEnd 360 --latBegin -90 --latEnd 90 --input_file atmos_sos --output_file atmos_sos_${outname}.nc --input_dir ./ --scalar_field  UGRD10m,VGRD10m --interp_method conserve_order1 --remap_file remap_weights_C768_0p25deg.nc

@ngs333
Copy link
Contributor

ngs333 commented Dec 21, 2023

Thanks, with that simple change (fregrid to fregrid_parallel, see below) this works for us. Is it safe to make this our default choice for all resolutions?

docker run --cap-add=SYS_PTRACE -v ${expdir}/${outputdir}:/rundir -it sofarocean/fretools:0.0.2 mpirun --allow-run-as-root -np 64 fregrid_parallel --input_mosaic  C${res}_mosaic.nc --nlon ${nlon} --nlat ${nlat} --lonBegin 0 --lonEnd 360 --latBegin -90 --latEnd 90 --input_file atmos_sos --output_file atmos_sos_${outname}.nc --input_dir ./ --scalar_field  UGRD10m,VGRD10m --interp_method conserve_order1 --remap_file remap_weights_C768_0p25deg.nc

@StevePny
Steve, I have gotten to experiment with our own hardware on this topic some more. I think that if it works with a given resolution, and if you stick with that resolution, then it should continue to work. But if resolution increases for your team at some future time, you may have to change the parameters.

Last week there was an upgrade to master that was mostly a documentation upgrade. You may want to clone it (or just read the docs online) as its got a couple of items related to this. Mostly its in the extreme fregrid pdf in the docs directory. Assuming you don't have any issues, and given the new docs, and assuming you don't have any issue with that, I am closing this question. Please let me know if you disagree. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants