Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCC 14 leads to compile errors in WRF/WPS compilation #2047

Open
SettRaziel opened this issue May 12, 2024 · 5 comments
Open

GCC 14 leads to compile errors in WRF/WPS compilation #2047

SettRaziel opened this issue May 12, 2024 · 5 comments
Assignees

Comments

@SettRaziel
Copy link

Hej there.
I am maintaining a project to provide wrf/wps binaries for deployment in an ArchLinux environment.

Describe the bug
With the update to gcc 14 several new changes regarding compile flags were done.
These lead to a more restrict error handling since several flags that would only lead to warnings now lead to compile errors:

Looking on the changes with gcc 14 it seems they have change the flag: GCC 14

Implicit function declarations (-Werror=implicit-function-declaration)
It is no longer possible to call a function that has not been declared. In general, the solution is to include a header file with an appropriate function prototype. Note that GCC will perform further type checks based on the function prototype, which can reveal further type errors that require additional changes.

For well-known functions declared in standard headers, GCC provides fix-it hints with the appropriate #include directives:

error: implicit declaration of function ‘strlen’ [-Wimplicit-function-declaration]
    5 |   return strlen (s);
      |          ^~~~~~
note: include ‘<string.h>’ or provide a declaration of ‘strlen’
  +++ |+#include <string.h>
    1 |

When compiling the wrf model now these new restriction lead to these compile errors, e.g.

gcc  -I. -w -O3 -c   -DDM_PARALLEL -DLANDREAD_STUB=1 -DMAX_HISTORY=25 -DNMM_CORE=0   -c get_region_center.c
get_region_center.c: In function ‘get_region_center_’:
get_region_center.c:40:3: error: implicit declaration of function ‘memcpy’ [-Wimplicit-function-declaration]
   40 |   memcpy(MemoryOrder,MemoryOrderIn,strlen1);
      |   ^~~~~~

To Reproduce
Steps to reproduce the behavior:
Testing environment is an ArchLinux VM with Linux Kernel 6.8.9.
Compiler is gcc/gfortran with version 14.1.1.
Current compile flags: Env. Variables
Additional parameter for compilation: 35 gfortran dm+sm
Precondition: Successful compilation of netcdf, netcdf-fortran, mpich, hdf5
Running my compile routines lead to a reproducible number of around 140 of these implicit compile errors.

Workaround
Adding to the compile flags: -Wimplicit-function-declaration (issue tracked in: wrf_archlinux)

Expected behavior
No implicit functions declaration throughout the code resolving the workaround to address these issues as warnings and not as errors with the flag which might lead to other side effects during the code compilation.

Additional context
Up to gcc/gfortran 13.2.2 these implicit declarations only lead to warnings, so up to that point the compilation runs successfully.
Are there plans to refactor the code base?

@weiwangncar
Copy link
Collaborator

@SettRaziel Which version of the WRF code have you tried?

@SettRaziel
Copy link
Author

SettRaziel commented May 12, 2024

I did a recompile of 4.5.0.
Just saw this morning, that 4.6 is released, but did not try that one yet.
Edit: I started 4.6 in my testing environment. I will update this, when the job is finished.
I cannot confirm if 4.6.0 compiles. The new and inconsistent versioning scheme (WRF 4.6.0, noahmp 4.6, ...) breaks most of my script logic. I need more time to get the versioning scheme fixed.

@SettRaziel
Copy link
Author

SettRaziel commented May 13, 2024

I do stand partially corrected. The errors regarding -Wimplicit-function-declaration are gone with the changes provided in #1823.
So i am sorry on that part.
But the compile process for 4.6 does run in other errors. These are also related to GCC 14 changes (in my opinion):

Type checking on pointer types (-Werror=incompatible-pointer-types)
GCC no longer allows implicitly casting all pointer types to all other pointer types. This behavior is now restricted to the void * type and its qualified variations.

To fix compilation errors resulting from that, you can add the appropriate casts, and maybe consider using void * in more places (particularly for old programs that predate the introduction of void * into the C language).

Programs that do not carefully track pointer types are likely to contain aliasing violations, so consider building with -fno-strict-aliasing. (Whether casts are written manually or performed by GCC automatically does not make a difference in terms of strict aliasing violations.)

A frequent source of incompatible function pointer types involves callback functions that have more specific argument types (or less specific return types) than the function pointer they are assigned to. For example, old code which attempts to sort an array of strings might look like this:

#include <stddef.h>
#include <stdlib.h>
#include <string.h>

int
compare (char **a, char **b)
{
  return strcmp (*a, *b);
}

void
sort (char **array, size_t length)
{
  qsort (array, length, sizeof (*array), compare);
}

I now get around 80 errors like this:

c_code.c: In Funktion »rsl_lite_pack_«:
c_code.c:487:27: Fehler: Übergabe des Arguments 1 von »f_pack_lint_« von inkompatiblem Zeigertyp [-Wincompatible-pointer-types]
  487 |             F_PACK_LINT ( buf, p+yp_curs, imemord, &js, &je, &ks, &ke, &is, &ie,
      |                           ^~~
      |                           |
      |                           char *
In Datei, eingebunden von c_code.c:31:
rsl_lite.h:200:25: Anmerkung: »long int *« erwartet, aber Argument hat Typ »char *«
  200 | void F_PACK_LINT (long *inbuf, long *outbuf, int* memorder, int* js, int* je, int* ks, int* ke, int* is, int* ie, int* jms, int* jme, int* kms, int* kme, int* ims, int* ime, int* curs);
      |                   ~~~~~~^~~~~

The error is an incompatible pointer type, here long int* instead of char*. Sorry had my compiler language changed to german. If you need a complete log i would change the language, restart it and add the log in english.
I will also try to add the flag temporary to check if this change is the only source of this error.

@weiwangncar
Copy link
Collaborator

@SettRaziel Thanks for testing the latest version of the model. We will take a look at this issue very soon.

@weiwangncar
Copy link
Collaborator

@islas When you get a chance, can you review this report?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants