-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use Linux's value for sysctl_nr_open_max
#32740
Open
johnrichardrinehart
wants to merge
9
commits into
systemd:main
Choose a base branch
from
johnrichardrinehart:jrinehart/match-kernel-ulimit-fs.nr_open
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
fix: use Linux's value for sysctl_nr_open_max
#32740
johnrichardrinehart
wants to merge
9
commits into
systemd:main
from
johnrichardrinehart:jrinehart/match-kernel-ulimit-fs.nr_open
Commits on May 11, 2024
-
fix: use Linux's value for sysctl_nr_open_max
While Lennart moaned, a bit, in the comments about vendoring this expression and this value being unknowable, it hasn't changed in over 10 years (and even then the change was a refactor to define this value statically instead of defining it via delayed execution, cf. 7f4b36f9bb930b3b2105a9a2cb0121fa7028c432 of the kernel source) so it could be argued that it's pretty knowable (even if it isn't available through public kernel headers like it might ought to be). The previous implementation of `bump_file_max_and_nr_open` isn't great, since the initial value is a bit higher than the maximum allowed value by the kernel which triggers a cut down to half of the initial value. We ran into this in production while trying to increase system resource limits and found that PID 1 wasn't respecting our sysctl.nr_file values. Enabling debug logs using `systemd.log_level=debug` we observed ``` May 10 02:46:45 localhost systemd[1]: systemd 251.16 running in system mode (+PAM +AUDIT -SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 > May 10 02:46:45 localhost systemd[1]: Detected virtualization amazon. May 10 02:46:45 localhost systemd[1]: Detected architecture x86-64. May 10 02:46:45 localhost systemd[1]: Detected initialized system, this is not the first boot. May 10 02:46:45 localhost systemd[1]: Kernel version 5.15.119, our baseline is 4.15 May 10 02:46:45 localhost systemd[1]: No hostname configured, using default hostname. May 10 02:46:45 localhost systemd[1]: Hostname set to <localhost>. May 10 02:46:45 localhost systemd[1]: Successfully added address 127.0.0.1 to loopback interface May 10 02:46:45 localhost systemd[1]: Successfully added address ::1 to loopback interface May 10 02:46:45 localhost systemd[1]: Successfully brought loopback interface up May 10 02:46:45 localhost systemd[1]: Setting '/proc/sys/fs/file-max' to '9223372036854775807' May 10 02:46:45 localhost systemd[1]: Initial value of v is 2147483584. May 10 02:46:45 localhost systemd[1]: Setting '/proc/sys/fs/nr_open' to '2147483584' May 10 02:46:45 localhost systemd[1]: Successfully bumped fs.nr_open to 2147483584 May 10 02:46:45 localhost systemd[1]: No credentials passed via fw_cfg. ``` including this change while we observed ``` May 10 03:31:37 localhost systemd[1]: systemd 251.16 running in system mode (+PAM +AUDIT -SELINUX +APPARMOR +IMA > May 10 03:31:37 localhost systemd[1]: Detected virtualization amazon. May 10 03:31:37 localhost systemd[1]: Detected architecture x86-64. May 10 03:31:37 localhost systemd[1]: Detected initialized system, this is not the first boot. May 10 03:31:37 localhost systemd[1]: Kernel version 5.15.119, our baseline is 4.15 May 10 03:31:37 localhost systemd[1]: No hostname configured, using default hostname. May 10 03:31:37 localhost systemd[1]: Hostname set to <localhost>. May 10 03:31:37 localhost systemd[1]: Successfully added address 127.0.0.1 to loopback interface May 10 03:31:37 localhost systemd[1]: Successfully added address ::1 to loopback interface May 10 03:31:37 localhost systemd[1]: Successfully brought loopback interface up May 10 03:31:37 localhost systemd[1]: Setting '/proc/sys/fs/file-max' to '9223372036854775807' May 10 03:31:37 localhost systemd[1]: Setting '/proc/sys/fs/nr_open' to '2147483640' May 10 03:31:37 localhost systemd[1]: Couldn't write fs.nr_open as 2147483640, halving it. May 10 03:31:37 localhost systemd[1]: Setting '/proc/sys/fs/nr_open' to '1073741816' May 10 03:31:37 localhost systemd[1]: Successfully bumped fs.nr_open to 1073741816 May 10 03:31:37 localhost systemd[1]: No credentials passed via fw_cfg. ``` without this patch. We also observed ``` $ cat /proc/1/limits | grep files Max open files 2147483584 2147483584 files ``` with this patch, but ``` $ cat /proc/1/limits | grep files Max open files 1073741816 1073741816 files ``` without this patch. Below is a short program that anyone can use to determine what the value of `sysctl_nr_open_max` should be for their architecture. ```cpp // Type your code here, or load an example. /* sysctl_nr_open_max defines the largest value for which you can do something like ``` $ sysctl -w fs.nr_open=10000 fs.nr_open = 10000 ``` The value is apparently architecture-dependent and is located at https://github.com/torvalds/linux/blob/448b3fe5a0eab5b625a7e15c67c7972169e47ff8/fs/file.c#L27-L32 This little snippet, below, yields this value for different architectures. It's mostly an experiment with using Godbolt (C++ not C) to confirm certain Linux results. */ // cf. https://stackoverflow.com/a/66249936 extern "C" { const char *getBuild() { // Get current architecture, detectx nearly every // architecture. Coded by Freak return "x86_64"; return "x86_32"; return "ARM2"; return "ARM3"; return "ARM4T"; return "ARM5" return "ARM6T2"; defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6Z__) || \ defined(__ARM_ARCH_6ZK__) return "ARM6"; defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) || \ defined(__ARM_ARCH_7S__) return "ARM7"; defined(__ARM_ARCH_7M__) || defined(__ARM_ARCH_7S__) return "ARM7A"; defined(__ARM_ARCH_7S__) return "ARM7R"; return "ARM7M"; return "ARM7S"; return "ARM64"; return "MIPS"; return "SUPERH"; defined(__POWERPC__) || defined(__ppc__) || defined(__PPC__) || \ defined(_ARCH_PPC) return "POWERPC"; return "POWERPC64"; return "SPARC"; return "M68K"; return "UNKNOWN"; } } // __WORDSIZE defined by GCC/llvm void fn() { const char *system = getBuild(); std::printf("==== Results for %s ====\n\n", system); std::printf("INT_MAX (%s): %x (%d)\n", system, INT_MAX, INT_MAX); unsigned long native_size = ~(size_t)0 / sizeof(void *); std::printf("~(size_t)0/sizeof(void *): %lx (%ld)\n", native_size, native_size); unsigned long min = __const_min(INT_MAX, ~(size_t)0 / sizeof(void *)); std::printf("Minimum of those two things: %lx (%ld)\n", min, min); std::cout << "BITS_PER_LONG: " << BITS_PER_LONG << std::endl; unsigned int sysctl_nr_open_max = __const_min(INT_MAX, ~(size_t)0 / sizeof(void *)) & -BITS_PER_LONG; printf("sysctl_nr_open_max: %d", sysctl_nr_open_max); } int main() { fn(); return 0; } ```
Configuration menu - View commit details
-
Copy full SHA for ee5e9e1 - Browse repository at this point
Copy the full SHA ee5e9e1View commit details -
Configuration menu - View commit details
-
Copy full SHA for e7ca7a0 - Browse repository at this point
Copy the full SHA e7ca7a0View commit details -
Configuration menu - View commit details
-
Copy full SHA for a989acf - Browse repository at this point
Copy the full SHA a989acfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9cbbfab - Browse repository at this point
Copy the full SHA 9cbbfabView commit details -
Configuration menu - View commit details
-
Copy full SHA for d35e1ed - Browse repository at this point
Copy the full SHA d35e1edView commit details -
Configuration menu - View commit details
-
Copy full SHA for b4ed665 - Browse repository at this point
Copy the full SHA b4ed665View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4673cac - Browse repository at this point
Copy the full SHA 4673cacView commit details -
Configuration menu - View commit details
-
Copy full SHA for 19e7859 - Browse repository at this point
Copy the full SHA 19e7859View commit details -
Configuration menu - View commit details
-
Copy full SHA for 09700f2 - Browse repository at this point
Copy the full SHA 09700f2View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.