-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The wget binary leaks defunct ssl_client processes #158
Comments
The registry@ services become unhealthy and get killed after a while they started. This is due to distribution/distribution-library-image#158. The healthchecks are based on wget, and wget leaks a defunct process everytime the healthcheck launches. Eventually, we reach the ulimit, no more forks are allowed and the container is restarted. This commit changes the healtcheck to use curl, that is installed by the healthcheck itself if not already available. This is a workaround to revert once a fix for distribution/distribution-library-image#158 lands and a new image is published.
The registry@ services become unhealthy and get killed after a while they started. This is due to distribution/distribution-library-image#158. The healthchecks are based on wget, and wget leaks a defunct process everytime the healthcheck launches. Eventually, we reach the ulimit, no more forks are allowed and the container is restarted. This commit changes the healtcheck to use curl, that is installed by the healthcheck itself if not already available. This is a workaround to revert once a fix for distribution/distribution-library-image#158 lands and a new image is published.
I'm a bit confused, you report that the issue affects
The example shows you've reproduced it on the Can you grab the latest |
Yeah, it reproduces on the 3.0.0-alpha.1 I pulled too:
I got confused too actually, but had to go working on other issues and wanted to raise the issue at least.
I'll double check this too as soon as i can |
Ok, I was able to reproduce in the base image by having
This is also reproducible on the ubuntu image when using the busybox version of wget actually (as far as the PID1 process cannot collect the defunct process?). I'll check where I can open an issue for busybox. In the meantime, would it be ok to onboard curl or wget (the non-busybox version) in this image? Proposed change in #159 |
When running busybox wget v1.31.x to request an SSL page, there is a leak of defunct processes that eventually could hit the limit for fork syscalls, leading to the unavailability of main registry process. Installing wget from the alpine repository resolves the issue as this only affects the busybox distribution. Moreover, this should transparently resolve the issue for deployments using wget as healthcheck/readiness/liveness probe. xref: https://bugs.busybox.net/show_bug.cgi?id=15967 Closes distribution#158
Opened an upstream bug for busybox at https://bugs.busybox.net/show_bug.cgi?id=15967 |
A couple of questions -- sorry if they come out as dumb, but I'm still confused and would like to understand the problem better:
|
The wget available in the alpine image is a busybox applet, affected by the above mentioned bug. If we use that binary to run an healthcheck about the registry availability (e.g.,
|
When running busybox wget v1.31.x to request an SSL page, there is a leak of defunct processes that eventually could hit the limit for fork syscalls, leading to the unavailability of main registry process. Installing wget from the alpine repository resolves the issue as this only affects the busybox distribution. Moreover, this should transparently resolve the issue for deployments using wget as healthcheck/readiness/liveness probe. xref: https://bugs.busybox.net/show_bug.cgi?id=15967 Closes distribution#158 Signed-off-by: aleskandro <[email protected]>
FYI: we bumped the base |
The included wget binary (from busybox) seems to leak defunct ssl_client processes.
When using wget for health checking a registry serving on SSL, the host's ulimit for forking is eventually reached, and the registry becomes nonfunctional.
Affected versions and env
Server: Docker Engine - Community
Engine:
Version: 25.0.3
API version: 1.44 (minimum version 1.24)
Go version: go1.21.6
Git commit: f417435
Built: Tue Feb 6 21:14:27 2024
OS/Arch: linux/amd64
Kernel: Linux seraph 6.8.0-0.rc5.41.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Feb 19 14:05:40 UTC 2024 x86_64 GNU/Linux
Host OS: Fedora CoreOS
Steps to reproduce
Other Infos
I tried it on the base image, alpine:3.18.6, and it didn't reproduce
The text was updated successfully, but these errors were encountered: