-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize forwarding crashes to containers #102
Comments
Citation needed. In most cases, that's actually not true, the container might not even exist anymore when the crashdump is received. |
One example: The test case mentioned in the bug description in https://bugs.launchpad.net/ubuntu/+source/apport/+bug/2063349. Another example: autopkgtest runners on Ubuntu armhf. Do you have examples where containers are destroyed when one process crashes inside? |
Anything single process, and anything that is closed or upgraded after the crash. Containers are ephemeral and volatile by definition. What's the point in doing this forwarding at all? |
Without getting into the weeds of why this is useful, one could just note that there are at least two crash dump handlers that grew that capability independently( I agree that the single-process container is a common pattern, but it's not the only use case for containers. So, assuming the container survives the crash and has a crash handler installed, it will get much more out of the crash dump than the host's handler, since it knows about the details of the containers. Think running a Ubuntu container on a Fedora host. |
Have you seen https://systemd.io/ELF_PACKAGE_METADATA/ ? I should probably move that here. I've already mentioned this to @enr0n please consider enabling that spec distro-wide in Ubuntu, so that the host can get all the information from a crash in the guest without any need for communication, but simply by parsing the core file. Fedora already implements it, so if you try the opposite (crash fedora guest in ubuntu host) coredumpctl on the host will give you at lot of info. In fact several packages in Debian/Ubuntu already use it, including all systemd ones, so if you crash any of those they'll already contain the info. This is done on a package-by-package opt-in basis, for a distro-wide debhelper change see: https://salsa.debian.org/debian/debhelper/-/merge_requests/98 (unfortunately going nowhere in Debian due to dpkg politics, but this shouldn't be a problem for you) |
I have seen https://systemd.io/ELF_PACKAGE_METADATA/ and it has been on my todo wish list for a long time. Thanks for the pointer to https://salsa.debian.org/debian/debhelper/-/merge_requests/98. I'll read the discussion there. If there are no technical reasons against the proposed implementation, we could carry this delta in Ubuntu to add the ELF metadata by default. |
That would be very nice, thanks |
I read enough for today. @bluca since you submitted https://salsa.debian.org/debian/debhelper/-/merge_requests/98 are you willing to submit against |
Yes I can look into that in the next few days |
@bdrung here's a PR: https://code.launchpad.net/~bluca/ubuntu/+source/dpkg/+git/dpkg/+merge/465957 |
Let me ask one question: wouldn't a nicer option to maybe switch to systemd-coredump as backend for Apport? I mean, that's what everyone else ended up doing, for example rh's abrt: they let systemd-coredump to the initial dirty work and then hook into it at a later step. Is there anything that the coredump collection logic in apport can do that systemd-coredump cannot do anyway? I mean, systemd-coredump as really nice features, such as the sandboxing and backtrace exraction and stuff, or the container forwarding. To me it appears like a much simpler approach. I mean, i can very much understand why you want that, i.e. in particular for closing the loop in CIs and suchlike, that they can get access to their own crashes. But I am a bit reluctant to commit to a generic API for this, as we tend to interpet certain things (rlimit_core) a bit differently from others, and hence our handlers get called differently from others. I think we can commit to compat between differently versioned containers and hosts to some degree, but I am a bit conservative in commiting to more than that on this interface. |
@bdrung any update on that MR? |
Apport in Ubuntu 24.04 gained support for using systemd-coredump as backend: https://discourse.ubuntu.com/t/apport-2-28-0-gained-systemd-coredump-integration/44910
The problem with switching the Apport's backend from Apport to systemd-coredump is backward/forward compatibility: Let's assume we have following basic installations:
So we need a solution for a new host with old containers. Changing the default setup of the old containers to include systemd-coredump is too invasive. We could modify Apport to be able to read the crash information from systemd-coredump from the host via a socket. That's this ticket about. |
There are different crash dump handler like systemd-coredump and Apport available. In case a process crashes inside a container, the crash dump handler on the host receives the crash and needs to forward the crash into the container. This crash forwarding works if the same handler is present on the host and in the container (e.g. systemd-coredump on the host and systemd-coredump in the container or Apport on the host and Apport in the container). If the crash dump handler in the container differs from the handler on the host, the forwarding will not work (see systemd-coredump handler does not forward the crash to the container for example).
To make forwarding crashes work in all different scenarios, please standardize the way of forwarding crashes to containers. I suggest to specify the location of a socket in the container and how the needed information (like crashed process ID) is sent to the socket.
The text was updated successfully, but these errors were encountered: