Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix runc kill and runc delete for containers with no init and no private PID namespace #4102

Merged
merged 8 commits into from
Nov 28, 2023

Commits on Nov 27, 2023

  1. libct: replace runType with hasInit

    The semantics of runType is slightly complicated, and the only place
    where we need to distinguish between Created and Running is
    refreshState.
    
    Replace runType with simpler hasInit, simplifying its users (except the
    refreshState, which now figures out on its own whether the container is
    Created or Running).
    
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    d9f2a24 View commit details
    Browse the repository at this point in the history
  2. libct: Signal: slight refactor

    Let's use c.hasInit and c.isPaused where needed instead of
    c.curentStatus for simplicity.
    
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    542cce0 View commit details
    Browse the repository at this point in the history
  3. runc kill: fix sending KILL to non-pidns container

    Commit f8ad20f made it impossible to kill leftover processes in a
    stopped container that does not have its own PID namespace. In other
    words, if a container init is gone, it is no longer possible to use
    `runc kill` to kill the leftover processes.
    
    Fix this by moving the check if container init exists to after the
    special case of handling the container without own PID namespace.
    
    While at it, fix the minor issue introduced by commit 9583b3d:
    if signalAllProcesses is used, there is no need to thaw the
    container (as freeze/thaw is either done in signalAllProcesses already,
    or not needed at all).
    
    Also, make signalAllProcesses return an error early if the container
    cgroup does not exist (as it relies on it to do its job). This way, the
    error message returned is more generic and easier to understand
    ("container not running" instead of "can't open file").
    
    Finally, add a test case.
    
    Fixes: f8ad20f
    Fixes: 9583b3d
    Co-authored-by: lifubang <[email protected]>
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin and lifubang committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    dcf1b73 View commit details
    Browse the repository at this point in the history
  4. runc delete -f: fix for no pidns + no init case

    Commit f8ad20f moved the kill logic from container destroy to container
    kill (which is the right thing to do).
    
    Alas, it broke the use case of doing "runc delete -f" for a container
    which does not have its own private PID namespace, when its init process
    is gone. In this case, some processes may still be running, and runc
    delete -f should kill them (the same way as "runc kill" does).
    
    It does not do that because the container status is "stopped" (as runc
    considers the container with no init process as stopped), and so we only
    call "destroy" (which was doing the killing before).
    
    The fix is easy: if --force is set, call killContainer no matter what.
    
    Add a test case, similar to the one in the previous commit.
    
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    29283bb View commit details
    Browse the repository at this point in the history
  5. libct/cg: improve cgroup removal logic

    The current code is only doing retries in RemovePaths, which is only
    used for cgroup v1 (cgroup v2 uses RemovePath, which makes no retries).
    
    Let's remove all retry logic and logging from RemovePaths, together
    with:
    
     - os.Stat check from RemovePaths (its usage probably made sense before
       commit 19be8e5 but not after);
    
     - error/warning logging from RemovePaths (this was added by commit
       19be8e5 in 2020 and so far we've seen no errors other
       than EBUSY, so reporting the actual error proved to be useless).
    
    Add the retry logic to rmdir, and the second retry bool argument.
    Decrease the initial delay and increase the number of retries from the
    old implementation so it can take up to ~1 sec before returning EBUSY
    (was about 0.3 sec).
    
    Hopefully, as a result, we'll have less "failed to remove cgroup paths"
    errors.
    
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    d3d7f7d View commit details
    Browse the repository at this point in the history
  6. runc delete: do not ignore error from destroy

    If container.Destroy() has failed, runc destroy still return 0, which is
    wrong and can result in other issues down the line.
    
    Let's always return error from destroy in runc delete.
    
    For runc checkpoint and runc run, we still treat it as a warning.
    
    Co-authored-by: Zhang Tianyang <[email protected]>
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin and Burning1020 committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    7396ca9 View commit details
    Browse the repository at this point in the history
  7. runc delete, container.Destroy: kill all processes

    (For a container with no private PID namespace, that is).
    
    When runc delete (or container.Destroy) is called on a stopped
    container without private PID namespace and there are processes
    in its cgroup, kill those.
    
    Add a test case.
    
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    ab3cd8d View commit details
    Browse the repository at this point in the history
  8. libct: Destroy: don't proceed in case of errors

    For some reason, container destroy operation removes container's state
    directory even if cgroup removal fails (and then still returns an
    error). It has been that way since commit 5c246d0, which added
    cgroup removal.
    
    This is problematic because once the container state dir is removed, we
    no longer know container's cgroup and thus can't remove it.
    
    Let's return the error early and fail if cgroup can't be removed.
    
    Same for other operations: do not proceed if we fail.
    
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    a6f4081 View commit details
    Browse the repository at this point in the history