Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example crictl run/runp fail on a machine with a running k8s CP #1696

Open
RonBarkan opened this issue Nov 22, 2024 · 6 comments
Open

Example crictl run/runp fail on a machine with a running k8s CP #1696

RonBarkan opened this issue Nov 22, 2024 · 6 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@RonBarkan
Copy link

RonBarkan commented Nov 22, 2024

What happened:

On a Linux system with a successfully running single node Kubernetes control plane, with containerd, I am using the example run/runp commands here and here, and I am getting the following errors:

$ sudo crictl -r unix:///run/containerd/containerd.sock runp /tmp/nginx-pod.json 
E1122 19:08:31.584796 3158795 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: 
runc create failed: expected cgroupsPath to be of format \"slice:prefix:name\" for systemd cgroups, got \"/k8s.io/e5a83c8255cf21db9fa18c1999cb571db2139e87ed0c592324e851117eefc9f6\" instead: unknown"

and

$ sudo crictl -r unix:///run/containerd/containerd.sock run /tmp/container.json /tmp/nginx-pod.json
E1122 19:12:17.887097 3159492 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: 
runc create failed: expected cgroupsPath to be of format \"slice:prefix:name\" for systemd cgroups, got \"/k8s.io/7f31c4319bc73ca556da493fee2f7c28abef514e0103e7277f766556da9c0d8f\" instead: unknown"

Content of the files (copied from above links):

$ cat /tmp/container.json 
{
  "metadata": {
      "name": "busybox"
  },
  "image":{
      "image": "busybox"
  },
  "command": [
      "top"
  ],
  "log_path":"busybox.0.log",
  "linux": {
  }
}
$ cat /tmp/nginx-pod.json 
{
    "metadata": {
        "name": "nginx-sandbox",
        "namespace": "default",
        "attempt": 1,
        "uid": "hdishd83djaidwnduwk28bcsb"
    },
    "log_directory": "/tmp",
    "linux": {
    }
}

What you expected to happen:

The examples to work.

How to reproduce it (as minimally and precisely as possible):

Installed containerd version 1.6.12 through apt. crictl is v1.31.1 and v1.28.0.

The config.toml was generated using:

containerd config default | sed "s/SystemdCgroup *= *false/SystemdCgroup = true/" | sudo tee /etc/containerd/config.toml

Which means it uses SystemdCgroups = true.

Anything else we need to know?:

Cilium with kube-proxy is installed on the healthy Kubernetes control plane.

In case this is important:

sudo cat /var/lib/kubelet/config.yaml | grep cgroup
cgroupDriver: systemd

Environment:

  • Container runtime or hardware configuration:
    • containerd 1.6.12
    • crictl v1.31.1 and v1.28.0
    • kubelet (presumably not relevant): v1.29.6
  • OS (e.g: cat /etc/os-release): Debian GNU/Linux rodete
  • Kernel (e.g. uname -a): 6.9.10-1rodete5-amd64
  • Others:
@RonBarkan RonBarkan added kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Nov 22, 2024
@kannon92
Copy link
Contributor

Reading this I don’t think this is a bug with crictl but with containerd. Your version is pretty old so I’d maybe ask containerd on this one.

@akhilerm
Copy link
Contributor

Can you also provide the contents of nginx-pod.json file? Are you setting the "cgroup_parent" field, because the change to get the cgroup driver from the container runtime was added in crictl 1.29.0 . Ref: #1302. In crictl 1.28.0, you will have to pass the cgroup_parent value, else it defaults to cgroupfs style syntax.

@RonBarkan
Copy link
Author

RonBarkan commented Dec 2, 2024

@akhilerm @kannon92

I have updated the description to show the content of the json files. I've also corrected the 1st link to the correct runp example.

I have downloaded crictl version 1.31.1, which results in an identical error message. Looks like the doc shows the same examples at the time #1302 was merged (see here).

I was not setting cgroup_parent and could not find any information about how to set it. If you think it is needed for version 1.31.1, please let me know how to configure it.

@youwalther65
Copy link

youwalther65 commented Jan 14, 2025

@akhilerm @kannon92 I got the same error using the runpexample from crictl GitHub here.
It would be helpful to update this with a working example for containerd using SystemdCgroup=true and kubelet using "cgroupDriver": "systemd".
I am running:

# containerd --version
containerd github.com/containerd/containerd 1.7.23 57f17b0a6295a39009d861b89e3b3b87b005ca27

@youwalther65
Copy link

youwalther65 commented Jan 14, 2025

I got one step further by using the following pod json

# cat pod-config.json
{
    "metadata": {
        "name": "nginx-sandbox",
        "namespace": "default",
        "attempt": 1,
        "uid": "hdishd83djaidwnduwk28bcsb"
    },
    "log_directory": "/tmp",
    "linux": {
        "cgroup_parent": "system.slice"
    }
}

# crictl  runp pod-config.json
469d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6

But this sandbox ID wasn't visible and couldn't be used for create:

# crictl pods --id 469d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6
POD ID              CREATED             STATE               NAME                NAMESPACE           ATTEMPT             RUNTIME

# crictl create 69d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6 ctr.json pod-config.json
E0114 09:50:18.822738 3887719 remote_runtime.go:319] "CreateContainer in sandbox from runtime service failed" err="rpc error: code = NotFound desc = failed to find sandbox id \"69d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6\": not found" podSandboxID="69d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6"
FATA[0000] creating container: rpc error: code = NotFound desc = failed to find sandbox id "69d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6": not found

Sometimes I see the pod in NotReady state for a view seconds , then it disappears.

This is independent of running OS Amazon Linux 2 (cgroupv1 based) or AL2023 (cgroupv2 based).

@youwalther65
Copy link

Using "cgroup_parent": "kubepods.slice" showed same result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

4 participants