Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s: guest-pull: Kill all processes in container test fails when pulling the image inside the guest #9664

Open
fidencio opened this issue May 19, 2024 · 1 comment
Assignees
Labels
area/guest-pull bug Incorrect behaviour needs-review Needs to be assessed by the team.

Comments

@fidencio
Copy link
Member

# Events:
#   Type     Reason     Age                From               Message
#   ----     ------     ----               ----               -------
#   Normal   Scheduled  90s                default-scheduler  Successfully assigned kata-containers-k8s-tests/busybox to 984fee00bd70.jf.intel.com
#   Normal   Pulling    82s                kubelet            Pulling image "quay.io/prometheus/busybox:latest"
#   Normal   Pulled     82s                kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 373ms (373ms including waiting)
#   Normal   Created    82s                kubelet            Created container first
#   Normal   Started    80s                kubelet            Started container first
#   Normal   Pulled     79s                kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 420ms (420ms including waiting)
#   Warning  Failed     78s                kubelet            Error: failed to create containerd task: failed to create shim task: the file sleep was not found: unknown
#   Warning  Failed     76s                kubelet            Error: failed to create containerd task: failed to create shim task: failed to mount /run/kata-containers/2ccc250bd74a81ef8c525b6acecaf3ff9c00cc8a2700c3ef17446c75374e32d7/rootfs to /run/kata-containers/first-test-container/rootfs, with error: ENOENT: No such file or directory: unknown
#   Normal   Pulled     59s (x2 over 77s)  kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 364ms (364ms including waiting)
#   Normal   Pulling    29s (x4 over 80s)  kubelet            Pulling image "quay.io/prometheus/busybox:latest"
#   Normal   Pulled     29s                kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 357ms (357ms including waiting)
#   Normal   Created    29s (x4 over 79s)  kubelet            Created container first-test-container
#   Warning  Failed     29s (x2 over 59s)  kubelet            Error: failed to create containerd task: failed to create shim task: load image bundle

And then taking a look at the test itself, we see:

setup() {
...
yaml_file="${pod_config_dir}/initcontainer-shareprocesspid.yaml"
...
}

...
@test "Kill all processes in container" {
        # Create the pod
        kubectl create -f "${yaml_file}"
...
}

And when taking a look at the error, it makes me think that nydus is not properly being used for initContainers.

@ChengyuZhu6, mind to verify if it works on your end?

@fidencio fidencio added bug Incorrect behaviour needs-review Needs to be assessed by the team. labels May 19, 2024
fidencio added a commit to fidencio/kata-containers that referenced this issue May 19, 2024
This test fails when using `shared_fs=none` with the nydus snapshotter,
and we're tracking the issue here:
kata-containers#9664

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <[email protected]>
fidencio added a commit to fidencio/kata-containers that referenced this issue May 19, 2024
This test fails when using `shared_fs=none` with the nydus snapshotter,
and we're tracking the issue here:
kata-containers#9664

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <[email protected]>
fidencio added a commit to fidencio/kata-containers that referenced this issue May 19, 2024
This test fails when using `shared_fs=none` with the nydus snapshotter,
and we're tracking the issue here:
kata-containers#9664

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <[email protected]>
fidencio added a commit to fidencio/kata-containers that referenced this issue May 19, 2024
This test fails when using `shared_fs=none` with the nydus snapshotter,
and we're tracking the issue here:
kata-containers#9664

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <[email protected]>
@wainersm
Copy link
Contributor

Hi @fidencio ! One question: did you get this failure when setting shared_fs=none and the runtime handler annotation?

Asking because I'm getting this error even when shared_fs=9p (i.e. using 9p) too. I realized that because I'm run a set of tests for qemu-coco-dev with setting the runtime handler annotation ("io.containerd.cri.runtime-handler").

Another test that's failing and I haven't seen you report is k8s-empty-dirs.bats . Wondering why.

ChengyuZhu6 added a commit to ChengyuZhu6/kata-containers that referenced this issue May 22, 2024
Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats

Fixes: kata-containers#9664 kata-containers#9666 kata-containers#9667 kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
ChengyuZhu6 added a commit to ChengyuZhu6/kata-containers that referenced this issue May 22, 2024
Revert

Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats
- k8s-shared-volume.bats
- k8s-kill-all-process-in-container.bats
- k8s-sysctls.bats

Fixes: kata-containers#9664 kata-containers#9666 kata-containers#9667 kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
ChengyuZhu6 added a commit to ChengyuZhu6/kata-containers that referenced this issue May 22, 2024
Revert code logic in 462051b

Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt
to pull the same image, like InitContainer. This is because the system searches for the existing configuration, which resides in the
guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>. Consequently, when
the InitContainer finishes its task and terminates, the directory ceases to exist. As a result, during the creation of the application container,
the OCI spec and process information cannot be merged due to the absence of the expected configuration file.

Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats
- k8s-shared-volume.bats
- k8s-kill-all-process-in-container.bats
- k8s-sysctls.bats

Fixes: kata-containers#9664 kata-containers#9666 kata-containers#9667 kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
@katacontainersbot katacontainersbot moved this from To do to In progress in Issue backlog May 22, 2024
ChengyuZhu6 added a commit to ChengyuZhu6/kata-containers that referenced this issue May 22, 2024
Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt
to pull the same image, like InitContainer. This is because the system searches for the existing configuration, which resides in the
guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>. Consequently, when
the InitContainer finishes its task and terminates, the directory ceases to exist. As a result, during the creation of the application container,
the OCI spec and process information cannot be merged due to the absence of the expected configuration file.

Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats
- k8s-shared-volume.bats
- k8s-kill-all-process-in-container.bats
- k8s-sysctls.bats

Fixes: kata-containers#9664 kata-containers#9666 kata-containers#9667 kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
ChengyuZhu6 added a commit to ChengyuZhu6/kata-containers that referenced this issue May 22, 2024
Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt
to pull the same image, like InitContainer. This is because the image service searches for the existing configuration,
which resides in the guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>.
Consequently, when the InitContainer finishes its task and terminates, the directory ceases to exist.
As a result, during the creation of the application container, the OCI spec and process information cannot
be merged due to the absence of the expected configuration file.

Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats
- k8s-shared-volume.bats
- k8s-kill-all-process-in-container.bats
- k8s-sysctls.bats

Fixes: kata-containers#9664 kata-containers#9666 kata-containers#9667 kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
ChengyuZhu6 added a commit to ChengyuZhu6/kata-containers that referenced this issue May 22, 2024
Revert code logic in 462051b

Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt
to pull the same image, like InitContainer. This is because the image service searches for the existing configuration,
which resides in the guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>.
Consequently, when the InitContainer finishes its task and terminates, the directory ceases to exist.
As a result, during the creation of the application container, the OCI spec and process information cannot
be merged due to the absence of the expected configuration file.

Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats
- k8s-shared-volume.bats
- k8s-kill-all-process-in-container.bats
- k8s-sysctls.bats

Fixes: kata-containers#9664 kata-containers#9666 kata-containers#9667 kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
ChengyuZhu6 added a commit to ChengyuZhu6/kata-containers that referenced this issue May 22, 2024
Revert code logic in 462051b

Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt
to pull the same image, like InitContainer. This is because the image service searches for the existing configuration,
which resides in the guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>.
Consequently, when the InitContainer finishes its task and terminates, the directory ceases to exist.
As a result, during the creation of the application container, the OCI spec and process information cannot
be merged due to the absence of the expected configuration file.

Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats
- k8s-shared-volume.bats
- k8s-kill-all-process-in-container.bats
- k8s-sysctls.bats

Fixes: kata-containers#9664   kata-containers#9666   kata-containers#9667   kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
ChengyuZhu6 added a commit to ChengyuZhu6/kata-containers that referenced this issue May 22, 2024
Revert code logic in 462051b

Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt
to pull the same image, like InitContainer. This is because the image service searches for the existing configuration,
which resides in the guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>.
Consequently, when the InitContainer finishes its task and terminates, the directory ceases to exist.
As a result, during the creation of the application container, the OCI spec and process information cannot
be merged due to the absence of the expected configuration file.

Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats
- k8s-shared-volume.bats
- k8s-kill-all-process-in-container.bats
- k8s-sysctls.bats

Fixes: kata-containers#9664
Fixes: kata-containers#9666
Fixes: kata-containers#9667
Fixes: kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
fidencio pushed a commit to ChengyuZhu6/kata-containers that referenced this issue May 22, 2024
Revert code logic in 462051b

Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt
to pull the same image, like InitContainer. This is because the image service searches for the existing configuration,
which resides in the guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>.
Consequently, when the InitContainer finishes its task and terminates, the directory ceases to exist.
As a result, during the creation of the application container, the OCI spec and process information cannot
be merged due to the absence of the expected configuration file.

Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats
- k8s-shared-volume.bats
- k8s-kill-all-process-in-container.bats
- k8s-sysctls.bats

Fixes: kata-containers#9664
Fixes: kata-containers#9666
Fixes: kata-containers#9667
Fixes: kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
ChengyuZhu6 added a commit to ChengyuZhu6/kata-containers that referenced this issue May 23, 2024
Revert code logic in 462051b

Let me explain why:

In our previous approach, we implemented guest pull by passing PullImageRequest to the guest.
However, this method  resulted in the loss of specifications essential for running the container,
such as commands specified in YAML, during the CreateContainer stage. To address this,
it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull.

The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt
to pull the same image, like InitContainer. This is because the image service searches for the existing configuration,
which resides in the guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>.
Consequently, when the InitContainer finishes its task and terminates, the directory ceases to exist.
As a result, during the creation of the application container, the OCI spec and process information cannot
be merged due to the absence of the expected configuration file.

Fix tests:
- k8s-credentials-secrets.bats
- k8s-file-volume.bats
- k8s-nested-configmap-secret.bats
- k8s-projected-volume.bats
- k8s-volume.bats
- k8s-shared-volume.bats
- k8s-kill-all-process-in-container.bats
- k8s-sysctls.bats

Fixes: kata-containers#9664
Fixes: kata-containers#9666
Fixes: kata-containers#9667
Fixes: kata-containers#9668

Signed-off-by: ChengyuZhu6 <[email protected]>
wainersm added a commit to wainersm/kata-containers that referenced this issue May 29, 2024
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: kata-containers#9664
Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
ryansavino added a commit to AdithyaKrishnan/kata-containers that referenced this issue Jun 1, 2024
This test fails when using `shared_fs=none` with the nydus napshotter,
Issue tracked here: kata-containers#9664
Skipping for now.

Signed-Off-By: Ryan Savino <[email protected]>
ryansavino added a commit to AdithyaKrishnan/kata-containers that referenced this issue Jun 1, 2024
This test fails when using `shared_fs=none` with the nydus napshotter,
Issue tracked here: kata-containers#9664
Skipping for now.

Signed-Off-By: Ryan Savino <[email protected]>
ryansavino added a commit to AdithyaKrishnan/kata-containers that referenced this issue Jun 1, 2024
This test fails when using `shared_fs=none` with the nydus napshotter,
Issue tracked here: kata-containers#9664
Skipping for now.

Signed-Off-By: Ryan Savino <[email protected]>
ryansavino added a commit to AdithyaKrishnan/kata-containers that referenced this issue Jun 3, 2024
This test fails when using `shared_fs=none` with the nydus napshotter,
Issue tracked here: kata-containers#9664
Skipping for now.

Signed-Off-By: Ryan Savino <[email protected]>
datadog-compute-robot pushed a commit to DataDog/kata-containers that referenced this issue Jun 11, 2024
This test fails when using `shared_fs=none` with the nydus snapshotter,
and we're tracking the issue here:
kata-containers#9664

For now, let's have it skipped.

Signed-off-by: Fabiano Fidêncio <[email protected]>
datadog-compute-robot pushed a commit to DataDog/kata-containers that referenced this issue Jun 11, 2024
This test fails with qemu-coco-dev configuration and guest-pull image pull.

Issue: kata-containers#9664
Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
datadog-compute-robot pushed a commit to DataDog/kata-containers that referenced this issue Jun 11, 2024
This test fails when using `shared_fs=none` with the nydus napshotter,
Issue tracked here: kata-containers#9664
Skipping for now.

Signed-Off-By: Ryan Savino <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/guest-pull bug Incorrect behaviour needs-review Needs to be assessed by the team.
Projects
Issue backlog
  
In progress
Status: In Progress
Development

No branches or pull requests

3 participants