Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image fetching does not work #238

Open
madhosoi opened this issue Apr 6, 2021 · 3 comments
Open

Image fetching does not work #238

madhosoi opened this issue Apr 6, 2021 · 3 comments

Comments

@madhosoi
Copy link

madhosoi commented Apr 6, 2021

Hi all,
I am testing to deploy concourse via helm, in an on-premise cluster deployed with RKE ( K8s version , based on CentOS7 nodes, with Kernel 5.4. Everything seems work fine, but when I start to run the first pipeline (helloworld) that tries to fetch a busybox image, I get this error:

selected worker: concourse-worker-0
fetching busybox@sha256:b175d1e57360807d83cb951c876c8104a5b49403e5e2e5605bfd0c4d76a34d8d
0d282f5b5b27 [======================================] 748.5KiB/748.7KiB
ERRO[0004] download failed: save image: write rootfs: extract image: lchown /tmp/build/get/rootfs/home: operation not permitted 

Another useful information:
Worker Persistent Volume via NFS3
I can setup Concourse in the same node type with Docker Compose and with Hashicorp Nomad and are running as expected.

Could you help me to understand why it happens?

Thanks!!
Miguel

@madhosoi
Copy link
Author

madhosoi commented Apr 6, 2021

The pipeline is using registry-image resource type. If I change it to docker-image, the pipeline works as expected.

@taylorsilva
Copy link
Member

Looking at the lchow man page: https://linux.die.net/man/2/lchown

It looks like the CAP_CHOWN capability is required for it to run successfully. Looking at the code, we only attempt chowning if we're the root user https://github.com/concourse/registry-image-resource/blob/7e65e92c87d7dcc8a83508a0cf771526e38a2299/commands/unpack.go#L28

This is probably a problem specific to RKE or CentOS7. I'm not familiar with either systems. You could try adding the CAP_CHOWN capability to the worker pods and see if that fixes the issue. I'm not convinced that'll fix the issue though. The registry-image pulling the image should be running under a non-root uid and therefore shouldn't go down that chown'ing codepath at all... Weird indeed.

@serverwentdown
Copy link

serverwentdown commented May 10, 2021

@madhosoi Likely your persistentvolume does not support chown operations. If you're using NFS like me, ensure that no_root_squash is enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants