Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove user privilege checks from preflight tests #128799

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

srivastav-abhishek
Copy link
Contributor

@srivastav-abhishek srivastav-abhishek commented Nov 14, 2024

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This PR removes user-privilege checks for some preflight tests to make sure UT behaviour is consistent and shouldn't get skipped for non-root users.

Which issue(s) this PR fixes:

Fixes kubernetes/kubeadm#3124

Special notes for your reviewer:

Does this PR introduce a user-facing change?

None

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/kubeadm labels Nov 14, 2024
@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 14, 2024
@srivastav-abhishek
Copy link
Contributor Author

manifestsDir := filepath.Join(kubeadmconstants.KubernetesDir, kubeadmconstants.ManifestsSubDirName)
checks := []Checker{
IsPrivilegedUserCheck{},
Copy link
Member

@neolit123 neolit123 Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be added only when this condition is true

if !isSecondaryControlPlane {

just create the checks variable earlier:

var checks []Checker

and keep the condition.

later checks must be appended.

@neolit123
Copy link
Member

thanks for the PR, i've added one comment.
/triage accepted
/priority backlog

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 14, 2024
@srivastav-abhishek
Copy link
Contributor Author

/retest

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, running e2e on this PR here:
https://github.com/neolit123/kubeadm-test/actions/runs/11853015645

will also test locally as it's a behavior change.

we might still consider the idea to calling RunRootCheckOnly before Init|Join checks are called and exiting early.

@neolit123
Copy link
Member

neolit123 commented Nov 15, 2024

@SataQiu @pacoxu @carlory

before this change kubeadm exits fast on init/join when the user is not root

$ kubeadm init
[init] Using Kubernetes version: v1.31.2
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR IsPrivilegedUser]: user is not running as root
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

with this change it runs some more checks before exiting with an error:

$ kubeadm init
[init] Using Kubernetes version: v1.31.2
[preflight] Running pre-flight checks
W1115 10:47:13.259101   12639 checks.go:1075] [preflight] WARNING: Couldn't create the interface used for talking to the container runtime: failed to create new CRI runtime service: validate service connection: validate CRI v1 runtime API for endpoint "unix:///var/run/containerd/containerd.sock": rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /var/run/containerd/containerd.sock: connect: permission denied"
        [WARNING FileExisting-ethtool]: ethtool not found in system path
        [WARNING SystemVerification]: cgroups v1 support is in maintenance mode, please migrate to cgroups v2
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR IsPrivilegedUser]: user is not running as root
        [ERROR KubeletVersion]: couldn't get kubelet version: cannot execute 'kubelet --version': executable file not found in $PATH
        [ERROR DirAvailable--var-lib-etcd]: unable to check if /var/lib/etcd is empty: open /var/lib/etcd: permission denied
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

do you think we should preserve exactly the old behavior by calling the root check before any init/join checks?
we are doing that on upgrade node/apply too, so might be better to be consistent.

@pacoxu
Copy link
Member

pacoxu commented Nov 15, 2024

+1 for this change. A consistent behavior makes sense.

BTW, if we can show the [ERROR IsPrivilegedUser]: user is not running as root more clear(already the first ERROR) in the list, it would be better and easier for users to notice the root cause. Should the errors be printed before the warnings?

@neolit123
Copy link
Member

neolit123 commented Nov 15, 2024

BTW, if we can show the [ERROR IsPrivilegedUser]: user is not running as root more clear(already the first ERROR) in the list, it would be better and easier for users to notice the root cause.

maybe the best to do that is to run this check with RunRootCheckOnly first, like it was before this PR.

like in upgrade
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/cmd/phases/upgrade/apply/preflight.go#L71-L85

. Should the errors be printed before the warnings?

i'm +0 on this one, we could log a ticket about it.

@srivastav-abhishek
Copy link
Contributor Author

srivastav-abhishek commented Nov 15, 2024

maybe the best to do that is to run this check with RunRootCheckOnly first, like it was before this PR.

I shall revert the changes in checks.go then.

@neolit123
Copy link
Member

maybe the best to do that is to run this check with RunRootCheckOnly first, like it was before this PR.

I shall revert the changes in checks.go then.

you can do the following:

  • remove anything about the priv check in the init|join check functions
  • call the runrootchecksonly function in all locations before the init|join functions. but also consider the isSecondaryControlPlane flag.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 15, 2024
if err := RunRootCheckOnly(ignorePreflightErrors); err != nil {
return err
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of calling the root check inside the runinit|joinnodechecks we should call RunRootCheckOnly in these places:

grep -E 'RunJoinNodeChecks|RunInitNodeChecks' * -rnI | grep -v chec
ks.go
app/cmd/phases/join/preflight.go:94:    if err := preflight.RunJoinNodeChecks(utilsexec.New(), j.Cfg(), j.IgnorePreflightErrors()); err != nil {
app/cmd/phases/join/preflight.go:123:           if err := preflight.RunInitNodeChecks(utilsexec.New(), initCfg, j.IgnorePreflightErrors(), true, hasCertificateKey); err != nil {
app/cmd/phases/init/preflight.go:65:    if err := preflight.RunInitNodeChecks(utilsexec.New(), data.Cfg(), data.IgnorePreflightErrors(), false, false); err != nil {

before the run init/join checks are called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the flag isSecondaryControlPlane should be removed from RunInitNodeChecks

actually
app/cmd/phases/join/preflight.go:123 - should not call RunRootCheck
app/cmd/phases/init/preflight.go:65 - should call RunRootCheck

this is confirmed by the flag values which are passed ( isSecondaryControlPlane)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I called in the sub-functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way you did it is also OK, but i think we should be more explicit about the runned root checks, similarly to how kubeadm upgrade and reset does it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. It seems it is not required for

app/cmd/phases/join/preflight.go:123:           if err := preflight.RunInitNodeChecks(utilsexec.New(), initCfg, j.IgnorePreflightErrors(), true, hasCertificateKey); err != nil {

as isSecondaryControlPlane is true here.

@@ -933,7 +926,6 @@ func InitNodeChecks(execer utilsexec.Interface, cfg *kubeadmapi.InitConfiguratio
FileAvailableCheck{Path: kubeadmconstants.GetStaticPodFilepath(kubeadmconstants.Etcd, manifestsDir)},
HTTPProxyCheck{Proto: "https", Host: cfg.LocalAPIEndpoint.AdvertiseAddress},
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't remove line.

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
just one more minor line nit.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123, srivastav-abhishek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 15, 2024
Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is passing
https://github.com/neolit123/kubeadm-test/actions/runs/11853015645

/lgtm

we are in code freeze for 1.32 so this will automatically merge after CF is lifted.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 15, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: c739bf0fc621486c89ec9ea1176e4d8a76129358

@neolit123
Copy link
Member

thanks for the PR @srivastav-abhishek

@srivastav-abhishek
Copy link
Contributor Author

/retest

@pacoxu
Copy link
Member

pacoxu commented Nov 18, 2024

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

don't skip unit tests in preflight/checks_test.go when not running as root
4 participants