-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update SeccompDefault KEP upgrade strategy #2773
Conversation
This patch reflects the latest changes from the implementation PR: - Adding the kubelet configuration and CLI flag. - Making the rollout strategy more verbose and add mitigations. Signed-off-by: Sascha Grunert <[email protected]>
lgtm, default behavior is unchanged from previous releases; kubelet gains a configuration option so the node owner can define what seccomp profile should be used for pods/containers which do not specify a seccomp profile |
@derekwaynecarr please take a look too and approve if this change is fine |
/lgtm |
/assign @derekwaynecarr |
@saschagrunert thank you for the changes! I think the new approach is a good balance that lets cluster operators implement more aggressive security features without posing an acute threat to compatibility. I'm still a bit worried we're deferring a lot of the complexity of real-world rollouts based on the idea this is "off by default". Once this feature is in, there will be pressure to turn it on by default in a future release, and we'll face the same problems around deployment and compatibility. We can avoid future headaches by thinking through this stuff now. Since this is implemented as a node flag, responsibility for rolling it out is likely to fall to platform teams (cluster admins) rather than application owners. It will need to be enabled for all applications in the cluster at once unless admins play scheduling tricks to put only "blessed" applications on new nodes with the flag enabled. The existence of even one non-compliant application within a cluster will pose a significant rollout hurdle. The current rollout guidance assumes that cluster admins have access to full and comprehensive test suites for all deployed applications, including high coverage for all third-party dependencies, and can run them on demand. That doesn't match the reality I've seen. So most cluster admins will have to go down the path we call "optional" where they run applications under a permissive/complain seccomp filter for a while -- but, for that, they have to write their own report-mode seccomp filter. Maybe we can write that filter for them? There's also a ton of cluster-specific complexity hiding behind "[i]f it's possible to monitor audit logs within the cluster". Do we think that is possible in many/most customer clusters? If so, we should gesture at how it can be done. And if not, we shouldn't be relying on it as a rollout risk mitigation. I'm sorry I'm joining this conversation late -- for the original KEP, was there any discussion of applying these policies to workloads rather than nodes? Something like upgrading the |
We followed the idea of having container runtimes define the profiles and not kubernetes. Sure, we could add documentation about how to modify the default profile to "log instead of block". We could also add more docs about how to investigate the audit logs on a node.
I definitely see your point. I think we also can enhance the docs (or write a blog post and link it to the docs) about how this can be done in technical detail. Sure, we will not cover every platform/setup, but for example most systems ship
I could imagine doing this by using a selector, but how would that be specified? We could add it to the kubelet configuration, but I'm not in favor of mixing application and cluster logic into a single config. We could also add a new annotation/field, but this would require API changes. I think we still can elaborate on adding such a feature on top of the graduation. For example:
The point of not changing the API is already part of the KEP, but I see no blocker in revisiting on graduation. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: derekwaynecarr, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@mmdriley i appreciate your summary on the distinction between platform owner and application owner with the outlined approach. i agree with @saschagrunert that a workload level choice can be explored later as well. |
This patch reflects the latest changes from the implementation PR:
Refers to: #2413
Triggered by: kubernetes/kubernetes#101943
cc @pjbgf @liggitt @mmdriley PTAL