Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to use etcd managed by cilium-etcd-operator as kvstore #8629

Merged
merged 4 commits into from
Mar 15, 2020

Conversation

olemarkus
Copy link
Member

This PR adds the possibility to use etcd as kvstore for cilium agent state. See https://docs.cilium.io/en/v1.7/gettingstarted/k8s-install-etcd-operator/ for the details.

Fixes #8465

Note that coredns 1.6+ currently has a bug that prevents etcd-operator from creating clusters, so if one wants to test this, downgrade coredns first. See coredns/coredns#3686

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 25, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @olemarkus. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 25, 2020
@hakman
Copy link
Member

hakman commented Feb 25, 2020

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 25, 2020
hostNetwork: true
{{- if .EtcdManaged }}
# In managed etcd mode, Cilium must be able to resolve the DNS name of
# the etcd service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to set the dnsPolicy here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I removed that part on purpose. Changing the DNS policy breaks cilium because it can no longer look up the internal cluster DNS (since we use the public DNS entry for that). The change on line 14 ( "etcd.operator": "true") removes the need to change the DNS policy.
That particular change hasn't made it into the cilium helm templates yet, if you are comparing to what that one outputs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a conditional wrapped around nothing but comments here. Perhaps remove the conditional and comments as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thanks. I was certain I removed that one. 🤦‍♂️

image: "cilium/cilium-etcd-operator:v2.0.7"
name: cilium-etcd-operator
dnsPolicy: ClusterFirst
hostNetwork: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  priorityClassName: system-cluster-critical

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure this should qualify for system-cluster-critical. It technically isn't very critical even though the pods it spawns is critical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it gets preempted then it can't repair any problems with the Cilium etcds, such as their losing quorum.

@johngmyers
Copy link
Member

Does cilium-etcd-operator run the etcd pods with host network? We forked etcd-operator to include coreos/etcd-operator#2094 and suspect we'll have to fork cilium-etcd-operator too.

Also, does cilium-etcd-operator run the etcd pods with the system-cluster-critical priority class?

Since etcd-operator appears to be unmaintained, I'm wondering if it would be practical to add a Cilium mode to etcd-manager.

@olemarkus
Copy link
Member Author

Does cilium-etcd-operator run the etcd pods with host network? We forked etcd-operator to include coreos/etcd-operator#2094 and suspect we'll have to fork cilium-etcd-operator too.

No, it doens't. It is unclear to me why that would be necessary.

Also, does cilium-etcd-operator run the etcd pods with the system-cluster-critical priority class?

No, I and I am not sure it is correct to do so either. The consequence of losing etcd is that state propagation will be slower, but it won't lead to immediate critical failure

Since etcd-operator appears to be unmaintained, I'm wondering if it would be practical to add a Cilium mode to etcd-manager.

That could be interesting. Everything that talks to etcd uses hostNetworking, so etcd-manager's etcd should be reachable too. It would certainly the concerns above. Would be a pretty big change though.

@johngmyers
Copy link
Member

With hundreds of nodes, "slower" may well be "completely unstable". If someone doesn't need etcd for state propagation it's likely they wouldn't enable it.

I suppose in the crd identity allocation mode an agent could bootstrap off of the apiserver, so running etcd on host network wouldn't be as important as for kvstore mode. It might even be able to do so without killing the apiserver when in a large cluster.

@johngmyers
Copy link
Member

Shouldn't we bump the version in bootstrapchannelbuilder.go?

@olemarkus
Copy link
Member Author

Cilium's stance on this is still that larger environments should use external etcd (https://docs.cilium.io/en/v1.7/gettingstarted/k8s-install-external-etcd/)
I think etcd-manager would qualify as that better be pretty damn stable.

Forking cilium-etcd-operator/etcd-operator etc isn't an option for me at least.

@olemarkus
Copy link
Member Author

Shouldn't we bump the version in bootstrapchannelbuilder.go?

Yep. bumped.

@johngmyers
Copy link
Member

External etcd would be expensive to manage. We run in-cluster etcd on host network with forked etcd-operator. When we are able to upgrade to 1.6 we'll look at crd identity allocation mode with etcd for propagation, assuming we're willing to stay under 250 nodes.

Elsewhere they've mentioned that pure crd is only good for up to 50 nodes.

Host network is not a blocker for this PR. I'm trying to decide whether I want to hold out for priorityClassName on cilium-etcd-operator. If cilium-etcd-operator happens to not put them on the etcd pods then it'd be a moot point unless/until that could be fixed upstream.

@olemarkus
Copy link
Member Author

I think we should push this in without and then I can see if I can push cilium to propagate the priority class.

I agree that external etcd is too expensive, but I want to see how hard it is to add another etcd-manager cluster. That will have to come later though

@johngmyers
Copy link
Member

/lgtm
/retest

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 2, 2020
@johngmyers
Copy link
Member

/retest

1 similar comment
@johngmyers
Copy link
Member

/retest

@johngmyers
Copy link
Member

It looks like cilium/cilium-etcd-operator#67 is going to be a significant defect now that kops rolling update can drain nodes in parallel.

@olemarkus
Copy link
Member Author

Yep. I think I am going to have a look at etcd-manager sooner than I thought.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 13, 2020
@k8s-ci-robot k8s-ci-robot added area/api and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 13, 2020
path: etcd.config
name: cilium-config
name: etcd-config-path
# To read the k8s etcd secrets in case the user might want to use TLS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the Cilium etcd secrets, not the k8s ones, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. etcd-operator creates the secret cilium-etcd-secret.

…1.12.yaml.template

Co-Authored-By: John Gardiner Myers <[email protected]>
@johngmyers
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 14, 2020
@rifelpet
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: olemarkus, rifelpet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 14, 2020
@k8s-ci-robot k8s-ci-robot merged commit 8860040 into kubernetes:master Mar 15, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Mar 15, 2020
@olemarkus olemarkus deleted the cilium-etcd-operator branch March 25, 2020 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CNI Improvement: Cilium KV-Store Backed Cluster Support
5 participants