Skip to content
This repository has been archived by the owner on Nov 29, 2024. It is now read-only.

[leader election] master should't give up it's leadership easily #17

Open
BlueBlue-Lee opened this issue Jun 19, 2017 · 1 comment
Open

Comments

@BlueBlue-Lee
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
feature request

What happened:
There is leader election in doorman server, which is achieved by etcd. Set a key with delay ttl and continually refresh it every 1/3 delay interval.

when leader is down, this etcd key will expire. And then new leader is elected.

see the source code:

	go func() {
		for {
			log.V(2).Infof("trying to become master at %v", e.lock)
			if _, err := kapi.Set(ctx, e.lock, id, &client.SetOptions{
				TTL:       e.delay,
				PrevExist: client.PrevNoExist,
			}); err != nil {
				log.V(2).Infof("failed becoming the master, retrying in %v: %v", e.delay, err)
				time.Sleep(e.delay)
				continue
			}
			e.isMaster <- true
			log.V(2).Info("Became master at %v as %v.", e.lock, id)
			for {
				time.Sleep(e.delay / 3)
				log.V(2).Infof("Renewing mastership lease at %v as %v", e.lock, id)
				_, err := kapi.Set(ctx, e.lock, id, &client.SetOptions{
					TTL:       e.delay,
					PrevExist: client.PrevExist,
					PrevValue: id,
				})

				if err != nil {
					log.V(2).Info("lost mastership")
					e.isMaster <- false
					break
				}
			}
		}
	}()

when master fail to renew lease because some temp reasons, for example network jitter, it just loses leadership easily. But actually, if the master try again, it will renew lease successfully.

This problem will resulting in unnecessary learning mode and it takes time to converge.

What you expected to happen or what your proposal is:

I think we shold add retry mechanism when renew lease. If it fails twice or other retry-counts, then lose its leadership.

@BlueBlue-Lee
Copy link
Author

@ryszard @josvisser

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant