You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 29, 2024. It is now read-only.
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
feature request
What happened:
There is leader election in doorman server, which is achieved by etcd. Set a key with delay ttl and continually refresh it every 1/3 delay interval.
when leader is down, this etcd key will expire. And then new leader is elected.
see the source code:
go func() {
for {
log.V(2).Infof("trying to become master at %v", e.lock)
if _, err := kapi.Set(ctx, e.lock, id, &client.SetOptions{
TTL: e.delay,
PrevExist: client.PrevNoExist,
}); err != nil {
log.V(2).Infof("failed becoming the master, retrying in %v: %v", e.delay, err)
time.Sleep(e.delay)
continue
}
e.isMaster <- true
log.V(2).Info("Became master at %v as %v.", e.lock, id)
for {
time.Sleep(e.delay / 3)
log.V(2).Infof("Renewing mastership lease at %v as %v", e.lock, id)
_, err := kapi.Set(ctx, e.lock, id, &client.SetOptions{
TTL: e.delay,
PrevExist: client.PrevExist,
PrevValue: id,
})
if err != nil {
log.V(2).Info("lost mastership")
e.isMaster <- false
break
}
}
}
}()
when master fail to renew lease because some temp reasons, for example network jitter, it just loses leadership easily. But actually, if the master try again, it will renew lease successfully.
This problem will resulting in unnecessary learning mode and it takes time to converge.
What you expected to happen or what your proposal is:
I think we shold add retry mechanism when renew lease. If it fails twice or other retry-counts, then lose its leadership.
The text was updated successfully, but these errors were encountered:
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
feature request
What happened:
There is leader election in doorman server, which is achieved by etcd. Set a key with delay ttl and continually refresh it every 1/3 delay interval.
when leader is down, this etcd key will expire. And then new leader is elected.
see the source code:
when master fail to renew lease because some temp reasons, for example network jitter, it just loses leadership easily. But actually, if the master try again, it will renew lease successfully.
This problem will resulting in unnecessary learning mode and it takes time to converge.
What you expected to happen or what your proposal is:
I think we shold add retry mechanism when renew lease. If it fails twice or other retry-counts, then lose its leadership.
The text was updated successfully, but these errors were encountered: