fix flower should not use logic clock to manage events #831
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fixed #613
Since Celery v3.1, the events state managed based on a logic clock, see change log:
http://docs.celeryproject.org/en/3.0/whatsnew-3.1.html#events-are-now-ordered-using-logical-time
Detail logic see:
https://github.com/celery/celery/blob/master/celery/events/state.py#L609-L618
The logic clock can auto sync in a worker cluster, but not sync with flower, that is the root cause.
Lets assume the State is:
If a worker cluster lunched without
--state-db
option and restarted, the logic clock will reset to 0.When the flower received new event(C.create) with smaller clock from Worker, at the same time, task number already reach the limit, State class will remove the first event and re-insert the latest smaller clock event to the first place, the State now is:
then another new event, again and again, the State will like below:
Now, you can see the heap only contain one task alive, others are already dead, util logic clock forward greater than Worker restart moment value, the flower task list view can only read the latest task item.
As I said, the root cause is flower can not auto sync logic clock with worker cluster, so I monkey patched the calculate method to make clock always =0