core: fix missed register_callsite
error
#2938
Draft
+239
−5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note: This is a change against the
v0.1.x
branch.Motivation
There are 2 triggers which will cause a subscriber to receive a call to
Subscriber::register_callsite
for a specific callsite.set_default
orwith_default
)It is trigger (2) that will cause a new subscriber to receive
Subscriber::register_callsite
for all the callsites which had alreadybeen registered before it became active.
When a callsite is registered for trigger (1), the callsite starts in
state
UNREGISTERED
.The first thread to encounter the callsite will transition it to
REGISTERING
and determine the overall interest for the callsite byregistering with all known dispatchers (which will call into
Subscriber::register_callsite
).Once that is complete, the callsite is added to the list of all known
callsites and its state is transitioned to
REGISTERED
.is (re)built for all known dispatchers. The callsite starts in state
UNREGISTERED
. The This calls down intoSubscriber::register_callsite
for each subscriber. Once that iscomplete, the callsite is added to the global list of known callsites.
While the callsite interest is being rebuilt, other threads that
encounter the callsite will be given
Interest::sometimes()
until theregistration is complete. However, if a new subscriber is added during
this window, all the interest for all callsites will be rebuilt, but
because the new callsite (in state
REGISTERING
) won't be includedbecause it isn't yet in the global list of callsites.
This can cause a case where that new subscriber being added won't
receive
Subscriber::register_callsite
before it receives the subsequentcall to
Subscriber::event
orSubscriber::new_span
.The documentation on Registering Callsites is not very explicit on
this point, but it does suggest that
Subscriber::register_callsite
will be called before the call to either
Subscriber::event
orSubscriber::new_span
, and the current behavior can break this implicitcontract.
Solution
This change swaps the order of rebuilding the callsite interest and
adding the callsite to the global list so that the callsite gets pushed
first, avoiding this window in which a subscriber won't get a call to
register_callsite
.As such, a callsite may have its interest read before it is set. In this
case, the existing implementation will return
Interest::sometimes()
for the
DefaultCallsite
implementation. Other implementations (outsideof the
tracing
project) may perform this differently, but in thiscase, there is no documented guarantee regarding the ordering.
A regression test is included which provokes the race condition 100% of
the time before the changes in this fix.
Fixes: #2743