Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TECH-12593:make global_context global, but just for that logger instance #72

Merged
merged 34 commits into from
Sep 14, 2023

Conversation

ColinDKelley
Copy link
Contributor

@ColinDKelley ColinDKelley commented Aug 31, 2023

While investigating missing log contexts, we found a few problems:

  1. The global_context was stored into a Thread/Fiber-local... so it would not be applied when running on a different Thread or Fiber.
  2. Both the global_context and current_context were stored in a Thread/Fiber-local... meaning that they would be shared across different instance of the Logger, if you had them.
  3. It was confusing the have the context managed by the Context::Handler rather than simply the Context.

The changes in this branch are:

  • The global_context is stored in the logger itself, so it is global to that logger, only.
  • LoggerWithContext delegates the global_context to the current_context of the base logger it was created with.
  • The current_context comes from a Thread/Fiber local where the key to the name includes the logger's object_id. So it can be different for multiple logger instances.
  • The current_context defaults to nil, and may also be set to nil. Whenever it has a nil value, the current_context defaults to the global_context for that logger.
  • Context is now cleanly separated from ContextHandler.

Note: best reviewed with Hide whitespace.

@ColinDKelley ColinDKelley requested a review from a team as a code owner August 31, 2023 00:30
@ColinDKelley ColinDKelley force-pushed the TECH-12593/make-global_context-global branch from 87e79a3 to e0f49aa Compare August 31, 2023 04:16
@ColinDKelley ColinDKelley requested review from jebentier and removed request for ishakun August 31, 2023 04:18
module Context
def thread_context_for_logger_instance
# We include the object_id here to make these thread/fiber locals unique per logger instance.
@thread_context_for_logger_instance ||= "ContextualLogger::Context.context_for_#{object_id}".to_sym
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty sneaky. It's the only way I could think to get Thread/Fiber-locals to also be unique for each enclosing Logger. Notice that they won't get cleaned up when the Logger goes out of scope. I don't think it's likely that anyone would create a huge number of Loggers, so we should be fine.

lib/contextual_logger.rb Outdated Show resolved Hide resolved
lib/contextual_logger.rb Outdated Show resolved Hide resolved
spec/lib/contextual_logger/context_handler_spec.rb Outdated Show resolved Hide resolved
Co-authored-by: James Ebentier <[email protected]>
@@ -16,7 +23,13 @@ def initialize(logger, context, level: nil)
@logger = logger
self.level = level
@context = normalize_context(context)
@merged_context_cache = {} # so we don't have to merge every time
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jebentier This is still a bug. I realized it's one of the reasons I had headed down the path of tracking the global_context instance for equals?--that would be necessary in order to fix this bug. But it would be really complicated. And this bug isn't new. The code in this branch works much better in that it handles with_context on both the original logger as well as the LoggerWithContext; it just doesn't handle if you repeat the former after having logged something. That seems very unlikely.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this seems unlikely and that the code required to fix this limitation is more complex than we want for this logger. Thank you for documenting it so we’re aware moving forward.

@ColinDKelley
Copy link
Contributor Author

@jebentier Can you review the new changes here since you last reviewed?

Copy link
Collaborator

@jebentier jebentier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new changes look good. Thanks for working through this. Have we been able to confirm that this resolves the bugs we were seeing when using this with Sidekiq?

@@ -16,7 +23,13 @@ def initialize(logger, context, level: nil)
@logger = logger
self.level = level
@context = normalize_context(context)
@merged_context_cache = {} # so we don't have to merge every time
end

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this seems unlikely and that the code required to fix this limitation is more complex than we want for this logger. Thank you for documenting it so we’re aware moving forward.

end

def current_context_override=(context_override)
ContextualLogger.global_context_lock_message ||= "ContextualLogger::Context.current_context_override set for #{self.class.name} #{object_id}: #{context_override.inspect}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/LineLength: Line is too long. [176/150]

@ColinDKelley
Copy link
Contributor Author

Have we been able to confirm that this resolves the bugs we were seeing when using this with Sidekiq?

Yes, the previous version did. But I'll be checking again with this latest version.

@ColinDKelley
Copy link
Contributor Author

@jebentier FYI, I ran the final QA on a kube-platform and confirmed that the Sidekiq logging always contains the full context, once we get fully bootstrapped. As before there's a small window at the start when a ServiceDiscovery thread starts logging before the LoggerMixin logger is in place. That's not new, and I think we can live with it.

Here are the first 3 examples from a task pod. Note that the inline context is winding up in the progname parameter:

{"app":{"@timestamp":"2023-09-13T11:06:02.898978-07:00","severity":"INFO","host":"task-slow-7fb96d64f5-dzgd7","tid":"1qo4","service_name":"rr_slow_task_queue","progname":"{:invoca_service_discovery=>{:metric=>\"discover.attempt\", :datacenter=>\"aws-us-east-1\", :service=>\"vault\", :tags=>[\"active\"], :count=>1, :timers=>[], :current_datacenter=>\"aws-us-east-1\"}}","message":"{:invoca_service_discovery=>{:metric=>\"discover.attempt\", :datacenter=>\"aws-us-east-1\", :service=>\"vault\", :tags=>[\"active\"], :count=>1, :timers=>[], :current_datacenter=>\"aws-us-east-1\"}}: Invoca::ServiceDiscovery.discover metric report"}}
{"app":{"@timestamp":"2023-09-13T11:06:02.899689-07:00","severity":"INFO","host":"task-slow-7fb96d64f5-dzgd7","tid":"1qo4","service_name":"rr_slow_task_queue","progname":"{:invoca_service_discovery=>{:metric=>\"discover.unexpected_cross_datacenter\", :datacenter=>\"aws-us-east-1\", :service=>\"vault\", :tags=>[], :count=>0, :timers=>[], :current_datacenter=>\"aws-us-east-1\"}}","message":"{:invoca_service_discovery=>{:metric=>\"discover.unexpected_cross_datacenter\", :datacenter=>\"aws-us-east-1\", :service=>\"vault\", :tags=>[], :count=>0, :timers=>[], :current_datacenter=>\"aws-us-east-1\"}}: Invoca::ServiceDiscovery.discover metric report"}}
{"app":{"@timestamp":"2023-09-13T11:06:02.900647-07:00","severity":"INFO","host":"task-slow-7fb96d64f5-dzgd7","tid":"1qo4","service_name":"rr_slow_task_queue","progname":"{:invoca_service_discovery=>{:metric=>\"discover.timing\", :datacenter=>\"aws-us-east-1\", :service=>\"vault\", :tags=>[\"active\"], :count=>0, :timers=>[60.919656000010036], :current_datacenter=>\"aws-us-east-1\"}}","message":"{:invoca_service_discovery=>{:metric=>\"discover.timing\", :datacenter=>\"aws-us-east-1\", :service=>\"vault\", :tags=>[\"active\"], :count=>0, :timers=>[60.919656000010036], :current_datacenter=>\"aws-us-east-1\"}}: Invoca::ServiceDiscovery.discover metric report"}}

@ColinDKelley ColinDKelley merged commit 4525391 into master Sep 14, 2023
28 checks passed
@ColinDKelley ColinDKelley deleted the TECH-12593/make-global_context-global branch September 14, 2023 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants