Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataDog APM not receiving data #4952

Open
layerssss opened this issue May 14, 2024 · 10 comments
Open

DataDog APM not receiving data #4952

layerssss opened this issue May 14, 2024 · 10 comments

Comments

@layerssss
Copy link

Describe the bug

I was tracking an issue after upgrading ruby-graphql the DataDog APM stopped receiving data. I found this issue occurred in version 2.1.11, as well as 2.2.24, 2.3.3. (working well in in <= 2.1.10)

I noticed when it was working well (2.2.10), I could get these log by DD_TRACE_DEBUG=true

 Name: execute.graphql
Span ID: 306045165879513178
Parent ID: 1395647012708376529
Trace ID: 135933776842391537409026544268374695538
Type: custom
Service: ruby-graphql
Resource: execute.graphql
Error: 0
Start: 1715725477160225024
End: 1715725492656022016
Duration: 15.495798999909312
Tags: [
   env => development,
   component => graphql,
   operation => execute_query,
   selected_operation_name => cameraPreviewMotionCrop,
   selected_operation_type => mutation,
   query_string => mutation cameraPreviewMotionCrop($input: CameraPreviewMotionCropInput!) {
  cameraPreviewMotionCrop(input: $input) {
    jpgImageBase64
    previewErrorLogs
    __typename
  }
}]
Metrics: [
   ],

But in 2.2.11 (when APM is no longer receiving data)

 Name: execute.graphql
Span ID: 335850196281974193
Parent ID: 2492163108850977412
Trace ID: 135933765750448785412869899199825844147
Type: custom
Service: rails
Resource: execute.graphql
Error: 0
Start: 1715725337101981952
End: 1715725353642692096
Duration: 16.540711999870837
Tags: [
   env => development,
   component => graphql,
   operation => execute_query,
   selected_operation_name => cameraPreviewMotionCrop,
   selected_operation_type => mutation,
   query_string => mutation cameraPreviewMotionCrop($input: CameraPreviewMotionCropInput!) {
  cameraPreviewMotionCrop(input: $input) {
    jpgImageBase64
    previewErrorLogs
    __typename
  }
}]
Metrics: [
   ],

I could see the difference in Service: changed from ruby-graphql to rails, so wondered if this patch made in 2.1.10 had potentially broken it. @TonyCTHsu

Versions

  • graphql version: 2.1.11
  • rails (7.1.3.2)
  • ddtrace (1.23.0)

Steps to reproduce

  • executing a GraphQL request

Expected behavior

Expect data entries to popup in DataDog ruby-graphql APM.

Actual behavior

No data arrives in DataDog ruby-graphql APM

Additional context

The way we configure DataDog:

...Gemfile

gem "ddtrace", require: "ddtrace/auto_instrument"

...config/initializers/datadog.rb

Datadog.configure do |c|
    c.tracing.instrument :active_model_serializers
    c.tracing.instrument :aws
    c.tracing.instrument :excon
    c.tracing.instrument :faraday
    c.tracing.instrument :http
    c.tracing.instrument :httpclient
    c.tracing.instrument :rails, service_name: "rails"
    c.tracing.instrument :redis
    c.tracing.instrument :sidekiq, service_name: "sidekiq", client_service_name: "sidekiq-client"
end

...app/grahpql/some_application_schema.rb

class SomeApplicationSchema < GraphQL::Schema
  use(GraphQL::Tracing::DataDogTracing)
  ...

We have a secondary GraphQL schema also with use(GraphQL::Tracing::DataDogTracing).

Switching to the new trace_with GraphQL::Tracing::DataDogTrace after upgrading didn't solve the issue.

Switching to ddtrace 2.x (datadog) didn't solve the issue either.

@rmosolgo
Copy link
Owner

Hey @layerssss, thanks for reporting this issue.

I see you spotted the difference in the Service: name. Were you able to find data in DataDog under the new service name? (I'm not sure how that's surfaced inside DataDog, but I'm wondering whether there's really no data going to DataDog, or whether data is still going there, but it's in a new place.)

Judging by the diff in that PR, it looks like you could set the Service back to ruby-graphql by passing it as an option, for example:

trace_with GraphQL::Tracing::DataDogTrace, service: "ruby-graphql" 

What happens if you add that option to your setup?

@layerssss
Copy link
Author

layerssss commented May 15, 2024

Hi @rmosolgo I think you are right, the data is sent to DataDog under the new service name rails. But it didn't get processed because I assume the records won't "fit" into the "rails" APM. Here is a screenshot inside the DataDog rails APM. There are no additional entries related to GraphQL. (without specifying service: option)

image

Adding service: "ruby-graphql" does workaround the issue though. (The ruby-graphql APM did get populated)

@rmosolgo
Copy link
Owner

cc @TonyCTHsu @vpellan is this intended behavior? Should the GraphQL-Ruby plugin still be providing ruby-graphql as the default service name?

@TonyCTHsu
Copy link
Contributor

👋 @layerssss @rmosolgo Thanks for reporting.

service is a field defined to be a entity that groups together endpoints, queries, or jobs for the purposes of building your application.

Generally speaking, it is your application.

Historically, it was abused for other reasons. Assigning it incorrectly would break other features such as Service Catalog.

GraphQL should be considered as internal to your application without explicitly defining it as a different service other than your application. The service for GraphQL spans will be labelled as your service definition from your configuration.

Datadog.configure do |c|
  c.service = "..."
end

I would highly recommend to NOT provide the default service ruby-graphql, because eventually this field will be deprecated from Datadog's API.

@layerssss
Copy link
Author

@TonyCTHsu Thanks for referring to the documentation. But if I remove the service: option. I could no longer find any stats for each executed GraphQL query in the APM section in DataDog.

It's not inside the rails service as my above screenshot. Where should I look for it?

@layerssss
Copy link
Author

@TonyCTHsu I've tried setting a default service option for the whole application.

Now the whole configuration becomes:

Datadog.configure do |c|
  c.service = "another-rails-app"
  c.tracing.contrib.global_default_service_name.enabled = true
  if ENV["DD_ENV"].present?
    c.tracing.instrument :active_model_serializers
    c.tracing.instrument :active_support
    c.tracing.instrument :aws
    c.tracing.instrument :excon
    c.tracing.instrument :faraday
    c.tracing.instrument :http
    c.tracing.instrument :httpclient
    c.tracing.instrument :rails
    c.tracing.instrument :redis
    c.tracing.instrument :sidekiq

    c.profiling.enabled = true if Rails.env.production?
  else
    c.tracing.enabled = false
  end
end

graphql/???_schema.rb has

  trace_with GraphQL::Tracing::DataDogTrace

(we have 2 schemas)

I've also removed require: "ddtrace/auto_instrument" from Gemfile just in case.

This does result all integrations ended up nicely under the new "service" (another-rails-app). But GraphQL seems to be the only one missing here.

image image

When I changed trace_with GraphQL::Tracing::DataDogTrace to trace_with GraphQL::Tracing::DataDogTrace, service: "another-graphql-app", GraphQL stats appeared (under the new service name).

image

@marcotc
Copy link
Contributor

marcotc commented May 17, 2024

Hey @layerssss, is there any information in the execute.graphql span that you cannot get from the rack.request, scoped to your Rails controller that handles GraphQL requests in your application?

@layerssss
Copy link
Author

In execute.graphql (when it works), each resource is the name of GraphQL query (passed by operation_name from ruby-graphql). Showing me execution span for each different query corresponding to different React component it was triggered from.

In rack.request each resource is the name of the controller. Showing span of each different controllers. All GraphQL requests are within one controller GraphqlController. e.g. I won't be able to tell which React component is triggering a slow GraphQL query.

@layerssss
Copy link
Author

@marcotc

@rmosolgo
Copy link
Owner

@layerssss, glad to hear that using service: ... makes the data show up again.

@marcotc or @TonyCTHsu, can either of you provide a screenshot of how GraphQL data should look in the DataDog UI? I want to make sure our default plugin makes data appear somewhere, because as @layerssss mentioned, it contains information that other spans don't have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants