Skip to content

Latest commit

 

History

History
164 lines (110 loc) · 10.9 KB

0116-sentry-semantic-conventions.md

File metadata and controls

164 lines (110 loc) · 10.9 KB
  • Start Date: 2023-10-12
  • RFC Type: feature
  • RFC PR: #116
  • RFC Status: implementation
  • RFC Driver: Abhijeet Prasad

Summary

The purpose of this RFC is to formalize the a set of semantic conventions in Sentry, aligning with OpenTelemetry's semantic conventions. At the curren time, these semantic conventions are meant to apply to spans, breadcrumbs, and metrics, but not to errors/transactions/replays/crons. This may change in the future.

These will be a standardized naming scheme for operations and data that will be shared across the SDKs, ingest, and the product. We will implement the semantic conventions via a versioned JSON schema that is published as packages for Python, JavaScript, and Rust. This allows us to have a single source of truth for the semantic conventions, and also allows us to generate code for all parts of the stack (ingest, frontend, backend, data pipelines) that need to be aware of this.

This moves us closer to OpenTelemetry, helps reduce friction when creating new product features that rely on conventions around data, and helps us avoid the need to create new conventions for data that is already covered by OpenTelemetry.

We've found much success adopting the approach outlined by the RFC for span data conventions, and we believe that this approach can be extended to other parts of the product.

Motivation

We've been relying on various sources of truth for data attached to Sentry signals (errors, transactions, replays).

For errors and transactions we have:

  • Top level fields like user/release/environment.
  • Contexts, our current implementation of arbitrary contextual data attached to events.
  • Tags, string to string key value pairs that are indexed and searchable in Sentry.

Errors and transactions also have breadcrumbs which have:

  • Data, arbitrary contextual data attached to breadcrumbs (similar to contexts for errors/transactions). There is no formal schema for breadcrumb data, but some UI features in the error monitoring and replay product rely on specific keys being present.

For spans specifically we have:

  • Tags, string to string key value pairs that are indexed and searchable in Sentry, behaves exactly like tags for errors/transactions.
  • Data, arbitrary contextual data attached to breadcrumbs (similar to contexts for errors/transactions). There is no formal schema for span data, but we have been maintaining a set of span data conventions that is a superset of OpenTelemetry's semantic conventions.

For replays we have:

  • Top level fields like user/release/environment.
  • Tags, string to string key value pairs that are indexed and searchable in Sentry.
  • Breadcrumbs data, which is heavily relied on by the replay product.

For crons we have:

  • Top level fields like release/environment.
  • Contexts, our current implementation of arbitrary contextual data attached to events. Only the trace context is supported for crons at the current time.

For metrics we have:

  • Tags

Given the above, we can categorize the data attached to Sentry signals as so:

  • String key to string value pairs (tags, top level fields)
  • Structured dictionary with well defined keys and values (contexts)
  • Unstructured dictionary with arbitrary keys and values (breadcrumb data, span data)

Currently this is confusing, and it can even be hard to tell what data is generated by an sdk, added by a user, or injected at processing time.

Sentry Semantic Conventions

This RFC proposes adding semantic conventions that behave exactly like OpenTelemetry's semantic conventions. Each signal will get a new attributes field that is dictionary of key value pairs. For purposes of backwards comptability, this field can also be called data (means it can be instantly adopted for breadcrumbs and spans), but new signals like crons should use attributes.

For the purposes of rollout, we recommend this field is adopted by spans, metrics (DDM), and breadcrumbs first. There are no plans to adapt these conventions for crons/errors/transactions/replays, but we can consider it in the future.

This does not replace tags, which should still be part of their respective signal payloads. The product can make decisions on how to promote different attributes to become tags. This happens with contexts and span data today.

Attributes Schema

Attribute keys should be unique and well-known, and should not be used for multiple purposes. In OpenTelemetry the attribute values cannot be dictionaries, only primitives and arrays of primitives. We will try to follow this convention as well, which represents a change in values for some of the usages of contexts we have today.

export interface Attributes {
  [attributeKey: string]: AttributeValue;
}

/**
 * Attribute values may be any non-nullish primitive value except an object.
 *
 * null or undefined attribute values are invalid and will result in undefined behavior.
 */
export type AttributeValue =
  | string
  | number
  | boolean
  | Array<string>
  | Array<number>
  | Array<boolean>;
{
  "http.method": "POST",
  "http.request_content_length_uncompressed": 380,
  "http.status_code": 200,
  "http.status_text": OK,
  "http.target": "/api/checkout",
}

Whenever possible the attributes key/value meaning should be based on OpenTelemetry's semantic conventions. For example, the http.request.method attribute should have the same meaning as OpenTelemetry's http.request.method attribute. If a value does not exist in OpenTelemetry's semantic conventions, we should also aim to get it upstreamed into OpenTelemetry.

The reason for having an independent set of semantic conventions is that we have some attributes that are not covered by OpenTelemetry's semantic conventions, and we don't want to be blocked on upstreaming them. There are also some Sentry specific attributes that will never match OpenTelemetry's semantic conventions.

For sentry specific fields like release and environment, we are proposing setting them under the sentry.X namespace for the key. For example, release would be defined under the sentry.release attributes key.

Attributes Schema Versioning and Implementation

The attributes schema will be versioned and published as a JSON Schema, which will be the representation of the well-known attributes that Sentry ingestion and product relies on.

The JSON schema will also be codegened into packages for JavaScript, Python, and Rust, so that the schema can be used in Relay, the frontend and the backend. There has to be care put in to make sure that the codegened packages are compatible with the JSON schema, and that the JSON schema is backwards compatible, but we can introduce tests to make sure that this is the case.

By having a versionig/packaging structure like this, it also makes it much easier for the SDKs to identify what parts of the product are using the data they are sending.

Mapping Sentry Specific Fields to Attributes

Right now we have a variety of top level fields that need to be mapped to attributes, and some of them match existing OpenTelemetry semantic conventions. We should map these fields to attributes as best we can. The following represents some examples of transformations we'll be doing (the dots in the initial fields represent nested fields). This is a subset of all the involved fields, which will be documented in the JSON schema.

Top level fields:

  • release -> sentry.release
  • environment -> sentry.environment
  • origin -> sentry.origin
  • op -> sentry.op
  • source -> sentry.source
  • sample_rate -> sentry.sample_rate

User interface:

  • user.id -> enduser.id
  • user.username -> sentry.user.username
  • user.email -> sentry.user.email
  • user.ip_address -> sentry.user.ip_address
  • user.segment -> sentry.user.segment
  • user.geo.city -> sentry.user.geo.city
  • user.geo.country_code -> sentry.user.geo.country_code
  • user.geo.region -> sentry.user.geo.region

Request interface

  • request.method -> http.request.method
  • request.url -> url.full
  • request.query_string -> url.query_string
  • request.cookies -> http.request.cookies
  • request.headers.X -> http.request.headers.X (for example, request.headers.content-type)
  • request.env -> http.request.env

SDK interface

  • sdk.name -> sentry.sdk.name
  • sdk.version -> sentry.sdk.version
  • sdk.integrations -> sentry.sdk.integrations
  • sdk.packages -> sentry.sdk.packages

For the contexts defined in as part of our contexts payload, we can flatten it so that the context name become the prefix to the attribute key. For example the os context would become os.name, os.version, os.kernel_version, etc. Some of these new keys will have to be renamed to better match OpenTelemetry's semantic conventions, but that can be approached in a case by case basis.

Review process

Generally as per OpenTelemetry's semantic conventions, we can split up the attributes into required and optional attributes into various categories (browser, mobile, http, etc). This means we can have each category have it's own set of owners that can review and approve changes to the attributes in that category. This will help us scale the review process for the attributes.

Backwards Compatibility

We will support the data field as an alias for attributes for backwards compatibility. This means that breadcrumbs and spans can adopt the data field immediately. There is no timeline for us to remove the data field, but new signals like crons or metrics should use attributes as the field name.

We can also have an integrated test suite throughout Relay, the frontend, and the backend to ensure that any changes to the schema does not break product expectations.