0023 Improve fingerprinting for performance issues #24

smeubank · 2022-10-04T17:38:09Z

Request for ideas around improving fingerprinting to enable scaling out to more performance issue types

…rove-fingerprinting-for-performance-issues.md

AbhiPrasad · 2022-10-04T20:43:34Z

text/0023-improve-fingerprinting-for-performance-issues.md

+
+# Options Considered
+
+1. Client side/SDK Includes application file name in span/transaction


I assume this is inspired from opentelemetry source code attributes, but we have to be careful about the runtime costs here.

I am not sure where the original inspiration came from, pull this from a notion doc with performance team, will try to get them involved here

This is also inspired by our PHP SDK which surprisingly send code line numbers with the spans. We're worried about runtime costs, too, but it's worth looking into since it would dramatically improve fingerprinting quality

runtime costs here

I'd have to check the impact again, but I think the way we do stacktraces (serializing etc) in some of our sdks adds to the runtime cost instead of just fetching frames in the callstack. Something to keep in mind if we're not displaying a stacktrace is perhaps there are some savings we can do solely for the sake of fingerprinting.

On Cocoa, serializing the stacktrace for a single thread would be too much overhead for multiple spans during a transaction. We could do that for selected spans, and maybe we don't need the full stacktrace, but only the top n frames. If we only want to use this for fingerprinting and don't want to display the stacktrace, we could focus on only sending the bare minimum for symbolication to reduce the payload size.

AbhiPrasad · 2022-10-04T20:45:20Z

text/0023-improve-fingerprinting-for-performance-issues.md

+
+1. Client side/SDK Includes application file name in span/transaction
+2. Can we fetch more info on Sentry server side from profiling if available 
+3. Could SDKs detect something and create a unique identifier artificially to empower fingerprinting?


One thing we can do is inject more thing during build time. For browser SDKs we have bundler plugins, in android we have the gradle plug-in, and in web SDKs we run in CI to create a release.

we can dynamically inject code based on parsed modules to attach metadata to events. If it’s expensive to generate this per event, we can send it as part of the release, and then resolve it server-side.

Does Django (or other frameworks) have any useful runtime state that we could inject into the spans? e.g., when we were talking about detector evidence, there was an idea of parsing out the table names from the span description, and looking up the names of the corresponding Django models. Any relationship (even tenuous) between the span description and the code is helpful!

Ok so crazy idea. We could potentially walk the AST of any codebase and upload that information during CI, since Sentry often runs in CI to do things like upload release artifacts (debug symbols, sourcemaps) and create a release.

This means we can actually do this evidence mapping in Relay itself, since it can just use the release information to make better decisions about spans. For example, in Django we could upload all the models, views, controllers - and then map those automatically to certain spans/transactions based on their name. We could even upload line/col numbers during this process, and then resolve it like how we do stacktraces in the product.

@mitsuhiko this comes back to our discussions about uploading transaction name info during build time.

Also @benvinegar @HazAT comes back to our general discussions about build time insights, uploading information at build that helps enhance our production data.

During CI / buildtime is a good idea 👍 we just have to make sure the venn diagram of overlapping performance / sdk / and people willing to setup CI tools from us makes this worth it as a happy path.

This means we can actually do this evidence mapping in Relay itself

I think we'd have to be careful with performance costs since we'll potentially want to run this on processed events and pulling data into Relay would be expensive. Potentially we could do something like the projectoption cache sync, but we'd want fairly responsive cache updates upon pushing so we catch performance issues properly upon deploy instead of having a lag and making the start time of the issue not line up. Additionally, we would need to figure out if a non-AST evidence issue and an AST evidence issue could be merged so as not to have 2 different issues in the time between a cache update.

We must ensure that those artificial fingerprints work across releases and multiple SDK installations. This could be a challenge.

jernejstrasner · 2022-10-21T08:14:29Z

text/0023-improve-fingerprinting-for-performance-issues.md

+# Options Considered
+
+1. Client side/SDK Includes application file name in span/transaction
+2. Can we fetch more info on Sentry server side from profiling if available 


We talked about this for a bit and the main thing right now is that we don't have a simple way / any infrastructure built that makes it possible to leverage profiling data from other products. A big consideration going forward but for now it makes it unfeasible to rely on profiling for any of the work for next quarter.

Yeah, it will be nice to leverage profiling in the future but I agree it's probably premature to plan for it in fingerprinting if we don't have the infrastructure setup yet.

gggritso · 2022-10-21T15:52:30Z

text/0023-improve-fingerprinting-for-performance-issues.md

+* Can we specify why fingerprinting is required?
+* what the threshold would be for uniqueness to enable fingerprinting for performance issues?


Fingerprinting is required for grouping problematic transactions into issues. The goal of grouping is to create exactly one issue for exactly one performance problem in the code. Fingerprints are the mechanism that accomplishes this for both error events and transaction events. Each event is "hashed" into a string fingerprint. Events with the same fingerprint are grouped and become an Issue!

Fingerprinting transaction events is hard because we don't have stack traces. We can't determine the code location precisely, so have to infer it using spans. When fingerprinting is too strict, we end up creating lots of issues even though the transactions come from one piece of code. This causes notification spam. When fingerprinting is too loose we end up creating just one group for multiple unrelated problems. This causes issues to become unresolved when a user fixes just one of the underlying problems.

That's the overall threshold/heuristic we're striving towards: to create exactly one issue for one piece of code that causes the performance problems 🙏🏻

Fingerprinting transaction events is hard because we don't have stack traces. We can't determine the code location precisely

We could add the stacktrace to spans for some performance issues, such as file I/O on the main thread. As we have to detect that on the client side, we can attach the stacktrace of the calling thread.

philipphofmann · 2022-12-21T07:54:09Z

text/0023-improve-fingerprinting-for-performance-issues.md

+1. Client side/SDK Includes application file name in span/transaction
+2. Can we fetch more info on Sentry server side from profiling if available 
+3. Could SDKs detect something and create a unique identifier artificially to empower fingerprinting?
+


Suggestion option 4: Add the stacktrace of the calling thread when the performance issue relies on SDK side detection, such as file I/O on the main thread.

philipphofmann · 2022-12-21T07:56:29Z

text/0023-improve-fingerprinting-for-performance-issues.md

+* Can we specify why fingerprinting is required?
+* what the threshold would be for uniqueness to enable fingerprinting for performance issues?


Fingerprinting transaction events is hard because we don't have stack traces. We can't determine the code location precisely

We could add the stacktrace to spans for some performance issues, such as file I/O on the main thread. As we have to detect that on the client side, we can attach the stacktrace of the calling thread.

philipphofmann · 2022-12-21T07:58:10Z

text/0023-improve-fingerprinting-for-performance-issues.md

+* Can we specify why fingerprinting is required?
+* what the threshold would be for uniqueness to enable fingerprinting for performance issues?


Fingerprinting transaction events is hard because we don't have stack traces. We can't determine the code location precisely

We could add the stacktrace to spans for some performance issues, such as file I/O on the main thread. As we have to detect that on the client side, we can attach the stacktrace of the calling thread.

philipphofmann · 2022-12-21T08:24:28Z

text/0023-improve-fingerprinting-for-performance-issues.md

+
+# Options Considered
+
+1. Client side/SDK Includes application file name in span/transaction


On Cocoa, serializing the stacktrace for a single thread would be too much overhead for multiple spans during a transaction. We could do that for selected spans, and maybe we don't need the full stacktrace, but only the top n frames. If we only want to use this for fingerprinting and don't want to display the stacktrace, we could focus on only sending the bare minimum for symbolication to reduce the payload size.

philipphofmann · 2022-12-21T08:26:40Z

text/0023-improve-fingerprinting-for-performance-issues.md

+
+1. Client side/SDK Includes application file name in span/transaction
+2. Can we fetch more info on Sentry server side from profiling if available 
+3. Could SDKs detect something and create a unique identifier artificially to empower fingerprinting?


We must ensure that those artificial fingerprints work across releases and multiple SDK installations. This could be a challenge.

Create 0023-improve-fingerprinting-for-performance-issues

fe048a6

smeubank linked an issue Oct 4, 2022 that may be closed by this pull request

Improve fingerprinting for performance issues #23

Open

Rename 0023-improve-fingerprinting-for-performance-issues to 0023-imp…

bb4abc3

…rove-fingerprinting-for-performance-issues.md

smeubank changed the title ~~Create 0023-improve-fingerprinting-for-performance-issues~~ 0023 Improve fingerprinting for performance issues Oct 4, 2022

Update 0023-improve-fingerprinting-for-performance-issues.md

d031fac

AbhiPrasad reviewed Oct 4, 2022

View reviewed changes

jernejstrasner reviewed Oct 21, 2022

View reviewed changes

gggritso reviewed Oct 21, 2022

View reviewed changes

philipphofmann reviewed Dec 21, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0023 Improve fingerprinting for performance issues #24

0023 Improve fingerprinting for performance issues #24

smeubank commented Oct 4, 2022 •

edited

AbhiPrasad Oct 4, 2022

smeubank Oct 21, 2022

gggritso Oct 21, 2022

k-fish Oct 24, 2022

philipphofmann Dec 21, 2022

AbhiPrasad Oct 4, 2022

gggritso Oct 21, 2022

AbhiPrasad Oct 21, 2022

k-fish Oct 24, 2022

philipphofmann Dec 21, 2022

jernejstrasner Oct 21, 2022

k-fish Oct 24, 2022

gggritso Oct 21, 2022

philipphofmann Dec 21, 2022

philipphofmann Dec 21, 2022

philipphofmann Dec 21, 2022

philipphofmann Dec 21, 2022

philipphofmann Dec 21, 2022

philipphofmann Dec 21, 2022

philipphofmann Dec 21, 2022


		# Options Considered

		1. Client side/SDK Includes application file name in span/transaction

		* Can we specify why fingerprinting is required?
		* what the threshold would be for uniqueness to enable fingerprinting for performance issues?

0023 Improve fingerprinting for performance issues #24

Are you sure you want to change the base?

0023 Improve fingerprinting for performance issues #24

Conversation

smeubank commented Oct 4, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

smeubank commented Oct 4, 2022 •

edited