telemetry(amazonq): measure e2e latency for amazon q requests #5566

jpinkney-aws · 2024-09-10T16:05:02Z

Problem:

We can't measure e2e how long a chat request takes

Solution:

Plumb the trace id through to the webview -> vscode -> webview. This allows us to know exactly what is happening to a message and how long everything took to get there
chat e2e latency can be linked with the other telemetry events via the traceId

Previous PR description

Problem

This is just a draft of what we could potentially do to measure e2e latency in amazon q chat message requests. Feel free to leave comments on the PR

Main code level problems:

It's hard to connect a tab id to a conversation id. Tab ID's exist pretty much to send/receive messages but they don't really live along side conversation id's. Conversation id's are exclusively related to the session. The only place they meetup is in their chat session storages
Do we want to track individual events like webviewToDispatcher as metrics?
- Though there wouldn't really be a way to "listen" on these events except for if we manually allow-listed metrics since theres no reason to attach a tabID to that metric
How do we listen to events emitted from telemetry so that we can find them later?
- One solution is proposed in this PR where we listen to events being emitted and then check if we want to track them
Could we connect these with a parent/child mechanism like we want to introduce eventually?
- We kind of do this because the "listener" event is the parent and the things it finds are the children. The problem is we can really have only one "listener" for these Q events, since we don't know what tabs are related to what conversation ID's until much later (after the first request). This is why we build up the map of conversation metrics and tab metrics and join them together before we emit

Solution

This PR does a couple things:

Moves up the chat session storage for codewhisperer chat. This allows us to eventually connect conversation ids to tabs
Introduces a chat storage facades that allows us to call getConversationId on any tab type. This is only implemented for cwc right now
Introduces a listener pattern that allows telemetry events to "listen" to metrics that are being emitted
emits a sample events that can you can trace from "pressing enter" to "receives message in ui"

Sample event when sending a request in amazon q chat:

2024-09-11 10:11:05.712 [debug] telemetry: chat_roundTrip {
  Metadata: {
    duration: '14585',
    name: 'onChatPrompt',
    child_metrics: 'initialRequestTime,webviewToDispatcher,dispatcherToFeature,amazonq_enterFocusConversation,amazonq_startConversation,amazonq_addMessage,finalRequestTime',
    child_metric_durations: '{"webviewToDispatcher":1,"dispatcherToFeature":3,"amazonq_enterFocusConversation":1542,"amazonq_startConversation":2,"amazonq_addMessage":13003,"finalRequestTime":34}',
    result: 'Succeeded',
    awsAccount: 'not-set',
    awsRegion: 'us-east-1'
  },
  Value: 1,
  Unit: 'None',
  Passive: false
}

You can also get events from the right click context menu items as well (explain, refactor, etc)

2024-09-11 10:11:34.384 [debug] telemetry: chat_roundTrip {
  Metadata: {
    duration: '12885',
    name: 'aws.amazonq.explainCode',
    child_metrics: 'initialRequestTime,webviewToDispatcher,amazonq_enterFocusConversation,amazonq_addMessage,finalRequestTime',
    child_metric_durations: '{"webviewToDispatcher":17,"amazonq_enterFocusConversation":1263,"amazonq_addMessage":11572,"finalRequestTime":33}',
    result: 'Succeeded',
    awsAccount: 'not-set',
    awsRegion: 'us-east-1'
  },
  Value: 1,
  Unit: 'None',
  Passive: false
}

License: I confirm that my contribution is made under the terms of the Apache 2.0 license.

github-actions · 2024-09-10T16:05:17Z

This pull request modifies files in src/ but no tests were added/updated. Confirm whether tests should be added or ensure the PR description explains why tests are not required.

nkomonen-amazon · 2024-09-10T20:20:22Z

packages/core/src/amazonq/util/storageFacade.ts

+
+// A common interface over existing chat session storages
+// TODO align all the chatsessionstorages so we don't have to do this :/
+export class ChatSessionStorageFacade {


packages/core/src/amazonq/util/storageFacade.ts

packages/core/src/amazonq/util/telemetryHelper.ts

packages/core/src/amazonq/webview/messages/messageDispatcher.ts

packages/core/src/shared/telemetry/spans.ts

packages/core/src/amazonq/webview/messages/messageDispatcher.ts

jpinkney-aws · 2024-09-11T18:31:27Z

I think another way we could possible go is to just instrument everything via a traceId which is more traditional for distributed tracing (rather than connecting tabIds to conversation Ids at the end of a flow like we do here). I think this might be more scalable for codewhisperer inline as well.

on chat prompt we could create a trace id, plumb that through the entire system and store various different timings in the RecordMap.

This would remove:

listening on telemetry events
remove the chatStorageFacade

we would keep:

record map
start listening/stop listening

end result:

we still get the chat message duration (but we don't necessarily correlate telemetry events with chat_roundTrip (though we totally can))
less complicated to understand/the flow is easily observable e2e
predictable metrics appearing in the end result

Problem: - We can't measure e2e how long a chat request takes Solution: - Plumb the trace id through to the webview -> vscode -> webview. This allows us to know exactly what is happening to a message and how long everything took to get there - chat e2e latency can be linked with the other telemetry events via the traceId

github-actions · 2024-09-24T18:06:52Z

This pull request modifies code in src/ but no tests were added/updated. Confirm whether tests should be added or ensure the PR description explains why tests are not required.

nkomonen-amazon reviewed Sep 10, 2024

View reviewed changes

jpinkney-aws force-pushed the jpinkney-aws/e2e-latency branch from 6ea7da1 to 22db33e Compare September 12, 2024 16:10

jpinkney-aws force-pushed the jpinkney-aws/e2e-latency branch from 68658f1 to f237a6c Compare September 20, 2024 14:41

jpinkney-aws marked this pull request as ready for review September 20, 2024 14:41

jpinkney-aws requested review from a team as code owners September 20, 2024 14:41

jpinkney-aws changed the title ~~draft: measure e2e latency for amazon q requests~~ telemetry(amazonq): measure e2e latency for amazon q requests Sep 23, 2024

jpinkney-aws mentioned this pull request Sep 24, 2024

telemetry(core): Add withTraceId function for improved tracing #5653

Merged

jpinkney-aws force-pushed the jpinkney-aws/e2e-latency branch from f237a6c to e7bd8a0 Compare September 24, 2024 18:06

nkomonen-amazon approved these changes Sep 25, 2024

View reviewed changes

improve docs

1fe948c

jpinkney-aws merged commit 359b1d8 into master Sep 25, 2024
14 of 16 checks passed

jpinkney-aws deleted the jpinkney-aws/e2e-latency branch September 25, 2024 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

telemetry(amazonq): measure e2e latency for amazon q requests #5566

telemetry(amazonq): measure e2e latency for amazon q requests #5566

jpinkney-aws commented Sep 10, 2024 •

edited

Loading

github-actions bot commented Sep 10, 2024

nkomonen-amazon Sep 10, 2024

jpinkney-aws commented Sep 11, 2024

github-actions bot commented Sep 24, 2024

telemetry(amazonq): measure e2e latency for amazon q requests #5566

telemetry(amazonq): measure e2e latency for amazon q requests #5566

Conversation

jpinkney-aws commented Sep 10, 2024 • edited Loading

Problem:

Solution:

Problem

Main code level problems:

Solution

Sample event when sending a request in amazon q chat:

You can also get events from the right click context menu items as well (explain, refactor, etc)

github-actions bot commented Sep 10, 2024

nkomonen-amazon Sep 10, 2024

Choose a reason for hiding this comment

jpinkney-aws commented Sep 11, 2024

github-actions bot commented Sep 24, 2024

jpinkney-aws commented Sep 10, 2024 •

edited

Loading