Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify upstream latency collector and measure gateway latency #4193

Merged

Conversation

philipp-spiess
Copy link
Member

Closes CODY-1605

This PR changes the way we track upstream latency (= RTT to the Sourcegraph instance). Instead of measuring every 10 minutes and then exporting the median to the completion metadata, I thought that it might be simpler to include the last measurement with the completion data (and then we can run an aggregation on that). While coding, completion suggestions should happen more frequent then every 10 minutes so this will give us a higher resolution while we avoid to maintain a list.

Additionally, I've added a second parameter to measure the latency to Gateway. This is only turned on for Free/Pro users right now (to avoid outgoing traffic in air-gapped instances).

Furthermore I added tracing to the upstream pings. With our current rate limit, one two two traces every ~10 minutes should be fine (again I assume this happen far less frequently than completion suggestions) but we can revisit this later.

The benefit of exporting this as a trace is that we can later forward the trace to the SG instance and convert it to a metric.

Test plan

Screenshot 2024-05-16 at 14 41 54

@philipp-spiess philipp-spiess requested review from rafax and a team May 16, 2024 12:48
@philipp-spiess philipp-spiess self-assigned this May 16, 2024
headers.set('Authorization', `Bearer ${this.fastPathAccessToken}`)
addTraceparent(headers)
addCustomUserAgent(headers)
const uri = 'https://cody-gateway.sourcegraph.com/healthz'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The right URL seems to be https://cody-gateway.sourcegraph.com/-/healthz (your URL 404s for me)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm this one also does not seem right. I wasn't able to find an endpoint that works so decided that a 404 endpoint from the same endpoint should still have the same round trip time.

vscode/src/services/UpstreamHealthProvider.ts Outdated Show resolved Hide resolved
vscode/src/services/UpstreamHealthProvider.ts Outdated Show resolved Hide resolved
@philipp-spiess
Copy link
Member Author

philipp-spiess commented May 17, 2024

Had one thought yesterday while I couldn't sleep at 3am (🙃):

  • We should run the initial ping a bit delayed. During the extension startup, a lot of i/o and network is happening so it might not be the best sample if we start a ping right away. I think something like a 10 second delay for the initial ping should be enough, just to ensure there's not too much interference of other extensions booting up?

@philipp-spiess philipp-spiess merged commit 857b548 into main May 17, 2024
18 of 19 checks passed
@philipp-spiess philipp-spiess deleted the ps/simplify-upstream-health-and-measure-gateway-latency branch May 17, 2024 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants