JENKINS-64418: add simple exponential backoff to the sleep for bitbucket api rate limits #414

b-dean · 2020-12-10T16:30:33Z

When we have hit the rate limits for Bitbucket API it slows everything down to a crawl. Also other API requests (such as getting the Jenkinsfile for multibranch pipelines) are rate limited and slower.

Adding exponential backoff with a minimum of 100 ms and a max of 1 minute, should alleviate the constant retry loop hitting the limit every 5s.

See JENKINS-64418.

Your checklist for this pull request

Make sure you are requesting to pull a topic/feature/bugfix branch (right side) and not your master branch!
Ensure that the pull request title represents the desired changelog entry
Please describe what you did
Link to relevant issues in GitHub or in Jenkins JIRA
Link to relevant pull requests, esp. upstream and downstream changes
Did you provide a test-case? That demonstrates feature works or fixes the issue.

…mits

artheus · 2020-12-22T09:32:06Z

Looking at Atlassians documentation for "Adjusting your code for rate limiting" I would argue that, of the three alternative ways proposed there, the 3rd option makes the most sense here.

See: https://confluence.atlassian.com/adminjiraserver/adjusting-your-code-for-rate-limiting-987143384.html

The 3rd option states that one can adjust the requests sent based on the request limit http headers.

x-ratelimit-interval-seconds: The time interval in seconds. You get a batch of new tokens every time interval.

x-ratelimit-fillrate: The number of tokens you get every time interval.

retry-after: The number of seconds you need to wait for new tokens. Make sure that your rate assumes waiting longer than this value.

I argue that this would be the way to go, due to the two reasons mentioned in the documentation about exponential backoff

⛔ High impact on a Jira instance because of concurrency. We’re assuming most active users will send requests whenever they’re available. This window will be similar for all users, making spikes in Jira performance. The same applies to threads — most will either be busy at the same time or idle.

⛔ Unpredictable. If you need to make a few critical requests, you can’t be sure all of them will be successful.

bitwiseman · 2020-12-22T17:27:36Z

@artheus
Agreed, that is a better way.
If you take a look at https://github.com/jenkinsci/github-branch-source-plugin/blob/master/src/main/java/org/jenkinsci/plugins/github_branch_source/ApiRateLimitChecker.java it uses similar header info from GitHub to reason over rate limits and choose a strategy. However, that will take considerably more work. Would you be interested in submitting a PR?

Also, take a look at #187 (comment). There might be a way to mitigate the behavior in some cases by using caching.

artheus · 2020-12-31T10:32:38Z

@bitwiseman I will start working on a PR for this.
I do not really agree to the use of caching as a part of the plugin though. I would much rather suggest/recommend users to use a caching reverse-proxy between Jenkins and Bitbucket, this way users would be in much better control of the caching layer, and the plugin would work as expected.
Creating a caching layer here would make this a much larger job, as we would have to make a large amount of separate decisions, e.g:

In which scope to cache (per job, run, bitbucket project or globally)
In RAM or on disk?
- if on disk, where would be the best place to store caches?
How to invalidate caches?
What is a good timeout for caching data from which changes are expected?
If a synchronization job is run, and cached data is used, that would be a complete waste of resources (RAM, CPU...)

I think that the scope for this issue should be just to add a way to properly comply with the Bitbucket rate limits.
We could create a separate issue, PR or project for a caching layer.

bitwiseman

We need a test that exercises this.

bitwiseman · 2021-01-05T21:10:20Z

src/main/java/com/cloudbees/jenkins/plugins/bitbucket/client/BitbucketCloudApiClient.java

@@ -866,7 +869,8 @@ protected CloseableHttpResponse executeMethod(HttpHost host, HttpRequestBase htt
        httpMethod.setConfig(requestConfig.build());

        CloseableHttpResponse response = client.execute(host, httpMethod, requestContext);
-        while (response.getStatusLine().getStatusCode() == API_RATE_LIMIT_CODE) {
+        Random random = new Random();
+        for (int attempts = 0; response.getStatusLine().getStatusCode() == API_RATE_LIMIT_CODE; attempts++) {


The minimum sleep should remain 5 seconds. Once the limit is hit, the client needs to a wait at least on the order of seconds before trying again.

The maximum around 2 or 3 minutes sounds good.

We don't currently limit the number of attempts. That shouldn't change. We should hit the max and stay there.

bitwiseman · 2021-01-05T21:26:20Z

@artheus

I see your point about a caching reverse-proxy.

In the best of worlds, people would implement an additional system that gave them more control. However, most Jenkins systems do not exist in the best of worlds. Most don't have what you are describing. As such, we have to look to see if there are any simple changes that can be made in this plugin to mitigate this issue, such as turning on local caching provided by the client http library.

OkHttp makes adding local, on-disk, size constrained caching is relatively straightforward. It looks like Apache CachingHttpClients provides something similar, either in memory or on disk.

Take a look at the github-branch-source implementation. There is a cache for each credential (since different users are likely to see different results) and each cache is size-limited on-disk.

The model here is that queries are always sent to the server (with an ETag from any previous result), the server then either returns new data or informs the client that their cache is still valid. Cache invalidation is done automatically and there is no timeout: if there's a result that hasn't changed in a day or a week, as long as it is still valid it can be used.

If a synchronization job is run, and cached data is used, that would be a complete waste of resources (RAM, CPU...)

As synchronization job? I'm not sure what you mean, but the most common scenario overall is that some mixture query results are valid and caches are used and some are not and new data is retrieved.

In anycase...
I completely support the plugin properly complying with rate limits. I look forward to your PR. That would certainly help this issue as well.

In the meanwhile, I've requested a few changes to this PR and will happily merge this a first step once they are made.

Thanks for contributing!

b-dean · 2021-02-04T22:11:46Z

@artheus, that article you linked to from Atlassian is for JIRA, not Bitbucket. I logged all the headers that come back from the request that has the 429 response and there's nothing that says retry-after

Server: nginx
Vary: Authorization, cookie, user-context
Cache-Control: no-cache, no-store
Content-Type: text/plain
X-B3-TraceId: 1a4e3ec6e8f2fe21
X-Dc-Location: ash2
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Date: Thu, 04 Feb 2021 21:55:33 GMT
X-Served-By: app-3002
X-Static-Version: 6d8cc6ef88d5
X-Credential-Type: apikey
X-Render-Time: 0.0227448940277
X-Accepted-OAuth-Scopes: repository
Connection: Keep-Alive
X-Version: 6d8cc6ef88d5
X-Request-Count: 152
X-Frame-Options: SAMEORIGIN
Content-Length: 46

I thought about X-Request-Count but someone else asked Atlassian and they said it means nothing.

b-dean changed the title ~~add simple exponential backoff to the sleep for bitbucket api rate limits~~ JENKINS-64418: add simple exponential backoff to the sleep for bitbucket api rate limits Dec 10, 2020

add simple exponential backoff to the sleep for bitbucket api rate li…

35adbab

…mits

b-dean force-pushed the exponential-backoff branch from 7ffd348 to 35adbab Compare December 16, 2020 02:08

bitwiseman requested changes Jan 5, 2021

View reviewed changes

b-dean marked this pull request as draft February 4, 2021 19:21

Zauxst mentioned this pull request May 10, 2021

Added wait and retries for rate limit response code from Bitbucket server #433

Closed

6 tasks

Merge branch 'master' into exponential-backoff

009627e

tsalzinger mentioned this pull request Mar 27, 2024

Do not endlessly retry rate limited requests to bitbucket cloud #831

Open

nfalco79 mentioned this pull request Apr 3, 2024

Change backoff strategy to exponential for BitbucketCloudApiClient when rate-limited #841

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JENKINS-64418: add simple exponential backoff to the sleep for bitbucket api rate limits #414

JENKINS-64418: add simple exponential backoff to the sleep for bitbucket api rate limits #414

b-dean commented Dec 10, 2020 •

edited

Loading

artheus commented Dec 22, 2020

bitwiseman commented Dec 22, 2020

artheus commented Dec 31, 2020

bitwiseman left a comment

bitwiseman Jan 5, 2021

bitwiseman commented Jan 5, 2021

b-dean commented Feb 4, 2021

JENKINS-64418: add simple exponential backoff to the sleep for bitbucket api rate limits #414

Are you sure you want to change the base?

JENKINS-64418: add simple exponential backoff to the sleep for bitbucket api rate limits #414

Conversation

b-dean commented Dec 10, 2020 • edited Loading

Your checklist for this pull request

artheus commented Dec 22, 2020

bitwiseman commented Dec 22, 2020

artheus commented Dec 31, 2020

bitwiseman left a comment

Choose a reason for hiding this comment

bitwiseman Jan 5, 2021

Choose a reason for hiding this comment

bitwiseman commented Jan 5, 2021

b-dean commented Feb 4, 2021

b-dean commented Dec 10, 2020 •

edited

Loading