Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommend DistSender concurrency limit bump #19096

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rmloveland
Copy link
Contributor

Fixes DOC-11652

Summary of changes:

  • Update 'Performance Recipes' page to note that if you encounter DistSender batch throttling (distsender.batches.async.throttled is > 0), consider increasing the value of the kv.dist_sender.concurrency_limit cluster setting.

Fixes DOC-11652

Summary of changes:

- Update 'Performance Recipes' page to note that if you encounter
  DistSender batch throttling (`distsender.batches.async.throttled` is >
  0), consider increasing the value of the
  `kv.dist_sender.concurrency_limit` cluster setting.
@rmloveland rmloveland marked this pull request as draft November 5, 2024 21:07
Copy link

github-actions bot commented Nov 5, 2024

Files changed:

Copy link

netlify bot commented Nov 5, 2024

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit 917c511
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-api-docs/deploys/672a891883106500083bb7ce

Copy link

netlify bot commented Nov 5, 2024

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit 917c511
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-interactivetutorials-docs/deploys/672a8918c673060008b5cf9a

Copy link

netlify bot commented Nov 5, 2024

Deploy Preview for cockroachdb-docs failed. Why did it fail? →

Name Link
🔨 Latest commit 917c511
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-docs/deploys/672a8918f568f4000886d005

@rmloveland
Copy link
Contributor Author

Hey @sean- @nhamlani101 @mwang1026

this is a quick rough draft to address https://cockroachlabs.atlassian.net/browse/DOC-11652 which was prompted by a message from @sean-

IMO to get this into docs, we need to answer a couple questions/make some decisions (below list of things are also marked XXX in the text of the PR, once we have answers/decisions I will update it and formally request reviews n stuff)

  1. How much to increase kv.dist_sender.concurrency_limit? 6x as in pkg/kv/kvclient: update per-vCPU concurrency limits cockroach#131226 ?
  2. What should we say about testing? What is a bad outcome of increasing this too much?
  3. If we're gonna document the above cluster setting, we should flip it to public, right now it's private AKA does not appear in docs at cluster settings (v24.2) (I can file a CRDB issue, at least in the past this was like a one-line code change)

@sean-
Copy link
Contributor

sean- commented Nov 5, 2024

  1. How much to increase kv.dist_sender.concurrency_limit? 6x as in pkg/kv/kvclient: update per-vCPU concurrency limits cockroach#131226 ?

Something like:

"If distsender.batches.async.throttled is non-zero and queries are experiencing unacceptable tail latency, try doubling this value to see if the number of throttled requests decreases. If it does, increase until see diminishing returns. If this value is non-zero after 10x'ing the value, please contact support for additional help to investigate the cause of the throttled messages."

  1. What should we say about testing? What is a bad outcome of increasing this too much?

"To validate a successful result, you can increase this value until you see no new throttled requests AND no increase in tail latency (e.g. p99.999)."

This does increase the amount of RAM consumption per node to handle the increased concurrency, but it's proportional to the load and an individual flow's memory consumption isn't "significant." Bad outcomes include increased tail latency or too much memory consumption with no decrease in the number of throttled requests.

  1. If we're gonna document the above cluster setting, we should flip it to public, right now it's private AKA does not appear in docs at cluster settings (v24.2) (I can file a CRDB issue, at least in the past this was like a one-line code change)

Heh, good to know, and yes agreed about making it public.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants