-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reactor, linux-aio: advise users in more detail on setting aio-max-nr
#2396
Conversation
Some example outputs, through scylla (all wrapped here manually, for readability):
|
i will take a look at this PR tomorrow. |
src/core/reactor.cc
Outdated
seastar_logger.info(""); | ||
log_aiocbs(log_level::info, storage_iocbs, preempt_iocbs, network_iocbs); | ||
seastar_logger.info(""); | ||
seastar_logger.info("Available AIO control blocks = aio-max-nr - aio-nr = {} - {} = {}", aio_max_nr, aio_nr, available_aio); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default logging level is set to info
, which ensures this message is always printed. i am wondering if we need this logging message if the number of aio cb is large enough. because i think, "no news is good news", which potentially reduces unnecessary verbosity.
so, maybe we can print it only when the number of aio cb is not sufficient and might hurt the network effienciency of the seastar application or it is not large enough to fulfill the minimum requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch (for seastar code) is also supposed to replace the following passage in scylladb's docs/dev/docker-hub.md
(as of scylladb commit ff52527c54e8
):
-Using multiple cores requires setting a proper value to the `/proc/sys/fs/aio-max-nr`.
-On many non-production systems, it will be equal to 65K. The formula
-to calculate the proper value is:
-
- Available AIO on the system - (request AIO per-cpu * ncpus) =
- aio_max_nr - aio_nr < (reactor::max_aio + detect_aio_poll + reactor_backend_aio::max_polls) * cpu_cores =
- aio_max_nr - aio_nr < (1024 + 2 + 10000) * cpu_cores =
- aio_max_nr - aio_nr < 11026 * cpu_cores
-
- where
-
- reactor::max_aio = max_aio_per_queue * max_queues,
- max_aio_per_queue = 128,
- max_queues = 8.
-
(See here why -- it's just too stale, and seeing actual numbers is better than purely symbolic formulae.)
Given that the static documentation is going away, I believe that the basic info should be printed even if the resources prove sufficint. However, I do agree that seeing this message at every startup, with default log level configuration, becomes boring pretty fast. I suggest: downgrade the log level from info to verbose. Would that work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lersek that'd work. thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correction: the log level just below info
is debug
, not verbose
(I remembered verbose
from another project, sorry). ... that is, I'm going to change the log level to debug
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in v2
Print a table of the AIO counts we intend to use (at "debug" level). If the intended counts cannot be allocated, reduce the network AIO count like before, but now also print an updated table (as a warning). Tell the user the specific number they should write to "/proc/sys/fs/aio-max-nr" for avoiding the reduction (assuming their current "aio-nr" does not change). If the per CPU network AIO count would have to be reduced under 1, format a clear error message with (up to) four specific recommendations: - aio-max-nr value for reaching 1 network AIOCB per CPU (always printed), - aio-max-nr value for reaching the originally requested network AIOCB per CPU (always printed), - the highest value to reduce the application's CPU count to, in order to reach 1 network AIOCB per CPU, within the current (aio-max-nr - aio-nr) budget (printed only if this SMP count is at least 1), - the value to reduce the application's CPU count to, in order to reach the originally requested network AIOCB per CPU, within the current (aio-max-nr - aio-nr) budget (printed only if this SMP count is at least 1). Motivated by <scylladb/scylladb#5981>. Signed-off-by: Laszlo Ersek <[email protected]>
v2:
example output (pass
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What we have today in "docs/dev/docker-hub.md" on "aio-max-nr" dates back to scylla commit f441202 ("docs/docker-hub.md: add quickstart section with --smp 1", 2020-09-22). Problems with the current language: - The "65K" claim as default value on non-production systems is wrong; "fs/aio.c" in Linux initializes "aio_max_nr" to 0x10000, which is 64K. - The section in question uses equal signs (=) incorrectly. The intent was probably to say "which means the same as", but that's not what equality means. - In the same section, the relational operator "<" is bogus. The available AIO count must be at least as high (>=) as the requested AIO count. - Clearer names should be used; adjust_max_networking_aio_io_control_blocks() in "src/core/reactor.cc" sets a great example: - "reactor::max_aio" should be called "storage_iocbs"; - "detect_aio_poll" should be called "preempt_iocbs"; - "reactor_backend_aio::max_polls" should be called "network_iocbs"; - The specific value 10000 for the last one ("network_iocbs") is not correct in scylla's context. It is correct as the Seastar default, but scylla has used 50000 since commit 2cfc517 ("main, test: adjust number of networking iocbs", 2021-07-18). Rewrite the section to address these problems. See also: - scylladb#5981 - scylladb/seastar#2396 - scylladb#19921 Signed-off-by: Laszlo Ersek <[email protected]>
What we have today in "docs/dev/docker-hub.md" on "aio-max-nr" dates back to scylla commit f441202 ("docs/docker-hub.md: add quickstart section with --smp 1", 2020-09-22). Problems with the current language: - The "65K" claim as default value on non-production systems is wrong; "fs/aio.c" in Linux initializes "aio_max_nr" to 0x10000, which is 64K. - The section in question uses equal signs (=) incorrectly. The intent was probably to say "which means the same as", but that's not what equality means. - In the same section, the relational operator "<" is bogus. The available AIO count must be at least as high (>=) as the requested AIO count. - Clearer names should be used; adjust_max_networking_aio_io_control_blocks() in "src/core/reactor.cc" sets a great example: - "reactor::max_aio" should be called "storage_iocbs", - "detect_aio_poll" should be called "preempt_iocbs", - "reactor_backend_aio::max_polls" should be called "network_iocbs". - The specific value 10000 for the last one ("network_iocbs") is not correct in scylla's context. It is correct as the Seastar default, but scylla has used 50000 since commit 2cfc517 ("main, test: adjust number of networking iocbs", 2021-07-18). Rewrite the section to address these problems. See also: - scylladb#5981 - scylladb/seastar#2396 - scylladb#19921 Signed-off-by: Laszlo Ersek <[email protected]>
…aszlo Ersek ~~~ What we have today in "docs/dev/docker-hub.md" on "aio-max-nr" dates back to scylla commit f441202 ("docs/docker-hub.md: add quickstart section with --smp 1", 2020-09-22). Problems with the current language: - The "65K" claim as default value on non-production systems is wrong; "fs/aio.c" in Linux initializes "aio_max_nr" to 0x10000, which is 64K. - The section in question uses equal signs (=) incorrectly. The intent was probably to say "which means the same as", but that's not what equality means. - In the same section, the relational operator "<" is bogus. The available AIO count must be at least as high (>=) as the requested AIO count. - Clearer names should be used; adjust_max_networking_aio_io_control_blocks() in "src/core/reactor.cc" sets a great example: - "reactor::max_aio" should be called "storage_iocbs", - "detect_aio_poll" should be called "preempt_iocbs", - "reactor_backend_aio::max_polls" should be called "network_iocbs". - The specific value 10000 for the last one ("network_iocbs") is not correct in scylla's context. It is correct as the Seastar default, but scylla has used 50000 since commit 2cfc517 ("main, test: adjust number of networking iocbs", 2021-07-18). Rewrite the section to address these problems. See also: - #5981 - scylladb/seastar#2396 - #19921 Signed-off-by: Laszlo Ersek <[email protected]> ~~~ No need for backporting; the documentation being refreshed targets developers as audience, not end-users. Closes #20398 * github.com:scylladb/scylladb: docs/dev/docker-hub.md: refresh aio-max-nr calculation docs/dev/docker-hub.md: strip trailing whitespace
Print a table of the AIO counts we intend to use (at "debug" level).
If the intended counts cannot be allocated, reduce the network AIO count like before, but now also print an updated table (as a warning). Tell the user the specific number they should write to
/proc/sys/fs/aio-max-nr
for avoiding the reduction (assuming their current "aio-nr" does not change).If the per CPU network AIO count would have to be reduced under 1, format a clear error message with (up to) four specific recommendations:
aio-max-nr
value for reaching 1 network AIOCB per CPU (always printed),aio-max-nr
value for reaching the originally requested network AIOCB per CPU (always printed),(aio-max-nr - aio-nr)
budget (printed only if this SMP count is at least 1),(aio-max-nr - aio-nr)
budget (printed only if this SMP count is at least 1).Motivated by scylladb/scylladb#5981.