-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High lowres clock drift > 1s in debug mode #2238
Comments
Maybe something like diff --git a/include/seastar/core/preempt.hh b/include/seastar/core/preempt.hh
index 537be5b5..21882aa9 100644
--- a/include/seastar/core/preempt.hh
+++ b/include/seastar/core/preempt.hh
@@ -49,11 +49,9 @@ inline const preemption_monitor*& get_need_preempt_var() {
void set_need_preempt_var(const preemption_monitor* pm);
-}
-
-SEASTAR_MODULE_EXPORT
-inline bool need_preempt() noexcept {
-#ifndef SEASTAR_DEBUG
+inline
+bool
+monitor_need_preempt() {
// prevent compiler from eliminating loads in a loop
std::atomic_signal_fence(std::memory_order_seq_cst);
auto np = internal::get_need_preempt_var();
@@ -64,6 +62,14 @@ inline bool need_preempt() noexcept {
// Possible optimization: read head and tail in a single 64-bit load,
// and find a funky way to compare the two 32-bit halves.
return __builtin_expect(head != tail, false);
+}
+
+}
+
+SEASTAR_MODULE_EXPORT
+inline bool need_preempt() noexcept {
+#ifndef SEASTAR_DEBUG
+ return internal::monitor_need_preempt();
#else
return true;
#endif
@@ -71,8 +77,6 @@ inline bool need_preempt() noexcept {
namespace internal {
-
-
// Same as need_preempt(), but for the scheduler's use. Outside debug
// mode they have the same meaning - the task quota expired and we need
// to check for I/O.
@@ -88,7 +92,7 @@ scheduler_need_preempt() {
// mode, run some random-but-bounded number of tasks instead. Latency
// will be high if those tasks are slow, but this is debug mode anyway.
static thread_local unsigned counter = 0;
- return ++counter % 64 == 0;
+ return ++counter % 64 == 0 || monitor_need_preempt();
#endif
} The |
Adding a second takes ~1ms. Adding multiple of them in a single batch can stall the reactor for a while. This, together with a recent seastar change[^1] caused some timeouts to fire very frequently[^2]. Fix the test by breaking the batch passed to add_segments into smaller segments so that we execute more finer grained tasks and yield more often to the scheduler. [^1]: scylladb/seastar#2238 [^2]: redpanda-data#13275
Reasonable. Does debug more default to 10ms task quotas? Or am I misremembering? Also your code should rely on need_preempt() rather than doing its own batching. |
Adding a segment takes ~1ms. Adding multiple segments in a single batch can stall the reactor for a while. This, together with a recent seastar change[^1] caused some timeouts to fire very frequently[^2]. Fix the problematic test by splitting the batch passed to add_segments so that we execute more finer grained tasks and yield more often to the scheduler. [^1]: scylladb/seastar#2238 [^2]: redpanda-data#13275
On a quick glance I don't see any code in seastar that would change the task quota in debug mode. I see only these 2 references:
-- Shall I submit a PR with the patch from the comment above or we have better suggestions? |
Ah, we have this in ScyllaDB:
I don't have a better suggestion. |
Adding a segment takes ~1ms. Adding multiple segments in a single batch can stall the reactor for a while. This, together with a recent seastar change[^1] caused some timeouts to fire very frequently[^2]. Fix the problematic test by splitting the batch passed to add_segments so that we execute more finer grained tasks and yield more often to the scheduler. [^1]: scylladb/seastar#2238 [^2]: redpanda-data#13275
Adding a segment takes ~1ms. Adding multiple segments in a single batch can stall the reactor for a while. This, together with a recent seastar change[^1] caused some timeouts to fire very frequently[^2]. Fix the problematic test by splitting the batch passed to add_segments so that we execute more finer grained tasks and yield more often to the scheduler. [^1]: scylladb/seastar#2238 [^2]: redpanda-data#13275
In redpanda we have a test which began failing very often in debug mode since latest seastar upgrade (CPU is busy with other processes too)
The test gets the current time using the lowres clock and sets a deadline +1 second for establishing a tcp connection. Because of high lowres clock drift, although the time it takes to run the tasks is in milliseconds we do hit the 1s deadline.
I believe this change in behavior is caused by this reactor change #2007.
We preempt every 64 calls to scheduler_need_preempt (every 64 tasks?) under the argument that "latency is not important in debug mode" but that seems to be too much in some cases which use the lowres clock since it is only updated when we break out of the reactor loop.
While we are guilty to have some reactor stalls in this testing code ~100ms, I don't believe that should be an excuse to let lowres clock drift to 1s+. If we were in a seastar::loop with each iteration taking 25ms (stall warning threshold), 64 calls are already 1.6s.
I see a few possible options:
Appreciate any thoughts or ideas on the above 🙏
The text was updated successfully, but these errors were encountered: