Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion `!_end && !_zc_bufs && "Was this stream properly closed?"' failed #19494

Closed
tanlosav opened this issue Jun 25, 2024 · 8 comments · Fixed by #19541
Closed

Assertion `!_end && !_zc_bufs && "Was this stream properly closed?"' failed #19494

tanlosav opened this issue Jun 25, 2024 · 8 comments · Fixed by #19541
Assignees
Labels
Backport candidate backport/5.4-done backport to 5.4 was completed

Comments

@tanlosav
Copy link

Installation details
Scylla version (or git commit hash): 5.4.7
Cluster size: 11 peers
OS (RHEL/CentOS/Ubuntu/AWS AMI): RHEL/CentOS

We observed process crash when the execution time of a REST API exceeds the client's HTTP timeout.
For instance, when the client requests a list of snapshots (we've generated approximately 5,000 snapshots during our tests), it may not receive a response
during the timeout and close the connection.
This can be reproduced with the following command:

timeout 5 curl -v -X GET --header 'Accept: application/json' 'http://localhost:10000/storage_service/snapshots/'

As a result, an assert will be triggered in ~output_stream().

2024-06-06T16:16:50+0300 lieflziu scylla[2413759]: scylla: seastar/include/seastar/core/iostream.hh:407: seastar::output_stream<char>::~output_stream()
[CharType = char]: Assertion `!_end && !_zc_bufs && "Was this stream properly closed?"' failed.
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]: Aborting on shard 0.
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]: Backtrace:
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x5b54d18
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x5b8ae92
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   /opt/scylladb/libreloc/libc.so.6+0x3dbaf
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   /opt/scylladb/libreloc/libc.so.6+0x8e883
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   /opt/scylladb/libreloc/libc.so.6+0x3dafd
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   /opt/scylladb/libreloc/libc.so.6+0x2687e
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   /opt/scylladb/libreloc/libc.so.6+0x2679a
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   /opt/scylladb/libreloc/libc.so.6+0x36186
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x4a57cf4
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x5b665cf
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x5b678a7
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x5b66c19
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x5b0a357
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x5b0950c
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x13334ce
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x1334f30
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x1331a3c
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   /opt/scylladb/libreloc/libc.so.6+0x27b89
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   /opt/scylladb/libreloc/libc.so.6+0x27c4a
2024-06-06T16:16:50+0300 lieflziu scylla[2413759]:   0x132f464
2024-06-06T16:24:32+0300 lieflziu systemd[1]: Stopping Scylla Server...
2024-06-06T16:26:23+0300 lieflziu systemd[1]: scylla-server.service: Main process exited, code=dumped, status=6/ABRT
2024-06-06T16:26:23+0300 lieflziu systemd[1]: scylla-server.service: Failed with result 'core-dump'.
(gdb) bt
#0  0x00007f1859014884 in __pthread_kill_implementation () from /opt/scylladb/libreloc/libc.so.6
#1  0x00007f1858fc3afe in raise () from /opt/scylladb/libreloc/libc.so.6
#2  0x00007f1858fac87f in abort () from /opt/scylladb/libreloc/libc.so.6
#3  0x00007f1858fac79b in __assert_fail_base.cold () from /opt/scylladb/libreloc/libc.so.6
#4  0x00007f1858fbc187 in __assert_fail () from /opt/scylladb/libreloc/libc.so.6
#5  0x0000000004a57cf5 in seastar::output_stream<char>::~output_stream (this=0x60000b5773a0) at ./seastar/include/seastar/core/iostream.hh:407
#6  std::_Head_base<0ul, seastar::output_stream<char>, false>::~_Head_base (this=0x60000b5773a0) at /usr/bin/../lib/gcc/x86_64-redhat-
linux/13/../../../../include/c++/13/tuple:75
#7  std::_Tuple_impl<0ul, seastar::output_stream<char>, bool>::~_Tuple_impl (this=0x60000b577398) at /usr/bin/../lib/gcc/x86_64-redhat-
linux/13/../../../../include/c++/13/tuple:259
#8  seastar::internal::do_with_state<std::tuple<seastar::output_stream<char>, bool>, seastar::future<void> >::~do_with_state (this=0x60000b577380) at
./seastar/include/seastar/core/do_with.hh:40
#9  seastar::internal::do_with_state<std::tuple<seastar::output_stream<char>, bool>, seastar::future<void> >::run_and_dispose (this=0x60000b577380) at
./seastar/include/seastar/core/do_with.hh:48
#10 0x000060000030e000 in ?? ()
#11 0x000c435b0ffb735c in ?? ()
#12 0x00007f1856b1eac0 in ?? ()
#13 0x0000000005b665d0 in seastar::reactor::run_tasks (this=0x60000b577380, tq=...) at ./seastar/src/core/[reactor.cc:2651](http://reactor.cc:2651/)
#14 seastar::reactor::run_some_tasks (this=0x60000b577380) at ./seastar/src/core/[reactor.cc:3114](http://reactor.cc:3114/)
#15 0x0000000000a13b01 in typeinfo name for seastar::reactor::drain_cross_cpu_freelist_pollfn ()
#16 0x00007f1856b1eb00 in ?? ()
#17 0x0000000000000000 in ?? ()
(gdb) f 5
#5  0x0000000004a57cf5 in seastar::output_stream<char>::~output_stream (this=0x60000b5773a0) at ./seastar/include/seastar/core/iostream.hh:407
407                 assert(!_end && !_zc_bufs && "Was this stream properly closed?");
(gdb) list
402         output_stream& operator=(output_stream&&) noexcept = default;
403         ~output_stream() {
404             if (_batch_flushes) {
405                 assert(!_in_batch && "Was this stream properly closed?");
406             } else {
407                 assert(!_end && !_zc_bufs && "Was this stream properly closed?");
408             }
409         }
410         future<> write(const char_type* buf, size_t n) noexcept;
411         future<> write(const char_type* buf) noexcept;
(gdb) p _end
$4 = 21
(gdb) p _buf
$9 = {_buffer = 0x60000a088000 "total\": 0, \"live\": 0}", _size = 32000, _deleter = {_impl = 0x60000a088001}}
@avikivity
Copy link
Member

Is this a self-built 5.4.7? backtrace.scylladb.com doesn't decode the trace properly (and yours seems to be missing some frames).

@xemul the http timeout is a good hint though.

@tanlosav
Copy link
Author

We spent some time investigating with several versions (5.4.3, some intermediate commit of 5.4.5), and latest version was 5.4.7.
Yes, one of core dump has part of HTTP response on the request from our service and in the log we see client HTTP timeout...

@dedok
Copy link

dedok commented Jun 26, 2024

Hello,

Also I have a good question:
As I can see there is no calling of output_stream::close() inside the destructor. Does it make any reason?
It seems that calling of close make sense and it would fix the issue.

For example std respects this rule: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/fstream#L256

@tanlosav
Copy link
Author

I found a commit where the destructor was added: scylladb/scylla-seastar@78e3924
This change doesn't seem intentional... Perhaps instead of assert(), it might be better to call close() to buffer cleanup and descriptor closure?

@avikivity
Copy link
Member

In general one can't call a function returning a future from a destructor, because there is no way to wait for it.

@dedok
Copy link

dedok commented Jun 27, 2024

@avikivity thx for the reply and your clarification. But could be close() not a future but sync call? For ex: we could introduce sync version of the close(). May be that could be a good strategy to avoid such situations in the future.

Btw wondering If no close() called, then we would have fd leak at least, am I right?

@xemul
Copy link
Contributor

xemul commented Jun 28, 2024

I've a simple reproduced in scylladb/seastar#2310 (fixed issue number)

@xemul
Copy link
Contributor

xemul commented Jun 28, 2024

Ah, it's even simpler

xemul added a commit that referenced this issue Jul 1, 2024
…PI handler' from ScyllaDB

All streams used by httpd handlers are to be closed by the handler itself, caller doesn't take care of that.

fixes: #19494

(cherry picked from commit d1fd886)

(cherry picked from commit a0c1552)

(cherry picked from commit 1839030)

 Refs #19541

Closes #19563

* github.com:scylladb/scylladb:
  api: Fix indentation after previous patch
  api: Close output_stream on error
  api: Flush response output stream before closing
xemul added a commit to xemul/scylla that referenced this issue Jul 1, 2024
All streams used by httpd handlers are to be closed by the handler itself,
caller doesn't take care of that.

On master this place is coroutinized, thus the original fix doesn't fit
as is. Original commit is 0ce00eb

refs: scylladb#19494
closes: scylladb#19561

Signed-off-by: Pavel Emelyanov <[email protected]>
xemul added a commit that referenced this issue Jul 1, 2024
All streams used by httpd handlers are to be closed by the handler itself,
caller doesn't take care of that.

On master this place is coroutinized, thus the original fix doesn't fit
as is. Original commit is 0ce00eb

refs: #19494
closes: #19561

Signed-off-by: Pavel Emelyanov <[email protected]>

Closes #19570
@yaronkaikov yaronkaikov added backport/5.4-done backport to 5.4 was completed and removed backport/5.4 labels Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backport candidate backport/5.4-done backport to 5.4 was completed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants