Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE-2814]: utils/file_io: fix double closing of ss::file in write_fully #18303

Merged

Conversation

andijcr
Copy link
Contributor

@andijcr andijcr commented May 8, 2024

ss::output_stream internally keeps track of in-flight exceptions. In the close() method, it will close the underlying ss::file and rethrow the exception.

ss::with_file_close_on_failure will too close the underlying ss::file if the future fails, this can result in a double close triggering an assertion like

seastar-prefix/src/seastar/include/seastar/core/future.hh:1917: future<T> seastar::promise<>::get_future() [T = void]: Assertion `!this->_future && this->_state && !this->_task' failed

This unit test shows how this assert could be triggered, if ss::output_stream as an active exception:

SEASTAR_THREAD_TEST_CASE(test_with_file_close_on_failure) {
    auto flags = ss::open_flags::rw | ss::open_flags::create
                 | ss::open_flags::truncate;
    ss::with_file_close_on_failure(
      ss::open_file_dma("/tmp/tmp.YuupbuphlR", flags),
      [](ss::file f) mutable {
          return f.close().then([] { throw "any value"; });
       })
      .get();
}

This commit moves out ss::output_stream::close() from ss::with_file_close_on_failure.

The method is coroutinized for clarity.

Fixes CORE-2814
Fixes #18286

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

Bug Fixes

  • Fixed an assertion triggering in a full-disk scenario

ss::output_stream internally keeps track of in-flight exceptions. In the
close() method, it will close the underlying ss::file and rethrow the
exception.

ss::with_file_close_on_failure will too close the underlying ss::file if
the future fails, this can result in a double close triggering an
assertion like

```
seastar-prefix/src/seastar/include/seastar/core/future.hh:1917: future<T> seastar::promise<>::get_future() [T = void]: Assertion `!this->_future && this->_state && !this->_task' failed
```

This unit test shows how this assert could be triggered, if
ss::output_stream as an active exception:

```
SEASTAR_THREAD_TEST_CASE(test_with_file_close_on_failure) {
    auto flags = ss::open_flags::rw | ss::open_flags::create
                 | ss::open_flags::truncate;
    ss::with_file_close_on_failure(
      ss::open_file_dma("/tmp/tmp.YuupbuphlR", flags),
      [](ss::file f) mutable {
          return f.close().then([] { throw "any value"; });
       })
      .get();
}
```

This commit moves out ss::output_stream::close() from
ss::with_file_close_on_failure.

The method is coroutinized for clarity.
@andijcr andijcr marked this pull request as draft May 8, 2024 10:00
@andijcr andijcr marked this pull request as ready for review May 8, 2024 10:21
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented May 8, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/48818#018f57e6-4fda-47d6-8ebe-2bb7d25c6bd8:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48818#018f57f7-26f0-4de3-b203-6259257cbe13:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/48818#018f57f7-26f4-491e-a8b4-e1f710eb6a34:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48818#018f57e6-4fd8-4008-95ad-27cc1ccc83c5:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.ABS"

@vbotbuildovich
Copy link
Collaborator

@andijcr
Copy link
Contributor Author

andijcr commented May 8, 2024

failure is #14139

@andijcr andijcr requested a review from a team May 9, 2024 16:09
Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Is it worth adding the test from the cover letter?

@andijcr
Copy link
Contributor Author

andijcr commented May 10, 2024

LGTM

Is it worth adding the test from the cover letter?

to test write_fully? i would need a semi reliable way to inject failures in ss::output_stream::close(). i can try to see what happen with /dev/null

edit: " seastar - io_submit: Invalid argument" would need to make the function more unit friendly

});
co_await write_iobuf_to_output_stream(std::move(buf), out).finally([&out] {
return out.close();
});
Copy link
Member

@dotnwat dotnwat May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks safe to me.

nit: I don't know if there is an idiomatic use of with_file_close_on_failure, but it seems like it would be:

auto f = co_await open();
co_await with_file_close_on_failure(
  // all the things that could fail and we don't want hand crafted error handling.
  make_file_output_stream.then(write_io_to_output_stream)...
);
co_await f.close();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ye, you just need to be sure that no path inside the lambda closes the file.
For the pr i copied how the function is used in the unit tests under seastar

@andijcr
Copy link
Contributor Author

andijcr commented May 13, 2024

failures are #14139

@michael-redpanda michael-redpanda merged commit 45423ed into redpanda-data:dev May 13, 2024
16 of 20 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

utils/file_io write_fully: assertion triggered with full disk
6 participants