-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seastar must listen on ALL shards, otherwise requests can be lost #2183
Comments
You example code doesn't prove that connections are accepted only on core 0 - rather it proves that you must listen also on core 0 for the setup to work. Can you please try the full loop (with 0) and add printouts which core gets each request, to verify that all of them - not just core 0 - gets requests? |
By the way, I'm not happy that the current situation is that you're not allowed to only listen on a subset of the cores. It's not because anyone might really want to do that (on a sharded application, you would definitely want to listen on all cores), but it makes the user experience frustrating: If you forget to listen on every one the cores, you get mysterious results with only some of the requests (if any) being handled, instead of getting a clear error message or everything just working just with fewer cores. |
Yes, that works, as expected:
My plan was to do different things on different cores - e.g. most of the cores would only listen, but some would be tasked with some background tasks etc... Is that discouraged? But please, in any case: Please document this limitation with a fat red banner so that nobody can miss it.
I must say that my onboarding experience has indeed been quite frustrating. I filed 3 issues including this, and almost a fourth due to the unexpected |
Good. So this issue should be closed, or at least its title and description need to be rewritten (the "connection only accepted in shard id = 0" isn't true, or doesn't accurately describe the problem).
Yes - although in some cases it makes sense to do some rare operations only one one CPU - e.g., to make them easy to serialize - usually you'd want the bulk of the operations, like processing requests, to happen in parallel on all cores. If you decide that some cores only listen, other cores only do background tasks, etc., you'll need to work hard to ensure a balanced load (and not leave some shard idle while other nodes are working) and you will have more communication between shards. That being said, as I also admitted above, I agree that the user experience of this really sucks. Ideally if the Seastar application only listens on core 2 and 7, then it should work - and all requests would be handled on core 2 and 7. Or, if this is NOT supported, we should abort the application with an error. Or, as you said, at least:
I agree. The situation with the Seastar tutorial (doc/tutorial.md) is not good. I started writing it a few years ago, and when funding for that project stopped. In some areas we have good doxygen comments, but they are not detailed enough. If you can send a patch improving tutorial.md and/or doxygen comments in the area, it would be very welcome.
I agree with every word. I opened an issue about control-C seven years ago - #261 :-( |
By the way: A plea to managers of commercial projects over Seastar such as ScyllaDB (CC @avikivity) and RedPanda: Please consider assigning people to improve Seastar documentation, and while at it also to fix bugs that hurt usability and onboarding. While no paying customers care about these things, they are important for the productivity of new developers that you hire for these commercial projects. Wouldn't it be great to have a "Seastar book" that every new employee could read and learn Seastar? That was my original intention when I started tutorial.md. |
Done.
Well yes - That as a drawback I was willing to take. One of the background tasks should be scheduled at more or less price times (60fps "output") and not be interrupted by other things. I know the scheduler takes care of this, but I'd prefer to have more explicit control.
Yep, I agree 100%
Well the point is I'm just getting started with Seastar (I'm working on a project called Kataklysmos which is supposed to become the fastest Pixelflut implementation in the world), so any doc PRs would just be more or less educated guesses. Not sure whether this is a good idea.
Oh, I didn't know funding for Seastar was stopped |
Just to clarify: Seastar development is still almost entirely done by developers paid by commercial companies (like my own employer, ScyllaDB). This hasn't stopped. But there is no one who is specifically funding, or assigning, documenting Seastar. So the result is partial documentation of varying breadth and depth. When Seastar development started, it was partially funded by an EU project (the Horizon 2020 project, https://mikelangelo-project.eu/), and one of the requirements of that funding was documentation. |
TL;DR:
Currently Seastar must listen on ALL shards, otherwise requests can be lost.
This is not documented anywhere as far as I'm aware - So either this limitation should be removed or it should be documented.
Original Issue:
Take the following code:
(taken from https://github.com/Rjerk/seastar-tutorial/blob/master/network-write.cc)
It will not work.
and in another window:
Only once you change to
boost::irange<unsigned>(0...
instead ofboost::irange<unsigned>(1...
it will work.6b7b16a
The text was updated successfully, but these errors were encountered: