-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc: optimize tuple deserialization when the types are default-constructible #2523
rpc: optimize tuple deserialization when the types are default-constructible #2523
Conversation
…uctible rpc deserialization cannot assume the types that make up the return tuple are default-constructuble, and cannot deserialize directly into the tuple constructor, so it is forced to construct a default-constructible tuple formed by wrapping every T with std::optional, deserializing into that, and then converting the temporary tuple into the return tuple by calling std::optional::value() for each element. This wrapping and unwrapping is wasteful, and while the compiler could theoretically fix everything up, in practice it does not. We notice that the first value can in fact be deserialized in the tuple constructor arguments, since there's no ordering problem for it. So we remove the std::optional wrapper for it unconditionally. For the rest of the elements, we wrap them with std::optional only if they are not default constructible. If they are, we leave them unchanged. Finally, the unwrapping process calls std::optional::value if the type was wrapped; and if none of the types were wrapped (which ought to be the common case), we return the temporary tuple without any unwrapping, reducing data movement considerably. The optimization is written in a way to also include the previous optimization when the tuple size was <= 1. Testing on ScyllaDB's messaging_service.o, we see another reduction in .text size: text data bss dec hex filename 6758116 48 236 6758400 672000 messaging_service.o.before 6741352 48 236 6741636 66de84 messaging_service.o.after About 17kB.
@@ -345,25 +345,86 @@ struct unmarshal_one { | |||
}; | |||
}; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We notice that the first value can in fact be deserialized in the tuple constructor arguments, since there's no ordering problem for it. So we remove the std::optional wrapper for it unconditionally.
I do not understand what ordering problem has to do with default constructability. Can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The non-first elements cannot be deserialized in the constructor since they suffer from the ordering problem; therefore they have to be default constructible (either naturally or by wrapping with std::optional). The first element has no ordering problem, so it can be deserialized in the constructor, so it need not be default constructible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the first element does not have ordering problem. May be I misunderstand what the "ordering problem" is. Can you define it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserializing element n+1 must happen after element n.
C++ doesn't define the order of argument evaluation, so we can't deserialize element n+1 in the constructor. So element n+1 must be default constructible.
We can however deserialize element n, for the case n=0, so we can drop the default constructible requirement for that element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserializing element n+1 must happen after element n.
OK. So I do understand the problem correctly. But why the first element does not suffer from it. You do not want to deserialize index 1 before index 0, no?
C++ doesn't define the order of argument evaluation, so we can't deserialize element n+1 in the constructor. So element n+1 must be default constructible.
We can however deserialize element n, for the case n=0, so we can drop the default constructible requirement for that element.
Ah, so first element is special because you want to optimize for the case where there is only one element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserializing element n+1 must happen after element n.
OK. So I do understand the problem correctly. But why the first element does not suffer from it. You do not want to deserialize index 1 before index 0, no?
I deserialze 0 in the constructor, and 1... in the comma expression.
C++ doesn't define the order of argument evaluation, so we can't deserialize element n+1 in the constructor. So element n+1 must be default constructible.
We can however deserialize element n, for the case n=0, so we can drop the default constructible requirement for that element.
Ah, so first element is special because you want to optimize for the case where there is only one element?
I want to remove std::optional wrappers. The first element can lose the wrapper unconditionally, the rest only if they are default constructible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserializing element n+1 must happen after element n.
OK. So I do understand the problem correctly. But why the first element does not suffer from it. You do not want to deserialize index 1 before index 0, no?
I deserialze 0 in the constructor, and 1... in the comma expression.
Ah, finally I get it. Since you de-serialize only one the order does not matter.
@xemul please review |
rpc deserialization cannot assume the types that make up the return tuple are default-constructuble, and cannot deserialize directly into the tuple constructor, so it is forced to construct a default-constructible tuple formed by wrapping every T with std::optional, deserializing into that, and then converting the temporary tuple into the return tuple by calling std::optional::value() for each element. This wrapping and unwrapping is wasteful, and while the compiler could theoretically fix everything up, in practice it does not.
We notice that the first value can in fact be deserialized in the tuple constructor arguments, since there's no ordering problem for it. So we remove the std::optional wrapper for it unconditionally.
For the rest of the elements, we wrap them with std::optional only if they are not default constructible. If they are, we leave them unchanged.
Finally, the unwrapping process calls std::optional::value if the type was wrapped; and if none of the types were wrapped (which ought to be the common case), we return the temporary tuple without any unwrapping, reducing data movement considerably.
The optimization is written in a way to also include the previous optimization when the tuple size was <= 1.
Testing on ScyllaDB's messaging_service.o, we see another reduction in .text size:
text data bss dec hex filename
6758116 48 236 6758400 672000 messaging_service.o.before
6741352 48 236 6741636 66de84 messaging_service.o.after
About 17kB.