-
-
Notifications
You must be signed in to change notification settings - Fork 777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix incorrect representation of tuples with skipped fields #2520
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR and exemplary writeup.
Have you thought about how this direction relates to skip_serializing_if
? For example in the following case:
#[derive(serde_derive::Serialize)]
struct Tuple(
#[serde(skip_serializing_if = "...")]
Option<usize>,
#[serde(skip_serializing_if = "...")]
Option<usize>,
#[serde(skip_serializing_if = "...")]
Option<usize>,
);
would you consider that the correct behavior needs to serialize using serialize_newtype_struct
if exactly one of the functions returns false, and serialize_unit_struct
if all true, and serialize_tuple_struct
otherwise?
It is hard to say. I never used In any case the behavior of |
I am definitely on board with the Not yet convinced about the |
The problem that right now this changes are coupled and decoupling them requires to replace simple and straightforward code with something more complex. This will be especially noticeable on the part that fixes enum representations (coming soon) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not interested in changing the Tuple2as1
case at this time. Individual data formats can opt to handle that by treating serialize_tuple_struct(1)
the same as serialize_newtype_struct
, and deserialize_tuple_struct(1)
as deserialize_newtype_struct
, if they prefer.
Do you see any way to separate that from the Tuple1as0
fix?
Yes, that should be possible, just should introduce some checks and possible code duplication. Just to clarify: you don't like changing behavior at all, or |
|
I am not convinced about the last "this should increase performance" commit. Interleaving the skipped and non-skipped fields to construct them in the original source order still seems fine to me. Default objects which are expensive to construct seems like a very uncommon case to begin with, and the commit only makes any difference in the error case if I understand correctly, which is not something we micro-optimize like this. |
Ok, I removed that commit. I hope compiler will able to reorder operations.
I think the code is generated also because to shift to generation taking into account all these boring uncommon cases and be sure that everything will implemented efficiently. Shifting this to the user seems to me bad practice, especially since it costs nothing. |
Actually, this commit just unifies current behavior, because for structs deserialization using use serde_derive::Deserialize;
#[derive(Deserialize)]
struct Skip {
#[serde(skip_deserializing)]
skipped: i32,
#[serde(alias = "c")]
aliased: i32,
} will generate #[inline]
fn visit_seq<__A>(
self,
mut __seq: __A,
) -> _serde::__private::Result<Self::Value, __A::Error>
where
__A: _serde::de::SeqAccess<'de>,
{
let __field0 = _serde::__private::Default::default();
let __field1 = //...;
_serde::__private::Ok(Skip {
skipped: __field0,
aliased: __field1,
})
}
#[inline]
fn visit_map<__A>(
self,
mut __map: __A,
) -> _serde::__private::Result<Self::Value, __A::Error>
where
__A: _serde::de::MapAccess<'de>,
{
let mut __field1: _serde::__private::Option<i32> = _serde::__private::None;
//...
_serde::__private::Ok(Skip {
skipped: _serde::__private::Default::default(),
aliased: __field1,
})
} |
@dtolnay , all your review comments was resolved, could you look again? I just want to remind you that all changes in the generated code are easy to see if you follow the instructions in the first message with comparing the generated code |
@dtolnay, yet another gentleman's reminder... |
SeqRefDeserializer::deserialize_any has a special condition for empty sequence, which emits visit_unit. That condition assumes that type would be able to deserialized from unit, but: 1) struct variants was never able to deserialize from it (they expect only visit_map or visit_seq) 2) tuple variants even with zero fields expect visit_seq only. The suggestion to accept visit_unit instead was rejected in serde-rs#2520 Fixes (2): newtype_enum::tuple0 newtype_enum::empty_struct_from_seq
SeqRefDeserializer::deserialize_any has a special condition for empty sequence, which emits visit_unit. That condition assumes that type would be able to deserialized from unit, but: 1) struct variants was never able to deserialize from it (they expect only visit_map or visit_seq) 2) tuple variants even with zero fields expect visit_seq only. The suggestion to accept visit_unit instead was rejected in serde-rs#2520 Fixes (2): newtype_enum::tuple0 newtype_enum::empty_struct_from_seq
SeqRefDeserializer::deserialize_any has a special condition for empty sequence, which emits visit_unit. That condition assumes that type would be able to deserialized from unit, but: 1) struct variants was never able to deserialize from it (they expect only visit_map or visit_seq) 2) tuple variants even with zero fields expect visit_seq only. The suggestion to accept visit_unit instead was rejected in serde-rs#2520 Fixes (2): newtype_enum::tuple0 newtype_enum::empty_struct_from_seq
…tion (review this with whitespace changes ignored)
…ction and rename it
…n_place` Those functions too small and used only once (review this with whitespace changes ignored)
Changes in generated code (see the file attached to PR): Tuple1as0: Tuple1as0Default: Tuple1as0With: fixed visit_newtype_struct: use default value instead of deserializing it This fixes compilation error for Deserialize side of Tuple1as0(Skipped) tuple
Enums out of scope for now, because they have too many problems failures (1): tuple_struct::tuple2as1 compilation error (commented): tuple_struct::tuple1as0
…r to Tuple0() struct A side-effect: Tuple2as1(Skipped, x) now also can be deserialized using visit_newtype_struct Changes in generated code (see the file attached to PR): Tuple1as0: Tuple1as0Default: Tuple1as0With: removed visit_newtype_struct, *_newtype_struct -> *_tuple_struct(0) Tuple2as1: Tuple2as1Default: Tuple2as1With: added visit_newtype_struct This commit fixes compilation error and actually fixes the issue as it was reported in serde-rs#2105
This PR fixes #2105 and give some insights about the current situation with skipped fields.
You can see what changed in the generated code by running the following commands:
test_suite/tests
folder and change it extension to.rs
...\serde\test_suite> cargo expand --all-features --test skip > skip-N.rs
N
is some number.When commit message contains words "Changes in generated code" that means that output of the command above was changed. I recommend to diff it against the previous result and explore the difference. There will be 4 files -- baseline and three changed.
Overall changes over all commits:
Tuple1as0
,Tuple1as0Default
,Tuple1as0With
:visit_newtype_struct
*_newtype_struct
->*_tuple_struct(0)
Tuple2as1
,Tuple2as1Default
,Tuple2as1With
:visit_newtype_struct
The table below summarizes the current behavior and the new behavior, introduced in this PR. Empty cells in the last column means that the behavior wasn't changed.
The behavior columns shows which methods are used to serialize or deserialize a type. The number in parenthesis is a count of fields which passed to the corresponding method.
serialize_unit_struct
deserialize_unit_struct
visit_unit
serialize_tuple_struct(0)
deserialize_tuple_struct(0)
visit_seq
ℹ️ It would be better to use Unit representation (see below)
serialize_newtype_struct
deserialize_newtype_struct
visit_newtype_struct
visit_seq
serialize_tuple_struct(2)
deserialize_tuple_struct(2)
visit_seq
#2105
serialize_newtype_struct
deserialize_newtype_struct
visit_newtype_struct
visit_seq
serialize_tuple_struct(0)
deserialize_tuple_struct(0)
visit_seq
ℹ️ It would be better to use Unit representation (see below)
serialize_tuple_struct(0)
deserialize_tuple_struct(0)
visit_seq
ℹ️ It would be better to use Unit representation (see below)
serialize_tuple_struct(1)
deserialize_tuple_struct(1)
visit_seq
serialize_tuple_struct(1)
deserialize_tuple_struct(1)
visit_newtype_struct
visit_seq
ℹ️ It would be better to use Newtype representation (see below)
serialize_struct
deserialize_struct
visit_seq
visit_map
serialize_struct
deserialize_struct
visit_seq
visit_map
serialize_struct
deserialize_struct
visit_seq
visit_map
serialize_struct
deserialize_struct
visit_seq
visit_map
Enum variant representations
The following changes also was implemented by I plan to ship them in separate PR due to some debating changes. I left my finding here to have the overall picture about skipped fields in one place.
The first debating change is to how to represent
tuple(0)
tuples? Should we represent them as a sequences with zero fields, or as a unit structs? Motivation for the unit structs is that that is the only way to skip all fields in a tuple and still represent it as a unit. This is currently done forEnum::Tuple1as0
, but only for it.Enum::Tuple2as0
no longer follows this agreement as well as all tuple structs, which introduces more confusion. I am inclined to this option, since to have an always empty sequence is somewhat strange.The second debating change, maybe we should make the behavior configurable? Introduce a new attribute like
#[serde(tuple)]
to producetuple(0)
andtuple(1)
forms fromTupleXas0
andTupleXas1
tuples?serialize_unit_variant
unit_variant
serialize_tuple_variant(0)
tuple_variant(0)
visit_seq
serialize_unit_variant
unit_variant
serialize_newtype_variant
newtype_variant
serialize_tuple_variant(2)
tuple_variant(2)
visit_seq
serialize_unit_variant
unit_variant
serialize_tuple_variant(0)
tuple_variant(0)
visit_seq
serialize_unit_variant
unit_variant
#2224
serialize_tuple_variant(1)
tuple_variant(1)
visit_seq
serialize_newtype_variant
newtype_variant
serialize_struct_variant(0)
struct_variant
visit_seq
visit_map
serialize_struct_variant(1)
struct_variant
visit_seq
visit_map
serialize_struct_variant(0)
struct_variant
visit_seq
visit_map
serialize_struct_variant(1)
struct_variant
visit_seq
visit_map
List of inconsistencies
As I said before, the current behavior contains a bunch of inconsistencies.
For example, an enum tuple variant with zero fields represented as a tuple with zero fields, but tuple variant with one skipped field represented as a unit. At the same time, tuple variant with 2 or more fields, all skipped, again represented as a tuple with zero fields. The same is applied to tuples with one effective field -- sometimes it is represented as a newtype, sometimes as a tuple with one field. I suggest to unify behavior across all cases that I summarized in the following table:
Tuple0()
Tuple1as0(Skipped)
Tuple2as0(Skipped, Skipped)
Tuple1(x)
x
Tuple2as1(Skipped, x)
Enum::Tuple0()
Enum::Tuple1as0(Skipped)
Enum::Tuple2as0(Skipped, Skipped)
Enum::Tuple1(x)
x
Enum::Tuple2as1(Skipped, x)
Legend: