Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow rename with str|bool|int type for internally tagged enums #1392

Closed
wants to merge 12 commits into from
Closed

Allow rename with str|bool|int type for internally tagged enums #1392

wants to merge 12 commits into from

Conversation

NotBad4U
Copy link

This is the WIP for #745 (my work is based on #973).
I have some troubles to understand what I have to do to complete the serde_derive/de.rs and serde_derive/se.rs modifications. I don't understand what I've to do in the method deserialize_identifier: serde_derive/src/de.rs#L2016 and if I have to modify other deserialize_* methods in serde_derive/de.rs. Does the serde_derive/de.rs part seem correct ?

For now the unit test doesn't compile and I get this error. Do I have to modify something in serde_derive/src/internals/ast.rs to pass the compilation ?

   Compiling serde_test_suite v0.0.0 (file:///home/escanor/Junk/serde/test_suite)
error[E0308]: mismatched types
    --> test_suite/tests/test_macros.rs:1048:32
     |
1048 |     #[derive(Debug, PartialEq, Serialize, Deserialize)]
     |                                ^^^^^^^^^
     |                                |
     |                                expected reference, found u64
     |                                help: consider borrowing here: `&Serialize`
     |
     = note: expected type `&_`
                found type `u64`

error[E0308]: mismatched types
    --> test_suite/tests/test_macros.rs:1048:32
     |
1048 |     #[derive(Debug, PartialEq, Serialize, Deserialize)]
     |                                ^^^^^^^^^
     |                                |
     |                                expected reference, found bool
     |                                help: consider borrowing here: `&Serialize`
     |
     = note: expected type `&_`
                found type `bool`

error[E0308]: mismatched types
    --> test_suite/tests/test_macros.rs:1048:43
     |
1048 |     #[derive(Debug, PartialEq, Serialize, Deserialize)]
     |                                           ^^^^^^^^^^^ expected str, found u64
     |
     = note: expected type `str`
                found type `u64`

error[E0308]: mismatched types
    --> test_suite/tests/test_macros.rs:1048:43
     |
1048 |     #[derive(Debug, PartialEq, Serialize, Deserialize)]
     |                                           ^^^^^^^^^^^ expected str, found bool
     |
     = note: expected type `str`
                found type `bool`

error: aborting due to 4 previous errors

@dtolnay
Copy link
Member

dtolnay commented Sep 19, 2018

Nice, this is on the right track. Regarding the test failure, you can use cargo-expand and the command cargo expand --test test_macros in the test_suite directory to see the generated code. I would recommend copying one of the non-working generated impls into a new minimal project, patching it up manually to compile and behave as intended, and then once the handwritten one works figuring out what modifications are needed in serde_derive to generate the modifications you had to make by hand.

@NotBad4U
Copy link
Author

@dtolnay Thx for the tips :)
And for the work on

fn deserialize_identifier(
I'm still blocking on what I have to do in deserialize_identifier method.

@dtolnay
Copy link
Member

dtolnay commented Sep 19, 2018

That deserialize_identifier function is generating the code that identifies which field of a struct or which variant of an enum we are looking at in the input. For example if the input contains a JSON string then we have a match expression like the following to map that string to a field or variant, taking into account any serde(rename = "...") attributes.

serde/serde_derive/src/de.rs

Lines 2244 to 2252 in 3f0f739

match __value {
#(
#field_strs => _serde::export::Ok(#constructors),
)*
_ => {
#value_as_str_content
#fallthrough_arm
}
}

The visit_other case is relevant when the user has something like:

#[derive(Deserialize)]
struct Demo {
    a: A,
    b: B,
    #[serde(flatten)]
    more: HashMap<Value, Value>,
}

where any map entries that are not one of the strings "a" or "b" need to be captured regardless of type and stored in more.

@NotBad4U
Copy link
Author

NotBad4U commented Oct 3, 2018

Thx @dtolnay for your cargo expand tips I was able to progress on this PR.

I set a unit test of a internally tagged enums with a lot of variant types to help me to work on this feature: test_internally_tagged_enum_renamed.
For now, the serialization pass all my assert_tokens but not the deserialization for variant with "complex" type like:

#[serde(rename=1)]
F(BTreeMap<String, String>),

Can you make a review of my deserialize_identifier ? I don't know if its the correct way to modify this method and I don't know if I have to deal with visit_u64 as I dealt with visit_bool.

Could you tell me where I have to put the unit tests for this feature (I saw that you have test_de.rs, test_ser.rs and test_annotations.rs) ?

@@ -21,6 +21,7 @@
extern crate syn;

extern crate proc_macro2;
extern crate quote;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is an external crate quote needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello 😄
The quote is used in serde_derive/src/internals/attr.rs#L153 :

impl ToTokens for VariantNameType {
    fn to_tokens(&self, out: &mut TokenStream) {
        match *self {
            VariantNameType::Str(ref s) => s.to_tokens(out),
            VariantNameType::Bool(b) => {
                b.to_tokens(out);
            },
            VariantNameType::Int(i) => {
                i.to_tokens(out);
            }
        }
    }
}

serde_derive_internal/ has a link to ./serde_derive/src/internals/ so if we need to build serde_derive_internal alone that will not compile without the extern crate quote;. If the sub crate serde_derive_internal is not meant to be built alone I'll remove this dependency.

@dtolnay
Copy link
Member

dtolnay commented Nov 25, 2018

Looks good so far!

I tried this against serde_json using the following code, but got an error during deserialization.

extern crate serde;
extern crate serde_derive;
extern crate serde_json;

use serde_derive::Deserialize;

#[derive(Deserialize, Debug)]
#[serde(tag = "success")]
enum Outcome {
    #[serde(rename = true)]
    Success,
    #[serde(rename = false)]
    Failure,
}

fn main() {
    let j = r#" {"success":true} "#;
    match serde_json::from_str::<Outcome>(j) {
        Ok(outcome) => println!("{:?}", outcome),
        Err(err) => eprintln!("{}", err),
    }
}
invalid type: boolean `true`, expected variant identifier at line 1 column 16

Some comments inline in the generated code:

    impl<'de> _serde::Deserialize<'de> for Outcome {
        fn deserialize<__D>(__deserializer: __D) -> _serde::export::Result<Self, __D::Error>
        where
            __D: _serde::Deserializer<'de>,
        {
            #[allow(non_camel_case_types)]
            enum __Field {
                __field0,
                __field1,
            }
            struct __FieldVisitor;

// I would omit visit_u64, visit_str, visit_bytes and only keep
// visit_bool. That way we get the Visitor trait's default invalid type
// error message, since those really correspond to an invalid type
// rather than an invalid value for the variant tag.
            impl<'de> _serde::de::Visitor<'de> for __FieldVisitor {
                type Value = __Field;
                fn expecting(
                    &self,
                    __formatter: &mut _serde::export::Formatter,
                ) -> _serde::export::fmt::Result {
                    _serde::export::Formatter::write_str(__formatter, "variant identifier")
                }
                fn visit_u64<__E>(self, __value: u64) -> _serde::export::Result<Self::Value, __E>
                where
                    __E: _serde::de::Error,
                {
                    match __value {
                        _ => _serde::export::Err(_serde::de::Error::invalid_value(
                            _serde::de::Unexpected::Unsigned(__value),
                            &"variant index 0 <= i < 2",
                        )),
                    }
                }
                fn visit_bool<__E>(self, __value: bool) -> _serde::export::Result<Self::Value, __E>
                where
                    __E: _serde::de::Error,
                {
                    match __value {
                        true => _serde::export::Ok(__Field::__field0),
                        false => _serde::export::Ok(__Field::__field1),

// This pattern is unreachable and should be left out.
                        _ => {
                            let __value = &_serde::export::from_bool(__value);
                            _serde::export::Err(_serde::de::Error::unknown_variant(__value, VARIANTS))
                        }
                    }
                }
                fn visit_str<__E>(self, __value: &str) -> _serde::export::Result<Self::Value, __E>
                where
                    __E: _serde::de::Error,
                {
                    match __value {
                        _ => _serde::export::Err(_serde::de::Error::unknown_variant(__value, VARIANTS)),
                    }
                }
                fn visit_bytes<__E>(self, __value: &[u8]) -> _serde::export::Result<Self::Value, __E>
                where
                    __E: _serde::de::Error,
                {
                    match __value {
                        _ => {
                            let __value = &_serde::export::from_utf8_lossy(__value);
                            _serde::export::Err(_serde::de::Error::unknown_variant(__value, VARIANTS))
                        }
                    }
                }
            }
            impl<'de> _serde::Deserialize<'de> for __Field {
                #[inline]
                fn deserialize<__D>(__deserializer: __D) -> _serde::export::Result<Self, __D::Error>
                where
                    __D: _serde::Deserializer<'de>,
                {

// This line is why the serde_json snippet above is failing. JSON
// represents identifiers as strings, so we are telling it to expect a
// string but the input contains a boolean. This line should say
// Deserializer::deserialize_bool. If the user's enum mixes different
// types of tags, this should say Deserializer::deserialize_any.
                    _serde::Deserializer::deserialize_identifier(__deserializer, __FieldVisitor)
                }
            }
            const VARIANTS: &'static [&'static str] = &["true", "false"];
            let __tagged = try!(_serde::Deserializer::deserialize_any(
                __deserializer,
                _serde::private::de::TaggedContentVisitor::<__Field>::new("success"),
            ));
            match __tagged.tag {
                __Field::__field0 => {
                    try!(_serde::Deserializer::deserialize_any(
                        _serde::private::de::ContentDeserializer::<__D::Error>::new(__tagged.content,),
                        _serde::private::de::InternallyTaggedUnitVisitor::new("Outcome", "Success"),
                    ));
                    _serde::export::Ok(Outcome::Success)
                }
                __Field::__field1 => {
                    try!(_serde::Deserializer::deserialize_any(
                        _serde::private::de::ContentDeserializer::<__D::Error>::new(__tagged.content,),
                        _serde::private::de::InternallyTaggedUnitVisitor::new("Outcome", "Failure"),
                    ));
                    _serde::export::Ok(Outcome::Failure)
                }
            }
        }
    }

@dtolnay
Copy link
Member

dtolnay commented Nov 25, 2018

Could you tell me where I have to put the unit tests for this feature (I saw that you have test_de.rs, test_ser.rs and test_annotations.rs) ?

Where you currently have it is fine. Those files are a bit of a mess -- at some point we will need to tidy up and organize them in a more sensible way.

Generate the true|false arms to avoid pattern true|false not covered error when the enum has only true or false arms
}
}

fn visit_bytes<__E>(self, __value: &[u8]) -> _serde::export::Result<Self::Value, __E>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the visit_bytes use the constructors_strstoo so I put them together.

We verify if the len > 0 to avoid a runtime failure
@NotBad4U
Copy link
Author

Hi @dtolnay 😃 , the code you tried with serde_json now work fine. I made some tests on my side (deserialization + serialization on complex enum) and It worked.
Now I have to fix some tests which fails:

---- test_flatten_struct_enum stdout ----
thread 'test_flatten_struct_enum' panicked at 'tokens failed to deserialize: invalid type: string "insert_integer", expected field identifier', serde_test/src/assert.rs:193:19

---- test_flatten_enum_newtype stdout ----
thread 'test_flatten_enum_newtype' panicked at 'tokens failed to deserialize: invalid type: string "Q", expected field identifier', serde_test/src/assert.rs:193:19
note: Run with `RUST_BACKTRACE=1` for a backtrace.

---- test_flatten_internally_tagged stdout ----
thread 'test_flatten_internally_tagged' panicked at 'tokens failed to deserialize: invalid type: string "typeX", expected field identifier', serde_test/src/assert.rs:193:19

---- test_flatten_option stdout ----
thread 'test_flatten_option' panicked at 'tokens failed to deserialize: invalid type: string "inner1", expected field identifier', serde_test/src/assert.rs:193:19

---- test_flatten_map_twice stdout ----
thread 'test_flatten_map_twice' panicked at 'tokens failed to deserialize: invalid type: string "x", expected field identifier', serde_test/src/assert.rs:193:19

---- test_flatten_untagged_enum stdout ----
thread 'test_flatten_untagged_enum' panicked at 'tokens failed to deserialize: invalid type: string "a", expected field identifier', serde_test/src/assert.rs:193:19

---- test_lifetime_propagation_for_flatten stdout ----
thread 'test_lifetime_propagation_for_flatten' panicked at 'tokens failed to deserialize: invalid type: string "x", expected field identifier', serde_test/src/assert.rs:193:19

I do not understand what expected field identifier means in the error messages. Could you clarify this for me ?

My second point is on the behavior of the API. What have to be done when we have 2 variants with the same type and value ?
e.g.: 2 * rename=true

#[derive(Deserialize, Debug)]
#[serde(tag = "success")]
enum Outcome {
    #[serde(rename = true)]  // first occurs
    A,
    #[serde(rename = true)] // secund occurs
    B,
}

For know, the build pass 😞 and the first variant is select all the time (A). If we allow only one rename=true where should I put this guard ?

@hcpl
Copy link
Contributor

hcpl commented Dec 10, 2018

I do not understand what expected field identifier means in the error messages. Could you clarify this for me ?

In Serde, a field identifier is a value that can be used to distinguish fields. It's used for key-value based formats that might not keep the original order, like JSON.

If we allow only one rename=true where should I put this guard ?

The attribute-checking machinery resides in https://github.com/serde-rs/serde/blob/e1edb0282ad353fd5f4539d2456f3a0ad90f7e54/serde_derive/src/internals/attr.rs where Attr::set already handles duplicate attributes if the value field already contains one.

Adding new checks is done by adding new match arms in Container::from_ast, Variant::from_ast and/or Field::from_ast.

@NotBad4U
Copy link
Author

So I don't understand why a structure like this one (from test_flatten_internally_tagged) with a flatten variant attributes encounter a problem with deserialization

---- test_flatten_internally_tagged stdout ----
thread 'test_flatten_internally_tagged' panicked at 'tokens failed to deserialize: invalid type: string "typeX", expected field identifier', serde_test/src/assert.rs:193:19
note: Run with `RUST_BACKTRACE=1` for a backtrace.
#[derive(Serialize, Deserialize, PartialEq, Debug)]
struct S {
    #[serde(flatten)]
    x: X,
    #[serde(flatten)]
    y: Y,
}

#[derive(Serialize, Deserialize, PartialEq, Debug)]
#[serde(tag = "typeX")]
enum X {
    A { a: i32 },
    B { b: i32 },
}

#[derive(Serialize, Deserialize, PartialEq, Debug)]
#[serde(tag = "typeY")]
enum Y {
    C { c: i32 },
    D { d: i32 },
}

This is the list of tokens expected:

assert_tokens(
     &s,
     &[
         Token::Map { len: None },
         Token::Str("typeX"),
         Token::Str("B"),
         Token::Str("b"),
         Token::I32(1),
         Token::Str("typeY"),
         Token::Str("D"),
         Token::Str("d"),
         Token::I32(2),
         Token::MapEnd,
     ],
);

This tokens list match with the ouput of value.serialize() (from assert_ser_tokens) but not the ouput of T::deserialize from (assert_de_tokens). How can I debug this further ? 😄

NOTE: All the tests that do not pass are related to the flatten variant attributes.

@dtolnay
Copy link
Member

dtolnay commented Jan 13, 2019

@hcpl would you be able to help get this reviewed and ready to land?

@kpozin
Copy link

kpozin commented Sep 8, 2019

This would be a handy feature! Any chance the PR could be revived?

@jaen
Copy link

jaen commented Sep 17, 2019

I need this quite badly (API I'm writing client for uses this pattern), badly enough to volunteer for updating this PR if it would mean it getting merged. Would just need guidance on what to do, since I don't have much in the way of practical Rust experience yet.

@dtolnay
Copy link
Member

dtolnay commented Oct 5, 2019

@kpozin @jaen, I am still interested in landing this feature but I have been bottlenecked on review bandwidth for many months, and this feature has been relatively low value for how much review effort it demanded so I haven't been able to give it more attention.

Here are some things that would help drive this forward:

  • Get the PR rebased, passing tests, and rustfmt'd;
  • Assess whether it is possible for this PR to affect generated impls of any type that does not use bool or int renames; I would hope that this is not the case;
  • Assess whether the test suite is adequate or are there edge cases that are not covered;
  • A round of code review from someone who is not me but knows the Deserialize api, on whether this implementation roughly makes sense.

@jaen
Copy link

jaen commented Oct 8, 2019

@dtolnay I actually got the PR rebased (well, squashed and rebased, it was simpler that way) and compiling, but I had problems getting tests to work.

I ended up writing the impl manually instead, because I discovered #1221 will be blocking me anyway. I can publish the rebased code somewhere and/or try to fix the tests in my spare time (not much of that though, considering it's not blocking me at work for now) if that helps.

@dtolnay
Copy link
Member

dtolnay commented Oct 8, 2019

@jaen that's awesome -- could you share your rebased code in a new PR? It doesn't have to be perfect.

@jaen
Copy link

jaen commented Oct 9, 2019

@dtolnay It's what I have right now – https://github.com/jaen/serde/pull/1/files. It compiles, but has some issues with deserialization tests – as I understand the deserializer produces a stream of tokens that are used to validate the deserializations and they do not match. These are the failing tests I didn't have time to debug yet.

@dtolnay
Copy link
Member

dtolnay commented Oct 9, 2019

Thanks! I copied it over to #1644 so we don't lose the work if you decide to delete your fork later.

@mbenoukaiss
Copy link

I spent a few hours doing this so I thought it may be useful to other people in the future since this PR has not received any modification for almost a year.

I generated the code for a tagged enum and adapted it to work with integers (it can also be modified to work with booleans). However I'm using two structs from private API ContentDeserializer and TaggedContentVisitor, to achieve this, should I worry it might break in a future serde patch? Is there a better way to do it?
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=ace054282891ad2e26a184f6487dc88a

@kangalio
Copy link

I think this PR is obsolete in favor of #2056

@NotBad4U NotBad4U closed this by deleting the head repository Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

8 participants