Support a #[serde(validate = "some_function")] attribute on fields #939

sfackler · 2017-05-19T20:32:01Z

I occasionally run into fields in a serde-deserializable type that require some extra validation after deserialization, for example "a u32 >= 2". The way to do this now is to make a custom deserialize function that deserializes and then does the check:

#[derive(Deserialize)]
struct Thing {
    #[serde(deserialize_with = "validate_foo")]
    foo: u32,
}

fn validate_foo<'de, D>(d: D) -> Result<u32, D::Error>
    where D: de::Deserializer<'de>
{

    let value = u32::deserialize(d)?;

    if value < 2 {
        return Err(de::Error::invalid_value(de::Unexpected::Unsigned(value as u64),
                                            &"a value at least 2"));
    }

    Ok(value)
}

However, it'd be nice if it were possible to specifically tell serde to use the standard deserializer but apply a validation function after the field is deserialized:

#[derive(Deserialize)]
struct Thing {
    #[serde(validate = "validate_foo")]
    foo: u32,
}

fn validate_foo<E>(v: &u32) -> Result<(), E>
    where E: de::Error
{
    if value < 2 {
        Err(de::Error::invalid_value(de::Unexpected::Unsigned(value as u64),
                                     &"a value at least 2"));
    } else {
        Ok(())
    }
}

Keats · 2017-05-28T15:16:14Z

That's very similar to one of my crate (https://github.com/Keats/validator) with a wip new version at Keats/validator#27

lucab · 2017-05-30T11:30:01Z

I guess this is a duplicate of #642.

dtolnay · 2017-05-30T14:45:56Z

I don't think this is a duplicated of #642. This is for a field attribute and that one is for a container attribute. They are useful in different situations.

TedDriggs · 2017-05-31T15:43:30Z

@sfackler, how would you feel about this:

#[derive(Deserialize)]
struct Thing {
    #[serde(and_then = "validate_foo")]
    foo: u32,
}

fn validate_foo<E>(v: u32) -> Result<u32, E>
    where E: de::Error
{
    if value < 2 {
        Err(de::Error::invalid_value(de::Unexpected::Unsigned(value as u64),
                                     &"a value at least 2"));
    } else {
        Ok(value)
    }
}

That would enable both validation and post-deserialize transformation, which is something I've found myself reaching for in the past.

avkonst · 2021-04-19T23:18:04Z

I think this issue should be closed as it seems there is a solution with and_then. Otherwise, it makes the impression it requires work....

TedDriggs · 2021-04-27T13:05:14Z

@avkonst it appears that #1858 was closed without being merged, meaning this particular issue probably will need to instead reference a documentation update.

@dtolnay your logic for individual fields in this comment makes sense; the place where I've found myself reaching for and_then is with complex objects such as the OASIS CACAO specification. That has requirements such as:

If both valid_from and valid_until are specified, then valid_from MUST be less than valid_until
All variables mentioned in a workflow step MUST be defined either in the playbook's variables or in the step's variables

If I'm also providing some other avenue for making these objects, then I likely already have a function with the signature validate(Playbook) -> Result<Playbook, AnErrSerdeCanWorkWith> in existence, and I'd like to have serde run the struct through that.

An alternative would be a custom Deserialize impl on the Playbook struct, like this:

pub struct Playbook {
    // fields
}

impl<'de> Deserialize<'de> for Playbook {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        #[derive(Deserialize)]
        struct UncheckedPlaybook {
            // fields...
        }

        impl TryFrom<UncheckedPlaybook> for Playbook {
            type Error = AnErrorThatImplsDisplay;

            fn try_from(_: UncheckedPlaybook) -> Result<Self, Self::Error> {
                // elided...
            }
        }

        UncheckedPlaybook::deserialize(deserializer)?.try_into().map_err(D::Error::custom)
    }
}

That works, but when the struct in question has a lot of fields that's code drift waiting to happen. I've considered making a crate to solve this that would allow the generation of a replica struct with different serde attributes, but that seems like overkill for the common case.

AsafFisher · 2021-05-18T23:10:29Z

deserialize_with Can be used for validation, please close this.

EqualMa · 2021-10-10T16:46:11Z

deserialize_with is not easy for me to use.
But I find try_from can be easily used to validate types and fields.

#[derive(Deserialize)]
#[serde(try_from = "String")] // Tell serde to deserialize data into a String and then try to convert it into Email
pub struct Email(String);

impl TryFrom<String> for Email {
    type Error = String;

    fn try_from(value: String) -> Result<Self, Self::Error> {
        if validate_email(&value) {
            Ok(Self(value))
        } else {
            Err(format!("Invalid email {}", value))
        }
    }
}

Then Email can be deserialized from a string and it will be validated.

You can also use it as the type of a field. It will be validated when the struct is deserialized.

#[derive(Deserialize)]
pub struct User {
    name: String,
    email: Email,
}

For full working code, checkout this repo.
I also wrote a a detailed guide.

When a `Stack` struct has a `StackId` of `*` then it should have no mixins. This is validated by a "shadow" struct deserializing working from https://dev.to/equalma/validate-fields-and-types-in-serde-with-tryfrom-c2n. Based on the comment from serde-rs/serde#939 (comment) In my own words it works like this: We create a duplicate struct and use serde's normal `derive(Deserialize)` on it. Then we specify that the `Stack` struct should be converted using the duplicate struct with `#[serde(try_from = "StackUnchecked")]. Finally we implement `try_from` and include our special logic that checks to see we don't have a `*` stack id AND any mixins. I also hard wrapped a few comments and put backpacks around special characters in the error output to make them clearer Close #104

schneems · 2021-10-20T18:10:37Z

@EqualMa thank you so much for that comment and the full write-up. I really like that approach. I'm a new-ish Rust programmer (decade in Ruby). The approach seems to work well.

I'm curious what it would take to be able to close out several of these requests for validation issues. I'm wondering would adding official documentation on that approach be enough for now?

I think in the future maybe this approach could be simplified/automated even more, and could be wrapped into some more specific "validation" feature. For example, maybe automating the generation of the "shadow" unchecked struct and the core "try from" logic. But the programmer has to implement the Ok/Err logic either way. All that considered TBH it's not a ton of boilerplate needed to use this approach.

What do you all think? Document this today to be able to close out some of these feature issues? If someone wants to further add some automation in the future PRs welcome?

## Deserializing error background When we deserialize TOML into structs there are requirements from the spec. For instance in https://github.com/Malax/libcnb.rs/pull/109 we see where this is invalid: ``` [[stacks]] id = "*" mixins = ["yolo"] ``` Because a stack cannot have an id of `*` and mixins. We need a consistent place to enforce this logic and rules and the most straightforward approach has been to check that the data at deserialize time. ## Existing approach This deserialize validation behavior is already happening in the buildpack. For example, the `Deserialize` trait is manually implemented on `BuildpackAPI` here https://github.com/Malax/libcnb.rs/blob/2ca9fff2d449889ae0b9893326b068e43f370238/src/data/buildpack.rs#L86-L151. ## This approach As part of the work with #109 it was found that serde has `try_from` that can be used to simplify this logic. Instead of having to manually implement all of Deserialize, the technique is to: - Introduce an "unchecked" struct that holds the raw data before validation - Implement a conversion between the "unchecked" struct and the "real" struct using TryFrom - Implement the validation and error checking logic inside of TryFrom - Set `#[serde(try_from=` on the "real" struct A much longer example is demonstrated in this blog post https://dev.to/equalma/validate-fields-and-types-in-serde-with-tryfrom-c2n as pulled from serde-rs/serde#939 (comment) Close #110

rapiz1 · 2021-12-06T13:35:31Z

@dtolnay 's comment #1858 (review) provides a quick example of validation using deserialize_with. I feel the need to link it here to help those who's looking for a solution.

Maybe this should be summed up and added to the documentation.

schneems · 2021-12-07T19:46:47Z

Maybe this should be summed up and added to the documentation.

Yes, I agree. I feel there are several issues and prs that we could close out as "good enough" with some existing solutions documented.

CobaltCause · 2022-06-17T23:50:07Z

I just realized that one can use const generics for integer validators; a bit much to declare but pretty nice to use.

ku1ik · 2022-08-03T11:47:07Z

I feel like conflating parsing (aka deserialization, what serde was built for) with value validation is not the right way to go. I know it's tempting to combine the two operations, it'd be convenient, but convenience invites complexity. Validation could easily be done as a second step, on the serde deserialized data structure. @Keats's validator crate seems nice (my understanding is you call .validate() on the struct after serde successfully deserialized).

ssokolow · 2022-08-03T12:11:52Z

Personally, I think that what people are asking for as "validation" is more a subset of parsing.

Parsing a sequence of UTF-8 bytes into an XML DOM doesn't become "validation" just because < or > or & are only valid in certain positions, and an argument can be made that one shouldn't need to newtype everything just to enforce that an integer named week_of_year only has a certain valid range.

...that said, my approach generally is to newtype everything since that both lets me push "validation" into Serde and allows the compiler to prevent conflation of values. Parse, don't validate.

CobaltCause · 2022-08-03T15:51:26Z

^{Parse, don't validate is probably one of my top 3 favorite blog posts of all time}

Another reason I favor newtypes over a second "validation" step is that without newtypes you can write this code:

let x = Stuff {
    between_1_and_3: 200,
};

which is clearly wrong. But with my example above, you can do either of:

between_1_and_3: BoundedUsize::clamp(200),

between_1_and_3: BoundedUsize::new(200).expect("out of bounds"),

And you could feasibly write a macro like this that would cause a compile error instead of a runtime one:

between_1_and_3: bounded_usize!(200),

I think the common name for this pattern is "correct by construction", which seems more in line with the way the rest of Rust works.

kanarus · 2024-08-30T15:40:49Z

"Parse, don't validate" is ideal in many situations, but if you need serde with validation, check out SerdeV ( https://github.com/ohkami-rs/serdev ), essentially serde with #[serde(validate = "some_function")] container attribute. This allows you to validate automatically in deserialization with no boilerplate.

dtolnay added the enhancement label May 19, 2017

Libbum mentioned this issue Jun 13, 2017

Step listing in config file Libbum/Wafer#6

Closed

asymmetric mentioned this issue Feb 14, 2018

Validate gasLimitBoundDivisor != 0 openethereum/parity-ethereum#7882

Closed

ordian mentioned this issue Oct 8, 2018

EIP-712 implementation openethereum/parity-ethereum#9631

Merged

1 task

andreeaflorescu mentioned this issue Jan 10, 2019

x86_64: limit maximum number of supported CPUs firecracker-microvm/firecracker#831

Merged

jwillbold mentioned this issue Jul 16, 2020

Implemented 'and_then' struct field attribute, fixing #939 #1858

Closed

This comment has been minimized.

Sign in to view

schneems mentioned this issue Oct 19, 2021

StackId should support the "any" stack ("*") heroku/libcnb.rs#104

Closed

schneems mentioned this issue Oct 19, 2021

Add support for "any" stack * in StackId heroku/libcnb.rs#109

Merged

schneems mentioned this issue Oct 20, 2021

Refactor deserializing validation checking heroku/libcnb.rs#110

Closed

sffc mentioned this issue Oct 9, 2022

Add validate_with to derive(ULE) and derive(VarULE) unicode-org/icu4x#2733

Draft

nick-mobilecoin mentioned this issue Oct 17, 2022

Initial work on supporting nested multi-sigs mobilecoinfoundation/mobilecoin#2657

Closed

winksaville mentioned this issue Jan 24, 2023

MsgHeader::id is not used to validate deserialization winksaville/exper_inter_process_channel#1

Open

yuja mentioned this issue Aug 12, 2023

Don't snapshot new files larger than 1MiB by default martinvonz/jj#1769

Merged

4 tasks

yuja mentioned this issue Apr 4, 2024

cli: improve, document, and bugfix snapshot.max-new-file-size martinvonz/jj#3444

Merged

2 tasks

brussee mentioned this issue Jun 10, 2024

Feature: Validate extracted data from request salvo-rs/salvo#168

Open

oli-obk added the docs label Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support a #[serde(validate = "some_function")] attribute on fields #939

Support a #[serde(validate = "some_function")] attribute on fields #939

sfackler commented May 19, 2017

Keats commented May 28, 2017

lucab commented May 30, 2017

dtolnay commented May 30, 2017

TedDriggs commented May 31, 2017

This comment has been minimized.

avkonst commented Apr 19, 2021

TedDriggs commented Apr 27, 2021

AsafFisher commented May 18, 2021 •

edited

Loading

EqualMa commented Oct 10, 2021

schneems commented Oct 20, 2021

rapiz1 commented Dec 6, 2021

schneems commented Dec 7, 2021

CobaltCause commented Jun 17, 2022

ku1ik commented Aug 3, 2022

ssokolow commented Aug 3, 2022 •

edited

Loading

CobaltCause commented Aug 3, 2022

kanarus commented Aug 30, 2024 •

edited

Loading

Support a #[serde(validate = "some_function")] attribute on fields #939

Support a #[serde(validate = "some_function")] attribute on fields #939

Comments

sfackler commented May 19, 2017

Keats commented May 28, 2017

lucab commented May 30, 2017

dtolnay commented May 30, 2017

TedDriggs commented May 31, 2017

This comment has been minimized.

avkonst commented Apr 19, 2021

TedDriggs commented Apr 27, 2021

AsafFisher commented May 18, 2021 • edited Loading

EqualMa commented Oct 10, 2021

schneems commented Oct 20, 2021

rapiz1 commented Dec 6, 2021

schneems commented Dec 7, 2021

CobaltCause commented Jun 17, 2022

ku1ik commented Aug 3, 2022

ssokolow commented Aug 3, 2022 • edited Loading

CobaltCause commented Aug 3, 2022

kanarus commented Aug 30, 2024 • edited Loading

AsafFisher commented May 18, 2021 •

edited

Loading

ssokolow commented Aug 3, 2022 •

edited

Loading

kanarus commented Aug 30, 2024 •

edited

Loading