Test encoding/decoding round-tripping using new hypothesis strategies. #2379

Hultner · 2021-02-19T15:42:12Z

Hultner
Feb 19, 2021

Idea / background

Improve (and safe guard against regression) pydantic endocding/decoding of json objects using property based testing.
The idea of something that encodes a value and then decodes the value again without losing data is maybe one of the most classic examples of where property based testing fits perfectly.

Suggestion

Create a random pydantic model generator for hypothesis
Use the hypothesis strategy from Generating Pydantic-specific types with Hypothesis #2017 to generate input for the random model, to create an instance.
Encode the model instance to json
Parse a new instance of the model using the encoded json data
Ensure that original instance is equal to the instance using the parsed json data

This should reasonably always hold true or we'd have data loss in the process. If certain types are lossy by design we'd either have to exclude them from the tests or write slightly more sophisticated assertions only comparing the non lossy parts, this would also have the added benefit of explicitly defining and documenting which parts of the conversion is lossy and which is not.

Other discussion

I briefly discussed this idea with @Zac-HD in #2017 (comment)

Would love to hear your thoughts on this idea.

Zac-HD · 2021-02-20T00:21:49Z

Zac-HD
Feb 20, 2021

I like the idea - here's some sample code to get started:

from hypothesis import given, strategies as st
from hypothesis_jsonschema import from_schema
import jsonschema


def models(...):
    """A strategy to generate Pydantic model types."""
    return ...


# These tests check that we can round-trip model instances in various ways


@given(st.data(), models())
def test_model_roundtrip_via_dict(data, model):
    instance = data.draw(st.from_type(model))

    # This is one sensible round-trip test; others would include
    #       instance.json() / model.parse_raw()
    #       pickle.dumps() / pickle.loads()
    as_dict = instance.dict()
    new = model.parse_obj(as_dict)

    assert instance == new


# These tests check that the jsonschemas accept all values which should be 
# accepted, respectively starting from the model and the schema.
# (rejecting everything which should be rejected is out of scope)


@given(st.data(), models())
def test_model_accepts_any_value_which_matches_schema(data, model):
    inputs = data.draw(st.lists(from_schema(model.schema()), min_size=1))
    for value in inputs:
        model.parse_obj(as_dict)


@given(st.data(), models().flatmap(st.from_type))
def test_schema_accepts_all_model_instances(data, instance):
    schema = type(instance).schema()
    obj = instance.dict()
    jsonschema.validate(obj, schema)

I honestly don't have time at the moment for much more than that (and #2365, of course), so @Hultner if you want to take this on I'd be delighted to stick to code review and suggestions 😅

1 reply

Hultner Feb 24, 2021
Author

I'd be happy to give it a go if @samuelcolvin / other maintainers sees this as desirable to be added :)

However I'll await feedback first, I don't have much extra capacity to work on these things so I'd like to know up front.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test encoding/decoding round-tripping using new hypothesis strategies. #2379

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Test encoding/decoding round-tripping using new hypothesis strategies. #2379

Hultner Feb 19, 2021

Idea / background

Suggestion

Other discussion

Replies: 1 comment · 1 reply

Zac-HD Feb 20, 2021

Hultner Feb 24, 2021 Author

Hultner
Feb 19, 2021

Replies: 1 comment 1 reply

Zac-HD
Feb 20, 2021

Hultner Feb 24, 2021
Author