Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The type constraint of errors is broken #253

Open
uncle-lv opened this issue Feb 5, 2024 · 3 comments
Open

The type constraint of errors is broken #253

uncle-lv opened this issue Feb 5, 2024 · 3 comments

Comments

@uncle-lv
Copy link

uncle-lv commented Feb 5, 2024

errors should be dict[str, list[str]] in marshmallow.

But when union fields raise errors, marshmallow_dataclass adds the error objects into errors directly. The error message generated by marshmallow_dataclass will be List[TypeCheckError | ValidationError].

It breaks the type constraint of errors.

@uncle-lv
Copy link
Author

uncle-lv commented Feb 5, 2024

This bug is found in this issue.

@dairiki
Copy link
Collaborator

dairiki commented Feb 6, 2024

I started to look into fixing this, but then got confused.

If we have a union of simple types:

from dataclasses import dataclass
from typing import Union

from marshmallow_dataclass import class_schema

@dataclass
class IntOrStr:
    value: Union[int, str]

schema = class_schema(IntOrStr)()

Then, we expect schema.load({"value": object()}) to raise something like

ValidationError({"value": ["Not a valid int.", "Not a valid string."]})

(Currently, it raises ValidationError({"value": [ValidationError("Not a valid int."), ValidationError("Not a valid string.")]}), which as you point out is pretty clearly not normal marshmallow usage.)

But what if we have a union of compound types?
As an example:

@dataclass
class IntStrOrStrInt:
    value: Union[Tuple[int, str], Tuple[str, int]]

schema2 = class_schema(IntStrOrStrInt)()

What should schema2.load({"value": (1,1)}) raise?

Currently, it raises

ValidationError({"value": [
    ValidationError({1: ["Not a valid string."]}),  # fails parsing as Tuple[int, str], because the second value is not a str
    ValidationError({0: ["Not a valid string."]}),  # fails parsing as Tuple[str, int] because the first value is not a str
]})

We could merge these in the same manner that marshmallow’s merge_errors does. That would result in

ValidationError({"value": {
    0: ["Not a valid string."],
    1: ["Not a valid string."]
}})

I'm not convinced that that makes sense (but I'm at a loss for better ideas).

As an alternative, I considered treating the Union (for error reporting purposes) essentially like a sequence of the constituent types.
Then the resulting errors dict would then be one level deeper:

ValidationError({"value": {
    0: {1: ["Not a valid string."]},  # deserialization of the second element of the first type (Tuple[int, str]) failed
    1: {0: ["Not a valid string."]},  # deserialization of the first element of the second type (Tuple[str, int]) failed
}})

I'm pretty sure from poking through marshmallow docs that the keys in the messages dict are supposed to correspond to keys/indexes in the input data. So that would rule out this approach.


Then there's the issue that fixing this will break any existing code that depends on the current (admittedly broken) behavior.

@uncle-lv
Copy link
Author

I started to look into fixing this, but then got confused.

If we have a union of simple types:

from dataclasses import dataclass
from typing import Union

from marshmallow_dataclass import class_schema

@dataclass
class IntOrStr:
    value: Union[int, str]

schema = class_schema(IntOrStr)()

Then, we expect schema.load({"value": object()}) to raise something like

ValidationError({"value": ["Not a valid int.", "Not a valid string."]})

(Currently, it raises ValidationError({"value": [ValidationError("Not a valid int."), ValidationError("Not a valid string.")]}), which as you point out is pretty clearly not normal marshmallow usage.)

But what if we have a union of compound types? As an example:

@dataclass
class IntStrOrStrInt:
    value: Union[Tuple[int, str], Tuple[str, int]]

schema2 = class_schema(IntStrOrStrInt)()

What should schema2.load({"value": (1,1)}) raise?

Currently, it raises

ValidationError({"value": [
    ValidationError({1: ["Not a valid string."]}),  # fails parsing as Tuple[int, str], because the second value is not a str
    ValidationError({0: ["Not a valid string."]}),  # fails parsing as Tuple[str, int] because the first value is not a str
]})

We could merge these in the same manner that marshmallow’s merge_errors does. That would result in

ValidationError({"value": {
    0: ["Not a valid string."],
    1: ["Not a valid string."]
}})

I'm not convinced that that makes sense (but I'm at a loss for better ideas).

As an alternative, I considered treating the Union (for error reporting purposes) essentially like a sequence of the constituent types. Then the resulting errors dict would then be one level deeper:

ValidationError({"value": {
    0: {1: ["Not a valid string."]},  # deserialization of the second element of the first type (Tuple[int, str]) failed
    1: {0: ["Not a valid string."]},  # deserialization of the first element of the second type (Tuple[str, int]) failed
}})

I'm pretty sure from poking through marshmallow docs that the keys in the messages dict are supposed to correspond to keys/indexes in the input data. So that would rule out this approach.

Then there's the issue that fixing this will break any existing code that depends on the current (admittedly broken) behavior.

I found this bug for the users of apiflask. But I hardly use union type in dataclasses. Maybe we should discuss it with people who use it regularly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants