-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Support assert_type
#15194
base: main
Are you sure you want to change the base?
Conversation
|
This is largely a proof-of-concept. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for working on this. I added a few initial review comments.
/// Returns true if this type and `other` are "exactly the same". | ||
/// | ||
/// This powers the `assert_type()` directive. | ||
pub(crate) fn is_equals_to(self, db: &'db dyn Db, other: Type<'db>) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you go into a bit more detail in how this is different from is_equivalent_to
? The main difference seems to be that this can also handle non-fully-static types and return true
for something like Any ~ Any
? If so, could it handle those gradual types (+ intersections/unions/tuples) and then fall back to is_equivalent_to
for fully static types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's quite possible that a function like this could help fix #14899, FWIW, so it doesn't seem like an unreasonable addition to me. I'm also not sure what name would clearly distinguish it from Type::is_equivalent_to()
, though 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I was not suggesting that this function is not useful. I just think there's some overlap with is_equivalent_to
, and I'd like to understand it and possibly make use of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another notable exception is type[Any] != type != type[object]
. I'm not too sure about the former, but I think it is better to be strict.
Also, as it currently is, int | Unknown | Any != int | Any
. This is supposedly due to Unknown
not being foldable into Any
, but int | Any
is just Any
; could such union types even be emitted in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int | Any
is not just Any
; it is a gradual type with a known lower bound. It is an unknown type that must be at least as large as int
, whereas plain Any
is an unknown type that could be a smaller type than int
. It is thus intentional that we don't simplify int | Any
, and int | Any
can be treated differently in the type system than plain Any
. For instance, given an object of type int | Any
, it is an error to assign that object to a destination expecting e.g. a str
, whereas Any
is assignable to str
. Using terms from the typing spec, Any
can materialize to str
, but int | Any
cannot; at best it can materialize to int | str
.
The thing that causes more difficulty here is our choice to not simplify Any | Unknown
. Because both of these simply represent the dynamic/unknown type, just with different origin. From a type system perspective, they should simplify, but we choose to preserve the information that the type may have come from an explicit Any
annotation or may have come from a lack of annotation. But we should still ideally treat Any | Unknown
as equivalent to Any
and equivalent to Unknown
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type[Any] != type != type[object]
I'm not sure if this is describing current behavior or correct behavior. Correct behavior would be that type
== type[object]
, but both are not equivalent to type[Any]
. (In our internal representation, type
would be Type::Instance(<builtins.type>)
and type[object]
would be Type::SubclassOf(<builtins.object>)
, but these are equivalent types and we should treat them as such.)
crates/red_knot_python_semantic/resources/mdtest/directives/assert_type.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/directives/assert_type.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/directives/assert_type.md
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/directives/assert_type.md
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/directives/assert_type.md
Outdated
Show resolved
Hide resolved
3234f80
to
08fd1a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, thank you for working on it! I think if we rebase on top of #15103 we can address some of the limitations and get it ready to land.
let equal = |(first, second): (&Type<'db>, &Type<'db>)| first.is_equals_to(db, *second); | ||
|
||
match (self, other) { | ||
(Type::Todo(_), Type::Todo(_)) => false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todo is supposed to behave like Any
or Unknown
in the typing system, so I think our default approach should be that all three are gradually-equivalent to each other. Is there a concrete reason you felt it necessary to make Todo != Todo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought that it would be rather nonsensical to do something like this:
def _(a: tuple[int, ...]):
assert_type(a, tuple[str, ...]) # Pass because both are Todos
No strong feelings though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the reasoning, but I think it is OK for that assert_type
to pass, until we actually are able to properly differentiate those types. It is more important that we consistently treat Todo
in the type system as the dynamic type, just with more provenance information.
(Type::Union(first), Type::Union(second)) => { | ||
let first_elements = first.elements(db); | ||
let second_elements = second.elements(db); | ||
|
||
if first_elements.len() != second_elements.len() { | ||
return false; | ||
} | ||
|
||
let first_elements = first_elements.iter().collect::<OrderedTypeSet>(); | ||
let second_elements = second_elements.iter().collect::<OrderedTypeSet>(); | ||
|
||
iter::zip(first_elements, second_elements).all(equal) | ||
} | ||
|
||
(Type::Intersection(first), Type::Intersection(second)) => { | ||
let first_positive = first.positive(db).iter().collect::<OrderedTypeSet>(); | ||
let first_negative = first.negative(db).iter().collect::<OrderedTypeSet>(); | ||
|
||
let second_positive = second.positive(db).iter().collect::<OrderedTypeSet>(); | ||
let second_negative = second.negative(db).iter().collect::<OrderedTypeSet>(); | ||
|
||
iter::zip(first_positive, second_positive).all(equal) | ||
&& iter::zip(first_negative, second_negative).all(equal) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice; this is a fix we need in is_equivalent_to
as well. It would be nice if there were a way to do it without allocating so many hashsets...
08fd1a5
to
7bb9545
Compare
@@ -553,6 +556,8 @@ pub enum Type<'db> { | |||
// TODO protocols, callable types, overloads, generics, type vars | |||
} | |||
|
|||
type OrderedTypeSet<'a, 'db> = HashSet<&'a Type<'db>, BuildHasherDefault<FxHasher>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What makes this set ordered? This is just a regular hash set, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I just thought its name should reflect the fact that the order of iteration is depended upon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the order isn't guaranteed. The element ordering may depend on insertion order. The IntersectionType
positive
and negative
sets are ordered (although I don't remember the details of how they're ordered)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IntersectionType
uses OrderedSet
, whose iteration order is the same as insertion order. The comparison in question needs a self-ordered set type.
I have so far been unsuccessful in finding such a type. Any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could collect the types into a vec and then order them. I'm just not sure why the order is important. I suspect that we instead have to implement an O(n^2)
algorithm that, for every type in a
, searches a type in b
to which it is equivalent, disregarding ordering entirely. But I'm not sure if doing so is too naive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this was done to avoid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking this over more, I agree with Micha, in that I don't think naive Type
ordering (even if we were to find/create a self-ordering set implementation) will be able to get us the right results here. The problem is that we can have equivalent "atomic" types that are not represented as the same Type
enum bit pattern. (Examples already discussed include Any
and Unknown
, or Type::Instance(<builtins.type>)
and Type::SubclassOf(<builtins.object>)
. So our definition of equivalence for unions and intersections needs to recurse into a full understanding of atomic type equivalence, not be built solely on Type
enum identity.
Given that we already have this as an unsolved problem with the existing is_equivalent_to
implementation, I'm OK with punting on this for this PR. (In other words, I think we can, and probably should, separate the problems of "implement assert_type
" and "fully correct gradual type equivalence" into separate PRs.) In which case, I would suggest just removing the set stuff from this PR, and implementing is_gradual_equivalent_to
for now without any handling for differently-ordered unions and intersections (just like is_equivalent_to
), and a TODO
comment for improving that in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually briefly tried looking at the problem of equivalence for differently ordered unions over the Christmas break, and, yeah, it is Not Easy
} | ||
|
||
(Type::Intersection(first), Type::Intersection(second)) => { | ||
let first_positive = first.positive(db).iter().collect::<OrderedTypeSet>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to collect the positive elements into another set when they are already set in IntersectionType
? Could we instead compute the difference between a
and b
and then compare if those elements are equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int & Unknown & ~Any
is supposed to be gradual equivalent to int & Any & ~Unknown
; I did mention this in a previous comment. Since I'm not sure whether such types can exist, I took the safe route.
a90a8fd
to
bf4957f
Compare
crates/red_knot_python_semantic/resources/mdtest/directives/assert_type.md
Outdated
Show resolved
Hide resolved
d6fa64f
to
b5d8b8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you! A few nits here and there, but basically this looks just right.
from typing_extensions import assert_type | ||
|
||
def _(a: str | int): | ||
# TODO: Infer the second argument as a type expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# TODO: Infer the second argument as a type expression | |
# TODO: order-independent union handling in type equivalence |
if isinstance(a, B) and not isinstance(a, C) and not isinstance(a, D): | ||
reveal_type(a) # revealed: A & B & ~C & ~D | ||
|
||
# TODO: Infer the second argument as a type expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# TODO: Infer the second argument as a type expression | |
# TODO: order-independent intersection handling in type equivalence |
|
||
## Unions | ||
|
||
Unions with the same elements are the same, regardless of order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also add a test showing union equivalence when the elements are actually in the same order? (This one should actually pass now.)
## Intersections | ||
|
||
Intersections are the same when their positive and negative parts are respectively the same, | ||
regardless of order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, let's also add a test for intersection equivalence when ordering is the same? It doesn't have to highlight ordering, just show two identical unions and that assert_type
considers them equivalent.
match (self, other) { | ||
(_, _) if self == other => true, | ||
|
||
(Type::Any | Type::Unknown, Type::Any | Type::Unknown) => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about Todo?
(Type::SubclassOf(first), Type::SubclassOf(second)) => { | ||
match (first.subclass_of(), second.subclass_of()) { | ||
(first, second) if first == second => true, | ||
(ClassBase::Any | ClassBase::Unknown, ClassBase::Any | ClassBase::Unknown) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about Todo?
iter::zip(first_elements, second_elements).all(equivalent) | ||
} | ||
|
||
(Type::Intersection(first), Type::Intersection(second)) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Type::Intersection(first), Type::Intersection(second)) => { | |
// TODO handle equivalent intersections with items in different order | |
(Type::Intersection(first), Type::Intersection(second)) => { |
&& iter::zip(first.elements(db), second.elements(db)).all(equivalent) | ||
} | ||
|
||
(Type::Union(first), Type::Union(second)) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Type::Union(first), Type::Union(second)) => { | |
// TODO handle equivalent unions with items in different order | |
(Type::Union(first), Type::Union(second)) => { |
#[test_case(Ty::Todo, Ty::Todo)] | ||
#[test_case(Ty::Any, Ty::Any)] | ||
#[test_case(Ty::Unknown, Ty::Unknown)] | ||
#[test_case(Ty::Any, Ty::Unknown)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also test Todo
vs Any
and Todo
vs Unknown
, also all equivalent
.map(|arg_or_keyword| { | ||
.enumerate() | ||
.map(|(index, arg_or_keyword)| { | ||
// TODO: Remove this once we have proper overload matching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think overload matching is the feature that will allow removing this; TypeForm
and contextual type inference might. But since the current approach is not buggy, just hardcoded, I don't think we need a TODO for that. We could make a note of it, but I'd probably put that in the doc comment for takes_type_expression_arguments
, not here.
// TODO: Remove this once we have proper overload matching |
Summary
See #15103.
Test Plan
Markdown tests and unit tests.