Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] knot_extensions Python API #15103

Merged
merged 12 commits into from
Jan 8, 2025
Merged

[red-knot] knot_extensions Python API #15103

merged 12 commits into from
Jan 8, 2025

Conversation

sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Dec 22, 2024

Summary

Adds a type-check-time Python API that allows us to create and manipulate types and to test various of their properties. For example, this can be used to write a Markdown test to make sure that A & B is a subtype of A and B, but not of an unrelated class C (something that requires quite a bit more code to do in Rust):

from knot_extensions import Intersection, is_subtype_of, static_assert

class A: ...
class B: ...

type AB = Intersection[A, B]

static_assert(is_subtype_of(AB, A))
static_assert(is_subtype_of(AB, B))

class C: ...
static_assert(not is_subtype_of(AB, C))

I think this functionality is also helpful for interactive debugging sessions, in order to query various properties of Red Knot's type system. Which is something that otherwise requires a custom Rust unit test, some boilerplate code and constant re-compilation.

Comparison Rust ↔ Python: is_assignable_to tests

Rust
#[test_case(Ty::BuiltinInstance("str"), Ty::BuiltinInstance("object"))]
#[test_case(Ty::BuiltinInstance("int"), Ty::BuiltinInstance("object"))]
#[test_case(Ty::Unknown, Ty::IntLiteral(1))]
#[test_case(Ty::Any, Ty::IntLiteral(1))]
#[test_case(Ty::Never, Ty::IntLiteral(1))]
#[test_case(Ty::IntLiteral(1), Ty::Unknown)]
#[test_case(Ty::IntLiteral(1), Ty::Any)]
#[test_case(Ty::IntLiteral(1), Ty::BuiltinInstance("int"))]
#[test_case(Ty::StringLiteral("foo"), Ty::BuiltinInstance("str"))]
#[test_case(Ty::StringLiteral("foo"), Ty::LiteralString)]
#[test_case(Ty::LiteralString, Ty::BuiltinInstance("str"))]
#[test_case(Ty::BytesLiteral("foo"), Ty::BuiltinInstance("bytes"))]
#[test_case(Ty::IntLiteral(1), Ty::Union(vec![Ty::BuiltinInstance("int"), Ty::BuiltinInstance("str")]))]
#[test_case(Ty::IntLiteral(1), Ty::Union(vec![Ty::Unknown, Ty::BuiltinInstance("str")]))]
#[test_case(Ty::Union(vec![Ty::IntLiteral(1), Ty::IntLiteral(2)]), Ty::Union(vec![Ty::IntLiteral(1), Ty::IntLiteral(2)]))]
#[test_case(
    Ty::Union(vec![Ty::IntLiteral(1), Ty::IntLiteral(2)]),
    Ty::BuiltinInstance("int")
)]
#[test_case(
    Ty::Union(vec![Ty::IntLiteral(1), Ty::None]),
    Ty::Union(vec![Ty::BuiltinInstance("int"), Ty::None])
)]
#[test_case(Ty::Tuple(vec![Ty::Todo]), Ty::Tuple(vec![Ty::IntLiteral(2)]))]
#[test_case(Ty::Tuple(vec![Ty::IntLiteral(2)]), Ty::Tuple(vec![Ty::Todo]))]
#[test_case(Ty::SubclassOfAny, Ty::SubclassOfAny)]
#[test_case(Ty::SubclassOfAny, Ty::SubclassOfBuiltinClass("object"))]
#[test_case(Ty::SubclassOfAny, Ty::SubclassOfBuiltinClass("str"))]
#[test_case(Ty::SubclassOfAny, Ty::BuiltinInstance("type"))]
#[test_case(Ty::SubclassOfBuiltinClass("object"), Ty::SubclassOfAny)]
#[test_case(
    Ty::SubclassOfBuiltinClass("object"),
    Ty::SubclassOfBuiltinClass("object")
)]
#[test_case(Ty::SubclassOfBuiltinClass("object"), Ty::BuiltinInstance("type"))]
#[test_case(Ty::SubclassOfBuiltinClass("str"), Ty::SubclassOfAny)]
#[test_case(
    Ty::SubclassOfBuiltinClass("str"),
    Ty::SubclassOfBuiltinClass("object")
)]
#[test_case(Ty::SubclassOfBuiltinClass("str"), Ty::SubclassOfBuiltinClass("str"))]
#[test_case(Ty::SubclassOfBuiltinClass("str"), Ty::BuiltinInstance("type"))]
#[test_case(Ty::BuiltinInstance("type"), Ty::SubclassOfAny)]
#[test_case(Ty::BuiltinInstance("type"), Ty::SubclassOfBuiltinClass("object"))]
#[test_case(Ty::BuiltinInstance("type"), Ty::BuiltinInstance("type"))]
#[test_case(Ty::BuiltinClassLiteral("str"), Ty::SubclassOfAny)]
#[test_case(Ty::SubclassOfBuiltinClass("str"), Ty::SubclassOfUnknown)]
#[test_case(Ty::SubclassOfUnknown, Ty::SubclassOfBuiltinClass("str"))]
#[test_case(Ty::SubclassOfAny, Ty::AbcInstance("ABCMeta"))]
#[test_case(Ty::SubclassOfUnknown, Ty::AbcInstance("ABCMeta"))]
fn is_assignable_to(from: Ty, to: Ty) {
    let db = setup_db();
    assert!(from.into_type(&db).is_assignable_to(&db, to.into_type(&db)));
}

#[test_case(Ty::BuiltinInstance("object"), Ty::BuiltinInstance("int"))]
#[test_case(Ty::IntLiteral(1), Ty::BuiltinInstance("str"))]
#[test_case(Ty::BuiltinInstance("int"), Ty::BuiltinInstance("str"))]
#[test_case(Ty::BuiltinInstance("int"), Ty::IntLiteral(1))]
#[test_case(
    Ty::Union(vec![Ty::IntLiteral(1), Ty::None]),
    Ty::BuiltinInstance("int")
)]
#[test_case(
    Ty::Union(vec![Ty::IntLiteral(1), Ty::None]),
    Ty::Union(vec![Ty::BuiltinInstance("str"), Ty::None])
)]
#[test_case(
    Ty::SubclassOfBuiltinClass("object"),
    Ty::SubclassOfBuiltinClass("str")
)]
#[test_case(Ty::BuiltinInstance("type"), Ty::SubclassOfBuiltinClass("str"))]
fn is_not_assignable_to(from: Ty, to: Ty) {
    let db = setup_db();
    assert!(!from.into_type(&db).is_assignable_to(&db, to.into_type(&db)));
}
Python (grouped by topic instead of assignable/not-assignable)
from knot_extensions import static_assert, is_assignable_to, Unknown, TypeOf
from typing_extensions import Never, Any, Literal, LiteralString
from abc import ABCMeta

static_assert(is_assignable_to(str, object))
static_assert(is_assignable_to(int, object))
static_assert(not is_assignable_to(object, int))
static_assert(not is_assignable_to(int, str))

static_assert(is_assignable_to(Unknown, Literal[1]))
static_assert(is_assignable_to(Any, Literal[1]))
static_assert(is_assignable_to(Never, Literal[1]))
static_assert(is_assignable_to(Literal[1], Unknown))
static_assert(is_assignable_to(Literal[1], Any))
static_assert(is_assignable_to(Literal[1], int))
static_assert(not is_assignable_to(Literal[1], str))
static_assert(not is_assignable_to(int, Literal[1]))

static_assert(is_assignable_to(Literal["foo"], str))
static_assert(is_assignable_to(Literal["foo"], LiteralString))
static_assert(is_assignable_to(LiteralString, str))
static_assert(is_assignable_to(Literal[b"foo"], bytes))

static_assert(is_assignable_to(Literal[1], int | str))
static_assert(is_assignable_to(Literal[1], Unknown | str))
static_assert(is_assignable_to(Literal[1] | Literal[2], Literal[1] | Literal[2]))
static_assert(is_assignable_to(Literal[1] | Literal[2], int))
static_assert(is_assignable_to(Literal[1] | None, int | None))
static_assert(not is_assignable_to(Literal[1] | None, int))
static_assert(not is_assignable_to(Literal[1] | None, str | None))

static_assert(is_assignable_to(type[Any], type[Any]))
static_assert(is_assignable_to(type[Any], type[object]))
static_assert(is_assignable_to(type[Any], type[str]))
static_assert(is_assignable_to(type[object], type[Any]))
static_assert(is_assignable_to(type[object], type[object]))
static_assert(is_assignable_to(type[object], type))
static_assert(is_assignable_to(type[str], type[Any]))
static_assert(is_assignable_to(type[str], type[object]))
static_assert(not is_assignable_to(type[object], type[str]))
static_assert(is_assignable_to(type[str], type[str]))
static_assert(is_assignable_to(type[str], type))
static_assert(not is_assignable_to(type, type[str]))
static_assert(is_assignable_to(type, type[Any]))
static_assert(is_assignable_to(type, type[object]))
static_assert(is_assignable_to(type, type))
static_assert(is_assignable_to(TypeOf[str], type[Any]))
static_assert(is_assignable_to(type[str], type[Unknown]))
static_assert(is_assignable_to(type[Unknown], type[str]))
static_assert(is_assignable_to(type[Any], type[Unknown]))
static_assert(is_assignable_to(type[Any], ABCMeta))
static_assert(is_assignable_to(type[Unknown], ABCMeta))

Other ideas

  • Introduce assert_true(…) to replace reveal_type(…) # revealed: Literal[True]
  • Introduce knot_extensions.Unknown as a spelling for the Unknown type.
  • Should we have a way to spell the empty intersection type? Like Intersection[()]? It's probably okay not to support this. Empty Unions or Unions with a single type are not supported either (even if they would have a well defined meaning).
  • Support typing.assert_type, and maybe alias it as knot_extensions.assert_type

To do

  • Proper error handling
  • Put the stubs for the knot_extensions module in a dedicated package
  • Think about whether or not TypeOf is needed. Maybe introduce something like knot_extensions.ClassLiteral[…] instead for that specific purpose?

Test Plan

New Markdown tests

@sharkdp sharkdp added the red-knot Multi-file analysis & type inference label Dec 22, 2024

This comment was marked as off-topic.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I'm impressed how simple the implementation is

crates/red_knot_python_semantic/src/types/type_api.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/type_api.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
@sharkdp sharkdp force-pushed the david/type-level-api branch 2 times, most recently from 9795b39 to 9943e95 Compare January 2, 2025 15:26
@sharkdp sharkdp marked this pull request as ready for review January 3, 2025 09:18
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice. Considering how much more readable this makes our tests, I'm sold that this is the way to go. As such, my review here is mostly a review/commentary of your implementation rather than of the concept itself :-)

crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
crates/red_knot_vendored/vendor/typeshed/stdlib/VERSIONS Outdated Show resolved Hide resolved
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More of a question than a comment: What are your thoughts on whether we should gate this behind a feature or make it testing only?

@sharkdp
Copy link
Contributor Author

sharkdp commented Jan 3, 2025

More of a question than a comment: What are your thoughts on whether we should gate this behind a feature or make it testing only?

Seeing that mypy and pyre have similar public APIs, I could imagine (part of) this becoming a public API for red knot that isn't feature-gated? The Intersection/Not constructors and static_assert are potentially interesting for users which don't care too much about compatibility with other type checkers (cf python/typing#213). Some less-interesting parts like is_singleton could also be moved to knot_extensions.internal maybe?

Do we need to decide this right away? If not, can this stay public for now, or should we approach it very carefully and hide everything (by default)?

@MichaReiser
Copy link
Member

Do we need to decide this right away? If not, can this stay public for now, or should we approach it very carefully and hide everything (by default)?

I'm generally leaning toward keeping the API intentionally limited and only expose features that we want to support long term (and consider part of the public API) because users will using them and it's very easy that we forget to revisit this decision before the release (unless we note it down somewhere?)

@sharkdp sharkdp changed the title [red-knot] Red Knot Type API [red-knot] knot_extensions Python API Jan 3, 2025
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!!

A couple more minor comments below, and we need to sort out the open question of where the knot_extensions stub will live (or it will get deleted in the next automated typeshed-sync PR). But other than that, this now looks great to me.

crates/red_knot_python_semantic/src/types/type_api.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/type_api.rs Outdated Show resolved Hide resolved
@carljm
Copy link
Contributor

carljm commented Jan 6, 2025

I'm generally leaning toward keeping the API intentionally limited and only expose features that we want to support long term (and consider part of the public API)

For what it's worth, there's nothing in the API added in this PR that I have reservations about supporting long-term as public API, in principle. Of course it's always possible that we realize we made mistakes in details of how we defined or named the API, and we want to change these things in future and potentially have to take backward-compatibility into account with those changes. But this is inevitable.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really excellent, thank you! I look forward to using it, and I think red-knot users will also find it valuable.

crates/red_knot_python_semantic/src/types/type_api.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
Comment on lines 14 to 21
# Predicates on types
class _IsEquivalentTo[S, T]: ...
class _IsSubtypeOf[S, T]: ...
class _IsAssignableTo[S, T]: ...
class _IsDisjointFrom[S, T]: ...
class _IsFullyStatic[T]: ...
class _IsSingleton[T]: ...
class _IsSingleValued[T]: ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of using these dedicated types vs just having them all return bool in the stub? Either way the stub-annotated return type isn't used as-is, we have to special-case it. But this way it seems like the stub is just lying about the actual inferred return type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting this question to come up 😄. Honestly, I didn't really know what to put here. Yes, bool is the union of both possible return types that we would could infer (either Literal[True] or Literal[False]). But it wouldn't ever be useful for a type checker to fall back to this bool return type annotation, since no static conclusions can be drawn from it, right? Or is that exactly why you want it to be annotated as bool? For other type checkers that do not special case these functions?

I think I wanted to express that these are functions on types, not on objects. Using _Xyz[S, T] as a return type was my way of saying: the return type depends on the argument types. But of course I couldn't express that the return type does not also depend on those "phantom data" objects x/y.

This is also why I implemented these predicates as generic classes like IsEquivalentTo[S, T] initially, as that seemed to imply to me that a new type (IsEquivalentTo[S, T]) is being computed from two other types (S and T). But I understand that that's not great either.

So I'm happy to change it to bool. And I can also add some documentation for readers of that stub file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it makes sense how you got here :) But my feeling is that neither return type is directly useful, and bool is both accurate and simple.

I wanted to express that these are functions on types

Yeah; I think the way to do this is would be via annotating the arguments as TypeForm, once PEP 747 is accepted.

Copy link
Contributor Author

@sharkdp sharkdp Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm just going to mention this one last time and then I'll be quiet 😄. The current version of the API does feel good to me and I think it's more or less intuitive. I'm personally fine with merging it as is.

But upon closer inspection, there are some things that are "special".

The type predicates like is_fully_static etc. do not work like usual Python functions: For all other functions, we call infer_expression on the arguments in a function call. But for these predicates, we call infer_type_expression.

Similarly, the special form TypeOf does not work like usual special forms (although Annotated can also contain an non-type-expression in the second argument). For all other special forms, we call infer_type_expression for the bracketed arguments to the special form. For TypeOf, we call infer_expression.

For TypeOf, we already discussed that it might be necessary to leave it as a special form, as a call expression can not occur in type annotations.

But the functions still feel a bit strange to me, and also require some special-casing in the implementation. It seems to me that even if we had PEP 747 and annotated the parameter types of these functions as TypeForms, we should still call infer_expression instead of infer_type_expression for calls to such functions (I haven't read the full PEP, but the split_union example here seems to imply that typ internally is of type UnionType, for example — not an actual union type).

Again, I'm personally fine with this special-casing, but I thought it would be worth calling out one last time. The alternative would be to make these predicates special forms that could be called via IsFullyStatic[int] or maybe these would be generic classes with a __bool__ method that would return Literal[True]/Literal[False] accordingly, such that we could write static_assert(IsFullyStatic[int]()). Or potentially static_assert(IsFullyStatic[int]) if we implement it on the meta-class(?).

(the Intersection and Not type forms, and the static_assert function act normal)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your functions aren't the first example of this: the existing typing.cast function (which we haven't yet implemented support for, but will need to) also requires its first argument to be interpreted as a type expression rather than a value expression.

PEP 747 generalizes this; it requires that an expression whose "type context" is a TypeForm type (e.g. the RHS of an annotated assignment where the annotation is a TypeForm type, or an argument matched to a parameter annotated as a TypeForm type) be interpreted as a type expression rather than a value expression. If the expression is a valid type expression, it is assignable to TypeForm; if it is a valid type expression spelling a type assignable to T, then it is assignable to TypeForm[T].

(The split_union example in the PEP is not a counter-example to this. A TypeForm type is indeed never an "actual union type"; TypeForm[T | S] represents the set of runtime objects that can result from a valid type expression for the type T | S. The type types.UnionType overlaps with TypeForm[T | S]; the type T | S is disjoint from the type TypeForm[T | S].)

This kind of contextual type evaluation is something we don't support yet, but will need to. It comes up in other cases besides TypeForm as well; Eric Traut has some good examples at https://discuss.python.org/t/pep-747-typeexpr-type-hint-for-a-type-expression/55984/67

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call-checking support will require changes to handle contextual type evaluation; we'll need to match arguments to parameters before inferring argument types. I felt that it still made sense to do the simpler implementation now, and adjust it when we add contextual inference support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your functions aren't the first example of this: the existing typing.cast function (which we haven't yet implemented support for, but will need to) also requires its first argument to be interpreted as a type expression rather than a value expression.

similar for assert_type as well

@sharkdp sharkdp force-pushed the david/type-level-api branch 2 times, most recently from 4aaacee to 209a217 Compare January 7, 2025 19:39
@sharkdp sharkdp changed the base branch from main to cjm/checksigs January 7, 2025 19:39
@sharkdp sharkdp force-pushed the david/type-level-api branch from ccb639c to 855d6db Compare January 7, 2025 19:58
Base automatically changed from cjm/checksigs to main January 7, 2025 20:39
@sharkdp sharkdp force-pushed the david/type-level-api branch from 1fee970 to 541a13c Compare January 7, 2025 21:24
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/diagnostic.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/call.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/call.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
@carljm
Copy link
Contributor

carljm commented Jan 7, 2025

The integration/flattening looks great!

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! A couple of nitpicks on top of Carl's comments. And it still needs an edit to https://github.com/astral-sh/ruff/blob/main/.github/workflows/sync_typeshed.yaml to make sure we don't lose the knot_extensions stub file in the next automated typeshed sync

crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/call.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Show resolved Hide resolved
@sharkdp sharkdp merged commit 235fdfc into main Jan 8, 2025
21 checks passed
@sharkdp sharkdp deleted the david/type-level-api branch January 8, 2025 11:52
@sharkdp
Copy link
Contributor Author

sharkdp commented Jan 8, 2025

Decided to merge this now to unblock work on #15194. Feel free to add post-merge comments and I will address them.

dcreager added a commit that referenced this pull request Jan 8, 2025
* main:
  [`pylint`] Fix `unreachable` infinite loop (`PLW0101`) (#15278)
  fix invalid syntax in workflow file (#15357)
  [`pycodestyle`] Avoid false positives related to type aliases (`E252`) (#15356)
  [`flake8-builtins`] Disapply `A005` to stub files (#15350)
  Improve logging system using `logLevel`, avoid trace value (#15232)
  [`flake8-builtins`] Rename `A005` and improve its error message (#15348)
  Spruce up docs for pydoclint rules (#15325)
  [`flake8-type-checking`] Apply `TC008` more eagerly in `TYPE_CHECKING` blocks and disapply it in stubs (#15180)
  [red-knot] `knot_extensions` Python API (#15103)
  Display Union of Literals as a Literal (#14993)
  [red-knot] all types are assignable to object (#15332)
  [`ruff`] Parenthesize arguments to `int` when removing `int` would change semantics in `unnecessary-cast-to-int` (`RUF046`) (#15277)
  [`eradicate`] Correctly handle metadata blocks directly followed by normal blocks (`ERA001`) (#15330)
  Narrowing for class patterns in match statements (#15223)
  [red-knot] add call checking (#15200)
  Spruce up docs for `slice-to-remove-prefix-or-suffix` (`FURB188`) (#15328)
  [`internal`] Return statements in finally block point to end block for `unreachable` (`PLW0101`) (#15276)
  [`ruff`] Treat `)` as a regex metacharacter (`RUF043`, `RUF055`) (#15318)
  Use uv consistently throughout the documentation (#15302)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants