Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typing updates #2252

Merged
merged 14 commits into from
Jun 11, 2024
Merged

Typing updates #2252

merged 14 commits into from
Jun 11, 2024

Conversation

rmartin16
Copy link
Member

@rmartin16 rmartin16 commented Nov 30, 2023

Changes

  • Completed survey of typing for Toga's API making updates/corrections where appropriate
  • Notable remaining typing issues:
    • Async and Generators handlers are not accepted as appropriate types
    • ListSource and TreeSource are not generic
      • Furthermore, widgets depending on these data sources are not generic, e.g. DetailedList, Table, etc.
    • The API surface for the backend implementations is not implemented
      • Ideally, Protocols will exist for the backend implementations for each widget, window, etc.

Notes

KI


Original PR body ## Summary This is a proposal to introduce better typing ergonomics for users and ensure typing accuracy of the Toga API.

Strategy

Introduce tests that specifically exercise the types for the API's interface.

These tests are a little different than what pytest does; they aren't runnable code - instead, they simulate user code and ensure that mypy thinks the API conforms to the expected types.

For instance, this test confirms src accepts a str and size returns a tuple of two ints:

assert_type(toga.Image(src="/data").size, tuple[int, int])

Benefits

  1. The API's typing will be compatible with mypy (and potentially other type checkers)
  2. The API's typing will be accurate...at least insofar as it's typed
  3. The API's typing will be confirmed stable over time

Considerations

  • Why not just make Toga's codebase pass mypy?
    • This will be non-trivial, to say the least...especially given how strictly I want to run mypy against the API's interface
    • I'm planning to enable every typing enforcement for mypy that I can for the API tests
    • Also, this approach actually verifies the user's experience with the API
      • For instance, running mypy over the Toga's codebase won't tell you that src for toga.Image(src="...") should accept a Path....but running mypy over code using toga.Image can verify that
  • How will we know if the entire API's typing is checked?
    • I'm not sure we can programmatically measure this....but it may be possible through other means 🤔

Thoughts, feedback, criticism welcome before I move forward

PR Checklist:

  • All new features have been tested
  • All new features have been documented
  • I have read the CONTRIBUTING.md file
  • I will abide by the code of conduct

Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fundamentally, I'm in favor of improvements that systematically ensure that our type annotations remain correct and consistent with reality.

However, I'm not immediately and obviously convinced what this PR is doing is the right approach. I'm willing to be convinced otherwise, and I'll openly admit that some of my hesitation may stem from not fully understanding why/how mypy is doing what it's doing.

  1. Some of the import oddities. "import Pack as Pack"... I'm not wild about the idea of requiring syntactically weird constructions to keep the type checker happy. I might be willing to live with this if it was
  2. As flagged inline, the need to test backends as well as dummy.
  3. The fact that we need to manually maintain a "type coverage test suite" without any guidance from automated tooling. Essentially there's no automated failure criteria triggered around adding a new API that isn't accompanied by a type test.

Part of my hesitation on item (3) is that I'm not clear what this typing suite is doing that the actual unit test suite isn't doing. The combination of the core test suite and the backend test suite is invoking every API endpoint, and every branch - that doesn't necessarily guarantee that every possible type input is being tested, but it should be pretty close. Why can't we use mypy on the test suite as compliance condition?

@@ -1,2 +1,2 @@
from toga.style.applicator import TogaApplicator # noqa: F401
from toga.style.pack import Pack # noqa: F401
from toga.style.pack import Pack as Pack # noqa: F401
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What am I missing here? This reads like a no-op renaming...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh...i should have just left this out until later. This is just something stricter mypy gets upset about because toga.style is not the owner of this object. So, when other projects import Pack from toga.style, it is considered an implicit re-export. I imagine the concern is that since toga.style doesn't own Pack, it could be renamed and the symbol would disappear from this namespace potentially without really understanding the consequence (although, its a bit obvious in this example). For instance, someone could have even aliased the new name of the object to Pack back over in toga.style.pack but maybe they won't here. So, doing an explicit re-export via as Pack helps explicitly avoid that situation.

This is one of mypy's stricter modes that is enabled via --no-implicit-reexport.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - Ok. That makes sense; and oddly, I'm actually more OK with it given that's the reason, as it requires us to beexplicit about thing we're re-exporting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --no-implicit-reexport docs say this can also be done using __all__, which seems like a less confusing and more explicit solution.


import toga

assert_type(toga.Image().size, tuple[int, int])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this will be sufficient. In many cases, type checks on core will result in type checking the behavior of the dummy backend, and while the dummy should definitely do the right thing, it's not what most people will be touching in practice, so checks at the runtime level would seem necessary.

Copy link
Member

@mhsmith mhsmith Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if I understand correctly, this is only making assertions about the type hints themselves, but not whether those hints are actually being followed. It seems like a lot of work for no clear benefit to add hundreds of assertions that essentially say "an argument or return type hinted as X is hinted as X", but don't add any additional confidence that an X will actually be accepted or returned by the code when it's run.

Your time might be better spent moving towards full mypy compliance in some parts of the repository, though I admit I'm ignorant about exactly what that would involve. The fact that Toga is split into semi-independent core, backend and testbed projects might make it a bit more approachable than a monolithic project like Briefcase.

In particular, a quite common class of bugs is where the public API in the core returns an object it got from the backend, but the backend doesn't comply with the public hint (e.g. #1779). It would be nice if things like that could be caught by a static type checker, rather than needing an isinstance check next to every equality check in the testbed. This would require formalizing the core / backend interface, which is mostly duck-typed at present, with a set of type-hinted base classes, which is probably a good idea even if we don't end up running mypy at all.

@rmartin16
Copy link
Member Author

These are good points. I would like to focus on the abstract goals, though, instead of my quick POC badly attempting some sort of an implementation. So, let's assume the type hints for the actual API (instead of dummy or something else) can be effectively evaluated for this discussion.

Goal

Ensure Toga provides a usable and accurate typing experience for users.

Measuring achievement

This goal would be satisfied when a user application exercising arbitrary components of Toga allows mypy to run to success regardless of strictness settings (where possible and pragmatic - # type: ignore exists for a reason).

Strategy

Toga compliance with mypy

Why can't we use mypy on the test suite as compliance condition?

full mypy compliance in some parts of the repository

These points are obviously valid; my primary contention with them, though, is the burden it bestows on Toga development. Writing Python that's compliant with mypy (even without the strict settings) can be a frustrating experience. This is especially true if you consider some of the more polymorphic code in Toga. Additionally, if you want to encourage outside developers to contribute, typing has a particularly rough learning curve and contributions may be ultimately abandoned because of it.

That said, a combination of mypy compliance in the source and in the tests could achieve the goal. And it's important that we consider running mypy against actual usage of Toga (by evaluating the tests as well) because we want to confirm the actual typing is not just internally consistent but is actually what the API should advertise. In many cases, an API's compliance with mypy with satisfy both of these....but it is also possible to satisfy mypy with typing that doesn't reflect desired usage. This is more likely in complicated typing involving Generics, for instance.

Toga usage compliance with mypy

This is effectively what I'm proposing; that when Toga is used, the APIs satisfy mypy even if Toga doesn't itself. It is essentially a subset of the solution space of "run mypy over the tests" but it wouldn't require all of the tests to satisfy mypy.

Yes, if I understand correctly, this is only making assertions about the type hints themselves, but not whether those hints are actually being followed. It seems like a lot of work for no clear benefit to add hundreds of assertions that essentially say "an argument or return type hinted as X is hinted as X", but don't add any additional confidence that an X will actually be accepted or returned by the code when it's run.

This is a good point; in some of my examples in my POC implementation, the assertion is trivial and isn't likely to provide enough value to warrant implementing hundreds of them. (Although, I will note it provoked fixing (int, int) to tuple[int, int]'.) I think what this could suggest is a strategy that would be limited to less straightforward APIs that would also usually be polymorphic in nature.

However, it is also not necessarily the goal of this type checking to ensure the code complies with it. This is part of the balance I talk about in the next section.

Implementation

I see levels, I suppose, of implementation options. They all come with their own trade offs (some mentioned above) and ultimately, this may come down to finding the right balance of achieving the fullest heights of the goal with the pragmatic situation of reality and existing tooling.

Un-automated and ad-hoc type testing

I would consider this the 80/20 solution. Basically, you write a quick Toga app implementing all (or at least the complicated) APIs and ensure it can pass mypy. This is, more or less, how I've come to proposing a solution at all; I will have a Toga app open in PyCharm and I start seeing squiggles under my variables because the typing isn't right.

If nothing else, such an approach would head off many of the issues users would create. Moreover, though, a related concern is that users don't report them and instead, Toga just has a lower reputation for them.

Automated type testing without known coverage

Incorporate the idea above in to CI. I haven't seen any tooling that would allow us to know if we're actually testing everything. However, if that was the goal, then I think the recommendation would just be "run mypy on your code". So, I would consider the intention here to make sure sensitive or error-prone code is being exercised.

Toga type checking

This is the "true" solution of just getting Toga to pass mypy...which I've discussed.

Other comments

In particular, a quite common class of bugs is where the public API in the core returns an object it got from the backend, but the backend doesn't comply with the public hint (e.g. #1779). It would be nice if things like that could be caught by a static type checker, rather than needing an isinstance check next to every equality check in the testbed. This would require formalizing the core / backend interface, which is mostly duck-typed at present, with a set of type-hinted base classes, which is probably a good idea even if we don't end up running mypy at all.

Agreed. My approach here is more of an interim solution; however, it'd only be interim if someone actually does all this work to type Toga soup to nuts at some point.

I'm mostly targeting the immediate user experience of trying to use Toga with mypy or even in a modern IDE. To that end, if you don't have Pylance enabled in VS Code, I would recommend turning it on so at least the type issues it can find are elevated at dev time.

Final Thoughts

At the end of the day, this is not a silver bullet or even a pretty stopgap. The best experience for typing with Toga would be facilitated by fully implementing typing within Toga and ensuring the code and its tests pass mypy. That is no small order, though. So, this is mostly trying to find a middle ground where at least some of the typing for the API's interface can be guaranteed and user ergonomics for implementing type checking in their projects using Toga is as painless as possible.

@freakboy3742
Copy link
Member

These are good points. I would like to focus on the abstract goals, though, instead of my quick POC badly attempting some sort of an implementation. So, let's assume the type hints for the actual API (instead of dummy or something else) can be effectively evaluated for this discussion.

To be clear - dummy is a valid backend - it's just not a backend most people will use in practice. It should match the same typing interface as a "real" backend.

Goal

Ensure Toga provides a usable and accurate typing experience for users.

Agreed - although I'd qualify that slightly by saying that while typing can be a useful mechanism for providing IDE hints and identifying some classes of bugs, it's not a panacea.

I'm interested in improving typing to the extent that it improves the lived experience of Toga developers (both those developing with Toga, and those developing Toga itself). I'm not especially interested in expending excessive effort just so we can metric of "type purity". Practicality beats purity.

Measuring achievement

This goal would be satisfied when a user application exercising arbitrary components of Toga allows mypy to run to success regardless of strictness settings (where possible and pragmatic - # type: ignore exists for a reason).

+1. +2 to the "pragmatic" part :-)

Strategy

Toga compliance with mypy

Why can't we use mypy on the test suite as compliance condition?
full mypy compliance in some parts of the repository
These points are obviously valid; my primary contention with them, though, is the burden it bestows on Toga development.

+1

Toga usage compliance with mypy

This is effectively what I'm proposing; that when Toga is used, the APIs satisfy mypy even if Toga doesn't itself. It is essentially a subset of the solution space of "run mypy over the tests" but it wouldn't require all of the tests to satisfy mypy.

It might be a lack of imagination on my part, but I'm having difficulty working out what this would look like. If Toga's test suite (core + testbed) is exercising API and every code path, I'm not sure I see how the tests could satisfy mypy if Toga doesn't.

Implementation

Un-automated and ad-hoc type testing

I would consider this the 80/20 solution. Basically, you write a quick Toga app implementing all (or at least the complicated) APIs and ensure it can pass mypy.

I can see how this would give us a quick win (or at least let us quickly identify problems that exist right now). My concern would be maintaining this sample app over time. I'm not wild about the idea of a new project that exists purely for testing purposes that isn't executed as part of automated testing except as part of mypy.

However, I wonder if there's a two-birds-one-stone solution here: We have toga-demo, but it is... not a very good demo :-) A much better demo would be a "widget gallery" - a live demonstrator of every widget that Toga contains (or, at least, a placeholder that says "this widget doesn't exist". This app could be uploaded to app stores, or made available for download. If it also gives us an opportunity to do a practical type check... ?

The reason we haven't historically done this is that we don't have a widget to use for the top-level navigation of such an app. We need the ability to display a list of widgets to demonstrate, which links to a page that demonstrates that widget.

Automated type testing without known coverage

Toga type checking

I'm not sure I understand why these two are in this order in terms of complexity. I would have thought internal type completeness in Toga would have been easier, as the external type surface isn't being tested.

Final Thoughts

At the end of the day, this is not a silver bullet or even a pretty stopgap. The best experience for typing with Toga would be facilitated by fully implementing typing within Toga and ensuring the code and its tests pass mypy. That is no small order, though. So, this is mostly trying to find a middle ground where at least some of the typing for the API's interface can be guaranteed and user ergonomics for implementing type checking in their projects using Toga is as painless as possible.

I'm definitely up for a "gradual improvement" approach. My concern is around ensuring that the approach we take is "self-sustaining" - that it doesn't involve adding something to a test suite that then starts to bitrot because there's no reason to ensure that it remains current and accurate.

@rmartin16
Copy link
Member Author

rmartin16 commented Dec 1, 2023

I'm not sure I see how the tests could satisfy mypy if Toga doesn't.

I would have thought internal type completeness in Toga would have been easier, as the external type surface isn't being tested.

I think I should clarify my position here. I am saying 1) my stated goal can be achieved with only type completeness of Toga's API surface and 2) this is far more feasible than type completeness of Toga, at large.

The primary piece of prior art for this strategy is Python's typeshed itself. It provides type completeness for the API surface of many packages and allows downstream developers to implement them with total typing compliance of their own code. This is, of course, while many of the packages represented by the typeshed are not typed themselves at all.

Therefore, as made evident by typeshed, type completeness of an API surface is an abstract subset of type completeness for the entire API (and its supporting code base). When a developer runs mypy on their code that's using Toga, mypy doesn't recursively review the internals of all function calls to check if they are internally consistent about the types they are declaring at their API surface; mypy takes the declared types and assumes they are correct.

Finally, retroactively achieving type completeness of a large code base long in to its existence is likely to be a perilous mission. When you write Python code without typing in mind from the get go, you're likely to end up with code that isn't compatible with type checking via static analysis. A good example is Briefcase; I never finished typing all of Briefcase because it strongly leverages composable classes via mix-ins, incompatible subclassing, and several of the important data structures are loosely defined. So, I would have had to create all of these awkward Protocols to structurally define the larger objects in play and likely do a bunch of refactoring. (Of course, none of this intended to reflect badly on Briefcase...it's just that type completeness inherently constrains what you can do with Python.)

At any rate, I feel confident saying achieving type completeness of Toga, in general, would be a significant task...and also one I am unlikely to take on. To get a taste of this, just run mypy on select files in Toga, e.g. core/src/toga/app.py; some of the errors are trivial...others may require refactoring....then you ask yourself "why am I refactoring working code....what am I getting". Of course, as Malcolm pointed out, targeted implementations such as abstract base classes for backends almost certainly make sense.

I'm definitely up for a "gradual improvement" approach. My concern is around ensuring that the approach we take is "self-sustaining" - that it doesn't involve adding something to a test suite that then starts to bitrot because there's no reason to ensure that it remains current and accurate.

Yeah; I understand this as a competing priority. I think the reality is, unfortunately, that there will be bitrot either way; if we don't verify or test the API surface types, they will bitrot...and if we do verify them with some sort of awkward pseudo-test app that won't have any coverage, it will bitrot. We're then full circle with "make Toga type complete" which I think is likely to be a task of dubious merit.

I do like the idea of the demo app potentially serving a dual purpose to verify Toga's API typing. However, I don't think I'm keen to take that on as part of this.

So, given all this, I can probably commit, at least, to a survey of Toga's current API surface types with the aim to:

  • confirm existing typing
  • correct current typing gaps
  • identify areas that may require more attention

@freakboy3742
Copy link
Member

I think I should clarify my position here. I am saying 1) my stated goal can be achieved with only type completeness of Toga's API surface and 2) this is far more feasible than type completeness of Toga, at large.

This all makes sense. I hadn't considered Typeshed as a point of comparison, and yeah - I can now see how an external API surface can be type complete without the internals being complete.

At any rate, I feel confident saying achieving type completeness of Toga, in general, would be a significant task...and also one I am unlikely to take on. To get a taste of this, just run mypy on select files in Toga, e.g. core/src/toga/app.py; some of the errors are trivial...others may require refactoring....then you ask yourself "why am I refactoring working code....what am I getting". Of course, as Malcolm pointed out, targeted implementations such as abstract base classes for backends almost certainly make sense.

And that's my general criticism of typing as an approach. There are times where it's very helpful. There are times when it is in direct conflict with the flexible-duck value proposition that is a core of Python's appeal.

I do like the idea of the demo app potentially serving a dual purpose to verify Toga's API typing. However, I don't think I'm keen to take that on as part of this.

So, given all this, I can probably commit, at least, to a survey of Toga's current API surface types with the aim to:

  • confirm existing typing
  • correct current typing gaps
  • identify areas that may require more attention

That seems like the best way forward to me. There's definitely stuff we can fix right now, but we don't need to swallow the full mypy elephant in one sitting; and we have a goal we can work towards in terms of a demo app (and any other areas that might be identified as part of a once-off type audit) that might provide a long term way to make mypy palatable.

@mhsmith
Copy link
Member

mhsmith commented Dec 2, 2023

So, given all this, I can probably commit, at least, to a survey of Toga's current API surface types with the aim to:

  • confirm existing typing
  • correct current typing gaps
  • identify areas that may require more attention

These all seem like worthwhile goals. You've probably worked this out already, but the type hints currently in the API were written mainly for the purpose of generating the documentation rather than being logically rigorous.

In fact, there were a couple of times where I deliberately made the hints looser because the "correct" form was unreadable in the docs. Type aliases are supposed to be the solution to this, but Sphinx doesn't seem to support them very well (e.g. #1984 (comment)).

@rmartin16
Copy link
Member Author

These all seem like worthwhile goals. You've probably worked this out already, but the type hints currently in the API were written mainly for the purpose of generating the documentation rather than being logically rigorous.

In fact, there were a couple of times where I deliberately made the hints looser because the "correct" form was unreadable in the docs. Type aliases are supposed to be the solution to this, but Sphinx doesn't seem to support them very well (e.g. #1984 (comment)).

Thank you for bringing up docs; I hadn't properly included them in my calculus. I will definitely incorporate them as part of the pragmatism of this implementation.

@rmartin16
Copy link
Member Author

Type aliases are supposed to be the solution to this, but Sphinx doesn't seem to support them very well (e.g. #1984 (comment)).

Want to mention this now that I've seen how fickle Sphinx is with rendering TypeAliases.

It turns out that Sphinx is very sensitive to how the TypeAlias is defined. Notably, a TypeAlias that's used in an API signature cannot use forward references or mixed definitions.

That is, "forward references" will not work:

ActionHandler: TypeAlias = "ActionHandlerType1 | ActionHandlerType2"

class ActionHandlerType1(Protocol): ...
class ActionHandlerType2(Protocol): ...

You need to ensure ordering to avoid the forward references:

class ActionHandlerType1(Protocol): ...
class ActionHandlerType2(Protocol): ...

ActionHandler: TypeAlias = "ActionHandlerType1 | ActionHandlerType2"

Furthermore, though, the TypeAlias definitions cannot be robust, if you will, if expressed as a string; for instance:

T = TypeVar("T")

class ActionHandlerType1(Protocol[T]): ...
class ActionHandlerType2(Protocol[T]): ...

ActionHandler: TypeAlias = "ActionHandlerType1[T] | ActionHandlerType2[T]"

I was only able to get this to work as:

T = TypeVar("T")

class ActionHandlerType1(Protocol[T]): ...
class ActionHandlerType2(Protocol[T]): ...

ActionHandler: TypeAlias = Union[ActionHandlerType1[T], ActionHandlerType2[T]]

Therefore, TypeAliases can be reliably rendered (ostensibly anyway) in the docs if avoid forward references and use explicit Unions.

P.S. You also must build the docs using Python 3.10 or later....which we always do anyway via .readthedocs.yaml.

@rmartin16 rmartin16 force-pushed the api-types-testing branch 7 times, most recently from 4df6fdf to cb569ba Compare December 3, 2023 21:08
tox.ini Outdated
skip_install = True
deps = ./core[dev]
commands = pre-commit run --all-files --show-diff-on-failure --color=always

# The leading comma generates the "py" environment.
[testenv:py{,38,39,310,311,312}]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: these tox envs can't be run in parallel...or at least not reliably. the simultaneous pip install commands might be an issue but I know that coverage combine is definitely an issue.

@freakboy3742
Copy link
Member

Therefore, TypeAliases can be reliably rendered (ostensibly anyway) in the docs if avoid forward references and use explicit Unions.

That specific exercise smells a bit like a bug to me.

However, this is also getting to the point where I'm not sure we're actually gaining anything by adding strong type declarations. If Toga's types are useful as a way to pass edge cases upstream to the people working on typing to diagnose bugs, then that's great - but I'm not wild about the idea of putting typing's bleeding edge on our critical path.

@rmartin16 rmartin16 force-pushed the api-types-testing branch 2 times, most recently from 4ac5de4 to 3eae7c1 Compare December 4, 2023 01:29
@rmartin16
Copy link
Member Author

Therefore, TypeAliases can be reliably rendered (ostensibly anyway) in the docs if avoid forward references and use explicit Unions.

That specific exercise smells a bit like a bug to me.

These are all Sphinx bugs; there are several logged issues I found related to TypeAliases, forward references, etc. However, if you avoid using all these things together and just stick to basic typing and assignments, then this all works as expected for documentation; given that, I wouldn't necessarily consider this bleeding edge, per se....perhaps just a bit rough around the edges.

However, this is also getting to the point where I'm not sure we're actually gaining anything by adding strong type declarations. If Toga's types are useful as a way to pass edge cases upstream to the people working on typing to diagnose bugs, then that's great - but I'm not wild about the idea of putting typing's bleeding edge on our critical path.

Well, I think we are gaining something (other than typing purity, that is); we are getting useful and accurate typing for handlers...which was one of the primary impetuses for me to do this (i.e. #2192). I'd also argue this use case isn't really an edge case; right now, the typing for handlers doesn't support async (and, less importantly, generators).

More ideologically, though, I would put forth that inaccurate typing is worse than no typing. Most IDEs are easily configured (some even out-of-the-box) to surface typing issues for Python to the developer. Therefore, if Toga is going to type its public API, I think it should be accurate.

Now...I do not intend this to mean that every type has to be accurate to the extent of the true strictness of the underlying code...but it should still be accurate in its permissiveness. For instance, if we consider handlers, I've implemented an accurate (and complete) typing for them (that's also not overly burdensome, IMHO). However, if we think this goes too far with typing, then we should just type them as Callable; this is as accurate as you can be at this level while still truly representing the actual permissiveness of the code. (I'd have to confirm but Callable[[App], ...] would probably work in cases where positional arguments are used.)

(meta: we're starting to enter the "pragmatism" debate of this PR :) so, with that, I would like to try to establish some guiding principles to help us balance Toga's typing. Hence, I'm putting forth that irrespective of how "true" a particular type is, its permissiveness should always be accurate.)

@freakboy3742
Copy link
Member

(meta: we're starting to enter the "pragmatism" debate of this PR :) so, with that, I would like to try to establish some guiding principles to help us balance Toga's typing.

+1

Hence, I'm putting forth that irrespective of how "true" a particular type is, its permissiveness should always be accurate.)

Agreed. Elaborating further on this, I'd give high level goals of:

  1. Clarity (i.e., you shouldn't need a 3 year degree in semiotics to be able to read the Sphinx rendition of a type)
  2. Accuracy (i.e., a type declaration shouldn't reject valid inputs)
  3. Specificity (i.e., a type declaration shouldn't accept invalid inputs)
  4. Completeness. (i.e., we'd rather a type not be declared, or be declared loosely, than declared in an incorrect or unclear way)

Balancing the relative priorities of those will be a bit of a balancing act, but they're at least principles to start with (or, at least, principles to start a discussion about on the way to a final set :-)

@rmartin16 rmartin16 force-pushed the api-types-testing branch 3 times, most recently from ff85208 to 1fad2cc Compare December 10, 2023 00:18
@rmartin16 rmartin16 changed the title [POC] Typing ergonomics and guarantees [WIP] Typing ergonomics and guarantees Dec 10, 2023
@rmartin16
Copy link
Member Author

Note about the current state:

You may notice changes that aren't particularly in line with the stated goal of "typing the API surface of Toga". I'm currently experimenting with different functionality of mypy but I will most likely target an end state that will not include functional changes to code or type: ignore comments.

Comment on lines 14 to 16
class OnPressHandler(Protocol):
def __call__(self, widget: Button, **kwargs: Any) -> None:
def __call__(self, widget: Button, **kwargs: Any) -> object:
"""A handler that will be invoked when a button is pressed.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I have all of the the handler Protocols in existence now...but they still have problems outside of not being compatible with async and generator versions.

Positional arguments

  • If we take Button here as an example, users must create a callable where the first positional argument is named widget; this is an unnecessary requirement for users
  • There are two ways to solve this:
    • Rename the variable to __widget and mypy and friends will ignore the argument name
    • Use a positional only signature: def __call__(self, widget: Button, /, **kwargs: Any) -> object:
    • This is the go-forward method in Python to denote this....but this brings up the second problem because it requires users to include **kwargs in their handler functions

Enforcement of kwargs

  • The current signatures are enforcing inclusion of kwargs in users' implementations of handlers
  • IMHO, I think this is an abuse of the typing system in this example....because it is not that there are unknown arguments, it is that there could eventually be unknown arguments
  • In this way, this is prescribing a design pattern that is subjective (albeit reasonable)
  • A similar goal could be accomplished by adding a comment to the docstring about forward compatibility via kwargs

At this point, I'm a bit past hills I'm willing to die on regarding these Protocols 🥲....but if these arguments are pervasive in some way, this is how I'd update the Protocols for now:

class OnPressHandler(Protocol):
    def __call__(self, widget: Button, /) -> object:
        """A handler that will be invoked when a button is pressed.
        
        Future versions may pass new keyword arguments; to maintain forward compatibility
        with Toga in your handlers, add ``**kwargs`` to your handler definition.
        """

Notes:

  • As noted above, the final / says there's one positional argument of type Button
  • The return type of object allows users to return whatever they want. This is changed from None because Toga doesn't actually care and users' functions can do what they want here
  • And finally, users' functions can specify arbitrary keyword arguments after the first argument while maintaining typing compliance

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I have all of the the handler Protocols in existence now...but they still have problems outside of not being compatible with async and generator versions.

Positional arguments

  • If we take Button here as an example, users must create a callable where the first positional argument is named widget; this is an unnecessary requirement for users

  • There are two ways to solve this:

    • Rename the variable to __widget and mypy and friends will ignore the argument name

Blurgh. I don't like that much.

  • Use a positional only signature: def __call__(self, widget: Button, /, **kwargs: Any) -> object:
  • This is the go-forward method in Python to denote this....but this brings up the second problem because it requires users to include **kwargs in their handler functions

Enforcement of kwargs

  • The current signatures are enforcing inclusion of kwargs in users' implementations of handlers
  • IMHO, I think this is an abuse of the typing system in this example....because it is not that there are unknown arguments, it is that there could eventually be unknown arguments
  • In this way, this is prescribing a design pattern that is subjective (albeit reasonable)

Agreed... but I'm actually OK with this. We've gently encouraged the use of **kwargs in any event handler as a forward compatibility aid; a mechanism that actually enforces this would be a good thing, IMHO.

  • A similar goal could be accomplished by adding a comment to the docstring about forward compatibility via kwargs

... which will be read by nobody, and adhered to by even less :-)

At this point, I'm a bit past hills I'm willing to die on regarding these Protocols 🥲....

Totally understood :-)

but if these arguments are pervasive in some way, this is how I'd update the Protocols for now:

FWIW: I'm +1 on the / syntax to enforce the named argument; my preference would be to preserve the **kwargs so that anyone with noisy type checking is "encouraged" to do the right thing. That gives us a little more wiggle room for future backwards compatibility discussions, because we can add a named keyword argument to the spec for a protocol without breaking backwards compatibility. My experience from Django API definition is that if you have a flexible spec, you don't tie yourself in knots trying to work out how to accomodate a necessary change.

@rmartin16
Copy link
Member Author

I spent some time with Sources today.

My goal was to add typing generics to these classes so static code analysis could know the arbitrary types of the data inside ListSource, TreeSource, etc. However....their current implementations do not allow static analysis to infer the types.

If we consider ListSource, the issue stems from the handling of the user provided data which can be:

  • Iterable[Mapping[str, T]]
  • Iterable[Iterable[T]]
  • Iterable[T]

If you try to union these types, the type of T always devolves to Any because their individual "type spaces" (if you will) overlap. That is, an iterable of mappings is a conceptual subtype of an iterable of iterables. That can be avoided, though, if we require an iterable of collections instead. However, the fact that the user-provided data can also just be an iterable of objects....messes this all up and mypy can't figure out what's going on.

That said, if we ignore the "iterable of objects" case, then using Iterable[Mapping[str, T]] | Iterable[Collection[T]] to describe the acceptable user input does work and mypy can track the data types of the values in the data. So, that would at least address some situations...

A similar issue exists for TreeSource since a simple iterable of objects can create the tree.

Alternatively, I suppose, we could re-implement their designs.....and instead, perhaps, use specific constructors based on the type of input. These would obviously be pretty disruptive as well as a worse user experience...

Comment on lines 12 to 25
if TYPE_CHECKING:
if sys.version_info < (3, 10):
from typing_extensions import TypeAlias
else:
from typing import TypeAlias

NodeDataT: TypeAlias = Union[object, Mapping[str, T], Iterable[T]]
TreeSourceDataT: TypeAlias = Union[
object,
Mapping[NodeDataT[T], "TreeSourceDataT[T]"],
Iterable[Tuple[NodeDataT[T], "TreeSourceDataT[T]"]],
]
else:
TreeSourceDataT = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a note, we end up with weird TYPE_CHECKING blocks like this if we continue managing typing_extensions the way. That is, by including it as a dev requirement, we can't freely use TypeAlias (or anything else we'll need from it) and as a consequence, we can't assume anything defined as a TypeAlias is available elsewhere for importing.

Interestingly, I didn't know I had all these import issues until I actually tried running a real app with Python 3.9. Neither the unit tests or testbed exercise in CI caught it.

The alternative would be to add typing_extensions as a full requirement....but I definitely understand the reluctance there.

Just as a further note, the design of typing_extensions basically lets you use it exclusively. The devs simply alias the backported classes to the real deal once youre on a supported Python. Furthermore, though, they backport features of classes; so, TypeVar, for instance, has features in 3.12 that do not exist in stdlib in 3.10. But if you use the TypeVar from typing_extensions, you get those features even on 3.10.

Finally, pyupgrade supports typing_extensions and will automatically revert its use to typing when only appropriate Pythons are supported.

Mostly just wanted to lay this out in one place.

@freakboy3742
Copy link
Member

Alternatively, I suppose, we could re-implement their designs.....and instead, perhaps, use specific constructors based on the type of input. These would obviously be pretty disruptive as well as a worse user experience...

FWIW: I'm not fundamentally opposed to some redesign here if it will help - however, I suspect this is just one of those areas where we see the hard edges of strong typing. We're deliberately exposing an interface that is flexible, which is the exact opposite of what strong typing wants to do.

@rmartin16
Copy link
Member Author

Alternatively, I suppose, we could re-implement their designs.....and instead, perhaps, use specific constructors based on the type of input. These would obviously be pretty disruptive as well as a worse user experience...

FWIW: I'm not fundamentally opposed to some redesign here if it will help - however, I suspect this is just one of those areas where we see the hard edges of strong typing. We're deliberately exposing an interface that is flexible, which is the exact opposite of what strong typing wants to do.

At this point, I'm thinking I'm just going to remove the ListSource and TreeSource typing from this PR. I was hoping to get to a final design for them today....but I don't really have one. So, instead of continuing to hold all of this up, I'll try to finalize what is ready to go tomorrow.

@rmartin16
Copy link
Member Author

The typing updates to the code are complete at this point; I removed the attempt to make ListSource and TreeSource generic. I do still need to assess any impact to documentation from the new TypeAliases or TypeVars, though..

@rmartin16 rmartin16 changed the title [WIP] Typing ergonomics and guarantees Typing updates Jun 8, 2024
@rmartin16 rmartin16 force-pushed the api-types-testing branch 2 times, most recently from 9d8152a to 76da8fe Compare June 10, 2024 21:50
@rmartin16
Copy link
Member Author

I reviewed all of the TypeAlias definitions and their documentation; they should all be in alignment now based on how the existing ones were documented. As such, the PR is done....now if the docs build will just be allowed to finish....

@freakboy3742
Copy link
Member

Looks like the RTD problem is because Microsoft's web servers are either being flaky, or they've blackholed a URL we were using....

@rmartin16 rmartin16 marked this pull request as ready for review June 11, 2024 01:16
@rmartin16
Copy link
Member Author

Just saw this in that last re-run for Windows...curious to see a stacktrace in the tests....

tests/widgets/test_table.py::test_remove_all_columns Traceback (most recent call last):
PASSED  File "D:\a\toga\toga\testbed\build\testbed\windows\app\src\app_packages\toga_winforms\widgets\table.py", line 176, in _new_item
    icon = icon(self._accessors[0])
  File "D:\a\toga\toga\testbed\build\testbed\windows\app\src\app_packages\toga_winforms\widgets\table.py", line 90, in winforms_retrieve_virtual_item
    e.Item = self._new_item(e.ItemIndex)
  File "D:\a\toga\toga\testbed\build\testbed\windows\app\src\app_packages\toga_winforms\libs\wrapper.py", line 22, in __call__
    return function(*args, **kwargs)
   at Python.Runtime.PythonException.ThrowLastAsClrException() in /tmp/build-via-sdist-h3j67_wc/pythonnet-3.0.3/src/runtime/PythonException.cs:line 53
   at Python.Runtime.Dispatcher.TrueDispatch(Object[] args) in /tmp/build-via-sdist-h3j67_wc/pythonnet-3.0.3/src/runtime/DelegateManager.cs:line 341
   at Python.Runtime.Dispatcher.Dispatch(Object[] args) in /tmp/build-via-sdist-h3j67_wc/pythonnet-3.0.3/src/runtime/DelegateManager.cs:line 208
   at __System_Windows_Forms_RetrieveVirtualItemEventHandlerDispatcher.Invoke(Object , RetrieveVirtualItemEventArgs )
   at System.Windows.Forms.ListView.OnRetrieveVirtualItem(RetrieveVirtualItemEventArgs e)
   at System.Windows.Forms.ListView.WmReflectNotify(Message& m)
   at System.Windows.Forms.ListView.WndProc(Message& m)
   at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
              [ 84%]
tests/widgets/test_table.py::test_cell_icon PASSED                       [ 85%]

@freakboy3742
Copy link
Member

Just saw this in that last re-run for Windows...curious to see a stacktrace in the tests....

Hrm... that's a new one for me - haven't seen that before.

Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My brain is bleeding after going through all this... but AFAICT, it's all fairly straightforward type updates (and occasional code cleanups, I'm guessing some of which triggered type checks).

I reiterate my original concern that these types will drift to being incorrect over time without automated checking, but I also acknowledge that automated checking will only serve to make the types even more incomprehensible, so I guess we'll just need to live with the occasional "type fixes" PR when someone notices a problem.

The two things of substance I noticed in my review:

  1. The documentation links for IconContentT and ImageContentT (et al) seem to be broken - the types are documented, but the hyperlinks aren't working
  2. The type annotation for SplitContainer.container is tuple... which doesn't seem right.

However, neither of these are new problems - the existing docs aren't hyperlinking IconContentT, and you've only corrected the types (and docs) for SplitContainer relative to what is there. Given how much effort it will be to keep this PR up to date, I'm going to merge it as is, and we can deal with those problems as follow ups in the future.

def _create_impl(self):
return self.factory.App(interface=self)
def _create_impl(self) -> None:
self.factory.App(interface=self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that's an interesting oddity... I've never noticed that we're not actually using the result of the factory call, relying on the factory to set the attribute on the interface.

Not something we need to fix here, but I'll remember to clean this up when I'm landing the #2244 changes.

core/src/toga/platform.py Show resolved Hide resolved
@freakboy3742 freakboy3742 merged commit 04daf6f into beeware:main Jun 11, 2024
31 of 34 checks passed
@rmartin16 rmartin16 deleted the api-types-testing branch June 11, 2024 05:57
@rmartin16
Copy link
Member Author

As far as IconContentT is concerned, my understanding from reviewing the history was that you never were able to get the type annotation in the API docs to link to your c:type:: documentation....the best you could do is link to the ctype doc in the description of the argument itself.

I think I've ultimately landed on that for a lot of this a custom sphinx extension will be necessary to really achieve the desired balance between accurate types and comprehensible documentation.

@freakboy3742
Copy link
Member

As far as IconContentT is concerned, my understanding from reviewing the history was that you never were able to get the type annotation in the API docs to link to your c:type:: documentation....the best you could do is link to the ctype doc in the description of the argument itself.

I think I've ultimately landed on that for a lot of this a custom sphinx extension will be necessary to really achieve the desired balance between accurate types and comprehensible documentation.

Yeah - looking closer at the problem refreshed my memory; I think you're right on both counts. It's still worth logging as an issue in case someone is motivated to look into it - see #2638.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants