Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

friendlier error messages for missing chunk managers #9676

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

keewis
Copy link
Collaborator

@keewis keewis commented Oct 24, 2024

The current error message when trying to use a chunked-array related method without actually having a chunk manager available is:

unrecognized chunk manager dask - must be one of: []

That's pretty confusing, so this catches the case where no chunk manager is available and raises an error with guidance on how to fix that.

  • Tests added
  • User visible changes (including notable bug fixes) are documented in whats-new.rst

@max-sixty
Copy link
Collaborator

Much better!

@mathause
Copy link
Collaborator

Thanks - I had a PR on this but don't mind closing mine in favor of this one. There is also #7963 which seems related.

@keewis
Copy link
Collaborator Author

keewis commented Oct 25, 2024

wow, I don't know how I missed two open PRs that aim to do something similar in different ways. Which one do we take?

If we merge this one your PR might still be valuable since it also changes the error message if there are chunk managers but not the one that was requested.

@dcherian
Copy link
Contributor

dcherian commented Nov 7, 2024

Shall we merge?

doc/whats-new.rst Outdated Show resolved Hide resolved
@TomNicholas TomNicholas added topic-error reporting topic-chunked-arrays Managing different chunked backends, e.g. dask labels Nov 8, 2024
@TomNicholas
Copy link
Member

TomNicholas commented Nov 8, 2024

wow, I don't know how I missed two open PRs that aim to do something similar in different ways. Which one do we take?

Sorry for dropping the ball on reviewing / merging these guys 😞

Let's merge this one.

If we merge this one your PR might still be valuable since it also changes the error message if there are chunk managers but not the one that was requested.

This change would also be useful but is much less likely to come up.

Co-authored-by: Tom Nicholas <[email protected]>
@dcherian dcherian mentioned this pull request Nov 18, 2024
6 tasks
Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this could be further improved in cases like cubed being installed but dask not (when there would not be 0 chunkmanagers installed), this PR on it's own addresses the confusing error that 99% of users are encountering so should be merged asap.

@TomNicholas
Copy link
Member

If you change the other ValueError to ImportError on line 109 of parallelcompat.py the failing test should pass.

def test_fail_on_nonexistent_chunkmanager(
self, register_dummy_chunkmanager
) -> None:
with pytest.raises(ImportError, match="unrecognized chunk manager foo"):
Copy link
Collaborator Author

@keewis keewis Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there might be two reasons why this fails: one, there's a typo, in which case I'd say it would be better to raise a ValueError (the issue is the caller's input), while the second reason is indeed that the library that provides the requested chunk manager was not installed or fails to import.

In the case when there's no chunk manager at all, we know that the user needs to install a library and I can understand using a ImportError. However, in the case where there's at least one chunk manager I don't think we can figure out whether the issue was a user error or a missing library (at least, without maintaining a list of known chunk managers and suggesting sufficiently "close" names), so I think this should still be expecting a ValueError:

Suggested change
with pytest.raises(ImportError, match="unrecognized chunk manager foo"):
with pytest.raises(ValueError, match="unrecognized chunk manager foo"):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You raise a good point about ambiguity, but I also think we could reasonably just say (imprecisely) "with this user input were unable to import a necessary library, so there's still an ImportError underneath", which then has the advantage of consistency of error types for users.

@dcherian
Copy link
Contributor

I'm planning to release on Friday US time. Would be good to wrap this up

@TomNicholas
Copy link
Member

Trying to summarize what we discussed in the meeting on Wednesday:

at least, without maintaining a list of known chunk managers and suggesting sufficiently "close" names

This would be okay, because right now there are only 3 chunkmanagers we know of, one which ships with xarray. Outside of these 3 a slightly less helpful error message is fine.

This allows our error messages to explicitly point to the correct package to install, e.g. dask/cubed-xarray/arkouda-xarray.

We questioned the wisdom of even creating the whole entrypoint system in the first place, but also said that removing it is a separable issue for later, and the priority should be to improve the error messages first.

I can't really remember what we decided about ValueError vs ImportError though...

@keewis
Copy link
Collaborator Author

keewis commented Nov 22, 2024

Thanks for the summary.

I can't really remember what we decided about ValueError vs ImportError though...

I don't think we decided anything, but it would make sense to me to raise a ImportError for all known-but-missing chunk managers, and raise a ValueError for everything else where we don't really know whether that's because of a missing package/one that fails to import or a typo

@keewis
Copy link
Collaborator Author

keewis commented Nov 22, 2024

we probably also want to prefer the more specific error over the generic n_chunkmanagers == 0 error (which would also allow passing a chunkmanager object if no chunk manager is available via the entrypoints)

@keewis
Copy link
Collaborator Author

keewis commented Nov 22, 2024

@dcherian, @TomNicholas, this should be ready for a final review (and can hopefully still make it into the release looks like I was late by a couple minutes)

@dcherian
Copy link
Contributor

we can always release more!

@keewis
Copy link
Collaborator Author

keewis commented Nov 22, 2024

the failing zarr tests were because of the automatic chunking in open_zarr (chunks="auto")

@keewis keewis changed the title more friendly error message in case no chunk manager is available friendlier error messages for missing chunk managers Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-chunked-arrays Managing different chunked backends, e.g. dask topic-error reporting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants