[Core] Improve logging during accelerator auto-detection #45240

HenryZJY · 2024-05-10T05:29:54Z

Why are these changes needed?

This PR improves the user experience during accelerator auto-detection. Currently errors from accelerator autodetection are suppressed. With the changes in this PR, if a user specifies accelerator(s) as part of available resources (for example, ray.init(num_gpus=3)), and that resource encounters errors during auto-detection, an error message will be logged.

I also added an additional check in resource_spec.py which adds an entry to logger if raylet wants a certain number of accelerators but visible accelerator id is None.

Related issue number

Closes #43328

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Henry <[email protected]>

HenryZJY added 3 commits May 11, 2024 13:52

Add user specified flag, log error

2420bb2

Signed-off-by: Henry <[email protected]>

Format python files

32e8eb4

Signed-off-by: Henry <[email protected]>

Fix format

dad8197

Signed-off-by: Henry <[email protected]>

HenryZJY force-pushed the improve-auto-detection branch from cd07342 to dad8197 Compare May 11, 2024 17:52

HenryZJY changed the title ~~Improve logging during accelerator auto-detection~~ [Core] Improve logging during accelerator auto-detection May 11, 2024

Log problem

b8d2511

Signed-off-by: Henry <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core] Improve logging during accelerator auto-detection #45240

[Core] Improve logging during accelerator auto-detection #45240

HenryZJY commented May 10, 2024 •

edited

[Core] Improve logging during accelerator auto-detection #45240

Are you sure you want to change the base?

[Core] Improve logging during accelerator auto-detection #45240

Conversation

HenryZJY commented May 10, 2024 • edited

Why are these changes needed?

Related issue number

Checks

HenryZJY commented May 10, 2024 •

edited