Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UX] minor optimizations for launch and introduce py-spy #4495

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,10 @@ pytest tests/test_smoke.py --generic-cloud azure

For profiling code, use:
```
pip install tuna # Tuna is used for visualization of profiling data.
python3 -m cProfile -o sky.prof -m sky.cli status # Or some other command
tuna sky.prof
pip install py-spy # py-spy is a sampling profiler for Python programs
py-spy record -t -o sky.svg -- python -m sky.cli status # Or some other command
py-spy top -- python -m sky.cli status # Get a live top view
py-spy -h # For more options
```

#### Testing in a container
Expand Down
24 changes: 15 additions & 9 deletions sky/adaptors/common.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""Lazy import for modules to avoid import error when not used."""
import functools
import importlib
import threading
from typing import Any, Callable, Optional, Tuple


Expand All @@ -24,17 +25,22 @@ def __init__(self,
self._module = None
self._import_error_message = import_error_message
self._set_loggers = set_loggers
self._lock = threading.RLock()

def load_module(self):
if self._module is None:
try:
self._module = importlib.import_module(self._module_name)
if self._set_loggers is not None:
self._set_loggers()
except ImportError as e:
if self._import_error_message is not None:
raise ImportError(self._import_error_message) from e
raise
# Avoid extra imports when multiple threads try to import the same
# module. The overhead is minor since import can only run in serial
# due to GIL even in multi-threaded environments.
with self._lock:
if self._module is None:
try:
self._module = importlib.import_module(self._module_name)
if self._set_loggers is not None:
self._set_loggers()
except ImportError as e:
if self._import_error_message is not None:
raise ImportError(self._import_error_message) from e
raise
return self._module

def __getattr__(self, name: str) -> Any:
Expand Down
12 changes: 10 additions & 2 deletions sky/usage/usage_lib.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import contextlib
import datetime
import enum
import inspect
import json
import os
import time
Expand All @@ -12,19 +11,28 @@
from typing import Any, Callable, Dict, List, Optional, Union

import click
import requests

import sky
from sky import sky_logging
from sky.adaptors import common as adaptors_common
from sky.usage import constants
from sky.utils import common_utils
from sky.utils import env_options
from sky.utils import ux_utils

if typing.TYPE_CHECKING:
import inspect

import requests

from sky import resources as resources_lib
from sky import status_lib
from sky import task as task_lib
else:
# requests and inspect cost ~100ms to load, which can be postponed to
# collection phase or skipped if user specifies no collection
requests = adaptors_common.LazyImport('requests')
inspect = adaptors_common.LazyImport('inspect')
Comment on lines +32 to +35
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, it might be good to have the usage being called asynchronously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, concurrency using threads/coroutines cannot leverage multiple CPU cores due to GIL and the network IO in usage_lib (_send_to_loki) happens when cli exiting. It might be okay to send the log synchronously now, since we are going to wait for _send_to_loki complete anyway.


logger = sky_logging.init_logger(__name__)

Expand Down
2 changes: 2 additions & 0 deletions sky/utils/log_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
from typing import Callable, Iterator, List, Optional, TextIO, Type

import colorama
# slow due to https://github.com/python-pendulum/pendulum/issues/808
# FIXME(aylei): bump pendulum if it get fixed
import pendulum
Comment on lines +8 to 10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can even get rid of this package, as we have also been observing inaccurate translation of the timestamp to readable time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any issue with this problem? If not, I'd like to open one as a followup~

import prettytable

Expand Down
Loading