Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite for named tuples #852

Merged
merged 101 commits into from
Nov 25, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
680bcb1
test
jsfreischuetz Jul 23, 2024
900aeb5
fix test cases for introducing tuple classes
jsfreischuetz Jul 23, 2024
778a718
fix linting issues
jsfreischuetz Jul 23, 2024
4347414
some more linting issues
jsfreischuetz Jul 24, 2024
ae04944
Merge branch 'main' into named-tuples
bpkroth Jul 24, 2024
d7fc6c8
Merge branch 'main' into named-tuples
jsfreischuetz Jul 24, 2024
be8125c
reformat
jsfreischuetz Jul 24, 2024
7a5ec7f
Merge branch 'named-tuples' of github.com:jsfreischuetz/MLOS into nam…
jsfreischuetz Jul 24, 2024
64faf17
significant changes to switch the types of observations and suggestions
jsfreischuetz Aug 21, 2024
a3ce096
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
d4ec442
Update mlos_core/mlos_core/spaces/adapters/adapter.py
jsfreischuetz Nov 14, 2024
6c77168
Update mlos_core/mlos_core/spaces/adapters/adapter.py
jsfreischuetz Nov 14, 2024
c3463bc
Update mlos_core/mlos_core/optimizers/optimizer.py
jsfreischuetz Nov 14, 2024
cdf48ad
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
4b479f8
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
54ee3aa
Update mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimi…
jsfreischuetz Nov 14, 2024
fed7473
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
91ce5a8
Update mlos_core/mlos_core/optimizers/optimizer.py
jsfreischuetz Nov 14, 2024
a8a8bf5
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
29f2d53
Update mlos_bench/mlos_bench/optimizers/mlos_core_optimizer.py
jsfreischuetz Nov 14, 2024
f4dc654
Update mlos_bench/mlos_bench/tests/optimizers/toy_optimization_loop_t…
jsfreischuetz Nov 14, 2024
292495f
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
c07eb0f
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
1e5954c
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
0a4c1ec
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
f9862c6
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
d16866b
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
5d01a1b
Update mlos_core/mlos_core/optimizers/observations.py
jsfreischuetz Nov 14, 2024
3440d47
some of the suggested changes
jsfreischuetz Nov 18, 2024
a1f8d57
partial test cases
jsfreischuetz Nov 18, 2024
dfb1e6f
data tests
jsfreischuetz Nov 18, 2024
8db3b44
most tests working
jsfreischuetz Nov 18, 2024
79d84c4
fixed all tests
jsfreischuetz Nov 19, 2024
5024462
update versions, and break everything
jsfreischuetz Nov 20, 2024
c2c0288
Merge branch 'main' of github.com:microsoft/MLOS into new-types
jsfreischuetz Nov 20, 2024
4e73be1
fix test cases
jsfreischuetz Nov 20, 2024
1592c84
fix test cases
jsfreischuetz Nov 20, 2024
f6385d1
revert
jsfreischuetz Nov 20, 2024
3da56fa
Merge branch 'main' of github.com:microsoft/MLOS into new-types
jsfreischuetz Nov 20, 2024
72a70bd
fix notebook
jsfreischuetz Nov 21, 2024
81d9e06
fix linter issue
jsfreischuetz Nov 21, 2024
64e4290
fix line length error
jsfreischuetz Nov 21, 2024
f587692
fix the readme
jsfreischuetz Nov 21, 2024
ed05841
line length issues
jsfreischuetz Nov 21, 2024
623a1bd
docstring issues
jsfreischuetz Nov 21, 2024
c9551d1
more changes
jsfreischuetz Nov 21, 2024
a1e1c19
remove "unused" ignores
jsfreischuetz Nov 21, 2024
df9269c
formatting issues
jsfreischuetz Nov 21, 2024
6e927d7
comments
jsfreischuetz Nov 21, 2024
55ef17b
Update mlos_core/mlos_core/optimizers/flaml_optimizer.py
jsfreischuetz Nov 21, 2024
6724e97
resolve comments
jsfreischuetz Nov 21, 2024
5d7c5a2
Merge branch 'new-types' of github.com:jsfreischuetz/MLOS into new-types
jsfreischuetz Nov 21, 2024
1d7301a
comment
jsfreischuetz Nov 21, 2024
bbd4ec3
comment
jsfreischuetz Nov 21, 2024
944a630
line length issue
jsfreischuetz Nov 21, 2024
9e20709
Merge branch 'main' into new-types
bpkroth Nov 21, 2024
d8f5b53
fix notebook
jsfreischuetz Nov 21, 2024
1eb7811
fix issue
jsfreischuetz Nov 21, 2024
c97d4a3
fix issues that are in my code
jsfreischuetz Nov 21, 2024
1296e90
resolve issue
jsfreischuetz Nov 21, 2024
48138aa
change requests
jsfreischuetz Nov 21, 2024
56e5e22
remove "unneeded" ignore
jsfreischuetz Nov 21, 2024
81d0b7b
merge
jsfreischuetz Nov 21, 2024
0d9cbd3
fix typo
jsfreischuetz Nov 21, 2024
f560aaf
revert
jsfreischuetz Nov 21, 2024
6e5da18
avoid 'futures'
bpkroth Nov 22, 2024
24e7193
pylint: use generator expressions
bpkroth Nov 22, 2024
6ea2631
style nits
bpkroth Nov 22, 2024
d41552b
revert
bpkroth Nov 22, 2024
5c2c57c
revert
bpkroth Nov 22, 2024
24ac8a4
test improvements
bpkroth Nov 22, 2024
6b09497
convert to_list to __iter__
bpkroth Nov 22, 2024
3af4396
fix issue from changing to __iter__
jsfreischuetz Nov 22, 2024
12bdc92
Merge branch 'main' into new-types
jsfreischuetz Nov 22, 2024
0e3fdf3
Merge branch 'new-types' of github.com:jsfreischuetz/MLOS into new-types
jsfreischuetz Nov 22, 2024
1210fc4
remove _register_single
bpkroth Nov 22, 2024
d15fd47
refactor to move _register_single into wrapper classes so we can reim…
bpkroth Nov 22, 2024
483531a
remove duplicate check
bpkroth Nov 22, 2024
1feb5b9
revert
bpkroth Nov 22, 2024
3ae48ed
revert
bpkroth Nov 22, 2024
ed519a1
format
bpkroth Nov 22, 2024
58c77e0
pylint
bpkroth Nov 22, 2024
fd18ff0
start fixing type aliases
bpkroth Nov 22, 2024
726d00b
start removing those type aliases for now
bpkroth Nov 22, 2024
e3a0028
Use properties to prevent mutation, related fixups, change names for …
bpkroth Nov 22, 2024
04a4c46
whitespace
bpkroth Nov 22, 2024
10b65ef
pylint
bpkroth Nov 22, 2024
38a95ef
docstring fixups
bpkroth Nov 22, 2024
a37071f
more docstring fixups
bpkroth Nov 22, 2024
957a5ed
docstring fixups
bpkroth Nov 22, 2024
0e4652f
more
bpkroth Nov 22, 2024
bd79d4e
more fixups
bpkroth Nov 22, 2024
52cf68c
format
bpkroth Nov 22, 2024
7a4f50e
fixups
bpkroth Nov 25, 2024
c193718
fixup
bpkroth Nov 25, 2024
5180455
fix mistaken inversion of the llamatune transforms
bpkroth Nov 25, 2024
7b5d210
docstring formatting
bpkroth Nov 25, 2024
0df8afe
fixup
bpkroth Nov 25, 2024
40e9677
formatting
bpkroth Nov 25, 2024
6648a0c
fixup
bpkroth Nov 25, 2024
94feced
Merge branch 'main' into new-types
bpkroth Nov 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
"discretization",
"discretize",
"drivername",
"dropna",
"dstpath",
"dtype",
"duckdb",
Expand Down
23 changes: 13 additions & 10 deletions mlos_bench/mlos_bench/optimizers/mlos_core_optimizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
OptimizerType,
SpaceAdapterType,
)
from mlos_core.optimizers.observations import Observations

_LOG = logging.getLogger(__name__)

Expand Down Expand Up @@ -128,7 +129,7 @@ def bulk_register(

# TODO: Specify (in the config) which metrics to pass to the optimizer.
# Issue: https://github.com/microsoft/MLOS/issues/745
self._opt.register(configs=df_configs, scores=df_scores)
self._opt.register(observations=Observations(config=df_configs, score=df_scores))

if _LOG.isEnabledFor(logging.DEBUG):
(score, _) = self.get_best_observation()
Expand Down Expand Up @@ -198,10 +199,10 @@ def suggest(self) -> TunableGroups:
tunables = super().suggest()
if self._start_with_defaults:
_LOG.info("Use default values for the first trial")
df_config, _metadata = self._opt.suggest(defaults=self._start_with_defaults)
suggestion = self._opt.suggest(defaults=self._start_with_defaults)
self._start_with_defaults = False
_LOG.info("Iteration %d :: Suggest:\n%s", self._iter, df_config)
return tunables.assign(configspace_data_to_tunable_values(df_config.loc[0].to_dict()))
_LOG.info("Iteration %d :: Suggest:\n%s", self._iter, suggestion.config)
return tunables.assign(configspace_data_to_tunable_values(suggestion.config.to_dict()))
jsfreischuetz marked this conversation as resolved.
Show resolved Hide resolved

def register(
self,
Expand All @@ -221,18 +222,20 @@ def register(
# TODO: Specify (in the config) which metrics to pass to the optimizer.
# Issue: https://github.com/microsoft/MLOS/issues/745
self._opt.register(
configs=df_config,
scores=pd.DataFrame([registered_score], dtype=float),
observations=Observations(
config=df_config,
score=pd.DataFrame([registered_score], dtype=float),
)
)
return registered_score

def get_best_observation(
self,
) -> Union[Tuple[Dict[str, float], TunableGroups], Tuple[None, None]]:
(df_config, df_score, _df_context) = self._opt.get_best_observations()
if len(df_config) == 0:
best_observations = self._opt.get_best_observations()
if len(best_observations.config.index) == 0:
jsfreischuetz marked this conversation as resolved.
Show resolved Hide resolved
return (None, None)
params = configspace_data_to_tunable_values(df_config.iloc[0].to_dict())
scores = self._adjust_signs_df(df_score).iloc[0].to_dict()
params = configspace_data_to_tunable_values(best_observations.config.iloc[0].to_dict())
scores = self._adjust_signs_df(best_observations.score).iloc[0].to_dict()
_LOG.debug("Best observation: %s score: %s", params, scores)
return (scores, self._tunables.copy().assign(params))
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@
from mlos_bench.optimizers.mock_optimizer import MockOptimizer
from mlos_bench.tunables.tunable_groups import TunableGroups
from mlos_core.optimizers.bayesian_optimizers.smac_optimizer import SmacOptimizer
from mlos_core.util import config_to_dataframe
from mlos_core.optimizers.observations import Suggestion
from mlos_core.util import config_to_series

# For debugging purposes output some warnings which are captured with failed tests.
DEBUG = True
Expand All @@ -40,10 +41,13 @@ def _optimize(env: Environment, opt: Optimizer) -> Tuple[float, TunableGroups]:
# pylint: disable=protected-access
if isinstance(opt, MlosCoreOptimizer) and isinstance(opt._opt, SmacOptimizer):
config = tunable_values_to_configuration(tunables)
config_df = config_to_dataframe(config)
config_df = config_to_series(config)
jsfreischuetz marked this conversation as resolved.
Show resolved Hide resolved
logger("config: %s", str(config))
try:
logger("prediction: %s", opt._opt.surrogate_predict(configs=config_df))
logger(
"prediction: %s",
opt._opt.surrogate_predict(suggestion=Suggestion(config=config_df)),
)
except RuntimeError:
pass

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,57 +5,38 @@
"""Contains the wrapper classes for base Bayesian optimizers."""

from abc import ABCMeta, abstractmethod
from typing import Optional

import numpy.typing as npt
import pandas as pd

from mlos_core.optimizers.observations import Suggestion
from mlos_core.optimizers.optimizer import BaseOptimizer


class BaseBayesianOptimizer(BaseOptimizer, metaclass=ABCMeta):
"""Abstract base class defining the interface for Bayesian optimization."""

@abstractmethod
def surrogate_predict(
self,
*,
configs: pd.DataFrame,
context: Optional[pd.DataFrame] = None,
) -> npt.NDArray:
def surrogate_predict(self, *, suggestion: Suggestion) -> npt.NDArray:
jsfreischuetz marked this conversation as resolved.
Show resolved Hide resolved
"""
Obtain a prediction from this Bayesian optimizer's surrogate model for the given
configuration(s).

Parameters
----------
configs : pd.DataFrame
Dataframe of configs / parameters. The columns are parameter names and
the rows are the configs.

context : pd.DataFrame
Not Yet Implemented.
suggestion: Suggestion
The suggestion containing the configuration(s) to predict.
"""
pass # pylint: disable=unnecessary-pass # pragma: no cover

@abstractmethod
def acquisition_function(
self,
*,
configs: pd.DataFrame,
context: Optional[pd.DataFrame] = None,
) -> npt.NDArray:
def acquisition_function(self, *, suggestion: Suggestion) -> npt.NDArray:
"""
Invokes the acquisition function from this Bayesian optimizer for the given
configuration.

Parameters
----------
configs : pd.DataFrame
Dataframe of configs / parameters. The columns are parameter names and
the rows are the configs.

context : pd.DataFrame
Not Yet Implemented.
suggestion: Suggestion
The suggestion containing the configuration(s) to evaluate.
"""
pass # pylint: disable=unnecessary-pass # pragma: no cover
149 changes: 54 additions & 95 deletions mlos_core/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,19 @@
from logging import warning
from pathlib import Path
from tempfile import TemporaryDirectory
from typing import TYPE_CHECKING, Dict, List, Optional, Tuple, Union
from typing import TYPE_CHECKING, Dict, List, Optional, Union
from warnings import warn

import ConfigSpace
import numpy as np
import numpy.typing as npt
import pandas as pd
from smac.utils.configspace import convert_configurations_to_array

from mlos_core.optimizers.bayesian_optimizers.bayesian_optimizer import (
BaseBayesianOptimizer,
)
from mlos_core.optimizers.observations import Observation, Observations, Suggestion
from mlos_core.spaces.adapters.adapter import BaseSpaceAdapter
from mlos_core.spaces.adapters.identity_adapter import IdentityAdapter

Expand Down Expand Up @@ -269,63 +272,49 @@ def _dummy_target_func(config: ConfigSpace.Configuration, seed: int = 0) -> None
# release: https://github.com/automl/SMAC3/issues/946
raise RuntimeError("This function should never be called.")

def _register(
self,
*,
configs: pd.DataFrame,
scores: pd.DataFrame,
context: Optional[pd.DataFrame] = None,
metadata: Optional[pd.DataFrame] = None,
) -> None:
def _register(self, *, observation: Observation) -> None:
jsfreischuetz marked this conversation as resolved.
Show resolved Hide resolved
"""
Registers the given configs and scores.

Parameters
----------
configs : pd.DataFrame
Dataframe of configs / parameters. The columns are parameter names and
the rows are the configs.

scores : pd.DataFrame
Scores from running the configs. The index is the same as the index of
the configs.

context : pd.DataFrame
Not Yet Implemented.

metadata: pd.DataFrame
Not Yet Implemented.
observation: Observation
The observation to register.
"""
from smac.runhistory import ( # pylint: disable=import-outside-toplevel
StatusType,
TrialInfo,
TrialValue,
)

if context is not None:
warn(f"Not Implemented: Ignoring context {list(context.columns)}", UserWarning)

# Register each trial (one-by-one)
for config, (_i, score) in zip(
self._to_configspace_configs(configs=configs), scores.iterrows()
):
# Retrieve previously generated TrialInfo (returned by .ask()) or create
# new TrialInfo instance
info: TrialInfo = self.trial_info_map.get(
config,
TrialInfo(config=config, seed=self.base_optimizer.scenario.seed),
if observation.context is not None:
warn(
f"Not Implemented: Ignoring context {list(observation.context.index)}",
UserWarning,
)
value = TrialValue(cost=list(score.astype(float)), time=0.0, status=StatusType.SUCCESS)
self.base_optimizer.tell(info, value, save=False)

# Retrieve previously generated TrialInfo (returned by .ask()) or create
# new TrialInfo instance
config = ConfigSpace.Configuration(
self.optimizer_parameter_space, values=observation.config.dropna().to_dict()
jsfreischuetz marked this conversation as resolved.
Show resolved Hide resolved
)
info: TrialInfo = self.trial_info_map.get(
config,
TrialInfo(config=config, seed=self.base_optimizer.scenario.seed),
)
value = TrialValue(
cost=list(observation.score.astype(float)), time=0.0, status=StatusType.SUCCESS
jsfreischuetz marked this conversation as resolved.
Show resolved Hide resolved
)
self.base_optimizer.tell(info, value, save=False)

# Save optimizer once we register all configs
self.base_optimizer.optimizer.save()

def _suggest(
self,
*,
context: Optional[pd.DataFrame] = None,
) -> Tuple[pd.DataFrame, Optional[pd.DataFrame]]:
context: Optional[pd.Series] = None,
) -> Suggestion:
"""
Suggests a new configuration.

Expand All @@ -336,49 +325,36 @@ def _suggest(

Returns
-------
configuration : pd.DataFrame
Pandas dataframe with a single row. Column names are the parameter names.

metadata : Optional[pd.DataFrame]
Not yet implemented.
suggestion: Suggestion
The suggestion to evaluate.
"""
if TYPE_CHECKING:
# pylint: disable=import-outside-toplevel,unused-import
from smac.runhistory import TrialInfo

if context is not None:
warn(f"Not Implemented: Ignoring context {list(context.columns)}", UserWarning)
warn(f"Not Implemented: Ignoring context {list(context.index)}", UserWarning)
bpkroth marked this conversation as resolved.
Show resolved Hide resolved

trial: TrialInfo = self.base_optimizer.ask()
trial.config.is_valid_configuration()
self.optimizer_parameter_space.check_configuration(trial.config)
assert trial.config.config_space == self.optimizer_parameter_space
self.trial_info_map[trial.config] = trial
config_df = pd.DataFrame(
[trial.config], columns=list(self.optimizer_parameter_space.keys())
)
return config_df, None
config_sr = pd.Series(dict(trial.config), dtype=object)
return Suggestion(config=config_sr, context=context, metadata=None)

def register_pending(
self,
*,
configs: pd.DataFrame,
context: Optional[pd.DataFrame] = None,
metadata: Optional[pd.DataFrame] = None,
) -> None:
def register_pending(self, *, pending: Suggestion) -> None:
raise NotImplementedError()

def surrogate_predict(
self,
*,
configs: pd.DataFrame,
context: Optional[pd.DataFrame] = None,
) -> npt.NDArray:
def surrogate_predict(self, *, suggestion: Suggestion) -> npt.NDArray:
# pylint: disable=import-outside-toplevel
from smac.utils.configspace import convert_configurations_to_array

if context is not None:
warn(f"Not Implemented: Ignoring context {list(context.columns)}", UserWarning)
if suggestion.context is not None:
warn(
f"Not Implemented: Ignoring context {list(suggestion.context.index)}",
UserWarning,
)
if self._space_adapter and not isinstance(self._space_adapter, IdentityAdapter):
raise NotImplementedError("Space adapter not supported for surrogate_predict.")

Expand All @@ -392,55 +368,38 @@ def surrogate_predict(
if self.base_optimizer._config_selector._model is None:
raise RuntimeError("Surrogate model is not yet trained")

config_array: npt.NDArray = convert_configurations_to_array(
self._to_configspace_configs(configs=configs)
config_array = convert_configurations_to_array(
[
ConfigSpace.Configuration(
self.optimizer_parameter_space, values=suggestion.config.to_dict()
)
]
)
mean_predictions, _ = self.base_optimizer._config_selector._model.predict(config_array)
return mean_predictions.reshape(
-1,
)

def acquisition_function(
self,
*,
configs: pd.DataFrame,
context: Optional[pd.DataFrame] = None,
) -> npt.NDArray:
if context is not None:
warn(f"Not Implemented: Ignoring context {list(context.columns)}", UserWarning)
def acquisition_function(self, *, suggestion: Suggestion) -> npt.NDArray:
if suggestion.context is not None:
warn(
f"Not Implemented: Ignoring context {list(suggestion.context.index)}",
UserWarning,
)
if self._space_adapter:
raise NotImplementedError()

# pylint: disable=protected-access
if self.base_optimizer._config_selector._acquisition_function is None:
raise RuntimeError("Acquisition function is not yet initialized")

cs_configs: list = self._to_configspace_configs(configs=configs)
return self.base_optimizer._config_selector._acquisition_function(cs_configs).reshape(
return self.base_optimizer._config_selector._acquisition_function(
suggestion.config.config_to_configspace(self.optimizer_parameter_space)
).reshape(
-1,
)

def cleanup(self) -> None:
if hasattr(self, "_temp_output_directory") and self._temp_output_directory is not None:
self._temp_output_directory.cleanup()
self._temp_output_directory = None

def _to_configspace_configs(self, *, configs: pd.DataFrame) -> List[ConfigSpace.Configuration]:
"""
Convert a dataframe of configs to a list of ConfigSpace configs.

Parameters
----------
configs : pd.DataFrame
Dataframe of configs / parameters. The columns are parameter names and
the rows are the configs.

Returns
-------
configs : list
List of ConfigSpace configs.
"""
return [
ConfigSpace.Configuration(self.optimizer_parameter_space, values=config.to_dict())
for (_, config) in configs.astype("O").iterrows()
]
Loading