New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Introduce Named Data Classes for mlos_core #811

Closed

jsfreischuetz wants to merge 8 commits into microsoft:main from jsfreischuetz:named-tuples

Contributor

jsfreischuetz commented Jul 23, 2024 •

edited by bpkroth

Loading

Introduces @dataclass types for return values from mlos_core.Optimizers to improve code readability.

See also: #751

jsfreischuetz and others added 8 commits

July 23, 2024 16:45


          test

680bcb1


          fix test cases for introducing tuple classes

900aeb5


          fix linting issues

778a718


          some more linting issues


          Merge branch 'main' into named-tuples

ae04944


          Merge branch 'main' into named-tuples

d7fc6c8


          reformat

be8125c


          Merge branch 'named-tuples' of github.com:jsfreischuetz/MLOS into nam…

7a5ec7f

…ed-tuples

jsfreischuetz marked this pull request as ready for review

July 24, 2024 22:49

jsfreischuetz requested a review from a team as a code owner

July 24, 2024 22:49

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/observations.py Show resolved Hide resolved

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/observations.py

+                  elif left is not None and right is not None:
+                      if not left.equals(right):
+                          return False
+                  return True

Contributor

bpkroth Jul 31, 2024

I think this can be simplified a bit:

if left is not None and right is not None and left.equals(right):
  return True
return False

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/observations.py Show resolved Hide resolved

jsfreischuetz mentioned this pull request

Rewrite for named tuples #852

Merged

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/observations.py

+                  def __repr__(self) -> str:
+                      return (
+                          f"Observation(config={self.config}, score={self.score},"
+                          + " context={self.context}, metadata={self.metadata})"

Contributor

bpkroth Sep 18, 2024

same concat comment here

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/optimizer.py

-                      scores: pd.DataFrame,
-                      context: Optional[pd.DataFrame] = None,
-                      metadata: Optional[pd.DataFrame] = None,
+                      observation: Observation,

Contributor

bpkroth Sep 18, 2024

Suggested change

      
                    observation: Observation,
          
                    observations: Observations,

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/optimizer.py

		@@ -150,7 +134,7 @@ def suggest(
		*,
		context: Optional[pd.DataFrame] = None,

Contributor

bpkroth Sep 18, 2024

Suggested change

      
                    context: Optional[pd.DataFrame] = None,
          
                    context: Optional[pd.Series] = None,

Did this one get replacd by a Series elsewhere?

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/optimizer.py

                       """
                       pass  # pylint: disable=unnecessary-pass # pragma: no cover
-                  def get_observations(self) -> Tuple[pd.DataFrame, pd.DataFrame, Optional[pd.DataFrame]]:
+                  def get_observations(self) -> Observation:

Contributor

bpkroth Sep 18, 2024

Suggested change

      
                def get_observations(self) -> Observation:
          
                def get_observations(self) -> Observations:

docstring needs updating too

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/optimizer.py

                       ).reset_index(drop=True)
-                      return (configs, scores, contexts if len(contexts.columns) > 0 else None)
+                      return Observation(

Contributor

bpkroth Sep 18, 2024

Suggested change

      
                    return Observation(
          
                    return Observations(

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/optimizer.py

+                      return Observation(
+                          config=configs,
+                          score=scores,
+                          context=contexts if len(contexts.columns) > 0 else None,

Contributor

bpkroth Sep 18, 2024

Suggested change

      
                        context=contexts if len(contexts.columns) > 0 else None,
          
                        context=contexts if len(contexts) > 0 else None,

?

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/optimizer.py

                   def get_best_observations(
                       self,
                       *,
                       n_max: int = 1,
-                  ) -> Tuple[pd.DataFrame, pd.DataFrame, Optional[pd.DataFrame]]:
+                  ) -> Observation:

Contributor

bpkroth Sep 18, 2024

Suggested change

      
                ) -> Observation:
          
                ) -> Observations:

Contributor

bpkroth Sep 18, 2024

docstring again

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/optimizer.py

-                      return (configs.loc[idx], scores.loc[idx], None if contexts is None else contexts.loc[idx])
+                      observations = self.get_observations()
+                      idx = observations.score.nsmallest(
+                          n_max, columns=self._optimization_targets, keep="first"

Contributor

bpkroth Sep 18, 2024

Suggested change

      
                        n_max, columns=self._optimization_targets, keep="first"
          
                        n_max, columns=self._optimization_targets, keep="first",

trailing comma, format fixups

bpkroth reviewed

View reviewed changes

mlos_core/mlos_core/optimizers/observations.py



		def compare_optional_dataframe(
		left: Optional[pd.DataFrame], right: Optional[pd.DataFrame]

Contributor

bpkroth Sep 20, 2024

Suggested change

      
                left: Optional[pd.DataFrame], right: Optional[pd.DataFrame]
          
                left: Optional[pd.DataFrame], right: Optional[pd.DataFrame],

nit: trailing comma and black reformat

Contributor

bpkroth commented Sep 20, 2024

Got confused and started commenting here again.

I think #852 supercedes this one, so let's close this one.

bpkroth closed this

bpkroth added a commit that referenced this pull request


          Rewrite for named tuples (#852)

b66e134

## Title

Refactor mlos_core APIs to encapsulate related data fields.

## Description

Refactors the mlos_core Optimizer APIs to accept new data types
`Observation`, `Observations` and return `Suggestion`, instead of a mess
of `Tuple[DataFrame, DataFrame, Optional[DataFrame],
Optional[DataFrame]]` that must be named and checked everywhere.

Additionally, this makes it more explicit that `_register` is a bulk
operation that is not actually supported currently by the underlying
optimizers, though leaves notes on how we can do that in the future.

## Type of Change

- Refactor

---

## Testing

Usual CI plus some new unit tests for new data type operations.

---

## Additional Notes

A more significant rewrite of named tuple support inside mlos_core.
This is based on comments in #811 as well as conversations with @bpkroth

---

---------

Co-authored-by: Brian Kroth <[email protected]>
Co-authored-by: Brian Kroth <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet