Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameter tuning #67

Open
ablaom opened this issue May 5, 2022 · 22 comments
Open

Hyperparameter tuning #67

ablaom opened this issue May 5, 2022 · 22 comments
Labels

Comments

@ablaom
Copy link
Contributor

ablaom commented May 5, 2022

Looking at bit into MLJ integration. For better or worse, hyper-parameter optimization (eg, grid search) in MLJ generally works by mutating the field values of the model struct. I wonder if TableTransforms.jl would consider changing their transformer types to mutable structs? I think in ML applications, at least, any loss in performance would be pretty minimal, but perhaps there are wider use-cases to consider?

The alternative for our use case is for the MLJ model wrapper to be mutable (for now a wrapper is necessary anyway) and that a user wanting to do a search does something like

values = [Scale(low=0, high=x) for x in 1.0:0.1:10]     <---- extra step 
values = range(wrapped_transformer, :model, values=values)

However, while this might be fine for Grid search, it doesn't really work for other optimization strategies.

Thoughts?

@ablaom
Copy link
Contributor Author

ablaom commented May 5, 2022

I suppose with some work we could create a wrapper that essentially "mutafies" the original model (overload setproperty! to re-generate the atomic model each time it's called) but that would be kinda complicated.

@juliohm juliohm added the question Further information is requested label May 5, 2022
@juliohm
Copy link
Member

juliohm commented May 5, 2022

I wonder if hyperparameter tuning could leverage Setfield.jl or its successor Acessors.jl instead of enforcing mutability of the structs? My understanding is that mutable structs in Julia are only justified when we have hot loops. In other words, mutable structs feel like an anti-pattern in the language unless you have a "array-like" struct that you need to setindex! many times.

My main concern with enforcing mutability is that it may compromise parallelization later on. I am not an expert in compiler technology but I know that non-mutable structs help the compiler do more aggressive optimizations. And sending copies of these tiny transforms to different threads and processes should be more performant than dealing with chunks of memory in the heap? Again, I am not an expert in compilers, but my intuition is pointing to the non-mutable design as the more conservative and optimal design.

@eliascarv do you have any comments to add?

@eliascarv
Copy link
Member

eliascarv commented May 5, 2022

I have an idea. We can make a wrapper type to make Transforms mutable.
Something like this:

mutable struct Mut{T<:Transform}
  transform::T
end

Base.getproperty(transf::Mut, prop::Symbol) = getproperty(transf.transform, prop)

function Base.setproperty!(transf::Mut, prop::Symbol)
  # code
end

apply(transf::Mut, table) = apply(transf.transform, table)

# code
statictransf = Scale(low=0, high=1)
mutabletransf = Mut(statictransf)

@juliohm
Copy link
Member

juliohm commented May 5, 2022

Yes, it is always possible to wrap the structs as @ablaom mentioned.

Let's postpone this decision to a future major release when hyperparameter tuning can be considered.

@juliohm juliohm changed the title Make transformer structs mutable? Hyperparameter tuning May 5, 2022
@juliohm juliohm added feature and removed question Further information is requested labels May 5, 2022
@ablaom
Copy link
Contributor Author

ablaom commented May 6, 2022

I wonder if hyperparameter tuning could leverage Setfield.jl

We already do (in-house) recursive setproperty!. I'm not understanding how that helps us, though. Except to make implementation of the proposed wrapper easier. Perhaps that's what you meant?

@juliohm
Copy link
Member

juliohm commented May 9, 2022

I mean that there is nothing intrinsic to hyperparameter tuning that requires types to be mutable. We can certainly provide an API that extracts parameters of a pipeline into a "flat vector" of parameters, performs an optimization step, and then instantiates a new pipeline with improved parameters. The amount of computing we save by forcing the structs to be mutable is not very clear, specially considering that we are talking about a couple of dozen parameters in general?

@CameronBieganek
Copy link

This proposal for MLJTuning would allow MLJ to handle immutable pipelines.

@ParadaCarleton
Copy link

@juliohm has there been any work on hyperparameter tuning with TableTransforms.jl? I'd like to add asinh (pseudolog) transformations to the package, but that would require the ability to tune 1 or 2 hyperparameters.

@juliohm
Copy link
Member

juliohm commented Oct 25, 2023 via email

@ParadaCarleton
Copy link

We are starting to brainstorm an API for this feature. It would be nice to have an additional hand :)

Sounds good! Should we just stick to just the MLJ model interface for now?

@juliohm
Copy link
Member

juliohm commented Oct 25, 2023

Sounds good! Should we just stick to just the MLJ model interface for now?

What interface exactly? Can you share a MWE?

Also, take a look at StatsLearnModels.jl where we already have a Learn transform with statistical learning models.

The goal is to be able to tune entire transform pipelines, which may or may not include Learn components.

@ParadaCarleton
Copy link

Sounds good! Should we just stick to just the MLJ model interface for now?

What interface exactly? Can you share a MWE?

This interface.

@juliohm
Copy link
Member

juliohm commented Oct 25, 2023

This interface is already supported with package extensions in StatsLearnModels.jl. That means that you can use any MLJ.jl model there directly and we will wrap the interface into a more general interface with tables.

@ParadaCarleton
Copy link

Ahh, sounds perfect!

That means that you can use any MLJ.jl model there directly and we will wrap the interface into a more general interface with tables.

Does the converse hold as well? That is, if I define a StatsLearnModels.jl model, will it work with MLJ.jl?

@juliohm
Copy link
Member

juliohm commented Oct 25, 2023

Does the converse hold as well? That is, if I define a StatsLearnModels.jl model, will it work with MLJ.jl?

Probably not, as our interface is a bit more general. For example, we don't assume that predictions are univariate, but as far as I remember MLJ.fit assumes X a matrix and y a vector (single column).

@ParadaCarleton
Copy link

Probably not, as our interface is a bit more general.

Hmm, if that's the way it goes, I think it should be possible to have SLM.jl models work with MLJModelInterface, so you can define an SLM model and it will work with MLJModelInterface (although not all SLM.jl features, e.g. multivariate prediction, would work with MLJ models). Alternatively, MLJModelInterface could be generalized to handle the cases in SLM.jl. @ablaom any thoughts?

@juliohm
Copy link
Member

juliohm commented Oct 25, 2023

What features of the MLJ stack are you missing @ParadaCarleton? Perhaps that is the right question to ask?

@ParadaCarleton
Copy link

Right now, model tuning. In the future, I just want to avoid duplicating work across the MLJ and JuliaML ecosystems.

@juliohm
Copy link
Member

juliohm commented Oct 26, 2023

Right now, model tuning.

That is one of the next steps in our roadmap. Stay tuned.

In the future, I just want to avoid duplicating work across the MLJ and JuliaML ecosystems.

These are different ecosystems with diverging design decisions. I would pick the one that fits your needs and would devote energy to it. Keep in mind that we assessed MLJ and other alternatives before attempting to start a parallel initiative. There are some fundamental differences that make a huge difference in practice.

@ParadaCarleton
Copy link

These are different ecosystems with diverging design decisions.

How/why? MLJ doesn't support multivariate targets yet, but that's a feature they've wanted for quite a while now

That is one of the next steps in our roadmap. Stay tuned.

OK, but is there no way to use MLJTuning on a small transformation from TableTransforms.jl?

@juliohm
Copy link
Member

juliohm commented Oct 26, 2023

How/why? MLJ doesn't support multivariate targets yet, but that's a feature they've wanted for quite a while now

To name a few diverging views:

OK, but is there no way to use MLJTuning on a small transformation from TableTransforms.jl?

I honestly don't know because we don't use MLJ.jl in our industrial projects anymore. I appreciate your attempt to improve collaboration, but that is very unlikely given the different application requirements.

We are happy to collaborate with specific modules that may benefit both stacks, but that is it.

@ParadaCarleton
Copy link

ParadaCarleton commented Oct 26, 2023

These are implemented in MLJModels.jl, where these transforms are considered a special class of unsupervised models. I would agree that the name MLJModels is very unintuitive, though. Even if a transformation is technically a kind of unsupervised model in the MLJ classification, that's definitely not a very intuitive way to think about it.

How about we provide a separate type for transforms and host them in a separate package (TableTransforms.jl), but have them reexported by MLJModels.jl?

Model interface that supports multivariate prediction in StatsLearnModels.jl

I think MLJModels would be ecstatic to support this interface! It's a very requested feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants