AbstractGP and Kriging perform badly due to lack of hyperparameter optimisation #328

st-- · 2022-03-29T20:06:03Z

If not improving it outright, it would be good to make this clear from the documentation, as the current state is quite confusing if you don't dive into the code and realize what's missing (e.g. see #251). This might partially be resolved by #224, but to be competitive with other packages such as mogp-emulator, a lot more work is needed, and this package doesn't work out of the box. (E.g. beyond hyperparameter optimisation, also careful initialisation of hyperparameters & priors on the parameters would be required.)
Happy to add more detailed explanation if required.

vikram-s-narayan · 2022-03-30T06:24:47Z

Yes. I will add this info to the documentation. Thank you!

vikram-s-narayan · 2022-04-08T15:19:39Z

@st--

I'm planning on adding the following example to the documentation.

#this is a starter example for how to
#find optimal initial hyperparameters

using Surrogates
using AbstractGPs
using Hyperopt

sp(x) = sum(x.^2)
n_samples = 50
lower_bound = [-5.12, -5.12]
upper_bound = [5.12, 5.12]

xys = sample(n_samples, lower_bound, upper_bound, SobolSample())
zs = sp.(xys)
true_val = sp((0.0,0.0)) #only one validation point is taken in this example; more points can give better results

function surrogate_err_min(kernelType, Σcandidate)
    candidate_gp_surrogate = AbstractGPSurrogate(xys, zs, gp=kernelType, Σy=Σcandidate)
    return candidate_gp_surrogate((0.0,0.0)) - true_val
end

ho = @hyperopt for i=100,
    sampler = RandomSampler(),
    a = [GP(SqExponentialKernel()), GP(Matern32Kernel()), GP(Matern52Kernel())],  
    b = LinRange(0,1,100)
@show surrogate_err_min(a,b)
end

Hope this is in line with your suggestion?

st-- · 2022-04-14T07:17:39Z

Hi @vikram-s-narayan, just throwing Hyperopt.jl at it is definitely better than not optimising at all, but if I understand your example correctly, it makes a bunch of limiting assumptions:

it only optimises the noise variance, not the kernel hyperparameters (e.g. signal variance, lengthscale)
it treats the GP model the same you would e.g. a neural network where you have no guarantees on anything, simply minimising the error on some validation points (NB: should your surrogate_err_min return MAE or RMSE (always >= 0) instead of the difference (which can be arbitrarily negative)?)

For GPs as a surrogate model, it'd be great to actually treat them properly, e.g. you can optimise all hyperparameters using the marginal likelihood as an objective (that doesn't require any validation points - just on the training points themselves!), see e.g. https://juliagaussianprocesses.github.io/AbstractGPs.jl/stable/examples/1-mauna-loa/#Hyperparameter-Optimization

(For more background reading, see these great tutorials: https://distill.pub/2019/visual-exploration-gaussian-processes/ and http://tinyurl.com/guide2gp)

st-- · 2022-04-14T07:20:38Z

It'd be good to make clear to readers/users what the limitations/assumptions of the examples are, so when they try it out they know that bad performance might be due to these limitations of your implementation, rather than due to any issues with the underlying method. (Then there's an incentive to improve the implementation, instead of just walking away from it thinking it's useless!)

vikram-s-narayan mentioned this issue May 1, 2022

add note about need for hyperoptimization in abstractgps surrogate #336

Merged

vikram-s-narayan mentioned this issue May 31, 2022

Parameter optimization #351

Closed

archermarx mentioned this issue Jul 7, 2022

Making Kriging differentiable and better initial hyperparams #368

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AbstractGP and Kriging perform badly due to lack of hyperparameter optimisation #328

AbstractGP and Kriging perform badly due to lack of hyperparameter optimisation #328

st-- commented Mar 29, 2022 •

edited

vikram-s-narayan commented Mar 30, 2022

vikram-s-narayan commented Apr 8, 2022 •

edited

st-- commented Apr 14, 2022 •

edited

st-- commented Apr 14, 2022

AbstractGP and Kriging perform badly due to lack of hyperparameter optimisation #328

AbstractGP and Kriging perform badly due to lack of hyperparameter optimisation #328

Comments

st-- commented Mar 29, 2022 • edited

vikram-s-narayan commented Mar 30, 2022

vikram-s-narayan commented Apr 8, 2022 • edited

st-- commented Apr 14, 2022 • edited

st-- commented Apr 14, 2022

st-- commented Mar 29, 2022 •

edited

vikram-s-narayan commented Apr 8, 2022 •

edited

st-- commented Apr 14, 2022 •

edited