Problem when Sobol falls back to HitAndRunPolytopeSampler #2373

parrangoiz · 2024-04-17T22:44:22Z

I've been playing around to try to understand the effect of linear inequality constraints on quasi-random sampling, such as Sobol, and I'm seeing some issues where the samples don't seem to uniformly fill the polytope that is defined by the linear constraints (in addition to the usual box constraints) when Sobol is allowed to fall back to HitAndRunPolytopeSampler. I am wondering if this is related to pytorch/botorch#1225, which was fixed a while back but perhaps the code path that Ax is following to call this sampler is not passing through the proper normalization steps that were implemented?

Here's what I mean. First generate a 100-sample Sobol sequence without falling back to the polytope sampler; to make this possible I'll crank up the max number of rejection sampling tries to 1e5:

import matplotlib.pyplot as plt

from ax.service.ax_client import AxClient
from ax.service.utils.instantiation import ObjectiveProperties
from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
from ax.modelbridge.registry import Models
from ax.modelbridge.modelbridge_utils import get_pending_observation_features

num_trials = 100
gs = GenerationStrategy(
    steps=[
        GenerationStep(
            model=Models.SOBOL,
            num_trials=num_trials,
            min_trials_observed=num_trials,
            model_kwargs={
                "seed": 0, 
                "deduplicate": True, 
                "fallback_to_sample_polytope": False
            },
            model_gen_kwargs={}
        )
    ]
)

ax_client = AxClient(generation_strategy=gs)

ax_client.create_experiment(
    name="hartmann_test_experiment",
    parameters=[
        {
            "name": "x1",
            "type": "range",
            "bounds": [0.0, 1.0]
        },
        {
            "name": "x2",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x3",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x4",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x5",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x6",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    objectives={"hartmann6": ObjectiveProperties(minimize=True)}, 
    parameter_constraints=[
        "x1 + x2 <= 0.1", 
        "x4 - x3 >= 0.5",
        "x5 - x6 >= 0.0"
        ]
)

for i in range(num_trials):
    generator_run = gs.gen(
        experiment=ax_client.experiment,
        data=None,
        n=1,
        pending_observations=get_pending_observation_features(ax_client.experiment),
        model_gen_options={"max_rs_draws": 1e5}
    )
    trial = ax_client.experiment.new_trial(generator_run)

parameters = [t.arm.parameters for t in ax_client.experiment.trials.values()]

fig, ax = plt.subplots(1, 3, figsize=(8, 3))
xs = np.array([[p[f"x{i+1}"] for p in parameters] for i in range(6)])
ax[0].scatter(xs[0, :], xs[1, :])
ax[1].scatter(xs[2, :], xs[3, :])
ax[2].scatter(xs[4, :], xs[5, :])

for i in range(3):
    ax[i].set_xlim(0, 1)
    ax[i].set_ylim(0, 1)

Producing all 100 generator runs is slow (15 seconds on my computer), probably due to a tiny acceptance fraction during rejection sampling, but the result looks exactly as we'd expect:

If I repeat the experiment above, but setting "fallback_to_sample_polytope": True and "max_rs_draws": 0 to force falling back to polytope sampling, the code runs noticeably faster (4 sec), but the results look quite biased towards the corners:

As a sanity check, I called this sampler directly and got much more reasonable results, although still not as clean-looking as what you get (very slowly) through rejection sampling:

import torch
from torch import tensor
from botorch.utils.sampling import get_polytope_samples

samples = get_polytope_samples(
    n=100,
    bounds = torch.stack((torch.zeros(6), torch.ones(6))),
    inequality_constraints = [
        (torch.arange(6), tensor([-1.0, -1.0, 0, 0, 0, 0]), -0.1),
        (torch.arange(6), tensor([0, 0, -1.0, 1.0, 0, 0]), 0.5),
        (torch.arange(6), tensor([0, 0, 0, 0, 1.0, -1.0]), 0.0)
    ],
    seed=0
)

I'm still pretty new to this so I could easily be doing something wrong, feedback would be welcome!

The text was updated successfully, but these errors were encountered:

Balandat · 2024-04-27T17:44:57Z

Thanks for flagging this, this does seem like a bug. I'll take a closer look

Balandat · 2024-04-28T00:09:19Z

OK so the issue is that the hit and run sampling fallback in Ax (see here currently neither applies the normalization that you mentioned nor, and more importantly, does it apply thinning to the samples, which results in much more correlated samples. Basically, we need to update the Ax logic to call get_polytope_samples instead. This should be straightforward, the only thing to be careful about is how to manage the handling of the seed between draws so things can be made to work deterministically. I can put up a PR for this.

parrangoiz · 2024-04-28T01:58:24Z

Thanks for looking into this. Sounds right, may as well call it directly since it seems to be working fine. When you say thinning what exactly does that mean? Sorry not 100% familiar with all the jargon yet.

Balandat · 2024-04-28T14:31:07Z

Thinning (removing all but every k sample) is a process to reduce autocorrelation of samples drawn from a MCMC chain. This is happening here: https://github.com/pytorch/botorch/blob/main/botorch/utils/sampling.py#L828 (this is a poor man's implementation, you could avoid storing the discarded samples as you go to save memory but this is typically not of concern for the sample sizes we use for Bayesian Optimization.

parrangoiz · 2024-04-29T20:22:00Z

Got it, thanks.

parrangoiz changed the title ~~Problem when Sobol falls back on HitAndRunPolytopeSampler~~ Problem when Sobol falls back to HitAndRunPolytopeSampler Apr 17, 2024

esantorella assigned esantorella and Balandat and unassigned esantorella Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem when Sobol falls back to HitAndRunPolytopeSampler #2373

Problem when Sobol falls back to HitAndRunPolytopeSampler #2373

parrangoiz commented Apr 17, 2024

Balandat commented Apr 27, 2024

Balandat commented Apr 28, 2024

parrangoiz commented Apr 28, 2024

Balandat commented Apr 28, 2024

parrangoiz commented Apr 29, 2024

Problem when Sobol falls back to HitAndRunPolytopeSampler #2373

Problem when Sobol falls back to HitAndRunPolytopeSampler #2373

Comments

parrangoiz commented Apr 17, 2024

Balandat commented Apr 27, 2024

Balandat commented Apr 28, 2024

parrangoiz commented Apr 28, 2024

Balandat commented Apr 28, 2024

parrangoiz commented Apr 29, 2024