Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem when Sobol falls back to HitAndRunPolytopeSampler #2373

Open
parrangoiz opened this issue Apr 17, 2024 · 5 comments
Open

Problem when Sobol falls back to HitAndRunPolytopeSampler #2373

parrangoiz opened this issue Apr 17, 2024 · 5 comments
Assignees

Comments

@parrangoiz
Copy link

I've been playing around to try to understand the effect of linear inequality constraints on quasi-random sampling, such as Sobol, and I'm seeing some issues where the samples don't seem to uniformly fill the polytope that is defined by the linear constraints (in addition to the usual box constraints) when Sobol is allowed to fall back to HitAndRunPolytopeSampler. I am wondering if this is related to pytorch/botorch#1225, which was fixed a while back but perhaps the code path that Ax is following to call this sampler is not passing through the proper normalization steps that were implemented?

Here's what I mean. First generate a 100-sample Sobol sequence without falling back to the polytope sampler; to make this possible I'll crank up the max number of rejection sampling tries to 1e5:

import matplotlib.pyplot as plt

from ax.service.ax_client import AxClient
from ax.service.utils.instantiation import ObjectiveProperties
from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
from ax.modelbridge.registry import Models
from ax.modelbridge.modelbridge_utils import get_pending_observation_features

num_trials = 100
gs = GenerationStrategy(
    steps=[
        GenerationStep(
            model=Models.SOBOL,
            num_trials=num_trials,
            min_trials_observed=num_trials,
            model_kwargs={
                "seed": 0, 
                "deduplicate": True, 
                "fallback_to_sample_polytope": False
            },
            model_gen_kwargs={}
        )
    ]
)

ax_client = AxClient(generation_strategy=gs)

ax_client.create_experiment(
    name="hartmann_test_experiment",
    parameters=[
        {
            "name": "x1",
            "type": "range",
            "bounds": [0.0, 1.0]
        },
        {
            "name": "x2",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x3",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x4",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x5",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x6",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    objectives={"hartmann6": ObjectiveProperties(minimize=True)}, 
    parameter_constraints=[
        "x1 + x2 <= 0.1", 
        "x4 - x3 >= 0.5",
        "x5 - x6 >= 0.0"
        ]
)

for i in range(num_trials):
    generator_run = gs.gen(
        experiment=ax_client.experiment,
        data=None,
        n=1,
        pending_observations=get_pending_observation_features(ax_client.experiment),
        model_gen_options={"max_rs_draws": 1e5}
    )
    trial = ax_client.experiment.new_trial(generator_run)

parameters = [t.arm.parameters for t in ax_client.experiment.trials.values()]

fig, ax = plt.subplots(1, 3, figsize=(8, 3))
xs = np.array([[p[f"x{i+1}"] for p in parameters] for i in range(6)])
ax[0].scatter(xs[0, :], xs[1, :])
ax[1].scatter(xs[2, :], xs[3, :])
ax[2].scatter(xs[4, :], xs[5, :])

for i in range(3):
    ax[i].set_xlim(0, 1)
    ax[i].set_ylim(0, 1)

Producing all 100 generator runs is slow (15 seconds on my computer), probably due to a tiny acceptance fraction during rejection sampling, but the result looks exactly as we'd expect:
image

If I repeat the experiment above, but setting "fallback_to_sample_polytope": True and "max_rs_draws": 0 to force falling back to polytope sampling, the code runs noticeably faster (4 sec), but the results look quite biased towards the corners:
image

As a sanity check, I called this sampler directly and got much more reasonable results, although still not as clean-looking as what you get (very slowly) through rejection sampling:

import torch
from torch import tensor
from botorch.utils.sampling import get_polytope_samples

samples = get_polytope_samples(
    n=100,
    bounds = torch.stack((torch.zeros(6), torch.ones(6))),
    inequality_constraints = [
        (torch.arange(6), tensor([-1.0, -1.0, 0, 0, 0, 0]), -0.1),
        (torch.arange(6), tensor([0, 0, -1.0, 1.0, 0, 0]), 0.5),
        (torch.arange(6), tensor([0, 0, 0, 0, 1.0, -1.0]), 0.0)
    ],
    seed=0
)

image

I'm still pretty new to this so I could easily be doing something wrong, feedback would be welcome!

@parrangoiz parrangoiz changed the title Problem when Sobol falls back on HitAndRunPolytopeSampler Problem when Sobol falls back to HitAndRunPolytopeSampler Apr 17, 2024
@Balandat
Copy link
Contributor

Thanks for flagging this, this does seem like a bug. I'll take a closer look

@Balandat
Copy link
Contributor

OK so the issue is that the hit and run sampling fallback in Ax (see here currently neither applies the normalization that you mentioned nor, and more importantly, does it apply thinning to the samples, which results in much more correlated samples. Basically, we need to update the Ax logic to call get_polytope_samples instead. This should be straightforward, the only thing to be careful about is how to manage the handling of the seed between draws so things can be made to work deterministically. I can put up a PR for this.

@parrangoiz
Copy link
Author

Thanks for looking into this. Sounds right, may as well call it directly since it seems to be working fine. When you say thinning what exactly does that mean? Sorry not 100% familiar with all the jargon yet.

@Balandat
Copy link
Contributor

Thinning (removing all but every k sample) is a process to reduce autocorrelation of samples drawn from a MCMC chain. This is happening here: https://github.com/pytorch/botorch/blob/main/botorch/utils/sampling.py#L828 (this is a poor man's implementation, you could avoid storing the discarded samples as you go to save memory but this is typically not of concern for the sample sizes we use for Bayesian Optimization.

@parrangoiz
Copy link
Author

Got it, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants