`AdversarialTrainingClassifier` may not work properly #138

vietvo89 · 2023-01-10T06:41:49Z

Hi

I have trained several adversarial trained ensembles based on TorchEnsemble implementation. The results I got are very interesting and surprise me. To illustrate, I set the epsilon = 8/255, 16/255 and 25/255. The clean accuracy for these models are around 91%, 92% and 88% respectively.

To me, it is incredibly high if compared with a state of the art method for adversarial trained ensembles on CIFAR-10 (around 82%). I think it could be due to the fact that, TorchEnsemble use FGSM to generate adversarial examples while some state-of-the-art methods use PGD. I think the clean accuracy should be lower and when increasing epsilon, the clean accuracy should decrease.

Secondly, when I carried out experiments to evaluate these models against black-box attacks. It seems that only model trained with epsilon = 8/255 is robust while others are not robust at all. I do not know what's wrong with my code.

# Define the ensemble
base_estimator = VGG('VGG16')
ensemble = AdversarialTrainingClassifier(
    estimator=base_estimator,               # estimator is your pytorch model
    n_estimators=10,                        # number of base estimators
    cuda = True
)

epochs = 100
save_dir = './models/'
criterion = nn.CrossEntropyLoss()
ensemble.set_criterion(criterion)

# Set the optimizer
ensemble.set_optimizer('Adam',             # parameter optimizer
                    lr=1e-3,            # learning rate of the optimizer
                    weight_decay=5e-4)  # weight decay of the optimizer

# Set the learning rate scheduler
ensemble.set_scheduler(
    "CosineAnnealingLR",                    # type of learning rate scheduler
    T_max=epochs,                           # additional arguments on the scheduler
)

# Train and Evaluate the ensemble 
ensemble.fit(
    train_loader=train_loader,  # training data
    epochs=epochs,                 # the number of training epochs
    epsilon=8/255, # 16/255, 25/255
    test_loader=test_loader,    # evaluate data (if missing this part, the training process will ignore or skip evaluation
    save_dir=save_dir,
)

xuyxu · 2023-01-10T13:50:48Z

Hi @vietvo89, would it be convenient for you to share the script on evaluating the adversial robustness of the ensemble.

vietvo89 · 2023-01-11T00:44:39Z

@xuyxu , this is my main script.

def load_ensemble_model(arch):
    
    if arch =='ensemble':
        model_path = './models/VotingClassifier_VGG_10_ckpt_no_dropout.pth'
        base_estimator = VGG('VGG16')
        ensemble = VotingClassifier(
            estimator=base_estimator,               # estimator is your pytorch model
            n_estimators=10,                        # number of base estimators
            cuda = True
        )
    elif arch =='ensemble_advtrain':
        model_path = './models/AdversarialTrainingClassifier_VGG_10_ckpt_8-255.pth'
        #model_path = './models/AdversarialTrainingClassifier_VGG_10_ckpt_16-255.pth'
        #model_path = './models/AdversarialTrainingClassifier_VGG_10_ckpt_25-255.pth'
        base_estimator = VGG('VGG16')
        ensemble = AdversarialTrainingClassifier(
            estimator=base_estimator,               # estimator is your pytorch model
            n_estimators=10,                        # number of base estimators
            cuda = True
        )

    device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
    ensemble = ensemble.to(device)
    checkpoint = torch.load(model_path)
    n_estimators = checkpoint["n_estimators"]
    model_params = checkpoint["model"]

    # Pre-allocate and load all base estimators
    for _ in range(n_estimators):
        ensemble.estimators_.append(ensemble._make_estimator())
    ensemble.load_state_dict(model_params)
    ensemble.eval()

    return ensemble

class PretrainedModel():
    def __init__(self,model,
                 dataset='imagenet',
                 arch='vit'):
        self.model = model
        self.dataset = dataset
        self.arch = arch

        #----------------------
        self.bounds =  [0,1]
        self.num_classes = 10 
        self.num_queries = 0        
        #----------------------
        
        # ======= non-normalized =========       
        if self.arch.startswith('ensemble_advtrain'):
            self.mu = torch.Tensor([0., 0., 0.]).float().view(1, 3, 1, 1).cuda()
            self.sigma = torch.Tensor([1., 1., 1.]).float().view(1, 3, 1, 1).cuda()
            self.num_classes = 10
        else:
            # ======= CIFAR10 ==========
            if self.dataset == 'cifar10':
                self.mu = torch.Tensor([0.4914, 0.4822, 0.4465]).float().view(1, 3, 1, 1).cuda()
                self.sigma = torch.Tensor([0.2023, 0.1994, 0.2010]).float().view(1, 3, 1, 1).cuda()
                self.num_classes = 10

    def predict_label(self, x):
        img = (x - self.mu) / self.sigma
        with torch.no_grad():
            if self.arch.startswith('ensemble'):
                outputs = []
                for i in range(self.model.n_estimators):
                    estimator = self.model.estimators_[i]
                    outputs.append(F.softmax(estimator(img), dim=1))
                out = torch.stack(outputs).mean(0)
            else:
                out = self.model(img)
            self.num_queries += x.size(0)

        out = torch.max(out,1)[1]

        return out
# ================== Main ===========================
# 1. load dataset
batch_size = 1
dataset = 'cifar10'
_, testset = load_data(dataset,batch_size=batch_size)

# 2. load model
arch = 'ensemble_advtrain'
net = load_ensemble_model(arch)

# 3. draft model
dataset = 'cifar10'
model = PretrainedModel(net,dataset,arch)

# 4. attack setup
# I leave this as a standard one from the original repo
constraint='l2'
num_iterations=150
gamma=1.0
stepsize_search='geometric_progression'
max_num_evals = 1e4
init_num_evals=100
query_limit = 10000
targeted = False

# 5. doing attack
o = 100
oimg, olabel = testset[o]
oimg = torch.unsqueeze(oimg, 0)
tlabel = None
timg = None
y_targ = np.array([olabel])

attack = HSJA(model,constraint,num_iterations,gamma,stepsize_search,max_num_evals,init_num_evals)
adv, nqry,_ = attack.perturb(oimg.numpy(), y_targ, timg,targeted,query_limit)

vietvo89 · 2023-01-27T01:51:58Z

@xuyxu Do you know what is going with AdversarialTrainingClassifier? Have you verify the implementation of this?

xuyxu · 2023-01-27T08:13:47Z

Hi @vietvo89, sorry for the late response, kind of busy these days. I am wondering will replacing the FGSM method with other methods solves this problem. Since I am not quite familiar with adversarial learning, your suggestion would be much appreciated.

vietvo89 · 2023-01-31T23:38:46Z

Hi @xuyxu , I'm not so sure but in practice, people usually use PGD to create adversarial examples more robust to push classifiers to learn robust features and improve the overall robustness. If I'm not mistaken, most recent adversarial training frameworks use PGD. I think it's worth a try and improve your repo. Anyone can contribute to your repo?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`AdversarialTrainingClassifier` may not work properly #138

`AdversarialTrainingClassifier` may not work properly #138

vietvo89 commented Jan 10, 2023 •

edited

Loading

xuyxu commented Jan 10, 2023

vietvo89 commented Jan 11, 2023 •

edited

Loading

vietvo89 commented Jan 27, 2023 •

edited

Loading

xuyxu commented Jan 27, 2023

vietvo89 commented Jan 31, 2023

AdversarialTrainingClassifier may not work properly #138

AdversarialTrainingClassifier may not work properly #138

Comments

vietvo89 commented Jan 10, 2023 • edited Loading

xuyxu commented Jan 10, 2023

vietvo89 commented Jan 11, 2023 • edited Loading

vietvo89 commented Jan 27, 2023 • edited Loading

xuyxu commented Jan 27, 2023

vietvo89 commented Jan 31, 2023

`AdversarialTrainingClassifier` may not work properly #138

`AdversarialTrainingClassifier` may not work properly #138

vietvo89 commented Jan 10, 2023 •

edited

Loading

vietvo89 commented Jan 11, 2023 •

edited

Loading

vietvo89 commented Jan 27, 2023 •

edited

Loading