A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations |
|
Taghanaki et al. |
2019 |
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks |
Croce and Hein |
2020 |
A New Defense Against Adversarial Images: Turning a Weakness into a Strength |
randomly perturbing input, checking if closest AE is further away than some threshold |
Yu et al. |
2019 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Adversarial and Clean Data Are Not Twins |
adversarial retraining (binary classifier) |
Gong et al. |
2017 |
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods |
Carlini and Wagner |
2017 |
Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks |
|
Mustafa et al. |
2019 |
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks |
Croce and Hein |
2020 |
Adversarial Example Detection and Classification With Asymmetrical Adversarial Training |
combine base classifier with robust, binary »class-predicate-classifiers« |
Yin et al. |
2019 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics |
hidden layer statistics (PCA on conv) |
Li and Li |
2016 |
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods |
Carlini and Wagner |
2017 |
Adversarial Logit Pairing |
Adversarial retraining (force logit similarity of benign and adversarial image pairs) |
Kannan et al. |
2018 |
Evaluating and Understanding the Robustness of Adversarial Logit Pairing |
Engstrom et al. |
2018 |
APE-GAN: Adversarial Perturbation Elimination with GAN |
APE-GAN (similar to MagNet) |
Shen et al. |
2017 |
MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples |
Carlini and Wagner |
2017 |
Are Generative Classifiers More Robust to Adversarial Attacks? |
|
Li et al. |
2018 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples |
Retraining by amplifying »important« neuron weigths |
Tao et al. |
2018 |
Is AmI (Attacks Meet Interpretability) Robust to Adversarial Examples? |
Carlini |
2019 |
Barrage of Random Transforms for Adversarially Robust Defense (BaRT) |
»… stochastically combining a large number of individually weak defenses into a single barrage of randomized transformations to build a strong defense …« |
Raff et al. |
2019 |
Demystifying the Adversarial Robustness of Random Transformation Defenses |
Sitawarin et al. |
2021 |
Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality |
Input statistics (local intrinsic dimensionality, LID) |
Ma et al. |
2018 |
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples |
Athalye et al. |
2018 |
Countering Adversarial Images using Input Transformations |
Input preprocessing (cropping, scaling, compression, …) |
Guo et al. |
2017 |
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples |
Athalye et al. |
2018 |
Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning |
|
Papernot and McDaniel |
2018 |
On the Robustness of Deep K-Nearest Neighbors |
Sitawarin and Wagner |
2019 |
Defense against Adversarial Attacks Using High-Level Representation Guided Denoiser |
Input preprocessing (denoising based on network trained on latent vectors) |
Liao et al. |
2017 |
On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses |
Athalye and Carlini |
2018 |
Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models |
Defense-GAN (Like PixelDefend, but with a GAN instead of a PixelCNN) |
Samangouei et al. |
2018 |
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples |
Athalye et al. |
2018 |
Deflecting Adversarial Attacks with Pixel Deflection |
Pixel Deflection (input preprocessing) |
Prakash et al. |
2018 |
On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses |
Athalye and Carlini |
2018 |
Detecting Adversarial Examples from Sensitivity Inconsistency of Spatial-Transform Domain |
primal & dual classifier + sensitivity statistics |
Tian et al. |
2021 |
Evading Adversarial Example Detection Defenses with Orthogonal Projected Gradient Descent |
Bryniarski et al. |
2021 |
Detecting Adversarial Samples from Artifacts |
distributional detection (density) |
Feinman et al. |
2017 |
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods |
Carlini and Wagner |
2017 |
Detecting Adversarial Samples from Artifacts |
distributional detection (Bayesian uncertainty) |
Feinman et al. |
2017 |
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods |
Carlini and Wagner |
2017 |
Detection based Defense against Adversarial Examples from the Steganalysis Point of View |
Input preprocessing (analysize images for »hidden features« and train a binary classifier to detect Aes) |
Liu et al. |
2018 |
Evading Adversarial Example Detection Defenses with Orthogonal Projected Gradient Descent |
Bryniarski et al. |
2021 |
Dimensionality Reduction as a Defense against Evasion Attacks on Machine Learning Classifiers |
dimensionality reduction |
Bhagoji et al. |
2017 |
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods |
Carlini and Wagner |
2017 |
Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks |
Retraining with soft labels |
Papernot et al. |
2016 |
Towards Evaluating the Robustness of Neural Networks |
Carlini and Wagner |
2017 |
DLA: Dense-Layer-Analysis for Adversarial Example Detection |
DLA: advernarial retraining on Benign/Adversarial pairs, binary classifier on hidden layer activations |
Sperl et al. |
2019 |
Evading Adversarial Example Detection Defenses with Orthogonal Projected Gradient Descent |
Bryniarski et al. |
2021 |
Early Methods for Detecting Adversarial Images |
dimensionality reduction (PCA) |
Hendrycks and Gimpel |
2016 |
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods |
Carlini and Wagner |
2017 |
Efficient Defenses Against Adversarial Attacks |
Training data augmentation, BReLU activation |
Zantedeschi et al. |
2017 |
MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples |
Carlini and Wagner |
2017 |
EMPIR: Ensembles of Mixed Precision Deep Networks for Increased Robustness against Adversarial Attacks |
Training an ensemble of classifiers with different precision in weights and activation |
Sen et al. |
2020 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Ensemble Adversarial Training: Attacks and Defenses |
|
Tramèr et al. |
2017 |
GenAttack: Practical Black-box Attacks with Gradient-Free Optimization |
Alzantot et al. |
2018 |
Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural Networks |
Training a diverse ensemble of binary classifiers (on partitions of the classes) |
Verma and Swami |
2019 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks |
dimensionality reduction (color space and median filter) |
Xu et al. |
2017 |
Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong |
He et al. |
2017 |
Gotta Catch 'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks |
Honeypot: Lure attackers to generate obvious Aes |
Shan et al. |
2019 |
Evading Adversarial Example Detection Defenses with Orthogonal Projected Gradient Descent |
Bryniarski et al. |
2021 |
Improving Adversarial Robustness via Promoting Ensemble Diversity |
|
Pang et al. |
2019 |
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks |
Croce and Hein |
2020 |
Improving Adversarial Robustness via Promoting Ensemble Diversity |
training via regularization a diverse ensemble of classifiers |
Pang et al. |
2019 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Jacobian Adversarially Regularized Networks for Robustness |
|
Chan et al. |
2019 |
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks |
Croce and Hein |
2020 |
MagNet: a Two-Pronged Defense against Adversarial Examples |
MagNet |
Meng and Chen |
2017 |
MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples |
Carlini and Wagner |
2017 |
ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation |
Preprocess input by randomly discarding parts of it, use matrix estimation on gaps, train on that input |
Yang et al. |
2019 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Mitigating Adversarial Effects Through Randomization |
Input preprocessing (randomized rescaling and randomized padding) |
Xie et al. |
2017 |
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples |
Athalye et al. |
2018 |
Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification |
|
Cao and Gong |
2017 |
Decision Boundary Analysis of Adversarial Examples |
He et al. |
2018 |
Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks |
Inject randomness into inference (interpolate input multiple times with random samples and average prediction on those) |
Pang et al. |
2019 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
On Detecting Adversarial Perturbations |
adversarial retraining (binary classifier) |
Metzen et al. |
2017 |
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods |
Carlini and Wagner |
2017 |
On the (Statistical) Detection of Adversarial Examples |
adversarial retraining (»trash class«) |
Grosse et al. |
2017 |
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods |
Carlini and Wagner |
2017 |
On the (Statistical) Detection of Adversarial Examples |
distributional detection (maximum mean discrepancy) |
Grosse et al. |
2017 |
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods |
Carlini and Wagner |
2017 |
PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples |
PixelDefend (Use a generative model [PixelCNN] to project data back to the manifold) |
Song et al. |
2017 |
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples |
Athalye et al. |
2018 |
Randomization matters How to defend against strong adversarial attacks |
Boosted Adversarial Training (BAT): Combine an adversarially trained network AT and a regular network trained on adversarial examples for AT |
Pinot et al. |
2020 |
Adversarial Vulnerability of Randomized Ensembles |
Dbouk and Shanbhag |
2022 |
Resisting Adversarial Attacks by k-Winners-Take-All |
Retraining, k-Winner-Take-All layers (instead of ReLU) |
Xiao et al. |
2019 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness |
Training using MMC loss |
Pang et al. |
2019 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Robustness to Adversarial Examples through an Ensemble of Specialists |
Ensemble of classifiers operating on class subsets |
Abbasi and Gagné |
2017 |
Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong |
He et al. |
2017 |
Stochastic Activation Pruning for Robust Adversarial Defense |
Inject randomness into inference (»weighted dropout«) |
Dhillon et al. |
2018 |
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples |
Athalye et al. |
2018 |
The Odds are Odd: A Statistical Test for Detecting Adversarial Examples |
Inject randomness into inference (noisy input), statistical test |
Roth et al. |
2019 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Thermometer Encoding: One Hot Way To Resist Adversarial Examples |
Retraining with discretized inputs |
Buckman et al. |
2018 |
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples |
Athalye et al. |
2018 |
Thwarting Adversarial Examples: An L0-RobustSparse Fourier Transform |
Input preprocessing (»compression« and projection to discrete cosine transformation coefficients) |
Bafna et al. |
2018 |
On Adaptive Attacks to Adversarial Example Defenses |
Tramèr et al. |
2020 |
Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One |
|
Grathwohl et al. |
2019 |
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks |
Croce and Hein |
2020 |