White box adversarial perturbations are generated via iterative optimizational algorithms most often by minimizing an adversarial loss on a $\ell_p$neighborhood of the original image, the so-called distortion set . Constrainingthe adversarial search with different norms results in disparately structuredadversarial examples . We will demonstrate in thiswork that the proposed structured adversarial examples can significantly bringdown the classification accuracy of adversarialy trained classifiers whileshowing low $ell_2$ distortion rate . For instance, on ImagNet dataset thestructured attacks drop the accuracy of the adversarial model to near zero withonly 50\% of $1,000 of $2,000 distortion generated using PGD . Asa byproduct, our finding can be used foradversary regularization of models to make models more robust or improve generalization performance on datasets which are structurally different

Author(s) : Ehsan Kazemi, Thomas Kerdreux, Liquang Wang

Links : PDF - Abstract

Code :
Coursera

Keywords : adversarial - distortion - attacks - accuracy - generated -

Leave a Reply

Your email address will not be published. Required fields are marked *