AI and Security: Adversarial Attacks and Defenses

Saerom Park
Seoul National University

Deep neural networks, such as convolutional neural networks, have recently produced good results and been applied to many real-world problems such as image classification, segmentation and object detection tasks. These models typically do not need separate feature extraction procedures because they learn the representation while learning the classifier. However, it has been shown that small noises can mislead trained networks, and detecting these noises with human eyes is difficult. These small perturbed examples are called adversarial examples. The existence of these examples is closely related to learning the thin data manifolds in supervised models. Several methods, such as fast gradient sign (FGS) and iterative FGS (IFGS), are used to generate adversarial examples. FGS and IFGS models train the adversarial examples to be dependent on the original input images and the targeted networks. These examples are relatively effective across neural network structures and training sets which are used to obtain the adversarial examples. Therefore, adversarial attacks can hinder real-world systems from utilizing neural network models. Many researchers have developed defense methods against adversarial attacks. Adversarial training is a widely known defense method of constructing robust models. It augments the adversarial examples of a previous model to training data and trains a new model. A high-level representation guided denoiser method is another approach for defending adversarial examples, where the information of the targeted network is used. These defense methods are effective but required the information of the targeted network and targeted adversarial examples. Thus, we want to develop an iterative defense projection method that is based on a denoising autoencoder(DAE). Our projection scheme is related to a discrete dynamical system that follows the score function of the input density and can be estimated by the DAE model with Gaussian noises. Therefore, our denoiser can defend an arbitrary adversarial attack with a small perturbation even without any adversarial example. The estimated score function can be also used for constructing an efficient attack system.

Back to Science at Extreme Scales: Where Big Data Meets Large-Scale Computing