We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

CAAD VILLAGE - GeekPwn - The Uprising Geekpwn AI/Robotics Cybersecurity Contest U.S. 2018 - High Frequenzy Targeted Attacks

00:00

Formale Metadaten

Titel
CAAD VILLAGE - GeekPwn - The Uprising Geekpwn AI/Robotics Cybersecurity Contest U.S. 2018 - High Frequenzy Targeted Attacks
Alternativer Titel
Adversarial^2 Training
Serientitel
Anzahl der Teile
322
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Targeted attacks of image classifiers are difficult to transfer from one model to another. Only strong adversarial attacks with the knowledge of the classifier can bypass existing defenses. To defend against such attacks, we implement an “adversarial^2 training” method to strengthen the existing defenses. Yao Zhao is an applied scientist at Microsoft AI & Research working on natural language understanding/generation and search ranking. During his Ph.D. at Yale University, he worked in the field of computuer vision and optics. Yuzhe Zhao is a software engineer in Google Research, working on natural language understanding. He recently earned his Ph.D. from Yale University. Previously, he received his undergraduate degree in mathematics and physics from Shanghai Jiao Tong University.
FrequenzNeuronales NetzCoxeter-GruppeMultiplikationsoperatorVorlesung/Konferenz
FrequenzNeuroinformatik
RechnernetzSpezialrechnerBildgebendes VerfahrenSoftwareFunktion <Mathematik>StörungstheoriePhysikalisches SystemNeuronales NetzCASE <Informatik>VorhersagbarkeitKlasse <Mathematik>Ein-AusgabeNeuroinformatikMereologieGewicht <Ausgleichsrechnung>Computeranimation
Bildgebendes VerfahrenTypentheorieUmwandlungsenthalpie
SpezialrechnerBildgebendes VerfahrenNeuronales NetzGradientEin-AusgabeEinfügungsdämpfungStörungstheorieReverse Engineering
IterationIterationGradientBitrateKurvenanpassungMessage-Passing
Gewicht <Ausgleichsrechnung>GradientDatenmodellFormation <Mathematik>RechnernetzNeuronales NetzGewicht <Ausgleichsrechnung>Formation <Mathematik>QuaderGradientBitrateBlackboxEndliche ModelltheorieMultiplikationsoperatorPhysikalisches SystemCASE <Informatik>
Formation <Mathematik>Gleitendes MittelFunktionalNeuronales NetzFormation <Mathematik>GradientEinfügungsdämpfungDifferenteEinfache Genauigkeit
Bildgebendes VerfahrenDatensatzWärmeübergangEndliche ModelltheorieDifferenteCASE <Informatik>
Physikalisches SystemNeuroinformatik
Bildgebendes VerfahrenSchaltnetzPhysikalisches SystemServerCASE <Informatik>Formation <Mathematik>SchnittmengeEndliche ModelltheorieRechter Winkel
SchaltnetzFormation <Mathematik>SchlüsselverwaltungDifferenteMultiplikationsoperatorBildgebendes Verfahren
Physikalisches SystemBildgebendes VerfahrenGraphSchaltnetzProgrammierungHalbleiterspeicherIterationGenerator <Informatik>BefehlsprozessorMultiplikationResultanteTensorWarteschlangeDatenflussMenütechnikGewicht <Ausgleichsrechnung>Prozess <Informatik>Strategisches SpielFormation <Mathematik>Ein-AusgabeThreadMinimumEndliche ModelltheoriePatch <Software>ZweiMinkowski-MetrikRechter Winkel
Gleitendes MittelStrategisches SpielBitrateMultiplikationsoperatorRechter Winkel
Endliche Modelltheorie
Bildgebendes VerfahrenTypentheorieZentrische StreckungFormation <Mathematik>BlackboxEndliche ModelltheorieSchlüsselverwaltungGraphBefehlsprozessorDifferenteNeuroinformatikZwei
SimulationCoxeter-GruppeInstantiierung
Transkript: Englisch(automatisch erzeugt)
Okay, the next presentation is from Wenxin Zhao and Yao Zhao. Okay, they bring us high frequency targeted attacks. They used this method to win the car CTF yesterday.
Thank you, Hi Bing. Hi, my name is Yao Zhao. This is my friend, Wenxin Zhao. We're both NLP researchers and in our spare time, we work on adversarial attack and defenses for neural networks. And today we're going to talk about our method,
high frequency targeted attacks that we used in yesterday's CAAD computation. So in the first half, I'm going to introduce some basic concepts of adversarial attacks and defense. And then the second part, we're going to talk about our techniques in the computation.
So neural networks are becoming a lot more popular in image classification and are deployed in a lot of commercial system. In this case, when image is given to neural networks, the network takes in the raw pixel and calculate the activation through a lot of hidden layers
and then output a final label for image. In a popular case like ImageNet, there can be a thousand labels for a image. And a adversarial attack against a neural network is that we apply some small perturbation on the input image and make the prediction
of the neural network fail to another class. In this case, we changed the correct label from snail to fox. There are generally two types of attacks, an adversarial attacks.
The first one is non-target attacks. Basically, change the correct label to any non-correct label without specific target. The other one is the targeted attacks. That is to, like given a target and perturb the image to have it classified wrong in the other target.
So the method of constructing adversarial images are the most popular method is the gradient-based attack. When given an input image and the neural network, we can calculate the loss through the neural network
and calculate the gradients back to the image. And then if we perturb the image in the way that is opposite to the gradients, then we can get an adversarial image that can fool the original neural network.
A more powerful attack is the iterative attack. It applies the same gradient method again and again over many iterations. And as you can see in the curve, the more iterations we apply this method, the higher successful rate the attack can be.
So in the realistic system, there can be black box attacks and white box attacks. So for white box attacks, the attacker have access to the model weights.
In this case, gradient attack can be applied and the gradients can be accurately calculated. Usually the attack success rate is very, very high. In the black box case, model weights are not accessible to attackers. So as to successfully attack a neural network,
we need to either guess the neural network the defender is using or ensemble a lot of neural networks and attack them at the same time. For those ensemble attacks,
like single neural network attack, we add the loss function of many different kind of neural networks together and calculate the gradients back through all of the neural network at the same time and apply the same gradient-based attack as the previous step.
So in this competition, we focused on the targeted attack and the targeted attack has some specific behavior that when you use the attack method on model,
it usually doesn't transfer to a different model. In this case, we have a lot of different attack method on the rows and columns and they can only attack the model, they can only attack the defender using the same model and the attack images rarely apply to new defenders.
So Vincent is going to talk about this computation and the method and system we're using. Okay, so thank you, Yao, for the introduction. So something I want to add is, for example here, so you see that especially for targeted attack,
the image is not transferable, so which means you have to guess. So first thing is it's really expensive to train a new model. So we think that in practice, so people usually, I mean, if you work on the ImageNet data sets,
usually people will use the pre-existing, pre-trained model, right, instead of training their own model. So they only have a couple of dozens of models out there. So the question is if we can attack them all, so we can, with high probability, attack any system. So we assume that people are using ensembles
of those models, some combination of them to do the defense. So that's our assumption and it's basically the case, actually. And then the other thing is that it's not transferable. So that means if somebody's using Inception v3 and we don't have that model, so it's really hard for us to build an attack model,
an adversarial image to attack that model without using that model to generate the image. So for competition, what's important here? So we are allowed to submit our attack every six seconds,
so that's a budget we have. So the competition runs for like 30 minutes, so that means we can try 300 times or maybe 200 times. So the key here is we want to try different combination of ensembles and to generate the image, but we want to do that really fast.
So how to do that? So basically it's quite simple. I mean, we run a multi-thread program, so there's one thread which do the submission, so the submission is controlled by one thread,
and the image is pointing to the double-ended queue, and we have an automatically generated generator to generate some image as well as a manual generator. So automatically generated, we have some prefixed ensemble combinations. We have like 50 of them, and we will try them all so it's fully automatic.
And to make it run fast, so you have to, I mean, the technical details is basically we use TensorFlow, and TensorFlow is pretty slow to build a graph. The build graph takes like 30 seconds, so you don't want to build a graph for each iteration, so we want to reuse the graph,
but we want to change the ensembles. So for each ensemble, you have some weights, right? So weights is as an input, so not a graph. And the bad thing for TensorFlow is like, for example, in this batch, you don't want to use that model, so you don't want to evaluate that model, right?
But TensorFlow didn't support that, so there's some space you can improve that. But basically, right now it's like if you have five models in your ensemble, so no matter you use it or not, so TensorFlow will always evaluate that model, so it takes time, but still good enough. And so that's basically for our automatic generator,
and for manual generator, so we will look at the results of the feedbacks and come up with some combinations we think might work and submit that job to CPU. So automatic generator is run on GPU, and manual generator is run on CPU, so that they will not compete for memories.
But definitely, manual generator is slower. So that's our strategy, and so, yeah, so what you can see yesterday is like we attack everybody crazy, right? So I mean, the success rate is not high,
but as long as we can get some scores, that's fine. So that's what we did yesterday. Yeah, our strategy. Yeah, thank you. Question? It's tremendous, yeah.
Yeah, so it's like if you use our model, so our biggest ensemble is that we use seven models, so to build a graph, it takes like 30 seconds, and then to do the computation, I mean, we calculate, we compute 10 images at once, so 10 images takes like 20 seconds, but for CPU, if you do the same thing
for just one image, it takes like four minutes, something like that, yeah. So different scale. Yeah. So basically, so in CTF, you probably did like one. You guys did like one manual type program. Yeah, yeah, yeah, so we, yeah, we heavily rely on the automatic,
so those predefined ensembles, we guess them, you know, we guess, okay, those people might use them. So that's our strategy, yeah. Because nowadays, we still believe that, so for the black box techs, so the key is to guess what model the opponent is using, right?
Cool. Thank you. Thanks, Vincent and Yao. So this is the last presentation.
Yeah, we finished this morning. Thanks, everyone.