Deep Learning Blindspots
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 167 | |
Author | ||
License | CC Attribution 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/34794 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
Level (video gaming)Covering spaceMultiplication signArtificial neural networkBitBlind spot (vehicle)MultilaterationComputer animationLecture/Conference
00:58
ComputerArtificial intelligenceComputational intelligenceVideo gameType theoryGoogolVirtual machineGraphical user interfaceRight angleWeb browserComputer animation
02:08
Menu (computing)Computational intelligenceRight angleEndliche ModelltheorieDifferent (Kate Ryan album)RoboticsProduct (business)Power (physics)BitWave packetMachine learningInheritance (object-oriented programming)Multiplication signVirtual machineArtificial neural networkComputer animation
03:46
GoogolVirtual machineMachine learningEndliche ModelltheorieAlgorithmMereologyComputational intelligenceFacebookMassType theoryWave packetAreaMultiplication signComputer architectureGraphics processing unitComputer animation
05:44
outputWeb pageCore dumpRule of inferenceType theoryComputer programForm (programming)Revision controlRoboticsBuildingSeries (mathematics)Logic programmingPhysical systemRight angleReinforcement learningTask (computing)Game theoryLogic gateBit rateArtificial neural networkForcing (mathematics)HTTP cookiePoint (geometry)Computer animation
07:03
Artificial neural networkoutputWeightSoftwareInformationFunction (mathematics)Revision controlNumberoutputRight angleVariable (mathematics)
07:46
outputWeb pageMultiplication signBitGradient descentFunction (mathematics)Formal languageError messageoutputSocial classVirtual machineSoftwarePropagatorMereologyStochasticBackpropagation-AlgorithmusWordWave packetSystem callRight angleComputer animation
09:09
Series (mathematics)Field (computer science)InformationVirtual machineError messageAlgorithmWordForcing (mathematics)Computer animation
10:04
Category of beingGradientIntrusion detection systemMultiplication signWave packetVector spaceWordBiostatisticsMachine learningSoftwareArtificial neural networkType theoryMalwareVirtual machineSign (mathematics)MultilaterationSupport vector machineInformation securityComputational intelligenceDifferent (Kate Ryan album)EmailEndliche ModelltheoriePhysical systemSocial classVideoconferencingGoogolComputer animation
12:21
RandomizationVariable (mathematics)WeightSoftwareGradient descentAreaCartesian coordinate systemError messageBitProbability distributionGradientCost curveGraph (mathematics)ArmArithmetic meanRight angleMathematicsPattern languageDiagram
13:21
Execution unitMaxima and minimaError messageLink (knot theory)SoftwareWave packetBit rateHill differential equationRight angleMultiplication signMathematical optimizationTrailComputer animation
14:28
AngleWeb pageAngleSeries (mathematics)Turtle graphicsSphereRiflingTwitterMachine visionComputational intelligenceSoftwareRight angleMedical imagingComputer animation
15:19
Turtle graphicsAsynchronous Transfer ModeInverter (logic gate)BitMultiplication signVideoconferencingRoundness (object)Gastropod shellShared memoryArtificial neural networkTexture mappingGraph coloringReal numberRight angleObject (grammar)
16:19
Inverter (logic gate)Row (database)ApproximationoutputMatrix (mathematics)NumberRight angleSoftwareArtificial neural networkOpen sourceMedical imagingVisualization (computer graphics)LinearizationAreaBlind spot (vehicle)Link (knot theory)HypothesisGreen's functionLecture/ConferenceComputer animation
18:40
Meta elementMatrix (mathematics)Perturbation theoryBitSign (mathematics)Level (video gaming)Derivation (linguistics)Cost curveGradientoutputFunctional (mathematics)NumberVektoranalysisMedical imagingApproximationLimit of a functionMathematicsMultiplication signIterationSoftwareMatrix (mathematics)Computer animation
20:00
Matrix (mathematics)Artificial neural networkProduct (business)Matrix (mathematics)Statement (computer science)Physical systemSoftwareApproximationPoint (geometry)BitType theoryCost curveState of matterVector spaceFunctional (mathematics)LinearizationQuicksortoutputPixelDerivation (linguistics)Natural numberSquare number
21:40
SoftwareArtificial neural networkApproximationEndliche ModelltheorieoutputSimilarity (geometry)Computer-assisted translationBit rate1 (number)Open sourceSign (mathematics)Matching (graph theory)Coefficient of determinationTunisTask (computing)Wave packetGradientMultiplication signSoftware testingSoftware frameworkBuildingDigital electronicsLevel (video gaming)Computer animation
23:38
Query languageLinear regressionLogistic distributionService (economics)NumberComputing platformPoint cloudGoogolType theoryWave packetEndliche ModelltheorieBlack boxComputer animation
24:24
Point cloudGoogolLinear regressionLogistic distributionQuery languageComputing platformNumberService (economics)Machine learningSupport vector machineEndliche ModelltheorieSoftwareWave packetGroup actionBlack boxHeat transferQuery languageGreatest elementLimit of a functionParameter (computer programming)TensorDataflowLibrary (computing)outputArmDecision theoryElement (mathematics)Virtual machineCodeMatrix (mathematics)Computational intelligenceMusical ensembleSpacetimeType theoryLinear regressionMappingOpen sourceBitDifferent (Kate Ryan album)Line (geometry)Machine visionNetwork topologyLogistic distributionRandomized algorithmMessage passingGradientLevel (video gaming)Front and back endsComputer animation
28:49
Computational intelligenceMobile WebWitt algebraRevision controlArithmetic meanType theoryMathematicsMedical imagingTask (computing)WordPixelRight angleFormal languageSoftwareArtificial neural networkSign (mathematics)Different (Kate Ryan album)Cartesian coordinate systemObservational studyOpen sourceComputational intelligenceAreaFilter <Stochastik>Information securityEndliche ModelltheorieLibrary (computing)FrequencyInterface (computing)Limit (category theory)MalwareMachine visionContext awarenessGradientWave packetMoment (mathematics)Projective planeMultiplication signCodeCuboidCycle (graph theory)Solid geometry1 (number)GmailMeeting/Interview
32:39
Programmable read-only memoryPhysical systemVirtual machineInstant MessagingInformation privacySoftwareElectronic mailing listMultiplication signRevision controlEndliche ModelltheorieSound effectFacebookData conversionBitType theoryLine (geometry)Computer-assisted translationService (economics)Automatic differentiationMessage passingCuboidSurfaceMachine learningLecture/ConferenceComputer animation
35:36
Endliche ModelltheorieComputational intelligenceFacebookMachine visionIdentical particlesComputer-assisted translationMedical imagingEmailMessage passingFigurate numberLine (geometry)SoftwareMeeting/Interview
36:23
Programmable read-only memoryPattern recognitionSoftwareSign (mathematics)GradientPerturbation theoryMedical imagingLimit of a functionNoise (electronics)Task (computing)Computer-assisted translationStatisticsMachine visionProbability distributionComputational intelligenceDigital photographyFacebookArtificial neural networkEndliche ModelltheoriePoint (geometry)Interface (computing)CASE <Informatik>InjektivitätHeat transferLecture/ConferenceComputer animation
38:09
Set (mathematics)DigitizingSoftwareGraph coloringLibrary (computing)Message passingReading (process)Pattern languageWave packetCASE <Informatik>Existential quantificationArtificial neural networkWordSound effectObservational studyComputer fileNoise (electronics)Multiplication signMereologyTask (computing)Computer animationEngineering drawing
39:24
Virtual machineSoftwareNatural languageComputational intelligenceForm (programming)Video gamePhysical systemInformationSoftware testingSound effectRight angleWordMeasurementDecision theoryProcess (computing)Cartesian coordinate systemMachine visionType theoryBitEntire functionHacker (term)Data conversionPower (physics)SpacetimeCASE <Informatik>NumberMetropolitan area networkFocus (optics)Personal identification numberPixelInformation securityComputer animation
45:10
Portable communications deviceDeterminantTrailEntire functionOpen sourceMereologyBitService (economics)Computer animation
46:39
Right anglePortable communications deviceInformation privacyBuildingAutomatic differentiationPhysical systemOpen sourceSelf-organizationMultiplication signInformationCuboidMachine learningSpacetimeHypermediaVirtual machineArithmetic meanVideo gameSoftwareRoundness (object)Closed setState of matterBlack boxLecture/ConferenceComputer animation
49:16
Multiplication signSoftware testingState observerVirtual machineType theoryCorrelation and dependenceDifferent (Kate Ryan album)SoftwareOpen sourceBlind spot (vehicle)LinearizationFigurate numberComputer-assisted translationArtificial neural networkRight angleGame theoryNonlinear systemArmDegree (graph theory)Lecture/Conference
50:59
SkewnessVariety (linguistics)Wave packetMassMultiplication signDirection (geometry)Open sourceInformationComputer hardwareInternetworkingLecture/ConferenceMeeting/Interview
52:09
Endliche ModelltheorieSampling (statistics)Real numberMixture modelWave packetBitInternetworkingIterationPhysical systemoutputSoftware testingRandomized algorithmMultiplication signValidity (statistics)Finite-state machineLevel (video gaming)Product (business)Lecture/Conference
53:13
Lecture/ConferenceComputer animation
Transcript: English(auto-generated)
00:02
And I will let Catherine take the stage now. Awesome. Well, thank you so much for the introduction, and thank you so much for being here, taking
00:23
your time. I know that Congress is really exciting, so I really appreciate you spending some time with me today. It's my first ever Congress, so I'm also really excited, and I want to meet new people, so if you want to come say hi to me later, I'm somewhat friendly, so we can maybe be
00:40
friends later. Today what we're going to talk about is deep learning blind spots, or how to fool artificial intelligence. I like to put artificial intelligence in quotes because, yeah, we'll talk about that, but I think it should be in quotes, and today we're going to talk a little bit about deep learning, how it works, and how you can maybe fool it.
01:02
So I ask us, is AI becoming more intelligent? And I ask this because when I open a browser, and, of course, often it's Chrome, and Google is already prompting me for what I should look at, and it knows that I work with machine learning, right?
01:20
And these are the headlines that I see every day. Are computers already smarter than humans? If so, I think we could just pack up and go home, right? Like we fix computers, right? If computer is smarter than me, then I already fixed it, we can go home. There's no need to talk about computers anymore, let's just move on with life.
01:43
But that's not true, right? We know because we work with computers, and we know how stupid computers are sometimes. They're pretty bad. Computers do only what we tell them to do, generally. So I don't think a computer can think and be smarter than me. So with the same types of headlines that you see this, then you also see this.
02:08
And, yeah. So Apple recently released their face ID, and this unlocks your phone with your face. And it seems like a great idea, right? You have a unique face, you have a face, nobody else can take your face.
02:22
But unfortunately, what we find out about computers is that they're awful sometimes. And for these women, for this Chinese woman that owned an iPhone, her coworker was able to unlock her phone. And I think Hendrik and Karen talked about it if you were here for the last talk.
02:40
We have a lot of problems in machine learning. And one of them is stereotypes and prejudice that are within our training data or within our minds that leak into our models. And perhaps they didn't do adequate training data on determining different features of Chinese folks. And perhaps it's other problems with their model or their training data or whatever they're
03:04
trying to do. But they clearly have some issues, right? So when somebody asks me, is AI going to take over the world, and is there a super robot that's going to come and be my new leader, or so to speak, I tell them, we can't even figure out the stuff that we already have in production.
03:21
So if we can't even figure out the stuff we already have in production, I'm a little bit less worried than of the super robot coming to kill me. That said, unfortunately, the powers that be, a lot of times they believe in this. And they believe strongly in artificial intelligence and machine learning.
03:43
And they're collecting data every day about you and me and everyone else. And they're going to use this data to build even better models. And this is because the revolution that we're seeing now in machine learning has really not much to do with new algorithms or architectures.
04:02
It has a lot more to do with heavy compute and with massive, massive data sets. And the more that we have training data of petabytes per 24 hours or even less, the more we're able to essentially fix up the parts that don't work so well.
04:21
And the companies that we see here are companies that are investing heavily in machine learning and AI. And part of how they're investing heavily is they're collecting more and more data about you and me and everyone else. Google and Facebook, more than 1 billion active users.
04:40
I was surprised to know that in Germany, the desktop search traffic for Google is higher than most of the rest of the world. And for Baidu, they're growing with the speed that broadband is available. And so what we see is these people are collecting this data and they also are using new technologies like GPUs and TPUs in new ways to
05:01
parallelize workflows. And with this, they're able to mess up less, right? They're still messing up, but they mess up slightly less. And they're not going to get uninterested in this topic. So we need to kind of start to prepare how we respond to this type of behavior.
05:22
And so one of the things that has been a big area of research, actually also for a lot of these companies, is what we'll talk about today. And that's adversarial machine learning. But the first thing that we'll start with is, what is behind what we call AI? So most of the time when you think of AI or something like Siri and so forth,
05:45
you are actually potentially talking about an old school rule based system. This is a rule, like you say a particular thing and then Siri's like, yes, I know how to respond to this. And we even hard program these types of things in, right? That is one version of AI, is essentially it's been pre-programmed to do and
06:03
understand certain things. Another form that usually, like for example, for the people that are trying to build AI robots and the people that are trying to build what we call general AI. So this is something that can maybe learn like a human. They'll use reinforcement learning. I don't specialize in reinforcement learning.
06:22
But what it does, it essentially tries to reward you for behavior that you're expected to do. So if you complete a task, you get a cookie. You complete two other tasks, you get two or three more cookies depending on how important the task is. And this will help you learn how to behave to get more points.
06:42
And it's used a lot in robots and gaming and so forth. And I'm not really gonna talk about that today because most of that is still not really something that you or I interact with. But what I am gonna talk about today is neural networks. Or as some people like to call them, deep learning, right? So deep learning won the neural network versus deep learning battle
07:02
a while ago. So here's an example neural network. We have an input layer, and that's where we essentially make a quantitative version of whatever our data is. So we need to make it into numbers. Then we have a hidden layer, and we might have multiple hidden layers. And depending on how deep our network is, or a network inside a network,
07:23
right, which is possible, we might have very much different layers there. And they may even act in cyclical ways. And then that's where all the weights and the variables and the learning happens. So that holds a lot of information and data that we eventually want to train there.
07:41
And finally, we have an output layer. And depending on the network and what we're trying to do, the output layer can vary between something that looks like the input. Like for example, if we want to machine translate, then I want the output to look like the input, right? I want it to just be in a different language. Or the output could be a different class. It can be this is a car, or this is a train, and so forth.
08:05
So it really depends what you're trying to solve. But the output layer gives us the answer. And how we train this is we use back propagation. And back propagation is nothing new, and neither is one of the most popular methods to do so, which is called stochastic gradient descent.
08:22
And what we do when we go through that part of the training, is we go from the output layer and we go backwards through the network. That's why it's called back propagation, great. And as we go backwards through the network, we upvote. And in the most simple way, we upvote and downvote what's working and what's not working. So we say, you got it right, you get a little bit more importance.
08:42
Or you got it wrong, you get a little bit less importance. And eventually we hope over time that they essentially correct each other's errors enough that we get a right answer. So that's a very general overview of how it works. And the cool thing is, is because it works that way, we can fool it.
09:02
And people have been researching ways to fool it for quite some time. So I'll give you a brief overview of the history of this field, so we can kind of know where we're working from, and maybe hopefully then where we're going to. In 2005 was one of the first most important papers to approach
09:22
adversarial learning, and it was written by a series of researchers. And they wanted to see if they could act as an informed attacker and attack a linear classifier. So this is just a spam filter. And they're like, can I send spam to my friends? I don't know why they wanted to do this, but can I send spam to my friend if I try testing out a few ideas?
09:42
And what they were able to show is yes, rather than just trial and error, which anybody can do, or brute force attack of just send a thousand emails and see what happens, they were able to craft a few algorithms that they could use to try and find important words to change to make it go through the spam filter.
10:02
In 2007, NIPS, which is a very popular machine learning conference, had one of their first all day workshops on computer security. And when they did so, they had a bunch of different people that were working on machine learning in computer security, from malware detection to network intrusion detection to, of course, spam.
10:21
And they also had a few talks on this type of adversarial learning. So how do you act as an adversary to your own model? And then how do you learn how to counter that adversary? In 2013, there was a really great paper that got a lot of people's attention called Poisoning Attacks Against Support Vector Machines.
10:40
Now, support vector machines are essentially usually a linear classifier. And we use them a lot to say, this is a member of this class, that or another, when we pertain to text. So I have a text and I wanna know what the text is about, or I wanna know if it's a positive or negative sentiment. A lot of times I'll use a support vector machine, and
11:01
we call them SVMs as well. And Batista Biggio was the main researcher, and he's actually written quite a lot about these poisoning attacks. And he poisoned the training data. So for a lot of these systems, sometimes they have active learning. And this means you or I, when we classify our emails as spam,
11:21
we're helping train the network. And so he poisoned the training data and was able to show that by poisoning it in a particular way, that he was able to then send spam email, because he knew what words were then benign, essentially. He went on to study a few other things about biometric data,
11:40
if you're interested in biometrics. But then in 2014, Christian Sezghedi, Ian Goodfellow, and a few other main researchers at Google Brain, released intriguing properties of neural networks. And that really became the explosion of what we're seeing today in adversarial learning. And what they were able to do is they were able to say,
12:02
we believe there's linear properties of these neural networks, even if they're not necessarily linear networks. And we believe we can exploit them to fool them. And they first introduced them the fast gradient sign method, which we'll talk about later today.
12:22
So how does it work? First, I want us to get a little bit of an intuition around how this works. Here's a graphic of gradient descent. And in gradient descent, we have this vertical axis is our cost function. And what we're trying to do is we're trying to minimize cost.
12:41
We want to minimize the error. And so when we start out, we just chose random weights and variables. So all of our hidden layers, they just have maybe random weights or random distribution. And then we want to get to a place where the weights have meaning, right? We want our network to know something,
13:01
even if it's just a mathematical pattern, right? So we start in the high area of the graph, or the reddish area. And that's where we started, and we have high error there. And then we try to get to the lowest area of the graph, or here, the dark blue that is right about here.
13:24
But sometimes what happens, and so as we learn, as we go through epics and training, we're moving slowly down, and hopefully we're optimizing. But what we might end up in, instead of this global minimum, we might end up in the local minimum, which is the other trail.
13:41
And that's fine, because it's still zero error, right? So we're still probably going to be able to succeed, but we might not get the best answer all the time. What adversarial tries to do in the most basic of ways, it essentially tries to push the error rate back up the hill for
14:02
as many units as it can. So it essentially tries to increase the error slowly through perturbations. And by disrupting, let's say, the weakest links, like the one that did not find the global minimum, but instead found a local minimum, we can hopefully fool the network.
14:21
Cuz we're finding those weak spots, and we're capitalizing on them, essentially. So what does an adversarial example actually look like? You may have already seen this, because it was very popular on the Twitter sphere and a few other places.
14:40
But this was a series of researchers at MIT. And it was debated whether you could do adversarial learning in the real world. A lot of the research has just been a still image. And what they were able to show is they created a 3D printed turtle. I mean, it looks like a turtle to you as well, correct?
15:04
And this 3D printed turtle by the Inception network, which is a very popular computer vision network, is a rifle. And it is a rifle in every angle that you can see. And the way they were able to do this, and
15:20
I don't know the next time it goes round, is you can see perhaps, and it's a little bit easier on the video, which I have posted, I'll share at the end. You can see perhaps that there's a slight discoloration of the shell. And they messed with the texture. And by messing with this texture and the colors, they were able to fool the neural network. They were able to activate different neurons that were not supposed to be activated.
15:44
Units, I should say. And so what we see here is, yeah, it can be done in the real world. And when I saw this, I started getting really excited. Cuz video surveillance is a real thing, right? So if we can start fooling 3D objects, we can perhaps start
16:02
fooling other things in the real world that we would like to fool. So why do adversarial examples exist? We're gonna talk a little bit about some things that are approximations of
16:22
what's actually happening. So please forgive me for not being always exact, but I would rather us all have a general understanding of what's happening. Across the top row, we have an input layer. And these images to the left, we can see are the source images. And this source image is like a piece of farming equipment or something.
16:42
And on the right, we have our guide image. This is what we're trying to get the network to see. We wanted to misclassify this farm equipment as a pink bird. So what these researchers did is they targeted different layers of the network. And they said, okay, we're going to use this method to target this
17:01
particular layer, and we'll see what happens. And so as they targeted these different layers, you can see what's happening on the internal visualization. Now, neural networks can't see, right? They're looking at matrices of numbers. But what we can do is we can use those internal values to try and
17:20
see with our human eyes what they are learning. And we can see here clearly inside the network, we no longer see the farming equipment, right? We see a pink bird. And this is not visible to our human eyes. Now, if you really study and if you enlarge the image,
17:42
you can start to see, okay, there's a little bit of pink here or greens. I don't know what's happening. But we can still see it in the neural network we have tricked. Now, people don't exactly know yet why these blind spots exist. So it's still an area of active research exactly why we can fool
18:03
neural networks so easily. There are some prominent researchers that believe that neural networks are essentially very linear. And that we can use this simple linearity to misclassify, to jump into another area. But there are others that believe that there's these pockets or blind spots.
18:24
And that we can then find these blind spots where these neurons really are the weakest links and they maybe even haven't learned anything. And if we change their activation, then we can fool the network easily. So this is still an area of active research. And let's say you're looking for your thesis. This would be a pretty neat thing to work on.
18:44
So we'll get into just a brief overview of some of the math behind the most popular methods. First, we have the fast gradient sign method. And that was used in the initial paper and now there's been many iterations on it. And what we do is we have our same cost function.
19:02
So this is the same way that we're trying to train our network and it's trying to learn. And we take the gradient sign of that. And if you can think it's okay if you're not used to doing vector calculus and especially not with a pen and paper in front of you. But what you think we're doing is we're essentially trying to calculate
19:23
some approximation of a derivative of the function. And this can kind of tell us where is it going. And if we know where it's going, we can maybe anticipate that and change. And then for to create the adversarial images, we then take the original input plus a small number epsilon times that gradient sign.
19:46
For the Jacobian saliency map, this is a newer method and it's a little bit more effective. But it takes a little bit more compute. And so this Jacobian saliency map uses a Jacobian matrix.
20:01
And if you remember also, and it's okay if you don't, a Jacobian matrix will look at the four derivative of a function. So you take the four derivative of a cost function and it gives you a matrix at that vector. And it gives you a matrix that is a point wise approximation if the function is differentiable at that input vector.
20:22
Don't worry, you can review this later too. But the Jacobian matrix then we use to create the saliency map. The same way where we're trying to essentially find some sort of linear approximation or point wise approximation. And we then want to find two pixels that we can perturb
20:40
that cause the most disruption. And then we continue to the next. And unfortunately, this is currently an O n squared problem, but there's a few people that are trying to essentially find ways that we can approximate this and make it faster. So maybe now, you want to fool a network too.
21:00
And I hope you do, cuz that's what we're gonna talk about. First, you need to pick a problem or a network type. So you may already know, but you may want to investigate what perhaps is this company using? What perhaps is this method using? And do a little bit of research, cuz that's going to help you.
21:22
Then you want to research state of the art methods. And this is like a typical research statement that you have a new state of the art method. But the good news is that the state of the art two to three years ago is most likely in production or in systems today. So once they find ways to speed it up,
21:41
some approximation of that is deployed. And a lot of times, these are then publicly available models. So a lot of times, if you're already working with a deep learning framework, they'll come pre-packaged with a few of the different popular models. So you can even use that. If you're already building neural networks, of course you can build your own.
22:01
An optional step, but one that might be recommended, is to fine tune your model. And what this means is to essentially take a new training data set, maybe data that you think this company is using or that you think this network is using. And you're going to remove the last few layers of the neural network. And you're going to retrain it.
22:20
So you essentially are nicely piggybacking on the work of the pre-trained model. And you're using the final layers to create finesse. This essentially makes your model better at the task that you have for it. Finally, then you use a library, and we'll go through a few of them. But some of the ones that I have used myself is Clevahan's deep fool and
22:43
Deep Pwning. And these all come with nice built-in features for you to use for, let's say, the fast gradient sign method, the Jacobian saliency map, and a few other methods that are available. Finally, it's not going to always work. So depending on your source and
23:00
your target, you won't always necessarily find a match. What researchers have shown is it's a lot easier to fool a network that a cat is a dog than it is to fool a network that a cat is an airplane. And this is just like we can make these intuitive. So you might want to pick an input that's not super dissimilar
23:20
from where you want to go, but is dissimilar enough. And you want to test it locally, and then finally test the ones with the highest misclassification rates on the target network. And you might say, Catherine, or you can call me KJAM, that's okay.
23:41
You might say, I don't know what the person is using. I don't know what the company is using. And I will say, it's okay, because what's been proven is you can attack a black box model. You do not have to know what they're using. You do not have to know exactly how it works.
24:02
You don't even have to know their training data. Cuz what you can do is if it has, okay, addendum, it has to have some API you can interface with. But if it has an API you can interface with, or even any API you can interact with that uses the same type of learning, you can collect training data by querying the API.
24:25
And then you're training your local model on that data that you're collecting. So you're collecting the data, you're training your local model. And as your local model gets more accurate and more similar to the deployed black box that you don't know how it works, you are then still able to fool it.
24:42
And what this paper proved, Nicholas Papenot and a few other great researchers, is that with usually less than 6,000 queries, they were able to fool the network between 84 and 97% certainty. And what the same group of researchers also studied is the ability to transfer
25:05
the ability to fool one network into another network. And they called that transferability. So I can take a certain type of network and I can use adversarial examples against this network to fool a different type of machine learning technique.
25:22
And here we have their matrix, their heat map that shows us exactly what they were able to fool. So we have across the left hand side here, the source machine learning technique. We have deep learning, logistic regression, SVMs like we talked about, decision trees, and k nearest neighbors.
25:42
And across the bottom, we have the target machine learning. So what were they targeting? They created the adversaries with the left hand side and they targeted across the bottom. We finally have an ensemble model at the end. And what they were able to show is, for example, SVMs and decision trees are quite easy to fool.
26:02
But logistic regression, a little bit less so, but still strong. For deep learning and k nearest neighbors, if you train a deep learning model or a k nearest neighbor model, then that performs fairly well against itself. And so what they're able to show is that you don't necessarily need to know
26:21
the target machine, and you don't even have to get it right even if you do know. You can use a different type of machine learning technique to target the network. So we'll look at six lines of Python here. And in the six lines of Python, I'm using the Clever Hans library.
26:43
And in six lines of Python, I can both generate my adversarial input, and I can even predict on it. So if you don't code Python, it's pretty easy to learn and pick up.
27:01
And for example, here we have Keras. And Keras is a very popular deep learning library in Python. It usually works with a Theano or a TensorFlow backend. And we can just wrap our model, pass it to the fast gradient method class, and then set up some parameters.
27:20
So here's our epsilon and a few extra parameters. This is to tune our adversary. And finally, we can generate our adversarial examples and then predict on them. So in a very small amount of Python, we're able to target and trick a network.
27:40
And if you're already using TensorFlow or Keras, it already works with those libraries. Deep Pwning is one of the first libraries that I heard about in this space, and it was presented at DEFCON in 2016. And what it comes with is a bunch of TensorFlow built-in code.
28:00
It even comes with a way that you can train the model yourself. So it has a few different models, a few different convolutional neural networks, and these are predominantly used in computer vision. It also, however, has a semantic model. And I normally work in NLP, and I was pretty excited to try it out.
28:21
And what it comes built with is the Rotten Tomatoes sentiment. So this is Rotten Tomatoes movie reviews that try to learn is it positive or negative. So the original text that I input in when I was generating my adversarial networks was More Trifle Than Triumph, which is a real review. And the adversarial text that it gave me was Jonah Refreshing Haunting Leaky.
28:49
Yeah. So I was able to fool my network, but I lost any type of meaning. And this is really the problem when we think about how we apply adversarial learning to different tasks is it's easy for an image if we make a few changes
29:05
for it to retain its image, right? It's many, many pixels. But when we start going into language, if we change one word and then another word and another word, or maybe we change all of the words, we no longer understand as humans. And I would say this is garbage in, garbage out.
29:22
This is not actual adversarial learning. So we have a long way to go when it comes to language tasks and being able to do adversarial learning. And there is some research in this, but it's not really advanced yet. So hopefully this is something that we can continue to work on and advance further. And if so, we need to support a few different types of networks
29:43
that are more common in NLP than they are in computer vision. There's some other notable open source libraries that are available to you, and I'll cover just a few here. There's the Vanderbilt Computational Economics Research Lab that has AdLib, and this allows you to do poisoning attacks.
30:03
So if you want to target training data and poison it, then you can do so with that. And it uses scikit-learn. DeepFool allows you to do the fast gradient sign method, but it tries to do smaller perturbations. It tries to be less detectable to us humans.
30:23
It's based on Theano, which is another library that I believe uses Lua as well as Python. FoolBox is kind of neat because I only heard about it last week, but it collects a bunch of different techniques all in one library, and you can use it with one interface.
30:40
So if you want to experiment with a few different ones at once, I would recommend taking a look at that. And finally, for something that we'll talk about briefly in a short period of time, we have Evolving AI Lab, which released a fooling library, and this fooling library is able to generate images that you or I can't tell what it is,
31:01
but that the neural network is convinced it is something. So this, we'll talk about maybe some applications of this in a moment, but they also open sourced all of their code, and they're researchers who open source their code, which is always very exciting. As you may have known from some of the research I already cited,
31:22
most of the studies and the research in this area has been on malicious attacks. So there's very few people trying to figure out how to do this for what I would call benevolent purposes. Most of them are trying to act as an adversary in the traditional computer security sense. They're perhaps studying spam filters and how spammers can get by them.
31:43
They're perhaps looking at network intrusion or botnet attacks and so forth. They're perhaps looking at self-driving cars. So, and I know that was referenced earlier as well at Hendrik and Karen's talk, they're perhaps trying to make a yield sign look like a stop sign or a stop sign look like a yield sign or a speed limit and so forth.
32:02
And scarily, they are quite successful at this. Or perhaps they're looking at data poisoning. So how do we poison the model so we render it useless in a particular context so we can utilize that? And finally, for malware. So what a few researchers were able to show is by just changing a few things in the malware,
32:21
they were able to upload their malware to Google Mail and send it to someone. And this was still fully functional malware. In that same sense, there's the MALGAN project, which uses a generative adversarial network to create malware. That works, I guess. So there's a lot of research of these kind of malicious attacks
32:41
within adversarial learning. But what I wonder is how might we use this for good? And I put good in quotation marks because we all have different ethical and moral systems we use and what you may decide is ethical for you might be different, but I think as a community,
33:00
especially at a conference like this, hopefully we can converge on some ethical, privacy-concerned version of using these networks. So I've composed a few ideas. And I hope that this is just a starting list of a longer conversation.
33:23
One idea is that we can perhaps use this type of adversarial learning to fool surveillance. So as surveillance affects you and I, it even disproportionately affects people that most likely can't be here. And so whether or not we're personally affected,
33:41
we can care about the many lives that are affected by this type of surveillance. And we can try and build ways to fool surveillance systems. Stenography, so we could potentially in a world where more and more people have less of a private way of sending messages to one another, we can perhaps use adversarial learning
34:01
to send private messages. Adware fooling, so again, where I might have quite a lot of privilege and I don't actually see ads that are predatory on me as much, there's a lot of people in the world that face predatory advertising. And so how can we help those problems
34:21
by developing adversarial techniques? Poisoning your own private data. This depends on whether you actually need to use the service and whether you like how the service is helping you with the machine learning. But if you don't care or if you need to essentially have a burn box of your data, then potentially you could poison your own private data.
34:44
And finally, I want us to use it to investigate deployed models. So even if we don't actually need a use for fooling this particular network, the more we know about what's deployed and how we can fool it, the more we're able to keep up with this technology as it continues to evolve.
35:01
So the more that we're practicing, the more that we're ready for whatever might happen next. And finally, I really want to hear your ideas as well. So I'll be here throughout the whole Congress and of course you can share during the Q&A time. If you have great ideas, I really want to hear them.
35:21
So I decided to play around a little bit with some of my ideas. And I was convinced perhaps that I could make Facebook think I was a cat. This was my goal. Can Facebook think I'm a cat? Because nobody really likes Facebook. I mean, let's be honest, right?
35:41
But I have to be on it because my mom messages me there and she doesn't use email anymore so I'm on Facebook. Anyways, so I used a pre-trained Inception model in Keras and I fine-tuned the layers. And I'm not a computer vision person really but it took me like a day of figuring out how computer vision people transfer their data
36:02
into something I can put inside of a network. Figured that out and I was able to quickly train a model and the model could only distinguish between people and cats. That's all the model knew how to do. I give it a picture, it says it's a person or it's a cat. I have no idea. I actually didn't try just giving it an image of something else. It would probably guess it's a person or a cat.
36:22
Maybe 50-50, who knows? And what I did was I used an image of myself and eventually I had my fast gradient sign method. I used Clever Hans and I was able to slowly increase the epsilon. And so the epsilon, as it's low,
36:41
you and I can't see the perturbations but also the network can't see the perturbations. So we need to increase it and of course as we increase it when we're using a technique like FGSM, we are also increasing the noise that we see. And when I got to 0.21 epsilon and I kept uploading it to Facebook and Facebook kept saying,
37:01
yeah, do you wanna tag yourself? And I'm like, no, I don't. I'm just testing. Finally, I got to 0.21 epsilon and Facebook no longer knew I was a faith. So I was just a book, I was a cat book, maybe.
37:22
So unfortunately, as we see, I didn't actually become a cat because that would be pretty neat. But I did, I was able to fool it. I spoke with a computer vision specialist that I know and she actually works in this. And I was like, what methods do you think Facebook is using? Like, did I really fool a neural network or what did I do?
37:40
And she's convinced most likely that they're actually using a statistical method called Viola Jones, which takes a look at the statistical distribution of your face and tries to guess if there's really a face there. But what I was able to show, transferability, is that I can use my neural network even to fool this statistical model.
38:01
So now I have a very noisy but happy photo on Facebook. Another use case potentially is adversarial stenography. And I was really excited reading this paper. What this paper covered, and they actually released the library as I mentioned,
38:20
is they studied the ability of a neural network to be convinced that something's there that's not actually there. And what they use is the NIPS training set. I'm sorry if that's like a trigger word. If you've used MNIST a million times, then I'm sorry for this. But what they use is MNIST, which is zero through nine of digits.
38:42
And what they were able to show using evolutionary networks is they were able to generate things that to us look maybe like art. And they actually used it on the CFAR data set too which has colors and it was quite beautiful some of what they created. In fact, they showed it in a gallery. And what the network sees here
39:01
is the digits across the top, they see that digit. They are more than 99% convinced that that digit is there. And what we see is pretty patterns or just noise. And when I was reading this paper, I was thinking how can we use this
39:20
to send messages to each other that nobody else will know is there? I'm just sending really nice, I'm an artist and this is my art and I'm sharing it with my friend. And in a world where I'm afraid to go home because there's a crazy person in charge, and I'm afraid that they might look at my phone
39:43
and my computer and a million other things. And I just want to make sure that my friend has my PIN number or this or that or whatever. I see use case for my life. But again, I leave a fairly privileged life. There are other people where their actual life and livelihood and security might depend on using a technique like this.
40:03
And I think we could use adversarial learning to create a new form of stenography. Finally, I cannot impress enough that the more information we have about the systems that we interact with every day, that our machine learning systems,
40:22
that our AI systems or whatever you want to call it, that our deep networks, the more information we have, the better we can fight them. We don't need perfect knowledge, but the more knowledge that we have, the better an adversary we can be. And if I thankfully now live in Germany, and if you are also a European resident,
40:42
we have GDPR, which is the General Data Protection Regulation. And it goes into effect in May of 2018. And we can use GDPR to make requests about our data. We can use GDPR to make requests about machine learning systems that we interact with.
41:01
This is a right that we have. And in Recital 71 of the GDPR, it states, the data subject should have the right to not be subject to a decision, which may include a measure evaluating personal aspects related to him or her, which is based solely on automated processing and which produces legal effects concerning him or her,
41:24
or similarly significant effects him or her, such as automatic refusal of an online credit application or e-recruiting practices without any human intervention. And I'm not a lawyer, and I don't know how this will be implemented. And it's a recital,
41:40
so we don't even know if it will be enforced the same way. But the good news is pieces of this same sentiment are in the actual amendments. And if they're in the amendments, then we can legally use them. And what it also says is we can ask companies to port our data other places. We can ask companies to delete our data.
42:01
We can ask for information about how our data is processed. We can ask for information about what different automated decisions are being made. And the more we all here ask for that data, the more we can also share that same information with people worldwide. Because the systems that we interact with,
42:22
they're not special to us. They're the same types of systems that are being deployed everywhere in the world. So we can help our fellow humans outside of Europe by being good caretakers and using our rights to make more information available to the entire world. And to use this information
42:40
to find ways to use adversarial learning to fool these types of systems. So how else might we be able to harness this for good?
43:01
I cannot focus enough on GDPR and our right to collect more information about the information they're already collecting about us and everyone else. So use it. Let's find ways to share the information we gain from it. So I don't want it to just be that one person requests it and they learn something. We have to find ways to share this information
43:22
with one another. Test low tech ways. So I'm big into, you know, I'm so excited about the maker space here and maker culture and other low tech or human crafted ways to fool networks. We can use adversarial learning perhaps to get good ideas on how to fool networks, to get lower tech ways.
43:40
What if I painted red pixels all over my face? Would I still be recognized? Would I not? Let's experiment with things that we learn from adversarial learning and try to find other lower tech solutions to the same problem. Finally, or nearly finally, we need to increase the research beyond just computer vision.
44:01
Quite a lot of adversarial learning has been only in computer vision and while I think that's important and it's also been very practical because we can start to see how we can fool something, we need to figure out natural language processing. We need to figure out other ways that machine learning systems are being used and we need to come up with clever ways to fool them.
44:22
Finally, spread the word. So I don't want the conversation to end here. I don't want the conversation to end at Congress. I want you to go back to your hacker collective, your local CCC, the people that you talk with, your coworkers, and I want you to spread the word. I want you to do workshops on adversarial learning.
44:41
I want more people to not treat this AI as something mystical and powerful because unfortunately it is powerful but it's not mystical. So we need to demystify this space. We need to experiment, we need to hack on it and we need to find ways to play with it and spread the word to other people.
45:01
Finally, I really wanna hear your other ideas. And before I leave today, I have to say a little bit about why I decided to join the Resiliency Track this year. I read about the Resiliency Track and I was really excited. It spoke to me and I said, I want to live in a world
45:22
where even if there's an entire burning trash fire around me, I know that there are other people that I care about, that I can count on, that I can work with to try and at least protect portions of our world, to try and protect ourselves, to try and protect people that do not have as much privilege.
45:42
So what I wanna be a part of is something that can use maybe the skills I have and the skills you have to do something with that. And your data is a big source of value for everyone. Any free service you use, they are selling your data.
46:01
Okay, I don't know that for a fact, but it is very certain, I feel very certain about the fact that they're most likely selling your data. And if they're selling your data, they might also be buying your data. And there is a whole market that's legal, that's freely available to buy and sell your data.
46:20
And they make money off of that and they mine more information and make more money off of that and so forth. So I will read a little bit of my opinions that I put forth on this. Determine who you share your data with and for what reasons. GDPR and data portability give us, European residents,
46:42
stronger rights than most of the world. Let's use them. Let's choose privacy concerned ethical data companies over corporations that are entirely built on selling ads. Let's build startups, organizations, open source tools and systems that we can be truly proud of.
47:02
And let's port our data to those. We have time for a few questions. I'm not done yet, sorry.
47:21
It's fine, it's cool, no big deal. So machine learning closing remarks is a brief round up. Closing remarks is that machine learning is not very intelligent. I think artificial intelligence is a misnomer in a lot of ways but this doesn't mean that people are going to stop using it.
47:41
In fact, there's very smart, powerful and rich people that are investing more than ever in it. So it's not going anywhere and it's going to be something that potentially becomes more dangerous over time because as we hand over more of these to these systems, it could potentially control more and more of our lives.
48:02
We can use however adversarial machine learning techniques to find ways to fool black box networks. So we can use these and we know we don't have to have perfect knowledge. However, information is powerful and the more information that we do have, the more we're able to become a good GDPR based adversary.
48:23
So please use GDPR and let's discuss ways where we can share information. Finally, please support open source tools and research in the space because we need to keep up with where the state of the art is. So we need to keep ourselves moving and open in that way
48:40
and please support ethical data companies or start one. If you come to me and you say, Catherine, I'm going to charge you this much money but I will never sell your data and I will never buy your data. I would much rather you handle my data. So I want us, especially those within the EU, to start a new economy around trust
49:03
and privacy and ethical data use. Thank you very much. Thank you. Okay, we still have time for a few questions. No, no, no, no worries, no worries. Less than the last time when I walked up here.
49:22
Yeah, now I'm really done. Come up to one of the mics in the front section and raise your hand. Can I take a question from mic one? Thank you very much, very interesting talk. One impression that I got during the talk was with the adversarial learning approach, aren't we just doing pen testing and quality assurance
49:43
for the AI companies and they're just going to build better machines? That's a very good question. And of course, most of this research right now is coming from those companies because they're worried about this. What however they've shown is they don't really have a good way to learn how to fool this.
50:04
Most likely they will need to use a different type of network eventually. So probably whether it's the blind spots or the linearity of these networks, they are easy to fool and they will have to come up with a different method for generating something that is robust enough to not be tricked.
50:22
So to some degree, yes, it's a cat and mouse game, right, but that's why I want the research and the open source to continue as well. And I would be highly suspect if they all of a sudden figure out a way to make a neural network which has proven linear relationships that we can exploit non-linear.
50:40
And if so, it's usually a different type of network that's a lot more expensive to train and that doesn't actually generalize well. So we're going to really hit them in a way where they're gonna have to be more specific, try harder, and I would rather do that than just kind of give up. Next one, Mike too.
51:03
Hello, thank you for the nice talk. I wanted to ask, have you ever tried looking at it from the other direction, like just trying to feed the companies falsely classified data and just do it with so massive amounts of data
51:21
so that they learn from it at a certain point? Yeah, so that's these poisoning attacks. So when we talk about poisoning attacks, we're essentially feeding bad training data and we're trying to get them to learn bad things, or I wouldn't say bad things, but we're trying to get them to learn false information. And that already happens on accident all the time,
51:41
so I think the more too we can, if we share information and they have a publicly available API where they're actually actively learning from our information, then yes, I would say poisoning is a great attack way and we can also share information of maybe how that works. So especially I would be intrigued if we can do poisoning for adware
52:00
and malicious ad targeting. Okay, thank you. One more question from the internet and then we run out of time, so you can find a train after. Thank you, one question from the internet. So what exactly can I do to harm my model against adversarial samples?
52:20
Sorry? What exactly can I do to harm my model against adversarial samples? Not much. What they have shown is that if you train on a mixture of real training data and adversarial data, it's a little bit harder to fool, but that just means that you have to try
52:40
more iterations of adversarial input. So right now, the recommendation is to train on a mixture of adversarial and real training data and to continue to do that over time. And I would argue that you need to maybe do data validation on input, and if you do data validation on input,
53:00
maybe you can recognize abnormalities, but that's because I come from mainly like production levels, not theoretical, and I think maybe you should just test things and see if they look weird. You should maybe not take them into the system. And that's all for the questions. I wish we had more time, but we just don't. Please give it up for Catherine Carmel.
53:22
Thank you.