AI VILLAGE - The current state of adversarial machine learning

Video thumbnail (Frame 0) Video thumbnail (Frame 3186) Video thumbnail (Frame 7457) Video thumbnail (Frame 8282) Video thumbnail (Frame 10042) Video thumbnail (Frame 13780) Video thumbnail (Frame 22066) Video thumbnail (Frame 22658)
Video in TIB AV-Portal: AI VILLAGE - The current state of adversarial machine learning

Formal Metadata

AI VILLAGE - The current state of adversarial machine learning
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Machine learning is quickly becoming a ubiquitous technology in the computer security space, but how secure is it exactly? This talk covers the research occurring in adversarial machine learning and includes a discussion of machine learning blind spots, adversarial examples and how they are generated, and current blackbox testing techniques. Heather Lawrence is a cyber data scientist working with NARI. She earned her undergraduate and MS degrees in Computer Engineering from the University of Central Florida focusing on computer security. She is pursuing a PhD in Computer Engineering from the University of Nebraska Lincoln. Her previous experience in cyber threat intelligence modeling, darknet marketplace research, IT/OT testbed development, data mining, and machine learning has led to several awards from capture-the-flag competitions including the National Collegiate Cyber Defense Competition, CSI CyberSEED, and SANS Netwars Tournament. Her current research interests focus on the application of machine learning to cybersecurity problem sets.
Rifling Slide rule State of matter Confidence interval Virtual machine Turtle graphics Event horizon Twitter 2 (number) Medical imaging Mathematics Machine learning Different (Kate Ryan album) Videoconferencing Endliche Modelltheorie Social class Noise (electronics) Support vector machine Algorithm Texture mapping Information Autonomous system (mathematics) Expert system Plastikkarte Sound effect Perturbation theory Type theory Word Googol Angle Personal digital assistant Right angle Spacetime
Point (geometry) Overhead (computing) Decision theory Multiplication sign Blind spot (vehicle) Virtual machine Set (mathematics) Black box Mereology Computer font Wave packet Software bug Mixture model Medical imaging Videoconferencing Bus (computing) Cuboid Boundary value problem Endliche Modelltheorie Error message Numerical taxonomy Domain name Area Noise (electronics) Dialect Demo (music) Validity (statistics) Digitizing Sampling (statistics) Entire function Type theory Exploratory data analysis Hybrid computer Right angle Figurate number Game theory Spacetime
Game controller Pixel Randomization Identifiability Divisor Decision theory Direction (geometry) Blind spot (vehicle) Multiplication sign Virtual machine Function (mathematics) Mereology Computer programming Element (mathematics) Software bug Wave packet Derivation (linguistics) Estimator Sign (mathematics) Goodness of fit Machine learning Circle Damping Information security Error message Physical system Domain name Noise (electronics) Algorithm Programming paradigm Pattern recognition Electric generator Expert system Sampling (statistics) Bit Line (geometry) Machine code Perturbation theory Frame problem Electronic signature Degree (graph theory) Word Personal digital assistant Order (biology) output Right angle Gradient descent
Logistic distribution Differential (mechanical device) Decision theory Multiplication sign Source code 1 (number) Mereology Heegaard splitting Machine learning Single-precision floating-point format Cuboid Endliche Modelltheorie Information security Area Constraint (mathematics) Linear regression Sampling (statistics) Smith chart Process (computing) output Right angle Cycle (graph theory) Asynchronous Transfer Mode Spacetime Slide rule Mobile app Link (knot theory) Patch (Unix) Virtual machine Black box Heat transfer Wave packet Goodness of fit Software testing Metropolitan area network Support vector machine Graph (mathematics) Demo (music) Information Artificial neural network Weight Autonomous system (mathematics) Machine code Cartesian coordinate system Limit (category theory) Loop (music) Query language Network topology Video game Point cloud
Web page Injektivität Slide rule Algorithm Validity (statistics) Sequel Blind spot (vehicle) Decision theory Virtual machine Computer font Integrated development environment output Metropolitan area network
hi I'm presenting adversarial example witchcraft or how to turn how to use alchemy to turn turtles into rifles I let's see I'm Heather Lawrence I do data science at the Nebraska Applied Research Institute you can find me on twitter under info second on if you didn't get a chance to visit the slides online i have them on my twitter and then i'm also not above bribing my audience so i have stickers back there and cards and the event that you want them right back over this goon here is waving his uh his hand alright cool so here we see a video of google's state-of-the-art inception three model and you see a printed turtle right it's obviously identified as a turtle and then after making changes to the texture map the classifier believes it's a rifle with high confidence from every angle look at that rifle isn't it so pretty so most of this research has in the space has focused on manipulating image classifiers it's easier to tell visually that an effect is occurring and we watched this video we saw it being identified as a rifle so i'm gonna motivate the the real question of this talk which is what happens when an autonomous system cannot tell the difference between a turtle and a rifle in a surveillance state I'm just kind of like marinate on that one for a second and so I didn't write this for machine learning experts I wrote this that to be approachable so I'm gonna use something like some terminology a classifier is like a style of machine learning algorithm that determines a class of data I might say SVM which stands for support vector machine it's a type of algorithm you don't need to know about any of the math or how it works when I say perturbation I mean I'm basically adding noise and that's it it's a very fancy word it means adding noise and then adversarial example is presenting a worst case example for an algorithm and
so my outline goes like this brief history types attacks what blind spots are important to motivate adverse sale examples what they are how to defend against them as far as we know white black a white box versus black box techniques a demo and then resources are at the end so in 2004 Dell via Al released his paper called adversarial classification and it was in the in the spam detection domain and outlined a formal game between a victim and a defending classifier and they were trying to determine which one could fool the the other and then hoang at all in adversarial machine learning they kind of defined like a formal taxonomy about regarding the attacks that are possible and in 2016 it's interesting because now we've moved beyond the theoretical and I don't even need access to your classifier to attack it anymore which is kind of interesting and so we have poisoning versus evasion poisoning happens before training and evasion happens after training and you'll notice here from this video at all paper you have part of the amnesty de set which if you aren't familiar is this huge image data set of handwritten digits and the ideas the classifiers trying to properly determine what that digit is from handwriting and so they added some noise and you'll notice the classification error the validation error shot up right after they added this noise and so the evasion attack after training you see a bus we add some noise and now it's an ostrich that looks like an ostrich right yes yeah alright so the types of attacks right we have causative so we can manipulate the training data before trading if you have that kind of access you have data poisoning where you can especially craft attack points injected into the training data again before training or exploratory where you're trying to explore exploit the classifier kind of figure out how it works after it's already been trained and then a hybrid is a mixture of those attacks all right this is probably the the important part of this like what is a blind spot and why do I care about it they are regions in the models decision space where the decision boundary is inaccurate and basically this is areas that are not well defined and I like to use pandas so let's say I have a classifier and I'm training the classifier and what a panda looks like I've got a whole bunch of images on what pandas look like but I want you to think for a second about the entire sample space of what is not a panda and that I have to provide that to the classifier I can't exhaust that space with and in any reasonable amount of time because the overhead on that is crazy right everything that is not a panda has to be
provided to the classifier well if you don't provide that data the classifier doesn't know it has to infer right Hassen for what is not a panda based on what it thinks it is a panda and so that's where these blind spots are coming in because we don't exhaustively or we don't exhaustively provide the classifier that data and given right this is an ongoing research area so nobody really nobody has definitively proven yet why these exist this is theoretically why they exist so they let me motivate that real quick and so here be bugs right as
introspection into algorithms increase though to the flaws like bug bounty programs right you have more eyes looking on lines of code you're gonna see more errors so you've up if you have more AI experts looking at the algorithms you're gonna see more flaws so the Bureau of Labor Statistics at estimates there's like a hundred and five thousand information security analysts here in the US whereas element a is tomates there's only 22,000 experts worldwide right that's a factor of five in this country alone so do you want to get into machine learning we need you all right so what our adversarial examples they're data that presents a worst case to the classifier it's intentionally caught intended to make the classifier make a wrong decision some of the examples particularly for information security is detecting domain generation algorithms used in command and control infrastructure and then malicious portable executables that are classified as benign and this there's actually a paper behind this which is really cool they determine the parts of the executable that could not be perturbed not good we changed in order for it to be
executed and then they took all the other bits and they perturb all of those and the classifier could not detect them as being malicious it was like oh this is fine this is benign so maybe if your information security you you remember signature based detection and how that was a problem well now we have a problem with machine learning based detection right where we're at that next stage of the attack defense paradigm so some real real-world adversarial examples are the sticker attack on the self-driving cars if the self-driving car cannot identify that that is a stop sign using eyeglass frames on facial recognition systems they cannot properly identify who the person is with a pair of glasses on or perturbing oral vocal sound and making it sound from it was the best of times it was the worst of times adding some noise and it is a truth universally acknowledged those those aren't the same at all and so now remember how you used to throw salt of your shoulder for good luck well now or how now we're using salt circles to trap self-driving cars so we are effectively using alchemy to full a high systems now so generating adversarial examples right what I've been talking about this whole time right we're adding noise random perturbations to the sample and it's optimized with something called gradient ascent it has to do with derivatives that parts not particularly important but it's a method that determines the directions that move the algorithms output by the greatest degree and then directs the input by small degrees to create that output that's a lot of words basically we're just adding noise now we're adding special noise and we're adding it to every pixel so that when the classifier looks at it it's like right in that blind spot I don't know what are the weather to infer that's a panda or not so what we can do about adversarial examples as we start facilitating robust algorithms we know that retraining from scratch increases misclassifications we know that retraining with disjoint data increases misclassifications and we're starting to find out that training with adversarial examples reduces the amount of misclassifications
so if we reduce the weights or the activation given to the inputs we can reduce the amount that that classifier is affected we can also choose to keep a human in the loop do not use autonomous systems do not let them do whatever they want and not check them or you can use something called the consensus method so now instead of having a single classifier that's trained we now have like three classifiers that are trained they take in the input and they come to come to a decision on whether that input should be trusted or not all right these are all methods to try to make these make these classifiers more robust and so we might see our training life cycles change anybody who's done like machine learning in this room will notice that import data clean data test train split and then deploy that's already part of our life cycle but now with adversarial examples we might have to train with them continue to test with them repeat this process right to get that that sample space reduced and then deploy afterwards after we retrain unfortunately some of the early research in this area had really bad attacker dysfunctions it's like easy mode white box assumptions where the attacker apparently has the code the training data everything that happens and like who who has act that kind of access I don't have that kind of access and they reference the information security community for trying to get like more robust attacks but I guess they think we know something I don't know but black box black box research is now here right they're starting to assume more constraints and attacks can be transferred between classifiers it's called model transferability I'm going to get into it here in a second the idea is that an attacker model uses the victim model as an Oracle and it queries the Oracle over and over again for the decisions on how it's going to classify it takes those decisions trains a separate completely separate model generates the adversarial examples from that attacker model and those examples still work on the victim right so this is from adversarial examples of machine learning by paper Noah it was presented but in yeast nicks in the enigma 2017 basically the same idea but visually represented you have an attacker queries the Oracle the Oracle returns what its classification decisions are the attacker trains the classifier B and that the our adversarial examples that are affected by a classifier B also affect classifier a and so this this graph that you see here on the left right bunch of boxes it's showing that the source machine learning technique which ones were affected on the target so on the vertical axis let's look at LR logistic regression if I my if my source was logistic regression and I trained it and I used the adversarial examples on let's say the other logistic regression right the the victim model was also LR my increase in misclassifications go up to 91% that's huge 91% like look at all that huge black column decision trees are the worst currently and this is really scary right how many classifiers do you know that are we'll just AK aggression support vector machine these are pretty popular classifier models so what we know is that differentiable models like logistic regression are more affected than models that art so like deep neural networks are less affected than logistic regression is and they make us make use of something called reservoir sampling when you have to query the Oracle 10,000 times somebody's going to notice right with reservoir sampling I can get that down to a thousand times have less notice but I can still have that I can still have that randomized sampling space to where it's almost like I got the thousand the South of queries and I only got a hundred notable recent research is very interesting the space is moving very fast right now we're starting to use a limited information approach and then testing it on the Google cloud AWS classifiers to see how well they're affected and they are affected some of the papers that I have in the resources they go more into detail on this and actually show percentages and they are in the 80s and above so these are affected in in state-of-the-art classifiers in the cloud all right so the demo I know the adversarial patch was given yesterday but if you have a app called TF classify it's a classification app that you can that attempts to classify the things that you have in front of you one at a time and provide it to this adversarial patch it looks like a toaster does that look like a toaster to you yes yes it looks like a toaster man I don't know what you have but I want whatever you have all right so let's go through this demo we have a pair of glasses at 67% 70% pretty good yep that's a toaster at
about 40% and then it's kind of limited on what it's been trained on so here's my duo servant ear and that's apparently a granny smith apple good job guys so I have a bunch of references for this talk that I couldn't go into in 20 minutes and that's why I provided links to the slides right it's a written in 9
font I don't expect you to take pictures of that and maybe you are in this talk
and you're like man machine learning sounds cool here are some resources for you there's a github page with machine learning for cybersecurity which is amazing and of course enterings machine learning course is like the go-to thing for machine learning so take away as you walked in basically machine learning all those can be attacked algorithms like humans have blind spots you need to read team your algorithms to increase your bust miss otherwise somebody's gonna do it for you and you may not know and then like sequel injection classifiers require input validation if your classifier is taking input from an adversarial environment or a possible adversarial environment that is users that could put in data make sure you control that data to you except from that user right don't allow it to retrain your classifier or otherwise alter it to make poor decisions again you can get these slides here my name is Heather Lawrence I work data science at nari thank you for your attention