CAAD VILLAGE - GeekPwn - The Uprising Geekpwn AI/Robotics Cybersecurity Contest U.S. 2018 - Hardware Trojan Attacks on Neural Networks

Video thumbnail (Frame 0) Video thumbnail (Frame 1134) Video thumbnail (Frame 1915) Video thumbnail (Frame 4473) Video thumbnail (Frame 5755) Video thumbnail (Frame 9564) Video thumbnail (Frame 11319) Video thumbnail (Frame 11893) Video thumbnail (Frame 13863) Video thumbnail (Frame 15972) Video thumbnail (Frame 17472) Video thumbnail (Frame 18844) Video thumbnail (Frame 20782) Video thumbnail (Frame 23596) Video thumbnail (Frame 25022) Video thumbnail (Frame 26820) Video thumbnail (Frame 28208) Video thumbnail (Frame 28807) Video thumbnail (Frame 34354)
Video in TIB AV-Portal: CAAD VILLAGE - GeekPwn - The Uprising Geekpwn AI/Robotics Cybersecurity Contest U.S. 2018 - Hardware Trojan Attacks on Neural Networks

Formal Metadata

CAAD VILLAGE - GeekPwn - The Uprising Geekpwn AI/Robotics Cybersecurity Contest U.S. 2018 - Hardware Trojan Attacks on Neural Networks
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Driven by their accessibility and ubiquity, deep learning has seen rapid growth into a variety of fields, in recent years, including many safety-critical areas. With the rising demands for computational power and speed in machine learning, there is a growing need for hardware architectures optimized for deep learning and other machine learning models, specifically in tightly constrained edge based systems. Unfortunately, the modern fabless business model of manufacturing hardware, while economic, leads to deficiencies in security through the supply chain. In addition, the embedded, distributed, unsupervised, and physically exposed nature of edge devices would make various hardware or physical attacks on edge devices as critical threats. In this talk, I will first introduce the landscape of adversarial machine learning on the edge. I will discuss several new attacks on neural networks from the hardware or physical perspective. I will then present our method for inserting a backdoor into neural networks. Our method is distinct from prior attacks in that it was generated to neither alter the weights nor inputs of a neural network. But rather, it inserts a backdoor by altering the functionality of operations implemented by the network on those parameters during the production of the neural network. Joseph Clements works with Dr. Yingjie Lao’s Secure and Innovative Computing Research Group conducting research on Adversarial AI in edge based Deep Learning technologies. In the fall semester of 2017, Joseph joined Clemson University’s Holcombe Department of Electrical and Computer Engineering in pursuit of his PhD. He graduated with a bachelor’s degree in computer engineering from the University of South Alabama in May of 2016. There, he engaged in research with Dr. Mark Yampolskiy on the security of additive manufacturing and cyber-physical systems. His research interests include machine learning and artificial intelligence, security and VLSI design.
Area Group action Focus (optics) Artificial neural network Artificial neural network Trojanisches Pferd <Informatik> Computer scientist Student's t-test Computer Distance Computer Latent heat Machine learning Computer hardware Computer hardware Musical ensemble Information security
Presentation of a group Implementation Observational study Direction (geometry) Virtual machine Online help Content (media) Mereology Wave packet Revision control Inference Malware Machine learning Hypermedia Computer hardware Information security Physical system Point cloud Area Programming paradigm Pattern recognition Email Cartesian coordinate system Data model Personal digital assistant Phase transition Point cloud Spacetime
Mobile app Implementation State of matter System administrator Virtual machine Materialization (paranormal) Maxima and minima Mereology Wave packet Power (physics) Twitter Product (business) Inference Machine learning Graph (mathematics) Computer hardware Energy level Information security Computing platform Mathematical optimization Physical system Pairwise comparison Addition Algorithm Constraint (mathematics) Artificial neural network Cloud computing Cartesian coordinate system Limit (category theory) Computer Markov chain Data model Software Personal digital assistant Telecommunication Infinite conjugacy class property Phase transition Chain Inference Point cloud Information security
Digital electronics Copyright infringement Mathematical analysis IP address Side channel attack Time domain Revision control Digital photography Data model Computer hardware Chain Computer hardware Business model Family Information security Logic gate Physical system Software development kit
Logical constant Group action Building Functional (mathematics) Digital electronics Multiplication sign Software developer Real number Copyright infringement Physicalism Mereology Revision control Category of being Software Peripheral Infinite conjugacy class property Chain Computer hardware Game theory Reverse engineering
Cybersex Classical physics Functional (mathematics) Implementation Algorithm Digital electronics Information Key (cryptography) Bit Trojanisches Pferd <Informatik> Side channel attack Peripheral Software Infinite conjugacy class property Chain Computer hardware Graph (mathematics) Encryption Quantum Information security Data type Physical system Vulnerability (computing)
Radical (chemistry) Digital electronics Different (Kate Ryan album) Computer hardware Connectivity (graph theory) Computer hardware Sound effect Trojanisches Pferd <Informatik> Trojanisches Pferd <Informatik> Information security Computer worm 2 (number)
Implementation Functional (mathematics) Context awareness Connectivity (graph theory) Artificial neural network Trojanisches Pferd <Informatik> Parameter (computer programming) Function (mathematics) Wave packet Product (business) 2 (number) Usability Different (Kate Ryan album) Operator (mathematics) Network socket Computer hardware Energy level Computer engineering Binary multiplier Backdoor (computing) Key (cryptography) Artificial neural network Weight Trojanisches Pferd <Informatik> Affine space Flow separation Category of being Type theory Software Personal digital assistant Computer hardware Phase transition Chain output Quicksort
Randomization Implementation Constraint (mathematics) Key (cryptography) Information Artificial neural network Set (mathematics) Perturbation theory Function (mathematics) Perturbation theory Intermediate value theorem 2 (number) Revision control Operator (mathematics) Computer hardware output Selectivity (electronic) output Perimeter Form (programming)
Axiom of choice Functional (mathematics) Randomization Multiplication sign 1 (number) Combinational logic Set (mathematics) Trojanisches Pferd <Informatik> Function (mathematics) Side channel attack Subset Revision control Machine learning Bit rate Different (Kate Ryan album) Military operation Operator (mathematics) Computer hardware Software testing Implementation Distribution (mathematics) Constraint (mathematics) Concentric Sound effect Trojanisches Pferd <Informatik> Category of being Data model Personal digital assistant Computer hardware output Metric system
Implementation Programming paradigm Context awareness Observational study Direction (geometry) Software developer Virtual machine Sound effect Physicalism Field (computer science) Wave packet Inference Data model Machine learning Computer hardware Chain Point cloud Information security Resultant Annihilator (ring theory) Physical system
laughs say thanks again for coming and good afternoon my name is Joseph Clements and I'm here to present my recent work published in the paper how we're changing the packs on their own networks so I am a PhD student from Clemson University at Clemson University [Music] at Clemson University
I work closely with my PhD advisor dr. in TLO and our research group secure innovative research computing research group my specific areas of focus are adversarial machine learning and hardware security we also have members of our team who work on VLSI design and proximate computing as well as post-punk democratic karate and humble mostly computer today I'd like to begin my
presentation by bringing about building a motivation for hardware security in machine learning and then by describing to the landscape of hardware attacks in machine in part of our text and then after that help us in our attack as one example of an attack possible on machine learning in Nonnberg wrangler after that i'll finally conclude the paper with future directions in this school so the
modern paradigm of machine learning is to implement machine learning in the cloud in the system a designer will first import Davis with the cloud treinta model on the cloud and that model is then housed on the cloud and whenever an edge to user wants to pay for with that model it the user will export this data to the cloud and then she's learning will happen on the cloud and then the inferences sent backwards and the inference is sent back to the edge easy systems like this are electronic assistance that we all know facial recognition is by social media and security applications such as email spam and spam in our malware prevention at the sale scenarios in the study that have been well studied there are two albums a tenth of attacks that people can take when attacking machine learning in the cloud they can take attacks in the Trinity face which it takes to modify the training phase so that a malicious version of emotional learning model is created you can also our adversaries can also implement attacks in the Infernus space in which they try to take a model that's already been built and find a way to maliciously produce comments with it okay adversary on machine learning in this Center has been well studied however we still need a lot of future research in this area
unfortunately however machine learning is muscles machinery in the cloud is not suitable for every scenario for example because of the latency issues with communicating back and forth with from the edge user and a cloud user systems like automatic driving which need real-time inference are not suitable to due to the latest issues similarly security systems may have security quality related to issues in mobile applications and wearable technology where machine learning is really becoming big users tend to be very mean that they want their machine of the applications to work well regardless of whether they have stable connections and in a lot of a lot of cases this is not the case with cloud-based country cloud-based machine learning and
therefore we predict that machine learning will at least in part moves to the edge and in this system the training still kurtz in the cloud-based system however the inference is moved from the edge to the climate and the way this is done is that the global level and export to the cloud is reproduced on all the edge devices so that the edge users can now communicate directly to the machine learning model on their own device all the states where the issues like latency and communication issues this however adds two additional constraints to our models one that edge devices are often very computationally have a lot of computational limitations in comparison with cloud computing systems and also they often run off battery powers and so are very limited in their power works restrictions because of this we see that there will because of this we will not have a need for machine learning to have very diverse hardware platforms to be able to handle these power power consuming large large deeper and deeper neural networks in HBase edge based systems we will also have a need for software and hardware and optimization optimization this the supply chain becomes very important when we're talking about an HP system those security systems are the atmosphere machine learning in this scenario in some ways remains very similar to adversary machine learning in the cloud however in other ways they are very different for example as trend is still pretty much the same done in the cloud the training phase will remain relatively the same however inference on the in the inference phase the an adversary may have direct access to the physical implementation of the machine learning algorithms through the edge devices and so in an accident inference phase may be much easier now than they were in cloud computing scenarios in addition to this we also have a new admin of attack for and materials and that they can attack the production face to target the machine learning application
unfortunately the modern fabulous business model has a lot of security photos this this model of supply chain is very economic in the past the supply chain was horizontal and where one company would do most is that strong security circuit design to fabrication to manufacturing however in the horizontal analysis business model the steps are divided up into multiple companies sometimes spanning multiple countries and therefore there are three key issues in the fabulous business business model one that third party IPS are very common three that globalization is very and for that there are generally multiple measures in these systems however these three things contribute to devices in family business model gate untrusted therefore hard work the tax of
the hard work the main have been very extensively studied as well we're going to talk about four different attacks in the hardware kit main one is I can pry the seat but that's counterfeiting attack side channel attacks and hardware version attacks so an IP piracy is when
a adversary attends to steal the intellectual property of another person and the times to passed off as their own IP attacks or IP IRC is seen in software as well however it is sometimes arguably easier to do in the harbour due to the fact that if an adversary has direct physical access to the hardware they can simply reconstruct the device and will get it in the microscope and they can physically what happen if they have access to the supply chain we then have a problem of IC over building where an entity can simply reproduce more devices than they're allowed to groups and they have essentially created multiple copies of the IP that they cannot use because of the economic costs in producing IP such as paying engineers and development costs and research costs okay access to IPS in this way it can be very economic especially if they have easy access to the physical device or the supply chain counterfeiting is the flip
side of this counterfeit is when entity takes an item and tries to reproduce it and make it seem like it's an illness for example we've got over here two pictures of a fake iphone and a real iPhone here thing Arduino and real and so you can see that sometimes these conflicts can be very similar in the game these attacks have been seen in hardware and software however sometimes these TAC can be very easy in the software due to a serious physical access to a device for physical access to the supply chain in fact it's been studied that up to 63 percent of all parts fly as constants our integrated circuits the research that publish this says that it might actually be higher due to the problems with detecting and proving that counters are counterfeits the other issue with counterfeit is that because they're often made with the reverse engineering they can be modified very easily and house malicious functionality that has not was not present in the original device the third
attack sensual text it is cannot be done through the supply chain one bit to previous attacks this one has to be done through access to the physical device as a side channel attack attends to use late side tale information from the device in order to gain some knowledge of the functionality or hidden knowledge on that device classic example of a side channel attack inside two attacks on AES algorithms this encryption system was thought to be computationally secure post pre quantum computing however it's been seen in the literature that through cyber attacks take advantage of hardware implementation weaknesses rather than algorithmic and for the algorithmic weaknesses that these attacks become relatively easy through cyber attacks
the final type of attacks we have today are Hardware Trojans a hardier Trojan is when a an adversary that takes to maliciously modify a circuit design such that the functionality changes these women see in software however they're also very easy to do especially if you have access to the supply chain unfortunately these attacks cannot be done through physical access they need access to the supply chain one key issue with harbored Trojans is that they are persistent in that once the hardware Trojan is in a hardware device you cannot remove it you have you essentially have to scrap the device and
so Hardware Trojans have been studied a lot in hardware security the reason being because they're so dangerous a lot of different classification techniques have arisen of it so there are many
different ways to divide former Trojans harbour Trojans are basic are broken up into two basic components the first being a trigger the second the payload the trigger is what activates the hardware Trojan the trigger searches for some rare internal condition inside of a circuit and then activates the terminals payload the Trojans payload is the component of the determine which is going to do perform the malicious modification needed there are two key issues that attackers need to tell you think about when developing hardware Trojans one is their stove which is very closely tied to part of the trigger design and the second is the effectiveness which is very closely tied to the payload so in designing the hardware trophy balancing these two key components is non-trivial
so our specific tap on in the hardware target attack on their own awareness socket charging attacks on neural networks have already been see these are done by creating a input trigger which in the apply to an input to the neural network and then using this key input training data is created which is similar to the original data and then the network is then retraining specific parameters of the network are retrained to produce the software trojan and in our case we have the same goal we want to insert a stealthy backdoor into a neural network classifier which persons forces the malicious output classification whenever a trigger is present however there are four main differences that come with powered attack one is that it is implemented in the production phase so it has to be done in the supply chain it can't be done afterwards or during the training phase it's implemented in the production phase the second is that it targets the network's operations rather than its weights so for example in charging attack you would insert a Trojan into maybe a multiplier or the components performing the activation function and apply your modifications to the hardware that way second is that it inherits some properties from Hardware engines in that it is persistent and that is very difficult to detect and affinities and finally it doesn't revolt involve any retraining and so in other weights are modified at all during our attack there it is simply the hardware implementation that were modifying our attack is done
in four separate stages and I'm going to talk about this on a rather high level rather they get into details however the way we start this is to select a we first select a target layer and then inside of that target where we select specific operations such as the multiplier that matters the activation functions and we isolate this type operation and in our case for ease of discussion later we simulate breaking the network up into two different you know awesome neural networks the first sort of neural network is the function implementation from an input layer to the output of the target operation the second sub network is the network from the output of the target operation to the primary outlet of the neural network once we have the target
layer we can then once we have the tight layer we can then select the input trigger key there are two different ways we can select input trigger piece as you see here we have a low form input trigger which looks exactly like in this data set would however we can also choose randomize input trigger keys those would just be a random values these input triggers have to follow the actual physical they have to follow the physical constraints that could actually be applied to this once we have the selected tricky we then feed that to the first subject neural network that we created and we look at the internal activations at each of the nodes of our network we call these intermediate activations intermediate values and the specific look at this final intermediate value which is the output of the target operations with when we have this output we can then move on to this - sorry the third step the third step is to
determine the operational information to be applied to be a target operation we take that in that intermediate value that corresponds to the target operations value and in the second sub neural network and we modified version of of an adversarial example attack is a modified version of the a personal example attack rather than trading the or altering the perimeters of the neural network sorry altering the input of the neural network alters the output from that offer target operation we have a very specific goal we're going to this in that we want to limit the amount of perturbations that were you were looking at here because we want the hardware implementation to be as small as possible okay after that we didn't come
out everything back together by taking the targeted operations and modifying the ones need notice that only because of our choice of a personal motivation which limits the operations in most cases we only need a small subset of generals to be modified the way that these neurons would be modified a can
potentially use any Hardware trans ended up designs such as the ones we've seen before for example we could place a Trojan inside of a multiplier which modifies the operation of the most fire to be if we were looking at an activation function we could determine two different activation functions one corresponding to the malicious modification one corresponding to the denying modification we continue to create a trigger which searches for a rarely occurring internal internal activation at the input of that activation function and then if that occurs will then switch the output from the benign version to the delicious version to demonstrate our attack we set up four different experiments we - of the experiments we used well concentrators in two other starting for the experience we use well-crafted triggers and for other experiments we use randomized triggers for each of those cases sorry when we attack the randomize trigger since these the outfits of the randomized readers are essentially already untargeted we had to use target attacks on the or the adversarial example for the well-crafted trigger we instead look at differences between bounding and the constraints longji attack so we evaluated this he essentially had three different medicines we thought we looked at and we're doing this one was the effectiveness of the attack we consider the effectiveness of the attack to be the amount of input triggers that effectively reached a malicious Trojan modification we found that on average 97% of our attacks rate we also evaluated this Delphinus under functional test testing functional testing is when a the Purdue small as compared to a Dakota's model and so we take a goal tomorrow and you pass the trust testing data from our from our target data set and we pass that through the machine learning model and we compare the outputs from the golden ball with the alkanes from the modified model to compare the two outputs we saw that a hundred percent of all of our input outfit combinations match finally we did we tested for spell stuffiness under the internal testing when you when you're testing a chip for behavioral testing you test the side channel effects such as power consumption timing properties and so on and so this these metrics is often generally correlate directly to the amount of modifications made to the hardware and so when were analyzing the stuffiness other behavioral testing we look at the percentage of neurons in each layer that is being modified so in some cases that's localized or 0.3 percent of okay so as we can see from
the experimental results with access to the devices supply chain we can perform a stealthy and effective attack on machine learning models by inserting Hardware Trojans and the physical and implementations however this is just one example as described earlier there are many other research fields in hardware security and many possible restart where attacks that can be mounted on machine learning devices in scenarios where at an adversarial has physical access to either of the devices running on a machine learning model or in supply chain meanwhile we have only talked about influencing sorry about inference on the edge scenarios however new novel machine learning paradigms such as federated and learning study new model new novel machine learning paradigms such as federated learning seen here which attempts to move training at least partially from the cloud into the edge devices are also possible in these scenarios training faith the training face may also be subject to attacks through the hardware implementation sorry okay therefore to ensure the safety of machine learning and paradigms where globalization and physical access are present the development of systems should be security aware that's it for my publication so are there any questions