AI VILLAGE - StuxNNet: Practical Live Memory Attacks on Machine Learning Systems

Cite

DEF CON

Norwitz, Raphael Kim, Bryan

Formal Metadata

Title

AI VILLAGE - StuxNNet: Practical Live Memory Attacks on Machine Learning Systems

Title of Series

DEF CON 26

Number of Parts

322

Author

Norwitz, Raphael

Kim, Bryan

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/39784 (DOI)

Publisher

DEF CON

Release Date

2018

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Like all software systems, the execution of machine learning models is dictated by logic represented as data in memory. Unlike traditional software, machine learning systems’ behavior is defined by the model’s weight and bias parameters, rather than precise machine opcodes. Thus patching network parameters can achieve the same ends as traditional attacks, which have proven brittle and prone to errors. Moreover, this attack provides powerful obfuscation as neural network weights are hard to interpret, making it difficult for security professionals to determine what a malicious patch does. We demonstrate that one can easily compute a trojan patch, which when applied causes a network to behave incorrectly only on inputs with a given trigger. An attacker looking to compromise an ML system can patch these values in live memory with minimal risk of system malfunctions or other detectable side-effects. In this presentation, we demonstrate proof of concept attacks on TensorFlow and a framework we wrote in C++ on both Linux and Windows systems. An attack of this type relies on limiting the amount of network communication to reduce to the likelyhood of detection. Accordingly, we attempt to minimize the size of the patch, in terms of number of changed parameters needed to introduce trojan behavior. On an MNIST handwritten digit classification network and on a malicious PDF detection network, we prove that the desired trojan behavior can be introduced with patches on the order of 1% of the total network size, using roughly 1% of the total training data, proving that the attack is realistic. I am a recent graduate from Columbia Univserity with a BA in Computer Science and MS in Machine Learning, and an incoming engineer on the Acropolis Hypervisor team at Nutanix. I have experience with Linux Kernel development, data science and malware analysis. I have interned at Google, Drawbridge and Nimbledroid, and have published research with Columbia’s Wireless and Mobile Networking lab. For fun, I like to be outdoors and train Brazilian Ju-Jitsu.