Rohan Taori and Amog Kamsetty, undergrades at UC Berkeley studying EECS The application of deep recurrent networks to audio transcription has led to impressive gains in automatic speech recognition (ASR) systems. Many have demonstrated that small adversarial perturbations can fool deep neural networks into incorrectly predicting a specified target with high confidence. Current work on fooling ASR systems have focused on white-box attacks, in which the model architecture and parameters are known. In this paper, we adopt a black-box approach to adversarial generation, combining the approaches of both genetic algorithms and gradient estimation to solve the task. We achieve a 89.25% targeted attack similarity after 3000 generations while maintaining 94.6% audio file similarity. Rohan Taori(Tweet@rtaori13) is an undergrade at UC Berkeley studying EECS with an interest in machine learning and AI. He heads the educational division at Machine Learning at Berkeley and is also a researcher at BAIR (Berkeley AI Research). Amog Kamsetty is an undergraduate studying EECS at UC Berkeley, with an interest in both machine learning and systems. He is involved with Machine Learning @ Berkeley and is currently pursuing research at UC Berkeley RISE Lab. |