Sound Event Detection with Machine Learning

Zitieren

EuroPython

Nordby, Jon

Formale Metadaten

Titel

Sound Event Detection with Machine Learning

Serientitel

EuroPython 2021

Anzahl der Teile

115

Autor

Nordby, Jon

Mitwirkende

Srivastava, Vaibhav (Moderation)

Lizenz

CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.

Identifikatoren

10.5446/58783 (DOI)

Herausgeber

EuroPython

Erscheinungsjahr

2021

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

Sound Events (or Audio Events or Acoustic Events) are individual distinct sounds. This could be the pop of roasting popcorn kernels, the cough of a patient, a car that is passing by on a road, or the sound of an alarm in an office building. Sound Event Detection (SED) is the task of detecting such sounds, returning precise times that each kind (class) of sound occurs. It finds uses in music analysis, manufacturing, medicine and noise monitoring. We will show how to realize a basic Sound Event Detection system in Python, using fermentation tracking of beer brewing as a fun and practical example. The talk will cover all major parts of such a system, including: - Collecting and exploring a custom dataset - Setting up the supervised learning task from the dataset - Extracting spectrogram features from audio waveforms - Training a Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) - Running the trained model on an real-time audio stream - Processing model output probabilties into discrete events - Evaluate the performance of the resulting SED system Example code in Python covering these aspects will be provided. Libraries used with be Keras, TensorFlow and scikit-learn for machine learning, and pysoundfile, sounddevice and librosa for audio processing, with some numpy and pandas for general data manipulation.