Speech to text with Elasticsearch

Zitieren

Zugehöriges Material

Plain Schwarz

Carboni, Sophie Precup, Lucian

Formale Metadaten

Titel

Speech to text with Elasticsearch

Serientitel

Berlin Buzzwords 2021

Anzahl der Teile

Autor

Carboni, Sophie

Precup, Lucian

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/67356 (DOI)

Herausgeber

Plain Schwarz

Erscheinungsjahr

2021

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

The steps from speech to text are quite simple in theory : you transform the waves into phonemes, then you group them together and decide which has the best probability of representing a meaningful word or phrase based on a dictionary. We often use services available with our devices for this task: Google services if our device is based on Android or you are using Chrome, Apple services if the device is an iPhone, Amazon services if the device is compatible with Alexa and so on. But there are cases where you cannot or do not want to use this type of service. We tried solving this problem with Elasticsearch. As the final step is searching throughout a dictionary of phonemes and finding the combination that best matches a real phrase, we can easily think of a solution based on an inverted index. In this talk we share our experience with implementing a prototype and give you all the tips and tricks for implementing such a system in your own infrastructure.