Speech to text with Elasticsearch

Cite

Related Material

Plain Schwarz

Carboni, Sophie Precup, Lucian

Formal Metadata

Title

Speech to text with Elasticsearch

Title of Series

Berlin Buzzwords 2021

Number of Parts

Author

Carboni, Sophie

Precup, Lucian

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/67356 (DOI)

Publisher

Plain Schwarz

Release Date

2021

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

The steps from speech to text are quite simple in theory : you transform the waves into phonemes, then you group them together and decide which has the best probability of representing a meaningful word or phrase based on a dictionary. We often use services available with our devices for this task: Google services if our device is based on Android or you are using Chrome, Apple services if the device is an iPhone, Amazon services if the device is compatible with Alexa and so on. But there are cases where you cannot or do not want to use this type of service. We tried solving this problem with Elasticsearch. As the final step is searching throughout a dictionary of phonemes and finding the combination that best matches a real phrase, we can easily think of a solution based on an inverted index. In this talk we share our experience with implementing a prototype and give you all the tips and tricks for implementing such a system in your own infrastructure.