#bbuzz: Fast scalable evaluation of ML models over large data sets using open source

Zitieren

Zugehöriges Material

Plain Schwarz

Bratseth, Jon

Formale Metadaten

Titel

#bbuzz: Fast scalable evaluation of ML models over large data sets using open source

Serientitel

Berlin Buzzwords 2020

Anzahl der Teile

Autor

Bratseth, Jon

Mitwirkende

Biswas, Debmalya (Moderation)

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/68793 (DOI)

Herausgeber

Plain Schwarz

Erscheinungsjahr

2020

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

Modern solutions to search and recommendation require evaluating machine-learned models over large data sets with low latency. Producing the best results typically require combining fast (approximate) nearest neighbour search in vector spaces to limit candidates, filtering to surface only the appropriate subset of results in each case, and evaluation of more complex ML models such as deep neural nets computing over both vectors and semantic features. Combining these needs into a working and scalable solution is a large challenge as separate components solving for each requirement cannot be composed into a scalable whole for fundamental reasons. This talk will explain the architectural challenges of this problem, show the advantages of solving it on concrete cases and introduce an open source engine - Vespa.ai - that provides a scalable solution by implementing all the elements in a single distributed execution.