Neural Search Comes to Apache Solr: Approximate Nearest Neighbor, BERT & more

Cite

Related Material

Plain Schwarz

Benedetti, Alessandro

Formal Metadata

Title

Neural Search Comes to Apache Solr: Approximate Nearest Neighbor, BERT & more

Title of Series

Berlin Buzzwords 2022

Number of Parts

Author

Benedetti, Alessandro

Contributors

N. N. (Moderation)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/67176 (DOI)

Publisher

Plain Schwarz

Release Date

2022

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

The first integrations of machine learning techniques with search allowed to improve the ranking of your search results (Learning To Rank) - but one limitation has always been that documents had to contain the keywords that the user typed in the search box in order to be retrieved. For example, the query “tiger” won’t retrieve documents containing only the terms “panthera tigris”. This is called the vocabulary mismatch problem and over the years it has been mitigated through query and document expansion approaches. Neural search is an Artificial Intelligence technique that allows a search engine to reach those documents that are semantically similar to the user’s query without necessarily containing those terms; it avoids the need for long lists of synonyms by automatically learning the similarity of terms and sentences in your collection through the utilisation of deep neural networks and numerical vector representation. This talk explores the first Apache Solr official contribution about this topic, available from Apache Solr 9.0. During the talk we will give an overview of neural search (Don’t worry - we will keep it simple!): we will describe vector representations for queries and documents, and how Approximate K-Nearest Neighbor (KNN) vector search works. We will show how neural search can be used along with deep learning techniques (e.g, BERT) or directly on vector data, and how we implemented this feature in Apache Solr, giving usage examples! Join us as we explore this new exciting Apache Solr feature and learn how you can leverage it to improve your search experience!