We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Query Embeddings: Web Scale Search powered by Deep Learning and Python

Formal Metadata

Title
Query Embeddings: Web Scale Search powered by Deep Learning and Python
Title of Series
Part Number
45
Number of Parts
169
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Ankit Bahuguna - Query Embeddings: Web Scale Search powered by Deep Learning and Python A web search engine allows a user to type few words of query and it presents list of potential relevant results within fraction of a second. Traditionally, keywords in the user query were fuzzy-matched in realtime with the keywords within different pages of the index and they didn't really focus on understanding meaning of query. Recently, Deep Learning + NLP techniques try to _represent sentences or documents as fixed dimensional vectors in high dimensional space. These special vectors inherit semantics of the document. Query embeddings is an unsupervised deep learning based system, built using Python, Word2Vec, Annoy and Keyvi which recognizes similarity between queries and their vectors for a web scale search engine within Cliqz browser. The goal is to describe how query embeddings contribute to our existing python search stack at scale and latency issues prevailing in real time search system. Also is a preview of separate vector index for queries, utilized by retrieval system at runtime via ANNs to get closest queries to user query, which is one of the many key components of our search stack. Prerequisites: Basic experience in NLP, ML, Deep Learning, Web search and Vector Algebra. Libraries: Annoy.