We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

NrtSearch: Yelp’s fast, scalable, and cost-effective open source search engine

Formal Metadata

Title
NrtSearch: Yelp’s fast, scalable, and cost-effective open source search engine
Title of Series
Number of Parts
56
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Search and ranking are part of many important features on the Yelp platform - from looking for a plumber to showing relevant photos of the dish you search for. These varied use cases led to the creation of Yelp’s Elasticsearch-based ranking platform which we presented at Berlin Buzzwords 2019, allowing real-time indexing, learning-to-rank, and lesser maintenance overhead, as well as enabling access to search functionality to more teams at Yelp. We recently built Nrtsearch, a Lucene-based search engine, to replace Elasticsearch. We have open sourced this search engine under the Apache 2.0 license. This talk will detail Challenges associated with scaling Elasticsearch costs and performance. Mainly issues related to the document-based replication approach. Difficulties with real time auto scaling of Elasticsearch. Inefficient usage of resources due to hot and cold node issues. Architecture of Nrtsearch Uses Lucene’s near-real-time (NRT) segment replication Primary-Replica architecture: Primary does all writing including segment merges while replicas simply copy over segments using Lucene's NRT APIs and serve search queries. Cluster orchestration, availability and management of nodes is left to systems like Kubernetes that excel at resource management and scheduling. Truly stateless architecture: Deployed as a standard microservice using Kubernetes. State is committed to s3, upon a restart of a primary or replica, the most recent state from s3 is pulled down. Benefits of this architecture Performance increased by up to 50% Cluster costs lowered by up to 50% Use of standard tools (k8s) to manage operational aspects of the cluster, relieving ranking infrastructure teams to focus on search-related problems. Challenges involved in rolling this out to production Lucene’s segment replication approach and the code itself is not widely used in the industry so had some rough edges. Exciting performance bugs! Future work Enhance feature support via extensible plugins like vector-embeddings Continue to simplify and open source deployment tooling to help others deploy NrtSearch in their own cloud environments.