Highly Available Search at Shopify

Zitieren

Zugehöriges Material

Plain Schwarz

Ebrahimpour, Khosrow

Formale Metadaten

Titel

Highly Available Search at Shopify

Serientitel

Berlin Buzzwords 2023

Anzahl der Teile

Autor

Ebrahimpour, Khosrow

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/66633 (DOI)

Herausgeber

Plain Schwarz

Erscheinungsjahr

2023

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

Millions of merchants rely on Shopify’s search infrastructure to sell their products and fulfill their orders. To be successful, merchants need their data to be highly available and also searchable in a matter of seconds. Moreover, these merchants are spread in different jurisdictions across the globe where data residency regulations require them to ensure their sensitive data stays within their jurisdiction. However, since their buyers are also spread across the globe, non personal data such as store products should be available globally and close to buyers to provide a fast search experience. This talk explains how the search platform team at Shopify built a highly available search infrastructure that indexes petabytes of data from traditional databases to Elasticsearch through Kafka in record time. Since search is a critical service for a global commerce platform in Shopify’s scale, the indexing pipeline writing to Elasticsearch is implemented with high availability and disaster recovery as a key requirement. That is, if one region becomes unavailable, the designed data replication mechanism allows the search infrastructure to provide service without impacting merchants and buyers. Moreover, this infrastructure is distributed across the globe and designed in a way to follow data residency regulations of different jurisdictions while making sure buyers are able to search products with minimum delay. Shopify’s search infrastructure has proven to be performant and capable of indexing millions of documents per minute while serving millions of queries at the same time. The lessons learned shared in this talk about the challenges of building a highly available and performant search infrastructure will be interesting to individuals and will encourage them to solve similar challenges.