Kaldb: serverless lucene at petabyte scale

Plain Schwarz

Karumuri, Suman

Formal Metadata

Title

Title of Series

Berlin Buzzwords 2023

Number of Parts

Author

Karumuri, Suman

Contributors

N. N. (Moderation)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/66610 (DOI)

Publisher

Plain Schwarz

Release Date

2023

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Running petabyte-scale columnar stores has become a routine operation in today's data-driven world. However, running a petabyte-scale search system is still a challenging task operationally. Enter Kaldb, an open-source, serverless Lucene serving system designed specifically for petabyte-scale Lucene workloads. We've designed Kaldb to automate and reduce operational toil without sacrificing performance or reliability. But designing a serverless Lucene system at this scale poses several unique challenges, such as ensuring durability of data, modifying replication and caching protocols for high availability, high fanout reads, managing ephemeral nodes, and more. In this talk, we'll delve into the details of how our redesigned Kaldb system overcomes these challenges. We've separated durability of the data from storage, separated compute from storage, modified replication algorithms to handle ephemeral nodes, use Kafka as a write ahead log and developed a novel query execution layer to handle high-fanout queries. Our implementation not only reduces operational toil but also adds several self-healing properties to the system. We're proud to say that Kaldb currently runs on Kubernetes at petabyte scale with improved reliability and performance. Join us in this talk to learn more about how Kaldb can help you overcome the challenges of running a petabyte-scale Lucene serving system. We'll share our experiences, best practices, and lessons learned in designing and operating a serverless Lucene serving system at this scale, and provide practical insights and techniques that you can use to optimize your own search systems.