How we isolate streaming ingest from search using RocksDB

Cite

Related Material

Plain Schwarz

Canadi, Igor

Formal Metadata

Title

How we isolate streaming ingest from search using RocksDB

Title of Series

Berlin Buzzwords 2024

Number of Parts

Author

Canadi, Igor

Contributors

N. N. (Moderation)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/70227 (DOI)

Publisher

Plain Schwarz

Release Date

2024

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

In this presentation, we delve into how we disaggregated streaming ingest compute from query compute for real-time applications. Up until this point, real-time architectures were designed for data streaming ingestion and query processing to be executed within the same cluster. This co-location, while beneficial for fresh data access, often leads to competition for compute resources, especially during spikes in either data ingestion or query activities. There are strategies for mitigating compute contention, including replication and overprovisioning clusters, but they do not fully address the issue. We describe how we built a new cloud architecture using RocksDB, a key-value store with a log-structured merge-tree architecture. The design disaggregates compute from storage, enabling simultaneous querying of shared real-time data across multiple clusters. The architecture also separates streaming ingest compute and query compute by replicating the in-memory state of the RocksDB memtable across different compute clusters, ensuring the accessibility of the latest data in single-digit milliseconds. Learn how this architecture is being used for search, real-time analytics and AI-powered search applications.