#bbuzz: Delivering stream data reliably with Pravega

Cite

Related Material

Plain Schwarz

Junqueira, Flavio

Formal Metadata

Title

#bbuzz: Delivering stream data reliably with Pravega

Title of Series

Berlin Buzzwords 2020

Number of Parts

Author

Junqueira, Flavio

Contributors

N. N. (Moderation)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/68789 (DOI)

Publisher

Plain Schwarz

Release Date

2020

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Pravega is storage for streams. Pravega exposes stream as a core storage primitive, which enables applications continuously generating data (e.g., from sensors, end users, servers, or cameras) to ingest and store such data permanently. Applications that consume stream data from Pravega are able to access the data through the same API, independent of whether it is tailing the stream or reprocessing historical data. Pravega has some unique features such as the ability of storing an unbounded amount data per stream, while guaranteeing strong delivery semantics with transactions and scaling according to changes to the workload. Pravega streams build on segments: append-only sequences of bytes that can be open or sealed. Segments are indivisible units that form streams and that enable important features. Segments enable Pravega to spread the load of streams across servers, to scale streams, to implement transactions efficiently, and to implement state replication. In this presentation, we overview the architecture of Pravega and its core features. To enable reliable ingestion of events, we discuss how to use transactions along with state synchronization to guarantee that events from memoryful sources are ingested exactly once. Such a property is important to guarantee correctness when processing the data. Pravega is an open-source project hosted on GitHub, and aims to be a community-driven project. It is under active development and encourages interested contributors to check it out and interact with the developers.