Should you read Kafka as a stream or in batch? Should you even care? - TIB AV-Portal

Should you read Kafka as a stream or in batch? Should you even care?

0

Related Material

Nadler, Ido Dubrovsky, Opher

Formal Metadata

Title

Should you read Kafka as a stream or in batch? Should you even care?

Title of Series

Berlin Buzzwords 2021

Number of Parts

69

Author

Dubrovsky, Opher

Contributors

N. N. (Moderation)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/67322 (DOI)

Publisher

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Should you consume Kafka in a stream OR batch? When should you choose each one? What is more efficient, and cost effective? Should you even care? In this talk we’ll give you the tools and metrics to decide which solution you should apply when, and show you a real life example with cost & time comparisons. To highlight the differences, we’ll dive into a project we’ve done, transitioning from reading Kafka in a stream to reading it in batch. By turning conventional thinking on its head and reading our multi-petabyte Kafka stream in batch using Spark and Airflow, we’ve achieved a huge cost reduction of 65% while at the same time getting a more scalable and resilient solution. Using the learnings and statistics we’ve gained, we’ll explore the tradeoffs and give you the metrics and intuition you’ll need to make such decisions yourself. We’ll cover: - Costs of processing in stream compared to batch - Scaling up for bursts and reprocessing - Making the tradeoff between wait times and costs - Recovering from outages - And much more…