We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Scaling your Kafka pipeline can be a pain - but it doesn’t have to be

Formal Metadata

Title
Scaling your Kafka pipeline can be a pain - but it doesn’t have to be
Title of Series
Number of Parts
56
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Kafka data pipeline maintenance can be painful. It usually comes with complicated and lengthy recovery processes, scaling difficulties, traffic ‘moodiness’, and latency issues after downtimes and outages. It doesn’t have to be that way! We’ll examine one of our multi-petabyte scale Kafka pipelines, and go over some of the pitfalls we’ve encountered. We’ll offer solutions that alleviate those problems, and go over comparisons between the before and after . We’ll then explain why some common sense solutions do not work well and offer an improved, scalable and resilient way of processing your stream. We’ll cover: - Costs of processing in stream compared to in batch - Scaling out for bursts and reprocessing - Making the tradeoff between wait times and costs - Recovering from outages - And much more…