No More Silos: Integrating Databases and Apache Kafka

Cite

Related Material

Plain Schwarz

Moffatt, Robin

Formal Metadata

Title

No More Silos: Integrating Databases and Apache Kafka

Title of Series

Berlin Buzzwords 2021

Number of Parts

Author

Moffatt, Robin

Contributors

N. N. (Moderation)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/67352 (DOI)

Publisher

Plain Schwarz

Release Date

2021

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Companies new and old are all recognising the importance of a low-latency, scalable, fault-tolerant data backbone, in the form of the Apache Kafka streaming platform. With Kafka, developers can integrate multiple sources and systems, which enables low latency analytics, event-driven architectures and the population of multiple downstream systems. In this talk, we’ll look at one of the most common integration requirements - connecting databases to Kafka. We’ll consider the concept that all data is a stream of events, including that residing within a database. We’ll look at why we’d want to stream data from a database, including driving applications in Kafka from events upstream. We’ll discuss the different methods for connecting databases to Kafka, and the pros and cons of each. Techniques including Change-Data-Capture (CDC) and Kafka Connect will be covered, as well as an exploration of the power of ksqlDB for performing transformations such as joins on the inbound data. Attendees of this talk will learn: - why events, not just state, matter - the difference between log-based CDC and query-based CDC - how to chose which CDC approach to use