We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

From Zero to Portability

Formal Metadata

Title
From Zero to Portability
Subtitle
Apache Beam's Journey to Cross-Language Data Processing
Title of Series
Number of Parts
561
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Apache Beam is a programming model for composing parallel and distributed data processing jobs. As many other Apache projects, Beam first used Java as its API language. Unsatisfied with the status quo, Beam developers launched the portability project to enable other languages to run with Beam. Currently, Beam has a Java, Python, and a Go API. Ultimately, these languages won't just coexist in Apache Beam, but they will complement each other in cross-language data processing jobs. In this talk we will learn how it is possible to support multiple languages and why it might be a good idea to combine these languages in data processing jobs. Apache Beam is a programming model for composing parallel and distributed data processing jobs. Once composed, these jobs run on various execution engines like Apache Flink, Apache Spark, or Google Cloud Dataflow. But Apache Beam's vision goes beyond just running on multiple execution engines. As many other Apache projects, Beam first used Java as its API language. Unsatisfied with the status quo, Beam developers launched the portability project to enable other languages to run with Beam. Currently, Beam has a Java, Python, and a Go API. That means users are not restricted to the Java ecosystem but can use their favorite Python libraries like Numpy or Tensorflow with Apache Beam. Ultimately, these languages won't just coexist in Apache Beam, but they will complement each other in cross-language data processing jobs. For example, reading from Kafka can be done with the Java connector but the data can afterwards be processed in Python. In this talk we will learn how it is possible to support multiple languages and why it might be a good idea to combine these languages in data processing jobs.