We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Apache Spark vs cloud-native SQL engines

Formal Metadata

Title
Apache Spark vs cloud-native SQL engines
Title of Series
Number of Parts
141
Author
License
CC Attribution - NonCommercial - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Currently, SQL and Cloud Data Warehouses (DWH) are extremely popular for good reason. They are great for dashboarding and business intelligence (BI) use cases due to their ease-of-use. However, their combination might not be the best choice for every problem. More precisely, business-critical data pipelines with high complexity might be better suited for frameworks such as Apache Spark which greatly benefit from the tight integration with general purpose languages like Python (e.g., PySpark). Expect an opinionated comparison between Apache Spark and seemingly easier-to-use cloud native SQL engines. By the end of this talk, you will be challenged to think about why they are complementary and when each has its justification.