We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Faster Spark SQL: Adaptive Query Execution in Spark v3

Formal Metadata

Title
Faster Spark SQL: Adaptive Query Execution in Spark v3
Title of Series
Number of Parts
637
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Over the years, there has been extensive efforts to improve Apache Spark SQL performance. This talk will introduce the new Adaptive Query Execution (AQE) framework and how it can automatically improve user query performance. AQE leverages query runtime statistics to dynamically guide Spark's execution as queries run along. The talk will go over the main features in AQE and provide examples on how it can improve on the previous static query plans. Finally, we'll present the significant improvements we have seen on the TPC-DS benchmark with AQE. Examples of the new runtime optimizations include selecting the right join type (broadcast-hash-join vs. sort-merge-join), dealing with data skew, and automatically selecting the number of shuffle (reducer) partitions.