We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Lessons in Building a Distributed Query Planner

Formale Metadaten

Titel
Lessons in Building a Distributed Query Planner
Serientitel
Anzahl der Teile
34
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Citus is a distributed database that scales out Postgres. By using the extension APIs, Citus distributes your tables across a cluster of machines and parallelizes SQL queries. This talk describes Citus' distributed query planner by focusing on our experience in distributed systems. We first show that the primary challenge for any distributed planner is a theoretical understanding of which computations are easy to scale. We provide three example SQL queries that demonstrate these challenges: (a) simpler aggregate functions with groupings, (b) large table joins, and (c) complex subselects. We then explain why some queries are harder to scale than others. Next, we map these two queries into relational algebra (logical plan). We show that a simple abstraction, one that separates logical and physical planning, can minimize network I/O and parallelize all SQL queries in a small amount of code. We conclude by comparing query planning methods across different distributed databases.