We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Accelerating TopK Queries

Formale Metadaten

Titel
Accelerating TopK Queries
Serientitel
Anzahl der Teile
64
Autor
Mitwirkende
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
This talk is about a recently added optimization to the Snowflake Cloud Database for speeding up TopK queries. In it the audience will learn about how we use runtime information to skip scanning data partitions that will not contribute to the result of the query. TopK Queries are SQL queries that contain both an order by and a limit clause. This is a very common query pattern. Examples are queries looking for the last k events (order by time descending), the k cheapest products (order by price ascending) or simply any limit query that is expected to deterministically return the same k rows (order by any unique identifier). In contrast to limit queries without an order by, the query engine cannot simply return the first k rows it has scanned. Instead it has to take all data into consideration to find the k smallest or largest rows. We can, however, come up with an upper / lower bound for values that can make it into the result after having seen at least k rows. This boundary can then be used to filter out rows that will not be a part of the result. Our optimization takes this a step further and additionally uses the boundary for pruning entire data partitions. This means that we compare it with metadata for these partitions and based on this comparison may decide to skip downloading and scanning a partition entirely if none of its contained rows can become a part of the result.