Hash Joins: Past, Present and Future

Cite

Related Material

PGCon - PostgreSQL Conference for Users and Developers

Munro, Thomas

Formal Metadata

Title

Hash Joins: Past, Present and Future

Subtitle

A peek inside the engine room

Title of Series

PGCon 2017

Number of Parts

Author

Munro, Thomas

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/48964 (DOI)

Publisher

PGCon - PostgreSQL Conference for Users and Developers

Release Date

2018

Language

English

Content Metadata

Subject Area

Information Science

Genre

Conference/Talk

Abstract

An introduction to the implementation of hash joins in PostgreSQL, and a discussion of some potential future improvements in the area of parallelism. "Hash joins" are one of the three strategies that PostgreSQL uses to join relations (along with merge joins and nested loops), and can provide excellent performance under certain conditions. Before diving into the implementation details, I will give an introduction to the general concepts. Then the first part will describe the implementation of hash joins in PostgreSQL 9.5 and earlier including a detailed look at planning and execution, with a focus on memory management. The second part will discuss the multi-core join execution available in PostgreSQL 9.6 and its limitations. The third part will look at several strategies being worked on to tackle those limitations with different trade-offs, so that parallel hash joins can make better use of available memory and CPU cores in future releases.