Implementing Parallelism in PostgreSQL

Cite

PGCon - PostgreSQL Conference for Users and Developers, Andrea Ross

Haas, Robert

Formal Metadata

Title

Implementing Parallelism in PostgreSQL

Title of Series

PGCon 2014

Number of Parts

Author

Haas, Robert

Contributors

Crunchy Data Solutions (Support)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/19080 (DOI)

Publisher

PGCon - PostgreSQL Conference for Users and Developers, Andrea Ross

Release Date

2014

Language

English

Production Place

Ottawa, Canada

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Where We Are Today, and What's On The Horizon PostgreSQL's architecture is based heavily on the idea that each connection is served by a single backend process, but CPU core counts are rising much faster than CPU speeds, and large data sets can't be efficiently processed serially. Adding parallelism to PostgreSQL requires significant architectural changes to many areas of the system, including background workers, shared memory, memory allocation, locking, GUC, transactions, snapshots, and more. In this talk, I'll give an overview of the changes made to background workers in PostgreSQL 9.4 and the new dynamic shared memory facility, which I believe will form the foundations of parallelism in PostgreSQL, and discuss some lessons I learned while implementing these features. I'll also discuss what I believe is needed next: easy allocation of dynamic shared memory, state sharing between multiple backends, lock manager improvements, and parallel algorithms; and highlight what I believe to be the key challenges in each area.