Distributed snapshots and global deadlock detection

Cite

Related Material

PGCon - PostgreSQL Conference for Users and Developers

Praveen, Asim Rama Zhang, Hubert

Formal Metadata

Title

Distributed snapshots and global deadlock detection

Title of Series

PGCon 2020

Number of Parts

Author

Praveen, Asim Rama

Zhang, Hubert

Contributors

Langille, Dan (Moderation)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/52131 (DOI)

Publisher

PGCon - PostgreSQL Conference for Users and Developers

Release Date

2020

Language

English

Content Metadata

Subject Area

Information Science

Genre

Conference/Talk

Abstract

In this talk we would like to share our experiences in implementing MVCC in an open source scale out environment based on PostgreSQL. Scale out environment is characterised by multiple PostgreSQL servers with one master PostgreSQL server designated as the entry point for clients. Transactions may update data residing on more than one PostgreSQL instance. Isolating two or more such transactions running concurrently is a major challenge in a scale out system. Different isolation levels have their own challenges, serialisable being the hardest to implement efficiently. Distributed snapshots enable individual PostgreSQL instances within a scale out system to determine status (in-progress, committed, aborted) of a transaction and to decide whether effects of a transaction are visible using a given snapshot. The talk will go over this distributed snapshot mechanism in detail including several corner cases that we found tricky to implement right. Note that each PostgreSQL instance in the scale out system continues to create local snapshots and local transactions. Distributed deadlock occurs when the wait cycle spans multiple PostgreSQL instances. To detect it, wait graphs from each PostgreSQL instance need to aggregated and cycle detecting be performed on the aggregated data. The talk will describe how we model vertices and edges in such a graph, the method used to aggregate this information, being mindful of performance of the distributed system.