Distributed snapshots and global deadlock detection

Zitieren

Zugehöriges Material

PGCon - PostgreSQL Conference for Users and Developers

Praveen, Asim Rama Zhang, Hubert

Formale Metadaten

Titel

Distributed snapshots and global deadlock detection

Serientitel

PGCon 2020

Anzahl der Teile

Autor

Praveen, Asim Rama

Zhang, Hubert

Mitwirkende

Langille, Dan (Moderation)

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/52131 (DOI)

Herausgeber

PGCon - PostgreSQL Conference for Users and Developers

Erscheinungsjahr

2020

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Information und Dokumentation

Genre

Konferenz/Talk

Abstract

In this talk we would like to share our experiences in implementing MVCC in an open source scale out environment based on PostgreSQL. Scale out environment is characterised by multiple PostgreSQL servers with one master PostgreSQL server designated as the entry point for clients. Transactions may update data residing on more than one PostgreSQL instance. Isolating two or more such transactions running concurrently is a major challenge in a scale out system. Different isolation levels have their own challenges, serialisable being the hardest to implement efficiently. Distributed snapshots enable individual PostgreSQL instances within a scale out system to determine status (in-progress, committed, aborted) of a transaction and to decide whether effects of a transaction are visible using a given snapshot. The talk will go over this distributed snapshot mechanism in detail including several corner cases that we found tricky to implement right. Note that each PostgreSQL instance in the scale out system continues to create local snapshots and local transactions. Distributed deadlock occurs when the wait cycle spans multiple PostgreSQL instances. To detect it, wait graphs from each PostgreSQL instance need to aggregated and cycle detecting be performed on the aggregated data. The talk will describe how we model vertices and edges in such a graph, the method used to aggregate this information, being mindful of performance of the distributed system.