We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Distributed snapshots and global deadlock detection

Formale Metadaten

Titel
Distributed snapshots and global deadlock detection
Serientitel
Anzahl der Teile
32
Autor
Mitwirkende
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
In this talk we would like to share our experiences in implementing MVCC in an open source scale out environment based on PostgreSQL. Scale out environment is characterised by multiple PostgreSQL servers with one master PostgreSQL server designated as the entry point for clients. Transactions may update data residing on more than one PostgreSQL instance. Isolating two or more such transactions running concurrently is a major challenge in a scale out system. Different isolation levels have their own challenges, serialisable being the hardest to implement efficiently. Distributed snapshots enable individual PostgreSQL instances within a scale out system to determine status (in-progress, committed, aborted) of a transaction and to decide whether effects of a transaction are visible using a given snapshot. The talk will go over this distributed snapshot mechanism in detail including several corner cases that we found tricky to implement right. Note that each PostgreSQL instance in the scale out system continues to create local snapshots and local transactions. Distributed deadlock occurs when the wait cycle spans multiple PostgreSQL instances. To detect it, wait graphs from each PostgreSQL instance need to aggregated and cycle detecting be performed on the aggregated data. The talk will describe how we model vertices and edges in such a graph, the method used to aggregate this information, being mindful of performance of the distributed system.