Making Consistency Protocols Serializable
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 155 | |
Autor | ||
Lizenz | CC-Namensnennung 3.0 Deutschland: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/42852 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
SIGMOD 201911 / 155
34
37
44
116
120
122
144
148
155
00:00
Magnetooptischer SpeicherDatenverwaltungWiderspruchsfreiheitProtokoll <Datenverarbeitungssystem>SerialisierbarkeitQuick-SortReelle ZahlSoftwaretestMultiplikationsoperatorDatenbankt-TestMinkowski-MetrikPhysikalische TheorieVorlesung/Konferenz
01:11
NebenläufigkeitskontrolleSystemplattformSoftwareentwicklerPhysikalische TheorieMechanismus-Design-TheorieMAPKategorie <Mathematik>SerialisierbarkeitTransaktionOperations Researcht-TestPhysikalisches SystemDatenbankWärmeübergangGruppenkeimProdukt <Mathematik>EINKAUF <Programm>ProgrammZurücksetzung <Transaktion>Befehl <Informatik>Konsistenz <Informatik>NebenbedingungWiderspruchsfreiheitProgrammiergerätAggregatzustandSelbst organisierendes SystemPuffer <Netzplantechnik>ROM <Informatik>KonditionszahlMechanismus-Design-TheoriePhysikalische TheorieGamecontrollerDatenparallelitätt-TestDatenbankProgrammierungStandardabweichungKartesische KoordinatenBefehl <Informatik>IntegralKlassische PhysikPuffer <Netzplantechnik>Pfaff-DifferentialformTransaktionHalbleiterspeicherSerialisierbarkeitCodeSelbst organisierendes SystemPhysikalisches SystemProzess <Informatik>NebenbedingungMailing-ListeAdditionQuick-SortSystemzusammenbruchProtokoll <Datenverarbeitungssystem>MathematikMultiplikationsoperatorAggregatzustandMereologieSchlussregelBitCASE <Informatik>OrdinalzahlSchreib-Lese-KopfZahlenbereichKontextbezogenes SystemMatchingBasis <Mathematik>Bridge <Kommunikationstechnik>ProgrammierumgebungProgrammiergerätZurücksetzung <Transaktion>Web SiteEigentliche AbbildungFokalpunktSoundverarbeitungSymboltabelleKlasse <Mathematik>GrenzschichtablösungComputeranimation
09:58
WiderspruchsfreiheitTaskSoundverarbeitungKonsistenz <Informatik>NebenbedingungCodeProgrammiergerätProgrammfehlerTransaktionStapeldateiSerialisierbarkeitOrdnung <Mathematik>Serielle SchnittstelleStandardabweichungAggregatzustandÄquivalenzklasseSystemprogrammierungMagnetooptischer SpeicherKlassische PhysikPhysikalisches SystemNebenläufigkeitskontrolleImplementierungROM <Informatik>DatenstrukturATMSchlussregelVersionsverwaltungDisjunktion <Logik>Lokales Minimump-BlockDatenverwaltungPhasenumwandlungProzess <Informatik>Zurücksetzung <Transaktion>DatenbankPhysikalisches SystemProgrammierungKlassische PhysikMathematikMereologieDämpfungPunktEinsTermOrdnung <Mathematik>TransaktionZahlenbereichNetzbetriebssystemATMDisjunktion <Logik>Physikalischer EffektNeuroinformatikOrdinalzahlAbgeschlossene MengeInterleavingDatenverwaltungStapeldateiSerialisierbarkeitDatenparallelitätSchlussregelSchnittmengeAggregatzustandNebenbedingungSoundverarbeitungIntegralEigentliche AbbildungMechanismus-Design-TheorieSerielle SchnittstelleProgrammiergerätKategorie <Mathematik>CASE <Informatik>GamecontrollerLesen <Datenverarbeitung>GarbentheorieKartesische KoordinatenProzess <Informatik>CodeQuick-SortSchaltwerkWiderspruchsfreiheitDifferentep-BlockForcingTeilgraphTeilbarkeitDatenflussReelle ZahlDisk-ArrayComputeranimation
18:45
Physikalische TheorieFramework <Informatik>DatenmodellDatenbankFolge <Mathematik>Operations ResearchTransaktionPhysikalisches SystemPartielle DifferentiationMaßerweiterungTabellePrädikat <Logik>Inhalt <Mathematik>SerialisierbarkeitFormale GrammatikRelation <Informatik>Lokales MinimumBernštejn-PolynomMathematikTaskNP-hartes ProblemSoftwaretestTeilmengeMagnetooptischer SpeicherApproximationVersionsverwaltungKnotenmengeGraphAggregatzustandSerielle SchnittstelleMengentheoretische TopologiePhasenumwandlungFrequenzVerklemmungKategorie <Mathematik>DatenbankTermVersionsverwaltungFormale SpracheKategorie <Mathematik>Physikalisches SystemDreiecksfreier GraphSerialisierbarkeitSchnittmengeMereologieTransaktionExistenzsatzNichtlinearer OperatorFolge <Mathematik>ZahlensystemQuick-SortRichtungPhysikalische TheorieMultiplikationsoperatorCodeRechter WinkelOrtsoperatorMatchingSchreiben <Datenverarbeitung>NetzbetriebssystemGraphSerielle SchnittstelleGrundsätze ordnungsmäßiger DatenverarbeitungTeilgraphDifferenteBitKonditionszahlPay-TVKomplex <Algebra>ResultanteLindenmayer-SystemEndliche ModelltheoriePhysikalischer EffektCASE <Informatik>Inhalt <Mathematik>Ordnung <Mathematik>ProgrammverifikationWeb SiteMechanismus-Design-TheorieKommutativgesetzFormale GrammatikÄquivalenzklasseDisk-ArrayAnalysisTypentheorieAggregatzustandBitrateDatensatzTabelleFortsetzung <Mathematik>BeweistheorieSichtenkonzeptServerAutomat <Automatentheorie>Lesen <Datenverarbeitung>Computeranimation
27:32
DatenstrukturPhysikalische TheorieDatenbankProgrammfehlerAbfrageNotepad-ComputerCodeBefehl <Informatik>TransaktionSummierbarkeitGammafunktionATMKommutativgesetzPrädikat <Logik>Attributierte GrammatikImplementierungSpannweite <Stochastik>IndexberechnungPortscannerZurücksetzung <Transaktion>WiderspruchsfreiheitAtomarität <Informatik>SoundverarbeitungAggregatzustandKonsistenz <Informatik>StapeldateiProgrammiergerätNebenbedingungSerielle SchnittstellePhysikalisches SystemKlassische PhysikSerialisierbarkeitMechanismus-Design-TheorieUmwandlungsenthalpieOrdnungsreduktionEbeneEntscheidungstheorieMagnetooptischer SpeicherARM <Computerarchitektur>MAPMotion CapturingDatenmodellDefaultStandardabweichungSystemprogrammierungTransaktionPhysikalisches SystemNichtlinearer OperatorIntegralAtomarität <Informatik>DatenbankDefaultProgrammiergerätEndliche ModelltheorieKartesische KoordinatenCASE <Informatik>TypentheorieBitProtokoll <Datenverarbeitungssystem>Projektive Ebenep-BlockEinfügungsdämpfungZahlenbereichStandardabweichungSerialisierbarkeitMechanismus-Design-TheoriePhysikalische TheorieMAPAbstimmung <Frequenz>Prozess <Informatik>Rechter WinkelPerfekte GruppeFitnessfunktionMereologieBefehl <Informatik>Automatische IndexierungFramework <Informatik>Ordnung <Mathematik>RouterSpannweite <Stochastik>NebenbedingungDreiDatensatzOffice-PaketBestimmtheitsmaßOrdinalzahlVertauschungsrelationFortsetzung <Mathematik>Klassische PhysikMultiplikationsoperatorAggregatzustandSchlüsselverwaltungSchreiben <Datenverarbeitung>Quick-SortComputeranimation
36:19
Lokales MinimumTransaktionPhysikalisches SystemKraftMAPLesen <Datenverarbeitung>SchnittmengeMagnetooptischer SpeicherAbfrageDefaultSystemplattformStabilitätstheorie <Logik>CursorSurjektivitätElement <Gruppentheorie>MultiplikationIndexberechnungPhysikalische TheorieFramework <Informatik>Folge <Mathematik>VersionsverwaltungProtokollierungOrakel <Informatik>TermSerialisierbarkeitOperations ResearchGraphMaßerweiterungProgrammfehlerNotepad-ComputerMechanismus-Design-TheorieStrom <Mathematik>KontrollstrukturImplementierungSQL Server 7.0SerialisierbarkeitTransaktionMAPVersionsverwaltungImplementierungFolge <Mathematik>GamecontrollerDatenparallelitätSoftwareschwachstellePhysikalisches SystemAlgorithmusDatenbankMultiplikationsoperatorPay-TVZahlensystemTabelleLesen <Datenverarbeitung>DatensatzMechanismus-Design-TheorieKomplex <Algebra>Quick-SortStatistische HypotheseKlasse <Mathematik>DefaultDifferenteKartesische KoordinatenMultiplikationIndexberechnungPrototypingBitKategorie <Mathematik>WiderspruchsfreiheitNatürliche ZahlIterationGraphBitrateRechter WinkelSupremum <Mathematik>Arithmetisches MittelDisk-ArrayStandardabweichungStellenringFokalpunktWhiteboardPhysikalische TheorieGrenzschichtablösungComputeranimation
45:07
VersionsverwaltungTransaktionDatenbankAusnahmebehandlungWiderspruchsfreiheitInstallation <Informatik>DatenparallelitätSchnittmengeGammafunktionp-BlockProgrammfehlerSerialisierbarkeitSchiefe WahrscheinlichkeitsverteilungSchreiben <Datenverarbeitung>Konsistenz <Informatik>NebenbedingungVerschiebungsoperatorSmith-DiagrammKlassische PhysikPhysikalische TheorieMechanismus-Design-TheorieUmwandlungsenthalpieGraphFortsetzung <Mathematik>SystemplattformNebenbedingungDatenbankIntegralTermAggregatzustandUrbild <Mathematik>ServerMultiplikationsoperatorMereologieTransaktionBitrateVersionsverwaltungVertauschungsrelationProgrammierungMechanismus-Design-TheorieMAPSchlussregelWeb SiteQuick-SortDreiecksfreier GraphResultanteProdukt <Mathematik>ZahlenbereichGraphSchnittmengeMathematikLesen <Datenverarbeitung>Wort <Informatik>FehlertoleranzSerialisierbarkeitSchiefe WahrscheinlichkeitsverteilungKette <Mathematik>InformationProgrammfehlerStandardabweichungOrakel <Informatik>Kartesische KoordinatenCodePunktSerielle SchnittstelleMultigraphsinc-FunktionSoundverarbeitungt-TestLaufzeitfehlerOrdnung <Mathematik>Rechter WinkelIntegriertes InformationssystemSichtenkonzeptArithmetisches MittelVarianzDatensatzMinkowski-MetrikWellenlehrePhysikalisches SystemHinterlegungsverfahren <Kryptologie>Computeranimation
53:54
Magnetooptischer SpeicherVersionsverwaltungDatenparallelitätGraphSerialisierbarkeitTransaktionKonditionszahlNotepad-ComputerSystemplattformTermLokales MinimumProgrammParametersystemKontrollstrukturCodeKonstruktor <Informatik>Operations ResearchHydrostatikQuick-SortPhysikalische TheorieDiagrammDatenstrukturRichtungTheoremAnalysisBeanspruchungMathematische LogikOrdnung <Mathematik>InstantiierungBeweistheorieGammafunktionExistenzsatzPhysikalisches SystemMAPMixed RealityMaßerweiterungLesen <Datenverarbeitung>Kontrast <Statistik>Pivot-OperationGraphVersionsverwaltungProgrammierungHydrostatikDatenbankPivot-OperationKartesische KoordinatenProgrammiergerätPhysikalische TheorieTransaktionBeweistheorieSoftwareschwachstelleNichtlinearer OperatorParametersystemDatenparallelitätAnalysisFolge <Mathematik>MAPBeanspruchungInstantiierungFigurierte ZahlServerEinsTypentheorieEnergiedichteProgrammcodeSchlussregelQuick-SortResultanteKlasse <Mathematik>CASE <Informatik>Kategorie <Mathematik>CodeKomplex <Algebra>Web SiteDatumsgrenzeTwitter <Softwareplattform>KontrollstrukturFortsetzung <Mathematik>Dreiecksfreier GraphDatensatzCOMDatenstrukturTermMultiplikationsoperatorPunktSystemplattformMereologieSerielle SchnittstelleSchnittmengeTheoremBildgebendes VerfahrenVerzweigendes ProgrammDifferenteProgrammfehlerPhysikalisches SystemDatenflussComputeranimation
01:02:41
SystemprogrammierungRechnernetzPrototypingDatenverwaltungSystemplattformWiderspruchsfreiheitStellenringProzess <Informatik>SerialisierbarkeitNebenläufigkeitskontrolleHardwareGemeinsamer SpeicherVersionsverwaltungPhysikalischer EffektExistenzsatzTransaktionPartielle DifferentiationFramework <Informatik>Ordnung <Mathematik>Kategorie <Mathematik>GraphHydrostatikMAPÄhnlichkeitsgeometrieParallele SchnittstelleProgrammAxiomatikKanonische KorrelationAxiomMagnetooptischer SpeicherDatenmodellExt-FunktorData MiningNotepad-ComputerTheoremKontrollstrukturMUDKlassische PhysikPhysikalische TheorieMechanismus-Design-TheorieKonsistenz <Informatik>UmwandlungsenthalpieNebenbedingungGraphPhysikalische TheorieSchreiben <Datenverarbeitung>Mechanismus-Design-TheorieHydrostatikKategorie <Mathematik>DatenparallelitätQuick-SortPhysikalischer EffektGamecontrollerKonditionszahlWiderspruchsfreiheitOrdnung <Mathematik>TransaktionMultiplikationsoperatorZahlenbereichDifferenteKette <Mathematik>InformationFramework <Informatik>BildschirmmaskeDruckspannungMAPAnalysisPhysikalisches SystemLesen <Datenverarbeitung>AxiomHardwareImplementierungFormale SemantikSoftwareResultanteRechter WinkelDreiSI-EinheitenRegelkreisVersionsverwaltungNichtlinearer OperatorMereologieExistenzsatzPartielle DifferentiationDreiecksfreier GraphPivot-OperationKartesische KoordinatenÄhnlichkeitsgeometrieDruckverlaufPrototypingFitnessfunktionVerzeichnisdienstPunktProgrammiergerätWurzel <Mathematik>GruppenoperationComputerspielNeuroinformatikTheoremMinkowski-MetrikAxiomatikEntscheidbarkeitCodecHalbleiterspeicherDichte <Stochastik>Computeranimation
01:11:28
MatchingGüte der Anpassungt-TestPhysikalische TheorieBefehl <Informatik>Endliche ModelltheorieIndexberechnungOrdnung <Mathematik>DifferenteSerialisierbarkeitKomplex <Algebra>SchnittmengeQuaderVorlesung/Konferenz
01:13:29
Physikalische TheorieNebenläufigkeitskontrolleMechanismus-Design-TheorieSerialisierbarkeitSoftwaretestGraphInklusion <Mathematik>TransaktionObjektverfolgungNotepad-ComputerPhysikalisches SystemPrototypingFahne <Mathematik>Pivot-OperationAlgorithmusVersionsverwaltungATMp-BlockImplementierungKonditionszahlSerielle SchnittstelleTermKette <Mathematik>VersionsverwaltungTypentheorieFundamentalsatz der AlgebraWhiteboardCASE <Informatik>MultiplikationsoperatorWeb SiteSchreiben <Datenverarbeitung>BitrateLesen <Datenverarbeitung>BitDatenparallelitätPhysikalische TheorieGamecontrollerKartesische KoordinatenCodecWeb-SeiteMathematikMinkowski-MetrikImplementierungTabelleZahlenbereichDisjunktion <Logik>ATMFeasibility-StudieSerialisierbarkeitTransaktionMechanismus-Design-TheorieProdukt <Mathematik>EnergiedichtePhysikalisches SystemAggregatzustandMomentenproblemt-TestGewicht <Ausgleichsrechnung>Globale OptimierungDatenfeldAutorisierungOrdnung <Mathematik>Quick-SortWeg <Topologie>SoftwaretestGraphAlgorithmusDreiecksfreier GraphSerielle Schnittstellep-BlockInformationPrototypingMereologieDatensatzStatistische HypotheseWellenlehreSingle Sign-OnSkriptspracheComputeranimation
01:21:55
Serielle SchnittstellePhysikalische TheoriePrototypingPhysikalisches SystemNebenläufigkeitskontrolleMechanismus-Design-TheorieSerialisierbarkeitTermVersionsverwaltungTransaktionKette <Mathematik>NebenbedingungKonsistenz <Informatik>UmwandlungsenthalpieKonfluenz <Informatik>Kategorie <Mathematik>BeobachtungsstudieOperations ResearchInvarianteInformationSigma-AlgebraPunktwolkeKonvexe HülleMagnettrommelspeicherCharakteristisches Polynomp-BlockPhysikalisches SystemSchwebungQuick-SortAnpassung <Mathematik>BitSerialisierbarkeitMechanismus-Design-TheoriePrototypingProtokoll <Datenverarbeitungssystem>CodeMultiplikationsoperatorKartesische KoordinatenSerielle SchnittstelleAlgorithmusInvarianteZeitstempelSigmoide FunktionInformationNebenbedingungIntegralSchnittmengeNichtlinearer OperatorDatenparallelitätIdentifizierbarkeitGamecontrollerTermPhysikalische TheoriePropagatorKategorie <Mathematik>MathematikTransaktionOrdnung <Mathematik>StandardabweichungVarietät <Mathematik>Funktion <Mathematik>VarianzComputervirusWeb SiteElementargeometrieGewicht <Ausgleichsrechnung>t-TestStabilitätstheorie <Logik>Demoszene <Programmierung>SpieltheorieStatistische HypotheseKonfluenz <Informatik>AggregatzustandGraphfärbungVorlesung/KonferenzBesprechung/InterviewComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:03
Thank you very much for the introduction, and thank you everybody for coming to this somewhat niche topic in the general space of the database community. So this talk is an attempt to try to sort of
00:23
have something that's of interest potentially to both PODS and SIGMOD audiences. So I'll try to explain some of the theory and some of the practice that relates to that theory, and we'll see how we go. I should just say that the Best Paper Award
00:44
and Test of Time Award that were just mentioned, I mean, I'm an author on it, but it's not my work in any real sense. It's the work of my PhD student at the time, Michael Kahl, to whom all credit for that work goes, and I will be explaining at least some of the details
01:04
of that work in this tutorial. Let me start with a brief recap of sort of the classical theory,
01:21
the standard transaction ideas, serializability ideas that are covered in a database education. But then I want to focus on a practical reality, which is that the theory that's taught in classes,
01:45
I mean, every textbook on databases explains two-phase locking and serializability and so on, that that theory is not actually the reality for a very large number of the application programs out there.
02:03
So a big part of what I want to make sure people get from today is that awareness that the traditional theory is not a close match to a lot of practice.
02:23
So I want people to understand that, and I'll try and explain the weak mechanisms that are actually used a lot of the time when application programs run against a database,
02:40
and that reality provides the basis for the bulk of this work, which is really about saying, we have a clear sense of serializability, why it's important, it's there for a reason.
03:00
And there's this practical situation that it's often not used. And so this tutorial is describing an agenda of work which is about trying to fix that problem, the bridge the gap, to allow the reasons that we want serializability
03:26
to get those benefits in situations where practice is not aligned that way. And a big part of that is the theory has developed what are called robustness conditions.
03:43
Those are statements which allow you to say that this particular application will execute serializability, it will run correctly, even when run in an environment which does not give that correctness guarantee
04:04
in general to arbitrary programs. So that was sort of the research agenda. What was described as winning those awards was the fact which was in fact something
04:22
I didn't see could happen. And when people suggested, I said, I can't be. And then, as I said, I had a PhD student who showed me that it could be done, and he did it, and that was to actually fix
04:40
the concurrency control protocols so that they were serializable. And that's what we now have in PostgreSQL. A lot of this tutorial will be describing work that I am part of the authorial team.
05:01
So I apologize in advance if it's a little too one-sided, but I am going to try to mention and talk about work that was done by other people as well. And so let's get started.
05:21
So I'm gonna begin with a quick, I hope, summary of the classical tech ideas which are taught in the courses. So the main reason for this is for people who haven't looked at their database textbook
05:40
in a long time, given a quick chance to refresh, or who haven't to skate over the transaction part of the class, and also to establish some terminology. So transaction involves executing
06:01
a program, part of the application code, and having it end in one of two ways. Either it commits, which means that the whole bit of code worked successfully,
06:22
and all of those activities should be done, or the program gets to a rollback, in which case we say the transaction has aborted,
06:40
and whatever the program did should be put back to the state before. So this is called atomicity in the database community that you end up in one of these two conditions, committing or aborting.
07:02
An assumption that is made in all of this work, so this is not something that the database can look after. This is a job for the programmer who writes the application code, is to make sure that their code is sensible in and of itself.
07:23
So without thinking about all the other things going on, if we just take one program and execute it once from start to finish with nothing else going on in the system, it should leave the database in a state that satisfies all the integrity constraints,
07:41
including integrity constraints which are only in the heads of the designers and not declared in the system. So we say that a program that is written this way is consistent, and in most sort of organizations
08:03
there will be a social process to make sure that programs are being written that way. Somebody will, at least there'll be another pair of eyes looking at the code, and one of the things they'll think about is has this code been written so that it leaves the database
08:23
satisfying the integrity constraints. And in many places there will be a list of the business rules in addition to whatever's declared explicitly in the schema, there'll be a list of other integrity constraints that are important and need to be preserved
08:42
by the application code. So given that each program is written like this, what are some of the threats that might cause the integrity conditions not to hold? And one of them is
09:03
the possibility of a transaction reaching the rollback statement. If you didn't put the state back the way it was but left the state sort of messed up, it might not satisfy the integrity condition in that intermediate state. A system crash could damage integrity constraint,
09:24
particularly if it leaves some data either still in the database buffers in volatile memory so they get lost when the crash happens or alternatively if there were some changes that an incomplete transaction made
09:42
that had made it into the persistent state. And finally, the one that we're going to be concerned about today, there's the possibility of concurrent activity. Several programs running at the same time interleaving with one another.
10:01
Not only do we learn about this in a database course, this is covered in an operating systems course, it's covered in a concurrent programming course. Essentially when things are happening in an interleaved fashion and there's shared state around, all sorts of horrors can occur unless you do something very careful to prevent it.
10:23
And the standard mechanisms that systems use to deal with this include logging and typically locking for concurrency control. And I just want to stress again the fact that each program,
10:45
it's an obligation of the programmer to make sure that that program keeps the integrity constraints when that program's considered by itself, but it's an obligation of the system to make sure that the integrity constraints are preserved in spite of all of these systems issues.
11:07
A number of classic cases where if you don't do things carefully, you can violate integrity constraints, have been given names, so people talk about dirty data
11:21
as a situation where one transaction, one application program execution reads data that another program created and then that other program ends up aborting. Or the interleaving where two programs
11:46
are dealing with the same data and the updates of one of them takes effect and the updates of the other are lost because it got clobbered by the first one. And then a more subtle one called inconsistent read
12:01
where a program sees some changes made by one program, by another program, but it doesn't see all of them. And if you see only part of the changes, the state you see may not satisfy the integrity constraints. Now the fact that a program sees things
12:22
that don't satisfy the integrity constraints means that that's a situation which the programmer never thought about. And so that program can then go off on totally bizarre float paths and do really bad things. So these are famous problems
12:42
if you don't have appropriate mechanisms in the database and a term that is used to capture what we want, especially in terms of dealing with concurrency is serializability. So starting point is to say,
13:01
well, if we think of a serial execution, old fashioned batch style execution where you run one program and then another program and then another program, you only run committed transactions, there's nothing rolling back anywhere.
13:21
This is what everyone will agree is definitely correct because each program is in fact having the whole database to itself. It's written to work correctly by itself. It is by itself, so everything works fine. And in particular, at the end of a serial execution,
13:44
you will have all the integrity constraints holding, including the ones that the database didn't even know about because each transaction takes the database from a consistent, from a state satisfying the integrity constraints to another and they just follow one after the other doing this.
14:03
So the serial execution is one which is not gonna cause any problems for our database, but of course, it's not gonna perform very well. Real executions are not gonna be serial, but we say they're serializable if there exists a serial execution
14:21
containing the same committed transactions which does the same thing and in particular, ends up with all the data in the same final state and all the transactions can't tell the difference between them. Something which is not necessarily stressed as much
14:44
but is important is that we ask that the real execution should have some serial execution, which is the same of it, has the same effects. There may be other serial executions
15:02
which behave differently and end up in different final states. That's okay, as long as there is one serial execution, that will be enough to give us all our integrity constraints holding. In fact, there is a slightly stronger definition, slightly stronger property called strict serializability,
15:23
which is now becoming somewhat important in a lot of the work in distributed and particularly geo-distributed settings where you ask not only that there should be a serial order but that you should be able to find a serial order
15:41
which doesn't actually reverse the order of any of the transactions. The serial order that is supposed to exist also respects anything that you can see from outside the system about one transaction running completely before another.
16:04
Indeed, if you look at the distributed computing literature, the notion of atomicity which is used there is in fact or linearizability is in fact close to strict serializability. One could say that strict serializability is actually the definition that people
16:22
should have been thinking about from the beginning. That's the definitions, how is it actually done? Well, system R worked all of this out. The database management system has an internal mechanism of locking.
16:43
Transactions have to get appropriate locks and they will block if they can't get the locks and essentially you have just enough of this blocking to prevent the interleavings that would not be serializable.
17:00
At least that's the goal. So inside the system you actually have a lock manager whose job is to manage all of these locks and to block lock requests if conflicting locks are currently held. And in the database space, slightly more sophisticated
17:24
than what you typically see in operating system critical sections. You actually have different lock modes, so you have the exclusive X lock which is what you need to get if you want to write an item and a shared S lock. If you want to read the item there are in fact
17:42
more sophisticated lock modes in practice, but these are the key ones. The rules are that essentially the only locks that can be held concurrently on an item are if they're all shared locks. If any one transaction gets an exclusive lock
18:00
that cuts out everything else, which is why it's called exclusive. It's important to remember that this locking is done inside the database management system. It's not typically done explicitly by the application code. The system grabs locks and releases them and so on.
18:24
So it's up to the system to be written to do all of this properly. The proper approach is called strict two-phase locking, usually abbreviated just 2PL, where when a transaction obtains locks,
18:43
it keeps them until the transaction completes, until the transaction actually commits or aborts. That is a much stronger property than you typically get in operating system code
19:00
where when you finish dealing with something you can release the lock. Here you have to keep the locks as long as you're running it all. In the transaction. So that's the practice. There's a traditional theory that goes along with this, and this is taught standardly.
19:23
You have operations where a transaction reads or writes various items in the database, and the database is seen as a collection of items. And so the typical notation that we use is we put a subscript on the operation
19:42
to indicate which transaction is doing it. And then you look at the whole system execution as just a sequence of operations from many different transactions, and they're all interleaving with one another. One thing which is not discussed traditionally in a lot of courses
20:02
is that this theory is not actually a perfect match to what really happens. This theory leaves quite a bit out from the reality of systems. So what are some of the things it leaves out? The biggest assumption,
20:22
the biggest sort of modelling assumption in this is that the database is a fixed collection of items. Now, when you start up, whether it's SQL Server or Oracle or whatever, it's not a fixed collection of things.
20:41
You can create new items. You can insert rows. You can insert whole tables. So there is a simplification here. Another simplification is that this model where we have read an item, write an item, and so on, it never mentions the contents of any of those items.
21:01
It just talks about I'm reading it or I'm writing it. And so in this model, you're losing quite a bit of the reality, of the complex reality of the system. Nonetheless, the model is very widespread
21:20
and is very useful to thinking about these algorithms like two-phase locking. So with this model, you can then say serializability is you've got this execution as a sequence of operations
21:40
into leaving various transactions. Can you find a serial execution of the operations just of the committed transactions, which is equivalent to what the actual execution? And different definitions come with different notions
22:02
of equivalence. The most important one is what's called view serializability, which is that the equivalent serial sequence should leave the database in the same state as the real execution.
22:20
And along the way, each read should be getting the same value. Now, I just told you that the values aren't in this sequence. So what does it mean to get the same value? Well, in any sequence, you can look at a read, find the closest preceding write
22:41
by a transaction that hasn't aborted, and your value will come from that write. Now, the write doesn't tell you what the value is, but you can just say, well, it's whatever the write is that went with that value. So that's the formal formality, the traditional formality.
23:01
The very early work in this theory actually came from people who were in the sort of automata theory and formal language communities. So they started defining a formal language containing the sets of all serializable sequences
23:20
and started proving complexity results about those. My take is that that's not actually tremendously helpful for practice because we're not that interested in what one can do with executions that don't arise.
23:43
Our real interest is much more what people in, say, the formal methods or code verification community would say is a refinement question. Is every execution of our system contained in the set of serializable executions?
24:04
And it doesn't matter how exactly that containment matches. What matters is just do you have sequences that are outside that set or not? And so in practice, most people don't use view serializability.
24:22
They use a simpler notion called conflict serializability, which is a sufficient condition for view serializability. Conflict serializability has the very nice property that it can be expressed in terms of properties of a graph. Now that graph is sometimes called the conflict graph
24:42
or the serializability graph or the dependency graph. It's a graph where the nodes are the transactions that are running and you've got edges between one transaction and another if you have a situation where the two transactions are accessing the same data item,
25:02
leaving out the case where they're both reads. So you can take your execution, you can draw that graph, and then you have the very nice property that you can test, are there cycles in that graph? The existence of a cycle is, sorry, the non-existence of a cycle
25:24
is the property that's called conflict serializability. And the good thing is that conflict serializability implies view serializability, and so you can work very effectively with this. And indeed, most of the practitioners
25:42
I've spoken to work with conflict serializability and don't even think that it's anything other than serializability. So I describe the two-phase locking mechanism, and the nice thing is if you use strict two-phase locking,
26:02
then you get executions which are strict serializable. So since this is a PODS tutorial, we have to have some actual proofs or at least proof sketches. So here's one of them. If you have a system where everybody's doing
26:22
strict two-phase locking, you have this conflict graph. You can prove by sort of an analysis of the cases where locks are obtained that every edge in the conflict graph
26:45
has the direction which is defined by the position and time of the commit operations of the transactions involved. So essentially, the conflict graph is a subgraph
27:03
of the graph that's called the commit ordering graph, which just orders transactions according to when they commit. Because the time at which transaction commits gives you a total order on the transactions, that means that the conflict graph is acyclic.
27:22
It's a subgraph of a total order. There you have it. And so you can take that total order, the commit order, and the serial order where transactions are put that way,
27:42
run that way, is an equivalent serial execution for you. Okay, so that is, I think, pretty standard material, and I hope for anybody who has been bored, maybe now's the time to come back in.
28:02
So we mentioned already that things have been left out of our model. Our fundamental framework for talking about executions was read and write operations on a fixed collection of items. And it turns out that this is not something
28:22
that one can just say, oh, well, it's a nice simplification for modelling. The fact that that's been left out actually means that the simplistic strict two-phase locking protocol that I described is actually wrong.
28:40
It does not guarantee serializability. And the situation is that in a real system, you have operations which can insert new items, and you have operations which essentially do select where which can find which items exist.
29:02
And so some of the very early work of Jim Gray and colleagues in the system, our project, pointed this out. They identified what's called the phantom problem. Here's a simple example to show you what's wrong.
29:21
If I have one transaction which looks for the rows whose department is account, it finds a number of those rows, so it puts locks on them. Another transaction can come in, insert a new row,
29:45
put a lock on that new row, commit, release the lock. First transaction can resume. It again looks for the rows in the account department. Now there are three of them.
30:03
Two of them it had already locked. One of them it hadn't, but that one is now no longer locked. The transaction that inserted it committed, the lock's been released, and so T1 now looks and suddenly sees more departments, more rows with this department than it saw before,
30:22
and in a serial execution that would never happen. So this problem is called the phantom problem. At a theoretical level, what's going on is that select where and insert don't actually commute with each other.
30:40
Because they don't commute, you need to have some mechanism of ordering them, but locks on the individual rows that were returned is not good enough. So practical solution is to have locks which represent not just individual rows
31:02
but ranges of rows. Now it's a very general concept, but in practice thing that is particularly done is what's called next key locking where when you lock a row, effectively you're locking
31:23
not only that row but also the range of rows just going up to that row, for example in an index. And a lock on the row is treated as covering that range
31:44
and an insert not only locks the row being insert but checks to make sure that there isn't a lock on the next key that would prevent this from happening, the insert from happening.
32:02
So System R knew this and it got it right. And so what we fundamentally have is the first takeaway that the beautiful theory is not a perfect fit for reality
32:20
and you need more sophisticated theory. Anyway, so let's sort of summarise the state of play of the traditional theory. Transactions should have the ACID properties. It's the programmer's job to write each transaction
32:40
to preserve the integrity constraints. It's the system's job to provide the A, I and D and in particular serialisability is the ideal way of getting that I and looking after all the integrity constraints. OK, so that's the classical theory.
33:05
As I said at the beginning, the reality is very far removed from that theory and so now I'd like to talk about the reality of non-serialisable support,
33:23
non-serialisable isolation, the mechanisms for it. So the motivation for this is that when you actually do strict two-phase locking, performance suffers. You're getting those write locks and transactions
33:41
are being blocked by the write locks and their need to get read locks. Now, as database people, this seems heretical but there are a lot of people who are willing to give up correctness for performance.
34:01
Sometimes the application actually makes this a reasonable thing to do. There are a significant number of use cases where data is providing some guidance to a human
34:21
but the human has enough smarts to work around if the data is a bit out of date or something. In some cases, it is actually corrupting the data but there are a lot of systems, people out there
34:42
who figure that the correctness of the data is somebody else's problem but the performance is where their pay packet comes from. And so right at the very early days, Jim Gray and colleagues introduced the idea that a database would offer programmers the ability
35:03
to say, I don't need the correctness, I want the extra performance. Just to be clear, even when they are willing to give up on the serialisability part
35:20
of the correctness, they don't want to give up on the atomicity or durability. You are still going to be setting write locks for your updates and rolling back properly. So, right from system R, they introduced and it is now part of the SQL statement, SQL standard
35:41
that you can declare an isolation level for a transaction that is less stringent, less restrictive than serialisability. The SQL standard says that serialisability is to be the default. It should be up to the programmer
36:01
to explicitly recognise that they are willing to give up the correctness. Unfortunately, there are not many actual systems out there that adhere to this particular bit of the standard. Almost any system, if you open it out of the box
36:24
and start running transactions and don't think about the isolation level, your transactions will not be serialisable. There are some subtleties in how one defines isolation levels and indeed there have been a sequence of papers
36:43
in SIGMOD and its community that have tried to grapple with this. Here, I'm going to focus much more just on the simplistic implementation that gives you the weak isolation level.
37:03
So, the very weakest is sometimes called browse or in the standard read uncommitted. In this one, you just don't bother setting the shared locks at all. You just read data, whatever is there.
37:20
If that happens, it's perfectly feasible for a transaction to see dirty data, data that will eventually disappear when the writer aborts. Read committed is the default, as I said, in most systems.
37:41
If you just start running transactions without saying anything, you don't get serialisability. What you typically get is read committed. So, read committed, you do set your locks, but while the strict two-phase locking says you've got to hold all your locks to the end,
38:00
in read committed, you do release the read locks after you finish dealing with the data. Now, that does prevent any reading of dirty data, but you can have inconsistent reads. Transaction can see a data item, read it again sometime later,
38:24
and find that it's changed in the meantime because the transaction wasn't holding the read lock. You can also get weird inconsistencies between different rows of a table that you're scanning in the transaction.
38:41
There's a level called repeatable read which essentially says the only bad thing that happens is phantoms. In repeatable read, if I read and read again, the data won't have changed,
39:01
but it is possible for new items to have been produced. This is done by, in fact, doing that simplistic form of strict two-phase locking where you lock the rows, but you don't do the next key locking on the indices and so on.
39:21
Those techniques have been there pretty much forever in the systems that everybody uses. The next complexity that I want to introduce is the fact that so far,
39:41
remember we had this simplification that we didn't put the values in. We simply said, well, when you read an item, you're reading from the most recent write to that item that hasn't yet been aborted. Not all databases work that way.
40:01
In particular, there's a class of systems which use multiple versions. So Oracle, as far as I know, were the sort of people who did this in widely available commercial systems, though there were previous research prototypes
40:20
that were like this. And it can have some very great advantages. If there are multiple versions, you can allow a read to see a version which is a bit older because the most recent version may still be uncommitted, and therefore you don't want to see that
40:42
because that would be a dirty data. So if you can see an older one, it's definitely committed and so you use that. There was a natural piece of work to take the serializability theory and extend it
41:01
to allow multiple versions. This has gone through several iterations in the community. The one I'm going to be talking about mostly is the one which is now sort of taken over, which comes from Atul Adya's PhD thesis in 1999.
41:22
I mean, as I say, versions of it date way, way back to the 80s. So in this, you have an item X contains multiple versions and the versions are indicated in this notation
41:41
by putting a subscript on them which says which transaction created them. So X I means the version of X that was written by transaction T I. So a write of X by transaction I is writing X I but when transaction I reads X,
42:04
it will get some version, some X J, but it's up to the system to work out which of the versions it's going to get. In this terminology, we identify conflicts just as we had before, but now we have to think
42:22
about conflicts not simply as the fact that one transaction is looking at the same item but which versions are being looked at. So a write-write conflict happens when one transaction creates a version, another transaction creates a later version. Write-read conflict is one transaction creates a version
42:46
and another transaction reads that version or one of its successes. And a read-write conflict, which turns out to be really important in the theory, it's given the special name of anti-dependency, is when transaction I reads a version
43:04
and transaction II writes a version which is later in the sequence of versions than the version that T I saw. If you have this notion of conflict, you can produce the conflict graph,
43:21
the multi-version conflict graph, and again, if that is acyclic, then you have serializability of your execution. So what sort of concurrency control can you do with these different notions of multiple versions?
43:44
So there's a read committed variant and while it still has this property that you only read committed data, it actually works differently in practice. The differences between this sort of read committed and the locking read committed
44:01
are actually visible sometimes to the applications. So in the multi-version read committed, very simply, you do that thing I said. If you're trying to read an item and the most recent version is dirty, just skip over it. Find the most recent committed version
44:22
and that's what you read. So that mechanism is very widespread. It's the default in Oracle. It's the default I regret to say in PostgreSQL.
44:41
So if you've got a multi-version system, this is typically the default. The one that my reputation is mostly associated with is an algorithm though called snapshot isolation. It's a multi-version concurrency control
45:01
and I wasn't watching at the time so I'm not sure of the history. In 1995, two things happened. There was a paper which defined this isolation level and pointed out that it was not serializable.
45:22
And Oracle built this isolation mechanism and released it in their product when you asked for isolation level serializable. So those two things happened. More recently, SQL Server and PostgreSQL
45:41
offer this isolation level. So in SQL Server, it's called isolation level snapshot. In PostgreSQL, it's called isolation level repeatable read. How does snapshot isolation work? So in snapshot isolation, when you do a read,
46:01
it's gonna find a version for you to see but it's not necessarily the most recent committed version. It can actually go further back in the version chain and what it will show you is the version that committed most recently before the time
46:22
that the reading transaction had started. So the idea is that at the time a transaction starts, you imagine take the database, take the most recent committed version of everything in the database at that point and imagine you took a snapshot of that collection of information
46:42
and whatever the transaction does from now on will work against that snapshot. Now it's not actually done by taking a snapshot. At runtime, you go back through the chain to find the appropriate thing but conceptually, it's as if you had a snapshot of the committed state at the time you ran.
47:02
Now this prevents the anomalies. You don't get phantoms, you don't get lost updates and so on provided you do one extra check and that is part of the mechanism. So that extra check is sometimes called
47:21
the first updater wins, variance first committer wins but the idea is if you have a transaction which modifies the data, it's not going to be allowed to commit if somebody else commits a change
47:40
in between the reading snapshot, the beginning of the transaction and the end. In other words, if you have two concurrent transactions that are updating the same item, only one of them will be allowed to commit. So that is a crucial part of this mechanism.
48:03
It's what means that the mechanism avoids all of the standard anomalies that people had worried about and that had been written about in the SQL standard and other places and in Jim Gray's earlier papers. So snapshot isolation has lots of nice features,
48:21
readings not blocked and indeed, what I've found when I teach this is that most students think that snapshot isolation is actually what serializability means. They think that the idea that you don't see anything concurrent to you
48:43
should mean that you're serializable. Now, that's not what the definition says and the crucial reason is a situation called right skew.
49:01
So in right skew, you have transactions which look at some data and each modifies a different row in that data that they're looking at. So neither sees the other in snapshot isolation. That's the intuition.
49:21
But if you had a serial execution, in fact, one would see the other. So this isn't serializable and this is actually really serious. Here's an example. Suppose we have a constraint on our database. It's a database of doctors and when they're on duty
49:42
and we've got a constraint. We wanna make sure that at least one doctor is on duty on each day. Now, we can have transaction one takes a snapshot and modifies one of the rows to say this doctor's taking a holiday.
50:01
Transaction two using the same snapshot modifies another row to say this doctor is taking a holiday. Because they're not modifying the same row, the first updater wins check doesn't prevent this happening. But when we look at the final state of the database,
50:23
the final state of the database isn't built up from the snapshots. It's built up from the modified rows. And in the modified rows, we see that both doctors have gone on holidays and there's nobody here for the patient to see and we've violated our integrity constraint.
50:42
So, that is built. It runs in a whole bunch of very common platforms out there and indeed Oracle offer this when you say, I want to have a serializable execution. And it's not serializable.
51:01
That's sort of the starting point for a lot of the work that was done. So, oh, of course.
51:27
So, if you declare the integrity constraint to Oracle, I believe that Oracle will pick it up. Anybody here who's a better? Yeah, so, if you declare the constraint,
51:43
Oracle will pick it up. But if you have a business rule. Well, but I mean, there are a lot of business rules which are not declared as constraints to the system and it's left to the code to deal with it.
52:01
So, the comment was that that example would actually be hard to express. I don't know exactly what is supported. Anyway, my take on this is that it's not enough
52:20
just to protect constraints that are declared. You really do want to give the application program or the ability to have constraints in the application and not be messed up by interference between transactions. So, let's try and look at these
52:43
non-serializable SI executions in terms of the, question or just stretching? In terms of the serialization graph. So, we know that non-serializable executions
53:01
mean a cycle in the graph, but what sort of cycle? So, the first result in this space was in Atalage's PhD where if you have one of these non-serializable executions, the cycle will have at least two anti-dependency edges.
53:21
Two of these R to W edges in it. The work I was involved in with Dennis Shasher who's here, wave to the crowds. And a number of others of us showed that not only are there two edges,
53:43
but you can find two of these read-write edges which are consecutive. One immediately following the other in the cycle. So, this is a way to see what, this is a theory result. As it stands it just says this is a situation that occurs.
54:03
And again, it's a PODS tutorial. I will at least sketch the proof out. So, the key parts of this proof are that the other sorts of edges, the WW edges and the WR edges,
54:20
in snapshot isolation come from transactions where one completely precedes the other. Those edges all have to be from one transaction finishes and then another transaction starts. So, there's no concurrency between them at all.
54:40
So, if you take one of these cycles, go and look at whichever transaction in that cycle is the first one to commit, the one that commits at the earliest point in time. That cannot have any WW or WR edges coming into it.
55:03
So, what's coming into it has to be an RW edge. Now, go and look at the edge that's before that one. That again can't be a WW or a WR edge because that would have to be sufficiently far back
55:23
that it's actually before this one which was chosen to be the earliest committer. So, you can find two successive anti-dependency RW edges in the cycle.
55:40
So, that's a nice piece of theory and it's a nice proof and elegant, but what is it good for? And the first sort of result I wanna talk about is a result of a class called robustness results where we say, can I identify a program, an application program, a set of pieces of code
56:04
where all of the executions will be serializable when run on a snapshot isolation database. So, essentially what we're saying is even though a snapshot isolation database allows anomalies in general,
56:20
you may have a particular application code that doesn't have any and an application code with that property is called robust and we wanna be able to tell what's robust. And so, there's a lot of complexity because a program code is not just a sequence of operations
56:40
it actually has control flow and parameters and so on. But anyway, the theory that was developed in this 2005 paper by Dennis and myself and others identify something called the static dependency graph. So, it's a graph like this conflict graph
57:04
but it's between programs, not executions or instances of those programs and we separate out the edges into two sorts, non-vulnerable edges and vulnerable edges and essentially the vulnerable edges are the ones
57:22
which could occur concurrently. So, a non-vulnerable edge cannot arise between concurrent transactions. So, for example, if you've got a write-write conflict the first updater wins rule will say no, this isn't going to happen.
57:43
Similarly, the way reads take the snapshot from things that finished before they started means you won't have a WR conflict but you might have an anti-dependency edge. So, because of this you can draw a graph. Now, because it's between transactions
58:02
rather than instances, every edge has an edge going the other way as well. You have two transactions and there's an edge one way, there's also the edge the other way. But they're not necessarily the same in terms of vulnerability. So, you could have a situation where the edge one way
58:21
was vulnerable and the edge the other way was not. Once you draw that graph you define a dangerous structure which is a situation where you have, it's similar to what our theory spoke about. You have two of these vulnerable edges in succession
58:47
and they're part of a cycle. So, you have here this vulnerable edge from A to B. So, an instance of A could be concurrent with an instance of B and they could have a read-write conflict.
59:02
An instance of B could be concurrent with an instance of C and they could have a read-write conflict. And then in the graph there's some path which takes you back from C to A, including this trivial case where C is equal to A. And the theorem is that if you've got a set of code
59:22
where this static dependency graph does not have a dangerous structure then this application is robust. Every execution will be serializable even if you're running on an SI platform. And the amazing thing is that the TPC-C workload
59:42
has this property. So, this figure is taken from that paper. It shows the analysis of the programs in the TPC-C workload. There's a trick you have to do. The delivery program.
01:00:01
has a control flow branch in it. You do one thing if there is an undelivered row and you do something different if there isn't and the conflicts are different in that case. And so in the analysis you actually separate that into two programs. One which represents one path through the flow
01:00:21
and one which represents the other. So anyway when you draw that what you'll see is that in this while there are faces of these vulnerable edges, nowhere do you have two in succession.
01:00:48
So yet another proof but very briefly. Essentially you have to argue that the picture that you get in the conflict graph of an execution
01:01:02
reflects what's in the interference graph when you look statically at the programs that generated that application. And so you do a sort of case analysis and you show that the edges from the execution, each of them comes from a corresponding edge in the static dependency graph.
01:01:24
And if you've got an anti-dependency between concurrent transactions in the execution then you've got a vulnerable edge in the static graph. And so if we have a non-serializable execution on an SI platform we already saw
01:01:43
that you have this special structure with two anti-dependency edges in a row and then combining, closing the cycle. You just take the corresponding edges in the static dependency graph and you have two vulnerable edges in a row being closed to a cycle. So that's the theorem.
01:02:00
Another piece of work I did, don't need to go into the details here, was to say well what if you have a system like SQL Server where you can choose for some transactions to use snapshot isolation and for others to use two-phase locking. How you choose the isolation levels for each transaction to end up
01:02:22
with your application being robust. And it turns out that there is a very nice characterization. You can look at the graph, look for particular situations and you identify what's called a pivot which is a node in a particular graph with an edge coming in, an edge going out,
01:02:44
a particular form and closing to a cycle. And the theorem says that you have to make the pivots use two-phase locking and the others could be chosen either two-phase locking or snapshot and that will be a robust application.
01:03:04
So now I'd like to sort of move on to some more recent work that people have been doing where they're looking at isolation properties in distributed and replicated systems.
01:03:23
So these systems have very high network latencies. If you're geo-replicated or geo-distributed, those latencies are really heartbreaking. And so the pressure to use weaker isolation
01:03:43
is very strong in these systems. People have identified an enormous zoo of different properties that various research prototypes offer in this space and to be honest, many of them seem to be more like,
01:04:03
well, I found a way to get better performance by doing something in the implementation and I'm gonna define that as a consistency and isolation property and programmers cope with it as best you can. But some of them have taken root.
01:04:21
I'd particularly like to talk here about causal consistency. So causal consistency actually is an idea which dates way back to shared memory multiprocessor hardware but it's got a new lease of life in this geo-distributed computing setting
01:04:42
and indeed several well-known very successful systems, Neo4j and MongoDB are examples, now actually support this as a level and indeed tomorrow in the industry session
01:05:01
will be a talk about MongoDB's implementation of this and the issues that they came across. So this is definitely not just a researcher's crazy idea. This is taking up. Causal consistency is not easy to define carefully.
01:05:21
The essence of causal consistency is, it comes from the Lamport causality, the idea that one operation can be part of the past which influences another and then what you have
01:05:41
is that if you know about the existence of a transaction, that wrote to X, know in the sense of it's part of your causal past,
01:06:01
you've got that information through things you've read in other items potentially and now you go to read X, what you see has to be at least as up to date as the version that you know about. So you never see something which essentially has time slipping backwards for you compared not just to what you've explicitly seen
01:06:23
but what you've learned about indirectly through other operations. So that's causal consistency. I'm not gonna talk about how it's implemented. You can go and hear the Mongo people describing that. Reason I'm particularly choosing this one other than that it's a real one out there
01:06:41
is that there is a robustness result for causal consistency. So Alexei Gottsman from Madrid has a very large collection of work looking at a number of different properties of systems and how one can reason about them
01:07:04
and in particular there's this paper from Concur a few years ago which took causal consistency and did a robustness analysis for causal consistency.
01:07:20
So at the top level, firstly they give a definition of causal consistency. Not just a particular implementation but a general definition in an axiomatic framework. They then essentially don't work with the axioms anymore.
01:07:40
The axioms tell you things about the conflict graph. So they work with the conflict graph and they work with the static dependency graph and find a condition on that static dependency graph for robustness. They've also looked at a number of other isolation levels and some other results
01:08:02
including they've essentially reproved the work I spoke about with snapshot isolation and extended it with other results. For example, when you can safely chop transactions in SI systems. But let's just have a very high level look at how they do it.
01:08:22
So they use a theoretical framework to define isolation levels which is based on having a number of different partial orders between the transactions. So there's one partial order called the happens before order which captures this information
01:08:40
that you get about what other transactions exist, sort of indirectly through the causality chain or just because of externally visible ordering. And they also have another order which they call arbitration which essentially corresponds to the commit order
01:09:03
says that when one transaction is being overwritten by another. I do want to stress this is less general than the sorts of frameworks for reasoning I spoke about previously. Not every concurrency control and isolation mechanism
01:09:20
can be described in their framework. In particular, there's an assumption that's called atomic visibility that you see all of a transaction or none of it when you read. So in particular, read committed levels don't fit this framework at all. But within the framework, I'm sorry that was done by PowerPoint snipping
01:09:43
sorry PDF snipping and it may not be totally readable. You don't need the details. I suggest you go and look at the paper to actually see them. The point is that they write down axioms about the different orders that exist
01:10:00
and how they relate and then a concurrency control and isolation level is simply defined by which of these axioms apply to it. The robustness that they get has a lot of similarities to the one I spoke about for snapshot isolation. You have to have two edges in your cycle.
01:10:22
One, a read write sorry a read write edge and the other can be either read write or write write. So being robust for causal consistency is a much harder thing to achieve than being robust for snapshot isolation.
01:10:42
But you can do it. And then they lift that to a condition on the static dependency graph in exactly the same way. Okay, so that sort of is the theory side of this talk.
01:11:03
I'm now about to go and talk about work that use that theory to inspire new isolation, new concurrency control mechanisms. And I've got to be a little careful because Michael Kalu did a lot of the work and actually had all the ideas is in the audience.
01:11:23
But anyway, it might be a good time just to see if anybody has any questions on what I've done so far. So the question was why do we teach two-phase locking when it doesn't guarantee serializability? So two-phase locking with those extra tricks of next key locking on indices
01:11:41
does guarantee serializability, okay? So the question is as a teacher, how do we deal with this? I'll tell you honestly, I teach two-phase locking without all those extra tricks in the assumption that there is a fixed data set.
01:12:02
Then I go on to talk, to tell students about the places where the theory is not a perfect match. I actually think that that's a good lesson for students. I mean, I think it's important for students to know that theory can be powerful without being perfect, that, you know, we heard in the keynote
01:12:32
the statement by Box, you know, all models are wrong but some are useful. And that in fact is the way I try to present this, okay?
01:12:43
I do say in order to get serializability, you have to do these extra things. It's not enough just to do the simple one. But I don't expect students necessarily to master the details of the extra complexity. But I mean, you know, as I said, when I teach this,
01:13:04
my students don't understand the difference between snapshot isolation and serializability. So I'm not perhaps a great model of a teacher because I fail with that. I don't know if other people do better at, I mean, I tell them, but telling and having the students understand
01:13:22
are not the same, and so. Okay, let me then move on to the idea of, instead of saying how can we check whether an application is able to work with these weak mechanisms, why don't we simply fix the mechanisms, right?
01:13:42
Which is a very natural question. And, you know, just to go off script for a moment, when I gave talks about that theory I've just been talking about, and I gave these things, Phil Bernstein said to me, why don't you build a concurrency control algorithm
01:14:05
that checks and prevents these problems, the non-serializable executions. And I said, it's utterly infeasible. You cannot, I mean, just tracking all of those dependencies
01:14:22
would be much too much work. I mean, theoretically it's possible, and indeed, Phil himself, author of the serialization graph testing, so theoretically, if you keep that graph showing all of the conflicts, and you abort a transaction whenever it would complete a cycle in that,
01:14:42
you get serializable executions. And so if you took snapshot isolation and just ran a serialization graph testing certifier on top of it, it would be correct. And indeed, there's a paper which came after the work of Michael's, which actually proposed to do that,
01:15:01
and did some experiments which they said showed in some cases it performed reasonably. However, I felt that this was not realistic, and indeed the Postgres SQL implementers said the same thing. It's just not practical to track all of those edges.
01:15:24
Fortunately, very fortunately, I had a brilliant student who when I told him that this is completely infeasible, he went ahead and did it. So Michael's at the back, a wave to the crowds. And so Michael Karl came up with this technique
01:15:44
which is called serializable SI or SSI. So you take a system running snapshot isolation, you make small changes to the implementation, which essentially look for those situations
01:16:01
of two anti-dependency edges one after the other in the graph. When you see that, you abort a transaction. So you prevent having those two anti-dependency edges in a row, and if you do that, you will prevent non-serializable execution from happening.
01:16:20
And Michael's thesis went and did prototype implementations in a couple of systems. Portz and Grint then actually did the work of building it for real, for production in Postgres SQL. And indeed, you download Postgres SQL now,
01:16:41
that's what's there. When you say, I want a serializable transaction, this is how it works. So it keeps the fundamental advantage of snapshot isolation in that transactions don't block any more than they would before.
01:17:00
Particularly read-only transactions don't block. But you abort some extra transactions, and those aborts include aborts that are vital to prevent non-serializable executions. So, the essence of Michael's algorithm is they put two bits in the information
01:17:23
that's kept for each transaction. One bit says, is there an anti-dependency edge going in? And one bit says, is there an anti-dependency edge going out? And if transaction ever has both those bits set, you have to do an abort, okay? That's very simple.
01:17:42
Except it's not very simple, right? It's all very well to say set a bit when there's a conflict. Comes back to that feasibility question, how do you detect the conflicts? And this was Michael's genius. There are two different situations where you have a read-write anti-dependency conflict.
01:18:05
One is where the write happens, and chronologically the read happens later. In that case, the anti-dependency is very easy
01:18:20
because the write version was there. And so when the read comes in, in finding the version it should read, it skipped over this version. As it skips over that version, it can say, ah, that's an anti-dependency edge. Mark the bit. In fact, mark both bits, right?
01:18:42
Like them at each end. What about the other case? The case where the read happens, and then sometime later, but from a concurrent transaction, a version is produced that the read didn't see because it wasn't there at the time.
01:19:01
So this is much harder conceptually to think, how would I know? And Michael's genius here was to say, well, let's use the lock table to remember that the read had occurred. So he creates an extra lock mode. Remember we had the shared lock mode
01:19:21
and the exclusive lock mode. Michael created a special lock mode called the SI read lock mode. And when you do a read, you set an SI read lock on the item you're reading. That lock does not block anything.
01:19:40
It's just there so that when another lock is put on this item, it can see that this SI read lock had occurred and can therefore set the appropriate bits. And if I've got anything wrong, Michael can give a better explanation. But essentially, in either order,
01:20:02
you are able to detect that there is this anti-dependency, set the appropriate bits, and once you've got the bits, you can just do the abort whenever a transaction has both its bits set. So the main disadvantage of this is that it's a bit more conservative than the theory insists on.
01:20:20
It aborts essentially whenever you have two anti-dependency edges in a row. And the theory actually says, the only problem is if you have two anti-dependencies in a row and they're part of a cycle, which is not always the case. And so Michael does sometimes abort
01:20:42
transactions unnecessarily. And when Port and Grintner did their implementation, they tried to reduce, Michael already has some techniques to avoid some cases, but this Postgres SQL implementation put in a lot of extra work to further reduce the number of false aborts.
01:21:04
They also introduced the concept of a safe snapshot. So certain transactions, you can detect that they're never going to be in a problem, and so they don't have to set the SI read locks at all. So, in order to do this, the Port and Grintner design
01:21:20
doesn't only keep a single bit for each transaction, which simply says, is an edge in or out, it actually keeps track of which transactions have edges in and which transactions have edges out. So it can use that to do some of this work of the optimizations.
01:21:46
Sorry, I'm in the reader's setting. Are the readers in or out? So the question was, are the readers getting locks? Yes, they are, they're getting these SI read locks, but those locks do not block writers.
01:22:03
Okay, so they're recorded in the system, but somebody else can come along and get a write lock and not be blocked. It's simply that the bits are set as appropriate.
01:22:21
Okay, so another sort of adaptation of these ideas came up a few years ago. Ryan Johnson and his student, Zhang Zhang Wang, developed what's called the Serial Safety Net, or SSN.
01:22:45
So it's a certifier which extends many of these ideas so that it can be used not only with snapshot isolation, but with a variety of other concurrency control protocols as well.
01:23:02
And this one involves, rather than tracking edges, what you do is you track a timestamp for each transaction, which essentially is the commit time of those transactions which came before this
01:23:20
in terms of back edges, which are edges where the serialization order and the commit order are not the same as one another. And so Zhang Zhang implemented this in Ermia, which is a prototype system.
01:23:41
Okay, so the last thing I'm gonna talk about very briefly is a piece of work which takes quite a different flavor. I mean, so far we've spoken about the weak mechanisms and finding an application and making it run serializably. And indeed, that's the title of this tutorial.
01:24:04
It's about serializable behaviors. But some work which is somewhat related is to say, so the virtue of serializability is that it will preserve every integrity constraint. Whether declared or not, you don't have to know about it.
01:24:22
It will be declared. What about if you said, look, rather than trying to guarantee every one, let me take particular integrity constraints and just try to see if I can preserve those integrity constraints. So this was done in the setting of very distributed,
01:24:45
geo-distributed settings. So this is work of Peter Bayliss in his PhD. And so what he said is, suppose we have operations and we don't do any synchronizing or concurrency control.
01:25:00
We just do the sort of standard thing. We update and then the, you know, basically anti-entropy propagates the changes eventually to everybody else. Gossip behind the scenes. Whenever a node hears about changes, it merges everything it hears about together.
01:25:21
If we have a system like that, is it possible that some application and some integrity constraint can be preserved? And that's been, and they, we, because I was a co-author on this, but it's Peter Bayliss' work, gave this the term I confluence.
01:25:44
So a property of a particular invariant, a particular set of operations which can be performed on the data items involved in that invariant and the algorithm that's used to do the merging
01:26:00
when information is gossiped. Given a particular integrity constraint, a particular set of operations and a merge technique, we say that is I confluent if the integrity constraint is actually preserved as this happens.
01:26:21
And I'm sorry, this is also not the most readable, but in Peter's paper and PhD thesis, he actually finds that there are some examples of this. You can actually say, you know, if for example I have a foreign key constraint
01:26:40
and what I'm doing is inserting items on one side and the gossip's having, then the constraint will be preserved in my system. So this is a lot less than being able to preserve arbitrary constraints,
01:27:00
but Peter looked at a bunch of, just a fairly substantial bunch of Rails code from well-known applications and found that quite a lot of that code did identify application invariants. One of the things about Rails is that the invariants are not pushed into the database,
01:27:21
they're kept in the application, but they're declared in the application and so his code was able to scrape that and have a look and found that quite a lot of examples did fall into this category. So those would be applications you could just distribute this way and know that those invariants
01:27:42
would continue to hold. So that brings me to the end of the tutorial. I guess the most important conclusion I would offer from all of this is that having sigmoid and pods together
01:28:01
is the right thing to do, okay? Theory and practice have value in talking to one another. The practical systems work provides interesting theory questions like proving robustness and so on. And the theory provides inspiration
01:28:22
for new system designs that can exploit that. So I'm not sure how many people here are pods people and how many people here are sigmoid people and how many people see themselves as both, but welcome and let's stay together.