Putting Rails in a corner: Understanding database isolation
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 88 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/37329 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Producer | ||
Production Year | 2018 | |
Production Place | Pittsburgh |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
RailsConf 201831 / 88
9
14
16
19
20
22
23
26
27
28
34
35
36
37
38
39
41
42
46
47
53
57
60
62
63
64
69
72
80
85
87
00:00
DatabaseLine (geometry)DiagramComputer animation
00:43
Computer virusNeuroinformatikMusical ensembleRight angleGreatest elementMultiplication signSoftwareTheoryComputer animationLecture/Conference
01:26
Group actionDatabaseProcess (computing)DatabaseGroup actionGame controllerComputer animation
02:13
Confidence intervalComputer animationLecture/Conference
02:40
Term (mathematics)DatabaseRow (database)Context awarenessDatabase transactionFinitary relationObject (grammar)AbstractionCodeObject-relational mappingRow (database)SpacetimeTerm (mathematics)DatabaseLevel (video gaming)Lecture/Conference
03:38
Database transactionDatabaseSequenceACIDControl flowConsistencyDatabaseSequenceDatabase transactionLine (geometry)Term (mathematics)State of matterCategory of beingConsistencyAtomic numberInteractive televisionFocus (optics)BitSerial portComputer animation
05:10
Row (database)Object modelDatabaseEnergy levelInheritance (object-oriented programming)Game controllerDatabase transactionOnline helpLecture/Conference
05:40
ACIDDatabase transactionBlock (periodic table)Computer animation
06:03
PlastikkarteLocal GroupSign (mathematics)Mobile appGroup actionPlastikkarteLecture/Conference
06:53
PlastikkarteSign (mathematics)Structural loadEvent horizonSequenceElectronic signaturePlastikkarteSpeicherbereinigungZoom lensElectronic signatureGame controllerBitEvent horizonMobile appSign (mathematics)Structural loadRight angleMultiplication signComputer animationLecture/Conference
08:43
PlastikkarteDatabaseSign (mathematics)PlastikkarteElectronic signatureDatabaseLecture/Conference
09:27
Structural loadSign (mathematics)ConsistencyContext awarenessACIDMathematicsDecision theoryComputer animation
09:53
DatabaseMereologyDigital photographyCoefficient of determinationComputer animation
10:20
Level (video gaming)DatabaseSpectrum (functional analysis)Default (computer science)Computer clusterLecture/Conference
11:02
Default (computer science)Execution unitMultiplication signState of matterTouch typingDatabase transactionSerializabilityProduct (business)Inheritance (object-oriented programming)ConsistencyImplementationRow (database)CodeDatabaseMathematicsReading (process)Real numberEnergy levelLecture/Conference
14:20
Electronic signatureStructural loadSign (mathematics)Commitment schemeDatabase transactionSerializabilityCASE <Informatik>Electronic signatureDatabaseRollback (data management)Moving averageComputer animation
15:11
Interior (topology)Database transactionBlock (periodic table)Parameter (computer programming)Energy levelLecture/Conference
15:44
Concurrency (computer science)CodeDatabaseDatabase transactionDeadlockData modelCodeMathematicsParameter (computer programming)Line (geometry)Error messageReading (process)Multiplication signLogicStructural loadCartesian coordinate systemAbstractionOrder (biology)DatabaseObject-relational mappingComputer animationLecture/Conference
17:32
ImplementationRevision controlProgrammschleifeLine (geometry)DatabaseTable (information)Database transactionField (computer science)Group actionPlastikkarteCuboidDeadlockAreaEmailProcess (computing)Energy levelSurfaceRollback (data management)Message passingLoop (music)ProgrammschleifeLogicState of matterSerializabilityDecision theoryTerm (mathematics)Reading (process)Ferry CorstenBlock (periodic table)Queue (abstract data type)Analytic continuationEndliche ModelltheorieBitData storage deviceConsistencyOrder (biology)Lecture/Conference
23:02
Software testingPhysical lawInformation managementError messageSoftware testingStatement (computer science)DatabaseCodeState of matterLatent heatOcean currentDevice driverNatural numberLevel (video gaming)Inheritance (object-oriented programming)LogicRollback (data management)Field (computer science)Strategy gameCausalityNetwork topologyMoving averageEnergy levelDatabase transactionComputer animationLecture/Conference
25:33
Execution unitSoftware testingDatabase transactionSoftware testingStructural loadError messageConcurrency (computer science)Computer animationLecture/Conference
26:33
Slide ruleCodeDatabaseMathematicsCoefficient of determinationSoftware testingEnergy levelRow (database)Lecture/Conference
27:11
System callSlide ruleDatabase transactionDatabaseCodeCASE <Informatik>Form (programming)Energy levelLecture/Conference
28:27
Thomas KuhnComa BerenicesSlide ruleAttribute grammarComputer animationLecture/ConferenceXML
28:50
Block (periodic table)Data typeComputer animation
Transcript: English(auto-generated)
00:12
Cool, so I'll go ahead and get started. Thanks everybody for coming. The title of this talk is Putting Rails in a Corner, Understanding Database Isolation.
00:21
And that was originally a reference to a quote from the movie Dirty Dancing, but much like a lot of lines from 80s movies, it kind of came out of nowhere and doesn't really go anywhere. So this isn't really going to be a Dirty Dancing themed talk, but instead you get this lovely botanical illustrations, which I find quite nice.
00:44
So I'm Emil, and I'm now, as of the last hour, so a software writer at Binti. And Binti is a great company. We work in the GovTech space, working with foster agencies with the goal of helping
01:03
to find every child a family. And we're hiring if you want to talk to us about that. I'm also on Twitter, I've got my handle at the bottom right hand corner if at any time you want to tweet something about the talk. I talk about computers on there sometimes,
01:21
but mostly like music theory and jazz and sometimes cute animals. Cool. So let me answer the first question that's all on your minds right now is, should you try and sneak out or else who might benefit from this talk? So say that you have some controller actions
01:41
that you wrapped in a transaction, and you're finding that you're still having inconsistent data afterwards, stick around, this is probably for you. If you have big slow background jobs that use active job or something like that, and those need to run transactions,
02:00
this is another good candidate. Or if you simply write anything to the database that's based on something that you read from the database. So really quite general. In other words, people who need confidence in their data and probably use a SQL database, otherwise this might not apply as much.
02:22
But if on the other hand somebody just called you and said that they've got some ice cream and it's gonna melt, totally that sounds better. So feel free to duck out. Great. So, thanks for staying. And let me just make a meta note
02:43
before we really dive in. So this is kind of a dense topic, and I think the takeaway here is not that you'd be able to walk away and immediately write code based on sitting and configuring database isolation levels. So I'd recommend that you just get a sense
03:03
of the kind of problems that are affected by the space and how you might think about configuration, and then just remember some terms and go back and search for it later. Cool. So let's consider the context that we're talking about. So we're talking about transactions.
03:21
We're talking about how they interact with ORMs and especially active record. And it was awesome that DHH set this term up in our mind of leaky abstractions, because that's what this is all about and why we still need to talk about this.
03:40
So just a reminder on transactions. They're really like a sequence of interactions with the database that really you hope have, in a lot of cases, these following properties. So if you've heard the term ACID, it's an acronym that stands for atomicity, consistency, isolation, durability.
04:02
Atomicity means that either everything that you tried to do happened or nothing happened. Consistency means that when you run a transaction, your transaction starts with a consistent state of the database, and it ends with a consistent state of the database. It might be the same state or it might be a different one,
04:21
but it should be consistent. Isolation is kind of the primary focus of this talk, and I think it's really fair to use your intuition about what isolation means, but the official definition of this is something along the lines of, if you have a bunch of transactions
04:41
that are running at the same time, they can be run as if they were run one after the other serially. And then durability means that the things that you write and committed, you expect to be able to, that they stay there until you delete them at least. So what we're gonna talk about is how we can control
05:02
isolation and then affect control consistency. So not too much about durability, maybe a little bit about atomicity in this talk. ActiveRecord is awesome. It gives us this really helpful object model for dealing with our data, and it even gives us some controls for transactions.
05:21
So we can start a transaction, we can roll back a transaction. As of Rails 4, we can even control the isolation level. So that's super helpful. Unfortunately, it might hide some of the finer points about what's going on with the database that we need to know about to make our code behave like we expect.
05:43
So many of you may have used transactions, and you take a block of code, which you think should have those kind of ACID properties, and you wrapped it in a ActiveRecord-based transaction block, and then we hope that everything goes right.
06:01
Well, let's take a look at a scenario of an app called the Congrats app. So this is just an imaginary app, but it's an app to send a card from a group of people to one person, usually probably to congratulate them. So you can go in, you can create a card,
06:21
identify a recipient, and set a group of people who can sign that card, maybe leave a note, and then set a send-by date. So people can come in and sign and remove their signature or update their note. And then we have a send-by date, so say that not everyone has signed by that send-by date,
06:44
we go ahead and send it. If you've ever tried to collaborate and get a bunch of people to sign a card for somebody, there's always some stragglers. So we'll just put that into our app. So let's say that we have this timeline. So Edie gets a promotion, that's pretty awesome.
07:02
Edie has a friend named Pat, who is really cool, and decides that they wanna send a congratulatory card to Edie, and invites Dana and Reese. So Dana is totally on top of things, signs right away, but then some drama happens between Dana and Edie.
07:21
I don't know what it is, but I'd love to speculate. So also it's a totally imaginary scenario, so I can make it up whatever I want. But that's not the important bit. The important bit is that the next thing that happens is that Reese finally signs, and Dana at the exact same time decides to remove their signature.
07:43
So we've set up our app so that Edie's getting a card. But when are they gonna get it? Whose signatures are gonna be on it? It's not really obvious from this, right? So let's zoom in on that simultaneous event. So this could be one way that happens.
08:02
So Reese says, clicks the sign button, it comes to our controller. We say load the card with the signatures, add Reese's signature, see if everybody's signed, and if everybody has signed, then go ahead and send the card. At the same time, Dana clicks remove me,
08:21
I no longer wanna be associated. And we load their signature and then delete it. So it could happen this way, depending on garbage collection or whatever. It could happen this way, who knows? So what's supposed to happen in either of these scenarios?
08:42
What do we intend to happen? Well, it could go two ways, as far as I can see. One is that we don't send the card until the send by date and only the people who intended to sign during that entire timeline are represented. And I think that's probably what I would want. So like Dana's signature is not on there,
09:00
but everybody else's is. But we could also end up sending the card before the send by date and still having deleted Dana's signature from the database. So at some point, even in our, not only have we misrepresented what everybody wanted, in our database, we have a card that was sent
09:21
before the send by date without all the signatures. So that seems bad. So why can we end up in these inconsistent states? Well, when we actually do the sending, we're assuming that the context from before hasn't changed.
09:41
We just made a decision based on that context. So the thing that we have to know is, even though we talked about those ACID transactions, what went wrong is that the database actually makes trade-offs. So we relax, the database is relaxed isolation
10:03
to improve concurrency and performance. So the database says, in certain cases, we don't have to be as strict. And I put this really cute dog photo on there so that you'd remember this really key important part. So remember the blissed out dog.
10:22
So there's actually a spectrum of isolation that we can choose, and it trades off between performance cost and how isolated you are. And don't try and read all these right now. We'll go through them. But you are always in one of these isolation levels,
10:40
no matter what. And depending on which database you're in, you might be in one by default or another one by default. And also, depending on which database you're in, it might implement these differently because the requirements are only minimum requirements. At every level, it could actually be more isolated than is specified.
11:01
So let's go through these. Read uncommitted TLDRs, there's no guarantee about isolation. There's nothing that is required of the database to isolate your transaction. So you can, in fact, even read rows that have been updated or inserted in other transactions that haven't even committed yet.
11:24
And that means that if you read data during your transaction, if you read it again, it could be totally different. There's no warning, and you don't know where it came from. And in fact, it might even roll back. So it's really, you have no guarantees whatsoever.
11:41
I can't think of a really good reason to use this for production code, but I have actually used it for sneaking a peek at what's going on in MySQL and production if I have a really long-running transaction. So for, the next level up is read committed. So in this case, when you read something
12:03
from the database, you're guaranteed that it has at least been committed by another transaction. However, that means that you still may be able to read data twice, and it be different without you having changed anything in that transaction.
12:21
So it still doesn't sound super isolated, but it's better than reading somebody's kind of like half-done work. And then there's no real warning if anything happened. And this is what Postgres chooses its default. Repeatable read is the next level up. So this means that once you've read a row,
12:41
if you try and read it again, you'll only see changes that you've made to it. So at this point, we're now in a place where the things that we kind of touch during our transaction are now consistent within the transaction. So that's a pretty nice guarantee.
13:01
And what happens is if you read something and then start updating things, the database will warn you and say, hey, you just did something based on something that changed and committed to a different state by the time you were done with your transaction. So you may want to do something else.
13:21
So it won't fix anything. It's not an automatic fix, but it is a really helpful warning. You can still end up with inconsistent data in this level. It's a little bit harder, but it's definitely possible. And then this is what MySQL chooses as its default, although, again, I said it's implemented differently
13:43
from Postgres. Serializable is basically full isolation. So transactions can only happen in a way that they could have been written serially. So this is hopefully an implementation of that full I in the asset spec.
14:02
I could not think of a way to get inconsistent data without incorrect code with this level, but if you think of a way or you know of a way, let me know. I think that'd be really interesting. And then this is the most expensive. So let's go back to our scenario
14:21
and see what would happen if we used repeatable read or serializable with this scenario that happened before. So if we got to the end of the transaction in Reese's transaction and say Dana's committed before,
14:40
then we would get a warning from the database that says, hey, sorry, your assumption's changed. Whatever you did, you can't rely on that. We're gonna roll back everything and try again. So that's at least a helpful warning to you to help you know that you have to do something else. It's also possible that Dana could have
15:02
gotten a rollback and said, sorry, we couldn't have deleted your signature before we sent the card, but at least you'd be able to warn the user in that case. So everything I talked about before is basically just for SQL. We didn't really need to know anything
15:21
about Rails or Ruby yet, but as I mentioned, you can enable isolation in Rails as of Rails 4. And what you do is when you have a transaction block, you specify an isolation level
15:41
as an argument to the transaction block. So cool. We're all done, right? It was less than one line of code for this whole talk. It's simply a parameter change. Yeah, not quite. So as I mentioned, as you increase the isolation guarantees, your performance may suffer.
16:01
And it may suffer because, well, there is actually more load on the database as well. You may hold locks, things like that. You may have to repeat transactions, but that's the cost of doing business there. And depending on how you implement things, you may introduce deadlocks. It's not the worst thing in the world. It actually looks a lot like those serialization errors
16:22
that we saw before, but it's another thing that might come up. So what's actually special about ORMs and an active record? Well, there's this great abstraction that reads from the database for you
16:42
at times that allow caching, and then it caches things, and it also writes to the database, and it may do that at any time. So you actually don't always really know if the data that you're reading has been read into a model yet.
17:02
So we have to go through special procedures like pre-loading and eager loading or refreshing. So in order to get this great facility from the database that warns us when our isolation has a problem, we need to give a hint to the database to tell it,
17:24
hey, I'm using this data, and I'm making assumptions based on it, and my logic, my application logic needs them. So let's consider how we might implement that send by date that we mentioned. So one thing that we could do
17:41
is use an active job implementation, and if you're familiar with active job, it's really cool, or if you're not, it's also cool. But one thing that you can do is call these jobs and say perform this later on this model. So maybe we've got a card model, and we say send the card at the send by date
18:02
later on with this card, and it does this really cool magic. Underneath the hood, it says, I know what this card is. I'm going to take an ID, stick that into whatever our store is for our background job queue. Maybe it's Redis or active MQ or something like that.
18:21
And then when I call you back in this perform method, I'm gonna reform that card for you and pass it in. Unfortunately, what that means is that the read for the database happened outside the transaction. So we thought we could use the serializable
18:42
isolation level for the body of this active job, but it turns out that the database has no way of knowing that we're using the data in the card to make our assumptions. So one thing that we can do to improve that, unfortunately we have to kind of not let active job
19:02
get the card for us, but it's not too bad. We can just introduce an additional line here, pass in the card ID to perform the job later and then grab it within the transaction. So now the database knows, hey, this is something that we're using for our logic.
19:25
But there's a problem. What if we sent a card in the actual mail and then the database told us, hey, you shouldn't have done that. Somebody's gonna have to go digging through the post box and that's not great.
19:41
So we can try something else. So this is another technique that you can use, which is we'll still continue to pass the card ID and we'll still grab it within the course of the transaction. And then maybe we'll add a Boolean field to the card table that says, are we committed to send? And what we want to do in terms of getting the logic here
20:04
is we wanna say, we're only gonna send the card in the mail when this field goes from false to true. Never again. So it should happen really at most once and depending on how we, it might happen at least once as well, depending on how we deal with our jobs.
20:22
But here we kind of guarantee that happens at most once. And so in this second line here with the return, we're making a decision based on what we read from the database. Is this committed to send? If not, then continue on, say we are committed to send.
20:41
And then the only way that we can either exit the transaction and continue on to the next line or exit the transaction without a rollback is if our assumption went from false to true and we use the database to helpfully get us from one state to the next just once.
21:02
Gotta watch out for loops too. So one thing that you might think about doing is sending like a reminder email to everybody who hasn't signed a card yet and it's before their send by date. So one thing you might think is, I'm just gonna do this loop in a transaction, do them all at once. And that might be safe and it might be okay too
21:23
if you know that this doesn't happen or if there's not very many cards like that or not very many reminders to send out. But you may also experience a lot of rollbacks. So really you gotta watch out for the surface area of your transactions. So it's better if you don't make a lot of assumptions
21:42
and then make a lot of actions based on those assumptions. So classic advice, keep your transactions small. If you do decide to do that though, at least try and put an ordering, a consistent ordering on the way that you access resources here we're only accessing them from one table but we at least should try and access them
22:03
in say an ID order or something that's consistent between transactions. Otherwise you could lead to deadlocks. Now I gotta talk too much about that but that's another thing that I would recommend digging into a little bit more if you haven't come across database deadlocks. That's another huge topic.
22:22
So maybe a better idea would be to go and get all of the cards that you think might, not that might need a reminder and then do a transaction around each one and then explicitly do a reload within the transaction block to say,
22:41
hey, go get this from the database again, tell it that we're using this data and then make our decision based on what we read within the transaction and then send it if need be. A little trick you can do is just select the ID and then if you do a reload it'll actually reload all the fields for you.
23:04
So we talked about how to configure and use all of the database isolation levels but how do you actually identify those rollbacks so you can make a decision about whether or not you should do something? Well, if you're running a transaction
23:24
what will happen is it will throw, ActiveRecord will throw this ActiveRecord statement invalid error. Unfortunately at the current state of things the exact nature of that statement invalid could be anything.
23:41
It could be that your SQL is invalid or something else. So ActiveRecord does provide this cause field on this statement invalid error and then what's attached is database driver specific, not just database specific. Postgres is pretty well typed
24:01
and then within there you can decide to try or do not retry depending, not to paraphrase Yoda. And MySQL has done less with the inheritance tree here so it's just error and these also
24:21
might all be different for JDBC as well so you really have to, the best way to do this is try and create scenarios where you will run into isolation errors and see what's thrown unfortunately. We haven't talked at all about tests but also I'm sorry, this leads to trouble with tests.
24:44
So between each test, for most tests ideally what you can do is set up a transaction before the test runs, run all your logic and then when it's done roll it back and get back to the state that you started at. Possibly empty or possibly with some seed data
25:02
and that's great, it's super fast and it's a really consistent way of dealing with test data. Unfortunately, once a transaction has been started you cannot change the transaction level. So if you're testing code that changes
25:22
the transaction level, you probably are gonna wanna use deletion or truncation strategies for your database cleaner. So I wrote a little poem to mention this. This is just to say that I've slowed the tests that were in transactions which you're probably hoping would remain fast.
25:42
Forgive me, they were so simple and so clean and so clear but now they're not, sorry. Also apologies to William Carlos Williams. So some more testing considerations. How do you actually test the concurrency here? It's actually really hard, I don't have a great answer for you, I'm not gonna lie. So one thing you can do is load testing.
26:01
This is more likely to come up with load testing and then the other thing you can do is kind of manually test it and this really stinks but you can try and add random sleeps to things just to prove at least to yourself what kinds of errors come up and that they're handled.
26:20
Yeah, I can't say sorry enough for that but it totally works and it works really consistently but it's really hard to automate. So if anybody comes up with some great ideas for testing, that would be awesome. So let's do a quick review. So remember my top happy dog slide.
26:43
Databases try to trade at isolation for performance. The database and active record will let you choose the level but choosing it may require code and test changes and it's a performance hit, it's not just a performance hit and I say it's worth it for the correctness.
27:02
I mean, when I say it's worth it, I mean it's required. There's not really much else you can do, you just kind of have to do it, so yeah. So I added this slide after I saw both DHH's and Eileen's talk and I realized
27:21
as I was going through this talk, this kind of sucks. You know, this is really hard and when you saw the final form of that job, I hope that like, or I don't know if you're like me, I was like wow, this really looks like threading code and it's really kind of rough. So I don't think it has to be this way.
27:42
I think we can do more with this and I think the more people understand this and kind of get an idea of the use cases for this, maybe we can do better and try and push some of this stuff further down the stack. Maybe we can figure out when we entered a transaction that changed the isolation,
28:01
it was very specific about isolation level and we explicitly reread things so that we, so that we can hint to the database that we made assumptions on that data, something like that. But we can definitely compress the concept down into Rails and ActiveRecord. I'm not totally sure how yet, but there's definitely work
28:21
that we could be doing with that. Cool, I just saw that. So thanks to SlidesCarnival, which I use this nice slide deck under creative commons attribution. And thanks everybody.