Event Sourcing in production
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 131 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/69439 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 202495 / 131
1
10
12
13
16
19
22
33
48
51
54
56
70
71
84
92
93
95
99
107
111
117
123
00:00
Event horizonDesign of experimentsArea40 (number)Query languageSound effectInformationProduct (business)Pattern languageTime domainEndliche ModelltheorieObject (grammar)Data structureMachine visionGroup actionString (computer science)Constraint (mathematics)Uniqueness quantificationRepository (publishing)Independence (probability theory)Axonometric projectionDerivation (linguistics)Process (computing)Polymorphism (materials science)PrototypeObject-relational mappingVideo projectorReading (process)Cursor (computers)ConsistencyTime evolutionInvariant (mathematics)CircleExtension (kinesiology)Public domainHuman migrationAttribute grammarOpen setFrequencyControl flowContext awarenessTemporal logicProduct (business)Event horizonPhysical systemRun time (program lifecycle phase)Data structureCodePhase transitionClass diagramInheritance (object-oriented programming)Scheduling (computing)Block (periodic table)Musical ensembleSocial classLattice (order)Right angleSequenceVideo gamePoint (geometry)CASE <Informatik>Pattern languageLibrary (computing)SoftwareProjective planeControl systemDomain nameInvariant (mathematics)EvoluteProcess (computing)SynchronizationState of matterMathematicsEndliche ModelltheorieTrailQuery languageMultiplication signCircleExtension (kinesiology)Electronic mailing listConstraint (mathematics)Video projectorConcurrency (computer science)Attribute grammarSlide ruleThread (computing)Single-precision floating-point formatDifferent (Kate Ryan album)InformationImplementationCore dumpPay televisionData storage devicePublic domainSemiconductor memoryConsistencyLogicCartesian coordinate systemRing (mathematics)Bridging (networking)BitUniform resource locatorLimit (category theory)Web applicationReading (process)Instance (computer science)DivergenceMatching (graph theory)Shift operatorDefault (computer science)Object-oriented programmingSoftware testingGoodness of fitStatement (computer science)HookingLevel (video gaming)Table (information)Semantics (computer science)StatisticsRevision controlNumberType theoryMechanism designOcean currentCoprocessoroutputGreen's functionPairwise comparisonNormal (geometry)Source codeGraph (mathematics)Database transaction2 (number)Dependent and independent variablesTablet computerDemosceneSinc functionView (database)Water vaporMereologyCase moddingLocal ringService (economics)Execution unit1 (number)Perspective (visual)TouchscreenGroup actionDecision theoryArithmetic meanBuildingException handlingCondition numberOperator (mathematics)Rule of inferenceHierarchyTerm (mathematics)Traffic reportingMetropolitan area networkSolid geometryIdentifiabilityRow (database)Position operatorCode refactoringField (computer science)CurvatureNamespaceModule (mathematics)Software developerIntegrated development environmentBoundary value problemLine (geometry)HypermediaDatabaseKeyboard shortcutTotal S.A.Queue (abstract data type)Loop (music)Cache (computing)Bit rateWeb 2.0AdditionRepository (publishing)Computer animationLecture/Conference
Transcript: English(auto-generated)
00:08
So I'm here to talk to you about running an event sourcing system in production. I've been working in Python for about five years now, I guess, and software for 15
00:24
years or so. We were using event sourcing system at my previous job for a few years, and I'm trying to compile some experiences and hopefully some tips for you today. So in a nutshell, once I'm done talking here, I would like you to keep a positive
00:43
idea about event sourcing, maybe think about it and consider it for your next project or your next big refactor. So I will talk about some roadblocks and patterns we use to solve them, common things that I think you will encounter when you start an event sourcing.
01:05
We have a very nice library in Python to handle event sourcing, which is called event sourcing. Look it up on PyPI, it's flexible and provides a lot of really solid building blocks.
01:21
So I'm going to use the library today in my code examples to illustrate a bit what I'm talking about. It's not going to be a tutorial, so I'm going to make some simplification on the library API itself. And I'm also hopefully not too much, but maybe muddy a bit the water with definitions
01:42
and concepts. I'm trying to be a bit more pragmatic, and I don't have too much time. So the event sourcing and related concepts I'm going to quickly touch on together with code examples to show you a bit what it can look like in Python. After that, I will talk about evolution of a projection, domain, and runtime, and
02:04
these names and concepts will also be explained along the way. So first of all, my example today is not going to be a cargo shipping or bank transactions or shopping cart, which is usually what you see with domain driven design and event
02:23
sourcing. I'm using something a bit more simple. It's still a contracted example, because we don't have too much time today. Imagine you want to meet with your friends, and you just developed basically a gathering service, so just for your group of people.
02:45
So the service is quite simple. You don't have a lot of concurrent users. And that also means that your domain, the way you are working, you don't have to handle too many concepts. And I will show you a bit later what I mean by that.
03:03
So in the title, what's up with event sourcing, simply put, it's about capturing the state changes into a sequence of events, rather than just persisting the current state. These events are the one and only source of truth. Everything else is derived from them.
03:21
So in comparison, the traditional approach pursues the current states, which gets overwritten when you make a change to it. In comparison, event sourcing offers a complete history of all the changes, and you will build the state from it.
03:44
If it's a bit confusing now, hopefully with the examples you will see what I mean. We have been doing this kind of things for about 5,000 years already. We found some clay tablets from Mesopotamia from 3,000 BC, and basically they were already
04:00
keeping ledgers of market transactions. And a ledger is basically a recording of transactions. So those are the facts that happen. It's not the tally, right? It's not who owes what right now, what's the total. That's a projection. You use all the transactions, all the events, to eventually count where you stand in your
04:25
tally between basically two sellers or a seller and a buyer on your market. Greg Young, whom I quoted here, is well known advocate of event sourcing and domain driven design. And I think nowadays most people using event sourcing really blend the two together, at
04:48
least the technical aspects of it, which is what I'm also doing today. So domain driven design, what is it? It's a bit more recent, only about 20 years old now. It was first described by Eric Evans, and it's basically about sharing knowledge, having
05:09
a common understanding between the operators, so the users, the practitioners of your domain, and the modelers, so us developers, the technical folks.
05:22
It's a methodology, it's patterns, it is many things, but here in a nutshell the technical aspects of it, you have basically the domain which is above all the specific environment the business operates in, and today I'm going to equate a domain to a
05:42
namespace, so a module in Python. And then you have aggregates, which are the smallest addressable units in your system. They enforce a boundary, so nobody can ever reach within an aggregate to state the change. Everybody has to go through the aggregate, which means that the aggregate is used to
06:06
maintain the invariance, so any constraints is being enforced by the aggregate. And aggregates, it's quite important, are not about relationships, they're not about hierarchy, so don't think about it as a class diagram of your system.
06:24
It's more about the cases that exist in the domain and how best to support those rules, right, all the exceptions and all the conditions. You know the business rules, from there you can create the aggregate and the structure
06:40
is just enough to support that, you don't need to go overboard. So what do we get when we mix event sourcing and domain design? I'm going to quickly show you with the example of the event sourcing library. So that's just a high-level view of core domain classes, the aggregate and the domain events,
07:08
and on the left, those two classes are basically meant to be derived, those are just parent classes, right, with metadata, and on the right you can see how I'm using it. So what can happen to a meeting, basically, what are the facts that we want to record,
07:24
what are the events? And so a meeting can be created by somebody about a topic, it has a schedule, it can be edited, it can be rescheduled, cancelled and so on and so forth. So how can we use these events now, briefly, also still to illustrate the point, so
07:51
there are a few ways to weave the logic and the event together, and here I went for two-way binding with the event decorator over the done-to-init method, and this basically
08:03
makes it so that when we instantiate a meeting aggregate, a new event, created in this case, is created and queued for us, right, it's quality of life decorator, basically. We can do it from first principles, but this is just a commodity offered by the library.
08:25
And from the other side, when we have, like, a sequence of events and we want to reconstruct the state from them, this binding also makes it so that when we see the meeting
08:43
the created event, the done-to-init method is called for us with the same attributes. So the attribute of the event matches the attributes from the done-to-init method here. Something else that we can see is the divergence here between the attributes
09:02
from the aggregate itself and the attributes that we have on the events. So if you don't really see it, I'm going to tell you. So we have, for instance, the edited or cancel events have an additional reason attribute, right,
09:20
which makes sense when you think about the facts that happen to the meeting, the edited or canceled, and if you want to notify your participants, you will want to know why. But for the purpose of this aggregate, I am not storing reason or the actual meeting or for a meeting owner even, because this aggregate, the method it has
09:45
to edit or to cancel, they are about invariance, they are about enforcing that the topic is not empty, right? It's not about who created it. I have no invariance in this one in this case, so I don't need to care about it at the aggregate level, but I do want to care about it in general.
10:05
But it's not here, but I'm going to care about it. I will explain a bit further where. The point is that I'm separating the commands and the queries, and this is the command query responsibility segregation pattern is basically about keeping
10:28
the commands and the queries separated, as the name implies, and the data structures that they use. So the idea is when you want to make a change, you are focused, targeted,
10:40
you don't need to know about the whole world, right? You just have a very simple structure to initiate a command. And on the other side, when you query your system, you will likely want to have more information available, you want a broader view, so hence the difference between the write models and the read models.
11:03
And since the aggregate is this structure to enforce our invariance, it's basically our write models. And I'm keeping the read models aside, but we can also have them in the aggregate. But I will show you an example, which I've been using in production, it's basically
11:22
having read models projected to a Django model. First off, a quick look at how we a bit of the infrastructure with the event sourcing library before looking at how we
11:45
can get the read models. So the application is the programmatic entry point to a domain. It basically holds the public commands in the domain, and it provides a means to persist the aggregate events. Yeah, and we can basically persist one or more aggregates at once
12:10
atomically. That's the point of application. It is the unit of work. A process application is just another application, but that knows how to react to other domain events.
12:25
So typically, this is where you will have local caches about aggregates from other domains, or that you will trigger more commands based on the fact that happens prior. More briefly,
12:41
the system is just the way to detail the graph. So which application is subscribed to which application, so that we know where to send the domain events, basically. And the runner is simply the executor, which pushes events around in the system. So an example of an application
13:05
here with the meetings. Basically, I guess I will go quickly about it. You can see that in the create method, we are basically creating a new meeting aggregate.
13:22
And as I explained earlier, we have a binding. So this also queues up an event for us. So when we save the meeting in the application, then what happens behind the scene is the application collects all the pending events from the aggregate and saves them. Because what we want to keep is the events in the end, not the aggregate per se.
13:45
Which is illustrated in the second method, edit, where we look for the aggregate, and then we can actually execute a command on it. And here, the self repository get meeting
14:00
ID is basically hydrating the aggregate from the events that are stored in your event store. So they are applied in sequence. And the state of the rate is rebuilt that way. So this is kind of glue code, basically, to not expose the persistence mechanism to the outside
14:25
world. And I have other examples in my extras. I'm not going to show that in the main run of the talk. But the idea of save is to be atomic. So we could potentially save multiple aggregates together or fail together. So the other methods are left here as stubs.
14:52
So we could also add a new method here to basically run a query and rebuild aggregate and
15:00
expose some fields we want from it. But as I said, I prefer to have a dedicated structure for that. And I will actually show you here an example of two of the structures. So a pretty much one-to-one structure would be the meeting info. It's very close to the
15:21
aggregate in terms of structure. We know about the offer of the meeting. We know about the skill of the participants. So basically, this is the kind of structure you want to use when you list the meetings that are coming up. If you expose that to your web application, that's your list operation. And another kind of model or projection we can make from the
15:48
same events is statistics. It's a very common approach, like report statistics. The same data is basically viewed from a different perspective. And here it's... I don't know if you were here
16:02
earlier for the aggregates and Django view models, but that's that. It's basically an aggregation of the different meetings. And here just some simple stats with number of meetings and number of participants. So that's just to give you an idea that how I view my data,
16:22
I keep it separated. This is the right model. It's not the right model. I don't need to know about the statistics to schedule a new meeting, basically. All right. So let's go a bit further. And now we are done with the concepts and
16:41
the illustration. Let's go into use case a little bit more. So I will begin with the projections. And first off, the simplest case, just when we need a new projection, and then a quick look at handling a projection that needs to change. So say that you have a domain which events you're interested in.
17:05
Hint, hint. I am interested in the meetings events. So we'll write a projection for that. You can also have multiple domains you're interested in. That's also a strength of projection. It doesn't have to map to your aggregates. So we will basically
17:26
subscribe to the domain events with a process application. And in the callback, in the policy method, we will then have logic based on the type of the event that we see. And I would suggest to, when you are going to production, to basically fail loudly if you
17:49
encounter an event you don't know about. Don't be lenient. Don't have the default case in your match statement. The idea is, I think it's better to fail and then fix than to silently
18:02
swallow or ignore the unknown event. Because then you will get to a broken state, basically in your projection. And you might not notice it. Obviously, that's you want a good test coverage to catch this before going to production.
18:22
It's better than having 500s. But the point about the rigmonology is that if you have many queries, so queries in the general sense here, and they don't really intersect, that's a good use case for having different read models, right?
18:43
Just like when you have it's basically about normalization and denormalization. And you can choose which one makes more sense based on your use case. So a quick example here on how you would do it with a Django read model.
19:06
So you would have a new dedicated process application that is basically subscribed to the meetings application. Which saves the events of the meeting as accurate.
19:24
And the idea here is when you basically dispatch on the domain event type here. And for instance, for the created event, then you create a new entry in your Django model. When you have an edited event, then you update this entry based on the identifier.
19:43
And when you have a canceled event, then you can just delete the entry. Because this is just a state you can rebuild. It's transient. And it doesn't have you don't have to maintain the history here. It's just really a flat state. And on the last slide, you would basically that's how you would say that
20:03
meeting info projector. So this projection application subscribed to the meetings application earlier. And you can have many subscriptions in here. So that's the initial implementation of your production, right? You write that. You run it. You're happy to have the read models there.
20:24
And then something happens, right? Requirements change. All the read models of the domain events change. And you need to update your read models. All your production. And once you've done that, you have to replay the whole history
20:41
to basically catch up your production with your domain events. So there are two aspects to dealing with that. On the remote side and on the productive side. So on the logic, I have to hurry up a bit. So the easiest would be to basically delete all the entries, update your schema,
21:05
and then keep your application offline while you reproject. And that's what I recommend. If you can do that, it's the simplest. Just do it. Another approach would be to update in place your read models. But then you have a few more gnarly
21:21
things. Like you need to migrate also your read model schema itself. So you have to handle the conflicts there because you have existing data you have to migrate. Plus, you will get the new data on top. And if you happen to query this data before the projection has catch up
21:40
with the domain events, then you are in an in-between state. So eventually consistency kicks in and basically you are lagging behind. Another easier model to reason about create a new production, a new model, a new table, a new database.
22:02
And I would say switch to that maybe using more of kind of a blue-green delivery mechanism. So you switch to new entries as the old entries become obsolete once you are catch up basically on the new entries. But this requires a bit more infrastructure
22:23
that is not built in the library. On the logic side, well again we have the simplest conceptually write a new projector because then you don't have to worry about its own state. It will just basically execute the whole history again and then you will have your
22:42
fresh read models. When you have Django, you can pretty easily I mean if you're using Django also to maintain the state of your projectors like you can do with library, then you can pretty easily reset the state there and on the next startup your basically projector will replay
23:01
the history again. That's a nice trick. So if you can keep your system offline and replay while it's there, it's easier and then you have your consistent system on restart.
23:25
Okay so let's shift gears. So let's talk a bit about the things that can go nasty like domain evolution and I want to talk about two things. Domain extensions, so adding a new attributes and invariant restriction when you want to for instance add a unique constraint
23:44
after the fact. So I'm going to use the example of the meetings and imagine that you are growing the circle of users a bit more and then now you need to also care about the location because it's not only one city, it's only one place. There are more people. So you have two ways of
24:02
doing it. You can either upcast existing events with new attributes or create a new event specifically for these new attributes. In the first case, you basically are deriving more information from existing events which is fine as the new consumers, the new projectors
24:24
will basically get the data that has been upcasted but the projectors that are already up to date that have seen the data won't see it, right? So you are responsible to trigger an update for them. So that's basically fine when you don't have consumers
24:42
or when it's only projectors but you can just reset and replay but it shows its limit pretty quickly semantically speaking because then there is no trace of what you're doing in the persistence, right? The upcast happens in memory. So if you basically were to only keep
25:03
your event store then you wouldn't have this information in there. So just keep it to metadata and don't use that on core attributes. You could also create a new event. In this case, that's location changed event and that
25:23
basically keeps the backwards compatibility and just bridges the gap with existing events, right? Everybody will get this event as soon as they handle it and stop crashing on it. And the best thing is then your event sequence in your event store has the semantics of you can
25:45
see that the location has changed. It's in there and you're describing basically what happened in your history more concretely. So in this example of location, it makes more sense because it's a core attribute to our domain. So that's what we want to do. But of course, when we
26:06
start, no aggregates have this information, right? It's undefined for all the aggregates. So how do you fill the gaps? Well, you need a kind of a workflow to do this. So either you are able to derive the value from within the aggregate, from the current state, or either you
26:25
need like an external input, right? You need to basically get the data from the outside world. So quickly, the point here is that you need to know about your aggregates so that you can
26:42
loop over them basically either from within the application or from the outside so that you can then ask them to derive the location. You need to hook in basically your control loop. So in this case, the example is you just ask for a specific meeting
27:04
to get this location. Just quickly passing over this. The other idea would be to then have a multi-step approach. So like a long-lived process in which you would basically keep track of all the aggregates and which ones are missing the value. And once you are up to date
27:24
with all the aggregates here, you can ask your tracker, okay, so now that you know, please send a notification to all the owners of the meetings that miss the value. Right? Just be careful because since it is a downstream consumer, you are potentially lagging
27:41
a bit behind as well, right? Just move it. So the second part of this domain evolution would be introducing an attribute unicity after the fact. And obviously, if you don't have duplicates, then you don't have problems. You can just introduce a constraint and you're done.
28:01
If not, then you need to migrate your data somehow. And I want to stress out that this is not a technical problem. I've just shown you an example of how to solve this with the media tutorial approach, when you track each issue and then ask people to solve it for you.
28:22
It's really, you have to decide from the business perspective, what do you do with duplicates? So the material approach I've just shown, you could have also, you could merge all the aggregates together, the ones that have the duplicate value, or you could use a temporal
28:41
approach and basically only keep, for instance, the last one that was updated. Discarding all the others, right? Again, this is really a decision you have to take based on your domain. All are valid, technically speaking. That's a quick example of the temporal tracker, but it's a bit lengthy on the screen. I don't have more time, so I'll just
29:03
skip over it. Again, you keep track of things and you just keep track of the last one that was updated, and in the end you can basically have a list of the winners. And the final chapter
29:23
is about synchronous systems, and I will just keep the last slide which summarizes the different approaches. And the winner in this case is, once you start to have traffic and you want to handle concurrent requests, go for a hybrid approach which is basically
29:47
have more workers but single-threaded runners. The slides will be available online, and I'm available for more questions. And I guess that's actually the end of my talk.