Service oriented architectures (Hardcore separation of concerns)
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 150 | |
Autor | ||
Lizenz | CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben | |
Identifikatoren | 10.5446/51391 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
NDC Oslo 2013149 / 150
3
4
5
6
8
11
12
15
17
22
26
27
31
32
39
40
41
42
44
47
51
53
56
57
59
60
61
63
64
66
67
68
69
71
72
79
80
81
82
83
85
87
89
90
93
94
95
97
98
99
100
101
102
103
106
108
109
110
114
118
119
120
122
125
126
130
132
133
134
135
136
137
138
139
140
141
142
145
00:00
GrenzschichtablösungServiceorientierte ArchitekturArchitektur <Informatik>GruppenoperationArchitektur <Informatik>Gemeinsamer SpeicherProzess <Informatik>SystemprogrammierungEDV-BeratungARM <Computerarchitektur>UnternehmensarchitekturGewicht <Ausgleichsrechnung>ErwartungswertTwitter <Softwareplattform>Formation <Mathematik>BitNatürliche ZahlServiceorientierte ArchitekturRechter WinkelSystemaufrufQuick-SortSpeicherabzugMereologieAppletTermSpezifisches VolumenGrenzschichtablösungJSONXMLUMLComputeranimation
01:55
MereologiePhysikalisches SystemAuthentifikationDatenbankMathematische LogikFirewallPasswortE-MailInformationProxy ServerGrenzschichtablösungFundamentalsatz der AlgebraCookie <Internet>SystemaufrufToken-RingUnternehmensarchitekturFramework <Informatik>Serviceorientierte ArchitekturMereologiePunktInformationGrenzschichtablösungDatenbankObjekt <Kategorie>Cookie <Internet>Ordnung <Mathematik>AuthentifikationAutorisierungE-MailPhysikalisches SystemQuick-SortFunktionalExistenzsatzMessage-PassingQuaderBitProjektive EbeneLoginGraphZusammenhängender GraphPasswortCASE <Informatik>ZeitrichtungBenutzerbeteiligungBrowserInformationsspeicherungWeb-ApplikationProxy ServerZentralisatorToken-RingFundamentalsatz der AlgebraBildschirmmaskeMailing-ListeRechenwerkMehrplatzsystemMathematische LogikVererbungshierarchieIndexberechnungSpezielle orthogonale GruppeGesetz <Physik>IPSecSystemaufrufFamilie <Mathematik>ResultanteZahlenbereichUmwandlungsenthalpieBitrateSoftwaretestComputeranimation
10:16
Physikalisches SystemAuthentifikationClientMailing-ListeAggregatzustandInformationHumanoider RoboterCookie <Internet>Physikalisches SystemTaskQuick-SortProdukt <Mathematik>PasswortAuthentifikationSystemverwaltungAutorisierungArchitektur <Informatik>Ein-AusgabeApp <Programm>Zusammenhängender GraphWeb-ApplikationHumanoider RoboterDatenverwaltungDatenbankSystemprogrammGraphQuaderBitMereologieMultiplikationsoperatorRootkitFirewallZentralisatorOrdnung <Mathematik>ZahlenbereichHeegaard-ZerlegungMailing-ListeExogene VariableCASE <Informatik>E-MailPunktZeitrichtungRohdatenBefehl <Informatik>ClientGrundsätze ordnungsmäßiger DatenverarbeitungRoutingRechter WinkelEinfügungsdämpfungAblöseblaseFokalpunktInverser LimesForcingFundamentalsatz der AlgebraBitrateCookie <Internet>Computeranimation
18:21
Physikalisches SystemAlgebraisch abgeschlossener KörperFramework <Informatik>GrenzschichtablösungQuick-SortGüte der AnpassungGraphBefehl <Informatik>Web-ApplikationBitPhysikalisches SystemComputersicherheitZusammenhängender GraphPunktArchitektur <Informatik>BenutzerbeteiligungAblöseblaseBimodulDokumentenserverOrdnung <Mathematik>MereologieDichte <Stochastik>RenderingMultiplikationsoperatorEin-AusgabeRechter WinkelThreadHumanoider RoboterElektronische PublikationApp <Programm>GrenzschichtablösungProjektive EbeneDifferenteZentralisatorEntscheidungstheorieExogene VariableDatenverwaltungHeegaard-ZerlegungFormale SpracheDesign by ContractVersionsverwaltungFramework <Informatik>SystemverwaltungCASE <Informatik>InternetworkingFamilie <Mathematik>VererbungshierarchieDatensatzSystemprogrammierungTaskCodeQuaderComputeranimationFlussdiagramm
26:25
MereologiePhysikalisches SystemServerWarteschlangeDatensatzMultiplikationsoperatorAdressierungSchlussregelMessage-PassingQuick-SortPunktValiditätProgrammbibliothekSocketCASE <Informatik>Serviceorientierte ArchitekturDifferenteServerSoftwareLeistung <Physik>Kartesische KoordinatenEinfach zusammenhängender RaumZweiQuaderArchitektur <Informatik>Physikalisches SystemBitExogene VariableTermDatenbankWiederkehrender ZustandSystemverwaltungClientSerielle SchnittstelleZahlenbereichFlächeninhaltRenderingGraphTopologieFächer <Mathematik>FitnessfunktionRechter WinkelGüte der AnpassungSocket-SchnittstelleZeitrichtungAttributierte GrammatikSoftware EngineeringGeradeStellenringWeg <Topologie>EindringerkennungArithmetisches MittelComputersicherheitZentralisatorComputeranimation
34:43
ServerCASE <Informatik>Message-PassingZahlenbereichWarteschlangeKundendatenbankGüte der AnpassungSocketProgrammierumgebungGamecontrollerQuick-SortDifferenteQuellcodeThreadRouterFramework <Informatik>Socket-SchnittstelleNP-hartes ProblemAggregatzustandProzess <Informatik>ProgrammbibliothekSystemprogrammierungFunktionalFunktionale ProgrammierspracheMAPTypentheorieFächer <Mathematik>PunktServerBitServiceorientierte ArchitekturSoundverarbeitungDatenbankParallele SchnittstelleSystemaufrufOrdnung <Mathematik>AusnahmebehandlungEinfach zusammenhängender RaumRechter Winkelp-BlockZweiAblöseblaseSoftwareDifferenzkernLoopRoutingEinfacher RingGemeinsamer SpeicherRuhmasseSummierbarkeitProgrammierungKugelIRIS-TAutomatische HandlungsplanungLochkartePrinzip der gleichmäßigen BeschränktheitFlächeninhaltExogene VariableDreiecksfreier GraphBasis <Mathematik>Physikalisches SystemAlgebraisch abgeschlossener KörperFlussdiagramm
42:35
MereologieDatenbankBeschreibungskomplexitätRechnernetzGrenzschichtablösungKonfiguration <Informatik>MereologiePunktWort <Informatik>Algebraisch abgeschlossener KörperEindringerkennungDatensatzZeichenketteQuick-SortWarteschlangeMapping <Computergraphik>DatenbankMAPFunktionalObjektrelationale AbbildungMultiplikationsoperatorOrdnung <Mathematik>InstantiierungObjekt <Kategorie>CodeE-MailAdressraumPhysikalisches SystemArchitektur <Informatik>Konfiguration <Informatik>Zusammenhängender GraphGrenzschichtablösungKomplex <Algebra>KettenkomplexHeegaard-ZerlegungFramework <Informatik>DifferenteSoftwareProgrammierungCASE <Informatik>AusnahmebehandlungMailing-ListeFormale SpracheFunktionale ProgrammierspracheRobotikAggregatzustandWiederkehrender ZustandRechter WinkelComputeranimation
48:20
SoftwareentwicklerGewicht <Ausgleichsrechnung>Quick-SortRechter WinkelBitSystemprogrammierungImplementierungSystemaufrufZahlenbereichWarteschlangeProtokoll <Datenverarbeitungssystem>MultiplikationsoperatorEinfach zusammenhängender RaumMaschinenspracheProjektive EbenePhysikalisches SystemPunktGüte der AnpassungProgrammbibliothekCASE <Informatik>Parallele SchnittstelleLaufzeitfehlerFormale SpracheBenutzeroberflächeAlgebraisch abgeschlossener KörperMultiplikationProdukt <Mathematik>Bildgebendes VerfahrenZentrische StreckungInformationsmanagementComputeranimation
53:30
JSONXMLUML
Transkript: Englisch(automatisch erzeugt)
00:05
All right, everybody. I think there should maybe have been some background music in this room, because I feel I have to talk very softly, because there's just a low volume in here. First of all, I'm very pleased that so many people showed up.
00:20
I was a little bit scared, because there was a talk on Wednesday that actually had only two people in it. I hate the title of my talk. It's a very bad title. So I was afraid that the same thing was going to happen to me. So this exceeds all my expectations. So, my name is Ergust on Twitter.
00:42
I'm August L. I'm a consultant or a contractor at a place called Cooler Market. That's mostly a Java shop, so since this is a .NET community, I don't expect that anyone here have heard about Cooler Market at all, because there's absolutely no intersection between the .NET and the Java communities for some reason. Anyway, that's me.
01:00
So I'm going to be talking about... I'm really going to be talking about the stuff in parentheses, hardcore separation of concerns. I'm not going to be talking a lot about the traditional service-oriented architectures. That was actually a mistake on my part of naming the title, or giving the title service-oriented architectures, because that's a very enterprising name that people don't like.
01:23
I sort of came up with it myself and didn't realize that it was actually a term in our industry. So there you go. This talk is basically just my experiences in making some systems that are very hard-corely separated.
01:42
I made three systems actually recently that share a very similar architecture, which is very separated in nature. I'm going to be talking about one of them now. This is basically a three-part talk. I'm going to just start with the service part.
02:03
And again, I'm not going to be talking about enterprise service buses and other nasty things. There will be no camel and no frameworks. This is actually much more descriptive, so I don't agree with everything Ritchie says, specifically not the things he says about testing,
02:21
but I certainly agree with this. Ritchie is a great designer, and he says that good design is about taking things apart. So that's what I've been doing in my projects recently. I've been taking things apart. And these are the things that have been taken apart. And don't worry, you don't have to sort of memorize this graph
02:43
because this is what I'm going to be talking about in the next 30 minutes at least. But this is sort of just the overview. So as you can see, there's not one box here. There are many boxes and many arrows and many things. And I'm going to be explaining these things to you and how they work, why I've separated them,
03:02
some of the benefits, some of the trade-offs, and so on. So I'm going to start. I'm not going to go through all of these. I'm going to go through most of them. I'm going to start with this one, the central API. And this is the part I think that everyone here can relate to because it's basically just an API.
03:22
I don't think you will have any troubles understanding this particular component. It implements all the business logic. So I can do a post to logins. This, by the way, would be familiar if you worked with Rails, which has a very resource-oriented mindset.
03:41
So when I log in, I actually sort of create a new login. That's the mindset here. But that's not the point. The point is that the central API has the functionality you need to pass in a username and password and be informed whether or not this user actually exists. You can also get information about a single user.
04:01
This specific system is sort of a web app for landlords, for invoicing, and stuff like that. So obviously a user has a list of rental objects. That's basically the housing units that they are renting out to other people. So all of this stuff lies in the central API.
04:23
And of course there's a lot of other stuff, not just these three things. This is just sort of a sneak peek. All this stuff is here. So what's a little bit strange about this thing maybe is that there is absolutely no authentication here. None whatsoever.
04:40
So if you have access to this central API, you can just get user slash one, or you can change this to user slash two, and you see the information about that user. There's absolutely no authentication at all. So that's a little bit strange maybe. Obviously this is behind a firewall. It would be no good if this was available to the general public.
05:03
So it also sort of happens to have a database, as I like to put it. This central API basically is the database in the system. It has a database. The central API talks to a database to store data, obviously. But I could completely change to another database, and no other part of the system than the central API would be affected by that.
05:23
So here's the central API basically. All the business logic. If you have access to the central API, you have access to everything. There's no authentication or anything like that. So perhaps you can sort of start to see the hints here of separation of concerns. No authentication.
05:40
It does not concern itself with authentication at all. So, that's the central API. I'm going to move on to this thing. Apparently it's called a user API. So let's see what that is. So, this is also maybe a little bit weird. Like, another API. Why do we have two APIs?
06:02
So, the user API talks to the central API, and this is where authentication is implemented. This is basically the point of the user API. To implement authentication to the central API. So it sits in front of the central API. As I said, the central API is behind a firewall.
06:21
This one obviously isn't. This is the thing you talk to. This is the actual API that is available publicly, except you obviously have to log in in order to access data. So you can log in, obviously, without having to authenticate. You can do password resets, stuff like that. And then all the other stuff, like getting information about a user,
06:44
getting the list of rental objects, invoices, all that stuff. You need to be authenticated in order to do that. And let's just look at a quick example of how that works. So, in the user API, you start by doing a get to slash rental objects.
07:04
If you don't pass in an authentication header, you will get 422, no, what's it, 403 not authenticated. So you have to use, like, the other part of the user API that we saw on the other slides, this login password reset stuff, to get this auth thingy you need in order to get the rental objects.
07:24
You pass along this auth header thingy. There's no cookies here, no cookies at all. It's just a header containing an auth token. And then, based on that auth token that we pass in to the user API, we fetch the user information from the central API,
07:42
and then we rewrite the path. So remember, the original path was slash rental objects. The path in the central API is slash user, and then the ID of the user, slash rental objects. And this is essentially what the user API does. It puts slash user, slash ID in front of the path to all the requests that you make.
08:05
So, as you can see, a request to rental objects ends up being a request to the central API as users, slash the ID of the user that we passed the auth token to, and then we get the original path that we passed to the user API.
08:22
So this is sort of a fundamental separation of access. If you have access to the central API, you have access to all the data, but obviously we need someone to implement that a user can only access its own data. And this is what the user API does.
08:41
So, we're getting further along in this fancy graph. So we talked about the central API. We talked about the user API. So now we basically know what those are. User API is just a proxy in front of the central API. Then there's this other thing over here called the main web app.
09:00
And the main web app is the main web app, obviously. This is what the customers use. Clickaround, it serves HTML, all that good stuff. So this is where the login form is. This is where we have cookies. So the user API does not use cookies at all. But the web app does, because cookies is obviously the way to store authentication information for browsers.
09:25
And this is a web app, so it's made for browsers. So it calls the user API a lot. It uses the cookie to fetch the auth token, and the user API takes the auth token, which again talks to the central API.
09:42
And that's basically it. So these components don't do a lot. They're very separate. They have very specific roles. But there's one sort of weird thing here. And this is that we always take this path, apparently. The main web app calls the user API, calls the central API.
10:01
So far, there's not really a good case for doing it this way. Why not just put it all in one box? And of course, the answer is all these other boxes. So let's talk a little bit about them. So the central API, no authentication, all the business logic. User API sort of scopes the central API to a user based on authentication.
10:26
So now we're going to be talking about this thing, the internal admin, whatever that might be. So this is when things start to get interesting. So the internal admin web app is internal. It's not for customers.
10:41
It's for the product owners, and it talks to the central API. This internal admin web app happens to use just normal basic auth with a crazy password. And it's only accessible for, obviously, the people that work at this company, not the users or the customers.
11:01
And it has some, obviously, admin tasks like listing all the clients, listing all the users, being able to reset passwords for users in case they didn't get their reset password email. It can also log in to the web app as any user. And what's interesting about this is that at no point is this one big web application where we have to check,
11:28
okay, if the current user is an admin user, it gets access to this. If not, it does not get access. There's no sort of hard-coded ACL here. It's all sort of fundamentally implemented through the architecture.
11:44
Because if you have access to the central API, you have access to everything. This internal admin web app does not talk to the user API, because the user API limits all requests to one user, but the internal admin app obviously needs to access everything. So instead of having this fancy ACL system,
12:03
I decided to just do it this way, split things apart, have the central API as a completely standalone thing. If you have access to the central API, you can access all data. And these two things talk to the central API as they please.
12:20
There's no sort of, well, there's no ACL, basically. The architecture takes care of the ACL for us. So, that's kind of cool. There's another system over here. It's called the invoice manager. I guess we can all sort of guess what that does.
12:40
This is about landlords. They have to send invoices. That's sort of the main feature of the system, that we send the invoices for them. They don't have to care about that. Now, this invoicing system, obviously, it talks to the central API, because the central API is the database that contains all of the invoices and all of the good stuff.
13:04
And this is completely different from the other components in our system. The other components are just web apps on HTTP. The internal invoicing system is not. It fires up hourly to call a third party to actually deliver the invoices.
13:21
We don't implement that. The invoicing system just asks the central API for, okay, what are all the invoices that we have to send? That's basically all the central API knows about delivering invoices. The central API implements that, okay, at this given date,
13:41
we have these invoices that need to be sent somehow, or invoiced somehow. And the invoicing system takes care of talking to the central API about this, and asks the central API, give me all the invoices, takes the invoices, sends them to whatever it needs to send them to in order to perform the invoicing.
14:01
And what's also interesting is that the invoicing system actually has its own database. So when we import an invoice, we get an invoice number back, but the central API is not concerned with that at all. So we have sort of a local database in the invoicing system
14:20
to store all of this data, which basically is just local to the invoicing system. And then it goes back to the central API again when asking about payment status, because that's also a feature in the system. So these invoice IDs that we got from the external invoicing system
14:41
are used to ask for payment status, and then we go back to the central API and tell it about the status. And again, the central API does not know anything at all about how invoices are delivered, how often do we check payment status. The invoicing system takes care of that. And again, it's very convenient to not have to worry about
15:02
authentication here at all, because if you have access to the central API, you have access to everything. So you don't have to sort of make an ACL system that supports authentication and no authentication, or have some sort of god, pseudo, root user that has access to everything.
15:23
The architecture actually takes care of this all by itself. So that's essentially the architecture of this system. The biggest problem it really solves, I would say, is the ACL part,
15:40
because access to the different components in the architecture naturally gives you access. There's no ACL system on top of this. So this lets you use a firewall instead of if statements, basically, to restrict access, which is very good.
16:00
There are some other obvious benefits. You could take down parts of the system without affecting the whole system. I'm going to spend a little bit of time explaining the other components in this graph now. Oh, first, these gray boxes. So this is the part where you get to say that I have over-engineered this, and you would be right,
16:21
because I'm talking about the future. For all I know, this future will never exist, but it's still interesting to talk about it. So first of all, the cost of achieving this future is low. If we want to create an Android app or an iOS app,
16:44
what do we have to do? Just that. We have to create the Android app. We already have the infrastructure in place to talk to the API. We will use the user API for that, because typically an Android app or an iOS app would be one user's utility for checking payment status or whatever.
17:05
So we have the user API, we can just create the app, talk to the user API. The user API already implements authentication. Minor sort of detail, but it's nice to have the web app and the user API separate,
17:20
because the web app deals with cookies, and it would feel really foreign for an iOS app or an Android app to deal with cookies. Now, they're all just headers, so it's a little bit of a moot point, but still, it's nice. It makes the responsibilities much more obvious.
17:40
So, there you go. That's the architecture. A lot of arrows and stuff. One more thing that I would like to talk about here is this thing. This whole system used to be one big... Oh, question.
18:17
So, the question is, basically, if you get access to the central API, you're dead, right?
18:22
Because then you have access to everything. Right. So, if I understood your question correctly, there could have been another layer here between the internal admin and the central API?
18:44
Ah, I see what you mean. So, basically... Yeah, that's a good point, actually. So, we have this user API, just repeating the question for the microphone here.
19:01
So, what if you wanted to make an Android app for administering the system? Then we would have to create a new component that was in front of the central API somehow, because we don't want the central API to be publicly available on the internet, because that would be horrible. So, I guess the short answer there is that I don't need it yet.
19:23
And I do need the web app, which is scoped to a single user, and the cost of doing that separation of the web app and the user API is very low. I just had to do it, right? And I guess it's more likely that I would have an Android app for users than for sysadmins,
19:42
but it's a valid question. And to be honest, I think this part of the architecture, that's sort of the weakest point. It doesn't really add a lot of value, other than the future Android and iOS app. We could have put it inside the web application, and it would have been pretty much the same. So, yeah, no sort of good answer for that, other than what you want to do.
20:08
So, this system over here is also kind of interesting. This sort of graph here makes it blindingly obvious that it is completely standalone and isolated from the rest of the system.
20:23
The reason that it's written in Ruby, that's what it says in parentheses under here, the rest of the system, I've removed all the parentheses, so this graph is a little bit moot. Everything is written in Clojure, except this part, which is written in Ruby, so now you know.
20:41
And that's actually legacy. This whole system used to be one big Ruby on Rails app, and when we decided to move for an architecture like this, which we basically felt the need for as soon as we had this invoice manager, because then we felt the need for having the ACL and all that, and I didn't want that.
21:00
So, I kept the PDF rendering code, which was written in Ruby, because, honestly, there was no point in re-implementing that, and it's also kind of a big task, actually. It took me at least a couple of weeks to create a basic version of rendering, and we render invoices, obviously, and contracts and stuff like that.
21:22
So, we have a PDF generator for doing that. And, sort of, because this system isn't one big system, but many small systems, they can obviously be written in different languages. And I guess, well, in this case, it was only legacy
21:44
that required us to have multiple languages, but you could imagine more cases. Like, you could have a separate team working on the invoice manager, and they wanted to write it, and, well, perhaps one of you guys would want to write it in F sharp, or something like that. So, that's another nice aspect of this architecture.
22:04
It's not just sort of boxed into different things. There are actually separate components that have actual separate responsibilities. So, it's much easier to just, you decide you want to rewrite the invoice manager,
22:21
then you just rewrite the invoice manager. You don't have to rewrite other parts of the system. And you can rewrite it in whatever language you want. Obviously, you have to make a good decision there if you rewrite it. If you have a system composed of different languages, there are good sides and bad sides to that.
22:43
And, as I said, you can have different teams working on different things. You can deploy one thing without affecting the rest of the stack. And there are a lot of good benefits to doing this. Yep. Good question.
23:08
That's actually the next part. I'll be talking a little bit about that. So, I won't repeat the question because I'll be getting into it later. So, this is kind of interesting, actually. I had a really hard time writing this slide.
23:24
There aren't actually that many trade-offs by doing it this way. So, I'll just go through these. So, one of the trade-offs by having this architecture with a lot of components is that it's harder for new team members to get started.
23:41
They can't just download one thing and run it. This is actually more of a complaint than an actual problem, because what has happened in other projects... So, I've actually been working on this particular project alone. I've been having an architecture like this in a project with teams of three or four people.
24:01
And when new people join the project, they will see eight GitHub repositories and they will be scared. But the thing is, this is only a problem the first day on a project, and then it's no longer a problem. People end up really liking it. So, I guess it's more of a complaint than a problem.
24:25
Overengineering, so that's not really a trade-off, but people easily accuse me of overengineering by doing it this way. I guess you could argue that it is a little bit overengineered, especially the separation of, as we just discussed,
24:43
the web app and the user API, a little bit overengineered. But the other parts, the essential separation of concerns as far as security goes, is definitely not overengineered, I would say. But I guess that's a debate you could have, and I guess you could call it a trade-off.
25:01
Another trade-off is that there are no frameworks for doing this. And that's also not really a trade-off. That's also actually kind of the whole point of doing it this way, to not have everything in one big framework, but as separate, isolated components. And as I sort of touched on a couple of minutes ago,
25:21
when you separate things, there's a big risk that you end up just boxing things that are actually one big thing. You just sort of use a knife and split them apart instead of actually separating things. A good example of that, I don't want to bash on Rails, but I remember it vividly.
25:41
And ActiveRecord is the ORM for Rails. It has a base class, which basically consists of 50 include statements of other modules. And if you change the order of one of them, everything would fail. And that's an example of just arbitrarily boxing up your stuff instead of actually separating it, because those 50 components weren't separate components.
26:02
They were just one component split into different files. So that's definitely a danger of an architecture like this. And boxing like that does not solve any problems at all. It just creates problems, makes stuff harder to work with. So you need to sort of thread carefully and make sure that you actually create separate components
26:21
that make sense. All right. So the gentleman in the second row here talked about queues. And that's what I'm going to talk about now. So the first time I did something like this, I used HTTP for everything.
26:41
HTTP is obviously good. Everyone uses it. But it does not necessarily fit well in an architecture like this. So there are some benefits to using queues instead of HTTP. So the first point I agree is also
27:02
a little bit of a moot point. Queues typically, out of the box, are easier to work with in terms of connections. So let's say that your HTTP service, like the central API, let's say I wrote that in HTTP, which I didn't, as I will get into later,
27:21
and I wanted to use HTTPS for security because the servers are in different places in the network and I don't want those sysadmins to be able to intercept my traffic. So you need to do some sort of connection pooling for that to avoid like a half a second handshake for every request you make to the central API. That would be horrible for everything.
27:42
Performance, power efficiency, what have you. So connection pooling for HTTP is something that HTTP clients typically don't do out of the box. You can do it, but you end up having to do some stuff yourself that the queue libraries typically do for you.
28:01
Second benefit, that's sort of the big one. So my central API is a queue consumer and it can die at any point in time and nobody will notice, except that it will take a long time to process the request. This is one of the important attributes of queues. With HTTP, you would talk directly to the central API
28:22
and if the central API was down, well, then you would get an error. If you have a queue in between, you can post things to the queue and if the central API is down, all you notice is that it took a little bit of a long time for that request to be processed, which in almost all cases is what you want
28:41
because the reason the central API is down is because you redeployed it or you had like a 10 second downtime or something like that. And the third is obviously also a big one. If you use HTTP, you talk directly to one thing, but if you do it through a queue, you don't care who does the work, and you can have multiple workers pulling stuff from the queue and putting stuff back.
29:05
So I've spoiled this now. There are three blue arrows here that I sort of have implied that are HTTP, but they are not. I'm using a queue here
29:22
and I'm actually using a very specific queue, not just any queue, and I'm using that queue for a very good reason. So I want to talk a little bit about which queue fits well in a scenario like this and how you can use it, basically. This is the one.
29:43
It's in Denmark, in Norway. This logo looks very ugly, right? It's a letter, but it's supposed to be like the high-tech rendering of a zero with a line through it, like the one you get on fancy military huts and stuff like that, and I guess it looks nice and symmetric with a queue.
30:04
So zero MQ, that's what it says. Zero MQ. So zero MQ is really weird and it's actually kind of difficult to explain it to people, so hopefully I will be able to explain it to you now. I've sort of refined my explanation of zero MQ a number of times,
30:22
and it gets better every time, so hopefully I reach a zero MQ explanation in nirvana. So this is the first big thing. There's no server in zero MQ. Okay. So what's zero MQ then if there's no server, right?
30:41
All queues have this server that you start up, like a broker that you connect to, and that's sort of the whole point because you connect to the broker instead of the actual thing, so the promise is that your broker dies a lot less than your actual thing, right? So you get to have more F time,
31:00
all that good stuff. But zero MQ apparently does not work like this at all, which is incidentally why I chose zero MQ, because it's easy to take an architecture where you have HTTP calls with an arrow, like in my fancy graph, and just replace it with zero MQ without adding any more nodes to your network or your topology.
31:24
So zero MQ uses sockets, so these two things here are sockets, and this is sort of supposed to represent, so there's a request socket and a reply socket, that's what rec and rep means. This request socket can represent the user API
31:41
or anything really that needs to make requests, like the internal admin, the invoice system, whatever, so all of these create request sockets, and then on the central API side, I create a reply socket, and I guess you can all sort of imagine how this goes from here, so you send a message on the request socket,
32:02
it gets sent to the reply socket, zero MQ has a nice API for working with this, so you don't have to manually track request IDs like you have to do in other queues when you do RPC over queue, and there you go. So what have we exactly achieved now
32:21
instead of using HTTP? Because there's no broker, so how do we achieve that thing where we can take down the central API, we can redeploy it without anyone noticing? So a lot of our software is written as if we were in the 1970s.
32:41
A long time ago, servers were expensive, big servers with a lot of RAM that were super expensive, so we had to put your stuff on those servers, like your databases, because they were expensive, so we couldn't have that many of them, and perhaps your queue brokers also had to be on those expensive servers, and then our application servers are just cheap, horrible servers with absolutely no RAM at all.
33:02
This is no longer the case, and what a queue really needs is just a queue of some kind. So instead of having a server for that, this actually happens locally in the socket for 0mq. So if we start up our user API
33:23
and our central API is down and perform a request with the request socket, it's added to your queue locally on this socket. I mean, why not? Why wouldn't you do the server for that? You certainly don't need a broker. Sometimes you do need a broker if you want to have all sorts of validation rules
33:42
for your messages. Maybe you want to persist your messages. But in this case, all we really need is something that holds onto stuff when there are no consumers reading that stuff. So when the central API comes up again, 0mq will automatically connect for us
34:01
and it will send a message. So it achieves all of the benefits of having a broker except that there's no broker. So one of the ways it achieves this is that sockets are a very well thought out name. They're not connections because the connection is implicit.
34:22
You don't ever connect manually. You just say, I have a socket here, a socket there. You have a transport, like you say, TCP, local host, port, whatever. But the actual connection and managing the connection is done for you under the hood. And that's also one of the things
34:41
that other queue libraries actually don't do. I initially tried doing this with RabbitMQ. And I don't want to bash on RabbitMQ here. RabbitMQ is really good if you need a broker. But with RabbitMQ, I had to manually do all of the connecting stuff. So if I started up my user API, when the queue or the broker itself was down,
35:02
I had to like, okay, connection exception thingy, wait a second, try again, whatever. I had to do all that stuff. With a CRM queue, you don't. It's just implicit. And obviously, there are hooks for managing, okay, what if we have like a million gazillion requests while the server is down? Then we run out of RAM. You can handle that instead of timeouts.
35:22
It's very similar to having a broker, except that it's not another server because it doesn't really need to be, at least not in the case of request and reply. Also, CRM queue has other socket types. So you can do PubSub. You can do push and pull. You can do fan out messaging. So CRM queue, there's a pun in the name there.
35:43
CRM queue is not really a message queue. It's more of a networking library, actually. Which is good, because that's what we need. We need to do networking. We need to do RPC, and CRM queue can do it for us. Now, this is actually how it looks. I don't have one request and one reply socket.
36:04
The reason for that is CRM queue does not allow you to share sockets across threads. Obviously, we want to be able to process requests in parallel. So we need to create many request sockets and many reply sockets.
36:21
This is pretty easy to do. I just have a for loop, basically, that creates like N reply sockets. I think I have like 20 or something. I don't remember. And then I put this router dealer thingy, which is another socket type, and connects the reply sockets to it. I have a router dealer on the other hand,
36:41
on the other side where I have all the request sockets. So the request sockets connects to my router dealer thingy. And this is how I achieve multiple threads sending messages. Because if you try to pass a CRM queue socket into different threads, you will get all sorts of weird behavior and exceptions,
37:01
and maybe not even exceptions. CRM queue has a very sort of C-ish philosophy. So if you do stuff that you aren't supposed to do, you will just get undefined behavior instead of exceptions, which is kind of fun. But that's the way it is. So there are a number of reasons for why CRM queue works like this.
37:20
The number one reason, the only one I'm going to talk about, is the fact that shared mutable state or shared state in any kind of mutable and CRM queue totally gets rid of it by only being able to send data through messages. So you have a bunch of threads sending messages to another thread, so the router dealer pair sits in a separate thread,
37:42
takes messages from the request sockets, sends it to the router dealer of the network, etc. Another reason it sort of has to be like this is that the API for a socket basically looks like message equals socket.getNextMessage, and then this call will block until it gets a message,
38:03
and then the next thing you have to do before you listen for the next request is to reply, or else you will get undefined behavior. And obviously you need more threads in order to achieve parallelism in a scenario like that. So CRM queue lets you compose sockets, basically.
38:24
In various interesting ways. Instead of having a broker and a separate service that sort of does all the things for you. So if you need a broker, you can write the broker in CRM queue, or you could just use a broker, obviously.
38:41
So CRM queue is a very good replacement for HTTP in cases like this. So, I did show you in the first slide that in the central API we do this get slash users, slash one, so that looks an awfully lot like HTTP, doesn't it? And here I'm telling you that I'm not using HTTP.
39:02
So, what do I do? Well, these are basically the messages that I send to CRM queue. And it looks an awfully lot like HTTP, but it isn't. So it has a method, it has a path, and it has a body, just like HTTP has.
39:20
And for all CRM queue cares, your messages are just bytes, so CRM queue doesn't care about this at all. This only becomes interesting in my reply sockets. So in my reply socket I get this message, and then I need to do something about it, obviously. And I happen to use Clojure in my systems.
39:42
And I'd sort of like to ask the question, what does HTTP look like? This is at the end of a Friday, so I didn't expect anyone to answer. But it looks like a function, right? I don't know if everybody has thought about that already,
40:00
but I think it's a kind of interesting aspect of HTTP. Obviously, like any other function, you're free to do all kinds of side effects inside it, like talking to databases and whatever. But at the end of the day, an HTTP request, or an HTTP cycle, takes in some data, the request, and it returns some other data, the reply.
40:21
So I use Clojure for my systems, or at least for this particular system, and all of the Clojure HTTP frameworks are just functions that takes a request and returns a response. So this is getting a little bit technical now, but the point is that it's really easy for me
40:41
in a scenario like that to just call the function with this instead of an actual HTTP request. Or I should do this. There are screens everywhere. Or this with my mouse. So all I really have to do is to make sure that my data, my map that represents an HTTP request,
41:01
actually looks like an HTTP request because in the HTTP frameworks, in Clojure it's not called method, it's called request method, and it's not called path, it's called URI. But I just have to take this request, turn it into something that looks like HTTP, and just send it to my HTTP handling function. So I'm not using HTTP libraries out there, which I actually do. I just use a plain HTTP router to achieve this.
41:25
So I was sort of lucky. I'm not sure what your story is. If you're in a sort of more monolithic framework world, sort of emulating HTTP with something that isn't HTTP might be a little bit hard. So I can't really help you there
41:41
if it's hard for you. That was too bad. I'm not sure, whatever. Because this, I guess this is kind of hard to do without functional programming. Perhaps like in servlets in Java, you could implement some sort of interface,
42:01
but I would get dirty really fast and it would be a lot more difficult. But that's sort of, at the end of the day, HTTP, the source for HTTP doesn't actually have to be an actual HTTP request if you're able to sort of control your environment. So this is basically it.
42:21
So if I decide one day to replace zeroMQ with HTTP, what do I have to do? Well, almost nothing. I only have to do HTTP requests instead of doing zeroMQ socket messages. So that's cool. That's zeroMQ and using zeroMQ instead of HTTP.
42:40
This is the last part of the talk. I think this is an important point because we're talking about values. Now I'm not talking about morals here or something like that. What I'm actually talking about is immutable data. So as I said, I use Clojure. In the Clojure community, we really like to have well-defined words.
43:00
So when in the Clojure community we say value, we mean immutable data. It's interchangeable. And what is a value? Well, these are three values. 42 is obviously a value. Hello world, the string. If it has mutable strings, hello world is not a value because it's mutable. If it has mutable strings, like everything except C,
43:23
for all I know, it's a value. So strings are values. Maps are also a value. It's a compound value, but it's still a value. Lists, they can all be values. So why am I talking about values? I'll start by showing you an example of doing something without values
43:41
and then how you would do it with values and how it makes the world different. And this is sort of an actual example from the system I'm working on now, which is very ORM heavy, very framework heavy. And the idea was that we had something that we wanted processed elsewhere,
44:02
but time obviously passes all the time, so we had to do some locking to get away from that. So we had this thing which was a row in the database, so we had to lock the row in the database, and we had to put the ID of that thing in a queue, and then on the other hand, on the consumer side,
44:22
we consume from the queue, read the ID, and the row is still locked, and then we can read the row from the database, and then we can unlock the row in the database. And the reason you want to do this is that if you use an ORM, it's basically the most convenient way to go around with it because to get an object
44:41
or an instance of your row in a database, you typically pass in an ID as the database, and there you go. So we had to lock the row in order to achieve this. So that's sort of a no-values way of doing it. Not good. Some of you might recognize this fellow.
45:01
So if you use values, you just put the value on the queue and consume the value on the queue, and you're done. No locking. No locking. So by value, I mean the row in the database represented as a map, a mutable map. That would be a value. And because you read out the entire value, there's no reference like an ID of a row.
45:20
That's just a reference. It can change under your feet at any point in time. So if you just read out the value instead, put it on the queue, and consume it from the queue, then you're done. There's no time complexities that can reek into your system because things change all the time. So if you just use values or immutable data
45:44
or data instead of references, it removes a lot of time complexity. As we've seen here, we had to use locks to get away from things changing all the time because most of us are completely crazy and use databases that only represent now
46:00
and deletes data all the time. When you set the email address of a user, you delete the old email address and completely oblivious of time. You only represent now in your database, so you have to do locking. If you just use values, you at least remove some of that complexity, as we've just seen.
46:22
Another aspect of values is that if you pass values around, you can just add a network at any point of your code at any point in time. So if you have a map you can easily serialize it and put it on a queue and put the function that reads it somewhere else,
46:42
which is good. If you have objects, I guess you would need to pass references or whatever. There's a downside of values. They're not very idiomatic if you use ORM's. So in our case, we do use an ORM, and we do read out the whole row in the database and put it on the queue.
47:01
And then on the other side of that queue, there's most ORM's, at least not the one we use, have a way to sort of go from map to object. You would have to do that mapping yourself. ORM's like IDs. So that's the downside, I guess. But that's just sort of my word of advice to you.
47:23
Think more about values. Try to use values. I think it should be called value-oriented programming, not functional programming, to be honest, at least in the case of higher programming closure. All right. So, essential separation. That was part one. Split up your components.
47:40
It can have some interesting effects, like you don't have to write ACL. Your architecture does it for you. And as we've seen, HTTP is not the only option. You have zero MQ. That's the one I've used. As I said, this is based on a true story, so I'm not going to tell you about other ways of doing it, because that's the only thing I've done,
48:01
and you can use other queues. What do I know? But you don't always have to use HTTP. It's easy to think like that, because HTTP is so ubiquitous, but it's not the only option, and values are great. Thank you.
48:21
All right. Are there any questions, or does everyone want to get out of here, because it's really late? There are lights in my face, so if I don't see any hands, just shout out. Nope. Yep.
48:42
Yeah. So, not really. The only disadvantage I've been having with zero MQ is that I chose to use the native implementation of zero MQ, which is in C++.
49:00
Now, Peter Hynchins, he's the guy behind zero MQ. The point, basically, of zero MQ is to create a protocol for doing these kinds of things. So, there are re-implementations of zero MQ, and those aren't forks or anything like that. They are actually implementations of the same protocol. Like, the parallel would be, if you have to do HTTP,
49:21
you have to pull in the C library that implements HTTP. Obviously not. There are multiple implementations for doing HTTP. That's also the case for zero MQ, and it's actually quite recent that there are .NET and JVM re-implementations of zero MQ, so that completely removes the calling into native code problem,
49:40
which is very painful on the JVM, at least. I'm not sure how painful it is on the CLR. So, other downsides. Well, you have to know zero MQ, so that's zero MQ specific, obviously, but zero MQ is not made to be really easy to get started with. It's sort of a new language, new way of doing things. And it took me a little bit of time
50:01
to figure out that router-dealer thing and the multi-threading stuff. But once you know zero MQ, you'll learn something new, and then you can apply it to all your projects. I like to say that the time you spend getting started is like one percent of the time you spend with something, so I don't really care about how hard it is to get started with it,
50:21
as long as it's easy to use, or good, and solves your problems when you know how to use it. Other problems? What's that? Yeah, easy when you know it. Other disadvantages with zero MQ? No, not that I can think of, actually.
50:41
At least at my scale. There's not 50 million users in the system. It's only for Norwegian customers. There's some 1,000, maybe, so I haven't tried to scale it, but apparently zero MQ is good at that as well. Ah, good point.
51:10
Yeah, good point. Just to repeat that, it's easy to do HTTP requests from anywhere, like your command line, with zero MQ. You have to fire up a runtime or whatever.
51:21
I happen to have a solution to the problem, because I use a closure, so I can just fire up a REPL, right, and just start calling stuff in other languages or systems that might not be as accessible. But yeah, that's a very valid point. More people know HTTP than zero MQ, obviously.
51:42
Yep. Can you repeat that? Sorry.
52:11
I'm not sure I understand your question. Why I chose... I didn't create any protocols, or I think I misunderstood.
52:22
Yeah, I decoupled everything, yep. So I'm with you so far. Yeah, right. Yeah, I used zero MQ for that. Yeah, so the number one was basically convenience. So if you use HTTP, you would have to do connection pooling.
52:42
You would have to do that yourself. But I think, and of course, I can take down the central API, and I can still do calls, because the calls will go to the queues. So I can kill the central API, I can click log in in the user interface, and I can start up the central API, and then the request will be responded to,
53:00
which is cool. So basically just the general queue sort of benefits, I guess. Did that answer your question? We can talk afterwards if you want to. Yeah, yeah, yeah. It's a little bit difficult hearing you as well. More questions? I guess we're out of time.
53:20
We have a little bit of time, but... All right, thank you.