From HTTP to Kafka-based microservices
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 118 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/44782 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
Local GroupNeuroinformatikSystem programmingPhysical systemOffice suiteMachine learningPoint cloudInterprozesskommunikationHand fanQuery languageService-oriented architectureDependent and independent variablesSoftware developerScalabilityConcurrency (computer science)Condition numberAxiom of choiceDevice driverInterprozesskommunikationEndliche ModelltheoriePoint (geometry)Service-oriented architectureGroup actionFront and back endsPhysical systemQuery languageInteractive televisionDecision theoryCASE <Informatik>Cartesian coordinate systemDependent and independent variablesMultiplication signOffice suiteNumberConcurrency (computer science)2 (number)Condition numberMessage passingInformationDevice driverSoftware developerCodePlanningProduct (business)Process (computing)Line (geometry)Different (Kate Ryan album)Hand fanSlide ruleSynchronizationComputer scienceStudent's t-testData managementVirtual machineMereology1 (number)Digital photographyRight angleTime zoneChemical equationException handlingBasis <Mathematik>Asynchronous communicationComputer animation
09:24
Device driverService-oriented architectureSystem callSoftware developerInterprozesskommunikationLibrary (computing)Decision theoryKolmogorov complexityVapor barrierMessage passingConfluence (abstract rewriting)Software frameworkObject (grammar)Computer wormDependent and independent variablesServer (computing)Client (computing)InformationServer (computing)Decision theoryPoint (geometry)Dependent and independent variablesAbstractionDifferent (Kate Ryan album)CodeRing (mathematics)System callInterprozesskommunikationParameter (computer programming)Client (computing)Queue (abstract data type)Software developerService-oriented architectureLibrary (computing)Latent heatFunctional (mathematics)System identificationAddress spaceComputer programmingQuery languagePrice indexDevice driverPhysical systemMessage passingComplex (psychology)Computer scienceSynchronizationVapor barrierUnit testingConfluence (abstract rewriting)CuboidSoftware frameworkObject (grammar)Software testingMultiplication signBasis <Mathematik>Stability theoryCausalityFilm editingPlanningPattern languageSoftware bugImplementationSoftware maintenanceComputer wormDirected graphMathematicsAsynchronous communicationIdentifiabilityComputer animation
18:48
Server (computing)Event horizonClient (computing)Dependent and independent variablesProcess (computing)Software testingAsynchronous Transfer ModeExecution unitHill differential equationSystem callService-oriented architectureInterprozesskommunikationContext awarenessMeasurementError messageKolmogorov complexityDevice driverConcurrency (computer science)Condition numberAxiom of choicePersonal digital assistantStandard deviationDependent and independent variablesQuery languageService-oriented architectureProcess (computing)Data managementMeasurementPattern languageFunction (mathematics)Asynchronous Transfer ModeLevel (video gaming)HoaxSoftware testingSystem callHookingStandard deviationClient (computing)CASE <Informatik>Streaming mediaSanitary sewerServer (computing)Device driverMultiplication signProjective planeLatent heatInterprozesskommunikationMessage passingSoftware developerFlagDifferent (Kate Ryan album)NumberComputer wormCartesian coordinate systemExecution unitCuboidContext awarenessBuffer solutionComplex (psychology)Concurrency (computer science)Error messageWeb browserGenerating functionRule of inferenceSynchronizationException handlingBitState of matterRoutingAbstractionLogicUnit testingComputer animation
28:12
Message passingRoundness (object)Level (video gaming)Functional (mathematics)Service-oriented architectureException handlingLatent heatInterprozesskommunikationError messageComputer wormDependent and independent variablesServer (computing)Multiplication signSoftware developerView (database)Library (computing)Different (Kate Ryan album)LengthLecture/Conference
Transcript: English(auto-generated)
00:02
Hi nice nice to see you all here How was the coffee? Great nice, okay, so I'm a person that always wanted to be an IT guy computer science guy and so on so I Was working for 15 years in
00:20
Academic world doing science teaching students and and so on in the meantime I did my PhD but always the practical things were most important for me so even in doing some science I was always trying to do it more practical available for developers, so you can google something about it also
00:40
So now I'm now I'm working for flyer and Doing microservices for them. I was always interested in distributed systems So it's perfectly aligned with what I what I was doing before. I'm also one of the people that organize It'll be user group in my country, so please be friendly, okay
01:09
Okay, what about trial what we do in flyer We do revenue management system for airlines so actually we take a lot of data from them So we do big data ETL pipelines and so on then we do machine learning on this data, and we tell them
01:28
What prices should though they sell the tickets to earn most? That's our goal we have we have an office in San Francisco, and we have office in Krakow Poland So you can work for flyer
01:42
from European time zone and have work-life balance Okay, so That's some things we use not not everything and one more thing You know my colleagues from crack office told me that I must bring to you
02:02
we were recently on a Python conference some of us were in in check on Python conference And we've met there very interesting guy So we hired him You know the guy in the middle of the
02:21
Photo who knows this guy Yeah, that's right Kratek in Polish Kratek is For the ones that didn't seen it It's a mole and it's a character from cartoon in our childhood very popular in our part of the Europe So we we've met this guy we took him on a conference. He was on a talk on different talks
02:43
He even gave his own lightning talk so if you are wondering if you should or not he did it you can do it Okay, okay So if he is actually an exceptional data digger as a mole, so you know that's why we hired him
03:00
but to the point the history begins like a year ago a little less than a year ago, and I started to work in flyer and by then All microservices in our product were communicating over HTTP and flyer
03:20
Felt quite comfortable with this solution, but they also have the feeling that for some Applications it won't be the best solution And that we will at some point probably have to switch We already had some places where the services communicated over a rabbit MQ
03:42
but it was not perfectly implemented and we knew it so There was a feeling that we should find an asynchronous way of communication and The thing that actually caused the decision was a new requirement required to our e-commerce
04:04
E-commerce use case And the requirement can be described More or less like this we had an UI Interacting with the user this there was a backend for this UI So there was some interaction and at some point this interaction between the UI user and UI backend
04:24
Caused the UI backend issue a query to the service our team was maintaining and to Fulfill this request to respond to this request we had to
04:41
We Knew that we will have to fan out a Number of requests to the other services some of them most of them external So we knew it will be time-consuming to get the responses, but we wanted to give their user anything
05:01
To show to have anything to show to the user whenever we have anything useful for him And we when we get anything better for him, so we will update what whatever we are showing okay? So that was the use case so we actually wanted to implement something like this we get the original query We fan out the sub queries and whenever we get the first response response to a query
05:26
we sent a partial response to the original query to show something to the user and Whenever we get another response to a sub query we will send another partial response to update this Whatever this information that is showed to the user and so on okay, so that that was the use case we wanted to implement
05:47
So that's the first thing second thing was related to performance, and I can't give you the precise numbers, but From what I can tell you was was that when I looked at rabbit MQ we wanted persistence in our communication
06:02
when I looked at rabbit MQ at 5,000 messages per second that can be handled when you turn on the Persistence it was definitely not enough so the requirements were were quite Quite high okay, so that was that was the second important thing so of course you can do it using that using HTTP
06:23
Okay, but we already felt that we will anyway need a sycophonic communication So that's the good point to to start with and how we did it how we approached this situation, okay? We decided yes, okay. We we need to do we need to do a well, but but what and how?
06:43
We have HTTP based infrastructure, and it works okay. We have experience with it. We have developers experience with this way of communication They have habits with implementing this way of communications, and you know Competence is you know important, but old habits die hard, so it's the hardest thing to
07:05
to overcome in some situations of course we knew we lack experience with this kind of communication because we always did it using HTTP so one more requirement Of course we must do it well and the first time do it well well hard to do it well first time
07:23
but maybe And of course we knew we will get all these goodies when we switch to the asynchronous Communication and even more for example more opportunities anyone knows. What's the first year opportunity we get in this situation?
07:41
Anybody what do you think? Sorry also sure you can have always refactor the code But I have some plans for you. Can you can you catch your plane? Thanks, oh, sorry, but it's still perfectly operational
08:09
Anybody tries it was probably on the previous slide, okay
08:23
That's not along my line of thinking Sure, yes, you do you can have all these things, but you also you have the perfect opportunity to make any mistakes
08:40
Isn't it true Completely new mistakes completely new things can go wrong when you go to asynchronous communication So We can have different concurrency issues race conditions because we do asynchronous things okay And in the places we didn't have it before
09:04
There is a problem you know we should choose a broker We don't have experience we can read a lot we can do research But there is always a chance we will choose around the broker because we didn't research for the right thing If we choose the right broker, there's probably more than one driver we can choose
09:22
Okay, so which one should we choose and what on what basis what what? How should we decide We can use the correct driver But use it incorrectly Okay, if you know if it's just in your one simple service
09:42
That's basic, but if it spreads all over your system, and then you realize well You have to do this and that to make this communication stable And now you have to find all these places the other people just copied and pasted their incorrect code that's hard to to overcome and
10:03
Finally we can have correct rival and correct broker but we can use the broker incorrectly and a lot of different things may also go wrong, so We decided to contain all these horrible things in one place okay, a
10:22
library and Called this library a sink calls, and there's another hard question I have a cup for you. I Won't be throwing. I won't be throwing it. I'll walk I deliver it by you know Not by plane
10:41
What? Why I think calls Okay, don't answer don't answer don't answer. There's a cup for you. No we don't use a single below
11:10
For some reasons Okay, the reason is I just didn't do one I just wanted to give you the cup before you try to answer because the answer is so strange that you You didn't have a chance, okay?
11:23
As there's always this naming thing in in computer science Okay, so we wanted to create a library that meets our functional requirements We want it for developers this for this library to wherever possible
11:41
Resemble what they already saw what they know you know And of course we wanted this asynchronous communication below so whenever it's possible we want to join this See these three requirements And why? Why the library you know for maintain a maintainers of a library?
12:04
Or of this switch you know the Sauron you know the guy One ring to rule them all so yes one place to fix all the box. Yeah One place to change decisions, so it's much easier to change decisions. You don't have to trace it all over microservices
12:27
implementation just one place and If we need to apply good patterns, it's not just that we Teach all these people how to use this like Kafka driver
12:40
Well, we just use it well in one place, and we don't have to Change what we taught to people of course. It's harder than just updating the code But you know Sauron had to sell this these rings somehow to the people So how do we sell it to developers? There must be something for them in this
13:09
Yeah sure and not know Kafka because it's hidden But they but they do The complexity will be hidden hidden okay if we do a good
13:21
Abstraction above it so they won't have to think about all Difficult things related to it and lower entry barrier, okay, it's Way different Communication than using that using HTTP, so Let's try to do it this way so the decisions we had to take were easier because they
13:44
No longer were final We were more comfortable with the thought that we will have to change this decision at some point The decisions were that we will choose Kafka as message broker not super singly for performance Reasons and for performance it offers just out of the box, and we will choose confluent Kafka as a driver
14:07
Big cause the performance reasons, and we also hoped that something supported by confluent will be Really stable and well So well That's what we use
14:22
We wanted to make it just a library no framework approach no put everything inside just a communication library make it simple If we need something more complicated, maybe we will put another library above it So that was that was first thing we wanted to make it testable first. Okay, this Kafka is nice
14:44
How do I see you to this Kafka? Well you don't? you can't Issue HTTP to Kafka queue, so how do I test it? We need to give people a way to ad hoc send something to
15:00
To just let them test their service, and we wanted to make it testable automatically So we wanted to provide some reasonable mocks for unit testing to to make implementing tests easier and If possible, maybe we could make it resemble flask just to
15:21
To easier to let developers easier get used to the the new approach Okay, but it's like half of my time, and I'm talking talking talking, and it's developer conference So I probably that's what you think okay? So yes, okay. I'm showing you the code
15:40
To use it how do you how do you use it? We Create an object given that Service name as a parameter. It's just an identifier of this of the service it should be unique across your system and When you have this object and you want to create a server endpoint so an endpoint that will be
16:03
asynchronously responding for some requests So you just create a function and decorate it with a sink call server callback for The parameter for the callback is the name of the endpoint it resembles HTTP endpoint, but it doesn't have to the slush is not
16:22
Not necessary there just a convention This function will get the request object as a parameter you can do with the request whatever you wish And you can use this request object to create a response More than one response and each of these responses can be sent back, and they will be delivered to whoever sent the original request
16:45
So you can send zero or more responses to an to a request? Well to create to send a request so to know that's about identification. That's an ID of the
17:01
Service and address of the service, and this is name of the endpoint so you send the request to the service ID and Specific endpoint you have you can have a lot of endpoints in a single service To send a request You use a sink calls clients to create new message that will be sent
17:21
There's a destination ID in this message target endpoint and of course a payload And maybe some more things and you send this request But wait it's all asynchronous the sending is asynchronous. It is not non blocking So how do we get a response for this for this request now before we send the request?
17:41
We should define a callback to handle Response we expect so we define a function this time the decorator is a sinkholes client not a sinkhole server and we define callback for The service that will be sending responses to us That's why that's the first name and the end point that will be queried and will respond to to our queries
18:05
So well we can handle response response this way, so that's just about How you can how you can use it? Of course in the in the Most basic most basic approach of course the last thing you should do is you should start listening
18:25
And it's all in the client in the server actually you can have client and server endpoints in a single service, so you can get some requests Send some responses to fulfill them and then when you get the responses you can Response to the original request, it's it's all feasible
18:43
So you just call to the a sinkholes listen at the at the end of your program, and it will make a sinkholes receive messages Route them to correct to correct callbacks So what we have we have server which is like HTTP server even driven the callbacks
19:04
We we know from from from flask for example We also have client which is not like an HTTP client because it's not blocking. It's asynchronous you send a request and Just nothing happens. You should have a callback to handle a response
19:23
So if your request requires query from another service you get the request you send a query to another service and Your Process is not not blocked by waiting for a response it can
19:43
Actually serve another request waiting for for that original for the response for the previous previous request a Single process can be a server and the client of course and We can handle one request and any number of responses so we can have
20:05
More than one response to the original request, but we also can have no response to original requests we can just send notifications this way and if the Receiver expects just to handle the notifications. It's not a problem. You don't have to send the response
20:23
So okay, I know when I showed you the basics. I could talk a long time About different things that and the details that are inside, but I just want to tell about one single I for me. It's one of the most important things. How do you test it?
20:41
Is there a way to easily test this as unit tests? So yes, I think calls have a testing mode You need to enable testing mode and you can use it in your unit tests to enable testing mode you just Set the testing flag to true
21:00
Then you import your application which defines all these callbacks And finally you must have a fixture that will reset the testing mode between the tests Because we need to clear some buffers and when you have the testing mode enabled You can start testing. Okay, so we can have different
21:21
In different use cases that we want to test the most basic is we have a server Server endpoint and we want to check if it gives correct responses So we want to send it a request and verify what responses we got and if it's correct the most basic thing
21:40
so we need a way to send a test request to a Arbitrary service we can do it using that Because if you turn on the testing mode, yeah, I think calls you get something which is Showed here. It's test client and you can use this test client to create arbitrary
22:01
Messages and to send these arbitrary messages wherever you want in your unit tests Okay so that's how you send the request to to a tested service you send this request and you when you Send it the test client will receive the responses for you so when you send the request you immediately immediately can check what responses were received and
22:27
Then you can just assert for Correctness of these of these responses so we don't have to think about this Kafka below About messages or anything you just sent a request according to business rules and verify the response according to business rules
22:43
That's all That's this that's the simplest simplest thing simply simplest thing to do The more complicated is when we have Another service and we are testing the service in the middle a money broker
23:00
and this service is When when it serves our original request is expected to notify another service a notification receiver about some things So we are testing the original that the money broker, but while testing it we want to Verify that correct notifications would be sent outside of course
23:22
We don't want to spin out all the infrastructure you want to have it mocked in our unit tests We don't want such a communication So we need a way to mock this Other service just to verify what it would receive if it worked If it really was really started
23:40
So Aside from test client a sinkholes in testing mode gives you also a test server So in this test server we can with this test server we can in this test server. We can read just register an endpoint For this service we want to mock it was called notification receiver
24:04
There was a not if you can notify endpoint and this test server We'll just receive these messages for us and will allow us to retrieve these messages and check if correct messages were retrieved So we are disturbed then we trigger the end point. We want to test we can
24:22
Do some assertions about responses as in the previous example and? We can use the test server received requests to obtain the request that Were received by the service and the most complicated thing is when this third service is actually expected to
24:41
Respond to some queries and these responses should be used to generate the response we are testing So we want to mock all this service together with its responses to produce to check that correct Output will be produced so we did here we need to mock the random service
25:01
And to do it we just define a generator function and when we register the end point in the test server We just give it the fake responses generator And it will generate responses, and we will be able to to verify that payload on the output are correct
25:23
It's correct So we have testing tools out of the box we this calls are in testing mode I are made on stock, so they are deterministic the tests are deterministic We don't need my message QB och broker We don't have to think about how this IPC is actually done below just think about the business level on in testing
25:47
We have also many more features like before and receive before send hooks like endpoint context managers if you want to Measure how long performance of your endpoints so you can hook a context manager
26:02
around this endpoint we have error handlers for For endpoints kubernetes health check because we run it on kubernetes We have sewer like client and And a lot of more more things of course if we hide
26:21
some complexity We also hide some opportunities not only to make errors But so you know if you want Kafka streams for example So sorry we won't be able to deliver it because we hid this specific of Kafka below the obstructions
26:41
Okay, so we still have Can have these problems like concurrency issues we will have because we have a synchronous communication We won't run away from it, but it's all for developers It's all on the level of the business logic. They are implementing. They don't have to think about Kafka usage pattern draft
27:00
Driver usage pattern and so on and if there are some problems below That they can be solved in one place and actually were solved in one place without bothering a lot of developers so Switching from HTTP to async calls for server is straightforward. It's not a problem for client. It's a little bit more complicated
27:21
We Support one-way communication if we use more complex use cases yes It's a matter of doing it well. We have callbacks, so we always can have a callback hell But we also know that there are patterns to do it well okay, so we can build something above it if we if we need
27:43
We have easily disabled services so because we have we have tools to do it And we have now a standard project white layer to make a synchronous communication between between the services Thank you. I have three more cups. I won't be throwing it, but please
28:03
Grab them and don't make me take them back by playing you know to my place Any questions we have time for a few questions one or two maybe
28:20
Yeah, thank you for a talk an interesting thing is how to design exceptions and errors How did you approach that? We have as I said we have exception handlers So you can register an exception handler for specific exceptions for all your service
28:41
So exception handler is a function that will be called when an exception in your in your end point of course Okay for the HDP Developers they don't see much of it of the of the Kafka specifics But there's like a layer your your library layer between throwing a value specific errors
29:04
exception handlers are for the exceptions that are going out of the callbacks as for Kafka exceptions Mmm. Well they shouldn't see them because we should debug the library if not we must fix it Yeah, like Message length is fixed and stuff like that
29:22
Have too much payload and stuff like that you need to abstract that probably you know because in message-driven communication the exception the errors are handled on the in the different place rather on the level of Receiver than the level of sender of the request. That's also the tricky thing
29:44
To switch from HTTP you don't have error 500 because your service is down Your message is just waiting when the server in the service is up after like 10 minutes It will serve your stale requests, and maybe it will also send responses to you
30:04
Unfortunately, that's all the time we have but you maybe can grab I'm around I'm around so Questions after this one big round of applause. Thank you very much