Using Service Discovery to build dynamic python applications
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 63 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21091 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 201663 / 169
1
5
6
7
10
11
12
13
18
20
24
26
29
30
31
32
33
36
39
41
44
48
51
52
53
59
60
62
68
69
71
79
82
83
84
85
90
91
98
99
101
102
106
110
113
114
115
118
122
123
124
125
132
133
135
136
137
140
143
144
145
147
148
149
151
153
154
155
156
158
162
163
166
167
169
00:00
Software developerService (economics)BitDemo (music)Internet service providerKeyboard shortcutData storage devicePoint (geometry)Key (cryptography)Library catalogElectronic mailing listMultiplication signClient (computing)WebsiteData centerFile systemView (database)IP addressServer (computing)Source codeWeb browserWeb serviceFocus (optics)MultiplicationDirect numerical simulationContext awarenessMessage passingImplementationHierarchyAlgorithmFile formatNetwork topologyVotingEvent horizonDisk read-and-write headFamilyAuthorizationSequel1 (number)Execution unitLine (geometry)Natural numberWeb 2.0Coordinate systemLecture/Conference
05:17
Graph (mathematics)AlgorithmService (economics)State of matterNetwork topologyStress (mechanics)Internet service providerFood energyConnected spaceClient (computing)Library (computing)Computer configurationMechanism designExecution unitForm (programming)Level (video gaming)Content (media)Electronic program guideProjective planeCASE <Informatik>Data storage deviceFile systemCartesian coordinate systemSurfaceGene clusterMultiplicationData centerShape (magazine)RootPoint (geometry)Query languageLine (geometry)Image registrationKey (cryptography)Direct numerical simulationServer (computing)Goodness of fitDirectory serviceCore dumpProcess (computing)HierarchyVideo game consoleLocal ringStandard deviationContext awarenessComputer fileQuicksortLecture/Conference
10:33
Discrete element methodException handlingBlock (periodic table)Cartesian coordinate systemCASE <Informatik>Library (computing)Error messageSymbol tableConnected spaceMereologyClient (computing)CodeWebsiteState of matterSystem callDifferent (Kate Ryan album)Sheaf (mathematics)Server (computing)Video game consoleMultiplicationParameter (computer programming)WeightSource codeComputer animation
13:04
Curve fittingPressureImage registrationService (economics)Content (media)CASE <Informatik>Video gameLibrary catalogMereologyImplementationServer (computing)Client (computing)State of matterLecture/Conference
14:06
File systemDirectory serviceLine (geometry)Multiplication signInternet service providerFrame problemPressureService (economics)Serial portMaxima and minimaClient (computing)2 (number)Loop (music)Workstation <Musikinstrument>Server (computing)Connectivity (graph theory)Cartesian coordinate systemObject (grammar)InfinityComputer programmingLibrary (computing)System callCASE <Informatik>Game controllerPoint (geometry)Network topologyWrapper (data mining)Crash (computing)Virtual machineAddress spaceThread (computing)WordComputer fileVideo game consoleParameter (computer programming)Library catalogOnline helpCodeImplementationFerry CorstenHierarchyUniform resource locatorQuicksortComputer animation
21:05
Dew point8 (number)Address spaceServer (computing)MereologyService (economics)Point (geometry)Internet service providerVideo game consoleQuery languageDirectory servicePiInformationGoodness of fitSpeech synthesisOrder (biology)Range (statistics)CausalitySummierbarkeitAreaState of matterComputer animation
22:53
Virtual machineVideo game consoleWeb pageService (economics)Lecture/ConferenceMeeting/Interview
23:23
Web pageWeb serviceConfiguration spaceFood energyKey (cryptography)Cartesian coordinate systemResultantData storage deviceResidual (numerical analysis)Term (mathematics)Internet forumMultiplication signInjektivitätGraph coloringService (economics)Video game consoleComputer animation
24:42
Multiplication signData storage deviceKey (cryptography)Graph coloringCASE <Informatik>PiNumberComputing platformServer (computing)Computer animation
25:23
Different (Kate Ryan album)Multiplication signPiBitVideo game consoleConnected spaceGoodness of fitOnline helpGraph coloringStructural loadRight angleSemiconductor memoryFamilyStability theoryComputer animation
27:41
Chi-squared distributionService (economics)Library catalogLogical constantException handlingConfiguration spaceServer (computing)CASE <Informatik>Data managementCore dumpArithmetic progressionWordCartesian coordinate systemLibrary (computing)Row (database)Procedural programmingTwitterShared memoryWebsiteInstance (computer science)Type theoryClient (computing)INTEGRALComputer programmingInvariant (mathematics)Image registrationDemonPoint (geometry)Crash (computing)Right angleElectronic mailing listNumberVirtual machineDifferent (Kate Ryan album)Execution unitChemical equationVideo game consoleMaxima and minimaLocal ringLastteilungConfiguration managementCentralizer and normalizerDatabase normalizationComputer animationLecture/Conference
33:48
CubeLecture/Conference
Transcript: English(auto-generated)
00:00
Let's go, okay so we're gonna talk about service discovery today and we'll focus on the client side, how do you use service discovery in Python and I won't
00:20
argue if you should or not use this service discovery in this talk, I won't explain how to install the three technologies that I will cover here I will just focus on their usage and if we have time which I hope we will have I'm crazy enough to have done a live demo so we'll try it, it's an opinionated talk
00:51
okay so that's my point of view here, short introduction about me you can find me on Nick Ultrabug, I'm a gentle Linux developer where I work mostly on
01:05
cluster stuff and Python stuff, I maintain packages related to NoSQL key value stores or message queueing, I'm also a CTO at Numberly, we are programmatic and data-driven marketing and advertising company, we have a
01:23
booth over there with a quiz and you can win some crazy stuff so just come around and we can have a talk, okay so what is service discovery? To make it short you can compare it to what DNS is for your browser but in a dynamic
01:43
way, when you connect to a website your browser first have to find out the IP address of the host hosting the website you want to reach and to do so it does a DNS query, beforehand when you own the web service and the
02:03
website you had to configure the DNS and register inside the IP address of the server of your server, service discovery is about the same thing it's about registering and querying but for service, that's the basic of it
02:23
let's see a bit more about it, so we have a catalog that's provided by the service discovery technologies and then you have your servers, each of them provide a service, some of them provide the same service, they will register
02:42
themselves into the catalog and so you will get a list of service is running at host and port, multiple times if the service is running on multiple servers, then you have clients, the clients will be looking for
03:02
a service by its name usually and they will query the catalog for the given service and they will be handed over a list of available hosts providing said service, this is service discovery, now let's take a quick tour of the three
03:22
technologies I will cover here, the first one is the oldest one, it's named zookeeper, it's from the Hapash Foundation, zookeeper is firstly designed as a reliable cluster coordination, it's used mostly and mainly in Hadoop, it has some pretty interesting features and it's mature
03:47
since it's the oldest of the three technologies we cover here, when I say in the negative points that it doesn't provide service discovery per se, that's true, but we'll get back to it later as to how we can still use zookeeper
04:06
to achieve service discovery, what I mean by this is that it's not a built-in feature of zookeeper, the main design of zookeeper and it's the same for ETCD which we'll see just after this, is that you can compare them to a
04:24
distributed hierarchical file system which is also comparable to a key value store, you'll see about it, and it's written in Java and it uses a special implementation for consensus algorithm, the consensus is about making sure all
04:44
the nodes of the zookeeper cluster agree on something, the Python C bindings are not usable, there is one provided here on the sources but it's not really usable and even worse for service discovery, it's not a
05:05
data center aware technology, it just knows about its own cluster, now you have ETCD, ETCD is from the CoreOS guys, it's pretty recent project, it's
05:21
written in Go, it uses the Raft consensus which is pretty robust, it has a good adoption, it's used on many bigger projects like Kubernetes and it provides an HTTP API to do all the queries and registration stuff, it's
05:44
pretty simple, really simple to implement and configure, just like zookeeper it doesn't provide per se a service discovery mechanism but we will use the file system hierarchy to achieve this, it's not data center aware either and it
06:04
doesn't provide any kind of health checking of your services once you register, we'll see about it later as well. The third one is Consul, it's from HashiCorp that's the newest of the three, it's also written in Go, it's also using the
06:24
algorithm and yeah I told you it's an opinionated talk so I didn't find any bad things to say about it because it has built-in service discovery feature, it's data center aware so you can have multiple clusters of consoles
06:42
in each data center and they can talk about them between them and it also provides a DNS API so you can also look up for services using DNS which is kind of a good feature. The note I wanted to stress out on Zookeeper and ETCD is
07:04
that we will achieve service discovery by abusing the key value store, you can see the key value store are the sort of file system where you can store data. Registering is about creating a node or a folder or a file if you want to
07:25
relate it to your local file system and make it meaningful. In this kind of example at the root of the hierarchy I will say okay the first level will be my service name APAIX then on the second level I will create a folder
07:45
which will represent all the servers providing this service so I call this folder providers and then inside I will create nodes or you can relate it
08:01
to files which are named my host two points and the port so discovering providers for APAIX service is just like listing the content of APAIX slash providers directory fine we can do the same with memcache and stuff like
08:26
that that's how you can abuse and achieve service discovery using key value store based technologies such as zookeeper and ETCD okay okay now let's
08:41
the Python clients library to talk to each of those technologies the first one is zookeeper is kazoo zc zk and yeah I know I'm sorry about this we can be very very creative community I know and we'll use the underscore line one zc zk which underlie underneath uses kazoo so you can see a rap zc zk
09:08
as a service discovery oriented rap rapper of of kazoo so it's pretty pretty handy then for etcd we have some standard Python dash etcd library
09:22
which is pretty good if you use I think I owe you have another four 18k of stuff as well and for console you have consulate and Python console will use Python console which is more now documented and more active
09:41
than consulate last year it was the contrary but this year Python console is very very nicely implemented now so good job guys thank you okay when you choose a technology you have to rely on it even more when it will be the
10:02
core of your whole topology you have to make sure that you can rely on the Python clients because they really have a direct impact on your application so let's see about the zc zk client library which one uses kazoo when you
10:23
want to connect to a zookeeper cluster you can specify multiple hosts which is pretty cool it has what to reconnect feature you can query about the connection state you will get connected or disconnected and stuff like that so you can have your code handle this gracefully and it has rich
10:46
exceptions if something wrongs happen so I'm providing a quick example here the don't fail and connect means if no server is available when I do the first line and wait and try to connect to to my zookeeper cluster will it be
11:05
blocking will it raise an exception in this case it's blocking if you can change this with the weight parameter but it will raise an exception okay so it's not for the one of you who are used to the Python memcached library
11:28
you have to know about this and handle it because it can block your whole application if no zookeeper server is up on the etcd side Python etcd
11:42
side you don't have the possibility to connect to multiple hosts but you have a toggle reconnection gracefully so it's pretty good you can't really try and get the connection states the exceptions are pretty rich so you can
12:00
see what's happening pretty easily and catch the good exceptions about the different kind of errors that you can happen to to be running into and it does fail and connect the Python console one is well not so good as this
12:25
zc zk1 as well because it doesn't support multiple hosts either it has also reconnect feature auto reconnect feature you the exceptions are so so I'm providing an example here connection error is well sometimes not
12:45
very very meaningful but it doesn't fail on connect that means it's non blocking you just create your console client and then continue on nothing happens when you do that which can be a good feature okay now about the
13:08
service registration there are three things you have to consider here three states of a service lifecycle is getting up and he needs to register
13:21
into the catalog then it's running and you have to make sure it's still running because if it's not running it crashes or your server providing said service becomes unavailable you don't want to answer clients about it okay so you have to remove it in a way from the catalog when it's down that's the
13:45
dynamic port and then if we stop gracefully of or if we crash we have to de-register it from the catalog so the health checking will also do the de-registration for you in in case of failure we'll see how it's done on
14:06
every Python implementation for ZC ZK it's pretty straightforward the main line the main thing to understand is the first one over here and the
14:23
first try except will just create the file system hierarchy I talked to you about so we just make sure that we have the slash EP 2016 providers and we do a make path which will create the whole path like mkdir-p and if the
14:45
node already exists it's okay we can we can continue then the ZC ZK provides a cool method which is register and then you say okay on this node on the providers node I will register a machine named Yaz running on
15:01
port 5000 and it will create the file like node like Yaz 2.5 thousand for you okay that's all you have to do now about help checking
15:22
the help checking in zookeeper is implicit because zookeeper has this cool feature named ephemeral node and ephemeral node it just like that they are like files or nodes in the file system hierarchy that are present on the
15:40
file system as long as the connect as long as the session of the client who created them is alive so whenever you the client dies or closes its session zookeeper will know about it and will remove the given node automatically so
16:03
it's a good way of doing health checking because if your application crashes or you want just to deregister you just have to exit gracefully and close your session by closing the session to zookeeper zookeeper will remove all the nodes you created with this application so the register
16:25
thing does that it creates an ephemeral node so that's implicit in the ZCZK Python client what about the failure detection latency if my program is kill
16:42
dash 9 or crashes badly and didn't have time to deregister gracefully how long will it take for zookeeper to remove the node from the hierarchy and then in other words how long will I will it take for the clients to not be
17:02
served my host and port anymore it will take session timeout here when I created my client session I said five so it will take up to five seconds in this case to make this happen so for five seconds maximum I could be serving
17:25
wrong host and port to my clients from the catalog that's something you have to consider as well in such topologies on etcd it's basically the
17:41
same principle we try to read the provider if we can find it we create it as a directory then we just have to write there is no resistor wrapper or something like this so you we just have to write the given node I'll talk
18:00
to you about here and we can set the data in it so we put also the same thing in the value it's not a directory and it has a time to leave which I'll talk to you right now that's the health checking actually you can see that it's coming difficult here why because etcd doesn't have the concept
18:25
of ephemeral nodes as zookeeper has that means that you have it to implement health checking yourself or use a third-party library or program to do it for you but you have to do it yourself so in this example I'm doing it myself so the trick I'm using is have is that when my application start I
18:45
have to create a health pinger thread which will constantly and in infinite loop register my service and that will be a sort of heartbeat or health checking stuff with the TTL and then my TTL the time to leave of the node
19:05
I'm creating it will be removed after X second TTL seconds from the the hierarchy so my fellow detection latency is TTL but I have to have a thread constantly making sure that my node is present and so my service and
19:26
server is in the catalog okay if you use console everything is granted and so you can see in the code that's it's pretty straightforward and I just have
19:43
to register my service into a console agent which is as well very self explanatory the name of the service the address of the host providing it and the port is running it's integrated nothing more to have the
20:07
health checking is interesting in console because you have a way to make sure that the console servers will run some health checks of your service by
20:21
themselves so you just have to create like in my example it's an HTTP service so I'm creating an object health check object which is of kind HTTP and I'm providing the URL that the console server should call every two
20:41
seconds so I'm telling console a okay and when I register I passed the extra argument check and I said to console okay check this URL every two seconds if it fails remove me from the catalog or to be very correct mark me
21:02
as failing all right how do you discover all of this it's pretty straightforward as well and so I will just show you the querying part for zookeeper you can get the addresses by listing the children of the given
21:25
node so I'm listing the children of the providers folder in EP 2016 and that will be my nodes I just have to look over them split the two points and I
21:44
get the host and port of every server providing my service okay etcd basically the same stuff so you make a recursive query real you get the children and you split and you get your host and port on console it's also
22:09
very easy you query the health service because you want only to get the healthy servers providing your service so that's the passing equals
22:21
through here I just want you to return the service where the health check is passing the servers and ports for which the server the health check is passing okay and then inside I get a lot of information it's a directory style thing and inside there is the host port and other stuff interesting
22:43
sounds good okay now let's play so I have given three raspberry pies and my
23:08
machine here is running a zookeeper etcd and console agent so the idea I had is to showcase a service discovery page like this where we will be looking
23:26
for the EP 2016 host providing the will be looking for the host providing the EP 2016 service so I just wanted also to to demonstrate yeah okay that's
23:47
okay to demonstrate the key value storage which all those technologies are also used to configuration access so you can store your configuration in
24:02
this key value stores so your applications can also get them from from it so the color here I don't know if it's really because of the reload this I changed the color configured on each in zookeeper in etcd
24:25
and in console for my for my web service so Dirk can you start running your your raspberry pi so raspberry pi 4 is this one that I plugged in a few
24:40
seconds before and I can just go to it like this and you can see that every time that I will change the color on the key value store it will be picked up by the application from in this case zookeeper and then Dirk just
25:02
plugged in the raspberry pi number one which appeared and got discovered here by the server on every platform so if you can you also plug in yours and you too so we'll see the others coming and what's interesting about
25:23
this is like this okay it's gonna get hard now I think my raspberry pi 4 gets a bit overloaded here that's the Wi-Fi but it's okay so it's time I
25:45
reload okay Dirk raspberry pi is running pretty awesome you see that my raspberry pi 4 which is not responding here you can see that it's not responding the help check failed for every one of them and that's it has
26:05
been removed from zookeeper etcd and console so it's a good thing okay it's working right okay so now we can see raspberry pi 2 okay raspberry pi 4 is
26:24
getting back somehow okay yeah it's getting back yeah it's getting back okay raspberry pi 3 on console yeah it's working as well okay you can see that oh yeah the color now all right yeah so now we have the four raspberry
26:46
pies up and running and they seem to be yeah pretty stable on the have check yeah I will remove I will disconnect raspberry pi 4 now let's see
27:03
about the time it takes it depends on the technology because they have a different kind of TTL ephemeral node session timeout or have check timeout okay it's yeah some of them are overloaded do you have any question
27:58
the client decides so the question is is there any kind of wood balancing no
28:24
the catalog in the when your client queries the catalog it gets a list of all the available nodes for the given service that's all that then it's up to you to decide to which one you want to connect yeah I have a question
28:42
about redundancy if you have an application that is dependent on the service discovery catalog and for different services that that exposes and the catalog for some reason crashes how will you recover from that situation will you have like service discovery of the service discovery or
29:03
how would you do that yeah no you don't do service discovery or service discovery the minimum that's advice of servers is three so you should have at
29:21
or console servers running okay so if you want more resiliency make make it five seven but an uneven number okay always an even number if something very bad happens and you don't have service discovery anymore I guess you
29:41
have to handle it on your application side you can make it like with cash cashing stuff it's not it's not very easy and it really depends on the type of application you're running but the best best course is to make sure your
30:01
service discovery cluster has enough nodes to sustain this kind of problem well it depends on the technology actually as you saw if you use zookeeper you can connect to multiple hosts so you don't need a load
30:22
balancer just but every of the node in in this on the other hand in console and etcd you have to specify one of those on one of the nodes so maybe you can implement some kind of stuff on your application to handle this like having a DQ or something like this in Python and try again in each
30:44
exception if you it's its phrase is an exception you can try and connect to the other host etc etc yep I think for the recording sorry a question about
31:05
registration procedure why don't you want to use external tool to do this it can be implemented in configuration management chief puppet salt and in
31:21
this case you will have possibility to register third-party services like MongoDB and so on automatically so that's a question why not to this as external service for your application well I think to me chef and stuff like
31:44
that are good for provisioning or configuration really a configuration applying configuration to service I don't see service discovery like this to me I relate to your point with MongoDB and and and demon stuff like
32:04
this you have external programs that do it for you I'm not sure that for instance chef etc can have have checks running on so if you make it with etcd
32:22
it may become difficult to do there are a lot of third parties libraries doing it for etcd for example because it has a wide audience and for container stuff like this and they they use specific third-party tools but not provisioning tools register container that register container that
32:45
automatically registers any container that is running from docker host and it's it's very good idea I think it's yeah it works well and it's if something happened with container it will be there just automatically yeah
33:00
but it has to be register somewhere anyway we are running local agent on each host machine local console agent and each service knows that it can find agent on local host and agent only agent knows where there is a console
33:23
cluster located it's can be very easy way yeah but you don't have central configuration place or you you you do it also in we are using salt to to install everything but actually a service discovery is implemented in
33:44
using special containers another question well thank you