Architecture of a cloud hosting service using python technologies: django, ansible and celery


Formal Metadata

Architecture of a cloud hosting service using python technologies: django, ansible and celery
Title of Series
Part Number
Number of Parts
Martin, Abraham
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date
Production Place
Bilbao, Euskadi, Spain

Content Metadata

Subject Area
Abraham Martin - Architecture of a cloud hosting service using python technologies: django, ansible and celery The talk will show the architecture and inners of a cloud hosting service we are developing in the University of Cambridge based on python technologies, mainly django, ansible, and celery. The users manage their hosts using a web panel, developed in django, with common options: ability to create a vhost, associate domain names to vhosts, install packages, recover from backups, make snapshots, etc. Interaction between the panel and the hosts are made using ansible playbooks launched asynchronously by celery tasks. The VM architecture has been designed to be VM platform agnostic and to provide disk replication and high availability. The University of Cambridge central IT services also provides other services to the rest of the university like domain name registration, authentication, authorisation, TLS certificates, etc. We link all these other services with the hosting service by using APIs while keeping a microservices architecture approach. Thus, enabling the use/link of other services within the same hosting service web application.
EuroPython Conference
EP 2015
EuroPython 2015
Computer animation Meeting/Interview
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation Meeting/Interview
hello everyone I'm Ramadan and I work for the University of Cambridge has you may have guessed from the future when the screen I was thinking about began earlier in solution and so everyone knows what the the University
reduce so I think I have leaders to say about that and such again show
you some pretty pictures so you can buy me where and this spread
place and when work at these nice architecture this knife
rivers would you can ponder with your friends are doing some as well 2 to of some as we have a year the year classic must
reach and you as as you may know we have a lot of other people there
are some number prices and people that usually walk around the
city dressed like this chart comes and it seems we you during this season look at it when you go there you will see these people but what is around them at any of these seems pretty preclassic pretty but we have to
have pretty nice new buildings like this 1 which is the universe of the new university data center which is 1 of the stop data centers
in the UK on it's pretty big it's green we have a lot of innovation society we even have HPC surveys which has in
2003 was the 1st 2nd most green computing above a 500 so it's not just classic building an architecture we also has some cool things and
this is the computer lab and what I used to work idea was that many religions you will know suspected with communal away was you my PC and then I worked there as a product of this building of this school we engage with because the data rate for the half of the building and I work on this
building now which is the university computing services the when we provides IT services for is unit of the University both building share a common history come and
they used to be the mathematical laboratory work these machine and trying balance the brownie points if you know where it is z-axis computing computers 1 of 1st computers in the war based on the book by von Neumann a detector that was you there we still have some PCs into a bit of a and but as so other things
would build there and in the applied
I'm working now which is the university computing services like X in which you probably know all about because steel 50 % of when males sorry mail servers and use it as so we have pretty good people working and not of any of them but we have some pretty good people that a working out of open source projects so when I want to
explain you today is 1 service that was
proposed so a lot of years ago uh which is the manage web services and was born to solve a problem that we have a lot of researchers in the university of assuming no but and a lot of them usually whole the conference they do research they want to do some simple website with some statistics to show a steady state show results from the research or even to question are used at to that at so the end up using their own web service under best was achieved was Ramblers on the best you was no mundane usually because there and the Academy's usually use that computer for the conference and then they left this computer on the there's the was motivated and then we get executed brought we get service hot incidents of value you you know so the proposed and that we have data services and the University before solving the problem was pricing these web services and so the solution was to provide a service where you don't have to worry about them and they knew where or dissolve or you only have to worry about maintaining the web obligations so we Monday in the we give basic what course think about it is like external this is the 1st and you have to worry about backups and you have some dedicated resources to whether so that's a very real and when a single is like 15 years ago and the book that the 1st lesson of the management server was using a slice 7 and running something size machine on so you can see that it was but using a very all that stuff about you need to be made going on it was using our troops system demanding the separation between the different but but this 2nd bias the claim the sooner than than than than the other 1 provided that means all words excised then to on work and ascended to use so I zones which relies on is is some kind of a the globalization inside the summaries machine it's kind of container so we were using containers before it was cool but but it seems pretty all them but it would lead as had make more enhance the future like database are driven by discrete so you did this doing this groups space on some information database it's and prices there to manage some news and NFS server classic us at the 1st hour by systems so the rise also snapshots which is good and the users were able to create the host Alusuisse attentive but the broader worth everything is mine 1 so that they had the users are sons of the e-mail saying I want is on then we make this same the changes we execute this groups groups uh made enters but everything is my so we we need some sort of human intervention so when it is when decided the growth and we can now have more than 200 users a moment behind the website it is that they become a little difficult to manage because it requires out of time so before we ended up making a new of from the management server another solution was which is the fuckin service was his a role made you only get our minds and you don't get access to the server anything this is CMS as a service and we have have a ligand 200 but websites there but so if you going to the university website you probably will end up in in fuckin service or a management service so for example the 1 of the most visited website inside the services to those different holdings website so we decided to make from scratch a new service so respect what we where have done because you know we don't have some more size machines interpreting and they're pretty old don't we have assumed that the prison for that so and we have the but let's do more automation we can do more dimension so it requires less and less time from us so when we say that book that he classic and the beginning the ends of best maintain the same things that was proposed by the previous ones make no for users and everything is 1 thing by yes I wanna is a by eyes I mean by civil because we've touch anything but as we will see that later and so claim that this e-mails that comes to are indexing can you please install these packets can you please installed these but we created a weapon of using jungle where we delete some of power to users so users can do 2 things without having reductions on the architecture they seek out basically at billion a machine so and we install the basic properties that we have been installing up to now right to the Apache might well be a to B which is the most common future demands but we have to support and ultimately as a battery of available like what we see you want to stop the Python at the jungle etc. but we have a set list of system but it is you can store their approved you ended up with a machine with a of packages you need for their strange thing but I we give them of all the power to do authorization to the sides create the host apply for domain names but it still TLS certificates into machines the about from them as research or management and to the set of so we we give them the portal of other things that we were doing before to the
other finally I don't blame for the of design it's an in-house assigned Museveni Cambridge website you will see that all of them look exactly the same but so it's it's just kind of with some options and to manage your by your side so when you create decided that these weapons disentangle
and you have some options to create and the host and ask for domain names etc you get to
but next of the m which is the best error so you can go on your production server to the test and test things there without having to comprise your professor red so that's good especially for people that have to but I'm still in the management services and then they took was in the brain but things happened so is that it did you can test it before anything goes wrong right you cloning back to the group center so study that looks like
that and we will go up 1 by 1 to see how we yeah that this is the not to talk about OpenStack naval about OK so don't expect any of that but we used all the most of them are present analyzes on we did that project in a few moments when assessing the thing is we're still working on it but most of it we have made by using 1 of to 1 . 2 1 . 3 st so it's so much so much people so the demand of research in the Europe right for being although it seems like a huge service it's not that the 2 so we have here of the image that you here the end is separated the VM service is separated from the rest of the stack so we use that and describing the BMI
detected again that is just 1 of the Y solution you may be may may know the skin with solutions just uh ESXi I servers and you can manage the sex services using views fair contrapuntal on some API and we have backup server where we do the backups and that is not obligated so if something happens and then we we build most the VM and attic over the things from the backup so therefore always easy on the user and that's the jungle upon of indicates so we know who he is and then he has for for any use that much web server and a host name and then I P 6 and 4 are arrogant to the side of the VM API creates a new VM and the MAP installed US I want you as always is ready and similarly sex security and and the world is the 1 that uh confuse the whole machine so it's so we're using as he worked as a configuration management and does everything we need for the the no answer what features a bunch of things so that today are is discrete so it's very easy to understand what they are doing on the separating the full list which is pretty good and you can find the find that you're looking for and there's separation of things that in the 1st 5 so you can see just pretty good to use but he also has inventory so you can define all your servers based on a dynamically or stabbing so you can have a file with all your service you can inject the output from that nobody PI and a silly stuff there is you have or even the database so it's pretty nice I was really well and a basin playbooks playbook is just the but at a bunch of roles and the to bunch of target so you have the roles and definition of for these things you want to install you know in the following these machines that have these rules and then you have targets and then you said a this target these machines I want still that the usual for example a web server which 17 year old and a web server role half of this tasks then start about cheek of Europe actually to the is isotope-labeled on as a said before and mind you define the cost where you want to install things and then you divide the roles that these machines in in the in this list we have that for its role you have passed them plates which are attended to demarcate this creates hundreds of variables you can have as a global variables or variables and the ending in into the script and this is how it draw looks like just a bunch of tasks inside the role and you can see that here we and funding but it is it's a of 5 as you can see and it's pretty easy to understand what you're doing and as always if you have a you know but if your working with with more people it's easy to modify a file change of configuration it you can see here that the templates can be used with viable as items into the templates we use variables there and therefore we use the same thing for for all the configurations all motions but you also have less which are basically cut call right when some function in Nancy 1 ace executed then and you have a call but later and you can for example if you have evaded the attachment duration for or George I want you to understand and the collection is called so that is from the
empire we use a use of the employer investiture we use the API so we launch we created the and everything good after that as a set answer silicon confuse machine and then we can offer the service to the user so we've we start from the top of the stack we see of syndication
we have foreign of indication of and we use raven rating is authentication service so you can see that we have a lot of services interconnected using a lot of the it is the of our Web offered the eye and all we have to build a custom jungle work again but this could be substituted by any of indication that you can you can use it can use the title 1 if you want you can link with your own enterprise if you want to so the 2nd
layer is overly station we have like kind of like
natives up each and service and its course look up and then what we have there is just a list of users and list of groups we can see these users if they belong to each other which institutions they belong to which groups they belong to a so the end user can of your data and only birth based on this list right they can search for and other users are authorized them as in the straight source of arise buying groups etc. so it's just basic least we use that instead of using the jungle of groups because it's more useful for us because everybody using the university to the service that we have so they created the groups there and they are ultimately be updated if someone leaves out of the and so when
the user has a very the user the user to to enter the machinery with the service but we need to steal installed the using the machine and so we have another servers over there called tactile which provides more information about users is like user in the management so we get from that bill by unique
UID from the user now we need that because if we the same user in different machines we still need to even divided the files that belonged to him out there but in different missions so they have the same you and we use these unitary another you user is usage install using on as well I it along the Vietnam's where it is also and we have really refreshes the French and the and they look at groups we have authorized to leave the groups change that people different change and do we allow people to upload their ssh keys and 66 thousand installing the user configurations so they can and they using they possible or which is checked with Rosella server or even the SSH and then they have installed in so once they
can access to the dialog planner land they have the user and following the but in the end by began already access the dimension everything is computed for them so they can stand studies out for praise we had also another communication with the idea raised API which is on the bottom there and
these are and houses is another external service so we as you can see we have the main service and then we have a lot of other services that we communicate with which provides the university of every station for CoMoCAT domains of so if you want to raise then you come across domains example important studies that chemical and you can launch an EPA we lunchtime Napier request using the at the same time upon in general but in a sense the request and then we get the domain name ideas for that site so everything is gone the automatically the use of the path to worry about the process of internal processes these API tells us that the user is authorized for that domain name or if these meaning is already in use etc. are the same EPA surprises from IP addresses we have to relocate some of the IP out to live with the host name so when the user request a new site he that he can access directly using the remaining without having to wait for D. and as refresh so what we do is just preallocate some analysis and then when the user gets their side they can actively using the hostname without waiting for the energy of the we used like the others as we have 1 addresses 1 1 host my name as well as a hostile this so it's the communications but for the the host and another for the service so you weekend if we want to move the service to another machine we can do it without having to modify the host name or of the host of machine so we can save right and what is the the service and what is the host and we go and we can move that the service at the and you will see this is useful later on additionally we have in society as theoretical and the but anyone knows what it's sh as the record stands for no 1 bit this is all to forget about that and I'm sure pretty pretty much you have seen this as a screen like that and where does is you can kind of load and as it is here according to the N and with your public husky and then you can use you have asset at doing it you don't have to take the 4 and a hosting fingerprint was a manually because the DNS server does it for you it gives you and the intersect gives you the fingerprint and then when you can check that this fingerprint is the 1 that is in Indiana 7 which is secure communicating with you and then you'll have to check manually intervene bring all the machine you connecting is the 1 that if claiming to be those
pretty useful of is that we have a lot of services in missing data we have and inventor you there which is using another API it's these days API some border consumed so we give the the service called best debate fall or services so we can use it as a externally there is as well and
we know where all the entire we know from where they're located the idea the the have into the so that it can be used even as he invented in France the we can be used for for other purposes so as you so we have a lot of for different ways of accessing we use is there to be press the number of a to the BS using Jason no indication of but we have had to deal with a lot of them in a lasting way because we don't want entangled the main thread of some you stop by that so what we do is executing as a background process is the using the growing jobs which is the easy way you the needs of the j executed after of the user has launched the provision or if you want the execution of a scandal and you can use salary and various which is what we use as Our is pretty good for us because it was pretty well with jungle but it's very easy to confuse your you just have to add and on the top of the in the in the function used to have to the is a shared task but you can use different them places like this task we figure that you can define the greater the number of fruit trees you define their at them plate so you can define if it fails and to log something or send you an e-mail that's so it's pretty stability of the euro and it was well um you can also execute crying jumps from this and salary schools to be 2 different
average jobs are you just review as he was growing of so it's it's pretty useful for us it was well and the 2nd set
these these salary and these API and services are supporting all of these all answer will the and so and the
changes of and and and downloads and persistence in the database and then actually it's executed takes this changes from the database and then uh it executes these changes on on deviance so we have the service who went to the community in the university when man worshiped and we said we have these for you we so that you would like it and they said we would like it but what about even the service and then we so
that we have about and you can recover from bank the it will take too much weaker at the end of the and that is a new but letter if you I want to switch to you but we do have is late because we have about them we have a plan avidly and have to at least you thought about what happens if 306 and this and that which requires a lot of century create I'm not assigned to take from other so some of the people where them we thinking the change in the room and history but only if you write high availability so lecture for us with desired the application so he can cope with different domestic architecture which is good because you have to worry about the VM that the jury that the end that time using because you are creating the VM using an API which may be the 1 that we're providing all could be an arm of an institute server and then we execute everything through unstable and see only needs an SSH connections so it's pretty previous so we just need to replace this company which is the market sector of other let's update VMware we have eliminated
and then we saw that we need replicated fair replicated storage which we half and review start for a lot of service is very expensive to maintain because you need to read a future star exists shared between all the the of so we have a lot of things depending from the fact that also with that's pretty risky for the long
time we have Barrett economic extension because it is this house expensive to acquire all the all the current over that we need to do so we decided OK a what we the London and then at the age shares system so when we do His
replicate each 1 of the the yams finds it suffices them to 1 of the 1 so what we used we thought we can use of EM where as the use of where investiture we had a pacemaker costing which is basically the bluster that checks that all of the ends are in contact with each other and then paintings the servings neglible tuition this is why you've useful have surfaced and 1 of the region to any of these 2 production Fiennes so we have a replication of repeated to be and the 2nd 1 the 2nd column is just that at the end that is waiting in something something phase to still be changed and start acting as a server be active at the end of the last and then what we replicated the storage individually for each 1 of IBM's using the ability which is basically I driver that sense all the rights our of our of our machine into the other via so they've they register it is obligated to the set 1 on the pacemaker takes care that the if sum of the components fate the switch over his may automatically so we have to work but then we felt for this is maintaining a lot of classes we may end up with 1 class there for each 1 of the DM that we're going to have because for each 1 of the and we would have to have a piece pacemaker cluster this is very expensive and they you may fail if we need to execute and see what we needed to do and will in both sides in the 2 so they are synchronized so it perhaps a lot of work and it's going to break quite easily those
again this stuff from scratch we move away from the and where we decided to use share z can be a tough year but very similarly as in view where you can see there Tuesday and services there and the hours of executed pacemaker and costing but the difference is we don't do testing for each 1 of the begins we do class and for each 1 of the Shenzhen Center service is ends areas have had a lot of the instance so if something happens with 1 of the areas the whole the whole server and all the enzyme inside 1 of the sensor is that you can see in the pulp up on this then the ends that have as and SARA act out what could mediated lies relation to the 2nd 1 and you use anything even the cell gets people kept i kept open you notice that the switchover has happened and we Vivian were solutions do we have to wait until we start the emperor some books and that goes up with z then you know that anything you just don't see that nothing happened so quickly inside you have a change of the from ones and over to the other but it's completed aspirin to so for the it's a more
complicated because this is the file system that we have to use that this is a very common system where you have all of these this this the title and in a physical volume and then you have the last 1 is the and unsealed which is basically the operating system that is running in this and indirect and then each 1 of these are the and each 1 of the individuals and then so if you have all the this storage replicated to the others and server which provide this like immigration so we have a happy
transition right what well but the history the same because we design objected to so we could change the VM not provided but just into the API DATA Somerville API so it and we only had to write an API for their XenServer on and then execute everything that was exactly the same so we're happy with that
since people may be happy with that as well so and that we change it from the the were solution to this and solution which is 3 Note class there's no all clusters are in different locations can do like relations so the uses some of the thing it's still using and and we have to use and the will to deploy more attractive so this is an example of the sensor best as we can deploy of many of them will the disease to the predators sensible say we want to create more than trust that just as is yes create 1 get machine physical machine start that have the 21 21 so it's pretty list that 1 minute about security in because we let you do what we I am not an expert instituted would we like to enforce security door users so we can end up with our problems we decided not to use route up has when we create these zen so that people and we don't have to manage pass request which is and if you could do to secure a lot of food parcels databases are we going to manage and that way so we use it only use Kate keys going to emissions using keys connect emissions using keys et cetera we have a separation of privilege for example we need to presented at the host queues of the San hosts the host needs to be generated previously because we need to overload the acidic the request before they even the machine is created so we need to have a pool of the huskies and we can use in a for you to and we can start in emissions we use use of service which provides usually face so users can execute commands from producers or the more creative uses and based on some filtering and something plating and we have to provide a PLS certificates service is 1 and only show 1 because we want to follow 1 of the
new the initiative from the FFF unlimited landing make which is as everywhere all let's increase let's Ingrid an open-source CAD which provides you with a free certificates for you what page but it's DBN everywhere so if you have a if you're trying to of force everyone to have its TPS web servers and even the
HDP to a specification is any was include the to enforce a to s orbital reversing well when we moved to a to the b to everyone is going to be a CBS that's not really true but our and because the pacification doesn't state but all the implementations are from the makers of out this parameter less and and cooling chrome have only implemented edge to be used to enter the be too if he uses HDPs so so we like
or service and I encourage you
to do the same and if you enter to the SSN lets you can get a qualification to our house accuses you web server which is pretty good but because it gives you some hints of if you have any open up all the mark of all the specification although motion of organisms and the to the and the
but the apart from security contingent topics we
use also some metrics and building systems so we can have users within give users some information about how the whole pork for example we use metric sees service which is basically that's the and collects the in each 1 of the machines installed as reason and see what and we have a class of animation programs that get information from all the host and then we have a class of carbon graphite which is stored in this information and other it from all the nations so the user can see
some these graphs in the pond and the weapon against the how the machine is behaving as and we yet
now trying to implement looks fashion Lexique stage and you which has a bright information about the Holstein added to build the web server on how we behave the way they you have the of the deceased how we behave duty in different periods of time and to the so you can have a lot of it looks guided by looks like a star as a storage in inelastic search and then showed in in a Cuban so that we might want to talk
about I hope that it thank you
few you and then we have a few minutes for questions yes yes many questions why did you choose standard means of King 1 for instance and what made you choose 1 thing and nothing so this was a long discussion we have it to between everything about which is basically 3 we didn't have any acceleration a strong reason to to choose 1 or the other of we soul additionally of research that stand out and work living that we see that the entity which is 1 of the main companies we wanted to use because we wanted to the projection of a storage from 1 sentence to the other so we saw the that was integrated in the inside the same sensor and we have to go that way but we could have chosen and giving me was in the list of products that we have to research on understand I think a fruitful to really funded to those injury understand the subtleties in the last architecture because you had several hard drives and that you have several zen service actually over the several drives so what their virtual servers or yes so these are the whole picture the so this is a picture of the of BM architectural on the pulp but this is more than you off the finest storage and use this a single machine with the rate of this and this is a failure as a firefighter storage forcing the machine but you have a physical you and then you have the the 1st column is light it used to be done called Don 0 which is the old rating system that manage all the yams from when you have access to the and you access to the use of the it's of RBM but you access to these and the test the practice to live up to the hard work and that of the VM we which have access to to the actor hide the rights and all the other columns each 1 of them are as in those which are that 1
of these at the end of the you it's quality in in then and each 1 of them has the DAB the device which is basically a vehicle of block devices which is replicated from here to
1 of their the entities the so you have each 1 of these United life at inside this list of real global of devices and they are replicative through the network and the other times when they are reading already but all you read and this is same as to the secondary is in a sense the seed it's automatically the network when it's executed but difference each 1 of them has a deity the divine thank you OK we have maybe 1 minute if anyone has a really quick question yeah and take please join me in thinking and the minimum Fj


  694 ms - page object


AV-Portal 3.10.1 (444c3c2f7be8b8a4b766f225e37189cd309f0d7f)