An HTTP request's journey through a platform-as-a-service

Video in TIB AV-Portal: An HTTP request's journey through a platform-as-a-service

Formal Metadata

An HTTP request's journey through a platform-as-a-service
Title of Series
Part Number
Number of Parts
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Place

Content Metadata

Subject Area
Giles Thomas - An HTTP request's journey through a platform-as-a-service PythonAnywhere hosts tens of thousands of Python web applications, with traffic ranging from a couple of hits a week to dozens of hits a second. Hosting this many sites reliably at a reasonable cost requires a well-designed infrastructure, but it uses the same standard components as many other Python-based websites. We've built our stack on GNU/Linux, nginx, uWSGI, Redis, and Lua -- all managed with Python. In this talk we'll give a high-level overview of how it all works, by tracing how a request goes from the browser to the Python application and its response goes back again. As well as showing how a fairly large deployment works, we'll give tips on scaling and share a few insights that may help people running smaller sites discover how they can speed things up.
Keywords EuroPython Conference EP 2014 EuroPython 2014
Programmer (hardware) Dialect Spreadsheet Perturbation theory Computing platform Physical system
Web page Service (economics) Software developer Operator (mathematics) Blog Computing platform Website Computing platform Descriptive statistics Number
Website Bit Software framework
Dependent and independent variables Scripting language Software developer Blog Software testing Software testing Traffic reporting
Goodness of fit Electronic mailing list Website Selectivity (electronic) Electronic mailing list Stack (abstract data type) Musical ensemble
Installation art Computer file Structural load Gender 1 (number) Set (mathematics) Bit Stack (abstract data type) Cartesian coordinate system Product (business) Number Web application Mathematics Latent heat Process (computing) Personal digital assistant Vertex (graph theory) Configuration space Video game Communications protocol Local ring Descriptive statistics
Laptop Web page Point (geometry) Server (computing) Context awareness Service (economics) Workstation <Musikinstrument> Virtual machine Set (mathematics) Web browser Client (computing) Coma Berenices Mereology IP address Number Front and back ends Element (mathematics) Web 2.0 Direct numerical simulation Goodness of fit Cuboid Software testing Computing platform Form (programming) Physical system Electric generator Structural load Moment (mathematics) Bound state Physicalism Database Instance (computer science) Flow separation Connected space Graphical user interface Web application Pointer (computer programming) Process (computing) Blog Calculation Website Normal (geometry) Lastteilung Right angle
Server (computing) Functional (mathematics) Context awareness Service (economics) Computer file Observational study Code Decision theory Code Dimensional analysis Number Formal language Front and back ends Web 2.0 Latent heat Goodness of fit Root Internetworking String (computer science) Software testing Proxy server Plug-in (computing) Modulo (jargon) Scripting language Block (periodic table) Structural load Electronic mailing list Variable (mathematics) Connected space Subject indexing Word Uniform resource locator Process (computing) Network topology Configuration space Website Natural language Summierbarkeit Pattern language Lastteilung
Web page Domain name Socket-Schnittstelle Proxy server Computer file Code View (database) 1 (number) Virtual machine Set (mathematics) Mass Web browser Automatic differentiation Front and back ends Medical imaging Sign (mathematics) Network socket Software testing Error message Proxy server Domain name Dependent and independent variables Information Sampling (statistics) Bit Database Directory service Cartesian coordinate system System call Front and back ends Web application Category of being Word Process (computing) Personal digital assistant Optics Configuration space Website Lastteilung Natural language
Web page Functional (mathematics) Server (computing) Group action Service (economics) Structural load State of matter Scaling (geometry) Multiplication sign 1 (number) Virtual machine Front and back ends Web 2.0 Web service Computer hardware Error message Physical system Rotation Scaling (geometry) Information Regulator gene Structural load Bit Instance (computer science) Limit (category theory) Web application Website Configuration space Lastteilung
Zirkulation <Strömungsmechanik> Game controller Service (economics) Code State of matter Multiplication sign Connectivity (graph theory) Set (mathematics) Design by contract Mereology Number Power (physics) Web 2.0 Cuboid Automation Software testing Extension (kinesiology) Error message Physical system Identity management Scripting language Dependent and independent variables Namespace Software developer Uniqueness quantification Moment (mathematics) Electronic mailing list Stress (mechanics) Sampling (statistics) Virtualization Bit Database Maxima and minima Instance (computer science) Limit (category theory) Variable (mathematics) Data management Process (computing) Loop (music) Software Integrated development environment Personal digital assistant Website Lastteilung Summierbarkeit Quicksort Communications protocol Flux
the 1st of all speakers is going to be Giles Thomas diets started programming when founding a business he wanted to revolutionize the spreadsheet world by making spreadsheets programmable and then he start tried to sell them to financial companies that didn't work then his team moved over to producing the Python system that they wanted for people like themselves and that sold a lot better dialed is also playing the guitar today today however he is going to take you on a journey the journey of an HTTP requests through a Platform as a Service please welcome with a hot applause trial of Thomas thank you thankful X and introduction of public servants coming so and yet but the we wrote the thing we thought that we were going to build a Python and we want to go Python where it's a
platform as a service it does lots of things 1 thing that does is that it
has but the number of websites and so just wanted to get a
general feeling about how much people in the room know about so running websites and how many people here are responsible for the continued operation all of a website that a personal blog or coming pages OK so the fact that the number yeah but really 50 % let's say this permissible behind you're responsible for several websites OK than 10 so if few more than 100 so 1 more of a thousand forget about that I'm not looking at you keep going up because it would be really embarrassing instead of the you actually run a website so we did we have 24 thousand 241 websites right on Python whereas all this morning and as you probably fueled by the by now and and so we got infrastructure on this it's a simple platform as a service Infrastructure going to go through a description of a pretty quickly touch a few the details but what I'd like to
do is leave quite a lot of questions because I think which bits are interesting to drill down into probably come more from you guys some for me on and interesting of the website we have they range from very basic things where somebody is
basically started using a particular framework Sesame
here's been trying out so went pi and and you know maybe they could build something and get it is today maybe just that as hobbyist experimenting the next stages maybe sites which give a
couple hundred this is they are as learning Mandarin Chinese in a particular way is sharing his his lessons with a few other people and so a couple hundred this is a day um we want to spend almost there also the 1st kind enough resources to keep this kind of sigh responsible the for a response to other people
running moderately technical Report political blogs is my colleague carries a beta testing got his companion for bookies which is also mentioned but if you're just interested in development with Python and anchor but he gets maybe 2 thousand visitors a day so this I the responsibilities beyond full-time but certainly does need that much more than that it's not a high-volume this is 1 of
the most fun customers this guy's running insights and it's insanely popular it gets in it doesn't of hits every 2nd were poring through that it's actually quite good selection music even if you don't like it can go ahead and various things but so it's again quite popular websites sometimes and is Google there's gotta be there's gotta be responsive and it's got to be maintains an affordable price so how do we do this
well is a very basic list
was a set of vertices that these the tools we use we use the Linux obviously we use engine for our foreign load balancing for all of our at http needs so we use you is the which is an absolutely awesome product which uh which magic Python process for you so they can set up Web applications because basically any uh web application that uses the at the whiskey protocol so that's that's can be Django that's what pi the bottle as fast as all of the big ones possibly except for the except for most tornado installations which doesn't play so well with the whiskey but we do use right it's not go into much detail on that day but we also use the for certain amount scripting that I know I'm a bit of danger here for talking about ones of lurid Python conference but so we do use it it's awesome for what it does to the specific use case we have now you know that's the local changes of Python there I did mention that all all of our infrastructure uses the tools that describes
so far all the configuration is managed by Python is from is managed by number gender applications of which basically spits out the configuration files of all the other stuff needs to run and keeps the keeps cluster life and doing was meant to do the so I promise a description of a HTTP
request journey through and through this platform as a service and here the machines that are involved in that so you can see what I have here is up in the top left to use the mouse pointer uh you can see is the blue box set is a separate physical machine or separate instance running on Amazon AWS or whatever is the user's laptop he's running Chrome down here we have a load balancer and a bunch back in service
so everything apart from this machine up here in the top left is part of Python anyway as infrastructure we run an US for that's kind of that uh and that's what particular elements this so that the context the let's say the person who's who's used from the browser it wants to view my friend Harris website they want to go to this this that www . the testing called well the browser makes DNS request the this request comes back with the IP address of the base system to get a com which is the IP address this load balancer that here so it opens up an age it opens up a TCP IP connection down to load balancer it sends the request load now and always rooted through to web application let's say debates beta-testing good is this web application here at this point and processes running over on this particular physical machine the middle 1 on the right hand side because the mouse pointer and so the load balancer needs to have intelligence to be able to to know that there a single column is running on back ends of servitude so I will just say that magic moment that's a magic in Uzbekistan to a makes a connection and so on and now we have to get the connections with the work from the clients the bounds of the bounce the back the back now needs to identify the process where that form is the Python processes running beta get call it does that again was semantically and some makes connection the web application does its calculations horrendous templated talks to the database to those with magic it does to generate page it's in back to the backend server back and citizens about the load balancer but Bateson's active client no the used to running normal kinds of websites the kind of system where a person for example PS where where I used to host my and my personal blog then you might thinking was the point the load balancer in there because normally you simply have a server looks rather like the at the back and so here it's running a front-end web server that conjecture Apache and it's got a number of web applications running 1 or more workstations running Python processes underneath it what we have this extra step the load balancer that's kind where the magic comes in because that's a slow bounds allows us to scale up to scale down to add in our resilience and failover and all this other good things that people expect when they outsource running their web applications to a 3rd party like us rather than renting VPS right so I said the load balancer
knows by magic which which back and its got and that's a generic question and the bounces running into next is running specific flavor and checks called Agresti now engine exhaust web server it has it's extremely fast is very good proxying and connections through it like we do through the load balancer and has a lot of great plug-ins aggressive basically engine genetics with batteries included 1 of the batteries the during this included is Lewis scripting the kind of scripting you can do is actually insanely powerful you can do any amount of the processing inside every single request it works extremely fast there I think is a nice language is not as nice them but some of the design decisions they make it less pleasant language to look at work with are actually very good for the speed and the and efficiency of that's why it's so I think that shows that for the for the the majority of insect scripting what we do inside a load balancer codes that you really very simple but what about here the engine configuration file and this reasoning readable at the top here we're saying it by the file when Internet starts up is going to load is run that script in it back and it backends basically just uh specify some global context which available to any Lewis script insights in saying here is a list of all the back-end servers that's all as we get out here we contrast so the block so this supports and for 4 3 and this location slash block is basically sum this can be executed because code is executed for every request so what we do is we extract the hose that a particular around that that service request is asking for W W 2 from the beta testing go book on we extract itself from the HTTP has had a from the HTTP request and start a session a variable called root we consider back and I variable to string and then this is basically a function call it recalling the Luo functions can so that's contained in get back and I pay now you can guess what I get back and IP does it returns the European but in this but back and IP variable and then we go into this in the dimension x magic which is proxy past that says just handle the processing of this request to the server over there identified by this IP and so and 6 of the rest for us let's take a look at that solidified this is an interesting occurred because it's something we put in the 1st couple of months which we call we're going to get rid of in a week or so it seemed too simple it seems that should 2 hours when it is in complicated enough to work on it does is half the hosting the comes in so that's that's the tree the code that's of Python use MinHash a string of convergence it has that so got number from his name we then take that modulo the number patterns and use that to index into the list of the back and so that means that every single word to uh words that were running into website for running is assigned essentially randomly to 1 of the different backends it's stable assigns the same back if we add new back into the cluster the modulus and of we use increases and so everything automatically spread itself out of the cluster again that's the bouncer I said the back in
several studies to identify which which processes running to the
web application so this is some really basic engine X configuration that any of you done you with the stuff on them intellectual recognized always saying here is again extracts the domain names that from the request is being made said from the requester processing differ way of doing it but the same effect and and what we do what we said delegate all requests for www forbade beta-test testing at all columns to pick socket all dynamic stuff this is this is what the company actually looks like it's not a sample that so the request comes into center next you will immediately look for the socket synaptic the location and expect to be a you whiskey process on the other end of its running the to running a website that should be on that particular domain How does um how'd ads he was given that it needs to have a what the web application running on the well you whiskey has a directory called it it is what they call vassal files Vassilis with this terminology for running Python process or set of processes responsible for a particular web applications but it's configured by vassal file and vessel file basically has various things saying where the code is what kind of sandbox want applied at handiwork process you want but importantly it also has the sign of the torque tedium which is which the sockets that it needs to listen on he was he's very clear that if a vassal funds created configuring a you is the best like this it will need to be detected at the creation of that file is in the right directory and will file fire up all the process immediately and then that and that means that on the Web applications started so we we need to do is start and a web application requests come and this is where things a more complicated what happens if request comes into 1 of our back ends and there is no process running that particular Web applications fertility the education you was simplified is something a bit closer to the truth when in genetics tries connects to a to it to a you whiskey back in the Sultanate business socket maybe whiskey itself hasn't started processes maybe kill them because they find out after a massive amount of activity intellectual intermediary to find the 2 error the way that is goes back to the browser and 1 of the things about that what we have here is error page handling if there's a 5 0 to error read essentially do go to this other bookie here at fullback error page 5 if there is a there's only went up inside this code here and all we do here is we check whether there's a parcel file for that particular domain so let's say we go to that we're looking for the WWW debates a single look on and the processes running the 1st images jumps this this fall back we see where there's a vessel file for that domain of this vessel felt that maybe we can safely assume that the process is running so actually this is a real fighter to maybe something what went wrong inside the web application so we generate real to error if there isn't a vassal of that particular way that we do we need to start that there that property passed from the load balancer where essentially saying delegates will work this request to this side here that there this is another proxy here which is delegating to little Microcebus running locally much of 1 that is actually very small Django application it has access to the database that configures all of the websites we run when it receives a call on its uh initialize where
that um view this is case I need to start up optical web application because the database together all the information about the user it works out whether we have um virtually as a container for this particular user running on this particular machine itself that's not necessary it increases the the whiskey configuration it generates a whiskey got any file the vessel for us is that of the unity with the you was the South processes running as suddenly we can start delegates to all the words so why is this interesting
well what it means is we can actually scale could much transparently let's say we had a busy day with let's merge we only had until now state 3
web service in the cluster and then suddenly things things sort out maybe uh with web applications we've gotta getting more busy all about the people signed up we got some more websites all we do is we created a new back-end service which is very easy without so we just fire up a new instance and then we tell the load balancer immediately on the uh on telling the text of readers configuration it will start distributing wrote request differently across the load balancer across the back ends and any backends this need to start without for automatically start them the ones that are running about the web applications that you longer need to run will all start timing out and killing themselves to dynamically reconfigure the cluster very very simply now let's say that something goes wrong last nite um 1 of the web page and web service oxygen problems and so we got something by backing them saying like what was that was going down the group 1 is that teachers that we have on Amazon and every year about this time hardware starts thinking on on AWS I think what happens is that all they're there in the regulation is gone holiday and the intense and in being given enough information on how to manage their the systems like web 1 started failing and so what we did was look into looking into load balancer remove it from an error from alastair back everything and the need to be reconfigured itself automatically just through the use of this hashing function to run on the remaining service that and that of course all the pathway that's running a little bit was so because I machines were closer to the load limit that's fine we have at the fair amount of plastic that we can we can bounce web server so about 60 broken so the bringing a prepackaged rotation of annuity everything went
that was a very very rapid so through uh our have have all system works and I realize I had a you guys for any questions that we can drill down on a those interesting but so we got
a lot of time for questions gentlemen here at the front of my house the this is an interesting about the do you want to be more well of the real the real problem is actually in the amount of work that needs to be done to stop the process because all of our users run inside insights and environments which we have had control of the contractions start up and he was given time we start using i think still doesn't have the capability to and to do all the set of work required to it because of course quite happy can run certain kinds printed scripts identical accessible that's something could not to it is intended virtualization essentially is of the offers this whole due to the sum of the the sample to looking at some of its kind of a roll-your-own things and these these days it which start on the web today we might think of we proteins Linux Containers always to potentially you stop we use inductive so some stuff to work on now and I think he wrote exchange back in but the thing about Linux Containers is that it was built out of reusable components user's true to process namespaces network namespaces and that and that all of these have been becoming available for a number of uh different here for number of years we essentially roll their own kind of Linux Containers lights by plugging these things together will be and what happens is long if workers the horns and so this if we risk minimization time is that there some good the reason here so I don't I don't know the answer to that so if you need to start of a process so was the process fraud would so that was the but some of the requests coming at the same time is there some kind of looking to make sure you will start at once and you don't lose any request what it's talking about what you yes yes yes they give that is looking inside there's an insider about which have and you may you is that using the send as to good old about embraces the user around the current source of error but are some boxes in various places including these we do but Ralph consuls and things like that so before we come along in Los sites protection that and he was just so that you thank you of you so there no against and so that's that's what we really do want support all of all of our infrastructure does support toward the extension of the income and the chief emotional what's so you with the support for which this is like a moment so but the problem is that with the protocol doesn't really support uh WebSockets if you if you don't support and will be in some extensions of of the past and I think that it that's that's it will reduce what we did we will support we supported well either 1 of rolling something of our own to be able to manage long-running tornado processes and use the same engine next infrastructure to read through to appropriate places all maybe maybe was he will buy that and is something that we can we can use the universal how deal with persistent states with the so it's hard to know what happens to the databases of the weather 0 it's of that's that's that's match separately we have we have sequences and we're working on this spot and excrescences but just separate the kind behind his back and service just do you need to have code that only that are in the question do you have some services that manage databases for for users bad on the back here on on on the most cited so little bit messy was built at all for we're having to express support right now and we basically building that a we have lost micro-services which runs on and 1 of a set of stress service which finds out doctor containers each of which runs on its presence and so that the flux microcircuits the provisioning and we're part of work on that and I wanted say it's it's it's working well enough to pass a functional tests which will be the sum of all the away from deployment last question please there's he said that when they're 1 of the instances that you got melatonin then you manually removed him from the load balancer is there a reason why the onset of the letters that automatically removal and also uh if you have the power to up-skill consisted in there you have some sort of limit on you know the instances yeah that's and that's that's that's question it's it's it's really a matter of of of development time 1 of features we do need to add this is automatic instant circulant 1st created it was it made sense to manually because each in each instance variable was a rare and care appearance and was kind of unique enough in the way failed better have a human in the loop right now I think we managed to get a list of the different ways in which instances can fail and we can probably start building in more automated responses but yeah that's a that's just a case of we have had time so thank you very much Charles


  383 ms - page object


AV-Portal 3.20.1 (bea96f1033d39fbe77f82542458e108105398441)