High Performance Django: From Runserver to Reddit Hugs

Video thumbnail (Frame 0) Video thumbnail (Frame 1424) Video thumbnail (Frame 2392) Video thumbnail (Frame 3376) Video thumbnail (Frame 4331) Video thumbnail (Frame 5664) Video thumbnail (Frame 6709) Video thumbnail (Frame 9014) Video thumbnail (Frame 9860) Video thumbnail (Frame 14201) Video thumbnail (Frame 16023) Video thumbnail (Frame 17624) Video thumbnail (Frame 18571) Video thumbnail (Frame 20360) Video thumbnail (Frame 25410) Video thumbnail (Frame 26218) Video thumbnail (Frame 27400) Video thumbnail (Frame 32894) Video thumbnail (Frame 38202) Video thumbnail (Frame 38967) Video thumbnail (Frame 43778) Video thumbnail (Frame 45476) Video thumbnail (Frame 48917) Video thumbnail (Frame 50503) Video thumbnail (Frame 51333) Video thumbnail (Frame 52844) Video thumbnail (Frame 54095) Video thumbnail (Frame 55959) Video thumbnail (Frame 57351) Video thumbnail (Frame 59046)
Video in TIB AV-Portal: High Performance Django: From Runserver to Reddit Hugs

Formal Metadata

High Performance Django: From Runserver to Reddit Hugs
Title of Series
Part Number
Number of Parts
CC Attribution - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Django makes it easy to build a site and get it running on your laptop, but how do you go from there to a site that can gracefully handle millions of page views per day? This talk will show you the modifications and supporting services needed to make your site scale. Topics will include caching, uWSGI, Varnish, and load balancing.
Scaling (geometry) Computer animation Integrated development environment Multiplication sign Website Special unitary group
Scaling (geometry) Computer animation Gender Source code Letterpress printing
Casting (performing arts) Computer animation Sequel Lastteilung Database Mass Number
Laptop Focus (optics) Server (computing) Computer animation Code Chemical equation Website Cartesian coordinate system
Server (computing) Computer animation Software Different (Kate Ryan album) Virtual machine Cartesian coordinate system Benchmark
Server (computing) View (database) Sampling (statistics) Mathematical analysis Database Maxima and minima Instance (computer science) Cartesian coordinate system Product (business) Web 2.0 Data management Computer animation Spherical cap Website Figurate number
User profile Process (computing) Computer animation Key (cryptography) Link (knot theory) Profil (magazine) Website Mereology
Web page Complex (psychology) Server (computing) Randomization Concurrency (computer science) Token ring Login Product (business) Subset Web 2.0 Response time (technology) Latent heat Programmschleife Profil (magazine) Different (Kate Ryan album) Software testing Social class Home page Demo (music) Planning Database Perturbation theory Discounts and allowances Loop (music) Computer animation Website Object (grammar)
Home page Server (computing) Touchscreen 1 (number) Instance (computer science) Mereology Event horizon Product (business) 2 (number) Connected space Response time (technology) Befehlsprozessor Process (computing) Computer animation Average Pole (complex analysis)
Server (computing) Process (computing) Touchscreen Computer animation Boilerplate (text) Configuration space
Authentication Goodness of fit Server (computing) Group action Process (computing) Computer animation Multiplication sign Password Software testing Resultant
Server (computing) Service (economics) Multiplication sign Structural load Virtual machine Bit Incidence algebra Cartesian coordinate system Connected space Web 2.0 Goodness of fit Response time (technology) Befehlsprozessor Programmschleife Computer animation Average Computer configuration Software testing Lastteilung Proxy server Resultant
Computer animation Multiplication sign Mereology Communications protocol
Medical imaging Server (computing) Computer animation Boilerplate (text) Set (mathematics) Planning Lastteilung Natural language Food energy
Point (geometry) Server (computing) Overhead (computing) Chemical equation Multiplication sign Planning Database Resampling (statistics) Software bug Web 2.0 Response time (technology) Computer animation Software Integrated development environment Average Lastteilung Software testing Resultant
Point (geometry) Server (computing) Multiplication sign Structural load Virtual machine Database Coprocessor Number Expected value Response time (technology) Computer animation Software Bit rate Average Internetworking Computer configuration Forest Computer hardware Lastteilung Natural language Window Resultant Superposition principle
Web 2.0 Server (computing) Surface Web Computer animation Natural language Communications protocol
Web page Dependent and independent variables Functional (mathematics) Inheritance (object-oriented programming) Computer file Chemical equation Multiplication sign System administrator Content (media) Analytic set Fault-tolerant system Mereology Formal language 2 (number) Frequency Category of being Uniform resource locator Coefficient of determination Googol Computer animation Different (Kate Ryan album) Refraction HTTP cookie Quicksort
Computer virus Dependent and independent variables Email Server (computing) Concurrency (computer science) Multiplication sign Set (mathematics) Fault-tolerant system 2 (number) Product (business) Cache (computing) Frequency Type theory Computer animation Website
Server (computing) Code Execution unit Database Median Mereology Mathematics Response time (technology) Computer animation Average Computer hardware Website Software testing Tuple Resultant Mathematical optimization
Server (computing) Computer animation Block (periodic table) Multiplication sign Bit Benchmark
Server (computing) Multiplication sign Virtual machine Primitive (album) Subject indexing Response time (technology) Word Computer animation Software Integrated development environment Term (mathematics) Software testing Lastteilung Resultant
Purchasing Computer virus Server (computing) Neighbourhood (graph theory) 1 (number) Real-time operating system Line (geometry) Mereology Connected space Number Type theory Cache (computing) Process (computing) Computer animation Hash function Vertex (graph theory)
Web page Area Home page Service (economics) Data storage device Content (media) Mathematical analysis Power (physics) Web 2.0 Mathematics Computer animation Profil (magazine) Energy level Video game Error message
Computer virus Web page Server (computing) Computer animation Cartesian coordinate system Measurement
Standard deviation Computer animation Observational study Key (cryptography) Multiplication sign Flow separation Product (business)
time and so so the yes so today we're gonna talk high-performance Django thing is that alone once a lot up to you being able to get on the fucking so and the environment like the sun and the climate like in so we urge and the agency so we built and sites and we help people with general problems and we help people learn how to scale in you don't need sites that are built for large high scale traffic
the so we've come since 2007 and in those 7 years we have learned a lot about go and how to make it run fast and have a lot of traffic and we've learned a lot of those lessons the hard way so we actually a receiver book by the same name high-performance Django that kind of bottles of all those lessons we learned in and package them uh and slowly but now and probably a print edition later I actually I have a couple of years and we get some of the people asking questions at the and
so if you really have the news source for any any other love publications you may have seen this before gender doesn't scale how many people here believe that this gender doesn't scale got the original controversial and say it's true that it can yet
generally does not scale you may say what about
insignificant interest and discuss all these people are using the Django and the you know at massive mass numbers of users and traffic are they doing the well I think
actually using Django as much as a supporting cast of players so you know it's the database post-processor my sequel it's memcached inredis doing load balancing with engine that's mutagen caching with varnish none of agenda but they're
really what makes the tango sites scale that you could use and just as easily with due to the so all of
this I have a company that might have that's that's going to be the focus of the talk today so that will be
allowed to go i the and this leads take site from the server and we're gonna possible on traffic CR performance and then scale up I'm I'm not gonna talk all about hot optimizing code to they're not gonna talk at all about taking databases and or the quality catching obviously going to work the higher stack that so I will be talking about how you serve you know application you load balancing and things like that but I can't do this on my laptop because a little new like multiple servers and so that if any of you are doing like a massive that torrent downloads like watching capital is there something that we can appreciate if you show up because going to really when talk of a kick about the others
so I like I said we're going to throw lot of traffic at the servers but it is like were benchmarking Django but
really what we're going to doing would be a terrible benchmark and set up a big application it's gonna have a big data it and many use the city's network who knows what's going on there and and when you Dr. containers inside virtual machines on shared infrastructure again you know their neighbors could be doing anything so don't take the exact numbers but we're going we're in a of c here too hot but more of of the difference is that the difference between each different set of
OK so that the sample taken to gender so who see here I this is
the easy to and since I have set up it's got a Django application on it the it's and 3 extra large so that is a has 4 cores in in about 15 yards away and so a decent size perhaps patents some massive but you know for for what we're doing the work well as to have a US post-stressed database server that's to medium and I think that is to see views and forwarded so when problem analysis and what you would use on a big production site but for our purposes it'll work well so I'm using a tool called the figures so it is a way to kind can manage docker containers so what we're gonna do here is we're gonna spend up Memcached the instance that is going to handle the cap sessions in the morning than up our web server running run server so it looks like this big and color figures file and spend up so the creator of container minimum custody container and were up and running so I can
show you what the site looks like right now this is another part running an
E C 2 uh and it's going to be what were broken the traffic at so and this is just the act by through a bunch of big data you a user profiles and of the users have a couple foreign keys to the company and the job title and they
have a profile that
have the decent-sized text globin it ends up some links to the
people each work with so it's not totally trivial hello world out of but probably not as crazy as as we'd be doing production again you just kind of a demo here so we'll see in the real
world the real difference OK so this changes so that you may be familiar with the like Apache bench the B or C is there there were good bad kind of blasting a specific page with lots of traffic jam you is kind like those on steroids you can do a really complex test plans and I'm running against a site so what limited what what's going to happen is we've got this class object and it's going to go through a loop through everything that's under it so 1st gonna get the home page of the site as an anonymous user next it's gonna get what I'm calling a hot profile page so in the real world said and typically you'll see that this campus but if you have a so what lots of pages is usually a small subset of pages that really get hammered and then come a long tail after that that don't see quite as much traffic so we're going to try to simulate that so this is going to pick a random profile page between 1 and 50 and and hit that next uh and 10 per cent of all loops were gonna log in so that login will this will have to add the modern page that all so that the sister of token and will use that token and and credentials to authenticate against the server and the posts to create a new profile and then look at the home pages and authenticated users 2nd stimulating and you know at a site it's gets modern traffic and those those users are doing something and another 10 % discount had a random profile page so the database has about half a million profiles and it and this is going to kind of simulate that long tail of of web traffic so you know I'm all set up here to my web server and amended to 50 concurrent users and we're gonna go through this loop 10 times and see what happens so for a protest here and we can see a response times so what's happening is that we said that we that we will below like 200 ms and that really quickly we are getting really bad response time for now up to the 2nd half of that that's not really a great response time to serve requests at Django so this is what's happening is this work were overwhelmingly server and if we go back and look at uh this is each
tab so we can see how well you can't quite see but I'm those events are instances in a paper on there and were not really utilizing all server and that there's you know that most maybe you 30 40 per cent of this any the CPU's are getting used so um the but they were questioning this year this is because of because 1 server is just a single process and as of recently it's multithreaded but still we were only using a single process so it's not really great performance basically the servers not fully utilized I can go back here and I'm a pole and where they keep track of that the data were saying here so this is going to be or should this part a little smaller so that some the screen the the so to answer uh 50
concurrent connections and the request for 2nd were 28 . 2 and the average response time was 1 thousand 353 millisecond so about 1 . 3 seconds that's the best I can fly and if you if you're on the home page or it's a little too too slow so let's go back in and see what else we have so normally you're you're not gonna put aside in the production run served with 1 server that's a bad idea but usually usually use production was the sober warning use you will see today you might be familiar with Unicorn Apache but whiskey but any of those really are going to be by this is just the ones we prefer so many go back in and show off
of run server container and well
that's happening all show you are you was the configuration so uh can't quite see at the screen there but they do that process is so instead of running 1 process like once a rewarding around 6 processes and again we have a threatened on so multi-thread multithreaded 6 processes we should expect some better performance here the rest of this stuff this kind of boilerplate you don't need to really don't worry about
so show use well so this is how we are going to
start off and are you with the process here Fig are you was is up and
running in little reserve previous test results he was he's gonna go little faster salute over those 20 times here and the start off at us so you can see that the performance is much better we were pretty quickly over a 2nd with run server and you were you know and a half a 2nd almost all requests you can see the authentication takes a little longer that's password hashing in action when actually want to take a long time so that's good In the consumer servant on
request with the or processes doing so so where you have the machine there we go and so it's bit you know it's it's definitely you may have just finish the test from there but as you can see what you know we were getting 90 per cent of CPU usage and it does look like we finish so let's see our results it's so go back here you whiskey again 50 concurrent users In the request per 2nd up to 72 . 4 so 150 % better that's that's a pretty good improvement on average response time is down to to 79 so much better performance exact same server all all we did was problem server for basically a real was the server and let's see what happens is if we take that same server in instead of throwing 50 concurrent connections that with 100 concurrent connections are and I think it's going to take a longer so further sister for short on time under 10 uh loops through more recent results and further up again we should see here we will pretty much maxing out the server it's under load him and the lower leverage is just going to keep creeping up here and if we look at the response times we consider also jumping up so last time we were hovering around you have a here and now we're the 2nd so in the real world out on a server that you'd be saying is you know work were maxing out the server if we withdraw much more low that we're gonna start a requested and start timing out we're going to start dropping requests so this this is going to get us and what we need so anybody know what what the next step is but that is and I don't know if any of the castle we could do caching the and and so so this assume you know what we maximize our and we could we could I slightly better optimize I with the services that might bias little bit I we probably will do a good idea to look back at application and see if that's you the places we can optimize the catching and improve the situation but let's just say you know the money at the problem you know 1 so was not enough let's try to so that that looks like this where he's engine acts as a load balancer and put 2 servers behind it the incident X you can use something like a cheap proxy Amazon of Yale B uh there's lots of options here so according to go back to my web server
killer have we bring up another
1 so this is about to end when to 0 area but similar want to to that looks just the same as the other parts that are identical this time we're going to bring out a you the it you know as the other 1 what's you with the GDP this is you is be using you was the protocol so but since a
little bit of overhead basically converting http uh into 1 and 2 that's wants to use and then back http and then down to you the so we should get a slight slightly better performance by using you with these internal protocol which and next and speak
so choose a load balancer the I'll show you this is energy that's configurations sorry no nothing it's saying here is this in settings that are known to boost the performance a little that can just boilerplate images on a server that we've defined were in the past back to
you with the cluster that will define when the containers spins up and this and it include you with the plans does everything we want to do so that looks like a very near and genetics OK so there's are you was the cluster we defined let's
see our the web servers running
but because then so would we go back to a test plan here in little over 20 times instead of pointing to a web server wearing a point to I load balancer which is called L B so and put in our preparatory whiskey so that we did run out of 100 and we got a better throughput 101 request per 2nd but our average response time jumped up to 750 ms so here we encounter decided that we are overloading the US over there so it's kind of forget about that 1 that's not good performance overall so when released those stand up against a load balancing and we can see where you're doing that because the 2 dimensional the requests to what's your response time is a response times more that them and so I can what we expect you know we we served as an amount of requests with 1 server we double the servers and we're getting close double request at at the same time will probably throwing twice as much traffic and better database so you wanna make sure that your database can withstand all this extra step uh you think there's some kind of funky stuff going on with the network here where the user is those gaps but hopefully we still get a decent result of this so that was 100 C so we given next 100 concurrent users and we 403 the average response time environment 137 . 4 so if we can pretty close to 2 2 doubling our our 1st often here on the response time in that's higher than it should be I think that we can have some anomalies of the random test again I think we would see it it would be really close to that initial you whiskey and since you maybe a little bit overhead but the and acts is pretty efficient in proxying so that's that's Anjanette's 100 concurrent uses 1 of we have about 200 concurrent users let's see what happens then so I'm never race these results from back up the and seeing the a response times there were starting to go up and see servers look like so
this is what 1 it's pretty maxed out there the is way to it's
probably also going to be pretty maxed out the and let's
take a look at what a load balancer is going it's nothing that I have
so let me know what is the superposition of will allow you to give and genetics is a big fat network pipe and it can handle lots and lots of traffic on a small machine so let's see her going over all kind of like when we found that you would be 2 of you know more than they could handle that were saying about the same thing on investing on the average response time is probably can get close to to a 2nd here and a request for 2nd we did better 166 point OK so 166 . so a few more cost per 2nd at 200 In an but our response time was 105 ms that means that this result with the low and we should expect response times around 200 so a hundred and that means we're basically overloading of server processors rating so the what's would strike this 1 as well so the next step I we could keep adding up servers right so to do more work than we directly implicated in lower 3 new for that them but maybe we can get a little smarter than would be if you keep adding at Cyrez you basically pushing the problem of down your stack and I'm having load issues and a database is is not 1 of you know you can throw hardware that for a certain amount of time but once you run out of hardware options and that that problem gets a lot trickier so maybe we can get smarter what's and this is going to last 1 that we're at a bunch more so now instead of using internet acts as a load balancer were used by which a varnish does load-balancing just like in the next but you can also do caching so to be the windows requests come back from bondage er come back from a back and through the load balancer forest and grab a copy of it and and server to other users so long with varnish many bump up the the number of times for an elite through this year and erase the previous results and then so I'm pretty quickly we should see the request for 2nd jumping 1 Delaware before I and must take a look at the response times a response times actually a little about OK and switch to to punish here so that explains why I was saying the same thing as
uh so we must stop the web
server is common stock and and genetics vanish and not
the the was the protocol again the next was so I'm going to start those up again in the
protocol so the surface web
server is the 2nd 1 server
and sperm varnish is is really amazing that and I think it gets enough love in the general community I don't hear a lot of people I'm talking about using it so let's the cyclic of the varnish configured and walk
through that really quick so you can see what's happening just seconds enacts within a defined back ends water containers spin up and include that file vanishes uses a reconfiguration language called the ECL it's sort old looks like what you would use to configure entonox but it's a lot different so you define these functions that the received is what happens when a new request comes and so on it's is a new request and will be talk is if their request is to the admin URL or whether it at the request has a cookie called session ID we want bypass the catch we want that person knows the through the back and and get fresh content the authenticated or they're trying to you access the advent of modern or something like that if they're not doing in 1 of those well we want on set the cookies so that this will get request and and this clearly and determine whether or not it's unique when the ways does that is by looking at you another way it is by checking the cookies so you're google analytics packages set could use for user and you may have other reasons that there's cookies set for anonymous users on the back and doesn't care about those so we we white balance itself the next up is the DCL hit method the BCS that method of that's what happens when it finds something cash so 1st what we do is we checked the TTL a to live in if it's and still basically say about that we're going to deliver that went from compact With this next part of this is it's got this next part and we can also define a grace period on a cash so there's there's basically 2 timeout bodies 1 of what we have for in the 1st time we just deliberate before we had passed the 1st time out but still the 2nd grace period were willing to do it so that's content to the user and by such new content from the back and in the background so the user doesn't have to wait for the generative return a response but any future users are going to get a new copy that data that page so that's really nice and uh it can prevent that if you have a really happy like you on the front page about it and then you have 1 pp just getting hammered parent hammered and then your catch expires what's going to happen is you have 100 users all blood through your back and required request that same page before the the the category refraction refreshes that's the McCulloch catch the your dog piling up so so this is basically protection against that of and the last
method here is where we actually said that the DCR back in response and were in a sense the uh grace period and time to live on the request there coming back out the virus also respects cache headers so this is something you can define Django but the for our purposes we're going to keep it simple and that were setting a 5 2nd time live in a five-minute grace period in production and depending on the type of site you have you know maybe could run those much higher than but when you're on a you would rather not even having a set really low 5 seconds might be enough time for you to to withstand basically that means that if you if you got a hundred users under concurrent use Stanley in your site but only 1 request every 5 seconds it's going back to back and so that can be a huge load off of your servers so I know it's a spin up varnish up the stood which
those over OK so
let's try this again racer previous results that's 1 expect because so you can see there are the tuple Scott were of 550 request for 2nd and if you look at the average response time it it's a 178 ms so that and and this is the this is the the really interesting from 1 to me is that if you look at the average and median response times on a happy it is 2 milliseconds so you know you never ever will dying running as fast as vanish that is is that failed to catch so that it does what you'd expect that no matter how much optimization you do in your code you database anything it's never going to do this you have a full-page catching on so this is where by actually and so that test is only done once and that was here so we're running vanish 200 concurrent requests and we do 456 requests per 2nd it's in and the average response time was 203 so compared engine that's running in the exact same servers I mean women Donald request for 2nd when handling and we cut our response time and half of all the exact same hardware all we do is change the change and units of part so that's a huge 1 let's see what's the French can do it if we were to throw 400 thinker users at so that that is a standard people simultaneously hitting your site most sites bold never see sees much traffic so and run that again and then well it's running cyclical what are the servers are going so this is what
to do this is what 1
lotus a little bit lower you'll see that in the end what has been in the past you know before when we're overloading servers you're you know really just totally spite of that it's the good news pretty heavily but still not
the time and looks like primitive we finish early yeah so that that's 1 of the truest here varnish and was 400 and the community the of this kind of blocks the work the best known lower doesn't happen this is why this is a terrible benchmark and usually be smooth so just pretend you don't see that it's like there and anyhow and in the world so we can also look
at so minus just like index it's it's not even breaking a sweat and you will want to randomize machine any make sure it's got lots network and the capabilities and it'll handle the tone of traffic 2 looks like it's wrapping up no in so that was 400 these results are great the agronomists like testing a million
times so just take my word on the normally it's about the same as far as the response time and slightly better request per 2nd so 3 in 476 references environment and 360 for so around us and let's and what's look at some other chemical things about Spanish got a few minutes left here so I mean all going in on and instead of looping over this 30 times for and over 100 times to give me some time to show you what's happening on the servers so Mustafa test here In terms primitive bacteria are load balancer and this will let
me go into a porous server of ours container that's running so 1 chemical thing virus comes with this is part this purchase a gram this is showing us the in real time the clustering server the dachshunds are the types of vertical lines are the ones that are been cash and you see that when the negative 5 that's 1 millisecond in and then the the hash marks are the ones that are of cache misses and heading back and so they're they're getting rich and returned around you in the neighborhood of a 2nd the also has varnish top notes on 1 furnish step
the solution you can see that this ratio as the number of connections and all that so some of the performance when is basically just a learning bias serve more of your passion you can track that you it does a really good job of showing you what year it ratios are there and this is a really awesome something
so when we do now is I'm at your way to the intermolecular
level 1 in let's see what's happening so the best I can do is I'm not there's no more store serving content and all those pages so you totally screwed up you deploy a massive breaking change your life services in the middle of you know being on the front page of red and analysis still checking on certain content and 10 % users are getting errors yeah that's bad but I you know your important content is still up and running so you can see the the area she coaching back up on all these but by by fissile serving the home page of those 50 profile pages in 2 milliseconds so that we can spend the power of web
servers and viruses can reconnect them and yeah the basically fix all that stuff so use by it's really great thing that's that's the the lesson of this talk so we
now on NVIDIA across prosodic with foreigners and if you the you know do that sustained for a day that's 40 million requests and they have to be on the news people do a lot more than that in a day but they do on a lot more than 3 servers and you know if you would be on the front page rather or something like that of the by just fine you could probably do it on a lot less would be surprised if you could do it on 1 server running varnish can't measure application itself pretty good results from from where we started
with uh any so we
started there you go with none 728 28 requests for 2nd so like I said instead of studies on several production vanish skills the use partnership and you know that time have out
so that like I said we want a book called high-performance Django you can check out at high performance standard outcome also have a few copies here that the loaded on nifty little USB keys and so as questions you can win a topic that has that that here but