Merken

High-Availability Django

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
you know it few sit down to it and if you see the proof you with some of those
today today talking about all on basically the the performance issues that we dealt with that the Atlantic over the last year but how we overcame them in some of the things that we learned on the way so I'd like to start off by thanking the organisers of Genghis Khan for taking interest my proposal all of you for coming to listen to it and I think 1 of the most valuable things that this conference on par with the talking events in the opportunity presented
in talk to some really smart people in that the work I did in the past year would eventually became this talk it was thanks to a conversation I had 1 of the founders of copy if you have a layer with them their service for monitoring performance of work that's so new relic but with a special focus on Django at the time in
September last year we were struggling with growing pains in a Django application we just completed a huge project the porting over 6 years legacy code and mix of PHP and Perl to Django powered CMS in France the review performance circuits that launched the we eventually overcame most of us with 1 important exception are servers basically melted when deploy at the time the using my whiskey store process with basically the of the code is a fabric script and then touch the whiskey file today as you can see and re not correctly here I the 82nd response times arm and add them to 150 ms so I and the 1st cover the things
that there wasn't 1 single thing that we did that address the performance issues it was a bunch of small things a couple things that they were actually low-hanging fruit that like were great wins but in general is a bunch of small things that build up to fixing our performance problems so the 1st thing of monitoring and profiling and the donors variable-length on don't understand where the slowness of area where on the app freezing up and then you just sort of sat in the dark on and I also think it's important to tackle the easy stuff 1st from quotes here of query optimization catching the sort of pretty easy ways to get a quick wins with the application and also on new this talk is very focused on the server side of things but it's important not to neglect friend performance as well which is a completely different ball game on but you can have your fast server and spilled the JavaScript takes 20 of the load no we say so for profiling on we found out that most users with hosted services and New relic the upbeat both comparable on an great services and we identified a number of really important performance issues in our society with that I'm also Eveready Hampshire's value rejected but pulled lot I recently came across finger silk which seemed like a really promising application 1st sort of different way of profound your drink obligations and if you're a masochist and you can always try see profile high prior to poultry in k category on and also there's an with the middleware which I had mixed experience with there was like 1 so it is code that I was able like kind of figure out where things were going wrong with that but it is really difficult to set up and probably not worth the effort monitoring the downtime unification of POS did a lot of the same services that provide performance monitoring the relic of and then also alerted chart the the upper register as well and even as opposite services like not years and many forks and also other like that so with the easy stuff catching on using a CDN I'll be at the time we launched we did have a CDN that we were using a Qatar properly and so on and so on had of being really strategic about where we said what the very the catch on and when and how long the cached page but particularly for archival pieces are really helped us with some of the traffic we got from callers and box on which is the majority the traffic gets past the city and on it's important thing to have a proxy catch whether it's engine X greater varnish we use engine next time we had a the great success with 1 of a really great things about the internet brought the cash the got of the just a 5 seconds at the 2nd but to hold the information in the along and there until is a 200 response so if your application goes down I that the 2nd half he will hold up until you say come back up again we've had this really save us from like you know somebody deployed some bad odor there's an issue the database and the site nobody really realized that this down because we had a few 2nd catch that basically last online on i also is catching on the stuff that comes built-in with Django at the cash that with tag that has property have static storage attached on their information about where the static files you look in the file system endocast temple a which does the same thing for the template finder on page caching frameworks and we actually have 1 that I thought we had open sourced and I realize today that we have a closed surface the reason so I'm I talk to my coworker are for supporting of cash cow in the spirit of for honey stupid names for a finger projects arms it's kind of loosely based on Django Jimmy Page which is has ended up in 3 years Phyllis itself and on so maybe it's an improvement on it but there's also a you know a number of really decent of 8 catching frameworks or attaching like doing of cash machines I ideally it is a mixed bag on and all actually get to that in the next slide so for
query optimization and there's the obvious stuff with prefetch related select related on prefetch related objects which is in what would attend public though it's accessible in earlier versions Django that allows you to view on acid callable and filter the objects that you can on so that you can prevent conditionally items in the query set for instance in a generic foreign key you can prefetch all the instances of 1 model in this field and not to worry about it conflicting with other results in the query that that might be from different models that do not have a field on bad values and the values list I found this really surprising when your profiling Django 1 of the things that we found to have a huge impact which doesn't really show up in any profiling is model instantiation hydration because and this is what I've back interlinking of cash machines on we had a query that basically we stored database we had on all the different ad slot our site and the breakpoints of their enabled and size of the Ag units and all together with the provincial laying in unit selection related this amounted to you I think like 700 instances being loaded into memory on every request which came up to about 80 ms 100 ms per request but when we switch the stages . values and you know manipulated they basically simulate the pre-calculated like I said the millisecond dropping every request so that was a huge gain basically made our response times 25 per cent faster at that point and then obviously you know but much more difficult it is granted editions indexing and and you can always throw more hardware the
problem that was on you know part of our what speed gains there was cheating so we upgraded from like four-year-old servers to brand new servers that ran you know twice as fast and so you that that our response times that now you might notice
from the earlier thing the only thing in this chart is request hearing on what is request queuing and so explain that under the rewind a little bit and is kind of about of how we were set up so
we had I think what's a fairly standard of way of load balancing a Django application in engine x in front of it as a proxy catch and then we had a couple of application servers in this case they're running might whiskey behind it on the internet was listening on port 80 and then we have a different applications listening on different words and defined of streams in the engine
X k the across the different ad servers but we also had a stage version of our site which we would use a test of coding production on the production environment and we just have a good support for it on so you directory structure looks something like this on the discussion I had last year Django with that under copied on know so explaining the hinge this problem we had we weight load everything up we train preloaded much we then we have the was the file and then cut across the fingers and if we were we have to be slam by by that time this all the code that took the load and why noble that machines just blowing up 1 worker process it would take the concept on the server loading 40 workers simultaneously across a unified view on twitter all actually sitting on 3 bare-metal machines this could take 30 seconds 45 seconds and during that period of time all these requested queuing up and I basically the server by the time the servers responding it's still being overwhelmed by requests and there were a few times are ahead of you like rolling restart verse erected bring them down and take the rotation and jmax just that they could and it's cool off and angular respond again so what he suggested was why don't you rather than having the stage analyze the separate and I you know doing this switch basically like at the point time to load the coding and touch the with the 1st and then um which the symlinks but basically have like 18 bcm like this stage in life so that you have on the a and the B
folders are always production right in any given case 1 of them is either 1 version ahead or 1 version behind what's currently on production but in every other respect the in CDN path that they're using the database their production and you force as much code as possible load before defining near-death application you with the vile this is less important remark whiskey the but we switch you is the end of 1 of the things that's nice when you're using you with the new set up the again with that sort of selenium for a pipeline and gender projects I deserve mode where you have like that the way hazard move work you deserve server which is really very bare-bones polar does it just like listens for workers in hazards when they come on line they communicated hazard service they are accepting requests and only at that point that deserves server request you so this gives them the freedom to sort of intake take as long as they need basically to get everything loaded up in the memory and on real everything before it even needs to deal with the incoming requests hazards are handled here and then before swapping the upstream after they've been preloaded we warm them up with the the the number of simultaneous concurrent requests that and basically make sure that all the workers get a request with a catalyst the neural that there'll crime
and so why did we switch from marked with you is the eye there's a think a lot of misinformation about with the servers and and you know performance benchmarks and they think that I think generally the comparison of the configurations are in very reliable and the thing being tested out into the plight of the world there something like how many concurrent Hello responses the server up at any given moment on
crime busy in general it's comparable performance like instead I runs on Apache HTTP Server which is but you don't whenever I most of the work of a single developer and the documentation is commonly out of date and important configuration options the missing a only discoverable Middle Eastern which is something that openly but admitted in the documentation but have with the
summary the get a page and I had it demonstrates that the balance between the where the communities gone with marked with in this
you is the I brought an active community the 3rd act documentation about this kind overwhelming but also highly configurable and n Ch so when I was rehearsing this top there a couple slides that I don't think in that time to get through but I do have it did have repository which are hasn't demo ocurred kind demonstrates how we're using I do is the emperor mode answered mode and the is the stats server to kind of monitor or with the application preload everything using the fabric script on so in order to records and when I refer to that on and sort at the
Peter Panda show what are you is the monitoring age of like so they look something like this and we use emperor mode ever mode basically is a way where you can I dynamically us set up multiple configurations for a single codebase so we have like 1 really logical place that hosts on all the limit properties city Lab a wire are the letting the kami of a profile site which is for people to manage the print subscriptions so we have sponsored section the site and we have an a and B version of all those things and then also we have a CMS mass that's for editors and SemEval although that's for our outside video contributors and so on we go to that page we can see in real time as request in on thank them thank but it would change as you can sort of hover over the i will pass to see information that hold communities that server and if the 2 sides will get overwhelmed is a little you monitor down the bottom generally if they use those of a little bit that's normal on that sort of expected we have 1 thing you can do it odd motors and dynamically spin instances based on how busy the other worker processes that far so you know when these request command bondsman processes it's able to handle the the in you know the increase the volume coming the appeared these are they
you were girls to come and be a bit of repository actually have pushed that yet but by the time the video online it'll be there I I think it'll probably be sometime later this week i'm affraid didn't you know I don't really is very much but it's an easy way to get a hold of me or you e-mail me Frankie at the length that can't I end up with the time I have left open it up in question I agree talk thank you on
sofas question just the question is how is basically all the end user traffic that hits the Atlantic . com serves like the CDM but then that CDN is basically talking to Django across the board so is the entire site served offered dying Geiger societal questions yeah well I mostly I mean there are a couple of really really random legacy pages that look really old and they are really old but 99 per cent the traffic goes through Jack electron page layout of material that yeah and as far as the CDN is concerned so a small portion of the request I will get through the year and you know I we generally have about a 3 minute page cash per every page and then we have longer ones for some you know holder pages that have ended up in a while I I moves the request they get past that are actually boxing crawlers we have an archive and we've been playing since 1857 and we basically everything online we have the right state you would think that I pre magazine would have digital publishing right all the articles that you would be wrong but that is not the case of everything we do have the rights to and that we've digitized free online so you know we don't generally have an issue and people crawler site but if they're doing good faith application is a US instance that make like 200 concurrent requests every 2nd and the makes our servers now and so we have kind scramble block it and so on it was like kind of dealing with that getting us in a place where we can handle those sorts of situations without breaking sweat great thank you it by our so you talked about using model is being switching over T was the there is a 3rd option that a lot the reasons juniper analysis 1 if you thought about unicorn didn't go at the word just didn't have time you had seen big he was he was you have an opinion on that I don't actually I know people had a lot of stuff with it but I don't really have a strong opinion on where the other just kind you with the work for us but I think a I think it's on me for my interchanging apart G unicorn is on par with with the and features so you you like look at Juniper this I like other types of work for us now that thanks the yeah how you mentioned earlier that sigh you thought so EMOa on caching was kind of a mixed bag can you go into more detail I think some sure yes so what we don't we rising to catch fishing for real time on site and then I actually kind of accidentally we turned it off of and and and this was the 1st time we've actually cannot capture but in this case there was really no performance and we can speculate about why that might be and and at the conclusion we came to is that a lot of the like those are situations that you need because we're not in the cloud were in a data center at all you know in Reston Virginia or like connected via fiber channel so there's not much latency between the service in the database and the database is pretty well in her read because we don't have a lot of rights that were you know Yetta's published maybe 23 the today people really millions of stories a day so on the database is generally not a bottleneck for us and other model instantiation titration and basically the UN pickling in creating the model instances was taking almost as much time as if were doing it right or queries but the database so it would like really did make a big difference but I think that in cases where laconic latency between the application server database servers or the when you have situation very like writes and reads a more balanced that I think there might be a which is a mixed bag then it might a useful and thank you it thank you thank you you for the
half of you
Selbst organisierendes System
Beweistheorie
Ereignishorizont
Computeranimation
Umsetzung <Informatik>
Prozess <Physik>
Ausnahmebehandlung
Kartesische Koordinaten
Elektronische Publikation
Fokalpunkt
Code
Computeranimation
Endogene Variable
Überlagerung <Mathematik>
Digitaltechnik
Canadian Mathematical Society
Server
Mixed Reality
Skript <Programm>
Projektive Ebene
Response-Zeit
Speicher <Informatik>
Transaktionsverwaltung
Resultante
Retrievalsprache
Punkt
Atomarität <Informatik>
Minimierung
Adressraum
Versionsverwaltung
Kartesische Koordinaten
Template
Marketinginformationssystem
Computeranimation
Homepage
Internetworking
Middleware
Mehrrechnersystem
Einheit <Mathematik>
Trennschärfe <Statistik>
Dateiverwaltung
Figurierte Zahl
Caching
Automatische Indexierung
App <Programm>
Hardware
Schlüsselverwaltung
Kategorie <Mathematik>
Datenhaltung
Template
Globale Optimierung
Abfrage
Profil <Aerodynamik>
Rechenschieber
Datenfeld
Framework <Informatik>
Automatische Indexierung
Festspeicher
Generizität
Server
Projektive Ebene
Information
Schlüsselverwaltung
Objektrelationale Abbildung
Instantiierung
CDN-Netzwerk
Proxy Server
Web Site
Subtraktion
Quader
Gefrieren
Zahlenbereich
Dienst <Informatik>
Framework <Informatik>
Code
Datenhaltung
Homepage
Informationsmodellierung
Flächentheorie
Spieltheorie
Proxy Server
Endogene Variable
Booten
Response-Zeit
Speicher <Informatik>
Zwei
Mailing-Liste
Elektronische Publikation
Quick-Sort
CDN-Netzwerk
Objekt <Kategorie>
Middleware
Flächeninhalt
Last
Parametersystem
Bit
Mereologie
Server
Response-Zeit
Transaktionsverwaltung
Computeranimation
Endogene Variable
Hardware
Proxy Server
Server
Subtraktion
Web Site
Prozess <Physik>
Punkt
Adserver
Versionsverwaltung
Kartesische Koordinaten
Drehung
Code
Computeranimation
Internetworking
Streaming <Kommunikationstechnik>
Virtuelle Maschine
Endogene Variable
Datenstruktur
Softwaretest
Videospiel
Sichtenkonzept
Zwei
Web Site
Elektronische Publikation
Biprodukt
Frequenz
Twitter <Softwareplattform>
Last
Server
Wort <Informatik>
Programmierumgebung
Verzeichnisdienst
Punkt
Datenparallelität
Dokumentenserver
Versionsverwaltung
Zahlenbereich
Kraft
Kartesische Koordinaten
Code
Computeranimation
Reelle Zahl
Code
Endogene Variable
Konfigurationsraum
Gerade
Benchmark
ATM
Datenhaltung
Hasard <Digitaltechnik>
Paarvergleich
Biprodukt
Quick-Sort
Last
Geschlecht <Mathematik>
Rechter Winkel
Festspeicher
Datenparallelität
Server
Projektive Ebene
Softwareentwickler
Konfiguration <Informatik>
Konfigurationsraum
Paarvergleich
Computeranimation
Konfiguration <Informatik>
Homepage
Summengleichung
Code
Server
Softwareentwickler
Konfigurationsraum
Apache <Programm>
Bit
Demo <Programm>
Web Site
Konfiguration <Informatik>
Hochdruck
Versionsverwaltung
Kartesische Koordinaten
Extrempunkt
E-Mail
Mathematische Logik
Computeranimation
Videokonferenz
Homepage
Datensatz
Canadian Mathematical Society
Inverser Limes
Skript <Programm>
Spezifisches Volumen
Drei
Konfigurationsraum
Gammafunktion
Drucksondierung
ATM
Dokumentenserver
Kategorie <Mathematik>
Pay-TV
Profil <Aerodynamik>
Ruhmasse
Statistische Analyse
Quick-Sort
Texteditor
Echtzeitsystem
Hydrostatischer Antrieb
Garbentheorie
Information
Ordnung <Mathematik>
Normalvektor
Instantiierung
Telekommunikation
Web Site
Subtraktion
Annulator
Kartesische Koordinaten
Whiteboard
Computeranimation
Eins
Homepage
Videokonferenz
Rechenzentrum
Informationsmodellierung
Datentyp
Randomisierung
COM
Urbild <Mathematik>
Analysis
Einfach zusammenhängender Raum
Dicke
Spider <Programm>
Datenhaltung
Güte der Anpassung
Abfrage
p-Block
Quick-Sort
Konfiguration <Informatik>
Echtzeitsystem
Rechter Winkel
Digitalisierer
Server
Wort <Informatik>
Streuungsdiagramm
Aggregatzustand
Instantiierung
Lesen <Datenverarbeitung>
COM

Metadaten

Formale Metadaten

Titel High-Availability Django
Serientitel DjangoCon US 2016
Teil 10
Anzahl der Teile 52
Autor Dintino, Frankie
Lizenz CC-Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.
DOI 10.5446/32700
Herausgeber DjangoCon US
Erscheinungsjahr 2016
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract One year ago we completed a years-long project of migrating theatlantic.com from a sprawling PHP codebase to a Python application built on Django. Our first attempt at a load-balanced Python stack had serious flaws, as we quickly learned. Since then we have completely remade our stack from the bottom up; we have built tools that improve our ability to monitor for performance and service degradation; and we have developed a deployment process that incorporates automated testing and that allows us to push out updates without incurring any downtime. I will discuss the mistakes we made, the steps we took to identify performance problems and server resource issues, what our current stack looks like, and how we achieved the holy grail of zero-downtime deploys.

Ähnliche Filme

Loading...