We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Building Dynamic Dashboards With Django and D3

00:00

Formal Metadata

Title
Building Dynamic Dashboards With Django and D3
Title of Series
Part Number
22
Number of Parts
52
Author
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Django does a great job of building dynamic web applications, but it's not always clear how to use it for a single-page JavaScript-driven application like a data dashboard. We will walk through a dashboard built with Django for emergency services data and dig into the following questions. How do I serve data up to my dashboard? We'll show how the Django REST Framework can make this easy. How do I allow deep linking to particular queries on my dashboard? We'll use django-url-filter to transform a URL hash into a database query. How do I get statistical calculations like quartiles out of Django? We'll stretch the Django ORM to use PostgreSQL's powerful statistics functions. How do I make all of this work with D3? We'll have a brief survey of how D3 works and see how to plug data from Django into it.
13
Thumbnail
42:32
Multiplication signRule of inferenceGame controllerComputer animationLecture/Conference
Cartesian coordinate systemCASE <Informatik>Content (media)Profil (magazine)Observational studyRight angleCoordinate systemSource codeFamilyComputer programmingMultiplication sign
Computer fileProcess (computing)Row (database)Software architecturePrototypeFront and back endsMultilaterationDebuggerVisualization (computer graphics)Library (computing)MereologyWebsiteScaling (geometry)Arithmetic meanMedical imagingEndliche ModelltheorieTheoryParameter (computer programming)Computer animation
System callMultilaterationRight angleLibrary (computing)Computer programmingVolume (thermodynamics)Uniform resource locatorInformation privacyResponse time (technology)MappingVisualization (computer graphics)Level (video gaming)Point (geometry)Personal digital assistantStack (abstract data type)2 (number)DatabaseModal logicMultiplication signMedianElement (mathematics)Series (mathematics)SequelMereologyComputer virusSource codeSoftware frameworkWordStudent's t-testAreaOrientation (vector space)Disk read-and-write headComputer animationMeeting/Interview
DebuggerWeb pageQuicksortElectric generatorFilter <Stochastik>Uniform resource locatorGender
Query languageParameter (computer programming)Object (grammar)Right angleQuicksortComputer programmingMultiplication signShift operatorNatural numberGroup actionFilter <Stochastik>Web pageSystem callEndliche ModelltheorieUniform resource locatorGreatest elementVolume (thermodynamics)MathematicsFunctional (mathematics)Entire functionSubsetDatabaseMathematical analysisAverageDependent and independent variablesBlogNumberRange (statistics)Line (geometry)CASE <Informatik>Office suite2 (number)Standard deviationType theorySource codeBitModal logicReal numberResponse time (technology)Term (mathematics)Different (Kate Ryan album)Information privacyKey (cryptography)Projective planeView (database)Personal digital assistantBuildingData structureSoftware frameworkDebuggerWebsiteStatement (computer science)Flow separationFamilyWorkstation <Musikinstrument>Set (mathematics)CodeAttribute grammarIncidence algebraMoment (mathematics)Level (video gaming)Inheritance (object-oriented programming)InformationTranslation (relic)Library (computing)GeometryElectronic mailing listRule of inferencePoint (geometry)SineSocial classSummierbarkeitDialectComputer configurationRepresentational state transferData analysisSequelProcess (computing)PressureMetropolitan area networkDistanceVelocityPower (physics)State of matterElectric generatorDemosceneUniverse (mathematics)Mathematical optimizationServer (computing)WordError messageVideo gameResultantTotal S.A.Computer fileValidity (statistics)Water vaporLibrary catalogSoftwareLecture/ConferenceComputer animation
DebuggerEvent horizonQuicksortSet (mathematics)WordDataflowWeb pageCartesian coordinate systemMultiplication signHydraulic jumpEvent-driven programmingUniform resource locatorSystem callSubsetConnectivity (graph theory)Bookmark (World Wide Web)MathematicsHash functionNatural numberDirection (geometry)ChainComplex (psychology)Point (geometry)MereologyDifferent (Kate Ryan album)Right angleState observerMusical ensembleFilter <Stochastik>CASE <Informatik>Game controllerResultantLattice (group)Computer virusTheoryParameter (computer programming)State of matterStandard deviationLibrary (computing)Computer animation
Template (C++)Point (geometry)EmailReading (process)Computer animation
Object (grammar)NeuroinformatikPhysical systemWeb pageQuicksortVolume (thermodynamics)Level (video gaming)SubsetNetwork topologyLibrary (computing)HookingSystem callFront and back endsStandard deviationFunctional (mathematics)MathematicsPlug-in (computing)Social classVisualization (computer graphics)CodeLink (knot theory)Real numberDataflowMereologyBuildingWebsiteOpen sourceBlock (periodic table)Cartesian coordinate systemProjective planePlotterComputer programming1 (number)Greatest elementMultiplication signForcing (mathematics)Physical lawRight angleDomain nameParameter (computer programming)Endliche ModelltheorieProcess (computing)GeometryRule of inferencePresentation of a groupEntire functionUniverse (mathematics)Natural numberStructural loadCategory of beingState of matterWordSpeciesExecution unitWave packetSampling (statistics)Information securityComputer animation
Office suiteUniform resource locatorQuery languageSet (mathematics)Multiplication sign1 (number)Cartesian coordinate systemEmailWordParameter (computer programming)Right angleService (economics)Decision theoryGame controller
CASE <Informatik>Computer animationXML
Transcript: English(auto-generated)
This is my first JaycoCon, which is super exciting for me. I am a little nervous.
I'm a little nervous, not because of the talk, but because I brought my four-year-old son, and he's in daycare, and it's the first time I've run into a conference. So if my phone starts buzzing uncontrollably, we'll see what to do about that. And if you see a little hobbit running around tonight at bowling, that would be mine.
Say hello to him. He looks just like Bilbo Baggins. It's uncanny. Cool. So before we get started today, I just want to make one note about content. The dashboard that I'm going to show, this is all a case study based off a dashboard that I just built and just released as open source last week. It's all based around 911 data, which
is, of course, police data. And the police are a very touchy subject right now. I just want to state that up front. In case anyone has any issues with that, that is OK. I understand. This is not a pro-police talk. I 100% fully support the Black Lives Matter movement. This is not an anti-police talk either. It's just about 911 data.
But with that said, I worked on this in coordination with my local police department, where I also built an application to help them detect racial profiling and their traffic stops and prevent that. Great, so the problem that I had to solve, I started at my job. And they came to me and said, hey, we want to build this dashboard that handles millions of 911 records.
We have a prototype. Here it is. The prototype is built. It's a static site. It uses a CSV file for the back end. It serves everything in the front end. It's cool, but it really doesn't scale. You can see it here. It looks nice, but it really doesn't scale. And so they said, hey, we would like for you to build something better that we can really
use millions and millions of records for. This is what I ended up building. And we'll look at it in detail later. But with this, I built it with the back end of Django, where I do all my data processing, and then a front end using D3. D3, if you're not familiar, is a data visualization library
in JavaScript. It's fairly intimidating. Who in here has ever seen or been intimidated by particularly D3 and our JavaScript? That's awesome. You're not going to have to see too much of it. Part of my talk actually recommends, hey, maybe you don't want to use D3. But so the architecture of it.
So here's the tools that I used. This is my stack. I use Django, of course. It's right now running on Django 1.8, but I want to upgrade it. I don't mention the database I use here. I use PostgreSQL, which I see as kind of a necessity for this,
and you'll see why later. I use the Django REST framework, which has been great for me. I absolutely love it. Obscure library called Django URL Filter. And then on the JavaScript side, I use something called ractive.js. This gives us the ability to do reactive programming. I'll kind of explain what reactive programming is as we get further into the dashboard.
And then D3, of course, is my visualization library. NVD3 is a higher level library on top of that. And then Leaflet. Leaflet is a mapping library that allow us to have dynamic maps. Before we get started, I think just so we have context, it'd be nice to see what we're looking at. So I'm going to pull that up right here.
So the dashboard that I'm talking about, this is using New Orleans data. I don't live in New Orleans. I live in Durham, North Carolina. The Durham one is the one that I created for my local police department. This is using public data, and that's a really good point.
This is open source, so you can use it with public data. But here I can see the call volume over time. You see it dips at the end. That's just because we don't have today's data yet. I can click through and see if I want to see all general assistance calls only. Click that. Now I've just got general assistance calls. I can see how heavy it is all over town.
I can look at the response time and see that the median response time for general assistance calls is 5 minutes and 20 seconds for the last two weeks. That's not too bad overall. Could be worse. And then I can even look and see all the calls by location.
How is it going to work? That is awesome. So yeah, this cluster is done. So I can go in and look at individual calls and see like this call right here was complaint other.
And I can see high call volume locations, which this is being used actually to detect mental health problems in community and some places using that. OK, now that we just have the briefest overview, how do you build something like this with Django? Cool, so this is the architecture.
And I'm going to come back to this again at the end. So you're going to get to see this again. But if you look here, you can see I start with a request that comes in, and I load a page. And then I sort of push everything off to the front end, where I'm handling everything with D3. I'm using Django really to generate summary stats, which are the summarizations that you see on the front end.
It takes the filters that come in and generates it. OK, Django. How do I use Django? Well, so for the Django side of this, I have an endpoint for every page that you saw when I flipped through the dashboard.
So this is really simple, just three little endpoints here. But all, they have an API prefix because they all serve up JSON responses. They all serve very, I try to keep them compact. And they're all summarized ahead of time. Most of that's done in the database to make this as fast as possible.
The way I approached that was I came up with this idea of a summary model, where I put all the analysis that I would need to do to display an entire dashboard page in here. This is just a very small subset of it. You can see here in this, I don't have a laser pointer, so it's going to be hard to point, but that's OK.
You can see here in this, I have from a base one called call overview. And then I use in my subclasses this idea of annotations. Each page is really about a specific subject. So we've got a page that's about the amount of calls. Especially for departments and planning, they want to see the number of calls. For the public, we really want to see the response time.
I care a lot about where the response time is in my community. And so I can have a different annotation for that here. But every one of the particular aggregations I want to do, whether it's by day of week, by district, by type of call, by unit, all those different things,
I can do. And I just put the different annotations for how I want to aggregate that inside of each one here. You might notice in here down at the bottom, on that last annotation for the mean, the average seconds of the officer response time. Seconds is not a standard Django function for the ORM. So I'll talk a little bit about why that happened
and how useful that's been. So I mentioned I used PostgreSQL. And I find that to be a necessity for this sort of work because it really does have some great things inside of it in terms of being able to do different types of queries that you can't necessarily do because it goes outside the SQL standard. But here's an example of where
I was able to build a custom function around the Django ORM why Django was so useful. We had our option, in fact I was pressured at the beginning a little bit to use a different technology just because we had several people internally who were very good at JavaScript. The discussion came up, maybe we should use Node for this and one of the big arguments that I made
in favor of using Django was how great it is at letting us customize the ORM, build things around it to make it fit our needs exactly. So here I have a little helper function called precision where depending on the amount of data that I am looking at, whether I'm looking at a week,
you know, I'm looking at a month, I'm looking at a whole year of data, it's gonna use a different precision when we're looking at the data. And you know, maybe month, day or hour. And then down here at the bottom on volume by date, you know, I have, I truncate that date by the precision. So it's grouping it together. Like if I'm looking at a year of data,
it's grouping it together by month. If I'm looking at less than that, it's grouping it together by day. If I'm looking at less than seven days, it's grouping it together by hour. And that is gonna happen dynamically in our dashboard just to take a quick look here. All right, so last seven days, when I moved to last seven days here,
we're suddenly looking at it by hour and we can really see what's happening. If I go back here and say year to date and then I'll change that to a more custom range. Okay, so that's by day. You can see by day. And then if I go in here and say, oh yeah, not January 2016, but January 2015.
It might be quicker to type that. This is live.
Love it, love it. All right, here we are. Let me click apply. Wow, okay. Maybe I'll use a future date. Oh, was I looking at future date? I was because I'm a nut. I thought that this was August. I don't know. Take a second there.
But yeah, now we're looking at it by month. This is a much flatter line. So this is a really cool feature that I was asked for that Django just made easy to do. Django plus Postgres in this case. What you see here with this date trunk, again, that's not a standard function. I had this whole thing in my blog and I'd go into that function a little bit further there. I don't really have time in 25 minutes
to dissect the date trunk function, but it was something I was able to add easily. In this case, I was also able to add an aggregation easily. I needed percentiles. So I work at RTI International. I work specifically in our Center for Data Science. Percentiles are not a particularly heavy piece
of data science, but in this case, we really wanted not just to see the average response time, but 75% of calls. What's their response time? Because that's gonna matter a lot more. The average response time can get artificially lowered, but we really want to be able to see what's the real response time on the ground. So with here, I was able to go in and say, give me the quartiles, that is 25% of calls.
How quickly are they responded to? 50%, 75% of the total number of calls. Lastly, with these summary models, I just call a bunch of functions and return a bunch of JSON. In this case, you're seeing a data structure here that gets turned into JSON by Django REST framework.
And this is for the volume page. I've got volume by date and source and all these different types of ways that we want to look at the volume. And then we also have the heat map, which shows by day of the week and hour what the amount of calls were.
The key to this, and the key to doing one of the big things that was demanded was Django URL filter. So one of the big demands, one of the things that I had to have in this project was the ability for every view to be bookmarked. If they clicked through on three different charts and they drove down to say, hey, in this district, on Mondays, for general assistance calls,
this is what I'm looking at, they want to be able to send that to someone else. Not just to bookmark it so they can look it up again, but for people who aren't necessarily that technical, they just want to do something quick. You've got, let's say, you've got a lawyer who's looking at this public data and wants to send it to their client. Or you've got someone within the police station
who wants to send it to the chief. They want to just be able to send that URL. So I use Django URL filter. Now, a lot of people may be familiar with Django filter, just because that's highly shown on the Django REST framework website. It's a separate library, but it's often used with Django REST framework. Django URL filter is its little brother
that isn't nearly as good, but it's super hackable. And it's very small. I shouldn't even say it's not as good. It's just quirky, right? And if you look at how many people use it, not many, but it's very, very hackable. And that was important, because I had a couple of things I needed to do here. The big one was I needed to be able to call query set methods.
I had certain ways of looking at my data that I couldn't just put in a call about objects dot filter statement. I needed to be able to say, hey, I need everything during day shift, which is 7 a.m. to 7 p.m. And so I might make methods on my query set. You certainly don't have to understand this code up here, but note that what it does is it looks here
to see if there's an attribute with that filter name on the query set. And if so, it just goes ahead and calls that instead of sending it over to filter. And that became very, very useful. So you can see here how it translates it, right? If I have get parameters of district seven, nature 10, those will go straight to filter. But shift, because shift is a method on my query set,
will get called here if that's one of the parameters. The other big hack was I need to build my filter from a data structure instead of building it from classes and objects. The reason why is because I have the filter and I have the ways to select it on my front end.
I didn't wanna have to recreate that, right? One of the easiest ways to make mistakes in programming is by repeating yourself because you will never get it exactly the same. And so I certainly didn't wanna do that here. I wanted to be able to have this data structure right here, which is a pretty arcane data structure, like most things that you put together while you're coding it.
This is sort of putting the airplane together while I was in it. It got a little funky, but it's relatively self-explanatory what's happening here. I've got filters based off time received and based off shift and district and nature group and whether or not the call was canceled. And with this, it can just translate it straight into JSON.
And then my front end can consume that to build all the filters that I have along the top of the page. We'll see that again when we go back to look at it. My API endpoints are simple. There's not a lot of detail to go into on them, but you can see here it takes those get parameters from the request,
which are gonna be at the top of the page or be in the URL bar, right? So it's markable I can send to other people. And then it's just gonna take my overview model or my summary model, call to dict on it and send that back. Nothing major there. The front end is where this got really interesting and how it works with Django.
So my front end, I mentioned that it's reactive. So what do I mean by that? Reactive really just means that there is a flow of data and there's events that happen, there's things that monitor those events and react to them. In this case, the things that you might see someone do while using this page is they change filters, which they can do in one of two places, right?
They can click a dropdown. Let's pull that up. So they can click one of these dropdowns. Like I just wanna look at Monday. You can see the URL changes up here immediately and it reloads. I might also say, I just wanna look at this district, right? So this updates in two different places. And the thing I just selected here, district three,
I could select district seven here, or I could go in here and clear it, right? I have two different ways that I can update these filters, but every time I do, the URL at the top of the page changes. The application watches for those changes in the URL, right? If I went and changed the URL just by
manually changing it. If I went in here and said, I don't care about this nature group, I wouldn't really expect someone to do this, but they might have a bookmark. In that case, this would be necessary. It looks for those changes in the URL and then it sends requests to the backend for new data. It gets that request back, we update the data and when the data is updated, the page is updated. If anyone's ever written like a fairly complex
JavaScript application, using pretty standard tools, let's say you just use jQuery, which is an awesome library, but whenever you have something new, you need to update. There's a lot of linkages. You have two different ways that you can update something. Now you've got to link both of them. With this sort of way of looking at it, it made it very easy to add new controls,
to add new charts and not have to have sort of an exponentially growing set of linkages. Like I said, reactive programming, the big words that you might hear about are unidirectional, right? Everything flowed one way. Would this change happen? Then this would happen, then this would happen. I don't have data syncing two different ways.
And then it was data flow, right? Data going one and event driven. So let's, this is a sort of a summary of the different events and the reactions that I saw in my application. And all of these look kind of synchronous, right? When a user clicks on a chart, the filter changes. When the filter changes, the URL hash is updated.
But note that this is asynchronous, and that was part of this that worked really well because it means I can come in at different points. There's no direct chain of things I have to do. If any of these things change, then the things that should happen happen, right? So like I showed, when I just go and change the URL hash manually, the Ajax request is sent for new call data.
And then each one of those charts actually monitors a specific subset of the data. So if only that subset changes, then only that chart updates. It's a pretty slick way of doing things. This is a similar thing to what you see if anyone's ever used ReactJS. It's the same model, but Ractive was what we used
because it was a little simpler. It works really well with Django, and it was written by people I respect over at The Guardian. Here's a simple component that you might see. So I've got some jumps up here that shows I've got a template, and then I've got this data hidden true thing, right?
So I have a little chart header I wanna hide and open. In fact, you can see it right here. Pretty simple stuff. Not normally complicated. But here I have the template where you can see unless hidden, show this. This isn't rendered once. It monitors the data, which is the whole point
of the way this dashboard works is it monitors the data to update. So as soon as the data updates, this is re-rendered. Okay, I haven't talked about D3 at all yet, and we have five minutes. Visualizations. So, but what I was gonna say about D3 is there are,
D3 is really, it can be easily thought of as a toolkit for building visualizations. It is not particularly good to think of it as a chart library, because it's not. A chart library would give you some charts. D3 gives you a lot of tools. I sort of describe it as you can get,
it's like a bag of Lotus parts. You can build your Lotus. It's gonna be awesome, but you gotta build it. And so getting, going and just picking up a Ford Taurus from the lot is sometimes a smarter move. And there's a lot of higher level libraries on top of D3 that I recommend. I used any D3 because I liked its styling.
It fit in really well with what I was doing, but there's a lot of other ones to look at. One I've been using recently is PlotlyJS. The team behind Plotly open sourced their JavaScript library, and it's awesome. It's also like two megabytes. It's giant. That's the only downside. But here you can see a simple chart object that I built, this higher level object on top of D3.
It says, oh, I filtered things by day of week received when I click on it. It has some formatting stuff in there. And then I have this monitor chart. And this is how I hook up the reactive nature of this. So whenever data in the volume by day of week subset of that data tree changes, then call update on this chart.
All my chart objects have a create and update method. And that's really, that's the entire API behind my entire system here is. When the page loads, call create. When the data updates, call update. Monitor chart is simple. It just takes what's called a key path. Again, that's sort of a path into the tree of data and says, hey, call this function when that changes
unless the page is currently loading. That was a little protection I put in there because a lot of this stuff is asynchronously loading. The heat map was where I actually used D3 for real. The heat map was very, very cool. To show how to do this would require an entire class on D3, which I am not gonna give. But, and the great part is,
I built it directly off of one of Mike Bostock's examples on his excellent site blocks.org. If you're interested in doing cool visualizations, he has amazing examples there. It was cool until I was teaching some people data science and then one of them made their example project that had exact same heat map in it. I went, oh, okay.
I guess more than one person has looked at that example. But this heat map here is showing by day of week and by hour how many calls there are. And again, I just have a create and update function. So I was able to plug this in to the way everything else works very easily by having this create and update function. So every one of my visualizations has the same API
because of the way the data flow works. So, my lessons learned. First one was for building something that's so data intensive like a dashboard, reactive programming really simplifies those interactions. It made it much more simple to work with. I learned that I should always use higher level libraries on top of D3. I started by not doing that
and I bled on my computer for it. It was not fun. I have really, really, that was like two weeks that I'm ever getting back. And then I didn't talk about this in here, but this is an open source application. You can look at it. When you have serious front-end work happening, when I had like simple front-end work,
I like to just use the standard sort of Django asset pipeline tools. I like Django Compressor myself. But for serious front-end work, where you have a lot of stuff going on, using Webpack and Django Webpack Loader is really great and really simple. I was able to use it with this and even make a plugin system for it. It works really, really well.
Running, we're getting near the end. So, I have this again. And it might make a little more sense to you now that I've talked through reactive programming, how it works, how all these things are connected. If you wanna see the code behind this, there's a link at the bottom to get.io slash cfs. You can look at the code there.
And if you go to cfsdemo.rticds.org, that's a mouthful, you can actually play with the application with the live New Orleans data. It updates nightly, so it should continue to have live data. I know I don't have a lot of time for questions, but I'd like to take any if there's a minute or so. You have a minute.
So, is this application communicating with any of the officers out and making, or doing patrols or anything like that? Or is this right in-house and then they analyze the data and then they're like, okay, we send these officers in these locations to them? So, for my local police station, it's only internal intranet. It's mainly used at their cop set meetings,
which I got to go to. I felt like I was extra on the wire. It was badass. Again, I am not crazy pro-police, but there's something about feeling like an extra on the wire that's pretty cool. So, it's used in-house on the intranet. But, yeah, and it's been really useful for them to be able to find real problems in the city.
Thank you. My question's actually related. I was wondering, I really like how simple it is when you go from the URL all the way down the stack, but how do you control situations where someone can't put dunder user or dunder email or something? How does that work? So, yeah, I mean, the things that they're able to access
are solely filtering on that query set. Yeah, and I know there's solutions. What's your solution? Oh, I mean, Jingle URL filters out the things that can't be used on the query set. Okay. And anything that comes in that can't be used, it just discards. I think that's probably all the time I have for questions, but I'm gonna go right outside the door here if anyone has any further ones.
I know this is a big topic for 25 minutes, so I'd love to answer any.