We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Django for IOT: From Hackathon to Production

00:00

Formal Metadata

Title
Django for IOT: From Hackathon to Production
Title of Series
Part Number
37
Number of Parts
52
Author
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
It’s Friday night of hackathon weekend. The latest snazzy Internet-connected thingy is sitting on the table next to your beverage of choice, the device’s API docs are open in a browser tab, and your fingers are itching to write some Django. What’s the fastest way to get started? And next month when you come back to it, what will you want to upgrade? This talk will walk through a common IoT use case, sending HTTP requests to turn on and off a device in response to some external data. I do this all the time at WattTime and I'll share some of the tricks I've picked up over the last couple years. We’ll focus on two big differences from your typical blog or polls app: the data model abstractions that fit the problem, and the need to run frequent periodic tasks to hit the device’s API. I'll share a data model that's worked well for me across a bunch of IoT apps. And I'll show you two ways to run those periodic background tasks in Django: a hackathon-friendly version, and a production-friendly version using Celery. You'll walk away with a complete demo template that you can use in your own projects!
13
Thumbnail
42:32
Pattern languageBuffer overflowComputer animation
PlastikkarteWorkstation <Musikinstrument>Food energyThermische ZustandsgleichungInternetworkingState of matterComputer animation
Projective planeSound effectCategory of beingBusiness modelPersonal identification number (Denmark)Internet der DingeControl flowEvent horizonAuthorizationBitMathematicsState observerWordDemo (music)Different (Kate Ryan album)CodeNumberContent (media)Data structureInternet service providerTimestampCASE <Informatik>WritingMobile appLevel (video gaming)Type theoryMultiplication signTable (information)DatabaseSoftware design patternGame controllerTime seriesInternetworkingServer (computing)Electric power transmissionProduct (business)Software frameworkFood energyReading (process)AnalogyReal numberKey (cryptography)Address spaceWeb 2.0Bookmark (World Wide Web)Subject indexingSurjective functionGraph coloringElectronic visual displayIdentity managementNumeral (linguistics)Pairwise comparisonLattice (order)Similarity (geometry)Real-time operating systemHypermediaElectronic data interchangeRight angleAttribute grammarConservation lawLine (geometry)10 (number)SoftwareMomentumMathematical optimizationAtomic numberUniform resource locatorExpected valueGreatest elementComputer programmingTheoryGradientAxiom of choiceNeuroinformatikMetreAreaMedianRectangleMessage passingComputer animationLecture/Conference
Dependent and independent variablesElectronic data interchangeVirtual machineInteractive televisionPoint (geometry)TelecommunicationSemiconductor memoryGame controllerMultiplication signACIDFunctional (mathematics)State observerMereologyMobile appCASE <Informatik>Cellular automatonGroup actionCartesian coordinate systemDisk read-and-write headInternet der DingeExecution unitState of matterTask (computing)View (database)Cycle (graph theory)Attribute grammarNormal (geometry)Flow separationComputer animation
MereologyArmAreaRight angleQuicksortFormal grammarCartesian coordinate systemCASE <Informatik>Wave packetObject (grammar)Internet der DingeMultiplication signProduct (business)Device driverArithmetic meanWordBusiness modelState of matterTask (computing)Key (cryptography)System callDecision theoryBuildingMobile appError messageDatabaseGroup actionQuery languageMessage passingUniqueness quantificationCodeDemo (music)Lecture/Conference
Task (computing)Internet der DingeBitData storage deviceProduct (business)View (database)Cycle (graph theory)Message passingMobile appBusiness modelMultiplication signComputer configurationFitness functionQueue (abstract data type)State observerPoint (geometry)Service (economics)Service-oriented architectureMereologyFocus (optics)Interactive televisionServer (computing)Point cloudCodeResultantEvent-driven programmingPhysical systemIntegrated development environmentScheduling (computing)DatabaseMultiplicationDigital rights managementEmailFunctional (mathematics)Web 2.0Computer architectureClient (computing)Wrapper (data mining)User interfaceData structureDifferent (Kate Ryan album)Key (cryptography)Process (computing)Computer clusterBoilerplate (text)DeterminismCartesian coordinate systemConcentricPopulation densityData managementArithmetic meanSoftwareRadiusRight angleCASE <Informatik>Data recoveryHuman migrationIterationPattern languageGame controllerPhysical lawSystem callQuicksortTerm (mathematics)Category of beingTheoryClosed set1 (number)State of matterComputer animation
Task (computing)QuicksortParameter (computer programming)Goodness of fitResultantQuery languageDatabaseSystem callFrequencySet (mathematics)Wrapper (data mining)Computer architectureMultiplication signInstance (computer science)Similarity (geometry)Metropolitan area networkRight angleVideo game1 (number)Execution unitLecture/Conference
Boilerplate (text)Time seriesMobile appComputer architectureVideo gameInternet der DingeComputer fileProjective planeDigital rights managementCycle (graph theory)Interactive televisionStandard deviationScheduling (computing)BitTask (computing)Data modelMultiplication signDependent and independent variablesHTTP cookieCartesian coordinate systemProcess (computing)Greatest elementBusiness modelLengthMetropolitan area networkSoftware frameworkSparse matrixLevel (video gaming)Open setAreaComplex (psychology)Integrated development environmentInternetworkingComputer animation
Integrated development environmentUniverse (mathematics)Process (computing)QuicksortMereologyProduct (business)Power (physics)Medical imagingRule of inferenceRight angleSoftware maintenanceSoftware testingContent (media)Canonical ensembleArithmetic meanMultiplication signFrequencyTask (computing)InternetworkingQueue (abstract data type)Computing platformKey (cryptography)Standard deviationPoint cloudService (economics)Line (geometry)Business modelLecture/Conference
Rule of inferenceBit rateElectric power transmissionSlide ruleInternetworkingSynchronizationMultiplicationDampingSampling (statistics)Right angleOntologyRootWave packetFrequencyDifferent (Kate Ryan album)Task (computing)Business modelState of matterView (database)Internet der DingeCartesian coordinate systemRow (database)Line (geometry)Instance (computer science)Computer configurationFunctional (mathematics)Lecture/Conference
ResultantComputer animationJSONXML
Transcript: English(auto-generated)
So if you don't want to hang out in the overflow room,
you have two anos to go, see. So I was introduced to Django at the hackathon where my startup was born. So we're called Wattime, and we use Django to fight climate change. So we do that by enabling internet-connected devices like smart thermostats or the electric vehicle charging stations that you see pictured here
to turn on when energy is clean coming from the grid and turn off when it's coming from dirty fossil fuels. Another side project that I'm working on right now is the Unconscious Bias Project. And for that, we're collecting evidence-based resources to help you identify, reduce, and cope with the effects of unconscious bias in science
and tech. But those are what I'm actually talking about here today. So I'm talking about Django for the internet of things from hackathon to production. What? That was a lot of words. Let's break this down a bit. So Django, we all know and love everyone's favorite web framework. The internet of things.
So this is when something that you don't normally think of as a computer, like a thermostat or a car, it can receive and transmit data and even respond to controls all in real time. And when some people talk about the internet of things, they talk about writing the embedded code that runs on the device.
But a lot of the devices that you can buy these days have APIs that the vendor provides. So today, I'm going to focus on the use case where you're writing code that interacts with that API. Hackathon. So hackathons are a week of events where you form a team, write a bunch of code, drink a bunch of coffee and or beer.
These usually winners and prizes, and everyone has a great time. And as IoT has grown in general, it's become particularly popular for hackathon projects. But because hackathons are so short and compressed, they're just a weekend, they're kind of notorious for producing really bad code. So what if you want to put your code into production with real users?
So reliability matters in a different way for internet of things apps and for a lot of other things you might work on. So say you have a Wi-Fi enabled lock on your house. What happens if the server goes down? Can you still get into your house? Can anyone get into your house? So you'll want to be a little bit more careful when you're coding for production.
Eat your fruits and veggies while you're drinking your beer. And make sure that your code is extensible and testable and uses dependencies that are reliable so that when you ship it, you can trust it and go have fun. So today, we're going to talk about some design patterns that have worked for us at what time
while writing and deploying Django projects that monitor and control internet of things devices using the vendor provided APIs. Some things will streamline your work for your next hackathon, and other things will help you as you move into production. And as we go, we'll be using this emoji for hackathons,
this one for production, and this one for anti-patterns to avoid. So we'll be helping those. Great. So imagine it's Friday night. You're at a hackathon, and you have this awesome idea for an IoT project. Where do you start? So usually start with some models. So for comparison, let's think about the standard books
app that you may have seen in the tutorial. We have an author model with a name and hometown attributes, and we have a book model with title, a foreign key to author, and year that the book was published. An analogous IoT model would look pretty similar to start with. So instead of book, we have a device with a name and a location. And instead of author, sorry, other way around there.
So instead of device, we have an author, we have a device with a name and a location. And so the book, we have an observation with a value, a foreign key, back to the device and the timestamp. So even though I said the word's wrong, there's a lot that's similar. So what's different?
The main change to the device model is to add the vendor-provided IT, something like a MAC address maybe. You'll almost always need this to pass on to the vendor's API to uniquely identify what device you want to be talking to. So this might be analogous to the author's social security number or other national ID number.
The bigger mental change when you're writing these apps is thinking about the observations. So IoT is fundamentally about time series data in a way that most other Django starter projects aren't. And the number of observations that you'll be dealing with grows a lot faster than the content in the early stage of a typical content-driven app.
So at what time, we tend to deal with medium-sized data, since we pull new energy data from the power grid every five minutes. So it's not like huge data. Your project might enter big data territory much faster than you expect, because many sensors have readings that pull data extremely quickly. And even at my very first hackathon,
adding a database index to timestamp was the difference between our demo timing out and it actually running quickly and smoothly. So thinking about the underlying structure of your database tables might be more important. And the last difference I'm going to talk about is you'll want to be storing different types of observations
in a lot of cases. So with books, it's a good bet that the title is going to be well-represented by character data. But with IoT, you'll probably want to be storing both numerical data, like how much energy is being used by the device, and maybe Boolean data, like whether the device is on or off, possibly also care data, like display messages, or data with choices, like the color of a light bulb.
So you might want to use different models for those different data types. Great. So we have a basic data model. Onto views, right? Wrong. So instead of views, which are for interactions with people, our basic unit of interactivity is going to be tasks, which are for interactions that can be triggered without a human
pressing a button. And to see why this is important, let's think about the normal vanilla MVC Django project. In this case, your app is only responsible for sending data to a user if they request it from you. But most third-party APIs, including almost every API I've used for an IoT project, they
put the responsibility on your app to make the request to them whenever you want to push or pull data. And of course, a big part of the point of the internet of things is machine-to-machine communication, doing what the user magically wants without making them ask for it. So it's crucial that all the important stuff that happens in your app can be initiated outside of the user-driven request-response cycle.
And pretty much the entire rest of the talk today is going to be focusing on how to make that happen because it's pretty normal views. So what tasks do we need to be doing? The two basic kinds of functionality for an IoT application are going to be monitoring, so getting the state of the device,
and control, changing the state of the device. So for each kind of observation you have, you might want a separate task function to get the data and to set it if the vendor lets you modify it. Some APIs don't let you modify as many attributes as you might want to, so check that before getting your heart
set on specific applications. You'll also need a driver task. So let's call it the Do Something Awesome task. It's going to stitch all of this together. That's more or less readable. So for this particular demo application, we're going to pass in a device.
We're going to, based on some external data, decide whether to turn it on or off. We will pass that isOn message onto the device using the device's API. We will save that new status to the database. So that's what's happening here. It's kind of funny. After working on Internet of Things apps for a while,
the part where you interact with the device itself has become less exciting to me. Because that's the part that's kind of the same between most apps. Like, you're talking to your light bulb, that part's going to be the same. But the part that I get excited about now is the decision criteria for why you're turning that light on or off, why you're actually taking the action. Because that's the part that makes your app unique.
So once you've decided what sort of awesome thing you actually want to do, so writing the code is going to be as simple or as hard as your idea.
But there is one gotcha that we're going to talk about quickly. So we're building up to running these tasks asynchronously. So in a production situation, you can't necessarily count on the state of the database being the same when the task runs as when you called it.
So here, since we're passing in the device object as the parameter, what happened with the database may be different by the time we get here. So a better convention that I like is passing in the primary key to the device. And then you can make the query to get exactly the right object that you want from the database right when you need it.
Also, if you use a convention of returning a primary key to new objects instead of something like error codes, then you can chain different tasks together really easily. So we have our models. We have our tasks. Let's put it together into apps.
So in a hackathon situation, just toss it all into one app. No one really cares. But if you want to organize it a little bit better, I like to split up my apps this way. So Devices has the device model. Observations might have attributes, status, anything else. Interactions is where the tasks go.
And it will have the wrapper around the vendor client API in its own app. So this might seem like a lot of structure for now. But each app will grow over time as you start adding views for adding and removing devices so the user can manage what they have. If you want to start adding fancy v3 dashboards,
you can put those views in the Observations app. You're probably going to want to log the interactions that you're making with a device. So that can go in the Interactions app. And if you want to start having multiple kinds of devices that you're supporting, then you can swap out the vendor APIs pretty easily. Cool. So you have models, you have tasks, they're in apps.
Let's deploy it. So how are we going to deploy these tasks? Running a web server isn't actually going to help us here. What we really need is something that acts like, if you're familiar with the Unix concept of cron or crontabs, where we want something like that that can run in a distributed cloud environment. So we want to be able to run our tasks anytime
we want at frequent and deterministic times and outside of the request-response cycle. And this is the part of code where hackathon and production differ most. So I'm going to talk about two different ways to run these tasks, both of which you can deploy to Heroku or a lot of other cloud environments.
So let's see the hackathon way first. I'm using Heroku. It has an add-on called Scheduler. And we'll use that to run management commands. So you're already familiar with the Django management commands like make migrations or run server probably. And you can write your own management commands too. There's boilerplate in the Django docs,
and you essentially just copy that. And that's a great way to get started. In this case, we'll just have a simple management command that wraps the task that we already have. And then to get it running on Heroku Scheduler, there's this free add-on with a cute little web interface to set up to run your tasks. So it's pretty easy to set up, which is good.
The downside to using this is to make it only suitable for hackathon situations is that it has very limited frequencies that it lets you on the tasks. So you can do daily, hourly, or every 10 minutes. But if you want to do every five minutes, you are out of luck. And it's also, they only operate as the best effort service. So I have noticed it skipped some tasks
if you let it run long enough and then are sad when you don't have that one data point. But it's awesome for hackathons. If you need to go into production and do it for real, you'll want both more flexibility and more reliability than these free add-ons can give you. So as the fit to production, you got to eat your veggies.
So we're going to use a package called Celery. What is Celery? Celery is a distributed message queuing system for asynchronous stuff. Most tutorials for Celery focus on how it's really good for long-running, event-driven
background tasks, like maybe sending the user a sign-in email after they signed up to your site. Something that doesn't come up as much in tutorials, but that we're going to be using it for today, is scheduling periodic tasks. So in a generic app, you may want to run some daily analytics, like Clinton was just
using for aggregating all those data for the police calls. But here, we're going to be using it for doing our Cron in the cloud solution. The main reason I don't recommend Celery for hackathons is that it complicates your architecture a lot. So I'll do a brief overview of everything
that you need to have set up to run a Celery app. So we have our web servers. Those can trigger the event-driven tasks. We also run a scheduler server that can trigger the periodic tasks. Both of those put messages on a message broker transport queue. That's what Celery calls it.
But it's just a first-in, first-out queue of wherever tasks get put on. They can then get consumed by worker servers that actually do the work of running that task function. So maybe they'll go out and have external APIs. Maybe they'll put things in the database. They'll do whatever you need them to do.
And if there is a return data from that function, it will go into something called the results store. And there's a bunch of different options for backing services for both the queue and the results store. Redis is a popular one because it's really good for both of them. Rabbit AMQP services are also good for the queue,
but not for the results store. So to make it more concrete, for our IoT application, we're going to have every few minutes our scheduler is going to put a do something awesome message into the queue. Our worker will pull it off. It will decide whatever you want to do,
maybe turn on our light bulb. It will send the API request to our vendor to say, turn on the light bulb. The vendor will return, yes, it's on now. Good job. The worker will then store that new status in the database and return the primary key to the results store.
So once you've got all that architecture configured, actually changing our plain Python tasks into celery tasks is really easy. You just add a speculator. And then to configure when the tasks are going to run, there is this dictionary syntax.
And the syntax for defining the periodicity itself is pretty similar to Unix crontab. So you can have really flexible intervals anywhere between once a minute and once a year, including things like day of the week or only at noon on Tuesdays or anything like that.
And after all my talk about passing primary keys around, you're probably tempted to do a query here and pass the results as arguments into the task, right? Unfortunately, this doesn't work very well because this dictionary is only evaluated once, like anything else in your settings file, for instance.
So you can only use static arguments here if that's an option for you. If that's not an option, one good sort of workaround is to have a wrapper task that spawns all of the daughter tasks that you need. And so your database call only happens here.
So to that, we have a final production-ready project. So we have the same apps as before at the bottom. The schedule file has been added to the interaction app. And then there's a bunch of other boilerplate to configure all of the cellular architecture that's
involved. So there's some pretty important subtleties in that that I don't have time to get into today. But the docs are good. And if you didn't memorize all that, there's also put together a cookie cutter template, github.com slash aschn slash cookie cutter Django IoT.
So this has all the boilerplate configured for you. It has all of the data models configured for you. All you have to do is write your actual interactions with the vendor API. And whatever you're doing, something awesome is this should make things a lot faster for you to deploy, either with Celery or with the management
commands. That's all configured to one on Heroku. So hopefully, this will be helpful for your next hackathon. So what have we learned? IoT is fundamentally about time series data in a way that most other Django starter projects aren't. So that will mean bigger data.
And that will mean that you have to think a little bit differently about which of your models are going to be big and which of them are going to be small. IoT is also fundamentally about asynchronous processes instead of the standard MVC request response cycle. So think about putting the actually important stuff, not in views, but in somewhere that can be accessed
through tasks or something else. To run those tasks, an easy but really rigid way that's going to hackathon situations is Heroku Scheduler. And the flexible, more complex way is using Celery. And I keep talking about Heroku just because that's really common for hackathons. But Celery, of course, can run anywhere.
If you want to get started using some of this better and faster, then Cookiecutter Django IoT is ready for your perusal. And just to get a little more philosophical for a sec, I'm getting back to the keynote this morning. IoT is typically thought of in very privileged situations.
It makes life easier for people who already have it really easy. But a lot of time, we think that the internet of things can be more powerful than that. So we like to use IoT for environmental applications. But I encourage you to think about ways in your life
where having devices respond to what's happening around them could be much more broadly impactful. I think I have a few minutes for questions.
Primary key? Were you saying make the time the primary key? And why? Well, you kind of went over that pretty quick. Sorry. So I haven't actually tried doing it that way. I just use the standard Django primary key. And then the other one was why use the Celery?
What benefits did you find using the Celery if you got a pass over cron? So if you're running on one of these platform as a service situations, you can't access cron. Oh, I guess he might have answered that question. I understand that Celery has the advantage of queuing
for asynchronous tasks. So the consumer producer model rather than invocation. But why Heroku versus cron? Is that also because you may not have access to cron on the devices?
I'm not sure if I'd understand it. Well, anyway, just setting up a job in a cron tab and invoking like a Django command rather than having Heroku invoke the command. Yeah, so if you're running in an environment where you have access to cron and you know that you're going to only be running on that machine,
then cron can do the job for you. But in some sort of a more distributed situation where you can't rely on having access to those. From outside. Question about running Celery in production and what you're doing to monitor tasks and making sure that they're running and running successfully.
Yeah, I didn't have a lot of success with Celery flower, flower, however you say it. I wanted to like it, but you couldn't get a lot out of it. I actually don't use Redis for my production situations. I use Cloud AMQP and other AMQP services.
And I find the monitoring on that is pretty good, but again, not quite as much detail as I want. So lately, I've been looking into using librotto and some of those sort of logging things. One nice thing that I just found out about librotto is that you can send alerts based on when lines show up
in your log file, so either based on the content or if your queue stops recording for 10 minutes, then you can have librotto trigger an alert based on that. So my internet is setting that up as we speak today, so I'm pretty excited to see how that works for us.
I didn't think you'd use the task or periodic task decorator. I'm sorry. Sorry, what? Did you use the task or periodic task? Yeah, so the task decorator.
Yeah, there's a periodic task decorator where you can just feed it the contact object, and is there a reason not to use it? I haven't, because I don't like to have the tasks only be available periodically. So if you do it this way, you can call it from multiple places in different ways.
What's the biggest challenge you've had to solve with doing an IoT project? I guess one application that we're working on right now is controlling smart thermostats. So we have data coming in about whether the power grid in Chicago, how much carbon pollution
there is in that electricity. So the API that I'm working with for that partner is a SOAP API, which has been incredibly annoying to deal with. So that little line where it's like status is actually a huge, really annoying method there.
So picking your partners carefully is important. Never mind another question. I noticed early on in your slides, you had a set status and then a status set create. And I'm wondering, what's the purpose of duplicating that? In other words, I'm thinking about the Ecobee thermostat where everything you do, it just queries everything again
from scratch. And I wonder if that model avoids desynchronization of the state of the device versus the database's view of the state of the device. Yeah, getting those out of sync can be a problem. For some devices, they only let you access maybe like 10 days of historical data.
And so having your own record of it can be really useful too, even if that does create the additional challenge of trying to keep them up to date. Makes sense maybe to just depend on re-asking the state if that doesn't take too long, rather than depending on the state in the database being. Yeah, so if you are able to ask the device for its own historical data,
that is likely to be more accurate. So if that's an option, do it. Last question. Teacher, do you have any kind of option to prevent proposal instances of the task from trying to new date some device state at the same time? That is something that Celery has some helper functions for
that is complicated.