REST: It's not just for servers
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 38 | |
Number of Parts | 44 | |
Author | ||
Contributors | ||
License | CC Attribution - ShareAlike 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/32850 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
DjangoCon US 201438 / 44
2
3
7
8
10
11
13
14
18
26
29
32
38
39
42
43
00:00
Mechanism designClient (computing)Open sourceFood energyTwitterWebsiteMultiplication sign1 (number)Electronic mailing listFacebookIndependence (probability theory)CASE <Informatik>Flow separationScalabilityHypermediaTouch typingTerm (mathematics)Message passingWeb serviceRepresentational state transferGroup actionMarginal distributionRevision controlNoise (electronics)Source codeMereologyDynamic Host Configuration ProtocolAxiom of choiceQuicksortSystem callConnectivity (graph theory)AreaCovering spaceBuildingCommunications protocolFile formatHypothesisPhysical systemCartesian coordinate systemWrapper (data mining)Field (computer science)Graph coloringWeb-DesignerServer (computing)Dependent and independent variablesMobile appMoment (mathematics)Cloud computingFamilyArithmetic meanGoodness of fitLatent heatSystem identificationUniqueness quantificationCodeComputer configurationInterface (computing)Proxy serverRule of inferenceConstraint (mathematics)WordContext awarenessInteractive televisionPoint cloudNP-hardReal numberFeedbackLibrary (computing)Web 2.0Computer animation
08:58
TwitterWebsiteDependent and independent variablesLine (geometry)Repository (publishing)Group actionClient (computing)Type theoryWeb browserInformationDifferent (Kate Ryan album)Web pageVulnerability (computing)AuthorizationServer (computing)Integrated development environmentEmailRepresentational state transferCartesian coordinate systemContent (media)Web 2.0MathematicsLink (knot theory)Communications protocolState of matterInteractive televisionFunction (mathematics)Core dumpSheaf (mathematics)Level (video gaming)User profileHypermediaRevision controlSet (mathematics)BuildingInterface (computing)Blog1 (number)Uniformer RaumUniform resource locatorConstraint (mathematics)DatabaseProfil (magazine)Computer-assisted translationComputer wormGenderDot productEvent horizonAvatar (2009 film)Projective planeInversion (music)Software frameworkDomain nameStatement (computer science)BitSoftware developerElectronic mailing listWordLibrary (computing)Computer animation
17:57
Type theoryAreaRevision controlClient (computing)BuildingHash functionLink (knot theory)Disk read-and-write headNumberBitPasswordRepository (publishing)DivisorBounded variationCodeBit rateProfil (magazine)EmailDependent and independent variablesCartesian coordinate systemData dictionaryObject (grammar)Cache (computing)Web pagePurchasingData managementInstance (computer science)Direct numerical simulationTrailRow (database)InformationElectronic mailing listDigitizingTable (information)Gateway (telecommunications)State of matterPhysical systemServer (computing)Inheritance (object-oriented programming)Representational state transferWrapper (data mining)Block (periodic table)Core dumpSeries (mathematics)Library (computing)User profileBusiness objectWeb browserUniform resource locatorEntire functionLocal ringCountingMathematicsRight angleField (computer science)Computer animation
26:55
Digital photographyGroup actionTable (information)Ring (mathematics)Profil (magazine)GenderTrailUniform resource locatorClient (computing)Condition numberWebsiteRepresentational state transferServer (computing)State of matterWeb serviceElectronic mailing listBusiness objectData managementDemosceneBuildingAreaSystem callObject (grammar)Message passingMereologyPoint (geometry)SpacetimeComplex (psychology)Attribute grammarFile formatSlide ruleLink (knot theory)Web browserMoment (mathematics)Software frameworkElectronic data interchangeString (computer science)Process (computing)AbstractionRow (database)FacebookSoftware developerBit rateComputing platformReal numberDependent and independent variablesKeyboard shortcutVideo gameMathematical singularityLengthLibrary (computing)Computer configurationMathematicsTemplate (C++)Default (computer science)ResultantOnline helpWeb pageDomain nameMappingCopyright infringementAmerican Physical SocietyLimit (category theory)Arithmetic meanSoftwareDifferent (Kate Ryan album)Point cloudStandard deviation1 (number)Cache (computing)HypothesisComputer animation
35:53
Computer animation
Transcript: English(auto-generated)
00:23
As I said, my name is Mark. I work at Cactus Consulting Group. I didn't just get a free t-shirt. There's my Twitter handle if you want to tweet at me any feedback that you have. This is the part of the talk where I'm supposed to tell you I'm really smart,
00:42
you should listen to me. I'm the technical director at Cactus Group. I'm the co-author of Lightweight Django and we build REST APIs at Cactus. We interact with a lot of REST APIs in Cactus. This is really about both sides. My examples here are in Python, as you might
01:15
imagine, but there's nothing really Django specific about this. It's more about REST as a concept.
01:25
That's not really a full introduction. This is the real me. This is me with my family, completely broken, with the grimmest smile, in an unbelievable amount of pain. This is when
01:42
I'm real and it only happens for a few fleeting moments every year. This talks about REST. What is REST? Sometimes it feels like just a marketing term. It feels like something like
02:05
responsive. People use it and I don't know what it means. Sounds good. Let's call our site responsive. Let's call our API RESTful. It means something. It stands for representational state transfer. This concept is defined by Roy Fielding's Ph.D. thesis at UC Irvine
02:27
in 2000. Again, it's a term that's been abused by web services and marketing teams basically every day since then. REST is not a format or a protocol. It's an architectural
02:48
style. It's a way of building applications. In his thesis, it stated it's an architectural style for building distributed hypermedia systems. It emphasizes scalability, generality,
03:09
independent deployment, reusable components, even more buzzwords of things people want.
03:21
But there are some constraints. If you want your API to be RESTful, you need to satisfy some rules. First of these rules is that it needs to be a client server model. Server needs to be stateless. No client context should be stored on the server between requests.
03:48
This system should be cacheable, layered. A client shouldn't know whether it's talking directly to the server or an intermediary proxy. There should be a uniform interface
04:03
on how the client and server talk to one another. There should be a unique identification of resources, self-descriptive messages passed between the client and the server. There's an optional code on demand. It's kind of weird. It doesn't really make sense.
04:21
Not going to touch on it in this talk. Again, when you buy into this, when you commit and say, I'm going to build a RESTful service, I'm going to build a client server model, and I'm going to build this stateless server, these amazingly self-descriptive messages,
04:42
what do you get for that? Well, you're supposed to get performance in terms of scalability, simplicity, modifiability. This comes from, again, this cacheability, this separation of concerns that pieces can be scaled independently. So it's not surprising
05:08
that this is an extraordinarily popular architectural style of late. If you name a web service, it probably has what it calls a RESTful API. Social media sites, Twitter, Facebook, Instagram,
05:25
Pinterest, cloud services, like AWS or Rackspace or OpenShift, Cloud Foundry, the list just goes on and on. Those are just the ones that have public APIs. Many people use this
05:45
just internally. They build mobile applications on top of a RESTful API. Public APIs just love open source clients. That's what this talk is about. It's about the client.
06:09
This is one of the coolest things, I think, that is happening right now on the web. Then the opening up public APIs through OAuth or whatever mechanisms you have is why there
06:24
were Twitter clients before there was the one true Twitter client and why you have services like Travis that run on top of GitHub and how Travis can call coveralls and
06:41
Travis can deploy to Heroku. These decoupling of services that scale independently, these are built independently. That's what the dream of REST is. It happens with two components.
07:06
One, you need a server. If you want to learn how to build RESTful servers in Django, there have been like half a dozen talks already about it. That's not this talk. This talk
07:23
is about the clients and particularly the challenges for clients. Some of these come from being the client and some of these come from just the interaction between the client and the server. Sometimes you think, well, these are HTTP. HTTP, I understand
07:46
as a web developer. I can just import URL lib or import requests and I can just start going. Writing a client that sort of works, like Brandon said, is not really
08:01
hard. Getting one that works well is hard, and maintaining one as APIs change is hard. There's technical challenges that we're going to cover and look through. There's non-technical challenges, like really terrible terms of service, which I'm not going to cover.
08:23
So what are some of the technical challenges? These are things like changing APIs. When you're talking to a public API, you don't have the choice of how the API will change.
08:41
It's a client server model and you don't get to have any say on the other half. That's frustrating. That's hard. It means potentially supporting multiple versions of the API in a client wrapper, making it clear which versions you are supporting, understanding
09:00
how the server versions content, which is a hotly debated topic in REST. For example, Twitter deprecated their 1.0 API. Now all the URLs have a 1.1. These two URLs in particular are the ones that bit me in a project. They changed. They have
09:27
identical responses, which is even more frustrating. There's a two-character change that broke because I hit a 404. This is a relatively simple and easy change to make. The larger
09:48
changes to APIs can be harder to deal with. A few years ago, Twitter got rid of their basic auth and they switched everything to OAuth. A funny story is that we actually
10:01
had a client a couple of years ago and wanted us to upgrade a site. The site was running Django 1.0. We upgraded it to 1.5. There was a piece of the site where they could tweet things that happened. A new blog post would come up and they could hit a
10:22
button and tweet about it. It used basic auth. I, as diplomatically as I could, said, is anyone using this piece? It said they would check whether anyone was using this piece, but I was fairly certain they were not using the piece because it hadn't worked
10:41
in two years. They eventually agreed that we could remove it rather than updating the client because it hadn't been used. A big challenge for clients is servers,
11:03
in particular servers that really don't meet what I would say all of the constraints that are necessary. In particular, the uniform interface constraint, as it's defined, defines
11:21
the concept of hypermedia as the engine of application state. It's usually shortened like this. I'm not entirely sure how it's pronounced, but it's not usually implemented in a way that's helpful for the client. There's debate as to whether it's necessary,
11:47
whether it's helpful. I will tell you it is helpful and I will show you how it's helpful. The idea is that you should build discoverable APIs. The server should tell
12:08
the client how it can navigate through the API, how it can find resources that exist, how it can navigate through the API. This shouldn't really be such a controversial topic.
12:22
This is exactly how we build websites. We have links on pages and you navigate through them to find related pages. You have forms and the forms have actions and they tell the browser where to submit and how to submit the same types of concepts here.
12:47
When you don't have a discoverable API, how is the API discovered? It's discovered by humans. Humans are terrible. You have to read a giant pile of docs. Again, when
13:02
APIs change, clients don't know, humans have to know, and humans have to read more docs. Instead of relying on documentation to build discoverable APIs, you have to build APIs. How do you build a discoverable API? I have an example of a change that Bitbucket
13:25
made. You don't have to see the whole example, but this is my Bitbucket profile on version one of the Bitbucket API. There's a tiny little section at the top which is my user profile information, which is what I asked for. Then there's a whole
13:46
pile of information that I didn't ask for, which is every repository that I have, all of the information about every repository that I have, and all the sub-information about all the forks that I've created down the line. You can see the scroll bar
14:05
of how ridiculously long this response is. I don't even use Bitbucket that much. I have like five repositories. What they did in version two was they normalized this, if you would, in a database sense. When I ask for my profile in version two of the API, I actually
14:28
get my profile information at the top level. It's not buried into a user key. Then beyond that, there's a set of links. They say, if you want other information about
14:42
this user, here's where you can find it. You can find your repositories here. You can find Mark's followers here. You can find my avatar over here. Again, going back to the keynote, you can think about how these can be cached differently. In this first
15:06
response, when anything about any one of my repositories changes, this cached response has to be invalidated. Now here, I have a small, concise payload that can be cached
15:26
when I ask for my profile. When my repositories change, this response doesn't change. Those
15:41
are a couple challenges on the server side. Building RESTful clients is also a challenge because some environments have weak HTTP support. This is typically in a browser-based environment. As web developers, you probably don't get to always use Python to do your
16:05
API interactions. When I say that some browsers have weak HTTP support, I mean IE has terrible HTTP support. When you do cores or cross-domain requests, it's basically
16:21
completely broken. There's no delete or put when you do cross-domain requests. The content type is broken. All the things that you would want when really building a robust API is not there in IE. Servers need to work around this. Django REST framework,
16:48
as many have talked on here today, have ways of doing this. There's a commonly used HTTP header called method override to say, this is a post, but it should have been a delete, so treat it like a delete. If you control the server, that's a facility
17:08
that you have. If you don't control the server, sorry. One of the biggest problems for clients, though, is managing state. I said that it's state, you know, the client-server
17:27
model and it should be stateless. Well, not entirely stateless. The protocol is stateless. HTTP is stateless. The server is stateless. Guess who has to manage
17:40
state? The client. It's like herding cats. It's a pain. Particularly when you're trying to build a general client library, managing state is difficult. You don't know what state the person who's going to use your client is interested in, but you need to
18:01
help them with some basic pieces. That's what I'm going to talk about. So what should you do to build a good client? What are some best practices in building a REST API client? Well, first thing is to build useful objects. And this almost sounds
18:23
like a tautology or something that's stupidly obvious. If you want to build something useful, it should return useful objects, but you would be surprised if you went through GitHub and saw how many API clients basically import requests, do the request, spit
18:41
back a dictionary blob. You should provide useful objects that translate these dictionaries, these JSON blobs, into meaningful business objects for the API and they should help you link to related resources. They should help you perform actions because
19:04
you're not just asking for a dictionary. I'm asking for a thing. I'm asking for my user profile. I'm asking for a repository. There are things I want to do with my profile and there are things I want to do with my repository. Maybe I want to delete my repository.
19:20
Maybe I want to see what the last commit on my repository is. Maybe I want to update my user profile. Those types of things is what you should be building in your Python clients. So here's an example from Twilio Python. They're right out in the hall if
19:46
you want to bug them. So Twilio, if you haven't gone to their table, I don't have any affiliation with Twilio, just to be clear. They didn't pay me to put this on there. I didn't know they would be out there. Twilio is an SMS gateway and you can purchase
20:11
numbers. Well, they do more than just, they're a telephony gateway. They translate SMS and voice into HTTP. So you can purchase new numbers. You can send SMSes. These are
20:23
all the things you might want to do. And when you use the Python Twilio wrapper and you search for available phone numbers, you get back one of these. You get an instance of available phone number. And available phone numbers do one thing that's really
20:42
helpful, which is they know how to purchase themselves, which is probably why you are searching for available phone numbers. How do you use it? You construct an instance of the client with your credentials. I say, search for phone numbers in 919 area code.
21:02
And then if there's a number, just buy the first one. I don't care what the other digits are. I just want it in this area code. I don't know anything about the URLs. As the person using this client, I don't want to know anything about the URL. That's why I'm using the client. I don't want to know that to do a purchase, I need to take
21:23
the response from the search. I need to do a post to another place. I just want to purchase the thing. That's what this wrapper does. This is a great example of writing a RESTful client. Another piece of useful information that you want to track, Brandon
21:44
touched on this a lot, cache headers. This is a piece of state that the client needs to track. If the server is giving you e-tags and last modified headers, and you don't send them back, you're not holding up your end of this cacheable bargain. It's
22:08
not, the system isn't cacheable if you're not respecting the cache headers. So, these useful objects that you create for your API clients should help you track
22:23
the cache headers. This is something that's easy to take for granted in Python because it just kind of happens in the browser. Your browser is really smart about tracking e-tags, tracking last modified, knowing where the resource was held locally. But
22:41
in Python, you need to take a little more care. So, an example of a client library that does this is the GitHub3 Python wrapper. They have a series of objects that build upon one another, and the core GitHub object has a refresh. You know, I fetched
23:04
Mark's profile, and I did some things, and I'm going to update it. What I want to do is make sure I've got the most recent copy before I make my update, because I only want to update one field. But this is rest, so I have to send the whole thing. So, the
23:25
body has the last modified header, and the e-tag header, that's actually handled by the parent class when the last modified itself e-tagger said. And this is used like
23:41
this. I would log in, and I can get the user who is currently logged in. That's not my GitHub password. Don't try to log in with me. Maybe it should be. No one would ever suspect that that's my password. But I get the user who's currently logged in, and
24:06
I can see my e-tag. That's so cool. I don't know why, but I like that. I want to know what e-tag did GitHub give me? What do they think of me? What MD5 hash really represents
24:22
me as GitHub user? And I can do a conditional refresh. I can say, get me my profile if it hasn't changed. This is the version I have. Is there a newer version of Mark's profile? It doesn't update that much. And the cool thing about this, for GitHub in particular,
24:46
if you get a 304, if you get a not modified response, it does not count towards your API rate limit. And that's probably because it doesn't hit their application servers. It probably just hits their varnish server. And they repay that. They pay it forward
25:02
and say, this doesn't count towards your API limit, because you didn't hit. You didn't get a response. It was not modified. So you want to avoid hard coding paths. This is another thing to do when building API clients. You should use the URLs and links
25:29
that are returned from the server when they're given. And we saw they're given by Bitbucket. Sadly, most of the Bitbucket clients haven't been updated for version
25:41
two. They're used by GitHub. The Python, GitHub 3.py uses the responses sent back by GitHub. But I already used them as an example. So a more common place where these are given is in pagination. If you get a list of things
26:07
many APIs, while they don't give you a nice block of related links on detail pages, do usually provide a next or previous URL when you paginate large objects. So this is PyRacks.
26:26
This is the Rackspace API. The entire method here isn't shown. It's kind of long. But this is their Cloud DNS manager. So you can search through all the DNS records that you
26:45
have managed by Rackspace. I don't know how many DNS records you have that fits on more than one page, but I don't know what you do. So here you can see it parses out the links from the bodies and says, like, is there a next URL? Is there a previous
27:04
URL? And hold on to it. And then when it lists through them, it just iterates through them. You have this option to say list all, and it will just yield. It will just keep making API calls. It will go through all the things in the response, and it says, oh, I need
27:22
to fetch the next one and the next one and the next one. This is pretty fantastic. I can tell you that this hasn't worked for me before on Facebook where they actually gave me the wrong next URL, and I just looped and looped for eternity until I hit the rate limit. But when your server works, this works. So here's how
27:52
you use it. Setting up the credentials is kind of weird on PyRacks, but that's what
28:01
this is about. This is about iterating through all the things. So in their Cloud DNS, you can search for records, and rightfully so, the search by default lists all the results. It doesn't just search the first page of results. It will yield all the results.
28:23
So I can find all the C names that I've set up. For example, .com, which is probably not very many. But if you're building a software as a service platform where you may have a lot of C name records, where you're mapping your domain to a client domain, you can have a lot. And this is, again, kind of good and bad. It makes it really easy to use.
28:48
It abstracts away the fact that there is pagination here. It also hides how many API calls this is going to take. It's not immediately obvious how many API calls that will make or how fast it will make them. It will basically make them as fast as I can iterate
29:04
through the list. But, again, as someone using the client, I don't want to do that pagination myself. So I'm happy that they do this for me.
29:24
So those are some things to do. There's one thing that I really want you to stop doing. This is my plea for the Python community and the Django community, is to stop making REST clients, which are basically glorified URL builders. If your REST client only builds
29:50
URLs, like, the server has failed, the client has failed, you're not really doing anything more helpful than just string formatting, but you're adding, like, syntactic
30:04
sugar on top of it. There's nothing, you can't really make a general REST API client at this point. There's not enough standardization of message formats, yet there's some work in that space. Anything claiming to be a general REST API client is kind of missing
30:22
the point, and they're lying to you. There's no shortcuts. The business objects that come back from a REST API, you have to understand the API. There's no generality to that. And each API is a little bit different. So what's an example? And I don't mean to pick
30:46
on the developers of Slumber. They're probably very well-meaning people. But this is one that attempts to be a general REST API client. And it's very clever Python. This
31:01
is exceptionally clever Python. It translates method calls and attributes into URLs. It's so promising. But it gives you back dictionary blobs, which you then need to translate to
31:23
make additional calls, like the put call here to do the update, to do the delete. These objects don't know how to delete themselves. They don't know how to update themselves. And when you look at this, again, it looks so elegant. It looks like the Python I love
31:45
to read. But what is kind of hidden here is that when the API changes from note to notes plural, because it's kind of awkward to have these like singular resource names,
32:07
I have to change all of these calls. This client hasn't saved me anything. In fact, it's added a layer of abstraction that I'm still building the URLs. I could have done
32:21
this with string formatting. I still have to translate all the objects. It doesn't help me with caching. It's just, like I said, added syntactic sugar over building URLs. So please stop doing this. Just to summarize what we talked about, we talked about REST.
32:49
REST is a client server model, and servers are completely useless without clients. In fact, I don't know that you can really call something a REST API if it doesn't have a client, because it's a client server model. So if you're gonna build a REST API, I would
33:11
encourage you to try to write a client. Understand the pain of navigating your API with a client.
33:21
And don't just show examples that have one request. It's really easy to make one request. It's hard to manage state over requests. You know, tracking this profile that I want to later update. So show large examples. Treat your API just like you treat your website.
33:45
Make it discoverable, browsable. Think about how clients are gonna navigate and find the data that they want. How are they gonna do the actions that they need to do? Again,
34:01
I think the Django REST framework does a great job with this, with the browsable API, because you get this experience in the browser to kind of click through. Like, how do I get to the next piece of data? And this is like documentation. When you have to explain how something works, you realize how terrible it works, and you get to redesign it before
34:24
it's too late. So some handy resources. There's some links here. I'll have links to the slides. The original doctoral thesis and this rant by Roy Fielding about REST APIs must be
34:41
hypertext driven. There's an RFC about constructing URI templates if you have more complex templates. This is something that GitHub does and the Python library uses. The links to the example ones. I didn't link to slumber, because you shouldn't use it.
35:02
Here are the photo credits. Thank you for listening, and I'm Mark. I'm writing like white Django. I actually have a few pre-release copies. I'm gonna be signing them at 12.30 at the Cactus table if you want to come by. We'll have one on the table if you
35:23
want to flip through it and look at it. Thanks for listening, and build great APIs.