Overcoming access control in web APIs
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 130 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/49940 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 202063 / 130
2
4
7
8
13
16
21
23
25
26
27
30
33
36
39
46
50
53
54
56
60
61
62
65
68
73
82
85
86
95
100
101
102
106
108
109
113
118
119
120
125
00:00
Control flowMaizeWeb applicationServer (computing)Software maintenanceAuthenticationCodeSlide ruleSoftware frameworkLibrary (computing)Information securityWeb 2.0Term (mathematics)Game controllerMeeting/Interview
01:06
SynchronizationAuthenticationMathematicsWave packetAuthorizationWorkstation <Musikinstrument>PasswordInformationQuicksortSingle-precision floating-point formatSoftware maintenanceData conversionWave packetPublic key certificateTransport Layer SecuritySoftware frameworkMultiplication signType theoryRight angleHTTP cookieFormal languageSoftware engineeringException handlingDatabaseError messageInjektivitätNumberInformation securityTablet computerServer (computing)Sensitivity analysisToken ringService (economics)EmailMessage passingSoftwareAuthenticationProjective planeDifferent (Kate Ryan album)HypothesisPoint cloudConnected spaceComputer configurationComputing platformFunctional (mathematics)Web 2.0Cartesian coordinate systemSoftware developerSet (mathematics)
07:36
Wave packetStrategy gameWeb browserDirected setResonanceHill differential equationMultiplication signWeb browserKey (cryptography)Internet forumEmailIPSecFrequencyMereology2 (number)Military baseWeb applicationBitQuicksortOcean currentAuthorizationHTTP cookieDirection (geometry)Data conversionInformation securityDatabaseStrategy gameWebsiteWeb 2.0Programmer (hardware)NumberInformationToken ringLink (knot theory)Server (computing)LoginLibrary (computing)Client (computing)Game controllerType theoryDifferent (Kate Ryan album)AuthenticationSingle-precision floating-point formatFigurate numberCartesian coordinate systemData storage deviceInternet service providerSoftware frameworkCross-site scriptingGraph coloringComputer configurationScripting languageJSON
14:06
SummierbarkeitHTTP cookieReading (process)Mountain passInformationCodeSheaf (mathematics)UsabilityProcess (computing)QuicksortHTTP cookieCartesian coordinate systemMereologyWeb browserReading (process)Codierung <Programmierung>Multiplication signToken ringPoint (geometry)String (computer science)EmailComputer wormSystem callElectronic signatureException handlingBitFrequencyServer (computing)Validity (statistics)Information securityNamespaceDifferent (Kate Ryan album)Group actionStandard deviationGreen's functionMetadataStorage area networkAuthorizationWeb applicationHeegaard splittingMultilaterationDebuggerSet (mathematics)
21:44
MaizeMultiplication signToken ringAdditionDatabaseEncryptionQuicksortGoodness of fitSystem callNeuroinformatikTransport Layer SecurityPublic key certificateServer (computing)Level (video gaming)Information securityData conversionCartesian coordinate systemBitSoftwareAbsolute valueRoundness (object)Meeting/Interview
Transcript: English(auto-generated)
00:06
Next talk here. So, our next speaker is a gentleman named Adam Hopkins. He is, let's see, okay, so he is a builder of web applications for 20-plus years.
00:22
You have probably used his coat because he is the maintainer of the Async.io web server and framework known as Sanic, which learning about that led me to look up a while back, look up where that term came from. I love it when they name a library after a meme. That just makes me really happy. He's a self-styled authentication nut, and he's going to be talking about, as you can see from his slide there,
00:47
he's going to be talking about overcoming access control in web APIs and addressing security concerns using Sanic. So, all of you API lovers are going to really enjoy this one.
01:00
So, everyone, please welcome Adam Hopkins. Hello, everyone. Okay, as I said, my name is Adam Hopkins. I am a senior software engineer with Packet Fabric. Packet Fabric is a network as a service platform that provides cloud connectivity solutions
01:25
and has recently been named one of the top 10 hottest networking startups. As I also said, I'm one of the maintainers of the Sanic project, and I'm going to be using that to kind of showcase my talk here, but it's not really a Sanic talk itself.
01:43
So, with that said, we're going to be talking about security mainly, but we're not going to be talking everything. We're not going to talk about TLS certificates, how you should maintain passwords and sensitive information, how you should store that, or how you should keep your server secure, SQL injection,
02:03
all that kind of stuff. I'd be very happy to talk with you about, but we'll take that offline. And what we really want to talk about is these two things. We want to talk about authentication and authorization. So, what is authentication? We're talking about, do I know who this person is that's trying to access my API?
02:23
That's the first step. That's this question here. Is the person logged in? If not, we're going to send them an unauthorized message. If they are, we're going to ask them a second question. We're going to ask whether or not should I let this person in, and the same thing. If not, we're going to throw them an error message.
02:40
So, real quick, that's basically what we're talking about here. So, here is my endpoint. As you can see, we're going to serve up some really super top secret information. We don't want anybody to know this, that foo is bar, so we're going to try to protect this. And this is how we're going to do it inside of Sanic. Right now, if I were to hit that endpoint, it doesn't really do me any good,
03:02
because anybody that hits that endpoint is going to find out foo is bar. So, how are we going to protect that? Sanic is very similar to Flask in that it uses decorators pretty heavily. So, we're going to create ourselves this protected decorator.
03:22
What does protected need to do? It needs to do protection. If we can get through this do protection and we're happy with what happens, then we'll go ahead and we'll execute our handler. If not, we're going to bail out and we'll serve up our error. So, we've got this function up here and we're going to try to figure out what that should do.
03:42
Now, I just said that we're going to do this using decorators, but there's also a very other valid way to handle things. One of the things that Sanic sort of prides itself on is not being very opinionated and try to leave as many tools to the developer as possible, because after all, this is your application, this is your API.
04:03
So, a very other valid way to handle authentication would be to create this middleware and on every single request before it even gets to the handlers, we'll execute this middleware and do the exact same do protection method. So, this is certainly a valid option.
04:21
So, this is probably the most important thing. If you walk away with nothing else from this conversation, remember, authentication failure is a 401 and that's unauthorized. It's weird. I get it. It's sort of this legacy thing that's a hangover from the early days when the World Wide Web is still like the wild, wild west.
04:44
So, authentication leads to unauthorized. An authorization failure is forbidden. So, remember that. So, Sanic is going to provide us with a very easy way to do both of these things. So, we've got these two exceptions. We're going to run do protection.
05:01
If we fail authentication, you raise unauthorized. If we pass and we fail authorization, then we go on to forbidden. Okay, so we know currently as we're set up, it works. It does what we want. So, the question is, how are we going to determine authentication? And this is sort of the meat of the talk here.
05:23
So, inside of HTTP, there's a few different common ways of handling authentication, but we're going to eliminate right off the bat three of these. Basic and digest, again, are sort of legacy. They have some security issues and they're not really meant for handling the type of APIs that we're dealing with.
05:44
OAuth, great tool, probably a conversation for its own 30-minute talk at another time. So, again, this is something we could talk about offline. A lot of the concepts that we're going to talk about here are going to be applicable, but it's sort of a framework in its own right. What we're really talking about here are bearer tokens
06:00
and these are things that go inside of your headers and session tokens. But I want you to forget that. Everything that you already know, you already have a preconceived notion of sessions, cookies, headers. Take that and just put it aside for a minute, because we're going to sort of build up our information knowledge again. Okay, so first we're going to talk about this sort of hypothetical that I have.
06:24
We've got a train and our train offers two different types of tickets. You can have a session ticket or a non-session ticket. And something else that's going to be peculiar about our train is every single time that our train stops at a station, our conductor is going to check every single person's ticket to make sure that they're still valid.
06:44
Okay, so how's this going to work? In our session based, it's a single ride. You buy your ticket, you're going to go from one place to another place, you're going to get off and that's it. You can't use that ticket again. You can't, you have to buy a new ticket if you want. And every single time that you get to a station, the conductor is going to take his tablet,
07:01
he's going to look up your ticket number, he's going to say, yep, it's in our database. Yes, this is still valid. We haven't gotten to the end point. You can stay on the train. We're going to contrast that with what I'm calling non-session tickets. You're probably not going to see this language elsewhere. So don't go start googling this because you're not going to really see this.
07:23
This is sort of the all day pass. This is, I can get on and off as long as I want, as long as my ticket hasn't expired. And the really nice thing about this for the conductor is he's going to get a look at your ticket. And just by looking at the face of it, he knows whether or not he issued this.
07:40
He knows that this is valid. It hasn't expired. He doesn't have to go looking up inside of a database or anything like that. So how do those session based requests work? Our client is going to initiate a login with some credentials to our server. Our server is going to check them to make sure that they're OK. And it's going to create this thing called a session and it's going to store it somewhere.
08:03
And it's going to give a session ID back to our client. And every single time that that client wants to make a new request, they're going to use that session ID. We're going to go look that up in the database and then we're going to deliver the information. So this is a very tried and true method. You find this all over the web currently.
08:23
So the next thing we've got is these non-session based. And this is where we supply credentials. We're going to generate some kind of a token. We don't need to store this somewhere. There's no data store. And every single time that token comes back, we can just look at it to know whether or not it's valid and whether or not we want to provide the information.
08:40
OK, so let's take that information and hold on to it. We have sessions and non-sessions. Next thing we're going to decide is what's our strategy going to be? And we've got three questions here. Who is going to consume our API? Is it going to be a script? Is this going to be some sort of application? Is there's going to be somebody inside of a curl command?
09:02
Do we have control over that client? And what and will this be, will there be a web browser that's going to be trying to get information from our API? Really what we're trying to get at is sort of that last question. Is this a direct access API? Is this a browser API or is it going to be both?
09:22
Why do we want to know that? Well, first of all, direct APIs are a lot easier to handle. Number one, we were dealing with usually a more technically sophisticated user. It's usually going to be a programmer that's going to be doing this. And we kind of assume that they're doing what they can to protect their tokens and their keys.
09:43
So this is sort of the typical scenario. Maybe you've used, you know, say one of these web-based email providers, you know, and you want to be able to send out an email, they're going to provide you a key. These login credentials, you're going to send a request. You're going to send them that key and you got access to the API.
10:04
Then on the other side, we've got the browser based. Now, this is really sort of our problem. The browsers are sort of built not really to handle authentication the way that we need to do it these days. And because we have got less lesser technically sophisticated users who might have installed a bad plugin,
10:24
or maybe the websites itself got security flaws. We've got these two things called CSRF and XSS. Going into details about exactly what they are. Let's do that in the conversation afterwards. But these are the two things that we're really trying to protect against.
10:40
And these are the things that we're concerned with. And so we want to figure out how do we solve for that. Okay, so the problems with browsers is these two questions here. How is the browser going to store my cookie or my token? You know, is it going to put it into a cookie? It's going to store it somewhere.
11:00
These are different tools that are built into the JavaScript libraries. And because they're built into JavaScript libraries, JavaScript can access them. If JavaScript can access them, then sort of the bad guys can get at them too. And that's really what the XSS attacks are that we're trying to figure out.
11:21
Now, the other thing we have to talk about is how is our browser going to send those cookies back to us? We've got two options. We have cookies and we have authentication headers. And so how is this sort of typically handled? If you went to Google and try to figure this out, what are they probably going to recommend?
11:44
Well, they're going to say if you've got a session token sticking in a cookie, you know, yeah, we got to deal with this thing called CSRF, but we're going to solve that with a header. Now, inside, I've put in a link in the discussion forum to repo.
12:03
And inside there, there's an example that goes into how you can do this. So let's, again, let's talk about that kind of offline and look at that example. And we can talk about CSRF, but just know that we're going to solve for these session-based cookies
12:20
by using some authentication, using an XSRF token header. Now, what you might find out there is people are going to say, well, you've got, you know, a token like a JWT token. Well, how are you going to do that? You're going to send it over an authorization bearer token. The problem with this is that we're still open to XSS attacks.
12:44
So let's do a little recap. To answer the question of how do we authenticate, we need to know about session versus non-session. We need to know about direct APIs versus browser APIs. And we needed to know about different types of tokens we have here.
13:02
OK, so we know if we're going to have a direct API, we can use an API key in the authorization header and we feel OK that that's going to be safe. Or we can put a session ID inside of cookies in the browser and that's going to be safe. But what about when our API has to do both? What if we need to serve both a direct API and a browser
13:22
and we don't want to have two different methodologies to sort of authenticate? How are we going to do this without over-complicating our application? Second question is, what if we want to use JWTs inside of our front-end framework? How can we do that safely?
13:42
Well, I'm going to tell you that we can. All you need to do is you need to take the right pill here. So let's talk a little bit about what a JWT is. This is a mess of characters, but we'll notice that there are periods inside of a JWT.
14:00
And these periods break up the JWT into three different parts. I've color-coded them here into blue and purple. And those represent three bits of information. The orange is our meta information. This tells us how we can read the rest of the token. The blue is sort of the money.
14:20
This is really where the value of JWTs come in because it allows us to take some actual usable information about who has logged in and actually serve that information back. And then lastly, we have the signature here.
14:41
It's really that sort of bit that allows us to take a look at this and know whether it's valid just on the face of whether or not we can look at it. Because basically what's going to happen is we're going to take this whole thing and we're going to create the signature and it's going to be based off of a security secret
15:03
that is going to only be known to our server. And so our server is the only one that's going to be able to generate this based off of the rest of this information. So how are we going to handle this? Well, what I'm going to suggest is let's create two cookies. We're going to have one cookie that's going to be called our access token.
15:22
And that's just going to be the first two parts of our JWT. Then we're going to have the second part that is going to be what I call the access token signature. And that's just that third part. So we've broken up our JWT into different parts. And really the important thing to note here is this. This HTTP only.
15:40
And this is a browser feature. All major browsers support this now. So we can rely upon it. And what that's going to do is it is going to say, yes, I have a cookie, but this cookie can only go over HTTP request. JavaScript cannot get at it. So that's great. Problem is if our entire cookie were HTTP only,
16:02
then we can't get at this section here in blue that we do want our front end application to be able to get at. So that's why we don't have HTTP only here. And we do have it down here. So cookies seem like they're going to do the job for us. So how can we handle this inside of our Python code?
16:22
And so what we need to do is we're going to take our token and we're going to split it based on that period into two parts. We have the header and the payload. We have the signature. And then all we're going to do is we're going to create cookies. We're going to create an access token with our header and payload. And this is the important thing that I just mentioned.
16:42
HTTP only is false. And then our signature, and we're going to put HTTP only on for that one. Now I've also put in the CSFR token here because this is typically you're going to want to do it at the same time. And again, this needs to be false here.
17:01
And the reason why is the way that we get that CSFR security is we want to be able to read that from our JavaScript and we want to be able to insert that into a header when we make the request. And the reason why we're going to do that is because now we know that it came from our JavaScript.
17:20
Okay, so this is sort of how you would set cookies inside of Sanic. It sort of acts and looks like a dictionary, but not really because as you can see here, this value is just a string. So it kind of looks like it, but just know it's something a little bit different.
17:40
So we can say our cookie key equals value and then set all this sort of meta information on it afterwards. So it looks like we have a winner. We solved the problem. We figured out how we can use a JWT. And the solution is with two cookies. We're going to store two cookies. We're going to send two cookies.
18:01
And then we've got this header for CSFR protection here. So we've sort of solved both of these problems and we've got our nice little green check mark there. So what is this going to look like inside of those handlers that we set up at the very beginning? Well, when we do execute is authenticated,
18:21
we need to get that token. So how are we going to do that? We're going to go back into our cookie again. It's actually this point when we're inside of a request that actually is a dictionary. So we've got our tokens that we can take here and we can rebuild them back into one single thing.
18:40
So once we do that, we can try and decode it. Now this decode method is going to throw up all sorts of exceptions if that JWT is not valid. So all we need to do is just listen for the exceptions and return either true or false. So that's pretty simple there. Okay, so this is back. This is that thing that we're going to do
19:01
inside of our protection method. So we just figured out how we can do is authenticated. The next question is, how are we going to do is authorized. So we've got our method here. And this is where I'm going to introduce something. It's a little bit different.
19:21
This is something that I call structured scopes. Now, a couple years ago, when I was trying to build an API, I figured, okay, that's great. Let's go and let's add some scoping. I want to have certain users be able to this X endpoint, but not this one. And how is this done? And I spent a lot of time trying to figure out
19:43
exactly what's the standard out there. And I was really actually kind of disappointed to find out that there was no standard. So I went, you know, and try to figure out what's the best way that to do it. And I came up with this idea. Maybe one of these days, if I ever, you know,
20:01
find the time to do it, maybe I'll submit an RFC or something about it. But basically what we've got is a string of characters, and we're going to separate it with these colons. The first one is our namespace and everything after that is going to be an action. So we're going to have two of these. And we have, we have,
20:20
this is what we're going to validate against. And this is what we've got. And we've got a check mark. So this is sort of what it looks like. Our requirement says we have user read and incoming, we've got user read, right? And this is valid. So we're just going to stick this inside of our handler. So now all we need to do is put this on our endpoint
20:44
and we can just proceed on with JavaScript. We can feel comfortable that our cookies are going to be sent back and forth. That's great. But is there a better way? And that's this package here that I created. It's going to set up a lot of this stuff for you.
21:01
So the important things that I want to point out to you here is we're going to be sending our tokens with cookies. We're going to be splitting them up into two parts, but we're also going to say cookie strict equals false. And this allows us to fall back to those authorization headers. So now our API is able to do both the direct API and the browser-based.
21:24
Sanh-AWT added these three endpoints for us. So this is what our whole application looks like currently with fairly minimal code. And that is sort of, that's it. That's what I've got for now. Any questions, I'd be happy to answer them here
21:41
or in the channel later. Adam, thank you so much for that excellent talk. Okay, come on computer. It's doing it to me again. Oh, this is always embarrassing when I think I need to work correctly.
22:02
Here we go. All right. And I do in fact have a couple of questions here for you. So Goose asks, why do we not need to store a token for a non-session-based ticket?
22:23
Well, it's not a question of do we need to. It's we're trying to figure out if there's a way that we can avoid having to have session tokens. So in one world, we need to check sessions and in another world, we don't. One of the sort of advantages of not having it
22:42
is we just took away a round trip call to make to our database. So with session tokens, you've got this extra request that you have to make on every single request to the database and doing that adds more network time. So that's gonna inherently slow down your application. So one thing that you do gain by this
23:01
is by reducing that and you can make your application a little bit faster. Okay. Cesar asks, he's a sorry. Is it working with HTTP2 or HTTP3 or is it not dependent on that? It's a little bit HTTP one, one, one, two, three.
23:24
It's sort of a little bit of a different conversation. That's gonna be more at the server level. So you can run Sanic over HTTP2, but it would handle this stuff pretty much the exact same way. The one thing to know about HTTP2
23:41
is you're sort of forced into using TLS certificates, which is a good thing because that's gonna obviously add some additional security layers for you with encryption. All right. Once again, thank you very much. Absolutely.