Practical Django Secuirty
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Teil | 3 | |
Anzahl der Teile | 44 | |
Autor | ||
Mitwirkende | ||
Lizenz | CC-Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben. | |
Identifikatoren | 10.5446/32846 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
DjangoCon US 20143 / 44
2
3
7
8
10
11
13
14
18
26
29
32
38
39
42
43
00:00
ProgrammfehlerEinsApp <Programm>Endliche ModelltheorieMultiplikationsoperatorFunktionalProxy ServerAuthentifikationBefehl <Informatik>AbfrageGebäude <Mathematik>Cross-site scriptingStrömungsrichtungGamecontrollerServerKumulanteComputersicherheitCodeFunktion <Mathematik>Kategorie <Mathematik>BitSoftwareschwachstelleAutorisierungInverser LimesSichtenkonzeptMechanismus-Design-TheorieKonfiguration <Informatik>Inhalt <Mathematik>RechenschieberVideokonferenzPunktFlächentheorieEinfache GenauigkeitBrowserSchlussregelSoftwareSoundverarbeitungProgrammierungSchnittmengeBruchrechnungClientRPCInjektivitätURLKontextbezogenes SystemSoftwaretestKartesische KoordinatenHackerGüte der AnpassungKryptologieWeb-ApplikationEin-AusgabeWeb-SeiteMetadatenVersionsverwaltungKlasse <Mathematik>MereologiePrimitive <Informatik>StellenringMathematikQuadratzahlZahlenbereichGeradeProgrammierspracheProzess <Informatik>SkriptspracheDigital Rights ManagementSoftwarewartungDifferenteRechter WinkelArithmetischer AusdruckWhiteboardMusterspracheDrucksondierungFlächeninhaltFolge <Mathematik>BildschirmfensterDoS-AttackeRuhmasseLie-GruppeAnalysisKegelschnittFigurierte ZahlXMLUMLComputeranimation
09:57
DefaultImplementierungCookie <Internet>Maskierung <Informatik>Vorzeichen <Mathematik>AuszeichnungsspracheCodeVolumenvisualisierungEinflussgrößeService providerZahlenbereichSoftwareentwicklerBitFächer <Mathematik>SichtenkonzeptNeuroinformatikSpeicherabzugLoginFramework <Informatik>Physikalisches SystemVariableKryptologieFunktionalMiddlewareURLBenutzerbeteiligungMailing-ListeGamecontrollerDatenfeldRuhmasseMathematische LogikCASE <Informatik>HyperbelverfahrenProgrammfehlerInformationEndliche ModelltheorieBildschirmmaskeObjekt <Kategorie>TemplateRegulärer GraphKoroutineAuthentifikationLeistung <Physik>SchnittmengeAutorisierungComputersicherheitRegulärer Ausdruck <Textverarbeitung>Selbst organisierendes SystemSchlüsselverwaltungGeradeInternetworkingDerivation <Algebra>Rechter WinkelPunktHash-AlgorithmusVersionsverwaltungKartesische KoordinatenInterface <Schaltung>Dienst <Informatik>EvoluteKonditionszahlGüte der AnpassungWort <Informatik>RelativitätstheorieDichtefunktionalProgrammierumgebungBasis <Mathematik>Klasse <Mathematik>BenutzeroberflächeQuick-SortAlgorithmische ProgrammierspracheGebäude <Mathematik>DistributionenraumMultiplikationsoperatorBetafunktionDomain <Netzwerk>IndexberechnungTUNIS <Programm>WechselsprungGruppenoperationt-TestRechenwerkBefehl <Informatik>Bildgebendes VerfahrenProzess <Informatik>MereologieDatenstrukturÄhnlichkeitsgeometrieRandomisierungGeschlecht <Mathematik>Computeranimation
19:33
Hash-AlgorithmusElektronische PublikationProgrammierungCodeIdentifizierbarkeitReguläres MaßWiderspruchsfreiheitCASE <Informatik>Leistung <Physik>AlgorithmusFamilie <Mathematik>SoftwareForcingComputersicherheitSchnittmengeChecklisteFlächentheorieSoftwareentwicklerMereologieInverser LimesKette <Mathematik>BruchrechnungVerschlingungImplementierungMultiplikationsoperatorÄußere Algebra eines ModulsSeitenkanalattackeBitPrimitive <Informatik>DefaultPatch <Software>Nichtlinearer OperatorQuellcodeZahlenbereichRechter WinkelWurm <Informatik>HilfesystemProjektive EbeneProgrammbibliothekBildschirmmaskeServerRechenschieberQuick-SortFramework <Informatik>HackerLesen <Datenverarbeitung>KryptologieSchlussregelInformationsspeicherungWeb-ApplikationOpen SourceDatenkompressionPasswortWeb logKartesische KoordinatenSoftwareschwachstelleDifferenteLineare RegressionOrdnung <Mathematik>SystemaufrufSpezielle unitäre GruppeSchreiben <Datenverarbeitung>UmwandlungsenthalpieGarbentheorieVideokonferenzVerkehrsinformationDienst <Informatik>Gemeinsamer SpeicherNetzbetriebssystemWeb SiteSoundverarbeitungInternetworkingMAPSchlüsselverwaltungComputerspielGebäude <Mathematik>Web-SeiteStörungstheorieRechenwerkFormation <Mathematik>Arithmetisches MittelGrößenordnungGeradeFunktionalMechanismus-Design-TheorieSpezifisches VolumenVorlesung/Konferenz
29:09
FokalpunktPerspektiveProxy ServerZusammenhängender GraphDienst <Informatik>GamecontrollerGarbentheorieAggregatzustandFlächentheorieExploitProfil <Aerodynamik>Web SiteProzess <Informatik>MengeMereologieSoftwareCookie <Internet>SichtenkonzeptSoftwareentwicklerBildschirmmaskeMaßerweiterungInjektivitätElektronische PublikationDatenfeldEndliche ModelltheorieFunktionalInverser LimesMultiplikationsoperatorDifferenteProgrammierungBenutzerbeteiligungRichtungKartesische KoordinatenRechenwerkImplementierungNatürliche SpracheMustererkennungEDV-BeratungFramework <Informatik>ATMWorkstation <Musikinstrument>CASE <Informatik>SchlussregelKonforme AbbildungComputersicherheitSchnittmengeGruppenoperationProdukt <Mathematik>Kartesisches ProduktCodeUmwandlungsenthalpieHackerFigurierte ZahlKonditionszahlBitrateSoftwaretestLie-GruppeWeb-ApplikationRechter WinkelSchlüsselverwaltungKette <Mathematik>Schreib-Lese-KopfTypentheorieDreiecksfreier GraphGanze FunktionBitQuellcodeMathematikDruckverlaufExogene VariableServerFehlermeldungChiffrierungSoftwareschwachstelleSuite <Programmpaket>Fahne <Mathematik>Hash-AlgorithmusNetzadresseGebäude <Mathematik>KryptologieQuick-SortComputeranimation
38:45
ParallelrechnerFlächentheorieComputersicherheitMereologieAutorisierungDreiecksfreier GraphMathematische Logikt-TestProgrammiergerätSchnittmengePunktCodeGüte der AnpassungFamilie <Mathematik>SchlussregelQuick-SortKartesische KoordinatenMultitaskingNabel <Mathematik>StabFlächeninhaltProgrammierungBenutzerbeteiligungPhysikalisches SystemURLMultiplikationsoperatorDoS-AttackeInverser LimesBitrateElektronische PublikationProzess <Informatik>SichtenkonzeptLeistung <Physik>Nichtlinearer OperatorBefehlsprozessorReelle ZahlServerTransaktionGesetz <Physik>KonditionszahlKoroutineComputerspielBroadcastingverfahrenRechter WinkelExogene VariableProgrammierspracheDienst <Informatik>SoftwareschwachstelleProgrammierumgebungProgrammbibliothekDemoszene <Programmierung>SoftwarePerspektiveLineare RegressionFontGeschlecht <Mathematik>Gewicht <Ausgleichsrechnung>IterationTUNIS <Programm>Hilfesystemsinc-FunktionEreignishorizontArithmetisches MittelSoundverarbeitungRobotikKontrollstrukturRandwertDrehimpulsKlasse <Mathematik>Tabellep-BlockROC-KurveGamecontrollerStreaming <Kommunikationstechnik>Computeranimation
48:21
XMLComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:21
Thank you very much. Thanks for coming to DjangoCon. Wonderful speakers here. I hope I can move up to the podium. Now, I generally speak about security in the Django meetup in New York, and my first time I did this, I spoke for about three hours, and I realized that maybe I should tone down the slides a little bit.
00:42
And these days, I largely speak about attacks, but when it came to posting content for DjangoCon, I figured, like, let's not look at attacks. Attacks change. Also, there's a video. I don't want someone learning the wrong things from this talk. So I've come up with a little bit of a unique approach to how we're gonna go through security.
01:03
So I'm employed by a wonderful company called Genesis. I used to be employed by a company called Manasano. How many of you have heard of Manasano? How many have done the crypto challenges? Okay, good. So at least we have some people. I did web application penetration testing
01:20
for about two years. I moved over to Genesis to do it more, to own security in-house, and I like to stay employed, so I'd like to mention in the contents that these slides are my own. Now, when it comes to speaking about security, it gets really tough. You know, everyone, you're always taught about security. They drag you into a big conference room in these large corporations if you work in one of those,
01:42
or in a small place, and you hear about it, you read about it on Hacker News and on Reddit. About different hacks. And it's really, it's a hard problem. It still exists. And one of the major reasons for that is that your app or your app ecosystem is only secure at its lowest common denominator. Take Google, for example.
02:02
One cross-site scripting bug within one Google app affects every single Google app because they implicitly trust each other. Now, that's really tough because the surface gets really, really big. Also, security is constantly changing. What was good yesterday is not good today. And as a caveat, these slides work for today.
02:22
They'll probably work a little bit more, but you should take these points down, research them further, because they may change. So let's look at what we can do to change that. Now, if you want to break down security vulnerabilities into two categories that can be broken down rather simply,
02:43
there's either a lack of or improperly designed, or improperly coded security controls. And these will cause you to be vulnerable to attacks like denial of service, and where you don't have rate limiting. Authentication bypass, because your authentication mechanism
03:01
isn't secure or is susceptible to bypass. And then the worst and most common is improperly implementing cryptographic primitives, which, you know, you're encrypting the data. The data looks encrypted but can easily be decrypted or can be modified. And then there is mixing data up with code,
03:22
and these largely run around with the remote code execution vulnerabilities. Now, while there are obviously applications that have remote code execution interfaces built into them, stuff like SQL injection and cross-site scripting, in where an attacker can make their data execute as code on your server,
03:41
or at least within the context of cross-site scripting, within the client's browser, largely involves the data and code being mixed up. So let's look at these things in a little bit of a different way, right? So as opposed to looking at how to prevent the attacks, let's look to build our software to be more robust.
04:02
Let's not look how we're being attacked. Let's look to build better software. Now, every language, every program has side effects. But there are side effects that we can prevent. And I feel that one of the first things in making a more secure program is building software that is incredibly assertive.
04:22
You must, to the best of your ability, assert as much as possible what the user is giving you is what you intend it to be, not what you think it's going to be, what you intended. And that's very important. And the Python Zen has this,
04:40
where it says explicit is better than implicit. Don't just take the input from the user. Assert it. Make sure it's exactly what you want. Now, let's look at the easy one, which is mixing code and data. And when we talk about mitigating cross-site scripting and SQL injection,
05:00
we talk about statements like output encoding or prepared statements or parameterized queries. And these statements sound very nice. And, you know, when you get hit by a cross-site scripting bug, they tell you, oh, you have to output encode. And you have to, or when you get hit by SQL injection, they'll tell you, you have to use parameterized queries,
05:22
prepared statements. But these are stopgap measures. Let's look at what these things actually are. So when we get hit by this bug, for example, let's say SQL injection, we have a SQL query. We've got data from the user. Now, do you know what that data is? Well, let's say it's supposed to be a number. Or let's say it's supposed to be text, right?
05:42
Now, it should only be text. And it should only be considered text by our SQL server. So if we explicitly put it into that category, at that point, we've mitigated it. And that's what prepared statements are. Prepared statements are parameterized queries. They're called both things.
06:01
That's why I keep on using both of them because everyone, like, literally, in every different app, in each app ecosystem, they have another way of saying prepared statements are parameterized queries. But they mark, they explicitly tell the SQL parser, this is data. Do not execute this. Insert this when you need the data required in this location.
06:23
You explicitly mark that as data. When it comes to output encoding, when you're reflecting user data on the page, at that point, you have the ability to, you're marking that user data, you're marking that user text or user bytes as data.
06:42
You're telling the HTML parts, you're telling the JavaScript engine within your browser not to execute that. Now, this should be a general rule. And this is one that I follow and that should be, I feel that many of you should follow as well. Be assertive. If you don't know what it is,
07:02
find out where it's going to be used. And whatever you do, you will not get hurt if you mark it as data. Like, marking it as code is a risky option. Try not to go on the risky side. Now, let's look at the security controls. So this is where it gets really tough. Because security controls,
07:21
you have to understand attack surfaces. You have to understand threat modeling. You have to be able to build your security controls in a robust way. And that's tough. Because the current security controls that we all use every day have been a cumulative set of knowledge, research around these security controls.
07:42
And we've only gotten to this point because of all the attacks that they've been hit by. We only have security controls because we need them. We don't have them because it was an initial thought. Security is always an afterthought. It's only after an attacker comes. And if a bank didn't need a vault, they would not invest money in a vault.
08:01
But they need one. And then as banks, as the vaults have been broken into, so too the vault technology has evolved. So, what's really important to note is if you're going to ever build security controls, if you're going to ever implement them or use them, use popular ones. Now, what I mean by popular
08:20
is use ones that industry leaders use. You can find strength in numbers, right? Do not use one that one or two people use. There are many times I will go online on GitHub and see a new model that's great and has awesome functionality and contains an authentication bypass
08:41
or contains a security bug that you've now opened that an unsuspecting user is actually going to just jump straight into. And the way these holes are mitigated, the way that these holes are patched up, is that more people are using them and more people are reviewing the code and therefore building out and helping build a better
09:02
security control. Now, Jamio helps us with authentication. It doesn't help us very much with authorization. So a lot of us find ourselves writing authorization code. It's really important that when you build this code, these authorization views or authorization decorators, that you build them,
09:21
you write them, you put them in one place. They should not be scattered amongst your application. Literally, you can take the Python send, read it ten times before writing any security control. Having them broken up into many different locations can lead to a vulnerability,
09:40
especially when you intend to update or mitigate a specific attack, and now you have to do it in every single location. Now, again, as mentioned previously with our data and code example, it's really important to be explicit. But conservative, I'll explain why a little bit later. Now,
10:02
if you find yourself in the case where you have to actually build, let's say, a login system, at that point, you need to literally go by the book. You need to get a web application, a security book, a good one and I'll have recommendations later on, and you need to go down the list of all the attacks and find out how they're mitigated.
10:20
You need to look at current secure, or at least secure as of today or now, or at least we hope so, login systems and design yours to be very similar, if not the same. You must understand the attacks and whatever you do,
10:40
do not write your own crypto routines. I have so much pain on the internet when I see large organizations in GitHub linking to very insecure crypto routines. Now, it's also important to note that you have to understand what you're doing.
11:01
So, you have to be very conservative in using cryptographic hash functions. And we'll get to this a little bit later. Most of all, what's really, really important is Django provides very good and very powerful security routines. It's important that you use them because this is what the strength in numbers mean.
11:22
Django is popular. We're all here. We all represent that community. There are many of us, there are many in that community that are not here today. I think we probably fill more than this hotel, if not. So, Django is very popular. If Django implements it, you should be implementing it. Now, let's look at some
11:41
security controls that Django provides. If you need to cryptographically sign something, use Django. Use the signer that Django comes with. Django auto escapes most unless you mark it as safe. Django will auto escape user data. It will explicitly mark it as safe after auto escaping it.
12:01
But, the auto escape only works for HTML markup. If you find yourself needing to render JavaScript code, use your JavaScript code, use JS encode. Now, when it comes also to validating the URL structure, the regular expressions provide a very, very secure implementation of ensuring that the,
12:21
or at least you should be using it to provide a stopgap measure that people that shouldn't be hitting certain endpoints with certain data shouldn't be there. However, it's not perfect. Now, as an example, the default hashers PBKDF2,
12:41
Passer Base Key Derivative Function 2, makes it easier to remember the long list, the set of characters when you know what it stands for. But, it's set at 10,000. You should upgrade that to 100,000. And that's as of today. Learning to all future viewers, watching the video, or considering this, you may need to
13:00
increase that at a later point. Now, if you want, if it slows down your system, right now, 2.7.8, the latest version of Python 2.7, has backported from Python 3 the C hash function, and you really don't have to worry about it. Another issue that we have is that object permissions
13:22
aren't present within the framework, and therefore most developers don't think about them. And, it's really important to remember that you have to be assertive of which user owns which object. And on top of that, the CSRF implementation in Django implicitly trusts the framework.
13:40
I've got some work that I'm working on that will implicitly trust the cookie, I'm sorry. And, it bases all the CSRF mitigation on the cookie. If you have access to writing a cookie for that domain, you have effectively broken CSRF. Now, that is hard, but it's possible, and I'm working on some
14:01
enhancements to the CSRF implementation to mitigate that issue. Let's talk, let's jump back a little bit to building our security controls. We mentioned that they should be in one location, and we should not repeat ourselves. all the logic should be very straightforward.
14:20
This means that when you're using the login or acquire decorator, it does not belong in your views.py. It belongs in your URLs.py. How many people have a views.py that's bigger than a thousand lines? There we go, right? How hard is it to realize that you missed one view? It's in your URLs.py
14:41
that rarely grows larger than a hundred lines, and works quite well. Now, I'm not going to get into the debate of function-based views versus class-based views. Personally, I use class, and you'll see why I like them more later on. But, whatever works for you, you know, it's a free framework. Now, within class-based views, you actually get
15:00
the ability to create mixins by overriding the dispatch method. We have an example of that here. Here we have, I hope it's easy to see, took a screenshot from my computer. Here we have a class-based view mixin that is a login required plus it's going to require that a
15:21
specific object is owned by that user. And, as you see in let's say line 5, for example, we obtain the target user based on the object. We check that in line 9. Line 9 is going to make use of the
15:40
request.user interface. Now, this is obviously assuming that within your application you are request.user, you're not saving objects as anonymous users, and therefore in line 9 you're checking if the user is authenticated. But then, you're also checking in line 9 if that user owns that object.
16:02
And, this is one of the reasons I like class-based views. It allows very easy mixins. And then you can use these mixins to create views, like authenticated template view, which will just have this as a mixin on the left-most side. And then have the regular template
16:21
view that Django offers. And, you have a very simple view, and you don't have to worry about putting login required. It also allows you to put all your authentication and authorization code in one easy to find place. Now, that's views. Let's talk about forms. Now, just
16:41
Django forms are fairly secure. They're built fairly well. The thing that scares me the most is model forms. How many of you have Rails experience? How many of you know what mass assignment is? Oh yeah, that's good. Just about the same amount of people that have Rails experience. That's really good. So, this is something we don't have in Django, or at least
17:01
we shouldn't, but mass assignment is a bug where fields within your model are accidentally exposed to the internet, allowing anyone to put any information they want in those fields. Now, I've the Django documentation taught me to always make use of fields, and
17:22
I very much trust it. I don't use exclude. When you use fields, you're whitelisting. You're saying we only want these fields. We only want to deal with these. We only want to populate these fields in our model, and I think this is a very good example of a properly implemented control. It makes it hard for the developer
17:41
to shoot themselves in the foot, and at the same time, it's also very powerful. Now, well, we've knocked off quite a few things in our list. Let's talk about being conservative. So, CSRF is one of the things that is a popular hole that's been exploited, and it's pretty scary, and one of the problems
18:01
that I feel that Django has, especially with function-based views, is their lack of explicit HTTP verb handling. Now, as you see here in my class-based view, I'm inheriting from the Django core view, and I have the ability to respond to both a get and a post, but if I don't have any of those methods, it'll get a method not,
18:20
the view will return method not allowed, whereas a function-based view will always return based on that method. There are CSRF holes that I've found that exist because functions were expecting a post, but when you gave them a get, they pulled the variables out of request.meta, and the
18:41
CSRF middleware in Django saw a get, said, we don't have to check for CSRF here, and the view was compromised. So, there is the ability within function-based views to use a decorator
19:01
called allowed methods, but I'm not such a fan of decorators. I find sometimes they're left out, and when it comes to writing your own, it can get a little bit hairy, but that's just my own personal feeling, and, you know, don't take it, judge it on a case-by-case basis, but I like to be very conservative at least when it comes to
19:20
my own personal development, so I shy away from writing decorators. I'm using classes, and I'm being conservative and explicit about what my view is supposed to handle. Okay. Now that we know what being explicit is about, let's talk about crypto. Now,
19:41
I was, the original section, the original slides here were examples of how to do crypto properly, and I wrote an example of cryptographic implementation on my blog in Node.js, and then someone asked a bunch of questions about it, and started moving things around, and then I pulled my slides
20:00
out of this talk, and I want to shut down that page, or at least remove it. Crypto is really hard, right? Even cryptographers have problems with it, let alone simple humans like us. It is really important to note that if you don't implement crypto properly, your users can be
20:20
negatively affected, and therefore you have to be very conservative with what you do. Using KeysR, and I'm only giving KeysR as a recommendation, use KeysR. When it comes to crypto, use KeysR. Don't implement it yourself, please. And, please, don't play with KeysR.
20:42
You're modifying KeysR, modifying the behavior of KeysR, vulnerabilities. Now, I have a friend named Yan. Yan and I used to work together at Matasano, and after the work at Matasano, Yan and I developed an application in Django, and I proposed an idea to use
21:00
a cryptographic hash to identify a user, and Yan let me know about Yan's rule. Yan's hash rule is simple. Only use a cryptographic hash when you need to. Now, can anyone give you examples of what a use case
21:21
for a cryptographic hash that you absolutely need a cryptographic hash? Storing passwords. Incorrect. Because these days, cryptographic hashes, while they used to be good for storing passwords, because they're
21:41
supposedly one-way functions, or they're great compression mechanisms, cryptographic hashes can be calculated very fast. Passwords, you don't want to be able to brute force. these days we use algorithms that are tuneable. Someone mentioned Bitcoin, or cryptographic currency. Again,
22:02
the amount of processing power that the Bitcoin network has practically invalidates anyone that uses any of the SHA family or even, I'm not even going to talk about MD5, but the SHA family of hashes to hash user passwords, the amount of hashes that are able to generate a second is mind-boggling.
22:21
So, there are two examples of Yan's hash rule, in where you don't use a hash unless you absolutely need to. The one use case I know is if you want to ensure the consistency of a specific file. And for example,
22:41
Git uses hashes to track files. Many open source programs use hashes so people should know if the file has been modified or not. Then again, we need hashes. We need, at least, unique identifiers that we need to tie back to users. So, for that,
23:01
we have UUID4. UUID4 reads 16 bytes out of the cryptographic random source of the operating system, or at least it's supposed to. If you don't have access to it, it'll just use Python's random, so you have to ensure that you actually have access to it. But that's when you need a UUID. When you need a regular
23:21
set of bytes that are completely random, just read out of Urandom. An example of code right there. That number is completely unique. At least, I hope it is. Now, you need something to be signed, use DjangoSigner. If you're stuck, you're going to have to use
23:40
the Python's HMAC implementation. But then again, once you get to the HMAC implementation, which by the way needs a hash, that's another example of a primitive that needs a cryptographic hash algorithm, you can get nailed with a timing attack, a side channel timing attack. So
24:01
really just stay on the safe side, use the Django primitives. Now, I deliberately shortened this talk a little bit. We've got almost 15 minutes. So there can be a lot of questions, but I'll end off with a little bit of an appeal. Security is a great industry. It's really
24:21
good. It's made me a better engineer, and I hope if you guys do research, you do research into security, then you will become better engineers. It's important to note the limitations. The lowest common denominator will never disappear from security, just like the chain is only secure as its weakest link. But
24:42
it's really important to do research, look things up, and when building new parts of your application, try to understand the attack surface. Don't only look at what should happen, look at what may happen. And then once you
25:00
look at what may happen, you realize it's a little bit tedious, because so many things may happen. So let's just be assertive and ensure, as much as we can, what should happen. Now, I'd like to make an appeal to the larger community. There is a lot of knowledge that we have within Django about security. There's a checklist that we should be building
25:21
to help our fellow developers build and write more secure code. I know as a developer, as a security person, one of the biggest fears we all have is that innocent people, people in general, will get hurt because of our code. We don't intend that to happen, and unfortunately there are parasites, attackers in the world
25:41
that prey on the weak, and unfortunately we sometimes are able to take advantage of our code. So we should create a very basic checklist of every part of the framework of when you're writing, when you're using this part of the framework, what do you have to be worried about? And
26:01
it should be really easy to use, and just appeal to the community to help. And if you'd like some reading material, at Matasana we always recommend the WA, we call it, the Web Application Hackers Handbook 2, the second edition, the first one is good, but the second edition is better. Tangled Web, The Art of a Software Security
26:21
Assessment, and Microsoft also has a great book called Writing Secure Code, and Microsoft's got some good books out there. And all these books will give you the mindset, the proper mindset to building more secure code. Now, that's good. Nice, I've got time for questions. Sure.
26:44
I'm going to put these slides on my website. So, the question part. Thank you so much, Leiby. I think it looks like you've left a full 19 minutes for questions if my arithmetic serves.
27:00
So, please, this is obviously an exciting opportunity to engage on security matters. Looks like we already have our first question ready. I just have a question about, you mentioned taking advantage of safety in numbers. Do you believe that there's any sort of diminishing returns there, inasmuch as
27:22
while you may get the benefit as a developer of the collective knowledge of the community, so that they can secure, help you secure against attacks, does that not also though open up doors for when there is a security exploit, that the payload is larger, right?
27:41
If every Django project is using the same code to secure themselves, once an exploit is discovered, you have a huge swath of projects that you can exploit. So, it does. I mean, think about it in simple forms. Heartbleed affected billions. That's exactly what I was thinking. Right?
28:01
And OpenSSL is a very popular library. But at the same time, it's better than the alternative. This is part of the security. You've got to kind of secure battles. It is better than the alternative. Writing your own is not going to get, it will not be more secure. There are very few people in the world that are able to put out solid cryptographic libraries and for them to be initially secure
28:21
by default. And I don't know them. There are people that do, but no one trusts them. And things that are built by the community got a large set of eyes on them. Yes, while you if there is an exploit, many people will be affected. But at the same time, every time that there is a hole in them, you know that there is so much security review that
28:41
has been done that they found a slight chink in the armor. And everyone knows about it, so everyone talks about patching it right away. It's not that everyone ignores it, or it's known only to few people, and then you don't know to patch it, and then you get hit by it six months later. Thanks. You're welcome.
29:01
Should I repeat the questions, or do they have them good on the video? Well, okay. Please. I have a more specific question. I'm sure other people have run into this too. So what's the best way to encrypt a field in a Django model?
29:22
say I have a user and I want to encrypt their first and last name. There doesn't seem to be a very clear way to do this. Is there anything you would recommend besides the instinct of just hashing that value? Well, hashing it won't help you, because the hash is supposed to be a one-way function. You're not supposed to be able to pull data out
29:41
of it. Using keysr to encrypt it is really helpful. You can encrypt it and then pull it back and then decrypt it every time you pull it out. I believe Django encrypted fields, now don't hold me to it, you have to check them out, but Django encrypted fields uses keysr, and Django encrypted fields has an example,
30:00
has a text-based encrypted field. Great, thank you. It's not an endorsement to Django crypto fields. I've looked at them once, they may have changed, but check them out. Is that like an add-on? Yeah, it's a third-party application. I also have a Django specific question.
30:20
Can you comment on development versus production settings? An example that comes to mind is allowed hosts. Sure, so most important, debug mode must be turned off. You may all laugh, right? And I did when I was a Django developer, and then I joined the security world and I was like, okay, now I'm going to count
30:42
the amount of sites I've seen debug mode turned off. Almost every application that I've tested had debug mode enabled in some way. And it's really, really important for that. Now also in production, you're dealing with things like rate limiting and abuse that you're not used to. And it's really
31:02
important to look at your, every use case is different. And just set some really basic rules and guidelines to how much of your service your user should be using. I have a quick question that might be a little too specific, but
31:21
you, do you have time to elaborate a little bit on the sort of spoofing the post by using a get and bypassing the C-surf and how safe is it to trust in, if you're using a function based view whether the request is a post or not? Which is something that I rely on.
31:41
Sure, sure. So I'm going to talk in the abstract and I don't have any code to show, but I'll try to be as specific as possible. So every function based view takes a request. Right? It will take any type of HTTP request, it doesn't matter. Now within the code of that function
32:01
are specific directives to process based on that request. If request.post, so if this is a post request, process the form. If request.get do the opposite, etc. Now, the issue I was describing was an issue that has been
32:21
found where a view existed and this view used request.meta. So it wasn't just looking for a get or a post but the developer assumed the fact that it was in a form that it would be posted to. But it can easily have been requested by a get.
32:42
And that was the issue. You mentioned during talk that it's important if you do venture down the security road and writing your own software for that, your own controls, it's important to understand all the attacks that you're going to have to you may encounter and anticipate.
33:03
My question more pertains to do you think it's valuable to be able to understand and execute some of these exploits yourself in that process? That depends largely on the way a person works and learns, but
33:22
from my perspective, I learned software security by looking at the framework, by looking at the Django framework and just seeing what does it do? How does it work? And then looking at other implementations and saying why are they different? And then comparing the differences and understanding how it can be attacked. Some people find it really helpful to actually attack. And for that
33:43
there's OWASP has a program called WebGoat that allows you to, a really vulnerable web application that allows you to really test out the application security. The Web Application Hackers Handbook will give you a lot of those attacks practically. They make use of a
34:01
proxy tool, a web application proxy called the Burp Suite. And for me that's my Swiss Army knife when it comes to pen testing. And I really rely on it heavily for everything that I do. And I think that going through specific components and just reading about the attack surfaces
34:21
on those components will be really helpful. But whatever works, if it's easier for you to break into the actual application, there you have it. If it's easier for you to just look at secure implementations, then if it's easier for you to do both, then do both. Thank you.
34:41
Hello, my question is about do you have a guideline to know when it's appropriate to step back and ask, well not ask, but when doing security it's holistic, so how can you know whether the Django layer is the appropriate layer to address a certain security concern
35:02
with, you know, a stack with other layers in it? So, just to expand a little bit on what you just said, why should, so for example in building a rate limiting control, you can build it, let's say you use nginx, you can put that directive in nginx, you can put that directive in Django, why should you do one over the other?
35:21
Now, it really depends what you're looking for. nginx knows how to proxy and serve files very well, right? So when you want to limit the amount of IPs that hit your site, you can do that in nginx. When you want to limit the amount of IPs that hit a specific endpoint, you could do that within nginx. However, it doesn't know anything about your application.
35:42
Within Django, you can build rate limiters that can rate limit specific actions, so that should an attacker decide to abuse that specific action, you can throttle that entire action across your application. It's not bound to URL, as nginx is going to expect, it's bound to a set of behaviors. So it's largely dependent on what you want to, the type of
36:01
control you want to build. Both are really important. In fact, I'll say everything that you can get out of Django, as in put into nginx for rate limiting, should be done before that, because why start to get into the request response cycle, when you can just stop the attack from ever happening.
36:24
Django has a lot of settings around cookies, like HP only secure signed cookies. Do you have a recommendation on how you set up your settings, or do you just use defaults, or what? So, it really depends on the site that you plan on hosting.
36:41
For example, if your site is not hosted under SSL, then sending the secure flag in your cookies is kind of going to break the whole HTTP sessions. So, generally, when I set my when talking about cookie settings, I make sure that HTTP only is set on both the session cookie as well as a C serve cookie, and then
37:01
secure, I like to host fully over SSL, so secure on both of those as well. Now, what's important that most people don't know, is that the C serve failure view exists by default, and it's one of the easiest ways to profile if a site is actually a Django site. Send a post with either a missing C serve
37:21
token, or an invalid one, and you'll get back the C serve error view. And it will also tell you whether the site is in debug mode or not. Now, that's problematic. Most people don't change it. It's really simple too.
37:48
So, my question is more about Python 3 and tools like Keyzar and Encrypted Fields. Is there anybody working on that? Or any Kickstarters?
38:01
Are you seeing any new tools coming out that support Python 3 versus what you were talking about earlier? Yeah. So, we're going to expand your question to also PyPy. Keyzar makes use of, I think it's PyCrypto. PyCrypto is a C extension to Python.
38:21
PyPy doesn't work on PyPy. It doesn't work on Python 3. And the reason I chose Keyzar is because Keyzar is a set of executables. You can download the executables written in Java and run them. Now, what is important is that you don't run these executables
38:42
and open yourself up to command injection. But, largely you can make use of those executables. Use a tool. Don't use it so much as a library. And that way you're abstracted away from the internals of it. And you're just simply using, like as if you would
39:02
use cowsay in shell, you're using just a shell command. You put in data, you get back data. It's that simple. Any more? Please.
39:27
I respect that you're sort of, instead of talking about attacks, you're talking about best practices against surfaces generally. But it seems like the surface you're talking about by and large is the request-response
39:41
cycle. For example, you say, let's put all authorization in one place. But what about for async? What about if you're dispatching a broadcast and you're not exactly, you know, you need to understand who's on the other end. find that that logic sometimes is wanting for a home. And
40:01
it seems like probably it's not URLs.py. So what do you do with that stuff? So this is why I chose the request-response cycle. Because I don't think 45 minutes or two hours would have helped with async security. When it comes to security, when it comes to protecting yourself from vulnerabilities, as I mentioned previously, you largely have to
40:21
assert what is going to happen. You have to be very specific. And you have to really push yourself to understand what's going to happen. When it comes to asynchronous programming, things are largely being handed off to different let's say you're using co-routines or now event-driven asynchronous programming. You can throw yourself
40:41
down a rabbit hole really fast. How many people want to think about race conditions in asynchronous programming? Every bank that exists today does not make use of asynchronous programming when it comes to transferring money for the fear of race conditions. And even within race conditions they make sure that everything is
41:01
within a transaction. When you mention asynchronous programming, from my perspective, I'll look at it from the Python way, not just API calls that are being sent out to the server. Because those are also handled by Django's request-response cycle. Whereas with Django itself, asynchronously processing
41:21
certain things. One important thing to note when dealing with it, and I'm not going to dive really deep into it, is just because it's async doesn't mean it's lightweight. And just because it's lightweight in one place doesn't mean it's lightweight in the other. For example, if you manage to find a view that you could really increase the amount
41:41
of processing power that is required by that view saying they're based on let's say one of the most recent holes within Django is submitting within a multi-part request a set of files that will cause Django to perform an O-N operation. If you can find
42:01
that within an async piece of code and they're using G-Event that person's got real problems because G-Event is great when it comes to I-O and it actually isn't so good when it comes to serious CPU handling and it's going to stop responding to everything else. So what's really important is that
42:21
you rate limit as much as possible. Don't think that you can upload a 300 megabyte file and it will be really easy on your system. The second thing is to remove the asynchronous parts of your code from where it can get a little bit hairy. And what I mean by that is when you have code that is going to run
42:41
say payment system, you want to make sure that payment system that payment happens once and only in one location and it can't be triggered multiple times within a specific set of time. So in the regular world you'd use locks. You'd lock the start of that payment, you'd
43:01
move through the code and you'd end the lock. However Python's cooperative scheduler may especially asynchronous programming may break those locks or may get you to the point where you may be within multiple locks at the same time and do certain things. So it's important to know when to be synchronous. That said, it does mitigate
43:21
a decent amount of denial of service vulnerabilities, so your mileage may vary. I'm an educator and teach people
43:42
Python programming, Python web programming and so on and so forth. And one of the most difficult areas for me as an educator is that I tend to be a fairly trusting programmer which makes me very bad at this kind of work. Do you have any advice for educators or for students who are learning this kind of thing about ways to make themselves constructively paranoid?
44:03
Well, first of all being a trusting person is great. Unfortunately we're not as trusting as we could be, but just as a show of hands, how many people have invited people over to their homes? Okay, the people that didn't answer are definitely
44:22
that's a little scary. How many of you guys have known that person before you invited them over to your house? Okay. Most. How many of you have not known that person? Excluding Airbnb. Oh, one person. Okay, good. Right? So you invite someone
44:40
over to your house, you see someone walking over the street and saying, hey, you look hungry, come in for a meal. Right? Now you're going to expect that individual to be courteous and to know their boundaries, right? You know, not to put the silverware in their pocket, not to, you know, harass the other members at the table, etc. And largely they, you know,
45:01
very nice individuals, they will be like that. But when that person oversteps their boundaries, at that point you kindly show them the door, right? Now, when it comes to building software, we don't always have that luxury of being able to handle just one request and ensure that everything is working nicely around it. So
45:20
we try to assert as much as possible, place all the rules around that one request, that guest in our house, and say these are the roped off areas. You can only go into the bedroom if you're a member of the family. You know, okay, everyone can wash dishes. But, you can only do certain
45:41
things, you can only turn on the TV if you're a trusted, you know, you're a staff member, you're a member of the family, you can only change the channel if you're old enough, if you're older than, let's say, 18, right? Or 16. Now, that is kind of how you have to look at your users within your application. You're going to have,
46:01
the majority of your users are going to be good, hard-working people. Hard-working, sorry about that. They're going to be good, well-minded individuals. But you're going to have that 1% that's going to literally just probe you to find your weaknesses. And being assertive from the onset is really important.
46:21
Now, the assertive part of security actually comes to me from a different part of my life as you may have realized, I'm a religious Jew and Judaism is a set of laws. Those laws are largely written in many places, but are collected in a book called the Talmud. Now, I don't think
46:41
many of you have learned the Talmud here, but the way every law in the Talmud works is that you take, they discuss the law. They say, we know this is supposed to happen or we know this is the law. Now let's look at the limitations of it. And they hypothesize everything. And from a very rough piece of stone, which will be
47:00
the original law, they sculpt a masterpiece of every single facet when it should be done and why it should be this way. And this is, I've taken that part of my life and built it into writing software, writing secure software, in where all the code that I put out, I'm very
47:20
explicit, what should happen and why should it happen. And I don't look at kind of the, okay, I hope it's all going to work. I kind of, I'm not, it's not that I'm not trusting of my users, but I'm just assertive and I say, you are only a user if you meet these requirements.
47:40
The same way if someone walks into your house, they are only an invited guest if they can accept common courtesy. No more questions? Thanks. Alright, Lady Gross, thank you so much.