Memsad
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Untertitel |
| |
Serientitel | ||
Anzahl der Teile | 165 | |
Autor | ||
Lizenz | CC-Namensnennung 4.0 International: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/39262 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
| |
Schlagwörter |
35C3 Refreshing Memories113 / 165
2
5
6
7
8
9
10
11
12
13
15
16
17
22
26
29
30
31
33
37
38
39
40
44
45
48
49
53
54
55
57
59
60
62
65
66
69
70
72
73
74
77
80
82
83
84
85
86
87
89
92
94
100
104
105
106
107
108
111
113
114
115
116
117
119
121
122
123
124
127
132
133
136
139
141
143
144
145
146
148
149
150
154
155
156
157
158
159
160
161
162
163
164
165
00:00
HalbleiterspeicherInhalt <Mathematik>Coxeter-GruppeComputersicherheitUnrundheitVorlesung/Konferenz
00:48
DatensatzHalbleiterspeicherSoftwaretestZahlenbereichPerfekte GruppeCoxeter-GruppeNP-hartes ProblemMultiplikationsoperatorCodeHalbleiterspeicherXMLComputeranimationVorlesung/Konferenz
01:39
CodeKryptologieTypentheorieCompilerRechenschieberGüte der AnpassungComputersicherheitCodeProfil <Aerodynamik>CompilerRechenschieberComputersicherheitComputeranimation
02:28
ImplementierungKryptologieSoftwareTypentheorieBitRechenschieberTreiber <Programm>Quick-SortImplementierungXMLComputeranimationVorlesung/Konferenz
04:13
CodeCompilerQuick-SortProgrammfehlerHilfesystemExpertensystemHalbleiterspeicherSoftwareentwicklerComputeranimationVorlesung/Konferenz
04:59
InformationHalbleiterspeicherLeckMomentenproblemToken-RingQuick-SortCoxeter-GruppeComputersicherheitHilfesystemMailing-ListePasswortSchlüsselverwaltungCodeInformationSensitivitätsanalyseLeckToken-RingSchlüsselverwaltungGraphiktablettComputeranimation
06:33
CodeSoftwareStichprobenumfangPerfekte GruppeRandomisierungZeiger <Informatik>KurvenanpassungRechter WinkelCodeComputeranimationVorlesung/Konferenz
07:34
CodeCompilerQuick-SortDifferenteComputeranimation
08:30
CodeGlobale OptimierungBitEinfach zusammenhängender RaumQuick-SortTranslation <Mathematik>GraphfärbungDifferenteRechter WinkelSampler <Musikinstrument>DiagrammVorlesung/Konferenz
09:48
Formale SpracheMathematikProgrammierungBinärcodeCompilerLoopWechselsprungSampler <Musikinstrument>InformationsspeicherungMailing-ListePuffer <Netzplantechnik>ARM <Computerarchitektur>StandardabweichungEinsGlobale OptimierungCompilerLoopVersionsverwaltungSampler <Musikinstrument>InformationsspeicherungOrakel <Informatik>Computeranimation
11:53
ProgrammierungSoftwaretestGlobale OptimierungCompilerPhysikalisches SystemProjektive EbeneVisualisierungZahlenbereichQuick-SortUmsetzung <Informatik>Mailing-ListeElektronische PublikationMultiplikationsoperatorDefaultStatistikHalbleiterspeicherProgrammierumgebungGlobale OptimierungCompilerLoopVisualisierungVersionsverwaltungSampler <Musikinstrument>Orakel <Informatik>MultiplikationsoperatorDefaultComputeranimationVorlesung/Konferenz
13:53
Formale SpracheGebäude <Mathematik>Globale OptimierungQuick-SortVersionsverwaltungElektronische PublikationRechter WinkelFormale SpracheCompilerVersionsverwaltungMobiles EndgerätComputeranimationVorlesung/Konferenz
14:56
CompilerRechenschieberQuick-SortVersionsverwaltungGleitendes MittelPuffer <Netzplantechnik>Proxy ServerEinsDatenerfassungFormale SpracheSoftwaretestVersionsverwaltungIntegriertes InformationssystemSampler <Musikinstrument>Gleitendes MittelComputeranimationVorlesung/Konferenz
16:12
RelativitätstheorieFunktionalQuick-SortOffene MengeFreewareStandardabweichungComputeranimation
17:33
Formale SpracheBildschirmfensterKonfiguration <Informatik>FunktionalPhysikalisches SystemCASE <Informatik>Web-Seitesinc-FunktionBildschirmfensterKonfiguration <Informatik>StandardabweichungComputeranimationVorlesung/Konferenz
18:47
KryptologieSensitivitätsanalyseMaßerweiterungQuick-SortImplementierungProgrammierungTypentheorieMakrobefehlKonfiguration <Informatik>Divergente ReiheFunktionalKonforme AbbildungMaßerweiterungE-MailComputersicherheitART-NetzSoftwareschwachstelleStandardabweichungComputeranimation
19:36
KryptologieHalbleiterspeicherProgrammbibliothekCompilerFunktionalLoopQuick-SortVersionsverwaltungSampler <Musikinstrument>Mailing-ListeEinsKryptologieProgrammierumgebungCompilerVersionsverwaltungUmwandlungsenthalpieComputeranimationVorlesung/Konferenz
21:54
CodeProdukt <Mathematik>ProgrammierumgebungGlobale OptimierungPerfekte GruppeSampler <Musikinstrument>QuellcodeProgrammierumgebungGlobale OptimierungSampler <Musikinstrument>Computeranimation
22:45
Globale OptimierungCompilerFunktionalQuick-SortSoundverarbeitungCodeGlobale OptimierungCompilerStellenringTaupunktSampler <Musikinstrument>UmwandlungsenthalpieComputeranimationVorlesung/Konferenz
23:46
VariableArithmetisches MittelBitCompilerFunktionalQuick-SortInformationsspeicherungDateiformatLaufzeitfehlerSymboltabelleMultiplikationsoperatorOSS <Rechnernetz>CompilerOffene MengeLaufzeitfehlerSymboltabelleSoftwareschwachstelleComputeranimation
24:51
CodeImplementierungCompilerSystemaufrufMatchingPunktKontrollstrukturLaufzeitfehlerMessage-PassingHalbleiterspeicherFormale SemantikDeterminanteGlobale OptimierungBinder <Informatik>CompilerFunktionalKette <Mathematik>RechenwerkVerschlingungTranslation <Mathematik>Offene MengeEinfügungsdämpfungLaufzeitfehlerSymboltabelleRandwertSoftwareschwachstelleComputeranimationVorlesung/KonferenzBesprechung/Interview
26:07
CodeHardwareRelativitätstheorieHalbleiterspeicherCompilerKardinalzahlLoopZahlenbereichQuick-SortProgrammfehlerComputersicherheitKorrelationsfunktionFeuchteleitungKernel <Informatik>ThreadQuellcodeZweiRechter WinkelCodeMaschinenschreibenComputerarchitekturHalbleiterspeicherGenerator <Informatik>Globale OptimierungCompilerSampler <Musikinstrument>FeuchteleitungRichtungUmwandlungsenthalpieComputeranimationVorlesung/KonferenzBesprechung/Interview
28:00
CodeHardwareHalbleiterspeicherProgrammiergerätCompilerKorrelationsfunktionFeuchteleitungMessage-PassingCodeMaschinenschreibenComputerarchitekturHalbleiterspeicherGenerator <Informatik>Globale OptimierungCompilerSampler <Musikinstrument>FeuchteleitungRichtungUmwandlungsenthalpieComputeranimation
29:31
SoftwareHalbleiterspeicherProgrammierumgebungGlobale OptimierungFunktionalQuick-SortCASE <Informatik>FeuchteleitungKonstruktor <Informatik>HalbleiterspeicherFunktionalSampler <Musikinstrument>Zeiger <Informatik>ComputeranimationVorlesung/Konferenz
30:55
Formale SpracheHalbleiterspeicherCompilerGeradeZahlenbereichZeiger <Informatik>PrimzahlsatzObjekt <Kategorie>OvalZeiger <Informatik>Manufacturing Execution SystemComputeranimation
32:02
CodeCompilerFunktionalPhysikalische TheorieQuick-SortZeiger <Informatik>PunktLaufzeitfehlerDreiecksfreier GraphMultiplikationsoperatorRechter WinkelFunktionalZeiger <Informatik>ComputeranimationVorlesung/Konferenz
33:20
HalbleiterspeicherFunktionalPhysikalisches SystemQuick-SortSampler <Musikinstrument>Rechter WinkelHalbleiterspeicherComputeranimationVorlesung/Konferenz
34:14
Inverser LimesRechenschieberQuick-SortÜberlagerung <Mathematik>CASE <Informatik>ComputersicherheitSampler <Musikinstrument>PunktPatch <Software>Rechter WinkelHalbleiterspeicherCompilerComputersicherheitSampler <Musikinstrument>Computeranimation
35:36
TopologieZahlenbereichVersionsverwaltungEliminationsverfahrenMailing-ListeOpen SourceMultiplikationsoperatorMessage-PassingSoftwareCompilerQuellcodeComputeranimation
36:40
CodeMathematikHalbleiterspeicherBitGeradePublic-Key-KryptosystemProgrammfehlerMailing-ListeElektronische PublikationSchlüsselverwaltungCodeTopologieAggregatzustandGeradeGruppenoperationInhalt <Mathematik>Kette <Mathematik>ResultanteKrümmungsmaßVererbungshierarchieChiffrierungPuffer <Netzplantechnik>Elektronische PublikationPasswortDynamic Host Configuration ProtocolSchlüsselverwaltungp-BlockKontextbezogenes SystemFreewareCachingProgramm/Quellcode
39:01
CodeDatenstrukturCompilerZahlenbereichProgrammfehlerDatenfeldVektorpotenzialPatch <Software>Element <Gruppentheorie>MusterspracheVariableCASE <Informatik>ProgrammfehlerLuenberger-BeobachterPatch <Software>ComputeranimationVorlesung/Konferenz
40:52
CodeGlobale OptimierungCompilerTermQuick-SortFreewareRechter WinkelMusterspracheTablet PCVorlesung/Konferenz
42:32
CodeCASE <Informatik>ProgrammfehlerVariableCASE <Informatik>ProgrammfehlerLuenberger-BeobachterPatch <Software>Computeranimation
43:30
CodeHalbleiterspeicherPhysikalischer EffektGlobale OptimierungCompilerQuick-SortWurzel <Mathematik>ComputersicherheitKlasse <Mathematik>Gleitendes MittelPlastikkarteMixed RealityKonstruktor <Informatik>TVD-VerfahrenVirtualisierungHalbleiterspeicherCompilerLoopWurzel <Mathematik>Gleitendes MittelPlastikkarteTVD-VerfahrenComputeranimation
44:59
ImplementierungZeichenketteBitRechenschieberMultiplikationsoperatorRechter WinkelFormale SpracheZeichenketteProgrammierungSensitivitätsanalyseAppletMinimalgradComputeranimation
45:49
Formale SpracheKryptologieZeichenketteHalbleiterspeicherSensitivitätsanalyseTermAppletMinimalgradFormale SpracheZeichenketteProgrammierungAppletMinimalgradComputeranimationVorlesung/Konferenz
47:02
CodeHalbleiterspeicherSensitivitätsanalyseAggregatzustandQuick-SortVersionsverwaltungSchlüsselverwaltungMultiplikationsoperatorCodeParametersystemComputeranimationVorlesung/Konferenz
48:17
CodeSpeicherverwaltungQuick-SortÄhnlichkeitsgeometrieParametersystemProgrammfehlerMailing-ListeOpen SourceDickeMultiplikationsoperatorZweiMusterspracheCodeParametersystemComputeranimationVorlesung/Konferenz
49:59
CodeRelativitätstheorieTypentheorieGanze ZahlGlobale OptimierungBijektionBinärbaumCompilerTabelleQuick-SortCASE <Informatik>ProgrammfehlerTranslation <Mathematik>ComputersicherheitZeiger <Informatik>Mailing-ListePuffer <Netzplantechnik>PufferüberlaufRelativitätstheorieGlobale OptimierungProgrammfehlerZeiger <Informatik>ExploitComputeranimation
52:19
CodeGlobale OptimierungCompilerTabelleWechselsprungVerschlingungCASE <Informatik>ProgrammfehlerAbstraktionsebeneGemeinsamer SpeicherWeb logRandwertVirtualisierungVorlesung/Konferenz
53:51
ComputerarchitekturZeichenketteGlobale OptimierungRechenschieberGüte der AnpassungReelle ZahlComputersicherheitInformationsspeicherungUmwandlungsenthalpiePlastikkartePunktwolkeSchlüsselverwaltungWeb logKeller <Informatik>TypentheorieGlobale OptimierungCompilerObjektorientierte ProgrammierspracheProgrammfehlerExploitKeller <Informatik>Computeranimation
55:43
KryptologieMAPQuick-SortGüte der AnpassungProzess <Informatik>Sampler <Musikinstrument>Lesen <Datenverarbeitung>Web-SeiteBefehl <Informatik>Web logMultiplikationsoperatorHalbleiterspeicherGlobale OptimierungCompilerObjektorientierte ProgrammierspracheKeller <Informatik>ComputeranimationVorlesung/Konferenz
56:33
Formale SpracheTypentheorieGlobale OptimierungArithmetisches MittelBinärcodeBitCompilerTermQuick-SortReelle ZahlMailing-ListeGamecontrollerCodeGlobale OptimierungStellenringSkriptspracheGamecontrollerSoftwareentwicklerComputeranimation
58:22
Globale OptimierungBinärcodeGamecontrollerHalbleiterspeicherSoftwaretestBinärcodeCompilerDisassemblerMobiles EndgerätComputeranimationVorlesung/Konferenz
59:10
Formale SpracheGlobale OptimierungFunktionalGruppenoperationTermSystemaufrufProgrammfehlerSampler <Musikinstrument>PunktPatch <Software>Globale OptimierungGruppenoperationSampler <Musikinstrument>AdressraumQuellcodeStandardabweichungGamecontrollerComputeranimation
01:00:33
Quick-SortCoxeter-GruppeLaufzeitfehlerHalbleiterspeicherGlobale OptimierungGruppenoperationVisualisierungComputersicherheitSampler <Musikinstrument>BildschirmsymbolAdressraumQuellcodeKartesische AbgeschlossenheitGamecontrollerMehrwertnetzMechanismus-Design-TheorieComputeranimationDiagrammVorlesung/Konferenz
Transkript: Englisch(automatisch erzeugt)
00:20
the next talk, why cleaning memory is hard. Ilja van Sprundel is a security researcher
00:29
who loves to find out new things, and he found out that it's quite hard to get rid of sensitive content in the memory. So, today, he's going to give us an overview
00:43
presentation. Please give a warm round of applause to Ilja van Sprundel. Okay, perfect. Yeah, so, as the Herald just explained, my presentation is called Men's Sad, Why Clearing Memory is Hard. Before I dive into that, once upon a time,
01:07
that was me. A lot more hair, a lot less fat. This is my 17th Congress. You have pot kettle here, buddy. This is my 17th Congress in a row. I've spoken here a
01:25
number of times before. I have an addatory five in there yet, but obviously that should be in there as well. I work for a company called IO I'm the director of penetration testing, but that really just means that I lead teams of pen testers. No double entendre there. Obviously, we're always looking for
01:45
good security people, so if you're interested, come talk to me afterwards. I like looking at low-level stuff, kernels, drivers, hypervisors, that type of stuff, and I enjoy reading code. Okay, enough about me. What's the audience that I think would enjoy this? Pretty all-around security people, crypto
02:05
people, if you like code review, if you like compiler stuff, and if you're just generally curious about technology, I think you might enjoy this. In terms of the knowledge required, the first half of these slides is relatively basic, and
02:21
so if you have a basic technology understanding, you should be able to understand most of the first half. If you have some C background, that would be nice, and then as I move forward past the first half, things become a bit more advanced, but if you just, you know, only grasp the first half, I think that
02:43
will still be useful. Right, so what does this talk about? Basically, it's one very simple, easy-to-explain crypto implementation problem, and the reason I've dedicated an entire talk about it is because, while the problem is easy, the
03:01
solution is not. There's a lot of moving parts, a lot of subtlety, a lot of nuance, and I'll get into that in a little bit. Now, I can hear some of you thinking, well, Ilya, WTF at 2018, why the hell are you talking about this? This is very, very, very well known. To them, I say, well, because this
03:22
stuff is still everywhere, and I will show that in the slides. I will show this with data. I will show this with bugs, but the driver of this talk, the reason why I started making this presentation, is that this year alone, I did engagements for three different customers on three entirely different software projects, where they all had this exact same type of bug, and, you
03:46
know, you tell them about the bug, and the customer comes back and says, okay, well, yeah, that's great. We understand. Now, tell us how to fix this in a portable way, and that is not very easy. The other thing is that, even
04:00
though this problem is sort of known conceptually, like, you know, people kind of blasé about it. Practically, not many people understand how pervasive this problem is, how realistic it is. It isn't just like, oh, well, the
04:21
compiler might do this. No, the compiler will do this, and it does it everywhere, and these bugs do show up everywhere, even though, because it doesn't, it's hard to tell from the code that it is there. If you look at the binary, if you look at what the compiler emits, you see that it's there, and then the third is, given that one of the teams of the Congress this year
04:44
is foundations, and to talk about, you know, things that aren't necessarily new, but to sort of try and help bring the subject to the next generation, I think this fits in perfectly in the concept of foundations. Right, so before I dive in,
05:03
there's a couple of people, or actually a long list of people, that sort of helped me out. As I said, the problem is well known. It's been well known for at least 20 or 30 years, and so many, many, many people have published papers and presentations about this, and I don't know all of them personally, and I wish I could include them all, but the people I've
05:21
included in here are sort of the, you know, one or two away from me that have had some kind of impact in these slides. Some of you are sitting in this audience, and your help has been appreciated. Okay, so let's actually start. Now let's say you're gonna write some piece of code, right,
05:43
and it's gonna be doing something, and it's gonna be handling sensitive data, you know, be that keys, or decrypted plain text, or session tokens, or passwords, or password hashes, or anything that could be considered sensitive, right? Now if you're a smart, security conscious person, right, the
06:01
moment you are done with that sensitive data, you want to dispose of it, right? You want to purge it from memory, right? Now why do you want to do this? Well, because otherwise, if there's some kind of info leak that is discovered later on, then whatever secrets or lingering memory could be used in your info leak,
06:22
and all of a sudden, you know, your tokens or your keys are leaked out, right? And I mean, that may sound like a stretch, but you know, things like heartbleed, you know, happen, right? So it's, this is very practical, this can really happen. And before I move on, I want to say, if you think and make the step where you say, okay, I need to
06:44
dispose of sensitive material once I'm done with it. That's really big, right? Most software that deals with, you know, sensitive material does not do this. So if you make the step, thinking I need to purge this, you're ahead of the curve. Right. So now, concretely, it would look something like
07:04
this, right? You would write a little, this is a sample code I have that basically generates a little key. And it's a function, you give it and you give it a key pointer. And this thing declares a local variable that serves two bytes, goes and reads a bunch of random bits, puts it in K,
07:21
copies K to key, and then before it returns, because K is about to go out of scope, and it contains sensitive key material, you go and say memset, and then you clear the thing and then you return, right? Perfect. You run this, you compile it, you add a main, and it does exactly what it's
07:43
supposed to do. You look at the assembly and it's all perfect. Problem there is what you're doing is that's not release-based code, right? When you sort of make code ready to be released, you sort of tell the
08:00
compiler that it should enable the optimizer, right? You're going to you'll give it OS or O2. Those are the most common ones. Sometimes you see O3 when people want to live on the edge, but usually O2 or OS. Now, if you look at the assembly again, you know, you get a whole different picture of what's going on. And I want to illustrate this.
08:23
And there's a website called Compiler Explorer, which is beautiful. It integrates a whole bunch of compilers, and it has on the left, it shows you the C code, and the right shows you the assembly. And it's like color-based, and it's easy to make connections. So let's take our little example, and on the left we see the
08:44
generate key, and on the right we see the compiler. And sure enough, if you follow the colors in left and right, you can see what C code translates to which assembly, right? To make it a little bit easier, that memset clearly gets translated to assembly, right?
09:01
Now, that is when you do minus O0, which is the default, which is what you would do if you're developing code and you want to debug this stuff, right? Now, once you're done developing and you're about to ship this thing, you do minus O1, for example, and the optimizer kicks in, and all of a sudden your assembly
09:20
looks a whole lot different. You'll notice it's shorter. You'll notice that all of a sudden, you know, the color of your memset changed, whereas in O0, it was this sort of red-ish, and all of a sudden it became white, and it's nowhere to be found in your assembly, right? It just does not show up, right?
09:43
That's a problem. Okay, well, what happened, right? Let's, yeah, I stole that. So, what happened is a thing called dead store optimization, or dead store elimination, and so basically that memset at the end,
10:05
what you're doing is you are writing into a buffer that is never, ever going to get used again, and an optimizing compiler looks at that and says, hey, you know, I can just take that memset out, and I just saved you a couple of cycles, and you have a smaller binary. Huge win, and because it doesn't really
10:22
change what the program does, it's fully compliant with all the relevant language standards. Right, so that is, in a nutshell, our problem, right? And so one of the things I want to do is I wanted to look at all common compilers and see
10:41
for which of these I can get it to effectively, practically optimize out a memset like this, and I had to, with some of them it was easy, some of it was harder. I had to fiddle around with it for some straight up memset works, for others I had to like kind of twiddle and make a for loop or, you know, kind of jump around a bit, but essentially these
11:03
are lists of 10 compilers I tested. I tried to get my hands on the IBM compiler, but I don't have $20,000, so I couldn't do that. But these are all the ones that I did test, and then, so the first five, or the first four you will know, you know, the GCC and Clang and the Intel
11:21
compiler and the Microsoft compiler, and they're all also on the Compiler Explorer, so it was easy to test those, and it was very easy to get those to optimize out memsets. And then I moved on, downloaded a bunch of others, you know, the Studio Compiler and the Barcadera C++ Builder and the ARM Compiler and a
11:42
bunch of others, and out of these 10 I was able to get eight to optimize it out, right? 80 percent of most, of the most common compilers do this in any, in a practical sense, so it isn't just like a theoretical thing. This really happens. A funny note,
12:00
I tried really hard to get the PGI compiler to do it. In fact, it has, it has a switch called Death Store Elimination, and of course I played with it and I tried it, and I spent over an hour trying to get it to optimize out my memset. Goddamn thing wouldn't move. I don't know what it's doing there. I couldn't get it to do anything.
12:22
But basically, most compilers, if you ask them to do optimization, will gladly optimize out a lot of memsets, right? So the next question is, how common is it to actually see a project's use optimization, right? And this sort of stems from
12:40
a conversation I had earlier this year with a couple of colleagues where a bunch of people said, well, you know, I don't see O2 or O1 or OS all that often. I don't think optimization is all that common. And so I started looking around and I said, okay, well, where can I get some data? And the first thing I was, so I was,
13:00
okay, well, I can go to opscrews.apple.com and that lists about 200 projects and I'll just go through all their make files and look for O2 or OS and so on. And there's about 100 out of 200 there. And then I realized that they actually don't use make files. They have a really bizarre build system. And that build system by default uses OS. So even though it says 100 out of 200, it's probably closer to 200 out of 200.
13:23
And then I had a whole list of programs I wanted to test in, like, FreeBSD and you put into it a bunch of Linux distros. But that's pretty boring and I ran out of time. So I kind of stopped there. But these numbers should be good enough. In addition, if you look at common IDEs, in particular Visual Studio and Xcode,
13:43
when you tell them to build a project in release mode, Visual Studio by default does O2. Xcode by default does OS, right? So the fact that these tools by default give you optimization should make you confident enough in knowing that, yes, in fact, optimization is
14:02
incredibly common in release builds. It isn't everywhere, but it is almost everywhere. Right. So now that we know the problem and now that we know it isn't just theoretical and that we know it's practical and that, in fact, it does occur very, very often, and with most
14:22
compilers, in fact, basically it's a real problem, how do we fix this, right? And this is sort of where things get difficult. There are many sort of solutions. Nothing is portable, right? It's sort of the, okay, well,
14:40
this solution works if you use this compiler, and this solution works if you use this ellipse, and this solution works if you use this OS, and this solution works with this version of the language spec, and this solution works if you have this particular executable file format, right? And before I dive into any of those, let's first talk about the elephant in the room. Don't just
15:02
roll your own. I've seen people do this where they go like, oh, well, you know, I'll just, I'll fight with the compiler. I know what I'm doing, and they'll just, you know, they'll kind of Leeroy Jenkins this, and they'll totally screw it up, and they'll come up with, you know, some really stupid idea. One of the ones I heard was like,
15:20
oh, well, I'll just, you know what, I'll just do IO with the buffer and then it's cool. Yeah, you could do that, but then you're doing IO, right? For no reason. So don't just roll your own. You're going to come up with a solution that's probably stupid. You're going to look really stupid, and it'll be one of these things where, okay,
15:40
you're sort of, your bad solution might work for this particular version compiler, but if you don't understand the concepts behind it, then chances are the next version of the compiler that is somewhat slightly smarter will sort of just bypass whatever you implemented. So don't roll your own, or if you want to roll your own, at least listen to the advice
16:01
of the next ten, and then base your solution on at least some of the advice that I'll be giving out in the next couple of slides. Right? So with that, let's move on to actual solutions. The first one is a libc function called explicit b0, and this is not part of
16:22
any standard, as far as I can tell, at the present time, but this was concocted in May 2014 by the OpenBSD guys. If you'll note the date, it's pretty close to when Heartbeat happened. It's a few months later. I think that may have some relation.
16:43
Anyway, this function basically does a b0, but explicitly guarantees that it does not get optimized out, and what that means is that whether it does or it doesn't, it's no longer your problem, it's the libc's problem, because they made the guarantee, so now it's on them, right? So that's
17:01
really nice. OpenBSD did it first, and then NetBSD said, you know, that's a great idea and we're going to steal it, but we're going to rename it, though, so they changed the name to explicit memset, but it's essentially the same thing, and then about two years and change ago, FreeBSD came up, sort of did the same thing, and then almost two years ago, the
17:21
glibc guys came up with this, too, and then DietLipsy supports this, too. OS X, however, does not support it. So if you're limited to those platforms, Explicit b0 is a perfect solution. Similarly, if
17:42
you are developing for the Windows world, there is an API called Secure Zero Memory, which basically is Microsoft saying, we guarantee that this thing doesn't get optimized out, and if you want to securely clear sensitive material, just use this API, and as the end says, it will ensure that your data
18:00
will be overwritten promptly, and this is one of the cases where Microsoft was ahead of the curve by like 15 years. They've had this thing since the early 2000s. It was in XP and it was in Windows 2003. Both operating systems are no longer supported, but the API is. Okay, so now, there's
18:21
another function called memset underscore S, and it guarantees it doesn't get optimized out, and it's guaranteed by spec. It's guaranteed by the language spec. It is standardized. It is in C11. It is wonderful. It is great, except it's not great, because even though it's in the standard and it's there, it's in what's
18:40
called the optional annex K, and if you read the spec, and it's like pages and pages of boring crap, but if you end up reading the K2, it says optional extension. What does optional extension mean? It means you can be entirely C11 compliant and not offer memset S. So, it's kind of this,
19:02
as Reverend Lovejoy would say, yes with an if, no with a but. So, if it's there, great. If it's not, it has the potential of being this great portable solution, and then it isn't.
19:21
Right, and then, of course, the sort of obvious choice, but somehow a lot of people seem to miss this, is if you end up doing something with sensitive material, chances are you're using a crypto library, and if you're using crypto library, chances are the crypto library offers you an API to do secure
19:41
memory cleaning, and so I listed the common ones. If you're using OpenSL, there's OpenSL Cleanse, and OpenSL guarantees they don't get optimized out. If you use GNU TLS, there's GNU TLS memset. Same thing, GNU TLS guarantees it does not get optimized out. If you're using LeapSodium, which is one of the newer ones,
20:01
they have sodium memzero. Same thing, they guarantee it doesn't get optimized out. I'll get done in a minute. So, this is basically up until here, I've sort of given you a list of, okay, well, here are specific API functions you can call. If
20:20
you're using this library or using this OS, use this. The next sort of solutions are sort of the, okay, well, what if you can't rely on the APIs? Maybe we can get something out of the compiler, right? The first solution is, and this isn't portable, but most compilers have this or something like it, where you can go to the compiler and say, hey,
20:42
don't use the built-in memset, just use the one from libc, and what that means is you tell the compiler that it shouldn't assume that it knows what memset does. And if you do that, sure enough, memset won't get optimized out.
21:00
GCC, this is originally GCC-specific, and then the Intel compiler supports it, too, and then Clang supports it, too, but up until, and this is true, up until Clang 3.7, which is maybe two years old, it's not that old, Clang basically supported fno-built-in memset, and then what they did is they kind of dropped it on the floor, and it got optimized out anyway.
21:22
So, it's kind of annoying. It kind of ruins the whole use this because it works, because except if you're using an older version of Clang, and also, you know, it's not overly portable, but it's a solution that works. Other things might still get optimized out. If you have some kind of
21:41
for loop that clears memory, that might still get optimized out, but at least if you use memset and you're using no built-in memset, then you have a pretty strong guarantee that it shouldn't get optimized out. So, another sort of solution is, you know,
22:01
just don't use optimization. That works. You're guaranteed not to get code to get optimized if you don't use the optimizer. Obviously, you know, that isn't perfect. For one, Fortify Source doesn't work if you don't use optimization. So, if you want to use Fortify Source, you have to
22:21
use optimization. The other one, of course, is that, yeah, you've got to change your build environment. Okay. I mean, it's not overly, I guess it's not portable, but then again, most compilers will have some way to tell it to not optimize anything. But obviously, the reason you don't want to use this particular solution is because, you know, you
22:41
don't get the optimizer, so your product will probably be slower. Sort of a spinoff of this is some compilers, in particular the Microsoft one, and GCC also kind of supports it, is where you can localize
23:01
optimizations based on scopes and functions. And so, you can say, oh, you know, for this function, do 0 0. It's not a commonly used feature. It seems like it might have some side effects and it doesn't support all
23:21
switches that the compiler generally does. I've seen this recommended by a few people. I played around with it. It seems to work, but it doesn't seem to be a sort of commonly adopted way of doing things. The other thing, of course, is again, this is, I mean, these are pragmas.
23:41
This is very, very compiler-specific stuff. Another solution is using what's called weak symbols. Anybody familiar with weak symbols? Okay, a little bit. I'll try to sort of very briefly
24:00
get into this. So the L file format basically is a format that specifies how to have an executable that can run on OSs that support this file format. So Ls, if it's compiled, for example, for Linux, it'll be in this particular format. And one of the, obviously, as part of format, you can store symbols for things like functions
24:21
and variables and so on. And generally, a symbol is what's called a strong symbol. You can mark one as weak, and what weak means is that a symbol may change in runtime. And what that means is, if you declare a function,
24:41
or a symbol of a function is weak, that means that compile time, the compiler would have a very hard time to reason about what that thing does. Because of the sheer fact that you've declared it as weak. And in fact,
25:00
this particular solution is what OpenBSD uses in their implementation of explicit B0. And what I really like about this is that this is the commit message for the OpenBSD guys, and they're very pragmatic about this. They say, well, you know, we think our solution is pretty clever,
25:20
but it's not foolproof. There are still two ways to defeat this, and they list a bunch of ways to do this. In particular, well, the compiler could emit runtime code that checks what this thing is in runtime right before it's called. And then you could still optimize it out if the thing matches or doesn't match.
25:43
But then they go on and say, well, in the foreseeable future, we don't think that's going to happen. But it's possible that at some point down the road, this might happen. And so I like their way of reasoning about this, where the solution's pretty clever, but it's not foolproof. It may at some point in the future break.
26:01
But at the present time, it's a fairly good solution, I think. Right. So another solution is to use memory barriers. How many people know what a memory barrier is or what it does? OK. About the same number of hands. OK.
26:20
Let me try to very briefly explain what a memory barrier is. And bear with me here, because it's and I'm going to oversimplify because it's not a particularly simple concept, if you've never heard of it. But OK. Let's say you have a piece of code, two global variables, A and B, and you assign a value. You say A equals something
26:41
and B equals something, right? And there's no relation between A and B, right? What that means is, both the compiler and the hardware, because they have no relation, the both of them are allowed to reorder it. So B can be assigned first, and then A can be assigned later, because there's no correlation. That's perfectly valid. Now let's say you have
27:00
a second thread somewhere, and your second thread says, OK, while not B spins, and then once B is set, you use A, right? This is sort of where you're basically waiting on something to be set, and the idea is that you wrote your code so that B is set after A is set.
27:20
And that seems logical, and that would work, except if the compiler and hardware don't know there's a relation between your loop on one end and the assignment on the other, and either the hardware or the compiler reorders it, and sets B before it sets A, really, really nasty things happen. And this has been the source
27:40
of numerous security bugs. Very, very subtle stuff. The K-SAN and T-SAN kind of stuff, we've seen the Linux kernel last couple of years, a bunch of that is related to these kind of bugs. And so the way that you fix this is you introduce what's called a memory barrier,
28:01
and that is basically, when you write your code and you say A equals something, and then before you say B equals something, you basically, in the middle say memory barrier, and then you say B equals something. And what that means is, it gives a signal to the hardware and to the compiler, and it says, whatever happens before this and after this, you are not allowed to reorder this.
28:21
There is a correlation there that I know that you're not aware of, so don't reorder it. And I hope I explained it well. This usually takes a lot longer to explain. I hope I got the message across well enough to sort of give you an idea of what a memory barrier is. And now the cool thing about memory barriers is that
28:42
it's a way for a programmer to tell the hardware or the compiler, I know something about this memory, you don't, stay away, don't touch. And because of, I mean it works for reordering, but it also works really well to not get something optimized out, right? The idea is you could basically
29:01
just do your memset and then on the thing you memset it, you basically do a memory barrier, and that tells the compiler not to optimize it out. And I know the concept sounds complicated, but it's pretty clever, and I've oversimplified this because it's a relatively complicated subject.
29:24
But this works really well, and this is used by Dialypsy, and it's used by glibc, and Nginx recently had a fix where they have their own explicit mem0, and it also uses a memory barrier. So this is a tried and tested concept,
29:42
and it works. So those are kind of the solutions that are known and that work, and have been tried and tested by various fairly well-known pieces of software. Somehow you're in an environment somewhere and none of this is available to you,
30:02
or it's not portable enough, and you're looking for a solution that works everywhere, the best you can do is fall back on constructs that are known in the C language, and this is basically use of the volatile keyword. I call this a fallback. People often go, well, just use volatile, and then that solves the problem.
30:21
And it turns out optimizers can be very clever and very tricky, and even when you use volatile, there are cases that can be made where if the optimizer is clever enough, your data may still get optimized out. So the volatile solutions are sort of best effort fallback solutions,
30:42
and there's sort of two variants of this. One is a volatile pointer write, and that's the fallback solution in libsodium, and the other one is a volatile memset function pointer, which is what OpenSSL uses. And here's what that looks like. This is the libsodium fallback, and this looks like, you know,
31:00
volatiles for the, you know, you tell the compiler, hey, you know, I know something about this. You don't. And this is where it gets very language lawyery. If you look at the spec, it says something along the lines of the access object something something, and they're describing the actual memory volatile,
31:23
not just the pointer L value. And so if the compiler looks at this code, and it can trace, and it can prove that wherever PNT came from, and if that isn't actually volatile, then this volatile doesn't really mean all that much, and it can still optimize it out.
31:40
That sounds very theoretical, and I don't know if that actually happens, but a number of people smarter than me or that know more about the sort of this nitty-gritty little C language things have told me that yes, in fact, you would be allowed to do that if you're a very smart optimizing compiler. The fact that it's a fallback solution for sodium and a few others
32:01
leads me to believe that it probably doesn't, but it could. Right, and so this is the solution which also uses volatile, doesn't do a pointer right, but instead it creates a volatile function pointer that points to memset, and it sort of gives you, you get the same concept more or less as with the weak symbols,
32:22
the idea being is that your volatile function pointer can change at any time without the compiler knowing about it, and that seems like a pretty good solution, except when you, one way of in theory getting around this is if the compiler emits runtime code
32:43
that right before the function gets called, it looks and goes, well, like it captures it and then goes, is it memset or is it something else? If it's something else, then we call it. If it's memset, we just return, and then you optimize out the runtime and save a few cycles.
33:01
In theory, the compiler is allowed to do that and emit code like that. I don't know if that actually happens anywhere, but it's a possibility. So think of these last two solutions as a fallback. They may not work in theory. In reality, they probably do.
33:22
Right. So this is sort of the first half of my presentation, and I'm perfectly on time, which is great. So now there isn't one portable solution, and this is why clear memory is hard, right? The problem is well understood, but if you're looking for an all-around solution
33:42
that works everywhere, regardless of compilers and operating systems and so on, it's very hard to have a good solution. And this is what customers have come back to me and said, give me a portable solution. I need something better than this or that. And so the best solution I have for this is
34:01
sort of apply all of the above as best as possible. And my initial idea was, I'll just write a little function that does this and put it on GitHub, and people can use it. But if you look at Lipsodium, and then, yeah, you see, I mean, I'm not going to click on it now, but that's a link, and if you download the slides later,
34:21
you'll see it points to GitHub, and it shows you the actual limitation. Lipsodium's mem0 is really well written, and it's beautiful, and it sort of has this fairly elegant, if this, this, and this, or if def this, then do this solution. L if, you know, this particular setup,
34:42
then do this solution. And it has this for six or seven of the cases I've covered. It's really nice. It's really elegant. If you're looking for inspiration, point people to Lipsodium. I think it's a good portable-ish way of solving this problem.
35:00
Right. Okay, so now we've talked about the problem. We've talked some solutions. Okay. Well, I want detection. When does this really happen? I want to see this, right? And I want compilers to tell me this. Like, why doesn't GCC tell me it's not writing something out?
35:22
Like, if it has security consequences, it should tell me. I want, why are they not doing this? I don't understand. So I set out, and I modified GCC. I looked at the industrial elimination, and I came up with this patch. And instead of, this is their tree SSA, that's their elimination pass.
35:44
And when it calls delete dead call, I sort of take that out and say, if it's a built-in memset, before you call delete call, emit this warning, tell me the file, and tell me the line number.
36:04
And then do you still optimize it out? And what this means is, every time a memset gets optimized out, GCC now tells me. And this is very interesting, because I not only get detection for my own code,
36:20
this is a great way to get really cheap, fast, zero-day. And in fact, that's what I did. I downloaded a whole bunch of very well-known open source projects, and I ran them through a modified version of GCC, and I came up with a list of things.
36:41
Oh, awesome, thank you. So I know of this particular problem, it practically affecting OpenSSL, MIT Curb, Heimdall-Curb, MatrixSSL, PHP, DHCP, Bind, SquidCache,
37:03
and the list goes on. I have Rsync as well, and there's more. So we know this problem is very widespread. If the stuff we all rely on, the stuff that is built on, has these problems, that means your code probably has this as well.
37:21
And of course, I'm just giving names out here. But let's give you guys some zero-day. That's MIT Curb. That memset's optimized out. That's PHP. That memset's optimized out. This, I think, is MatrixSSL.
37:43
That decrypted plain text gets memsetted, that gets optimized out. That's the lingers around in memory. This is OpenSSL. That crypto-extended data, that doesn't get optimized out. This is Nginx. That password, that mem0,
38:02
gets optimized out. This is Bind and DHCP. That memset of private key data gets optimized out. This is Squid, that it goes to LDAP and gets creds.
38:23
And basically, it tries to clear the creds, and then that gets optimized out. Yeah, well, I had to play around with PowerPoint. A little bit. Same thing. This is a key that gets optimized out.
38:41
And then this is rsync. And these are stored credentials in a file. And that gets into memory, and those memsets get optimized out. So that's nine bugs right there. And all it took was five lines of code change in GCC. And GC just gave me all these bugs.
39:02
The other thing about seeing exactly what... Thank you. The other thing that was really nice about getting the data back from GCC isn't just that it gave me bugs.
39:22
It also showed me things that optimized out that I thought it wouldn't. Obviously, what I was expecting is, you know, a variable that's about to go out of scope and that you memset, that would get optimized out, obviously. But what I also noticed was that,
39:41
you know, when you... There's a common code pattern when you just malloc something or you declare something on a stack, and the first thing you do is memset to clear the whole thing, and then you move on. It turns out that in a number of cases, that also gets optimized out. The idea is that if the compiler...
40:01
It only gets optimized out if the compiler can prove that every element in the struct or that the whole field gets filled in. I'm not sure if that's entirely true. And we were talking about this earlier. But what about, you know, things like structure padding,
40:22
or maybe enums, or, you know, something like that. Or, you know, unions, I mean. That I'm not quite sure how that works. I haven't dug into it, because I found this. I mean, I wrote this patch yesterday, right?
40:40
These bugs are like... They're fresh. So I don't know exactly how much potential there is here, but it smells like there's room for bugs here. So I was surprised to see this, and some more research is needed. So if anybody wants to, feel free.
41:01
And then the other thing I sort of noticed is that, obviously, the common case, what I was looking for in terms of bugs, was something that was sensitive material. And then obviously, I saw a whole bunch of things where non-sensitive material was being memsetted and then freed, and then that memset would get optimized out as well.
41:20
And that struck me as odd at the beginning. But then obviously, there's a common sort of coding pattern that I've seen, where, you know, anytime somebody does a malloc, and then, before using memset null, or when they free right before they do a memset zero,
41:40
not because they want a clear sensitive material, but because they always... They want to have a guarantee they always start from a clean slate, and so they end up building code that ends up working for something that always has a clean slate, right? And if that memset gets optimized out, then those guarantees no longer hold. And so code that works around the sort of,
42:03
well, we always have a clean slate when we get a fresh piece of memory, that is no longer true. And so I think that coding pattern doesn't jive well with compiler optimization. Again, this is the sort of realization I made yesterday. I don't have all the facts on this yet,
42:22
but it seemed interesting, and I think there's some room for research here. The other thing I noticed is that sort of close to the memsets that were optimized out, I noticed, like, other bugs. And this is kind of like, you know, it made me think,
42:41
it's like, well, bad code attracts other bad code. So one of them was null-deraps, and the other was the use-after-free, where, you know, instead of doing memset free, the code was doing free memset, right? And then, obviously, use-after-free. So this is the case of the null-derap, where basically a malloc happens,
43:02
and then the memset null happens, and then there's a check to see if the variable that was allocated is null, right? Obviously, that memset, if the value is null, that memset would cause a null-derap, except the memset gets optimized out, so it never generates the null-derap.
43:21
I kind of hit catch-22 there. But that construct is clearly broken. So now, I mostly spoke about memset, but really, there's a thousand variations
43:40
that clear memory and sort of comes down to the same thing, right? Obviously, you can do a for-loop, or you can use other APIs, or you can, like, roll your own and do something very exotic. And then there's C++, which there's a gazillion ways of doing it. You can have weird classes with a constructor
44:03
and inheritance, multiple objects, virtual... It gets really crazy once C++ comes in the mix. And so, basically, it all kind of looks different, but it all does the same thing. It has the same root causes and the same problem. And the thing is,
44:21
when you look at this from just a code perspective, is that sometimes the optimizer is smart enough to see it. Sometimes, the optimizer is not, but it could, in the future, be smart enough to see it. So it's one of these things where, if you're looking at a piece of code, and you're doing some kind of security assessment,
44:43
and you see this, and, like, should you report the bug, should you not? I think you should, because even if the compiler doesn't optimize it out today, it may very well optimize it out tomorrow. Right. Okay.
45:01
Now, all this talk has been about C, and when you write it in C, well, what if you're not writing C code? What if you're using other languages, non-native languages? You know, any Go, Rust, Objective-C, C-sharp,
45:21
Java, flavor of the month. And really, I wanted to spend more time on this, but then my slides would have gotten so long I couldn't. So I only have one slide on non-C, but I spent a little bit of time on this. In C-sharp, there's something called secure string, which is supposed to hold a string in a safe way,
45:44
and the problem I have with secure string isn't the implementation. It's how do I get something securely into secure string, and how do I get something securely out of it? And then, in terms of Java, there's a Java crypto guide, which basically says, it recommends not using strings to hold sensitive material,
46:02
but to use a binary instead. And there was some reason behind it, I don't remember. But the idea is basically this. Most managed languages don't really offer any decent way to clear memory,
46:21
or to hold sensitive material in memory without it leaking. And it will leak, and it will kind of happen behind your back, because you wouldn't know it leaked. Especially when you're dealing with garbage collection, where something can get reallocated without you ever knowing,
46:41
and all of a sudden, before you know it, there's like five different copies of your key sprayed all over memory. Most of these languages, as far as I can tell, they don't have the infrastructure in place to deal with sensitive material. It seems to be entirely missing in a lot of places. In other places, it's kind of like, you know,
47:02
shoehorned on or bolted on with some varying degrees of success. I remember seeing there was some places for Go, but it had been in revision three or four, because there was always something wrong with it. And so, again, I wish I had more time than I could elaborate on this.
47:21
From what I saw is that it's a pretty sad state of affairs in Nazi, that most people haven't tried, and those who have, have not tried hard enough. So, now that I've sort of run through all of the sort of, you know,
47:44
related issues, sort of, you know, memset problems and how to clear up memory, I want to sort of talk about some related issues. First of all, what I said initially is that, you know,
48:01
when people make this step and go, oh, well, I should clear this memory because it's sensitive, I said that's huge because it really is. Most code doesn't even try. There's an unbelievable amount of code that just keeps keys in memory and sensitive material, and it goes out of scope, and it never gets cleared,
48:22
and it just ends up lingering on the stack or the heap. I mean, often it would get overwritten fast, but sometimes it could linger around and sit there for a very, very long time. Problem with this is that it's hard to find in any kind of automated fashion, because you're looking for the absence of something, right?
48:44
So that means the only way you can really find these kinds of bugs is to manually look at it and go, oh, this is sensitive material, no effort was made to clear this. A second related issue, and this is really cute actually, so when you call memset the way it's done there, that's wrong.
49:06
The length and the byte you want to memset with are transposed. The zero should be the second argument, and len should be the third argument. So what that really does is a no op. It basically says memset, and then you use len as the pattern,
49:21
and then the pattern is used as the length. In this case, the pattern is zero. So a memset of zero, which becomes a no op. And what's really cool is that GCC actually had, you can tell GCC to emit a warning for this. And one of the things I wanted to do was I wanted to run through all the same code I had
49:42
tested before, and enabled the warning, and then sort of showed out a list of bugs, but I kind of ran out of time. But I strongly suspect if you use this warning and you go download a whole bunch of well-known open source code and you run it through, you'll end up with a very similar list of bugs.
50:06
So another list of sort of related bugs, and related in the sense that no longer clearing secrets, but related in the sense that optimization was involved. That is to say, if the optimizer wasn't turned on, a security bug wouldn't have occurred, right?
50:22
Or would have been less severe, right? And so there are sort of three cases of bugs I sort of ran into that I think is somewhat relevant. That I sort of wanted to talk about. So the first one is what's called pointer overflow. It turns out, let's say you have this code, PTR,
50:41
and then there's a len and the len is untrusted, right? The idea is that you want to validate that PTR plus len isn't beyond the end of your buffer. But before you do that, you also want to make sure that PTR plus len doesn't overflow, right? And so you would do code like PTR plus len is smaller than PTR. If that's the case, then the pointer overflowed and you bail out, right?
51:03
Problem is, according to the C standard, pointer overflow can't happen and so that is undefined behavior. And the optimizer sees that and goes, oh, undefined behavior, optimized out, gone. So your bound check just got optimized out. That is a relatively common bug to see.
51:22
If you don't know it's undefined behavior, if you don't know the compiler can optimize it out, you would just read over it. But once you know, you start reading code and you'll see it everywhere. Also, the way to fix this is basically to cast your pointer to an integer type that's big enough to hold a pointer. And all of a sudden, the optimizer can no longer optimize it out.
51:44
And it'll be in your code. So that's the first one. The second one is a lot more subtle. This has to do with switch case optimization. So when you have a switch in C and you translate it to assembly,
52:01
you could just do like a one-one sort of translation. But if you use the optimizer, one of two things will happen. Either it'll generate a binary tree. That's generally observed in the Microsoft compiler. If you look at GCC and Clang, what they do is they'll create a jump table.
52:24
What that means is they'll look at the value to compare. And then if they compare a certain number, they get the value again. And they use that value as an offset in a jump table. And that usually isn't a problem. And that's an abstraction.
52:41
And most people don't care about it in most situations. Except if you're dealing with a shared memory trust boundary, because all of a sudden there is a subtle double fetch that was emitted by the compiler behind your back that doesn't show up in the actual C code. And these are situations like a hypervisor trust boundary, right?
53:01
These are very, very strong trust boundaries. And that thing is actually a link if you look. And once I publish the slides, if you click on it, it links to a blog post that shows just that. It's a virtual box, guest host pre-vesclation because of a switch case jump table optimization.
53:24
And so the bug there basically is that when you fetch the first time, you do the compare, it's fine. And then two instructions later, you fetch again to do the jump table. Between this first and second fetch, the guy on the other end of the shared memory could change it. And all of a sudden, you can jump outside of your jump table
53:42
and basically cause an arbitrary jump, which obviously is bad. Yeah, I'm not going to cover this. Okay, I got to wrap it up. I got three more slides and then we can get to questions. This is actually very important.
54:01
So now, everything I've covered, we're good. We know what the problem is. We know what the solutions are. We know there's real world problems. But we have a good grasp of it now, right? Okay, turns out I kind of lied to you. This is not the whole problem. If it was the whole problem, that'd be great.
54:21
We know how to fix that, more or less. It turns out that compile optimization is really, really clever. And it does many, many clever things. And they're all very subtle and a lot of them are architecture specific. And here's a scenario of things that can occur, right? The optimizer will do things like,
54:41
oh, well, you're handling this string of a certain kind. You know what we'll do? I'll just shove it into a bunch of registers so it'll be faster. And then, you know, it passes something and then all of a sudden, the optimizer goes, oh, you know what? You don't have enough registers. It's okay. What I'll do is I'll take whatever's in the register and I'll dump it on the stack.
55:01
And then we'll go from there. And all of a sudden, what happened is you leak key material in registers and then you leak them on the stack. And this stuff happens throughout. And secrets leak out. It just happens. And it is because of optimization. Even if you try really, really hard
55:23
by trying to do the really hard to do the right thing and the optimizer just, they just screw you, right? And this problem is echoed in a blog post by Colin Percival, who used to be the security guy for FreeBSD, now has, I think, a cloud storage company.
55:41
Very, very smart security guy. And I would recommend reading that blog post. This problem is also echoed in the Linux map page, exclusive to B0. And the map page basically says, well, yes, this is a fundamental problem. We still recommend you use exclusive to B0.
56:01
And our hope, our thought process is that towards the future, we will have a way to get the compilers to not do this, and we can move on. But at the present time, there's no good fix for this. And so this is sort of the first statement I have that I want to make, is that at the present time,
56:22
optimizing compilers and cryptography are mutually exclusive. You can have one or the other. You cannot have both. It does not work. At the present time. Okay, before I get to my conclusion, I want to rant a bit about optimization, as if I haven't already.
56:43
But basically, I mean, I get that optimization is great, and it gives you all these things, and things get faster. But I have a real problem with optimization, because, look, if you're a developer, and you write code, and you're pretty smart, you can reason about your code, because you wrote it, and you know what it does.
57:02
But if you then compile it, and it's gone through an optimization pass, you can no longer reason about it, because you don't know what the optimizer did, right? And that is, I think, a fundamental problem. And sort of what I want people to sort of think about
57:20
is whenever they do dash O is the, and I'm not saying you shouldn't use the optimizer, because it has many pros, but don't be like blasé about it, right? Before you type dash O, really think what it means to do that,
57:40
because it will introduce all sorts of things that you weren't sure about, or that subtly changes the meaning of something. And what I really want, for the compiler people, and maybe the language people to implement this, so the compiler people can do it, is I want strong accountability and control of the optimizer, right?
58:01
That is, if I compile something, I want to be able to go to the compiler and say, hey, before you do anything, I want you to give me a detailed list of everything you're about to do in terms of optimization, so that I can take that list and look at my code, and then with that list in my code,
58:20
I can now reason again about what the binary is going to do. Without having that kind of accountability, you can't reason about your binary. And the other thing is control, right? I want to have fine-grained control over optimization. What I mean by that is, you know, like the localized stuff that I mentioned before,
58:40
that the Microsoft compiler, for example, has, where I want to be able to go and say, this particular scope don't optimize, or this particular scope don't do this particular optimization. I would like to see something like that. I'll just skip this. So okay, here's my conclusion, right? We know what the problem is,
59:00
the original problem, and then I have some solutions, and then in retrospect, they're just partial solutions, but they're still kind of solutions. Okay, and then I also have a call to action of things I think should happen, and hopefully will happen at some point in the future.
59:22
The problem, as I've illustrated, I think is rampant. And so basically, I would like people to use that GC patch that I've shown, or create a better one, and you know, go find some bugs, or better yet, go fix some bugs.
59:43
In terms of compilers, as I just mentioned, what I want is optimization, accountability, and control, right? It's kind of the wild, wild west in terms of optimization, where the compilers just go and do all these things, and I mean, we have some flags, but they're...
01:00:00
There's not enough control and there's not enough accountability. There's not enough transparency. You just don't know what's going on It's like a dumb functionality where it's like it's like a needle in a haystack You want something that's easy to work with easy to read or easy to parse and that tells you exactly
01:00:20
All the optimization steps you're doing and ideally I'd like the language people to get involved and to standardize on this because if they do we can now demand this of all the compilers And then lastly sort of the you know coming back to my Nazi is sort of the What about the Nazi Ruby Python Perl Go Rust and so on
01:00:45
It smells bad. It looks bad Especially when the runtime is involved This is probably worthy of a presentation of its own or multiple presentations But I I wish I had done more there
01:01:02
That is essentially it. I Hope you enjoyed it