Video thumbnail (Frame 0) Video thumbnail (Frame 1191) Video thumbnail (Frame 2484) Video thumbnail (Frame 3696) Video thumbnail (Frame 6326) Video thumbnail (Frame 7613) Video thumbnail (Frame 9828) Video thumbnail (Frame 11355) Video thumbnail (Frame 12739) Video thumbnail (Frame 14703) Video thumbnail (Frame 17813) Video thumbnail (Frame 20828) Video thumbnail (Frame 22394) Video thumbnail (Frame 24294) Video thumbnail (Frame 26320) Video thumbnail (Frame 28187) Video thumbnail (Frame 29401) Video thumbnail (Frame 32843) Video thumbnail (Frame 34127) Video thumbnail (Frame 35653) Video thumbnail (Frame 37265) Video thumbnail (Frame 39173) Video thumbnail (Frame 41998) Video thumbnail (Frame 44287) Video thumbnail (Frame 46376) Video thumbnail (Frame 48062) Video thumbnail (Frame 49998) Video thumbnail (Frame 51357) Video thumbnail (Frame 53406) Video thumbnail (Frame 54997) Video thumbnail (Frame 58528) Video thumbnail (Frame 61289) Video thumbnail (Frame 63812) Video thumbnail (Frame 65249) Video thumbnail (Frame 67467) Video thumbnail (Frame 68732) Video thumbnail (Frame 70559) Video thumbnail (Frame 72417) Video thumbnail (Frame 74982) Video thumbnail (Frame 78479) Video thumbnail (Frame 80778) Video thumbnail (Frame 83574) Video thumbnail (Frame 84836) Video thumbnail (Frame 87540) Video thumbnail (Frame 88760) Video thumbnail (Frame 90814)
Video in TIB AV-Portal: Memsad

Formal Metadata

Why clearing memory is hard.
Title of Series
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
This presentation will start off with a simple problem (how do you clear memory that holds sensitive content). It explores numerous possible solutions, and presents real live facts and figures. bugs in common applications will be shown.
Keywords Security

Related Material

Video is cited by the following resource
NP-hard Presentation of a group Functional (mathematics) Roundness (object) Semiconductor memory Content (media) Principle of maximum entropy Musical ensemble Information security
Presentation of a group Read-only memory Semiconductor memory Multiplication sign Code Principle of maximum entropy Software testing Row (database) Number
Slide rule Slide rule Code Code Device driver Compiler Compiler User profile Type theory Term (mathematics) Energy level Information security Information security
Slide rule Implementation Presentation of a group Projective plane Device driver Bit Mereology Cryptography Entire function Software bug Type theory Software Different (Kate Ryan album) Right angle Quicksort Implementation
Electric generator Online help Quicksort Semiconductor memory Compiler Software bug
Slide rule Beat (acoustics) Presentation of a group Key (cryptography) Information Code Token ring Moment (mathematics) Token ring Electronic mailing list Code Plastikkarte Online help Leak Leak Hash function Semiconductor memory Password Right angle Key (cryptography) Information Quicksort Information security
Curve Functional (mathematics) Randomization Pointer (computer programming) Software Code Sampling (statistics) Code Principle of maximum entropy Set (mathematics) Right angle Bit
Code 1 (number) Right angle Quicksort Graph coloring Compilation album Mathematical optimization Compiler Connected space
Default (computer science) Different (Kate Ryan album) Compiler MiniDisc Compiler Right angle Bit Quicksort Mathematical optimization Graph coloring Mathematical optimization
1 (number) Set (mathematics) Principle of maximum entropy Compiler Computer programming Formal language Gaussian elimination Software testing Determinant Loop (music) Mathematical optimization Hydraulic jump Compilation album Standard deviation Arm Electronic mailing list Bit Compiler Loop (music) Data storage device Compiler Buffer solution Revision control Right angle Cycle (graph theory) Mathematical optimization Oracle
Computer file Code Multiplication sign Visual system Set (mathematics) Compiler Computer programming Number Gaussian elimination Intrusion detection system Data conversion Loop (music) Mathematical optimization Compilation album God Physical system Default (computer science) Default (computer science) Addition Projective plane Electronic mailing list Semiconductor memory Statistics Compiler Intrusion detection system Compiler Revision control Right angle Quicksort Mathematical optimization Oracle Asynchronous Transfer Mode
File format Compiler Portable communications device Formal language Portable communications device Compiler Formal language Revision control Revision control Right angle Quicksort Mathematical optimization Compilation album
Slide rule Information systems Euler angles Formal language Compiler Revision control Compiler Data acquisition Buffer solution Revision control Software testing Right angle Moving average Quicksort Proxy server
Functional (mathematics) Freeware Theory of relativity Multiplication sign Mereology Open set Peer-to-peer Exclusive or Mathematics Loop (music) Semiconductor memory Quicksort Computing platform Window
Web page Functional (mathematics) Standard deviation Personal digital assistant Computer configuration Window Physical system Formal language
Axiom of choice Email Sensitivity analysis Computer program Principle of maximum entropy Cryptography Vector potential Arithmetic mean Macro (computer science) Type theory Series (mathematics) Function (mathematics) Species Quicksort Information security Implementation Extension (kinesiology) Library (computing) Extension (kinesiology)
Functional (mathematics) Transport Layer Security 1 (number) Compiler Open set Compiler Revision control Latent heat Latent heat Cryptography Loop (music) Semiconductor memory Revision control Integrated development environment Quicksort Speicherschutz Compilation album Library (computing)
Building Functional (mathematics) Code Source code Code Sound effect Compiler Product (business) Latent heat Integrated development environment Compiler Integrated development environment Mathematical optimization Local ring Sanitary sewer Mathematical optimization Compilation album
Functional (mathematics) Run time (program lifecycle phase) File format Run time (program lifecycle phase) Multiplication sign Source code Compiler Bit Variable (mathematics) Open set Symbol table Compiler Quicksort Arithmetic progression Vulnerability (computing)
Point (geometry) Execution unit Implementation Link (knot theory) Run time (program lifecycle phase) Code Run time (program lifecycle phase) Multiplication sign Compiler Translation (relic) Semantics (computer science) Open set Compiler Symbol table Chain Exclusive or Message passing Linker (computing) Read-only memory Function (mathematics) Right angle Musical ensemble Boundary value problem Mathematical optimization
Vapor barrier Thread (computing) Code Compiler Number Software bug Architecture Cross-correlation Read-only memory Semiconductor memory Computer hardware Information security Theory of relativity Code Variable (mathematics) Storage area network Compiler Latent heat Loop (music) Numeral (linguistics) Kernel (computing) Personal digital assistant Compiler Touch typing Website Right angle Quicksort Mathematical optimization Sanitary sewer Vapor barrier
Vapor barrier Code Code Compiler Semiconductor memory Compiler Architecture Programmer (hardware) Latent heat Message passing Cross-correlation Read-only memory Semiconductor memory Compiler Touch typing Computer hardware Mathematical optimization Vapor barrier Row (database)
Functional (mathematics) Manufacturing execution system Code Abklingzeit Formal language Number Pointer (computer programming) Semiconductor memory Mathematical optimization Default (computer science) Greedy algorithm Constructor (object-oriented programming) Line (geometry) Compiler Pointer (computer programming) Software Integrated development environment Oval Personal digital assistant Function (mathematics) Compiler Right angle Lipschitz-Stetigkeit Quicksort Object (grammar)
Functional (mathematics) Run time (program lifecycle phase) Code Cellular automaton Multiplication sign Principle of maximum entropy Open set Theory Symbol table Compiler Pointer (computer programming) Pointer (computer programming) Function (mathematics) Right angle Cycle (graph theory) Quicksort
Presentation of a group Functional (mathematics) Read-only memory Semiconductor memory Multiplication sign Operating system Right angle Lipschitz-Stetigkeit Quicksort Compilation album
Point (geometry) Slide rule Implementation Patch (Unix) Multiplication sign Compiler Line (geometry) System call Number Gaussian elimination Message passing Read-only memory Personal digital assistant Compiler Network topology Lipschitz-Stetigkeit Right angle Quicksort Information security Information security Compilation album
Freeware Inheritance (object-oriented programming) Open source Code Line (geometry) Password Principle of maximum entropy Set (mathematics) Compiler Open set Content (media) Software bug Local Group Revision control Chain Mathematics Cache (computing) Network topology Semiconductor memory Software Encryption Matrix (mathematics) Curvature Source code Context awareness Key (cryptography) Computer file Projective plane Keyboard shortcut State of matter Electronic mailing list Code Bit Line (geometry) Cryptography Public-key cryptography Cache (computing) Dynamic Host Configuration Protocol Password Key (cryptography) Encryption Quicksort Block (periodic table) Directed graph Data buffer
Graphics tablet Patch (Unix) Code Patch (Unix) Principle of maximum entropy Field (computer science) Variable (mathematics) Number Element (mathematics) Compiler Software bug Vector potential Personal digital assistant Personal digital assistant Pattern language Data structure
Building Patch (Unix) Code Set (mathematics) Principle of maximum entropy Variable (mathematics) Compiler Software bug Tablet computer Pointer (computer programming) Term (mathematics) Personal digital assistant Semiconductor memory Personal digital assistant Pattern language Right angle Quicksort Freeware Mathematical optimization
Code Principle of maximum entropy Compiler Plastikkarte Perspective (visual) Variable (mathematics) Software bug Different (Kate Ryan album) Semiconductor memory Personal digital assistant Bounded variation Information security Loop (music) Mathematical optimization Multiplication Graph (mathematics) Inheritance (object-oriented programming) Patch (Unix) Semiconductor memory Euler angles Compiler Root Loop (music) Mixed reality Duality (mathematics) Right angle Moving average Quicksort Object (grammar) Routing Bounded variation
Slide rule Implementation Java applet Code Multiplication sign Java applet Bit Computer programming Formal language Degree (graph theory) Term (mathematics) String (computer science) String (computer science) Object (grammar)
Sensitivity analysis State of matter Multiplication sign Java applet Electronic program guide Cryptography Computer programming Formal language Degree (graph theory) Revision control Degree (graph theory) Semiconductor memory String (computer science) String (computer science) Speicherbereinigung
Sensitivity analysis Theory of relativity Key (cryptography) Semiconductor memory Code Multiplication sign Memory management Code Quicksort Software bug
Open source Code Length Local area network Multiplication sign Set (mathematics) Principle of maximum entropy Translation (relic) Parameter (computer programming) Login Software bug Pointer (computer programming) Integer Information security Mathematical optimization Standard deviation Theory of relativity Electronic mailing list Code Compiler Type theory Binary tree Pointer (computer programming) Personal digital assistant Pattern language Right angle Quicksort Mathematical optimization Buffer overflow
Covering space Slide rule Link (knot theory) Code Multiplication sign Shared memory Compiler Number Personal digital assistant Blog Boundary value problem Table (information) Mathematical optimization Mathematical optimization Hydraulic jump Abstraction Data type
Slide rule Data storage device String (computer science) Blog Plastikkarte Point cloud Compiler Stack (abstract data type) Mathematical optimization Information security Mathematical optimization
Web page Mapping Multiplication sign Zoom lens Compiler Stack (abstract data type) Semiconductor memory Cryptography Process (computing) Blog Statement (computer science) Mathematical optimization Reading (process) Compilation album
Game controller Scripting language Software developer Code Software developer Real number Binary code Electronic mailing list Code Bit Control flow Semiconductor memory Compiler Formal language Type theory Term (mathematics) Right angle Quicksort Mathematical optimization Local ring Mathematical optimization Fundamental theorem of algebra
Point (geometry) Group action Game controller Functional (mathematics) Patch (Unix) Compiler Formal language Software bug Term (mathematics) Core dump Flag Compilation album Mathematical optimization Disassembler Source code 3 (number) Binary file Group action Control flow Portable communications device System call Compiler Software testing Right angle Quicksort Mathematical optimization Address space
Source code Computer icon Presentation of a group Run time (program lifecycle phase) ACID Control flow Group action Semiconductor memory Mechanism design Read-only memory Compiler Cartesian closed category Musical ensemble Mathematical optimization Address space
[Music] mem said that's the next title of the title of the next talk why cleaning memory is hard Ilia functional is a security researcher who loves to find out new things and he found out that it's quite hard to get rid of sensitive content in the memory yes so today he's gonna give us an overview presentation please give a warm round of applause - yeah oh yeah
perfect yeah so as as the Harrell just explained my presentation is called mem sad whine clearing memory is hard before
I dive in that once upon a time that was me a lot more hair a lot less fat this is my seventeen Congress you have pot-kettle here buddy this is my 72 Congress in a row I've spoken your
number times before I haven't had a 35 in there yet but obviously that's should be there as well I work for a company called ioactive I am the director of penetration testing but that really just means that I lead teams with pen testers
no no double entendre dare obviously we're we're good to two people so if you're interested come talk to me afterwards I like looking at low level stuff kernels drivers hypervisors that type of stuff and I enjoy reading code okay enough enough about me
what's the kind of audience that I think would enjoy this pretty all-around security people equipped Oh people if you like code review if you like compiler stuff and if you're just generally curious about technology I think you might enjoy this in terms of the knowledge required the first half of these slides is relatively basic and so if you have a basic technology understanding you should be able to understand most of the first half if you have some C background that
would be nice and then as I move forward past the first half it things become a bit more advanced but if you just you know only grasp the first half I think that will still be useful right so what
does this talk about basically it's one very simple easy to explain crypto implementation problem and the reason I've dedicated entire talk about it is because well the problem is easy the solution is not there's a lot of moving parts there's a lot of subtlety a lot of nuance and I'll get into that in a little bit now I can hear some of you
thinking well Ilya WTF at 2018 why the hell are you talking about this this is very very very well known today I say well because this stuff is still everywhere and I will show that in the slides I will show this with data I wish I will show this with bugs but the driver of this talk the reason why I started making this presentation is that this year alone I did engagements for three different customers on three entirely different software projects where they all had this exact same type of bug and you know you tell them about the bug and a customer comes back and says okay well yeah that's that's great we understand now tell us how to fix this in a portable way and that is not very easy the other thing is that even though this problem is sort of known conceptually like you know people kind of blase about it practically not many people understand how how pervasive this
problem is how realistic it is it isn't just like oh well the compiler might do this no the compiler will do this and it does it everywhere and this this is these bug do show up everywhere even though because it doesn't it's hard to tell from the coat that it is there if you look at the binary if you look at what the compiler emits you see that it's there and then the third is given that
one of the teams of the Congress this year is foundations and to talk about you know things don't necessarily new but sort of try and help bring the subject to the next generation this I think this fits in perfectly in the concept of foundations right so before I
dive in there's a couple of people or actually a long list of people that sort
of help me out as I said this the problem is well-known it's been well known for at least twenty or thirty years and so many many many people have published papers and presentations about this and I don't know all of them personally and and I wish I could include them all but the people have included in here or sort of the you know one or two away from me that have had some kind of impact in these slides some of you are sitting in this audience your and your help has been appreciated okay so let's let's let's actually start now let's say you're gonna write some piece of code right and it's going to be doing something and it's gonna be handling sensitive data you know beat at keys or decrypted plaintext recession tokens or passwords or password hashes or anything that could be considered sensitive right now if you're a smart security conscious person right the moment you are done with that sensitive data you want to dispose of it right you want to purge it from memory right now why do you want to do this well because otherwise if there's some kind of info leak that is discovered later on then whatever secrets or lingering memory could be used in your info leak and all of a sudden you know your tokens or your keys or leaked out right and and I mean that may sound like a stretch but you know things like heartbleed you know to happen right so it's this is very practical this can
really happen and for a move on I would say if if you make if you you think and make the step where you say okay I need to dispose of sensitive material once I'm done with it that that's really big right most software that deals with you know sensitive material does not do this so if you make this step thinking I need to purge this you're ahead of the curve right so now concretely it would look something like
this right you would write it a little this is a sample code I have that basically generates a little key and it's a function you give it and you're given a key pointer and this thing declares a local variable that searcher bytes goes and reads bunch of random bits puts it in K copies k2 key and then before it returns because K is about to go out of scope and it contains sensitive key material you go and say mem set and then you clear the thing and then you return right perfect
you you you run this you compile it you add a main yeah and it does exactly what it's supposed to do you look at the assembly and it's all perfect problem there is what you're doing is not that's not that's that's not release based code right when you sort of make code ready to be released you sort of tell the compiler that it should enable the optimizer right you'll you'll give itto SRO to those are the most common ones you sometimes you see oh three when people want to live on the edge but usually Oh two or OS now if you look at the assembly again you know you get a whole different picture of what what's going on and I want to do illustrate this and there's a website called compiler Explorer which is beautiful it integrates a whole bunch of compilers
and it has on the left it shows you the C code and the right shows you the assembly and it's very any hat is like color based and it's easy to make connections so it let's let's take our little example and on the Left we see
generate key and on the right we see the compiler and sure enough if you follow the colors and left and right you can see what C Co translates to which assembly right to make it a bit a little bit easier that memset that meant that clearly gets translated to assembly right now that is when you do - o0 which is the default which is what you would do if you're developing code and you're and you and you want to debug this stuff right now once you you're done developing and you're about to ship this thing you do - oh one for example and this this this little disc a the optimizer kicks in and all the sudden your assembly looks a whole lot different you'll notice it's shorter you'll notice that all of a sudden you know the the color of your memset changed whereas in o0 it was the sort of red ish and old sudden it became white and it's nowhere to be found in your assembly right it just does not show up right that's the problem okay well what happens right let's let's yeah
I still I stole that so what happened is
a thing called dead store optimization or Det store elimination and so basically that meant it at the end that what you're doing is you are writing into a buffer that is never ever going to get used again and an optimizing compiler looks at that and says hey you know I could just take that memset out and I just saved you a couple of cycles and you have a smaller binary huge win and because it doesn't really change what the program does it's fully compliant with oldest as the the relevant language standards right so
that's essentially that is in a nutshell our problem right and so one of things I want to do is I want to look at all common compilers and see for which of these I can get it to effectively practically optimize out Amen set like this and I had with some of them it was easy some of it was harder might fill around with it for some straight-up memset works for others I had to like kind of twiddle and make a for loop or you know kind of jump around a bit but essentially that these are lists of Tang compilers I test it I try to get my hands on the IBM compiler but I don't have 20,000 so I couldn't do that but these are the ones that I did test and then so the first five were the first four you will know you know the GCC and clang and the Intel compiler know Microsoft compiler and they're all also on the compile Explorer so it was easy to test those and it was very easy to get those two to optimize out mem sets and then I moved on downloaded much others you know the Sun Studio compiler and the Embarcadero C++ builder and the arm compiler and a bunch of others and out of these ten I was able to get eight to optimize it out right eighty percent of most of the most common compilers do this in any in a
practical sense so that it isn't just like a theoretical thing this really happens a funny note I tried really hard to get the PGI compiler to do it in fact
it has it has a switch called debts or elimination and of course I claimed it and I tried it and I spent over an hour trying to get it to optimize out my memset god damn thing wouldn't move I don't know what I don't
know what it's doing there I couldn't get it to do anything but basically most compilers if you asked them to do optimization will gladly optimize out a lot of sets right so the next question
is how common is it to actually see a project's use optimization right and this sort of stems from a conversation I had earlier this year with a couple of colleagues where a bunch of people said well you know I don't co2 or o1 or OS all that often I don't think optimization is all that common and so I start looking around and I said okay well where can I get some data and the first thing I was so I was ok well I go to obscure at and that lists about 200 projects and I'll just go through all their make files and look for Oh two or os and so on and there's about a hundred out of there and then I realized that they actually don't use make files they have a really bizarre build system and that built system by default uses OS so even though it says 100 out of 200 it's probably close to the 200 out of 2 and then I don't hold this to programs I want to test in like FreeBSD and Ubuntu and a bunch of Linux distros but that's pretty boring and I ran out of time so I kind of stopped there but these numbers should be good enough in addition if you look at the common IDs a particular feature studio and Xcode when you tell them to build a project in release mode for your studio by default da so 2x code by default da 0 s right so the fact that these tools by default give you
optimization should should make you confident off in knowing that yes in fact optimization is incredibly common in release builds it it isn't everywhere but it is almost everywhere right so now that we know the
problem and now that we know it isn't just theoretical and that we know it's practical and that in fact it does occur very very often and with most compilers in fact basically it's a real problem how do we fix this right and this is sort of where things get difficult there are many sort of solutions nothing is portable right it's sort of the okay well this solution works if you use this compiler and this solution works if you use this Lipsy and this solution works if you use this OS and this solution works with this version of the language spec and this solution works if you have this particular executable file format right so and
before I dive in any of those let's first talk about the elf in the room don't just roll your own I have seen people do this wedding like oh well you know I'll I'll just I'll fight what a compiler I know what I'm doing and they'll just you know they'll real they're kind of Leroy Jenkins this and they'll totally screw it up and they'll come up with you know some really stupid idea when I once I heard it was like oh well I'll just you know what I'll just do i/o with the buffer and edits and then it's cool yeah you could do that but then you're doing I you're right for no reason so don't just roll your own eye you're gonna come up with a solution that's probably stupid you're gonna look really stupid and it'll be one of these things where okay
you're sort of your bad solution might work for this particular version compiler but if you don't stand that the concepts behind it then chances are the next version to compiler that is somewhat slightly smarter will sort of just bypass whatever you implement it so so don't roll you or if you want to roll your own at least listen to the advice of the next ten and then base your your solution on at least some of the advice that I'll be giving out next couple of slides right so let's let with that
let's move on to actual solutions the first one is as a loop see function called explicit B 0 and this is not part of any standard as far as I can tell at the present time but this was this was sort of concocted in May 2014 by the OpenBSD guys if you'll note the date it's pretty close to an hardly happened it's a few months later I think that may have some relation anyway this function basically does a be 0 but explicitly guarantees that it does not get optimized out and what that means is that when R does or doesn't it's no longer your problem it's the Latisse problem because they've dave garand they made the guarantee so now it's on them right so that's really nice don't mute at first and then net bsd said you know that's a great idea and we're gonna steal it but we're gonna rename it though so they changed the name to exclusive memset but it's essentially the same thing and then about two years and change ago freebsd came up sort of the same thing and then almost two years ago the GFC guys came up with this too and then diet Lipsy supports this - OSX however does not support it so if your peers if your
limited those platforms explicit b0 is a perfect solution similarly if you are developing for the windows world there is an api called secure zero memory which basically is Microsoft saying we guarantee that this thing doesn't get optimized out and if you want to securely clear sensitive
material just use this API MSDN says it will ensure that your data will be overwritten promptly and it's one of the cases where Microsoft was ahead of the curve by like 15 years they've had this thing since the early 2000s it was in XP and it was in Windows 2003 both operating systems are no longer supported but the API is okay so
now there's not a function called memset underscore s and it guarantees it doesn't get optimized out and it's guaranteed by spec it's guaranteed by the the language spec it is standardized it is in C 11 it is wonderful it is great except it's not great because even though it's in the standard and it's there it's in what's called the optional annex K and if you read the spec and it's a lot it's like pages and pages of boring crap but if you end up to end up
reading the k2 it's it says optional extension what does optional extension mean it
means you can be entirely C 11 compliant
and not offer mem said s so it's kind of this as reverend lovejoy would say yes but if no it ibattz so if it's there great if it's not it's it doesn't it has the potential of being this great portable solution and then it isn't right and then of course the sort
of obvious choice but somehow a lot of people seem to miss this is if your end up doing something with sensitive material chances are you using a crypto library and if you're using crypto library chances are the crypto library
offers you an API to do secure memory cleaning and so I listed the the common ones using open SL there's open SSL cleanse and open SSL guarantees they don't get up to myself if you use glue TLS there's a gluteal mm set same thing duty less guarantees does not get optimized out after using leap sodium which is one of the newer ones they have sodium m0 same thing day they guarantee doesn't get up to myself and I get down a minute
so those are this is basically up until here I've sort of given you a list of
okay well here are specific API functions you can call if you're using this library or using this OS use this the next sort of solutions are sort of D okay well what if you can't rely on the api's maybe we can get something outta to compiler right the first solution is and this is this isn't portable but most compilers have this or something like it where you can go to the compiler and say hey don't use the built-in memset just use the one from Lipsy and what that means is you tell the compiler that it shouldn't assume that it knows what a memset does and if you do that then sure enough memset won't get optimized out GCC this is original you see specific and then the Intel compiler supports it too and then clang supports a two but up until the deal and this is true up until clang 3.7 which is maybe two years old it's not that old clang basically you know support it F no built-in memset and then what they did is they kind of dropped it on the floor and it got optimized out anyway so it's kind of annoying it kind of wrecked it kind of ruins the whole use this because it's because it works because except if you're using an older version of clang and also you know it's not overly portable but but it's a solution that works other things might still get optimized out if you if you have some kind of for loop that clears memory that might still get optimized out but at least if you use memset and you're using no built-in memset then you have a pretty strong guarantee that it should get optimized out so another sort of
solution is you know just don't use optimization that works go you're guaranteed not to get code to get optimized if you don't use the optimizer obviously you know that isn't perfect through 145 sewers doesn't work if you don't use optimization so if you want to use for device source you have to use optimization you know what of course that yeah you're gonna change your build environment okay it's I mean it's not overly I guess it's not portable but then again I most compilers will have some way to tell it to not optimize anything but obviously the reason you don't want to use this particular solution is because you know you don't get the optimizer so your product will probably be slower sort of
a spin-off of this is some compilers in particular Microsoft one and then GCC also kind of supports it is where you can localize optimizations based on scopes and functions and so you can say oh you know for this function duo zero it seems it's not a commonly used feature it seems like it might have some side effects and it doesn't support all
switches that the compiler generally does i've seems recommend by a few people i played around with it it seems to work but i it doesn't seem to be a sort of commonly adopting things the other thing of course is again this is this is i mean these these are progress this is very very compiler specific stuff another solution is using what's
called weak symbols anybody familiar with weak symbols oh yeah a little bit oh I'll try to sort of very briefly get into this so the elf file format basically is a format that specifies how to have an executable that can run on lessons that support this file format so LS if it's compiled for example for Linux it'll be in this particular format and one of the obviously a Sparta format you can source symbols for things like functions and variables and so on and generally a symbol is what's called a strong symbol you can mark one as weak and what week means is that a symbol may change in run time and what that means is if you declare a function or a symbol of a function as week that means at compile time the compiler would have a very hard time to reason about what that thing does because
because of this your fact that you've declared it as week and in fact this this particular solution is what open me
as the users in their implementation of exclusive b0 and what I really like about this is that this is the commit message for do music is and they they're
they're very pragmatic about if they say well you know the we think our solution is pretty clever but it's not foolproof there are still two ways to defeat this and they list you know a bunch of ways to do this in particular you know well you know the compiler could omit runtime code that checks what this thing is in runtime right before it's called and then you could still optimize it out if the thing matches or doesn't match and but then they go on and say well in the foreseeable future we don't think that's going to happen but it's possible that at some point down the road this might happen and so I like the I like their way of reasoning about this we're absolutely speeding clever but it's not foolproof it may at some point in the future break but at the present time it's a fairly good solution I think right so another solution is to use
memory barriers does that how many people know what a memory Berry is or what it does okay about the same number hands okay let me try to very briefly sort of explain what a memory barrier is and bear with me here because it's and I'm gonna oversimplify because it's not a particularly simple concept if you've never heard of it but ok let's say let's say you have a piece of code two global variables a and B and you assign a value you say a equals something and B equals something right and there's no relation between a and B right what that means is boat to compiler into hardware because they have no relation the both
of them are allowed to reorder it so B can be assigned first and then a can be a site later because there's no correlation that's perfectly valid now let's say you have a second thread somewhere and your second thread says ok you know while not be spin and then once B is set you use a right this is sort of where you're basically waiting on something to be set and the idea is that you're you wrote your code so that B is set after a is set and that seems logical and that would work except if the compiled and hardware don't know there's a relation between your loop on one end and your Simon on the other and either the hardware or the compiler reorders it and sets B before it sets a really really nasty things happen this is this is and this has been the sewers of numerous security bugs very very subtle stuff the case and and T San kind of stuff we've seen kernel last couple years bunch that is related to these kind of bugs and so the way the way that you fix this is you introduce what's called a memory barrier and that is
basically when you write your code you say a equals something and then before you say B quill something you basically in the middle say memory barrier and then you say B equals something and what that means is it gives a signal to the heart we're into the compiler and it says whatever happens before this and after this you are not allowed to reorder this there is a correlation there that I know that you're not aware of so don't reorder it and I hope I explained it well and this usually takes a lot longer to explain but I hope I got the message across well enough to sort of give you an idea what a memory barriers and now the cool thing about memory barriers is that it's a way for a programmer to tell the hardware or the compiler I know something about this memory you don't stay away don't touch and because of I mean it works for reordering but it also it's really well to not get something optimized out right the idea is you could basically just do your memset and then on the the thing you mem said it you basically do a memory barrier and that tells the compiler or not to optimize it out it and I know the concept sounds complicated but it's it's pretty clever and and there's I've oversimplified this because it's relatively complicated subject but this works really well and this is used by dial-up C and it's used by Cheil obscene and nginx recently had
a fix where they have their own explicit MZ row and it also uses a memory barrier so this is this is a tried and tested concept and it works so those are kind
of solutions that are known and that work or and have been tried and tested by various fairly well-known pieces of software if none of those things if somehow you're in an environment somewhere and none of this is available to you or it's not portable enough and you're looking for a solution that works everywhere the best you can do is fall back on constructs that are known in the C language and this is basically use of the volatile keyword this I call just a fallback so people often go well I'll just use volatile and then that solves the problem and it turns out optimizers can be very clever and very tricky and even when you use volatile there are cases there's cases that can be made where the optimizer is clever enough use data may still get optimized out so the default I solutions I sort of best-effort fallback solutions and they're sort of two variants of this one is a volatile pointer right I mean that's one of the that's the fallback solution in the lip sodium and the other one is a volatile memset function pointer which is what opensl uses and
here's what that looks like this is the lip sodium fallback and this looks like you know fall time sturdy you know you tell the compiler hey you know I know something about this you don't accept it and this is where it gets very language eat language Lori if you look at the speck it says something along the lines of the access object something something and that they're describing the actual memory volatile not just the pointer l-value and so if the compiler looks at this code and it can trace and it can prove that wherever PN T came from and if that isn't actually volatile then this volatile doesn't really mean all that much and it can still optimize it out that sounds very theoretical and I don't know if that actually happens but a number of people smarter than me or that know more about the sort of this needy greedy little C language things have told me that yes in fact you would be allowed to do that if you're a very smart optimizing compiler in fact that it's a fallback solution for sodium and a few others leads me to believe that it
probably doesn't but it could right and so this this is the this is a solution to open cell users which also use
volatile doesn't do a pointer right but instead it creates a volatile function pointer that points to memset and it sort of gives you you get the same concept more or less as with the weak symbols the idea being is that your fault a function pointer can change it any time without the compiler knowing about it and that seems like a pretty good solution except you when you one way of in theory getting around this is if the compiler emits runtime code that right before the function when it gets called it looks and goes well like he captures it and then goes is that mem set or is it something else if it's something else then we call it if it's memset we just return and then you optimize out the runtime and save a few cycles in theory the compiler is allowed to do that in a mid coat like that I don't know if that actually happens anywhere but it's a possibility so think of these laughter solutions as a fallback they may not work in theory in in reality they probably do
earth right so this is sort of the second the first half of my presentation and I'm perfectly on time which is great so now there isn't one portable solution and this is why this is my clear memories heart right the problem is well understood but if you're looking for like an all-around solution that works everywhere regardless of compilers and the operating systems and so on it's very hard to have a good solution and
this is what we're it customers come back to me and said give me a portable solution I need something better than this or that and so the best solution I have for this is sort of you know apply all of the above as best as possible and my initial idea was I'll just write a little function that does this and put it on github and people can use it but if you look at lip sodium and then yeah
you see I mean I'm not gonna click on it now but that's a link if if you download the slides later you'll see it it points to github and it shows you the actual implementation lip sodium's mems zeroes is really well-written and it's beautiful and it sort of has this fairly elegant you know if you know this this and this or if this then do this solution L if you know this particular setup then do this solution and it does has this four six or seven of the cases I've covered it's really nice it's really elegant if you're looking for inspiration point people to lip sodium I think it's it's a good portable ish way of solving this problem right okay so that now we've discovered we've talked about the problem we've talked we have some solutions okay well I want detection when does this really happen I want to see this right and I want compilers to tell me this like why doesn't GCC tell me it's not optimizing something out like if it has security consequences it should tell me I want why are they not doing this I don't understand so I set out and I modified GCC I looked at a decimal elimination and I came up with this patch and instead of the this is this is your
there tree yes to say deaths or elimination pass and when it calls delete dead call I sort of take that out and say if you know it's a built in them set before you call delete call emit this warning tell me to file and tell me the line number and then do still optimize the downs and what this means is every time and Memphis gets optimized out I GCC now tells me and this sort of
this is very interesting because I not
only get get you know detection for my own code this is a great way to get to get really cheap fast zero day and in fact that's what I did I downloaded a whole bunch of very well-known open source projects and I ran him through a modified version of GCC and I came up with a list of things
awesome thank you so I know of this particular problem it in like practically affecting open SSL MIT curb heimdal matrix SSL PHP DHCP bind squid cache and the list goes on I have our sink as well and there's more so we know this problem is very widespread but if the stuff we all rely on the stuff thing that is built on has these problems that means your code probably has this as well and of course I'm just giving names out here but let's give you guys some zero day that's MIT curb that mem sets optimized out that's a PHP that mem sets optimised out ah this I think is matrix SSL that decrypted plain text gets mem said it that gets optimized out that's lingers around memory this is open SSL that a crypto extended data that doesn't get optimized out this is nginx that password that men zero gets optimized out this is a bind and DHCP that mem set of private key data gets optimized out this is squid that it goes to LDAP and gets creds and basically it tries to clear the creds and then that gets optimized out yeah well I had to play around with I play around with powerful in a little bit so and then same thing this is a key that gets optimized out and then this is our sink and these are sort credentials in a file and that can surrender memory and those memset s-- get optimized out so that's nine bugs right there and all it took was five lines of code change in GCC and i ate just GC just told me just gave me all these bugs the other thing about seeing
exactly what thank you
the other thing that was really nice
about getting the data back from GCC isn't just that it gave me bugs it also showed me things had optimized out that I thought it wouldn't obviously what I was expecting is you know a variable that's about to go out of scope and that you mem said that would get optimized out obviously but what I also noticed was that you know when you there's a common code pattern when you just what you just malloc something you declare something on the stack and the first thing you do is mem set to clear the whole thing and then you move on it turns out that in a number of cases that also gets optimized out the idea is that if the compiler it only gets optimized out if the compiler can prove that every element in destructor that the whole field gets filled in I'm not sure if that's entirely true and we were talking about this earlier but what about you know things like structure padding or maybe enums or you know something like that or you know unions I mean that I'm not quite sure how that works I suspect I haven't dug into it because this is I I ran it I found this yeah I mean I wrote this patch yesterday right these these bugs are like they're fresh so I don't know exactly how much potential there is here but it smells like there's there's room for bugs here so I I was I was surprised to see this
and some more researchers need it so if anybody wants to feel free and then the other thing I sort of noticed is that you obviously the common case what I was looking for in terms of bugs was something that was sensitive material and then obviously I saw a whole bunch of things where non sensitive material was being memset it and then freed and then that member said we get optimized out as well and that struck me as odd at the beginning but then obviously there's a common sort of coding pattern that I've seen where you know anytime somebody does a malloc and then before using doesn't memset nil or when a free right before they do a memset zero not because they want to clear sensitive material but because they always they might have a guarantee they always start from a clean slate and so they ended up building code that ends up working for something that always has a clean slate right and if that if that meant that gets optimized out then those guarantees no longer hold and so code that works around to sort of well we always have a clean slate when we you know when we get a fresh piece of memory that is no longer true and so I I think that coding pattern doesn't jive well with compiler optimization again this is the sort of realization I made yesterday I don't have all the facts on this yet but it seemed interesting and I think there's some room for research here the other thing I noticed is that sort of close to the medicines that were
optimized out I noticed like other bugs and this is kind of like you know it made me think is that well bad code attracts other bad code so one of them was moldy refs and the other was they use after free where you know instead of doing mem set free the code was doing free mem sets right and then obviously easily use after free so this is a case
of a new graph where basically a malloc happens and then the mendel happens and then there's a check to see if the variable that was allocated is null right obviously that memset if the values dual that mem so that meant that would cause a little DRF except the Memphis gives
optimize out so it never generates the new DRF a kind of a catch-22 there but that that construct is clearly broken
so now I mostly spoke about memset but really there's a thousand variations that clear memory and sort of comes out the same thing right obviously you can do a for loop or you can use other api's or you can like roll your own and do something very exotic and then there's C++ and which there's a gazillion ways of doing it you can have like weird classes with like a constructor and like you know inheritance multiple objects and like virtual it gets really crazy once C C++ comes in the mix and so basically it all kind of looks different but it all does the same thing it has the same route cost is the same problem and the thing is when you look at this from just a code perspective is that sometimes the optimizer is smart enough to see it sometimes the optimizer is not but it could in the future be smart enough to see it so it's one of these things where if you're looking at a piece of code and you're doing some kind of security assessments and you see this in like should you should you report the bug should you not I think you should because even if the compiler doesn't optimize now today it may very well optimized it out tomorrow right okay so
now oldest talk has been about C and when you write it and see well what what if you're not writing C code what if you're using other languages non native languages you know any go Russ Objective C C sharp Java flavor of the month and really I wanted to spend more time on this but then there's my slides would have gotten so long I couldn't totally have one slide on on C but it I I spend a little bit time on this in C sharp or something called secure string which is supposed to hold a string in a safe way and the problem I have with secure isn't the implementation it's how do I get something securely into secure
string and how do I get something securely out of it and then in terms of Java there's a Java
crypto guide which basically says it recommends not using strings to hold sensitive material but to use a byte array instead and this okay there was some reason behind I don't remember but the idea is basically this most manage languages don't really offer any decent way to clear memory or to hold sensitive material in memory without it leaking and it will leak and it'll kind of happen behind your back because you wouldn't know it leaked especially you know when you're dealing with like garbage collection where something can get reallocated without you ever knowing and all of a sudden before you know it there's like five different copies of your key spray it all over memory most of these languages as far as I can tell they don't have the infrastructure in place to deal with sensitive material it seems to be entirely missing in a lot of places and other places it's kind of like you know a shoehorn Don or bolted
on with some varying degrees of success I remember seeing there was an some impatient for go but it had been in revision three or four because there was always something wrong with it and so again I wish I had more time than I could elaborate on this for what I saw is that it's a pretty sad state of affairs in not see that most people haven't tried and those who have have not tried hard enough
so now that I've sort of run through all of this sort of you know related issue sort of you know memset problems and had a clear memory I want to sort of talk about some related issues first of all when I said initially is that you know when people make this step and go oh well I should clear this memory because it's sensitive I said that's huge because it really is most code doesn't even try there is an unbelievable amount of code and just keeps keys in memory and sensitive
material and it goes out of scope and it never gets cleared and it just ends up lingering on the stack or the heap I mean often it would get overwritten fast but sometimes it could linger around and sit there for a very very long time problem with this is that it's it's hard to find in any kind of automated fashion because it's the you're looking for the absence of something right so that means the only way you can really find these kinds of bugs is the manual look at it and go oh this is sensitive material you didn't know no effort was made to clear this a second issue and this is this is
really cute actually so when you call memset the the way it's done there that that that's wrong the length and the bite you want to mem sit with or transposed the zero should be the second argument a lens should be the third argument so what that really does is a no op it basically says mem sets and then use lend us the pattern and then the pattern is used as the length in this guy's the in this case the pattern is zero so a mem set of zero which becomes a no op and what's really cool is that GCC actually hat you can tell GC to emit a warning for this and what I one of these I want to do is I wanted to run through all the same code I had tested before and have enabled a warning and then sort of show it on a list of bugs but I kind of ran out of time but I strongly suspect if you use this morning and you go download a whole bunch of like well-known open source code and you run it through you'll end up with a very similar list of bugs so another list of
sort of related bugs and related in the sense that no log clearing secrets but related in a sense that optimization was involved that is to say if the optimized it wasn't turned on a security bug wouldn't have occurred right or would have been less severe right as there are sort of three cases of bugs that sort of ran into and I I think it's somewhat relevant and then I sort of want to talk about so the first one is what's called pointer overflow it turns out if you let's say let's say you have this code PTR and then and then there's a Len and the Len is untrusted right the idea is that you want to validate at PDR Plus lend isn't beyond the end of your buffer but before you do that you also want to make sure that PBR plus Len doesn't overflow right and so you would do coat like PTR plus LAN is smaller than PTR if that's the case then the pointer overflowed and you bail out right problem is according to C standard pointer overflow can't happen and so that is undefined behavior and the optimizer sees that and goes oh I'm gonna find behavior optimized out gone so you're bound check just got optimized out that is a relatively common bug to see if you don't know it's on the find behavior if you don't know the compiler can optimize it out you would just read over it but once you know like you start reading code and you'll see you'll see it everywhere also the way to fix this is basically to cast your pointer to an integer type that's big enough to hold a pointer and all the sudden the optimizer can no longer optimize it out and it'll it'll be it it'll be in your code so that's the first one the second one is a lot more subtle this has to do with a switch case optimization so you know though when when you have a switch and C and you translate to assembly you know you could just do it one one sort of translation but if you have if you use the optimizer one of two things will happen either you know it'll generate a binary tree that's generally observed in the Microsoft compiler if you look at GCC
and clang what they do is they'll create a jump table and what that means is they'll look at the value to compare and then if the compare certain number they get the value again and they use that value as an offset in a jump table and and that usually isn't the problem and that's an abstraction and most people don't care about any more situations except if you're dealing with a shared memory trust boundary because all of a sudden there is a subtle double fetch that was emitted by the compiler behind your back that doesn't show up in the actual C code and these are situations like you know a hypervisor trust boundary right these are very very strong trust boundaries and and that thing is actually a link if you look and if you and if once I publish the slides if you click on it it links to a blog post that shows just that it's it's a VirtualBox guest host previous collation because of a switch case a jump table optimization and and so the the debug there basically is that when you fetch the first time you do to compare it's fine and then two instructions later you fetch again to do the jump table between his first and second fetch the guy on the other end of the shared memory could change it and all the sudden you can jump outside of your jump table and basically cause an arbitrary jump which obviously is bad
yeah not gonna cover this this okay I
gotta wrap it up I got three more slides
and then we can get to questions this this is actually this is very important so now after everything I've covered we're good we know what the problem is we know what the solutions are we know there's real-world problems but we have a good grasp of it now right okay turns out I kind of lied to you this is not the whole problem if it was no problem that'd be great we know how to fix that more or less it turns out that compiled optimization is really really clever and it does many many clever things and they're all very subtle and a lot of them are architecture-specific and here's a scenario of things that can occur right you'll have the optimizer will do things like oh well you're handling this string of a certain kind you know what we'll do I'll just shove it into a bunch of register so it'll be faster and then you know I pass or something and then old son the optimize it goes oh you know what you don't have enough registers it's okay what I'll do is I'll take whatever is in the register and I'll dump it on the stack and then we'll you will go from there and all the sudden what happened is you leaky material in registers and then if you leak them on the stack and this stuff happens throughout and secrets leak out it just happens and it is because of optimization even if you try really really hard by trying to do too really hard to do the right thing and they compile it and the optimizer just it just they just screw you right and it's promised echoed in a blog post by Colin Percival who used to be the security high FreeBSD now has I think a cloud storage company very very smart security guy who had and I would
recommend reading that blog post that this is this problem was also echoed in the Linux man page exposed to the B 0 and a map page basically says well yes this is a fundamental problem we still
recommend using us as B 0 and our our hope our thought process is that towards the future to comply we will have a way to get the compilers to not do this and we can move on but at the present time there's no good fix for this and this is my zoom I'm the first statement I have that I want to make is that at the present time optimizing compiles in cryptography are mutually exclusive you can have one the other you cannot have boat it does not work at the present time okay before
I get to my conclusion I want to rant a bit about optimization as if I haven't already but basically I mean I get that optimization is great and it gives you all these things and things get faster but I have a real problem with optimization because look if you're a developer and you write code and you're pretty smart you can reason about your code because you wrote it and you know what it does but if if you didn't compile it and it's gone through an optimization pass you can no longer reason about it because you don't know what the optimizer did right and that that is I that is I think a fundamental problem and sort of what I want people to sort of think about is whenever they do - oh is the and I'm not saying you shouldn't use the optimizer because it hasn't many many many pros but don't be like blase about it right before you type - oh really think what it means do what it means to do that because it will introduce all sorts of things that you weren't sure about or that suddenly changes the meaning of something so and what I really want for the compiler people and maybe the language people to implement this so the compiler people can do it is I want strong accountability and control of the optimizer right that is if I compile something I want to be able to compile I want to be able to go to compiler and say hey before you do anything I want you to give me a detailed list of everything you're about to do in terms of optimization so that I can take it out take that list and look at my code and then with that list of my code I can now reason again about what the binary
is gonna do without having that kind of accountability you just you can't reason about your binary and the other thing is control right I want to have fine fine grained control over optimization and what I mean by that is you know like the localized stuff that I mentioned before that my compiler for example has or I want to be able to go and say this particular scope door and optimizer this particular scope don't do this particular optimization I would like to see something like that and then I'll just skip this so okay here's my conclusion
right we know what the problem is the original problem and then I have some solutions and then in retrospect they're at just partial solutions but they're still kind of solutions okay
and then I also have a call to action of things I think should happen oh and hopefully will happen at some point in the future the problem as I've Illustrated I think is rampant right and so basically III would like people to use that juicy patch that I've shown or or create a better one and you know go find some bugs or better yet go fix some bugs in terms of compilers as I just mentioned what I want is optimization accountability in control right it's kind of the wild wild west in terms of optimization where the compilers just go and do all these things and I mean we have we have some flags but they're there there's not enough control and there's not enough accountability there's not enough transparency you just don't know what's going on and some problem if like a dump functionality where it's like it's like a needle in a haystack you want something that's easy to work with easy to read or easy to parse and that tells you exactly all the optimization steps you're doing and ideally we would like I'd like the the the language people to get involved in the standardize on this because if they do we can now demand this of all the compilers and then lastly sort of the
you know coming back to my nazi acidity
well what about Nazi Ruby Python Perl go
Russ and so on it smells bad it looks bad especially winter runtimes involved this is probably worthy of presentation of its own or multiple presentations but I I I wish I had done more there that is essentially it I hope
you enjoyed it
[Applause] [Music]
[Music] [Music]