Bestand wählen
Merken

Microarchitectural Attacks on Trusted Execution Environments

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
and and
the and a and B and I know fellow creatures but come end ch a last son of question another 1 there cool do we trust do we trust the trust sounds on our smartphones Ch the well and Keegan Orion is really fortunate sound to be here and there he inspired by he was inspired by a lot of talk from from this is the before they was 29 and down his research on now smartphones and chips so our system on a chip used in smartphones will answer this question so if you can trust those trusted execution environments these that give a warm round of applause to Keegan and enjoy few has thank you so I'm here and I'm a consultant with and group and that this is microarchitectural attacks on trusted execution environments so in order to understand what a trust execute environment is we need to go back into processes security specifically on x 86
so as many of you are probably aware there are a couple different modes which we can execute code under an exodus of processes and that includes ring 3 which is the user code in the applications it also ring 0 which is the kernel code the now there's also a rank 1 and rank to better suppose we use for drivers a offering systems but really just blows entering 0 entering 3 and in this diagram we have here we see that privilege increases as we go up the diagram so ring 0 is the most could privileged bring in ring 3 is the least privileged ring so over secret solver sense of information all of the attackers goals are ring 0 and attackers trying to access those from the on privilege world covering 3 you have a question what if I want to add a processor features that I don't want ring 0 to be of access well then you add ring minus 1 which is often used for hypervisor now the hypervisor has all the secrets and the hypervisor can manage different guest operating systems in each of those gas offering systems can execute in Ring 0 with how it having any idea of the other operant systems so this way now the secret or honoring minus wants to now the attackers goals have shifted from ring 0 to minus 1 the attacker has to our tekrar minus 1 from a it less privileged ring and try access those secrets but what if you want to add a processor feature that you don't want ring minus 1 build access so you had ring minus 2 which is a system management mode and that's capable of modern power I directly interfacing with firmware another chips on the motherboard and i it's able to access and do a lot of things of hypervisors not able to and now all of the secret and all the terror goals on ring minus 2 and the attackers attack those from molested littering no maybe you want to add some your promises that you don't remind us to to access so you add
bring minus hand I think you get the picture now and we just keep on adding more and more pillage ranging keep putting our secrets and our attackers Goals in these higher and higher public privilege brains but but what everything about a wrong what if instead we want to put all the secrets in the least privileged ring so this is sort of the idea behind S. Jackson it's useful for for things like DRM where you 1 that to run ring 3 code but have since the secrets or other signing capabilities for an inner ring 3 but this picture the overcomplicated Saigon is a little bit complex so let's simplified a little bit will only be
looking at ring 0 3 ring 3 which is the kernel of the usual and in the STX enclave which also executes in rank 3 now when exit encoding the STS enclave you 1st load the code into the enclave and then from that point on you trust the execution of whatever's going on in the enclave you trust that the other elements the kernel of the user and the other rings are not going to be of access what's not enclave so you've made your trusted execution environment this is a bit of a word model because now you're attacker is in the ring 0 kernel and your target victim here is rank 3 so instead of the attacker trying to move up the privilege came the attackers tried to move down which is pretty strange and you might have some questions like under this model who handles memory management because traditionally that something that ring 0 manage and brings 0 would be responsible for PD memory in and out for different processes in different our code that's XP and rank 3 but on the other hand you don't want that to happen with the checks enclave because 1 of the malicious ring 0 at the page to the enclave that that the Yongli doesn't expect so in order to solve this problem as gx i does allow rings here to handle page faults but simultaneously and in parallel it verifies every memory load it to make sure that you know I axes violations are made so that all the STS memory is safe so allows really to do his job but it's sort of watches over at the same time to make sure that nothing is is messed up so it's a bit of a weird convoluted solution to a strange inverted problem but the works end up that's essentially how SGX worse and the idea behind if px that we can look at at the 2nd and we can see that
obviate is constructed in a similar way but improves on x 86 in a couple key ways so 1st of all on the gets rid of ring monitoring to so that for about those images has different privilege levels for you and in the kernel and instead of these different Prewitt levels a called exception levels in the arm terminology and the 2nd thing that on gets right compared to x 86 is that in certain at 3 in counting down as per which goes up arms starts and 0 in count up so we don't have a word or negative numbers anymore now when we add the next privilege level of the hypervisor we call exceptionable 2 and the next 1 after that is the monitoring in exception Level 3 so at this point we still want to have the ability to run trusted code in Exception Level 0 the least privileged level of the army processes so in order to support this we need to seperate this diagram into 2 different sections ends in obviate the the called the Secure World and the nonsecure world so we have the announced world on the left in blue that consist of the user and the kernel and the hypervisor and we have the secure world on the right which consist of a modern exceptionable 3 8 trusted operant system an exception 1 entrusted applications in Exception Level 0 so the idea is that if you run anything in the secure world it should not be accessible or modifiable by anything in the nonsecure world so that's how attackers fired axis at the tagger has access to the nonsecure kernel which is often when x In the trend work to the trusted adapts so once again we have this weird inversion were trying to go from a more privilege level to a less privilege level in trying to extract secrets in that way so the question that arises when using these trusted execution environments that are implemented in SGX pinch trust on in arm is can we use these privilege
modes in opera which access in order to attack these trusted execution environments how to answer that question we can start looking at the different research papers the first one that I want to go into is 1
called clocks through its attack on trust and so throughout this presentation going to go through a few different papers in just to make it clear which papers have already been published and which ones are all are linked to the citation upper right hand corner so that we can tell what's old and what's new the 1st papers go this coxal paper is relatively new it was released in 2017 and the way Carter works is it takes advantage of the energy management features of a processor so a non-secure operant system has the ability to manage the energy consumption of the different colors so if a certain target core doesn't have much scheduled do then the Oberon system is able to scale back the voltage a or Ordell than the frequency on the course that core uses less energy which is a great thing for performance it really extend battery life it makes that the course will last longer and gives better performance overall but the problem here is what if you have 2 separate chorus and 1 of your cause is running this non-trusted operative system and the inner core is running code and secure world it's running that trusted code those trusted applications so that non-secure opera system can still dial down the voltage and you can still change that frequency in those changes will affect the scare world code so what the class detectors is the nonsecure operon system of core will del down the voltage of well over clock the frequency on the target secure world core in order to induce faults to make sure you have to make the computation on the course fail in some way and when the computation fails you get lost cryptographic areas that the tap can use to infer things like secret keys secret AES keys and to bypass co-signing implemented in the secure world so it's a very powerful attack that made possible because the nonsecure operators system is privileged enough in order to use these energy management features the class through an
example of an active attacks where the attacker is actively changing the outcome of the victim code of the codons secure world to but what about passive attacks so in a passive attack the attacker does not modify the actual outcome of the process the attacker just tries to monitor the process infer what's going on and that is the sort of attack that will be considered for the rest of the presentation so in a lot of STS entrusted implementations the trusted any non-trusted code both share the same hardware and this chair harbor could be assured catch it could be a bread predictor it could be a teal of the the point is that they share the same are the same hardware so that the changes made by the secure code may be reflected in the behavior of the nonsecure code so the classical might execute change the state of that shared cache for example and then the interest code may be able to go and see the changes in their cache infer information about the behavior of the Secure cone so that's essentially how aside tele-tax are going to work if the guy did secure code is going to moderate the shared harbor resources for state changes that reflect the behavior of the Secure code now we've our talk about how Intel and STX of addresses the problem of memory management here and uh who is responsible for making sure that those attacks don't
work on SGX so what do they have to say on how they protect against the side tax and tax on this shared cache hardware they don't that all they they essentially say I we do not consider this parameter model it is up to the developer to implement the protection is needed to protect against the side channel attacks which is great news for us because the social ties can be very powerful and if there aren't any hardware features that are necessarily stopping us from me and to accomplish a goal it makes us that more likely to succeed so with that we can
sort of take a step back from trust own industry and just take a look at cache attacks and don't make sure that we all have the same understanding of how the cash will be applied to the stress of execution environments into so that let's
go over a brief recap of how a cash works so catches the necessary processes because accessing the main memory is well when you try to access something from memory it takes a while to be read into the process so the cash exists as sort of a layer to our remember what that information is so if the price every need information from that same address it is reloaded from the catch and that access is going to be fast so it really speeds up the memory access for repeated that accesses to the same address and then if we try to access a different address address them and that will also be run to the cash flow at 1st but then quickly for repeated accesses and so on and so forth now as you can probably tell for all these examples the memory blocks have been moving horizontally they've always been seen in the same row and that is reflective of the idea of sets in a cash so there are number different societies and that corresponds the different rows in this diagram so for example there are 4 different ideas in each addressed in the main memory maps to a different set ID so that address and they memory will only go to that location in the catch up with the same society so only travel along those rows so that means if you have 2 different blocks of memory that mapped to different ideas and they're not going to interfere with each other in the cash but that raises the question what about 2 memory box that do map to the same society well if there's room in the cache then the same thing will happen as before that memory address and that those never countries will be loaded into the cache and then retrieved from the cache for future accesses and the number of possible entries for a particular society with a cache is called the associated the and on this diagram that represented by the number of columns and the cash so we will call our cash in this example a two-way set associative cache now the next question is what happens if you try to read a memory address that maps the same society but all of those entries within that study was the cash our full well 1 of those entries is is chosen it's evicted from the cache the new memory is written and then that's that's the process so it doesn't really matter how that addresses how the cache entry is chosen that you're 15 for the purpose of presentation you can just assume it's random are but the important thing is that if you try to access that's a member that was evicted before you know would have to wait for that time penalty for that to be reloaded into the cash and right and the process it so those are cashing in a nutshell in that particular set associative caches we can begin of looking at the different types of cache attacks so forecast that we have 2 different processes we have an attacker process in the victim process and for this type of attack that were considering both of them share the same underlying cos they're trying to access the same resources which could be the case if you have that page duplication in virtual machines or if you have copy-on-write mechanisms for shared co-teacher libraries but the point is that they share the same underlying memory now the flesh will attack works in 2 stages for the attacker we've ever 1st starts by flushing out the cash they flesh eating every addresses in the cash so the cash is just empty the then the attacker was the victim execute for small amount time so the victim might read on address from main memory holding into the cash In the 2nd stage of the attack is the reload phase so in the rule of phase the attacker tries to load loaded differ memory addresses from main memory and see if those entries are in the cache or not so here the talk will 1st try to load address 0 and see that because it takes a long time to read the contents of address 0 the attacker can infer that address 0 was not part of the catch which makes sense because the attacker flushed it from the cash in the 1st stage the and tries to read the memory at address 1 In sees that this operation is so the attacker infers that are the contents of address 1 or in the cash and because the attacker flush everything from the catch before the victim executed the attacker then concludes that the victim is responsible for bringing address 1 into the catch so this was through attack reveals which memory addresses the victim access during that small slice of time and then after that reload phase the tab repeats so the attacker flushes again what's the victim execute reloads again and so on so there's also a variant on the Persian ruler the flesh and flush attack which I'm not going to go into the details of but essentially it's the same idea but instead of using the load instructions to determine whether or not a piece of memory is in the cache or not the uses flush instructions because flush instructions will take a longer if something is in the cache already so the important thing is that both the flushing attack and the function flush tax rely on the attack on the victim sharing the same memory but this isn't always the case so we need to consider what happens when it happened the victim do not share memory and for this we have the prime in Probe attack so the problem from attack once again works in 2 separate stages in the 1st stage the attacker primes the caching by read all the attacker
memory into the cache and then the attacker what's the victim xk for small amount time so no matter what the victim accesses from main memory sense the caches full that happened either 1 of those attacker entries will be replaced by victim entry but in the 2nd phase of attack during the probe phase the factory checks the different cache entries for particular studies in sees if all the attacker entries are still in the cash so maybe attackers curious about the last ID the bottom row so the tagger 1st tries to load memory at address 3 and because this operation is fast the attacker knows that address 3 is in the cache it Alcatraz the same thing with our address 7 season this operation slow infers that at some point addressed 7 was evicted from the cash so the attacker knows that something had to be from the cash and had to be from the victim so the concludes that the victim axis something in that lasts a set idea that bottom out the target know if it was the comet of address 11 over the counter to that of 15 or even what those content are but the tagger has a good idea of which set idea it was so the good
things the more things to remember about cache taxes that catches are very important the crucial for performance on processes they give a huge speed boost and there's a huge time difference between having a cash and not having a catch for here executables but the downside to this is that big time difference also allows the attacker to infer information about how the victim is using the cash were able to use cash tax and the 2 different theories of where memory share in the case of the flushing reload partial flush tax In the case remember is not shown in the case of the primer-probe attack and finally the important thing to keep in mind is that for these cash tax we know where the victim is looking but we don't know what they see so we don't know the contents of the memory of the victims accessing we just know the location in the axes yeah so what does an example traces of these attacks look like well there's an easy way to represent these
as two-dimensional images so in this image we have horizontal axis as time so each column in this image represents a different time slice at different iteration of the prime measure and probe so then we also have a vertical axis which is the difference ideas which is the location that access by the victim process and then you of a pixel is white if the victim access that set ID and during that time slice so as you look from left to right as time moves forward you can sort of see the changes in the patterns of the memory accesses made by the victim process now for this particular example the traces captured on an execution of our AES repeated several times in a it's encryption repeat about 20 times and you can tell that this is a repeated action because you see the same repeated memory access patterns in the data you see the same structures repeated over and over so you know that this is reflecting at what's going on throughout time but what does it have to do with AS itself well if we take the same trace with the same settings but a different key and we
see that there's a different memory access patterns with different repetitions within the trace so all the key change the code change so even though were not able to read the content of the key directly using this cash that we know that the key is changing these memory access patterns and if we can see these memory access patterns then we can infer the key so that's the central idea we want to make these images as clear as possible and as descriptive as possible so we have the best chance of learning of what those secrets are and
we can define the metrics for what makes these cash checks powerful and a few different ways so the 3 ways will be looking at our spatial resolution and temporal resolution and noise so spatial resolution refers to how accurately we we can determine the where if we know that the victim access to memory address within 1 thousand bytes that's obviously not as powerful as knowing where the axis within 512 bytes temporal resolution is similar are where we want to know the order of Y axis is the victim so if that time slice during our attack is 1 millisecond we're going to get much better ordering information on those axis then we would get if we only saw all the memory accesses over the course of 1 2nd so the shorter the time slice the better than the temporal resolution the longer are pressure will be on the horizontal axis and the clear of an image of the cash that will see and the last metric to evaluate our tax on is noise and that reflects how accurately or measurements reflect the true state of the cache so right now we've been using time data to infer whether or not an item was in the cash or not but this is a little bit noisy it's possible that will follow false positives or false negatives so we want to keep that in mind as we look at the different tax so that's essentially
cached accident international that's all you really need to understand better to understand these attacks as they they've been implemented on trusted execution environments In the 1st particular tax there were going to be looking at is called a controlled the terrible
attack on SGX in this attack isn't necessarily cash that but we can analyze it in the same way that we analyze the cash tax so it's still useful to look at you know if you remember our how never Riemannian occurs with SGX we know that if a page fault occurs during as checks on Clifford execution that page fault is handled by the kernel of so the kernel has to know which pages the enclave needs to be paid in the kernel already get some information about what the enclave is looking at now in the control
channel attack there is a our uh what the attacker does from the non-trusted OS is the attacker Peters almost every other page from the enclave out of memory so no matter whatever page that enclave tries access it very likely to cause a page fault which will be redirected to do not trust the west where they don't recitalist recorded page on any other pages and continue execution so the US essentially gets a list of sequential page accesses made by the Aztecs enclaves all by Catherine paid for this is a very general attack you don't need to know what's going on the enclave in order to pull this off you just load up an arbitrary enclaves and you're able to see which pages that enclave that is trying to access so how did do on our metrics 1st of all the spatial resolution is not great we can only see where the victim is accessing within 4 thousand 96 bytes or at the size of a full page because that's just obscures the offset page where they fault there's the temporal resolution is good but not great because you know where the CAT any sequential our axis is 2 different pages were not able to the sequential accesses to the same page because we need to keep that same page page 10 while we let our eyes yes equivalent for that small time slice so tomorrow's resolution is good but not perfect ob but the noise is there is no noise in this attack because no matter where the page fault occurs the entrees operant system is going to capture that paid for in is going to handle it so it's very low noise not great spatial resolution but overall still a powerful attack but we still want to improve
on that spatial resolution what build see what the enclave is doing at greater than a resolution of of 1 page of 4 kilobytes so that's exactly what the cash the paper does and instead of interrupting the SDS enclave execution with page faults it uses timer interrupts because the untrusted operant system is able to schedule when timer interrupts a curve so it's able to scan of them had very tight intervals so it's able to get that's how small intake temporal resolution and this is
what happens in between is this time Europe fires the interest operates some runs the prime in Probe attack code in this case and our resumes execution of the ongoing process and this repeats so this is a prime and probe attack on the L 1 debtor cash so this attack lets you see what dad the enclave is looking at notice attack that could be easily modified it to use the L 1 instruction cache so in that case I you've learned which constructions the enclave is executing and our overall this is an even more powerful attacked and the control channel attack if we look at the matrix we can see that the spatial resolution is a lot better now will have a spatial resolution of 64 bytes or the size of an individual line the temporal resolution is very good it's almost unlimited to quote the paper because the unjust Oberon system has the privilege to keep going those time interrupt closer closer together until our it's able to capture very small time slices of the victim process and the noise itself as well were still using a cycle counter to measure the time it takes to load memory in and out of the cache but it's it's useful if the chances of having a false positive or false over low so the noise is slow as well now we can also look at our trust in attacks because so far the attacks that looked at passing attacks have been against SGX and those attacks initiatives have been pretty powerful so what are the published tax on interest on well there's 1
cultures by which is kind of similar in concept to the Qassam attack that we just looked at SGX it's once again a probe cell attack on the L 1 data cache and the difference here is that instead of interrupting the victim code execution of multiple times the trees by attack does the prime so does the full AES encryption and then does the probe set up and the reason they do this is because as they say the secure world is protected by and is not interoperable in the same way that SGX is intractable but even despite this to slide just having 1 measurement per execution the trees by others were you some statistics to still recover the EST from that noise and are methods were so powerful they are able to do this from an uncovered application in usual and so you don't even need to be running within the kernel in order to build to pull off the stack so how to the stack measure up the spatial resolution is once again 64 bytes because that's the size of the cache line on this processor ends the temporal resolution is is pretty poor here because we only get 1 measurement for execution of the AES encryption this is a also a particularly noisy attack because we're making the measurement from the user and are that even if we make measurements from the kernel we're so that have the same issues of false positives and false negatives associated with using a cycle counter to measure membership in a cash the so I would like to improve its little that we'd like to improve the temporal resolution so we have the power of the cash that to be a little bit closer on trust as it is on the x so we want to improve that temporal resolution let's dig into that similar but that the secure world is protected and not intractable and to do
this we go back to this diagram of on the now how that trust on a set of so it is true that when an interrupt occurs it is directed to the monitor and because the monitor operates in a secure world if we interrupt secure code that's unexceptionable 0 we're just going to and the point secure coding exception Level 3 so this doesn't necessarily give us anything I thing this for the authors mean by saying that is protect against this all I despise interruption we don't have a way to redirect our flow to the non trusted code at least that's how it works in theory in practice the Linux afferent summary in exception 1 in the non secure world kind of need interrupt in order to build the work so if an interrupt occurs it has been since the model model just forwarded right to the nonsecure operants some the so we have interrupts just the same way as we did in catch them and you
we can improve the trust in attacks by using this idea we have to cause or walk core is running the secure code the other course running nonsecure code and the nonsecure code is any interrupts to the Secure World Court and that will give us that interleaving of high-tech process and victim process that allow us to have a powerful primer-probe attack so what does this look like the other topic
on the victim the attack course and that order to that of course this interrupt is captured by the monitor which passes it to the nonsecure operates system some we secure some transfers this to our attack code which runs the primer-probe attack then we leave interrupts the execution with the victim coded in the store world resumes and we just repeat this over and over so now we
have that interleaving of death all of the processes of the attacker and the victim so now instead of having a high temporal resolution of 1 measurement per execution we once again have almost unlimited temporal resolution because we can just get will when we send interrupts from the attacker core now we also like to improve the noise
measurements are uh the because if we can improve the noise will be clearer pictures a model to infer the secrets more clearly so we can get some improvement by switching the measurements from you know an answer to those in the kernel but again we have the like counters so what if instead of using the cycle counter to measure whether or not something is in the cache we use the other performance counters because on on the platforms
there's a way to use performance counters to measure different events such as cache hit in cache misses so the events in these performance monitors require privileged access in order to use which for this that we do have a typical Castex scenario we would have access to the performance models which is why they haven't really been explored before but in this weird scenario where were attacking the less part which come from the more pull which code we do have access the performance monitors and we can use these monitors during the probe set to get a very accurate count of whether or not a certain memory load cause a cache miss her cache at so were able to essentially get rid of the different levels of noise now 1 thing to point out is that maybe we'd like to use these on the performance counters in order to count the different events that are occurring a secure world coded so maybe we started the performance counters from the nonsecure world but the secure world 1 and then when the secure world exerts we use the natural to read these performance counters and they would like to see how many instructions the secure World executed or how many branch instructions or how many arithmetic instructions cache misses that word but unfortunately obviate took this into account I and by default performance counters there certainly not world will not measure events that happen in the scope of which is smart which is how it should be in the only reason I bring this up is because that's how it is 7 so we're going to hold off on top of that is exploring the different implications of what that means but I wanna focus on the because that's that's the newest of the new so what keep looking at that are so we is run the premise of attack to use the performance counters so we can get a clear picture of what is and what is not in the cache and instead of having noisy measurements based on time we have virtually no noise at all because we get the truth straight from the processor itself whether or not we experience a cache miss yeah so we
had we some these attacks were to go from here we have all these ideas we have ways to make these trust Ontotext more powerful but that's not worthwhile unless we actually implement them so the goal here is to implement a tax
on trust on and since typically the nonsecure world operant system is based on Linux will take that into account when making our implementation so will write a kernel module that uses these performance counters in these interprocessor interrupts in order to actually accomplish these attacks and will write it in such a way that it's very generalizable so you can take this kernel model that was written for 1 device I in my case I did most might have on the nexus 5 acts and it's very easy to transfer this model to any other Linux-based device that has a track so that has these shared catches so it should be very easy to pour this over and to perform the same powerful cat attacks on different platforms the we can also do clever things based on the Linux operant system so that we limit that collection window to just when we're executing within the secure world so we can align traces all lot more easily that way In the end result is having a synchronized phase for each different tax because eyes into written about model away were able to run different acts simultaneously so maybe were running at 1 time of the attack on the L 1 data cash to learn where the victim is accessing memory and were simultaneously running attack on the L 1 instruction cache so we can see what instructions the victim is executing and these can be aligned so
the the tool that a written is a combination of a kernel model which acted performs this attack on a useful when binary which scandals these processes to differ cause In the gooey that will allow you to interact with this kernel model and rapidly start doing these cash test for yourself and perform them against different pilot processes insecure code I insecure world code yeah so the intention behind this tool is to be very are generalizable to make it very easy to use this platform for different devices and to allow people way to once again quickly develop these attacks and also to see if their own code is vulnerable to these cash tax see if the code has the secret dependent memory accesses so can we get even better spatial
resolution right no were down to 64 bytes and that's the size of the cache line which is the size of our shared hardware and honest yet we actually can get better than 64 bytes based on something called a
branched shadowing attack so uh Breton attack takes advantage of something called the branch target buffer the bread a buffer is a structure that's used for branch prediction it's similar to a cash are but there's a key difference where the bachata buffer doesn't compare the full address seen something is already in the cache and that doesn't compare all of the upper level bits so that means that it's possible that 2 different addresses will experience a collision and the same entry from that BTB cash will be read out for improper address now since this is just a brand collection and the worst that can happen is you'll get a misprediction of small time penalty but that's that's about it the idea behind the branch that attack is a leveraging of this small difference in this overlapping and this collision of addresses in order to sort of executing the shared code CELP our flesh real attack on the branch target buffer so I hear what goes on instant attack the attacker who modifies the Ustaše enclave to make sure that the branches that are within the enclave will collide with branches that are not in the enclave the tagger executes the enclave code and our then the attacker excuse their own code and based on the outcome of the and the victim code in that catch the attacker may or may not protuberant picture so the tag able to tell the outcome of a branch because of this overlap in this collision are like would be in a fashion real attack where those memories overlap between the attacker and the victim so here are spatial resolution is fantastic we can tell them to individual branch instructions industry we can tell exactly which branches were executed and which directions they were taken in the case of conditional by branches the temporal resolution is also once again almost limited beacon because we can use the same i timer interrupt in order to scandal or process are attacker process and the noise is once again very well because we can once again use the same sort of branch of misprediction counters that exists in the Intel world in order to measure this noise so does anything
of that apply to trust Ontotext well in this case the victim attacker don't share entries in the branch target buffer because the attacker is not able to map the virtual address of the victim process but this is kind of reminiscent of earlier cash tax so our flesh attack only worked when the attack on the victim share that memory but we still have the problem of attack for when they don't so what if we use a primer probe cell attack on the branch target buffer cache in armed processes so essentially what we do here is we con the venture buffer by executing many attacker branches to sort of Philip this BTB cash with the an attacker breath petition data we let the victim expert branch which will evict an attacker BTB entry and then we have the tech we branches and see if they have there been any mispredictions so don't either the cool thing about this attack is the structure of the BTB cash is different from that of the L 1 caches so instead of having 256 different sets and the L 1 cache the BTB cash has 2048 different sets so we can tell which branch our it attacks based on which 1 of 2048 different chaos and ideas that it could fall into and even more than that on the on the format was the nexus 5 x are working with the granularity is no longer 64 bytes which is the size of wine it's 16 bytes so we can see which branches the at 5 the trustor code with interest on his executing within 16 bytes so what is quite so previously with the truce by
attacking this is sort of the outcome of our primer-probe attack we get 1 measurement for those 256 different societies when we added those interrupts were of the get that time resolution and it
looks something like this you no maybe you can see a little bit at the top of the screen how there's these repeated sections of little white blocks and you can sort of use that refer like maybe there's the same cache line and cash instructions are called over and over so just looking at the cell when I catch attacked you can tell some information about how the process went that would compare that to the BTB attack and are enough you
can see 2 clearly I'd say it's a bit too high over a resolution right now so it's I just focus in on 1 small part of this overall trace and this is what it looks like so each of those white pixels represent a branch that was taken by that secure world code and you we can see repeated patterns we can see you maybe different functions that recalled we can see different loops in just by looking at this 1 trace we can for a lot of information on how that secure world executed so it's incredibly powerful and all the secrets are just waiting to be uncovered using these new tools so where do we go from
here what sort of countermeasures do we have a well 1st of all I think the long-term solution is going to be moving to no more shared hardware we need to have separate hardware and no more share caches in order to fully get rid of these different cache attacks into we've already seen this trend in different cell phones so for example in Apple associes for a long time I think since the apple 7 the secure enclave which runs a secure code has its own cash so these cash attacks can be accomplished from cold outside of that secure enclave so just by using that separate hardware it knocks out a whole class of different potential sigh microarchitectural attacks and just recently the pixel to is moving in the same direction little-to-no includes a hardware security model that includes cryptographic operations in that shit also has its own memory and its own catches so now we can no longer use this attack to 0 I extract information about what's going on in this external hardware security model but even then using a separate hardware that doesn't solve all of our problems because we saw the question of what do we included in the separate hardware on the 1 hand we want to include more code in that a separate hardware so were less portable to these outside chart tax but on the other hand we don't want to spend the attack surface are anymore because the more could be included in the secure environments the more like that of all ability will be found i in the temple built to get a foothold within the secure trust environment so there's going to be a balance between what you choose to include in a separate are and what you don't so do you included encoding D include cryptographic code on open questions and that sort of the long-term approach the short term you just kind of have to right side channel free software is the very careful about what you our process does if there any secret dependent memory accesses or secret dependent branching secret dependent function calls because any of those can week the secrets out of your trusted execution environment so here the thing is that if you're a developer of trusted execution code that I want to keep in mind 1st of all
performance is very often at odds with security we've seen over and over that the performance enhancements to these processes open up the ability for these microarchitectural attacks to be more efficient additionally the structures these trusted execution environments don't protect against everything there still these site on tax in his microcontroller tax that these systems are vulnerable to these attacks are very powerful they are can be accomplished simply and with the publication of the code that I've written it it's should be very simple to get set up and to analyze your own code to see em i've volleyball do I expose information in the same way and lastly it only takes 1 small error 1 tiny leak from your trusted and secure code in order to extract the entire secret in order to bring the whole thing out so what I want to leave US it is I want you to remember that you are responsible for making sure that your program is not rumbled to these micro potential attacks because if you do not agree responsibility for this it well thank you for the
few thank you very much please uh if you wanna leave the whole these 2 quiet and to call you loans with you with respect to speaker we have plenty of time in 17 minutes what you any so please let up the microphone no questions from the signal handle all right so we can start of microphone sixties that all that was simple see here o ss uh at the contrast so which a study of then if the nonsecure always gets all the interrupts but just secure full yes so i in the army there a couple different kinds of interrupts so I think Obama the terminology quickly there's an IIR Q and F. I. Q. interrupt so the nonsecure mode handles the IRQ interrupts in the secure mode handles the F. I. Q. interrupts to depending on which when you said will depend on which direction that monitor will direct and indirect if they OK thank you microphone number 7 please this is you on present that the text on trust alone but also by the way you that implementation of trust soon water you'll looking into it I would into AMD too much because our as far as I can tell that's not use as commonly on but there are many different types of trusted execution environments due to their focus on were SGX and I trust on because those are the most common examples that I've seen thank you microphone number 8 kids on well you the when trust only is going to work to dedicated hardware dedicated memory could you when they do user days texts by loading your own trustor's uses this users of Oracle also sorts I if you can live your own Trustco then yes you could do that but in many of the model that seen today are that's not possible so that's why you have things like code signing which prevents the arbitrary user from writing their own code and the trusted OS are and you have the trust environment right microphone number 1 I think that's a lot powerful against the running and and just execution environments non the sites would be against rank 3 can level in terms of scale can and does not mean that just in execution environments a basically an attractive conditions that we should use by the solar you Lord benefit to using these trusted execution environments the point I want to get across is that I although they had a lot of features they don't protect against everything so you should keep in mind that these side attacks do still exist in the Soviet gives them but over all of these are better things I in worthwhile in including and you might have 1 number 1 again please only so it is used to doing something from and critiquing memory and I'm not sure if the encryptor addresses to and from that book that defense against such attacks so not to really within but I SGX also encrypts memory on could in between the lowest level cash and the main memory but that doesn't really have an impact on the actual operation because the reason could get the cash level and as the attacker we don't care what that DataEase within the catchline we only care which cache line has been accessed you view and print addresses from let's hope against if you are not sure how you would include the addresses yourself are as long as those addresses mapped into the same set ideas that the victim can map into I've been the victim could still pull off the same cell of a text great we have a question from the Internet please the question is does the secure enclave on some X announced distinguish the receiver of the message so that the user application last to security and AS message can wants to live on the well know that these secure and wave regions so that sounds like it's asking about the trees by cell attack where it's call into the secure world to are encrypt something with a yes I think that would all depend on the different implementation as long as it's encryption for a certain key and it's able to do that repeatedly then the attack would are assuming evolvable AS implementation would build to extract the key out called microphone number 2 please you recommend a reference to understand how these cash attacks and branch of oracles actually to recovery by yeah so I will put through
these pages which i include a lot of references for the text I mentioned so if
you're watching the video you can see these right away
or just axis the slides arm and a lot of these contain good starting point so I to go into all the details on how for exempt the trees by attack recover the AST but that paper does have a lot of good links for how I those fault and on other there's can lead to q coverage same thing with a clustering back how the different fault injection can lead to key recovery is just like a phone number
6 please I think my question later then very that almost the same thing how hard is it actually to recover the keys is like a massive machine learning problem or is this something that you can do practically on a single machine it varies entirely by the implementation so for all these attacks work you need to have some sort of although implementation and some implementations liquidated more data than others i in the case of a lot of the AS attacks on where you're doing the passive attacks those are very easy to do on just a a computer after the Iast fault injection attack I think that when required more brute force in the corkscrew paper OS that 1 require more computing resources but still it was entirely practical to do in a realistic setting good thank you so we have 1 more microphone number 1 please so there is a hope is not the 294 questions about the i was wondering since all these types of trees on cash and Mrs. is the possible to forcibly flourish or invalidate or is of noise in cash at each operation trust in just environment and in order to move the miss the gas will cover just after a it so mn this common optimization and performance for additional security benefits yeah and that's that is absolute possible in you're absolutely right high does lead to a performance degradation because if you always flush the entire cache every time you do a context which that will be huge performance at so again that comes out of the question of the performance and security trade which 1 do you end up going west it seems historically the choices them more in the direction of performance thank you that but we have 1 more on microphone number 1 please so I have a more of a moral question how how well the should be really prepared from others which need uh some regional cooperation it is basically when we use the trust zone for the purpose of rules that city you like protecting that grows the from interacting from outside world then we are basically using the savings occasion environment for some books in the process but once we need so some cooperation from the kernel so that toxic fact empowered the user instead of the hardware produced yes I a wave depend entirely on what your application is and what your threat model is that you're looking at so if you're using these trust fish environments theorem for example then maybe you would be worried about that rings your attacker that college attacker who has the form rooted in is trying to recover these 5 media encryption keys from this execution environment but maybe there are other scenarios where it it you're not as worried about having attack was measuring here so it is how define the context right thank you so we have 1 more like a form of what again they're great don't thank you much if you just a short questions do you have any success stories about attacking trust on at different implementations of you with some endorse like and I know some Williams reading forms and stuff I'm not I'm not at this time the thank you thanks the role of the Curie much really is in the wall right overgrowth foraging and you have what to watch
if you if this the the the the the it will hold if the if if the the the
Distributionstheorie
Internetworking
Prozess <Physik>
Computersicherheit
Vererbungshierarchie
Gruppenkeim
EDV-Beratung
Kolmogorov-Komplexität
Unrundheit
Physikalisches System
Programmierumgebung
Kontextbezogenes System
Dijkstra-Algorithmus
Chaostheorie
Mikroarchitektur
Computersicherheit
Mobiles Internet
Booten
Ordnung <Mathematik>
Smartphone
Programmierumgebung
Kernel <Informatik>
Unterring
Subtraktion
Bit
Prozess <Physik>
Digital Rights Management
Kartesische Koordinaten
Hauptplatine
Komplex <Algebra>
Code
Kernel <Informatik>
Intel
Unterring
Geschlossenes System
Rangstatistik
Vorzeichen <Mathematik>
Netzbetriebssystem
Computersicherheit
Coprozessor
Leistung <Physik>
ATM
Geschlossenes System
Systemverwaltung
Quick-Sort
Coprozessor
Diagramm
Datenverwaltung
ATM
Hypercube
Information
Kernel <Informatik>
Unterring
Subtraktion
Bit
Prozess <Physik>
Punkt
Kartesische Koordinaten
Element <Mathematik>
Zählen
Code
Kernel <Informatik>
Homepage
Übergang
Negative Zahl
Informationsmodellierung
Unterring
Rangstatistik
Prozess <Informatik>
Netzbetriebssystem
Computersicherheit
Umkehrung <Mathematik>
Bildgebendes Verfahren
Computersicherheit
Ausnahmebehandlung
Quick-Sort
Coprozessor
Diagramm
System F
Twitter <Softwareplattform>
Rechter Winkel
Last
Festspeicher
Anpassung <Mathematik>
Hypercube
Wort <Informatik>
Garbentheorie
Speicherverwaltung
Ordnung <Mathematik>
Schlüsselverwaltung
Programmierumgebung
Subtraktion
Inferenz <Künstliche Intelligenz>
Klasse <Mathematik>
Mathematisierung
Kartesische Koordinaten
Computerunterstütztes Verfahren
Kombinatorische Gruppentheorie
Code
Eins
Datenverwaltung
Code
Proxy Server
Netzbetriebssystem
Speicherabzug
Coprozessor
Spannungsmessung <Mechanik>
ATM
Schlüsselverwaltung
Physikalischer Effekt
Computersicherheit
Mathematisierung
Physikalisches System
Frequenz
Overclocking
Energiedichte
Datenverwaltung
Flächeninhalt
Rechter Winkel
Parametersystem
ATM
Energiedichte
Speicherabzug
Kantenfärbung
Ordnung <Mathematik>
Programmierumgebung
Schlüsselverwaltung
Caching
Parametersystem
Prozess <Physik>
Punkt
Hardware
Inferenz <Künstliche Intelligenz>
Computersicherheit
Adressraum
Mathematisierung
Soundverarbeitung
Implementierung
Kombinatorische Gruppentheorie
Code
Quick-Sort
Intel
Informationsmodellierung
Code
Caching
Seitenkanalattacke
Speicherverwaltung
Information
Softwareentwickler
Phasenumwandlung
Hardware
Drucksondierung
Aggregatzustand
Subtraktion
Spiegelung <Mathematik>
Punkt
Prozess <Physik>
Quader
Program Slicing
Adressraum
Zahlenbereich
Kombinatorische Gruppentheorie
Homepage
Virtuelle Maschine
Datensatz
Datentyp
Programmbibliothek
Speicheradresse
Inhalt <Mathematik>
Virtuelle Adresse
Phasenumwandlung
Caching
Beobachtungsstudie
Assoziativgesetz
Nichtlinearer Operator
Kraftfahrzeugmechatroniker
Lineares Funktional
Hauptspeicher
Schlussregel
p-Block
Datenfluss
Quick-Sort
Mapping <Computergraphik>
Diagramm
Menge
Last
Rechter Winkel
Caching
Festspeicher
Mereologie
Information
URL
Normalspannung
Programmierumgebung
Lesen <Datenverarbeitung>
Subtraktion
Punkt
Prozess <Physik>
Inferenz <Künstliche Intelligenz>
Gemeinsamer Speicher
Adressraum
ROM <Informatik>
Physikalische Theorie
Datensatz
Minimum
Schlussfolgern
Inhalt <Mathematik>
Phasenumwandlung
Beobachtungsstudie
Caching
Nichtlinearer Operator
Hauptspeicher
Gemeinsamer Speicher
Partielle Differentiation
Zeitzone
Menge
Last
Festspeicher
Caching
Faktor <Algebra>
Information
URL
Ablaufverfolgung
Subtraktion
Pixel
Prozess <Physik>
Gruppenoperation
Program Slicing
Mathematisierung
Iteration
Kartesische Koordinaten
Primideal
Code
Advanced Encryption Standard
Chiffrierung
Menge
Festspeicher
Mustersprache
Inhalt <Mathematik>
URL
Datenstruktur
Ablaufverfolgung
Schlüsselverwaltung
Einflussgröße
Bildgebendes Verfahren
Bit
Subtraktion
Ortsoperator
Program Slicing
Geräusch
Kartesische Koordinaten
Negative Zahl
Geräusch
Speicheradresse
Bildgebendes Verfahren
Einflussgröße
Bildauflösung
Leistung <Physik>
Caching
Linienelement
Temporale Logik
Bildauflösung
Raumauflösung
Auflösungsvermögen
Druckverlauf
Caching
Festspeicher
Information
Ordnung <Mathematik>
Programmierumgebung
Aggregatzustand
Kernel <Informatik>
Folge <Mathematik>
Unterring
Program Slicing
Geräusch
Kartesische Koordinaten
Kernel <Informatik>
Homepage
Homepage
Geräusch
Netzbetriebssystem
Adressraum
Operations Research
Analytische Fortsetzung
Bildauflösung
Geschlossenes System
Linienelement
Bildauflösung
Mailing-Liste
Raumauflösung
Auflösungsvermögen
Sequentieller Zugriff
Festspeicher
Gamecontroller
Information
Ordnung <Mathematik>
Computerunterstützte Übersetzung
Matrizenrechnung
Kernel <Informatik>
Unterring
Prozess <Physik>
Program Slicing
Geräusch
Interrupt <Informatik>
Code
Kontextbezogenes System
Homepage
Geräusch
Interrupt <Informatik>
Netzbetriebssystem
Zoom
Kurvenanpassung
Gerade
Bildauflösung
Caching
Konstruktor <Informatik>
Prozess <Informatik>
Temporale Logik
Bildauflösung
Physikalisches System
Raumauflösung
Auflösungsvermögen
Caching
Dreiecksfreier Graph
Gamecontroller
Kernel <Informatik>
Maschinencode
Subtraktion
Bit
Punkt
Ortsoperator
Geräusch
Zellularer Automat
Element <Mathematik>
Physikalische Theorie
Interrupt <Informatik>
Code
Wiederherstellung <Informatik>
Kernel <Informatik>
Netzwerktopologie
Chiffrierung
Geräusch
Informationsmodellierung
Multiplikation
Negative Zahl
Temporale Logik
Coprozessor
Einflussgröße
Gerade
Bildauflösung
Leistung <Physik>
Autorisierung
Statistik
Computersicherheit
Bildauflösung
Ausnahmebehandlung
Ähnlichkeitsgeometrie
Raumauflösung
Schätzung
Datenfluss
Auflösungsvermögen
Advanced Encryption Standard
Primzahltest
Chiffrierung
Caching
Dreiecksfreier Graph
Hypercube
Messprozess
Ordnung <Mathematik>
Prozess <Physik>
Bildauflösung
Wärmeübergang
Physikalisches System
Raumauflösung
Code
Interrupt <Informatik>
Geräusch
Interrupt <Informatik>
Temporale Logik
Speicherabzug
Speicherabzug
Messprozess
Speicher <Informatik>
Ordnung <Mathematik>
Leistung <Physik>
Caching
Prozess <Physik>
Bildauflösung
Geräusch
Raumauflösung
Systemplattform
Interrupt <Informatik>
Kernel <Informatik>
Geräusch
Informationsmodellierung
Interleaving
Caching
Dreiecksfreier Graph
Speicherabzug
Speicherabzug
Einflussgröße
Bildauflösung
Subtraktion
Geräusch
Code
Übergang
Informationsmodellierung
Geräusch
Code
Coprozessor
Default
Ereignishorizont
Einflussgröße
Caching
Softwareentwickler
Default
Verzweigendes Programm
Bildauflösung
Raumauflösung
Fokalpunkt
Ereignishorizont
Last
Caching
Festspeicher
Mereologie
Wort <Informatik>
Ordnung <Mathematik>
Resultante
Kernel <Informatik>
Subtraktion
Prozess <Physik>
Schaltnetz
Implementierung
Systemplattform
Code
Kernel <Informatik>
Informationsmodellierung
Weg <Topologie>
Interrupt <Informatik>
Netzbetriebssystem
Bildschirmfenster
Phasenumwandlung
Modul
Leistung <Physik>
Implementierung
Softwaretest
Caching
Schreiben <Datenverarbeitung>
Dean-Zahl
Physikalischer Effekt
Modul
Menge
Systemaufruf
Inverser Limes
Caching
Festspeicher
Trigonometrie
Computerunterstützte Übersetzung
Ordnung <Mathematik>
Ablaufverfolgung
Verzweigendes Programm
Subtraktion
Bit
Prozess <Physik>
Stoß
Adressraum
Content <Internet>
Geräusch
Ähnlichkeitsgeometrie
Interrupt <Informatik>
Code
Richtung
Übergang
Puffer <Netzplantechnik>
Geräusch
Pufferspeicher
Prognoseverfahren
Reelle Zahl
Adressraum
Datenstruktur
Gerade
Bildauflösung
Caching
Stoß
Verzweigendes Programm
Prognostik
Bildauflösung
Raumauflösung
Quick-Sort
Auflösungsvermögen
Coprozessor
Rechter Winkel
Konditionszahl
Caching
Festspeicher
Ordnung <Mathematik>
Subtraktion
Prozess <Physik>
Primideal
Gemeinsamer Speicher
Adressraum
Zellularer Automat
Code
Interrupt <Informatik>
Puffer <Netzplantechnik>
Geräusch
Datenstruktur
Einflussgröße
Bildauflösung
Expertensystem
Chaostheorie
Verzweigendes Programm
Bildauflösung
Prognostik
Raumauflösung
Menge
Quick-Sort
Menge
Festspeicher
Caching
Dateiformat
Lineares Funktional
Bit
Prozess <Physik>
Pixel
Computersicherheit
Verzweigendes Programm
Zellularer Automat
p-Block
Code
Loop
Caching
Mustersprache
Mereologie
Garbentheorie
Information
Gerade
Touchscreen
Bildauflösung
Vektorpotenzial
Prozess <Physik>
Freeware
Gemeinsamer Speicher
Mikroarchitektur
Richtung
Geschlossenes System
Kryptologie
Computersicherheit
Seitenkanalattacke
Schreiben <Datenverarbeitung>
Lineares Funktional
Nichtlinearer Operator
Hardware
Computersicherheit
Systemaufruf
Mikrocontroller
Programmierumgebung
Software
Twitter <Softwareplattform>
Rechter Winkel
Festspeicher
Information
Ordnung <Mathematik>
Programmierumgebung
Pixel
Fehlermeldung
Subtraktion
Web Site
Klasse <Mathematik>
Smith-Diagramm
Term
Code
Leck
Informationsmodellierung
Software
Flächentheorie
Endogene Variable
Softwareentwickler
Optimierung
Datenstruktur
Operations Research
Ganze Funktion
Hardware
Assoziativgesetz
Fehlermeldung
Pixel
Quick-Sort
Summengleichung
Mikroarchitektur
Offene Menge
Caching
Brennen <Datenverarbeitung>
Subtraktion
Web Site
Punkt
Wasserdampftafel
Wellenlehre
Adressraum
Zahlenbereich
Zellularer Automat
Implementierung
Kartesische Koordinaten
Kombinatorische Gruppentheorie
Term
Code
Interrupt <Informatik>
Internetworking
Richtung
Übergang
Intel
Netzwerktopologie
Informationsmodellierung
Rangstatistik
Vorzeichen <Mathematik>
Datentyp
Kontrast <Statistik>
Gerade
Leistung <Physik>
Beobachtungsstudie
ATM
Zentrische Streckung
Nichtlinearer Operator
Hardware
Computersicherheit
Verzweigendes Programm
Fokalpunkt
Quick-Sort
Dialekt
Chiffrierung
Rekursivfilter
Menge
Rechter Winkel
Festspeicher
Konditionszahl
Wiederherstellung <Informatik>
Programmierumgebung
Schlüsselverwaltung
Message-Passing
Orakel <Informatik>
Hydrostatik
Wagner, Klaus
Punkt
Weg <Topologie>
Minimierung
Kartesische Koordinaten
Computer
Computerunterstütztes Verfahren
Information
Homepage
Videokonferenz
Kernel <Informatik>
Richtung
Intel
Netzwerktopologie
Computersicherheit
Schlussfolgern
Auswahlaxiom
Caching
Nichtlinearer Operator
Addition
Hardware
Computersicherheit
Kryptologie
Güte der Anpassung
Speicher <Informatik>
Feasibility-Studie
Aliasing
Kontextbezogenes System
RSA-Verschlüsselung
Zeitzone
Web log
Rechenschieber
Software
Datenverwaltung
Forcing
Menge
Rechter Winkel
Ordnung <Mathematik>
Programmierumgebung
Schlüsselverwaltung
Pixel
Lesen <Datenverarbeitung>
Subtraktion
Kontrollstruktur
Wellenlehre
Implementierung
Zahlenbereich
Geräusch
Maßerweiterung
Datenmissbrauch
Systemprogrammierung
Physikalisches System
Virtuelle Maschine
Geräusch
Bildschirmmaske
Informationsmodellierung
Unterring
Datentyp
Seitenkanalattacke
Hardware
Architektur <Informatik>
Bildauflösung
Schlussregel
Binder <Informatik>
Hochdruck
Quick-Sort
Coprozessor
Advanced Encryption Standard
Touchscreen
Caching
Injektivität
Hypermedia
Energiedichte
Wiederherstellung <Informatik>
Hypermedia
Medianwert
Systemprogrammierung

Metadaten

Formale Metadaten

Titel Microarchitectural Attacks on Trusted Execution Environments
Serientitel 34th Chaos Communication Congress
Autor Ryan, Keegan
Lizenz CC-Namensnennung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/34819
Herausgeber Chaos Computer Club e.V.
Erscheinungsjahr 2017
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Trusted Execution Environments (TEEs), like those based on ARM TrustZone or Intel SGX, intend to provide a secure way to run code beyond the typical reach of a computer’s operating system. However, when trusted and untrusted code runs on shared hardware, it opens the door to the same microarchitectural attacks that have been exploited for years. This talk provides an overview of these attacks as they have been applied to TEEs, and it additionally demonstrates how to mount these attacks on common TrustZone implementations. Finally, we identify new techniques which allow us to peer within TrustZone TEEs with greater resolution than ever before.
Schlagwörter Security

Zugehöriges Material

Video wird in der folgenden Ressource zitiert

Ähnliche Filme

Loading...
Feedback