The ring 0 façade: awakening the processor's inner demons
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 322 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/39677 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
Trigonometric functionsMagnetic stripe cardCoprocessorBitMicroprocessorPresentation of a groupWritingDifferent (Kate Ryan album)ExistenceProjective planeReading (process)FirmwareGame controllerProgrammer (hardware)Uniform resource locatorWärmestrahlungCodeRing (mathematics)EncryptionFunctional (mathematics)Information securityBackdoor (computing)Computer hardwareMechanism designReverse engineeringKey (cryptography)ResultantCache (computing)Prisoner's dilemmaMereologyOrder (biology)Descriptive statisticsAddress spaceQuicksortKernel (computing)BuildingDependent and independent variablesWhiteboardNumbering schemePower (physics)10 (number)PasswordCore dumpInformationProxy serverLatent heatBridging (networking)Right angleEndliche ModelltheoriePerspective (visual)Level (video gaming)WordSound effectPearson product-moment correlation coefficient
05:08
PasswordManufacturing execution systemException handlingFormal verification1 (number)PermutationLatent heatMagnetic stripe cardContent (media)Key (cryptography)CoprocessorFigurate numberNumberCASE <Informatik>Order (biology)Service (economics)State observerPasswordBefehlsprozessorNeuroinformatikExistenceCodeWordResultantAbstractionFirmwareLevel (video gaming)QuicksortHydraulic jumpReading (process)BitNumbering schemeOperator (mathematics)Endliche ModelltheorieImplementationException handlingInformation securityPerspective (visual)Process (computing)Arithmetic meanTable (information)Address spaceReverse engineeringWritingBlack boxInformationPower (physics)Different (Kate Ryan album)Right angleCore dumpObservational study
10:16
Exception handlingPasswordPeer-to-peerFormal verificationMagnetic stripe cardMUDExistenceException handlingCoprocessorMagnetic stripe cardPasswordCodeNumberOrder (biology)Right angleResultantLeak
11:00
CAN busTimestampReading (process)Order (biology)Side channel attackMultiplication signCodePasswordMagnetic stripe cardGraphics tabletWebsite
11:38
Scalable Coherent InterfaceMagnetic stripe cardNumberLine (geometry)Cartesian coordinate systemMultiplication signCoprocessorSide channel attackDiagram
12:11
State observerBounded variationMagnetic stripe cardGraph (mathematics)Multiplication signReading (process)
12:28
CodeData modelHill differential equationWechselseitige InformationSimulationDiscrete groupGraph (mathematics)QuicksortDifferent (Kate Ryan album)Group actionLine (geometry)DialectGreatest elementSide channel attackMultiplication signCoprocessorReading (process)Range (statistics)Magnetic stripe cardPasswordSpacetimeEndliche ModelltheorieService (economics)BitFirmwareOrder (biology)CASE <Informatik>WordException handlingLatent heatReverse engineeringDiagram
15:30
Core dumpIntelCoprocessorPasswordMultiplication signReading (process)Cycle (graph theory)BitResultantNumberRange (statistics)Order of magnitudeMagnetic stripe card1 (number)EncryptionPattern languageOrder (biology)Different (Kate Ryan album)Diagram
17:04
Electronic program guideCoprocessorPasswordQuicksortPort scannerEndliche ModelltheorieCodeMultiplication signWindowFamilyTwitterNumberFeedbackCASE <Informatik>Side channel attackSystem administratorPower (physics)Data conversionFirmwareRight angleSoftware bugLatent heatData miningEmailKernel (computing)DatabaseSampling (statistics)Different (Kate Ryan album)Presentation of a groupMagnetic stripe cardPattern languagePhysical systemAsynchronous Transfer ModeHecke operatorProjective planeType theoryVariety (linguistics)Module (mathematics)Key (cryptography)CompilerMechanism designSingle-precision floating-point formatComplex (psychology)Order (biology)Bridging (networking)Open setShift operatorExclusive orFigurate number
Transcript: English(auto-generated)
00:00
Here's Mr. Domus. All right, thanks everyone. Let's go ahead and dive in. So the fundamental premise behind this presentation is that just because we have code running at ring zero, just because we have code running in the most protected, privileged realm
00:21
of the processor, it doesn't necessarily mean that we really have access to everything on that processor. So I want to explore that idea today. But first, the most important part of any presentation, the part the lawyers put in, a disclaimer. All this research is stuff I did independently. This does not reflect in any way my employer. This is not their opinions. This is purely my own speculation and ideas.
00:42
But with that out of the way, my name is Christopher Domus. I'm a cybersecurity researcher. I've spent the last few years sort of poking around low-level processor security, and one of the things I really like about this is trying to find ways to expose secrets on processors, things that we're not supposed to know about or not supposed to have access to. So for this particular presentation,
01:00
I want to look at the idea of what are called model-specific registers in x86. So these registers are used for lots of different miscellaneous things, things like debugging and execution tracing and performance monitoring. And you can adjust your clock speed on the processor through MSRs. You can toggle thermal controls and thermal sensors and safety mechanisms on and off through MSRs.
01:21
You can adjust cache behavior with these model-specific registers. They do all sorts of miscellaneous kind of crazy things, but you can dig a lot deeper than that and start to find some scary stuff that MSRs are sometimes responsible for. For example, it is known that some undocumented model-specific registers can toggle really, really powerful debug features on the processor.
01:41
There's actually really, really good evidence that some firmware is using undocumented model-specific registers to enable previously disabled cores on the processor. And if you saw my Project Rosenbridge presentation yesterday, what we saw is that some MSRs are actually used for things like toggling hardware backdoors
02:00
on the processor. So there are some really, really incredible functionality tied up in these registers that most people don't have a lot of familiarity with. So it's definitely something that we want to investigate more. So just a little bit of background on how you use these MSRs before we get diving into some really interesting facets of them. Basically the way MSRs work are you have
02:21
to be in ring 0 in order to access an MSR. And then you access an MSR not by its name, but by its address. And MSRs have addresses between 0 and 4 gigabytes. And only a very, very small portion of that address space is actually implemented on most processors. Something like tens of MSRs on an older processor or hundreds on a modern processor, but not many in the scheme of things.
02:41
MSRs are 64 bits and you read them with a read MSR assembly instruction and you write them with a write MSR assembly instruction. So when I started looking into this, what I wanted to figure out was how deep do MSRs go? What real functionality is there that I might be able to tinker with? So I stumbled across this patent from VIA
03:02
that we actually looked at yesterday. If you were here in this patent, they casually mentioned that accessing some of the internal control registers, they're talking about the MSRs in this situation, can enable the user to bypass security mechanisms. For example, allowing ring 0 access at ring 3. In other words, allowing you to reach into the kernel from userland, something that should never be possible. And they go on to
03:22
say, for these reasons, the various x86 processor manufacturers have not publicly documented any description of the address or function of some control MSRs. So that part caught my attention. It's kind of like the Streisand effect, right? You're telling me that, hey, there are these really, really powerful MSRs out there and we're not going to tell you anymore. Well, of course, that just makes me want to find out more
03:42
about them. And if we keep reading through the patent, we start to learn some other interesting things. They say, nevertheless, the existence and locations of some of these undocumented control MSRs are easily found by programmers who typically then publish their findings for all to use. Specifically, what they're concerned about here is
04:00
people reverse engineering firmware, where firmware is using these undocumented MSRs, then somebody who's reverse engineering the firmware, somebody who's looking at it can very trivially see that these MSRs exist and figure out what they're being used for. But from a manufacturer's perspective, they've got a dilemma. They want to be able to tell their customers, their OEMs, the people using their
04:20
chips and building boards from their chips, they want to be able to tell their customers about these MSRs, but disclosing this information to their customers would result in the secret of these control MSRs basically becoming widely known when somebody looks at the firmware and thus being usable by anyone on any processor. So this patent's actually proposing a solution to
04:40
this problem. They're proposing a technique where the microprocessor itself would include a secret key manufactured internally within the microprocessor and externally invisible. And this microprocessor would have encryption configured to decrypt a user supplied password using this secret key in order to generate a
05:00
decrypted result in response to user instructions on the microprocessor to access the control register. So they're basically saying they are password protecting the read MSR and write MSR assembly instructions for very, very special, powerful secret model specific registers. So that's a little bit scary from
05:20
a security perspective. Right? That's not the way I think things should be working. Basically they're saying we're going to give third parties trusted keys to secret pieces of your processor and you, the end user, aren't going to have access to this. So that's a little bit unsettling from my perspective. Basically the question then is well could my processor right now on
05:42
this computer have these secret undocumented all powerful password protected registers in it? And I don't even know because these things aren't documented anywhere because nobody knows about this. It turns out the answer to that is yes. This has been done before and we know this has been done before. On the AMD K7
06:00
and K8 processors they were actually using password protected MSRs. The exact thing just described in this patent. And this was discovered exactly like the patent was worried about. This was discovered through firmware reverse engineering. People saw these MSRs being accessed and they saw the password that firmware was using to access them. Now the K7 and K8 had a very simple password protection scheme.
06:22
It was just a 32 bit password loaded into a general purpose register. So let's start looking at the K7 and K8 just as a case study. Basically let's try to treat these processors as a black box. Assume we didn't know this going in and see if we can find a better approach to identifying password protected registers on
06:41
x86. And I think that's important because we shouldn't have to wait until we've already been owned. We shouldn't have to wait until somebody else is accessing the secrets of our processors in order to figure out that this stuff exists. We should have some kind of means of detecting this kind of stuff on our own. So that's what I wanted to develop. A means of detecting these password protected registers
07:02
before other people started using them. So here's how things worked in AMD. Basically you would move a magic 32 bit value to password into the EDI register. Then you would move the address of the MSR that you're trying to access into the ECX register and then you would issue a read MSR instruction. And then a couple different things could happen.
07:22
If that MSR that you were trying to access doesn't actually exist on the processor the CPU would generate what's called a general protection exception. On the other hand if that MSR existed but you had the wrong password the CPU would generate a general protection exception. So that creates a problem for our research here.
07:42
We get the same results in both cases. In other words in order to detect that our CPU has password protected registers that we're being kept out of we have to both guess the model specific register address and guess the MSR password. Guessing either one of those two things wrong
08:01
gives us the exact same behavior. It gives us a general protection exception. That means we have to guess a 32 bit address and a 32 bit password. We have to guess 64 bits of information correctly in order to just detect that a password protected register exists on our processor. So if you look at even the simplest embodiment of password protected registers, just 32 bit passwords like AMD was using
08:22
if you could make a billion such guesses per second it would take you 600 years of processing in order to find all the password protected registers on your processor. So we need a better way. We need to figure out how could we detect that our processor has password protected registers without actually needing to know the password first. And the secret to figuring this out
08:42
is sort of realizing that assembly is actually a high level abstraction. Underneath the hood of your processor each x86 assembly instruction is actually broken out into micro ops for execution by the CPU core. So if we start thinking about what might the micro code behind a read MSR assembly instruction look like, it might
09:02
look something like this. Underneath the hood the micro code needs to figure out what MSR you're trying to access and figure out how to give you the contents of that MSR. So it might check, well are you trying to access MSR number one? If so it'll figure out how to handle MSR number one. Otherwise are you trying to access MSR 6? If so it'll give you the contents of MSR 6. Et cetera, et cetera until the very end
09:21
if it hasn't figured out any of the existing MSRs that you're trying to access that must mean that you're trying to access an MSR that doesn't exist so it throws a general protection exception. You might think maybe this is implemented as a jump table. We'll see some evidence coming up that this actually can't be implemented as a jump table. But that's sort of one possible implementation for micro code behind that read
09:41
MSR instruction. So we can look at a little permutation for that to see what it might look like if micro code was trying to service a password protected model specific register. So in this situation I'm saying MSR number lead code is a password protected MSR and I'm trying to access it. So what the micro code is going to do is say well are you trying to access MSR 1?
10:01
Nope. MSR 6? Nope. Ah you're trying to access MSR lead code. Well after that then it needs to check is your password correct? In this case is the EBX register feed face? If so it'll service that MSR otherwise it'll throw a general protection exception. So there's a key observation here. There are two different paths that the micro code took
10:21
in order to get to the same result. In both situations it ended up throwing a general protection exception but there are two different paths it took. So let's look at the path that the micro code takes if I try to access an MSR that doesn't exist. Let's say I try to access MSR number 12345678 which doesn't exist on this processor. It checks are you accessing 1? 6? Lead code? Nope. Then I'm
10:41
going to throw a general protection exception. But let's look at the path that the micro code took if you tried to access a password protected register with the wrong password. So I'm trying to access MSR lead code here but I have the wrong password. Here it checks MSR 1? 6? Lead code? Oh ok you're accessing lead code. Do you have the right password? No. Throw a general protection exception.
11:01
So two different paths depending on whether that MSR existed and or whether you had the correct password. So since there are two different paths here the timing on each path should be slightly different and that opens this micro code up to a timing side channel attack where what you can do is you can have a read MSR
11:21
instruction in the middle and on either side of that read MSR instruction you have read timestamp counters in order to am I missing something? Oh ok thanks. You have read timestamp counter instructions in order to detect how long that read MSR instruction took to access. So what that looks like when you
11:41
actually execute on the x-axis here I have the MSR numbers that I'm trying to access on the y-axis I have the time it takes to access each MSR. Now the light grey lines that you see there those are the implemented MSRs I'm less interested in those for this research what I'm actually interested in is that black line
12:01
along the bottom that's how long it takes to access defaulting MSRs. The MSRs that the processor is telling me don't really exist on this processor. So what we can do with this timing side channel attack basically looking at that graph that we just generated it lets us speculate about what the underlying microcode for the read MSR instruction
12:22
must look like. Specifically I can start focusing on variations in the observed fault times for accessing the various MSRs. So if you look at this graph carefully if you look at this black line along the bottom what you see is that there are these discrete regions for different groups of MSRs and that sort of tells us
12:40
that the microcode must be identifying these different MSR groups prior to checking for specific MSRs. In other words the model for this x86 microcode behind the read MSR instruction looks something like this. It's first going to check are you trying to access an MSR between 0 and 174. If so it will figure out exactly
13:01
which MSR you're accessing and service that request. Then it will check well are you trying to access an MSR between 174 and 200. If so it will figure out how to service that request. Breaking things into groups like this actually lets it handle the read MSR instruction a lot faster than it would checking MSRs one by one. But if you look carefully at this model that we came up with based on that timing attack
13:20
there's something a little bit unusual about two of these checks highlighted in red here. Two of these checks we can detect from the timings that the microcode is explicitly checking for these regions but it doesn't seem to be doing anything for those regions. In both cases it just throws a general protection exception and that doesn't make a lot of sense. Why on earth would microcode be
13:40
checking for these regions if there weren't even any visible or accessible MSRs within those regions. Well the only explanation for that is that there really are MSRs inside of those mysterious regions. They're just not giving us access to them. In other words those are probably the password protected regions on this processor.
14:00
If we make that assumption and we're actually able to reduce our search space our MSR search space by 99.999% which actually makes cracking individual MSRs inside of those regions feasible. We can essentially try all possible 32 bit values in all of the GPRs all of the MMX registers in order to crack what the password must be for
14:22
those password protected registers. So this works. We're able to crack the passwords on the AMD K8 in one day instead of 600 years like it would have taken without the timing attack. We find that the password 9C5A203A loaded into the EDI register unlocks one of those specific ranges
14:40
that we detected through our timing attack. That other range C00 etc. That one didn't have any password protected registers in it. The microcode is doing some check on that range but there's no telling why. I do want to emphasize like this region and this password were already known. People discovered this through
15:02
firmware reverse engineering a while ago but this is the first time we've had an approach for uncovering these password protected MSRs without first observing them in use. This side channel attack into microcode offers some really powerful opportunities to really figure out what's going on
15:20
under the hood of our processors. Things that are sort of being kept from us like these password protected registers. The question then is what else can we find with an attack like this? I started scanning a bunch of different processors using this MSR timing technique and I wanted to share some of those results with you quickly.
15:40
Here's what we found on a newer AMD processor. It no longer has some of the timing dips that the KA had which kind of suggests that newer AMDs got rid of this password check. Here's a Viya C3 scan where they didn't have any unusual timings on faulting registers but they had these two enormous spikes at 133 and 1133. Those
16:02
MSRs took over 100,000 cycles to access. There is no feasible explanation for why reading an MSR should take over 100,000 cycles. That's three orders of magnitude longer than the next longest MSR took to access. That's ample time to be doing encryption or any other number of interesting
16:20
things. That definitely warrants some more scrutiny. The Viya Nano had this interesting spike on the left where inexplicably a small range of MSRs seemed to be protected. Intel Adams Intel Core i5 also had some interesting timing patterns where you can see these little blips in the fault times where basically I'm asking the
16:42
processor, does this MSR exist? It says no. Does this one exist? No. Does this exist? No. Does this exist? And it thinks for a little bit and says no. It's like, well if it didn't exist just like all the ones right around it, why did you have to think for a little bit in order to respond? So it's really interesting behavior and it made
17:02
me nervous seeing these blips. At the end of the day I tried running my password cracking approach on this and I failed. I tried a lot of different things to crack 64-bit passwords. I tried other types of side channel attacks in order to detect more complex password mechanisms. I wasn't able to uncover any new passwords this way. And sometimes
17:22
that's just the way research turns out. But we're still left with this glaring question. Almost all of these processors had weird timing anomalies within the microcode and we don't have any other way to see what the microcode is doing. So we're left to speculate. So what is causing these timing anomalies? Well, there's a lot of possibilities.
17:40
One could be more advanced password checks. That's exactly what that VIA patent that we looked at at the beginning of the presentation was describing. It could be something like that. It could be that some of these MSRs are only accessible in ultra-privileged modes. Like some MSRs are only accessible in system management mode. Intel has patents on MSRs that are only accessible to authenticated code modules.
18:01
Or it could be something benign. For example, it could be that the microcode is checking the processor family, the model, the stepping. Basically what that would do, it would allow you to use one microcode update on a variety of different processor families. So that's possible too. And in fact, it kind of looked like that was probably what was happening on the Intel processors.
18:21
Those little blips actually seemed to align with documented MSRs on other processor families. So it's kind of nice to think maybe we were in the clear. Maybe password protected registers don't really exist beyond the K7 and K8 since we couldn't find anything here. Sadly, that's not the case.
18:41
At the end of this research, I had a friend of mine send me his x86 firmware database and I wrote a little instruction grepping tool to look through for certain assembly instruction patterns. And I was very quickly able to find a new previously undisclosed MSR password. 380DCBOF in the ESI register
19:00
is an MSR password being used by hundreds of different firmwares across many different vendors. You can even find its magic number being used to access MSRs in the Windows kernel. But nobody in the public has ever seen this. So we are still in a situation where third parties are being given keys to our processors that we ourselves do not possess.
19:20
I think our password scanning tool that I introduced here didn't find this because I only had so many processors at my disposal. I scanned 12 processors. I found 11 with timing anomalies. But that's obviously not every processor out there. So more scans need to be done to figure out exactly what this is being used for. It's sort of an open question right now. So at the end of the day I
19:42
really think this research is interesting but we've raised a lot more questions than we've answered. We found a really interesting timing attack and we found some suspicious things. And I think the stakes here are really really high. It's clear MSRs are being used for lots of powerful things. They control all the details of your processor and until now nobody's
20:02
ever had ways to look into what they were actually doing. So I think this research is really promising. This timing side channel attack on specific assembly instructions is entirely new and gives us a really cool way to sort of uncover some processor secrets that nobody's ever found before. So I'm excited about that. What I really like is
20:20
for people to use this and scan their own processors. I'm open sourcing this as Project Night Shift. You can find us on GitHub. GitHub.com slash XOR EAX EAX EAX I haven't been able to get that up yet but that should be up probably by Monday. What I would really love is for people to run this scan on their systems, send me the logs so that we can sort of collect
20:40
a database of processors that have these unusual timing anomalies and maybe when we get enough samples we can actually figure out what the heck is going on on some of these systems. Other stuff you can find there, Project Rose and Bridge is the back door that I talked about yesterday if you were around. Sand Sifter is a processor, fuzzer, mopskater is an interesting single instruction C compiler and some other stuff I've tinkered with
21:00
over the years. I love talking about this stuff. If anybody has feedback or ideas that they'd like to discuss I'm going to have to step out to make room for the next speaker but please grab me out the door here. Otherwise you can contact me on Twitter at XOR EAX EAX EAX or if you want to have a more proposed conversation please do send me an email same thing at Gmail.com So thank you everybody, I'll be right outside.