Jailbreaking the 3DS Through 7 Years of Hardening

Video thumbnail (Frame 0) Video thumbnail (Frame 2377) Video thumbnail (Frame 10157) Video thumbnail (Frame 11566) Video thumbnail (Frame 13566) Video thumbnail (Frame 14552) Video thumbnail (Frame 15540) Video thumbnail (Frame 16517) Video thumbnail (Frame 17579) Video thumbnail (Frame 18975) Video thumbnail (Frame 20306) Video thumbnail (Frame 22884) Video thumbnail (Frame 27033) Video thumbnail (Frame 28395) Video thumbnail (Frame 29650) Video thumbnail (Frame 31232) Video thumbnail (Frame 32595) Video thumbnail (Frame 33544) Video thumbnail (Frame 36151) Video thumbnail (Frame 39169) Video thumbnail (Frame 41247) Video thumbnail (Frame 44403) Video thumbnail (Frame 46316) Video thumbnail (Frame 48057) Video thumbnail (Frame 49832) Video thumbnail (Frame 50776) Video thumbnail (Frame 54549) Video thumbnail (Frame 56397) Video thumbnail (Frame 57629) Video thumbnail (Frame 58582) Video thumbnail (Frame 60576) Video thumbnail (Frame 62932) Video thumbnail (Frame 65353) Video thumbnail (Frame 67721) Video thumbnail (Frame 69987)
Video in TIB AV-Portal: Jailbreaking the 3DS Through 7 Years of Hardening

Formal Metadata

Title
Jailbreaking the 3DS Through 7 Years of Hardening
Title of Series
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
2018
Language
English

Content Metadata

Subject Area
Abstract
The 3DS was one of Nintendo's first serious attempts at security, featuring a cool microkernel based OS and actual exploit mitigations. That didn't stop it from getting hacked pretty hard, making it possible for people to write their own homebrew software for the console. But Nintendo isn't one to back off from a fight and, as a result, has put significant effort into not only fixing vulnerabilities but also introducing new security features targeted specifically at killing exploit techniques used by hackers. This talk will describe hacking the console through all these defensive features by walking through a 0-day exploit chain that takes us all the way from zero access to a full system jailbreak.
Nintendo Co. Ltd. Addition Mikrokernel Arm Frequency Category of being Befehlsprozessor Term (mathematics) Semiconductor memory Hacker (term) Befehlsprozessor Computer hardware Core dump Operating system Game theory Information security Game theory Resultant Computer architecture Graphics processing unit
Code Multiplication sign Set (mathematics) Arm Cryptography Virtual memory Different (Kate Ryan album) Semiconductor memory Kernel (computing) Cuboid Endliche Modelltheorie Information security Physical system Touchscreen Arm Data storage device Electronic mailing list Mikrokernel Physicalism Flow separation Data management Process (computing) Befehlsprozessor MiniDisc Right angle Encryption Task (computing) Physical system Asynchronous Transfer Mode Point (geometry) Game controller Mobile app Service (economics) Computer file Authentication Mikrokernel Device driver Data storage device Menu (computing) Web browser Coprocessor Zugriffskontrolle Architecture Goodness of fit Flow separation Read-only memory Computer hardware Operating system Energy level Game theory Metropolitan area network Task (computing) Module (mathematics) Addition Key (cryptography) Surface Plastikkarte Cryptography Cartesian coordinate system System call Kernel (computing) Personal digital assistant Computer hardware Game theory Fiber bundle Service-oriented architecture Table (information) Window
Android (robot) Asynchronous Transfer Mode Game controller Module (mathematics) Virtual machine Coprocessor Arm Theory Software bug Number Kernel (computing) Arrow of time Endliche Modelltheorie Information security Physical system Module (mathematics) Nintendo Co. Ltd. Arm Surface Code Total S.A. Bit Complete metric space System call Exploit (computer security) Kernel (computing) Personal digital assistant Chain Video game console
Point (geometry) Mobile app Context awareness Line (geometry) Codierung <Programmierung> File format Parsing Web browser Number Software bug Data model Term (mathematics) Cubic graph output Communications protocol Information security Game theory HTTP cookie Computing platform YouTube Social class Key (cryptography) Code Counting Bit Stack (abstract data type) Exploit (computer security) Web browser Strategy game Cubic graph Energy level Remote procedure call Game theory
Nintendo Co. Ltd. Server (computing) Mobile app Computer file Codierung <Programmierung> File format Parsing Stack (abstract data type) Web browser Data model Software Cubic graph Energy level Endliche Modelltheorie Communications protocol Communications protocol Information security Game theory HTTP cookie Asynchronous Transfer Mode Physical system
Execution unit Implementation Server (computing) Inheritance (object-oriented programming) Code Line (geometry) Code Maxima and minima Bit Line (geometry) 2 (number) Crash (computing) Function (mathematics) Strategy game Fuzzy logic output Vulnerability (computing)
Point (geometry) Domain name Normal-form game Dependent and independent variables Group action Length Software developer Expert system Translation (relic) Usability Software bug Crash (computing) String (computer science) Core dump Maize Firmware Vulnerability (computing)
Dataflow Dependent and independent variables Group action Code Length Exploit (computer security) Stack (abstract data type) Befehlsprozessor Semiconductor memory Buffer solution Video game Computer worm Hydraulic jump Address space
Greatest element Machine code Structural load Code Workstation <Musikinstrument> Range (statistics) Sheaf (mathematics) Function (mathematics) Computer programming Cryptography Virtual memory Semiconductor memory Kernel (computing) Memory management Process (computing) Aerodynamics HTTP cookie Relief Physical system Graphics processing unit Nintendo Co. Ltd. Randomization Arm Texture mapping Block (periodic table) Physicalism Range (statistics) Virtualization Process (computing) Order (biology) Chain Right angle Modul <Datentyp> Quicksort Remote procedure call Physical system Reduction of order Asynchronous Transfer Mode Data buffer Web page Random number Menu (computing) Equivalence relation Read-only memory Computer worm Hydraulic jump Address space Module (mathematics) Dialect Code Planning Stack (abstract data type) Binary file Cartesian coordinate system System call Uniform resource locator Kernel (computing) Read-only memory Sheaf (mathematics) Game theory
Point (geometry) Link (knot theory) Multiplication sign Code Physicalism Menu (computing) Binary file Host Identity Protocol Neuroinformatik Arithmetic mean Loop (music) Software Semiconductor memory Hacker (term) Befehlsprozessor Memory management Chain Dew point Negative number Convex hull Video game console
Point (geometry) Module (mathematics) Service (economics) Asynchronous Transfer Mode System call Service (economics) Code Multiplication sign Surface Cartesian coordinate system Kernel (computing) Process (computing) Read-only memory Term (mathematics) Access Basic Kernel (computing) Process (computing) Physical system Asynchronous Transfer Mode Physical system
Slide rule Service (economics) Machine code Code Memory management Sheaf (mathematics) Code Range (statistics) Menu (computing) Bit Drop (liquid) Menu (computing) Cartesian coordinate system Cryptography Term (mathematics) Semiconductor memory Kernel (computing) Memory management Object (grammar) Table (information) Physical system Reduction of order Physical system Graphics processing unit
Point (geometry) Mobile app Group action Service (economics) Theory Read-only memory Semiconductor memory Process (computing) Aerodynamics Physical system Service (economics) Surface Menu (computing) Cartesian coordinate system System call Process (computing) Right angle Game theory Object (grammar) Coefficient Physical system Window Asynchronous Transfer Mode Row (database) Library (computing)
Web page Point (geometry) Email Spezielle orthogonale Gruppe Structural load Code Tape drive Primitive (album) Mereology Rule of inference Software bug Pointer (computer programming) Semiconductor memory Linker (computing) File system Process (computing) Address space Vulnerability (computing) Graphics processing unit Vulnerability (computing) Email Constraint (mathematics) Bound state Code Physicalism Cartesian coordinate system Uniform resource locator Process (computing) Pointer (computer programming) Linker (computing) Right angle Reading (process) Writing
Point (geometry) System call Constraint (mathematics) Transport Layer Security Real number Primitive (album) Bit System call Pointer (computer programming) Uniform resource locator Process (computing) Kernel (computing) Read-only memory Semiconductor memory Order (biology) Phase transition Right angle Physical system Address space Address space
Web page Game controller Statistics Module (mathematics) Service (economics) Proxy server Software bug Goodness of fit Read-only memory Term (mathematics) Semiconductor memory Kernel (computing) Process (computing) Extension (kinesiology) Address space Physical system Constraint (mathematics) Web page Memory management Code System call Process (computing) Kernel (computing) Pointer (computer programming) Resource allocation Personal digital assistant Order (biology) Remote procedure call Physical system
Point (geometry) Module (mathematics) Freeware Proxy server Primitive (album) Infinity Mereology Disk read-and-write head Software bug Read-only memory Semiconductor memory Kernel (computing) Object (grammar) Process (computing) Resource allocation Web page Electronic mailing list Memory management Code Type theory Pointer (computer programming) Kernel (computing) Resource allocation Object (grammar) Freeware Physical system
Web page Functional (mathematics) Thread (computing) Code Semaphore line Event horizon Optical disc drive Wechselseitiger Ausschluss Synchronization Different (Kate Ryan album) Kernel (computing) Object (grammar) Software design pattern Error message Resource allocation Trail Weight Electronic mailing list System call Thread (computing) Uniform resource locator Wave Kernel (computing) Pointer (computer programming) Event horizon Wechselseitiger Ausschluss output Object (grammar) Table (information) Physical system Resultant Asynchronous Transfer Mode
Web page Ocean current Trail Group action Freeware Thread (computing) Multiplication sign Electronic mailing list Software bug Crash (computing) Object (grammar) Core dump Vertex (graph theory) Process (computing) Resource allocation Vulnerability (computing) Multiplication Computer file Web page Electronic mailing list Core dump 3 (number) System call Thread (computing) Mathematics Kernel (computing) Pointer (computer programming) Process (computing) Befehlsprozessor Resource allocation Crash (computing) Pauli exclusion principle Right angle Object (grammar)
Point (geometry) Game controller Structural load Electronic mailing list Software bug Pointer (computer programming) Uniform resource locator Pointer (computer programming) Personal digital assistant Phase transition Right angle Address space Address space
Web page Functional (mathematics) Code Web page Electronic mailing list System call Element (mathematics) Frame problem Pointer (computer programming) Uniform resource locator Kernel (computing) Pointer (computer programming) Right angle Vertex (graph theory) Object (grammar) Table (information) Freeware Resource allocation Perfect group Hydraulic jump Address space Asynchronous Transfer Mode
Point (geometry) Data storage device Menu (computing) Exploit (computer security) Arm Kernel (computing) Process (computing) Befehlsprozessor Read-only memory Term (mathematics) Semiconductor memory Kernel (computing) Computer hardware Hill differential equation Process (computing) Game theory Extension (kinesiology) Physical system Asynchronous Transfer Mode
Ocean current Asynchronous Transfer Mode Structural load Code Authentication Sheaf (mathematics) Arm Medical imaging Cryptography Permanent Term (mathematics) Semiconductor memory Software Computer hardware Operating system Firmware Game theory Default (computer science) Nintendo Co. Ltd. Mobile app Arm Structural load Data storage device Code Cryptography Data mining Computer hardware Sheaf (mathematics) Order (biology) Encryption Video game console Physical system Firmware Booting
Email Thread (computing) Hoax Structural load Code Sheaf (mathematics) Bound state Electronic signature Arm Dressing (medical) Software bug Medical imaging Mathematics Semiconductor memory Nintendo Co. Ltd. Injektivität Arm File format Data storage device Physicalism Parsing Entire function Electronic signature Befehlsprozessor Nintendo DS Right angle Quicksort Asynchronous Transfer Mode Spacetime Booting Point (geometry) Asynchronous Transfer Mode Game controller Constraint (mathematics) Plastikkarte Permanent Read-only memory Telecommunication Well-formed formula Computer hardware Spacetime Integer Booting Game theory Address space Compact space Code Plastikkarte Uniform resource locator Personal digital assistant Computer hardware Sheaf (mathematics) Formal verification Game theory Integer Buffer overflow RSA (algorithm) Address space
Point (geometry) Email Functional (mathematics) Game controller Code Network operating system Multiplication sign Workstation <Musikinstrument> Virtual machine Stack (abstract data type) Arm Pointer (computer programming) Term (mathematics) Semiconductor memory Boundary value problem Address space Nintendo Co. Ltd. Email Block (periodic table) Point (geometry) Physical law Code Cryptography System call Uniform resource locator Personal digital assistant Sheaf (mathematics) Chain Asynchronous Transfer Mode Booting Address space
Goodness of fit Code Weight Multiplication sign Coma Berenices Musical ensemble Exploit (computer security) Twitter
hi everyone I'm Samia smell them Jordyn whatever you want to call me today I'm going to talk to you about jailbreaking d Nintendo's for yes and you might be wondering okay why why does this matter well truth be told it really doesn't it's just kind of a way to piss off Nintendo and the reason in Tendo doesn't want us to hack their consoles it because you know you wanna sell games want to make money off their games and unfortunately once you hack these consoles it makes it possible for people to play games for free not really happy about that the thing is it's actually also like a really interesting target in terms of security properties in terms of hacking stuff so we're kind of in the middle here of like trying to do interesting things but also you know bad results happen so I'm not trying to make people have the ability to steal games but it kind of happens anyway first thing about talking about the hacking of three yesses kind of introducing through yes what right what is this 3d SD game console it was originally released in 2011 there is a new one that was released in 2014 there essentially the same thing except the new through yes which you know is a great name has twice the CPU cores it has a higher frequency it has more memory basically the twice the amount of main memory and but beyond that they are basically the exact same thing they are running the same operating system which is something I'm just going to get into it's a really cool microkernel architecture and they both have in addition to the main CPU which is what runs your games and stuff they have a secondary CPU which is the
arm 9 CPU so arm 11 here is what you can see in the CPU box here is basically what is going to be running all your games all your apps basically anything that hits the screen anything that you can interact with is gonna be running out of that CPU on the other hand you're gonna have the arm 9 which is V consoles security / io CPU and so the arm 9 is basically responsible for doing a bunch of security tasks and kind of brokering access to a bunch of hardware so in this case I kind of like showed some hardware devices here this is not an exhaustive list it's just a few examples that will come in handy later but so the idea is that the arm 9 basically has access to you know everything it has the keys to the kingdom I mean it doesn't literally have the keys actually because the keys are all in like this crypto hardware blob over there but it has the ability to talk to the crypto hardware blob has the ability to encrypt and decrypt content which is really all we care about and then it also has the ability to access this NAND chip which is all the you know permanent storage as well as the SD card while on the other hand if you take a look at the arm 11 your arm 11 first off does not have access to arm 9 internal memory which kind of makes sense but it also does not have access to the crypto hardware does not have access to the NAMM chip and so basically anytime that the the arm 11 wants to access a file on disk anywhere it has to ask the arm 9 very nicely to give it access and so that gives the arm and the ability to you know broker access to two resources kind of like in kind of a sandbox model now taking a look at what actually runs on the arm 11 as mentioned a very cool I think very cool microkernel based architecture and so the idea is that you will have as little code as possible inside of the kernel right because that is going to be your highest level of privilege on that CPU you want to have as little code in there and ideally have the most like all your drivers and stuff in user mode and so that's what you're gonna see it's over right here in in the base memory region this you're gonna be having a bunch of processes which are called system modules and are essentially just user mode drivers if you think of a monolithic kernel model like say Windows you would actually have all these drivers live inside the kernel and what that means is that whenever you compromise one of those drivers you gain access to the entire system whereas here if you compromised a driver you just gain access to whatever that driver had access to because because of the way that this this operating system works it actually gives as little privilege as possible you know principle is privileged to each process and so that means is that for example a give it a game is only going to have access to a small portion of a system call table same thing you're gonna have in addition to games are running the application memory region you actually have applets which run in the system memory region and applets are going to include anything that can run at the same time as your game so stuff like whole menu the web browser the note-taking app whatever any of that crap can run at the same time and so it's in the separate memory region the whole point here is that between the game and the whole menu you actually have access to the same set of system calls which is very limited as you can see it's basically like half of all the system call table but then if you take a look at one of the system modules you're going to have access to the same set of system calls in addition to that you will have access to new system calls which are going to be privileged system calls things that are going to for example allow you to create a service advertise that service and then in addition you know if you have a an especially special system system module it might have access to a system call that is only accessible from that particular process and nowhere else addition to that you actually need to have the ability to talk to these drivers right because they're not just in the kernel they're in these little pockets of like you know processes here and the way that this work is basically any given driver any given system module can advertise a service and then through the kernel a game can connect to that service and kind of talk to it directly and the cool thing about that is that much like the system call filtering you actually have a a service access list and so for example a game might not be able to access this am sis service am standing for application management so it's a service that lets you you know install and uninstall games or applications or whatever and so it makes no sense for a game like you know Zelda to try and install and uninstall new processes however it makes sense for a whole menu to have access to that and so you have this very granular level of privilege control on the system and what that means that even if you compromised a game you might not be able to access all of the attack surface that you want and you know that's actually like a really good security model and beyond that you just want to just I just want to mention that as mentioned earlier the arm9 handles a bunch of the well the young man does handle like a bunch of tasks such as Crysta crypto tasks as well as brokering access to to physical storage and so you actually have to to go from one process to another and then to the arm9 to complete certain tasks and so you you have like this very this very deep level of like you know different levels of privilege that kind of like live one on top of the other and is it's not as simple as just you know user mode kernel mode and vendor secure processor there's actually different layers and different levels of privileges between those so then if we
take a look just at these this physical memory separation because as I mentioned you know you have this application memory region you have the system memory region you have the base memory region and so these are actually physically separated memory and so you have the FC Ram which is the main Reena Bank of RAM so that's going to be 128 megabytes and it's actually separated into these three regions such that whenever you allocate memory virtual memory like the actual physical backing memory will never you know go from one region to another if you allocate memory from a game it will be in the application memory region it will never end up in the basement region that might seem kind of trivial but it will come up later and then the thing is you know you know kernel from for the arm9 is also it's going to live in our manual memory so you can't actually mess with it from the arm xi and then W Ram is going to be what contains all of the memory that pertains to the arm limited kernel and yeah and so the cool thing is
with this kind of like really deep security model is in theory at least compromising the whole system should take a number of exploits right first off you need to actually even get protects additional machine which is non-trivial because you know Nintendo is not it's not Apple it or Android like it doesn't just give you the ability to create your own apps and like run into your console so you need to actually first you know compromise an application and from there you'll be you'll have you might have Co extortion but it'll be unprivileged and so you kind of want to escalate your privilege to get more tax surface and so one way of doing that is to compromise the system module and from there you might have access to say more system calls and maybe the system calls are going to be more vulnerable than other system calls and then maybe you can use those to compromise as a kernel and then from there you will have the complete attack surface into the arm 9 which is secure processor and assuming compromise that you'll get the total control of course you know you have these kind of arrows to the left to kind of signal that yeah this is in theory and practice you can kind of go from like number one to number four sometimes I know someone has a bug that just goes straight them of number four but in this case we are actually going to explore a bug chain that does every single one of these steps and so it's kind of unnecessarily complicated but I do want to show you know that in theory a security model can be really cool and actually be really effective and and yeah so first thing is
actually getting critics in the machine just for a little bit of history okay
this is supposed to be like animated but
I guess not okay yeah so for a little bit of history there have been kind of two classes of entry points on the three yes and you know one of these is gonna be things like cubic ninja that can make ninja exploit from a couple years ago which is the kind of bug that is trivial and really should not be exploitable on any modern platform but is because Nintendo's then don't count elects a number of remote execution of mitigations on the through yes and you know it kind of makes sense like things
like ASL are actually you should really have that on through yes because it does really cost much in terms of performance but then you don't have to tack your keys and that kind of makes sense and in the context of a game because you don't want to sacrifice performance just for security and the thing is you know through yes also have these web browsers right it has the actual web browser applet actually has the YouTube app which is just like a web browser would like a fancy coat of paint and and you know the thing is from those you can really trivially bypass these mitigations anyways like no one believes that a web browser exploit is ever gonna be stopped by a SLR or Secor keys and so kind of a conclusion for that is even though all these bugs are trivial we're still exploitable and through yes and
really shouldn't be at the end of the day the threat model of a Nintendo needs to adopt is that you know user mode will be compromised right so they need to base themselves a lot and that makes it
such that you end up with like a lot of low-hanging fruit and the exploit that I'm going to talked about today is in the end copy app it's called the you know this micro has the network system transferor fingy basically just allows you to access the files on a micro SD over network and the way that that's implemented is an SMB server and because SMB is a notoriously secure protocol of course you end up with the ability to find
vulnerabilities really trivially and so the way I did this took you know like an hour so I just grabbed pajas mb from github modified a little bit to actually talk to Nintendo's SMB server implementation because for some reason just didn't work out of the box and after that added these six lines of fuzzing code which you know just flip bits randomly and if you want to take a
look at what that looks like in practice
you have you know there's foodie s is running yes and me server here the fuzzer is running in the background and and it should just take a couple seconds and then of course three s is going to crash because this you know super shitty SME server actually SME fuzzy I'm sorry actually you know works surprisingly
well so at this point you know you have like four yes that's crashed you know the three s does not normally give you this is not normally like give you a registered dump like a crash dump like that but this is running a custom firmware because that's how we do development ease days and so I'm not
going to go too deep into detail into how that bug works because first off I'm not in this MV expert I really literally just want that buzzer found that bug and exploited it but to give a basic idea basically yeah you have this like this is the package that actually crashes the console but in its normal form so sessions set setup and X packet where that means and it has a couple of like you know data blobs in there the ntlm response data blob and what I mean by do blob is something that is going to have variable length so as an attacker I control the length of these blobs and that is the until I'm a response data blob and the domain name data data blob as well do domain name is actually just like a string that says work group and the vulnerability is really trivial
basically it just checks the length of a data blob for ntlm response and if it's not a one specific value is gonna take another code path and that code path just copies memory onto the stack into a buffer vaz a fixed length but when you know like a life is controlled by the attacker and you know obviously you can just like you know override the entire stack with that from a packet that is craft is just like this you just make it
such that the intelligent response blob is not the 0x 18 it's gonna be 0 X 10 and then you know you make the work group blob size be like 0 like you know you know hundreds of bytes instead of like ten bytes or some and you actually just override the entire stack you're able to overwrite the return address you are able to redirect the CPUs execution flow and jump into existing code and get remote you know remote co-surgeon essentially in
practice you know I say remote rack station but in practice that actually just means Rob which stands for return oriented programming just real quickly for people who are not really familiar that what that means is we are not necessarily able to inject new code because we have this mitigation called DEP which is like data execution prevention and what DEP does is basically you're not just able to inject code into the process and just jump to it because the thing is that any memory that is writable will not be executable so that's actually like a really good really good mitigation that is actually enforced really strictly by Nintendo there is under no normal circumstance that like under normal circumstances it will never be memory in user mode that has both writable and executable or memory that was ever writable will never become executable and so that's it's really well enforced and what that means is a instead of actually just injecting shellcode and just jumping to it you have to reuse the existing code inside the process and that's that's what Rob is basically just over a return address you jump to like the tiny piece of code make sure that then it'll jump to another tiny piece of code another tiny piece of code another tiny piece of code which we call gadgets and from there you are actually able to do arbitrary competitions call arbitrary system calls and do kind of whatever you want thing is you know my personal aspirations for hacking the through yes were to actually run homebrew on it which is you know games made by amateurs applications being by amateurs that sort of the thing about that is writing homebrew and rap is not ideal you kind of want to do it in the in actual native code and so you know we are not able to to create new executable memory we're not able to reproduce cutable if we don't need to what if we can just actually overwrite memory Bayes read-only and the way you
do that is through DMA right you have a bunch of devices that have access directly to memory and you know the GPU is one of them because the GPU needs to be able to for example read a texture from you know read a texture from memory in order to render something and also needs to be able to write a frame buffer to memory right and the thing is the GPU actually has access to all that serum of w RAM or VRAM and that means that you can actually just use the GPU to render over code pages right in practice it's not that simple because otherwise you would just be able to override the kernel because even though it has access to W RAM which is includes a the arm eleven kernel it's you know there is like a register that allows intended to limit the range but GPU has access to through DMA and so we're not able to just override the kernel and in fact we're not even able to override system modules or or home menu because the system and base regions are now accessible to very to the GPU but because the GPU does need to be able to access the textures and and stuff from the current game we do have access to the first half of FC Ram it's what that means is that if you think of you know this is physical memory at the bottom you have virtual memory at the top you have this is going to be your text section so it's readable and executable not writable basically just gonna use the GPU to overwrite physical memory at the bottom and then it's just you know what because that's how memory works just going to show up in virtual memory and you can just jump to it and so basically we use the GPU to render code into you know these physical pages and then overwrite existing code right and so Nintendo did not really like that very much and they try to kind of put a wrench into our plans and the way they did that was they realized okay whenever people use their wrap chain to use the GPU to overwrite code with you know other code they kind of rely on this specific hard coded address the reason being that you know this code page is always going to be at the same location physical memory and so we don't really need to do anything fancy so their idea is you know this is before their mitigation was put into place you just have these four blobs of code and virtual address space and then it's just going to correspond really trivially to these four blobs and physical address space so their idea is well let's just jumble it up if you and that way you know as an attacker if I try to write to physical memory you know because the order is going to be kind of jumbled up and you don't know the size of the blocks you don't know where the which ordered the blocks are put into and stuff well if I try right to physical memory to the same location as before it might show up in the in the blob that I wanted to or it might show up in the other block and so that means I won't know the location I just wrote to and that's so we call this physical a SLR which you know P SLR for short because it's really what it is and the
thing is it's actually kind of a kind of a shitty mitigation because you know a good mitigation you want to have a negation that actually creates extra work for the hacker every time the right next play the thing with this one is well you just kind of have to bypass it once because it turns out Rob as has been known for about ten years is Turing complete and basically you can do arbitrary computations so you can actually just do a for loop and search for the physical piece of memory that you want to overwrite and then overwrite it so we basically had to write this rap chain once and then you can't just reapply to every exploit so you know not a great mitigation and so what that
means in practice is if you know this is like the actual it could just kind of write it on the computer connects to the console paksa console and then we have correction we have like the actual homebrew menu running the console and you can just do that over a network over any console that is running eleven point seven or whatever so that's the first
stage at this stage we have compromised you know I'm privileged user mode and and well yeah that's so that was the first step in our in our for chain next for exploit chain and at this point we want to somehow escalate privilege
because the thing is okay so we have code execution this is great but we only have access to Vedic unprivileged system calls right so in terms of attacking the kernel totally doable has been done several times but ideally want to have more tax surface and likewise if we want to escalate to another two like an actual system module which might have access to more system calls well you know M copy like this application we just compromised only has access to a few of these services so ideally want a way to kind of migrate to another process that might have access to better better better privileges and such and
well it turns out we can actually kind of do that because you know I was showing you this slide earlier in terms of like what the GPU has access to in terms of DMA we're saying only has access to the actual application region I kind of lied about that it actually has access a little bit of a system region as well it
just does not have access to the whole menus code section it does have access to the whole menus heap section and so that means it's like any memory base dynamically allocated by the home menu I will be able to read and write through the GPU and so from there is actually kind of trivial to just find some C++ object on the heap you know modifies V table for example and just have it jump to other code and then you get coded extra shouldn't well you actually just get robbed in the whole menu so that's the kind of annoying things like you can't use the GPU to get native code stationed inside of the home menu and so in terms of how that works in practice we had to write this whole like service that runs in the home menu and that's all drop and you know but once again like Rob is turing-complete so you can just do whatever you want and and yeah so
at that point we have compromised like these two processes and the thing that is interesting about that is you can see even though we don't have access to any additional system calls we do have access to the services at home and you have access to and one of those services allows us to for example kill processes and create new processes and so the idea then is that we can actually just kill VM copy process because you have coefficients as a homing process and replace it another process and then use the GPU to take over that process and so on and so the idea then is that we actually have you know in theory we have access to any the privileges of any process that can live inside the application region that we can start right and so that means that any game any app any surface that these have access to we you know but we kind of have access to as well so we have like the biggest etosha facility could possibly get from a privileged user mode and means that we can start looking into some more esoteric services such as LDR
R Rho so it lives in the RO process and basically what it does is if you think of Windows right windows has these DLL files dynamically linked libraries well turns out the 3s does as well and they don't call them DLL they call them CR rows which stands for CTR relocatable object probably I'm just guessing at this point and it is like an interesting process for us to go after because the action has access to a very special very special system call which actually allows us to create new executable memory if you want to which will come in handy later and so taking a look at how it works basically you first allocate a
piece of memory in application you will load your CRO into it from you know the file system or whatever and then you're going to ask LDR ro to load it for you and what I mean by lota is because it's a dll it's supposed to be executable code it's going to need to be reproductive to be executable at some point as a process I didn't have the ability to do that but you know this lgor o does so the first thing it does is it actually locks it away from the application and then is going to apply you know dynamic linking stuff to it which just means like relocating some pointers and such and and so on and then it's going to reap refutable for the pages that that's relevant for so what I mean by locking is that my application will not be able to write to that memory which makes sense because we don't want to well LDR does not want us to be able to like mess with it as it's happening and the thing is because you know it's the application itself is loading this this CRO blob the linker does have to be built defensively against like malformed CR O's and actually if it did like fix a bunch of bugs there and made it such as far as I can tell there's not a lot of vulnerabilities you can just exploit by just giving a malformed CR Oh the thing to notice though is as mentioned the application is the one that allocates the memory that is going to contain this URL so what that means is that in physical memory for this is going to be in the application region which means once again we can use the GPU to kind of like mess with that CRO blob as it's being relocated and you know that sounds like it could be a problem because it was built defensively against malformed CR O's but what about Saros are like kind of being modified on the fly well true now is not secure
against that at all and so if you look at the code that this is code that lives in the RO process and just kind of is part of your relocation process of this the first thing is basically is going to go through all the offsets in the header of a CRO and kind of modify them to stop being offsets into the CRO and become actual pointers to the CRO so it basically just like it just adds the base address of the CRO to each offset in the nd header after checking of course that the offset is within the bounds of the CRO right and so that but that could be fine the thing is this pointer that is going to be used later on by the RO process lives in physical memory that rule and so what that means is that whenever it ends up being you know this is for example the pointer to the segment tape in the VCR oh well that way you can see is loaded from the CRO here and then is going to be used directly to both read and write memory so as an attacker if I can modify that pointer I can start getting ro to read this which is not great for them and in practice we end up with like three kind of weird kind of weird like corruption primitives the first us to write an arbitrary value at an arbitrary location as long as there's like a byte that has the value to eight bytes after the location we're trying to overwrite and also like the location like four bytes after what we're trying to overwrite can't be value 0 for some reason and then the same thing below for the second primitive except it has to be byte 3 and then all the way below it can be any value there as long as what we're overriding it's not that zero and then we're not actually just overwriting it we are actually incrementing it with some other value so basically we have like these arbitrary read and write the primitives well actually really just arbitrary right but they're not really arbitrary in the sense that we do have like these weird constraints here but of course it's not that hard to exploit this in practice what I want to do is
get robbed inside of this process in order to get to do Rupp I just want to override a return address on the stack and this is just kind of showing what I can and cannot overwrite based on these beasts on these primitives well you see an orange here or actually return addresses on the stack that is what I would No and so what's in yellow right and what you can see is there's actually real lap between return addresses that I want to overlap over right and the locations in memory vac and actually overwrite phase we do have like this corruption target here of this corruption primitive here which does allow us to overwrite memory well actually increment memory at an arbitrary location with much fewer constraints so I don't need to have buy three for that and instead what I'm gonna do is I'm gonna use that to you know have this location we meet all these constraints for this primitive and so I can use this to actually place by three which I can then you know use this with the second corruption primitive to just overwrite this return address with an arbitrary value so at that point you know there is like a little bit more to the actual full-on exploit but this is the basic idea right it is pretty simple and at this point I will have Rob
execution inside this process and I do have more privileged than I had before and so what that means in practice is I just have access to a few more system calls and I get to like look at them and see if I can use them to actually take over the kernel so taking over the
kernel well now we have taken over this process called ro we have access to more system calls and one of these system calls that is actually really interesting is called control process memory what control process well yeah first off there's an interesting statistic call because ro is literally the only process that has access to it so since you can kind of think of ro is like an extension of a kernel that just like has this one very specific process like a goal like purpose and it has it has to use like this very special assistant call that was built just for it in order to achieve it the thing about control process memory is it's really just the same thing as control memory except they can work across process as long as you have a handle to that other process and that has fewer constraints one of the fewer constraints that it has as mentioned is that it can actually create or reaper TechEd existing memory as being executable it's really useful for us if you just want to run homebrew right we can just we don't need to mess with a GPU anymore we can just like create new arbitrary executable memory and it can just like do whatever we want but the other interesting thing about it is it can it also bypasses some of fictions that control memory has in terms of where it is allowed to map memory to and what that means is that we can map the null page which is address 0 which is something that is notoriously you know not allowed because a lot of bugs rely on the on the ability to place memory at address 0 because a lot of bugs are just gonna be dereferencing a pointer that is null and should not have been and was not checked properly and so that's kind of interesting for us because then if we can find a null dereference bug instead of a kernel which would normally just be a denial service bug all of a sudden we might be able to elevate it to become a an actual remote restriction bug and so what is a good target for null dereferences typically it's going to be memory allocation because if you have
reallocation primitive and you run out
of memory or you just like try out
memory is gonna return null synthase kernel and you know malloc or whatever does not check that the pointers null then all of a sudden you have an LD reference and things become interesting for us so taking a look at how the
allocator in the kernel works for kernel objects this is basically what it is it's a linked list it's a slab heap what that means basically for each type of object I'm going to have one memory blob there's going to be subdivided into sub objects and so basically whenever these sub objects are are not used there are part of a free lists what that means that you know you have that list head and then each free object is going to link from one to the other and then I'll checked just means popping and popping a free object from this free list and putting it into like whatever whatever you want to use it for and then freeing an object is just going to be pushing an object back into that free list Lister what happens if we run out well if we went out we end up having the free list head point to null and so whenever you allocate an object next it's just going to return null and all of a sudden you know we might have our our null dereference bug that you want now what
that means is that of course the code that uses this allocation this allocation function has to check the result the resulting pointer is not null right if it's zero it should just throw an error and usually it does but you can see is like last example there you know it's it's allocating a new linked list node and it's checking that this node is not null and if it is not always going to Ali is going to initialize it to zero without checking for anything it just like it does this check for Ella for it to zero but then even if it was null it just kind of uses it without without caring so it is like kind of an odd programming pattern but somehow it ends up being in literally every location that the kernel uses these linked list objects and so the idea then is becomes well if we can make the kernel run out of these linked list objects we can make one of these linked lists be in the null page which is controllable by us by from user mode and then well once we have that we might be able to actually take over the the kernel so the question is
how we actually make the kernel run out of these linked lists well a good way to do that is actually just to look at other system calls and how they work and one of them is weight synchronization and it is a system call that is is basically the same thing as weight synchronization one the only difference that we in synchronization one weights on one object whereas wave synchronization pen weights on n objects which I know it's pretty obvious the thing is what I mean by waiting on an object is going to be something like a criminal object like a thread when you're waiting on a thread that means that you're going to have your current thread wait until that other thread is dead and then you're after is going to be woken up it's going to get in event you can also be waiting on the event table jack on mutex waiting on mutex just means you know wait until that mutex is not locked anymore basic idea they should n will take in objects as input in practice that's up to 200 and then it's going to wait on them and as soon as one of them is signaled is going to wake up your thread and do whatever now the question is okay it has to wait on these two hundred fifty six objects
somehow so as to keep track of them somehow and the way it does that of course is a linked list right so the idea then becomes well if we can create as many threads as we want and have each one of them wait on as many objects as we want then we're going to be creating a bunch of these linked lists that all have you know as much as many as 250 objects in them and then well you know there's only about 1500 linked linked list nodes are allocating the kernel so after after a few attempts we should be able to actually get it to run out and
yeah so that is essentially what we're doing we can actually trigger of a null dereference bug that way thing is it's not trivial to exploit necessarily because well it turns out linked lists are using the kernel a lot and the problem is well what if another process is trying to use you know a system call because you know that's what processes do and it needs to use a linked list because you know that's what the kernel does and it has run out well it's basically going to crash because the other process does not have a null page mapped into it so it's not great and then the other thing is even if our own process might have another system call or even the current system call because if you if you look at this you know it's going to just continue allocating new linked list of new linked list nodes over and over again right and so even after we've triggered the vulnerability we're still going to be keeping keeping a like we're still going to keep like allocating new nodes after this and so the problem with that is well next time a node is allocated even if it's in our current process null is going to be returned and if it's from another list we're going to end up with two lists that you know collide into the same node and you end up with like all these linked lists from the kernel that like are kind of mangled into one one another and that gets really messy really quickly and we want to avoid that if possible so a way to do that is is basically to just kind of do this thing I just call it like just-in-time freeing but it's really not as fancy as it sounds the idea is that as soon as you're going to have a linked list node allocated in the null page is going to write data to that null page and so because of 3's has multiple CPU cores you can basically just have one thread do that you know that null dereference bug by keeping by allocating more and more and more linked list nodes and then you can have the other CPU just like reading from a null page at all times and as soon as it sees that you know zero has changed it to like a pointer value be it like the next pointer previous pointer and the object pointer is going to be able to take action so as soon as it sees that the the first CPU core so core zero is just going to signal an object the another thread was waiting on master is going to free a bunch of these of these uh linked list nodes and then the next time we have an allocation is just going to use one of these linked list node as was just freaked obviously that's how we get around the whole issue now the question is okay we're able to trigger this bug and we are able to do it without crashing at your finger is great but how do we actually exploit this to get closer to inside the kernel well typically like
I'm sure like people who are familiar with linked list bugs will know that basically you want to do this through the on linking phase and so just like explain how that works so basically imagine you have this linked list right so each load is going to point to the other and the previous one so when you free let's say node 2 what's going to happen it's just you have to update the next pointer from node 1 and the previous pointer from node 3 it's pretty straightforward now the thing is in our
case we actually control node 2 right we have full control over the next pointer value and the previous pointer value and so let's say in this case that you know we say that next points to zero X babe and in previous points to zero X is dead well if we try to unlink the this bad node what's going to happen is that it's going to write the you know the next is going to write the value of a previous to next and a value of next two previous essentially so that means that you know you're going to be writing 0x babe to 0x dead enter X dead to 0 X babe and so that means that we can actually use this to write an arbitrary address to an arbitrary location as long as both addresses you know point to actual valid writable memory and so we end up with
this primitive which is obviously super powerful because we can use this to say over a a function pointer and that's essentially what we end up doing because if you look at the code that actually invokes the linked list on linking right before freeing the kernel object and well right after frame the kernel object is actually going to make an indirect call for a V table so if you can overwrite the pointer of as pointed at by that V table then you you have the ability to just you know jump to any location code that you want the only kind of annoying thing is that it turns out the free Kayaba function is going to panic if it tries to free a null object which is kind of weird because the allocation function does not really seem to care about returning a null object but you
know whatever you end up being able to actually just exploit this by by doing by being like you know vaguely tricky and by vaguely tricky I really just mean you know you overwrite this vtable pointer the only thing is you have to override the veto pointer with the address of a node that is going to be overwritten and because of that you have to actually make the null page be read writable and executable but basically what's going to happen is you just kind of put a piece of code in there in the second node there that is going to just jump to some other location and user mode and and basically you just get coefficient that way it's not that
complicated at this point we have access to everything in the on eleven we have compromised on eleven kernel which means that by extension we have actually also compromised literally every other process that is running on the r11 at this point we can kind of just do whatever we want can run whatever games we want we can you know access all the hardware that is accessible by vrm eleven that's pretty great but it's not
enough for whatever reason we do still want to we do still want to actually take over vrm nine because you know I
guess that's cooler it actually doesn't really give you access to much more it does like allow you to write directly to the nand chip which is nice and definitely useful for other exploits but yeah so we want to be able to do that and again we don't actually have the ability to write directly to my memory but you do have the ability to talk to the are benign through other ways and so the arm9 is responsible for certain services such as accessing permanent storage as mentioned but it also does other things and one of those things is actually backwards compatibility so v3s is actually able to run old DS games and the way that that's done is by basically turning the 3s into a DS in terms of hardware like it folks a bunch of weird hardware registers and that she brings up a third CPU of is like kind of hidden and it turns the arm9 into like a vs mode CPU and it's kind of crazy and so the thing is it has to be
able to do that in order to do that it actually has to kill the current operating system and start another operating system that's going to do or bring up for all this crap and so the operating system that we've been working on so far is this native firm thing firm being for firmware presumably but you have other firmware that can run them through yes you have safe firm which is what runs when you do an update for your console when you have TW all firm which is the one we're gonna be interested in and twl is just like the code name for the DSi so that's that's where anything comes from and so in terms of actually launching another operating system with
a three es does is it basically just has the arm nine do everything first the arm nine is going to load the memory like load the firmware image from permanent storage it's going to load into memory that we cannot alter from the arm 11 which is our mine internal memory it's going to use you know its crypto hardware it's actually decrypt it and authenticate it make sure it's like actually the Nintendo special sauce and not something that we've altered somehow and then it's going to copy each individual section into where they're supposed to go because you know you have code that needs to run in the arm 9 that's pretty easy it's already in our memory we also have code that needs to run the arm 11 and that has to be copied to FC Ram W RAM and whatever now the thing is once it's done that the arm 9 is not just going to start executing its own code it is first going to tell VR I'm 11 hey please start running the code on your end - because you know I need I need you to like do do that as well thing is we've compromised your I'm leavens so if we can basically tell the arm and it off and we can just like keep running whatever code we're already running into whatever you want and so that becomes interesting because if you
took a look if you take a look at how Tullio firm works basically it first has to load a ROM like a DS game image from somewhere and then turn into a pseudo a pseudo es mode sort of thing and it can actually load its ROM from three locations and either loaded from an actual physical game card or from the NAND car from the NAND permanent storage or for some reason it can loaded from FC Ram which of course we have complete control over from the arm 11 and that becomes interesting because well you know it's a rom loader it's a file format parser well there might be bugs in there and it turns out there are and so from the arm eleven we can actually mess with that ROM and kind of inject something the only thing is of course the s Troms are signed so Nintendo actually checks with the the jest ROM is valid before using it and so that should kind of kill the idea except that for some reason it does not check the signature if the ROM is coming from FC RAM which is completely baffling because that is like the one location that we have control over I honestly don't know why I did not really care to reverse engineer into like why that happens but it does and so we are actually able to move for of this idea and so the D s ROM is going to basically contain two memory images they have to be copied because these are the code images are going to be run by the arm 9v arm seven which is our two CPUs of a DS and essentially yeah the the the the the t WL firm loader has to basically copy these two arm nine and arm seven sections into whatever memory is going to correspond to the DS mode RAM address so you have like this kind
of formula to convert a 3d s physical address well in Nintendo DSi mode physical address into s3 as physical address so the two things to notice is first off it's like a kind of weird hardware compatibility mode and so you only have like two bytes that are used every eight bytes which is why we have to multiply by four for some reason and then the other thing is you can see with this formula that you can actually create any three es physical address as long as you have the right Nintendo DS physical address like we can cover the entire memory space even though it's not meant to be the case and so because we had control over the address in Nintendo DS mode that the arm nine and arm seven sections are going to be loaded at we can actually turn those into any 3ds physical address that you want for example memory that is used by the arm nine to execute code so for example we could maybe override the actual arm nine through yes mode code and take it over that is as long as there are checks are not good enough turns out of course their checks are not good enough because they did not do an integer overflow check which again kind of sad and there's also no bounce checks on the section sizes so that means that we can have like kind of crazy values of in a dress like let's say we want to override this address over there because like that is the arm9 like thread stack address well we can just like create this fake nin tote address and then give it its crazy size this is like about one gigabyte of memory so this would never be valid for it in towns yes rom and then if you check out on the math it's actually going to give you the address that you want and it's also going to go through all these checks that we just mentioned here and so at this point basically we
end up with the ability to write about a gigabyte of memory to an arbitrary location through yes a physical space and it should be a problem because you know we actually don't have a gigabyte of memory to overwrite the thing is you know I'm gonna kind of skip over because I'm running out of time but the idea is that we can just overwrite this physical address here well the return address for this function because the actual memory that is being copied is being copied by tiny tiny blocks instead of like big blocks so you actually end up being totally fine now the only thing is for some reason it's copying two bytes every eight byte boundary and so you can't actually just like overwrite code because you can only overwrite two bytes you know on every eight bytes law which is super annoying what that means in practice is if I want to overwrite this this call stack right I can actually only overwrite the bytes that are highlighted in orange and so in terms of making a wrap chain that's you know not ideal but we can make it work because the arm9 doesn't really have depth or anything so we can actually just use this to place an actual address the address of like actual code that we control because that code can be in writable memory in this case we place that code into the Nintendo DSi mode ROM header and then we just overwrite one return address of a top there and make it jump to this gadget that's just going to skip a bunch of the call stack and then it's going to return into this address and we will get correct station on the r9 and yeah at this point we have
full control over the entire machine you know we can do this we started with nothing went over Network sent like one magic packet has it and then it just kind of you know gives you full access to everything and you can read and write and you can mess with the crypto engine and kind of do whatever the you want so thank you
for your time I'll be code for the exploits for this Isabelle and github [Music] I want to give special thanks to a few people Derrick net will yellow say Pluto there were and she want to follow me on Twitter that's my hand Dahl's me l'm not tweeting very interesting things so you know please don't and yeah have a good Def Con
Feedback