Breaking Band

Video in TIB AV-Portal: Breaking Band

Formal Metadata

Breaking Band
Title of Series
Part Number
Number of Parts
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
In recent years, over-the-air exploitation of cellular baseband vulnerabilities has been a recurring topic in the security community as well as the media. However, since “All Your Baseband Are Belong To Us” in 2010, there has been little public research on exploiting cellular modems directly. Now, Breaking Band is back with a new season by popular demand We will describe our methodology for reverse engineering the RTOS, starting from unpacking proprietary loading formats to understanding the security architecture and the operation of the real-time tasks, identifying attack surfaces, and enabling debugging capabilities. Through this, we’ll give you a complete walkthrough of what it takes to go from zero to zero-day exploit, owning the baseband of a major flagship phone, as we have done at Mobile Pwn2Own 2015.
Adventure game Particle system Computer animation Software Multiplication sign Remote procedure call Mereology
Area Domain name Quark Point (geometry) Time zone Implementation Parsing Information Real number Multiplication sign Black box Exploit (computer security) Event horizon Portable communications device Frequency Goodness of fit Computer animation Semiconductor memory Personal digital assistant Quantum Summierbarkeit Codomain Information security Operating system
Scripting language Time zone Implementation Building Computer-generated imagery Code State of matter Real number Real-time operating system Exploit (computer security) Entire function Electronic signature Process (computing) Computer animation Remote procedure call Endliche Modelltheorie Resultant Partition (number theory) Modem
Slide rule Group action Sheaf (mathematics) Real-time operating system Distance Mereology Code Bookmark (World Wide Web) Non-volatile memory String (computer science) Energy level Firmware Booting Partition (number theory) Email File format Content (media) Planning Entire function Electronic signature Type theory Bootstrap aggregating Hexagon Computer animation Personal digital assistant Enumerated type Configuration space Text editor Table (information)
Computer virus Computer animation Computer-generated imagery Personal digital assistant Code Mathematical analysis Variance Mereology Resultant
Android (robot) Code Component-based software engineering Programmschleife Computer configuration Computer hardware Entropie <Informationstheorie> Encryption File system Data compression Booting Dialect Computer-generated imagery Bit Line (geometry) Cryptography Electronic signature Type theory Process (computing) Computer animation Personal digital assistant Blog Normal (geometry) Summierbarkeit Quicksort Resultant
Android (robot) Functional (mathematics) Computer-generated imagery Quantum state File format Sound effect Bit Online help Menu (computing) Rule of inference Connected space Process (computing) Component-based software engineering Computer animation Logic Personal digital assistant Semiconductor memory Core dump Video game Representation (politics) Plug-in (computing) Form (programming)
Android (robot) Context awareness State of matter Code Sheaf (mathematics) Perspective (visual) Code Rule of inference Fluid statics Goodness of fit Component-based software engineering Internet forum Semiconductor memory Forest Booting Area Information Computer-generated imagery File format Software developer Keyboard shortcut Electronic mailing list Bit Computer animation Personal digital assistant Whiteboard Routing Reverse engineering Relief
Functional (mathematics) Game controller Arm Code Binary code 1 (number) Sheaf (mathematics) Primitive (album) Real-time operating system Bit Software bug Word Process (computing) Computer animation Personal digital assistant String (computer science) Formal verification Cuboid Right angle Position operator Reverse engineering
Point (geometry) Default (computer science) Functional (mathematics) Scaling (geometry) Information Computer file State of matter Ferry Corsten Length Binary code Limit (category theory) Field (computer science) Category of being Type theory Crash (computing) Computer animation Hierarchy String (computer science) Normal (geometry) Task (computing) Modem
Point (geometry) Default (computer science) Functional (mathematics) Game controller Arm Code Binary code Combinational logic Real-time operating system Cache (computing) Arithmetic mean Computer animation Personal digital assistant String (computer science) Mixed reality Heuristic Right angle Task (computing) Physical system
Point (geometry) Randomization Functional (mathematics) Game controller Multiplication sign Real-time operating system Primitive (album) Mereology Stack (abstract data type) Proper map Heegaard splitting Core dump Energy level Task (computing) Mobile Web Scripting language Cellular automaton Content (media) Electronic mailing list Bit System call Flow separation Exploit (computer security) Frame problem Data management Kernel (computing) Radius Computer animation Vector space Personal digital assistant Interrupt <Informatik> Right angle Pattern language Table (information) Routing Asynchronous Transfer Mode
Point (geometry) Functional (mathematics) Implementation Greatest element Game controller State of matter Real-time operating system Open set Stack (abstract data type) Event horizon Expected value Different (Kate Ryan album) Semiconductor memory Data structure Metropolitan area network Task (computing) Condition number Area Dialect Arm Seitentabelle Information Projective plane Memory management System call Exploit (computer security) Entire function Message passing Process (computing) Computer animation Personal digital assistant Configuration space Pattern language Freezing
Point (geometry) Functional (mathematics) Information Mapping Proper map Crash (computing) Computer animation Vector space Personal digital assistant Semiconductor memory Right angle Whiteboard Exception handling Computer worm Asynchronous Transfer Mode Modem
Point (geometry) Axiom of choice Context awareness Implementation Functional (mathematics) Multiplication sign Set (mathematics) Mass Mereology Stack (abstract data type) Software bug Semiconductor memory Diagram Modem Task (computing) Software developer Interface (computing) Debugger Bit Message passing Process (computing) Computer animation Integrated development environment Buffer solution Normal (geometry) Right angle Computer worm
Point (geometry) Game controller Identifiability Source code Mass Menu (computing) Element (mathematics) Heegaard splitting Causality Core dump Representation (politics) Data structure Task (computing) Mobile Web Email Information System call Connected space Particle system Type theory Data management Message passing Pointer (computer programming) Process (computing) Computer animation Personal digital assistant Quicksort Communications protocol Reverse engineering
Point (geometry) Axiom of choice Functional (mathematics) Length View (database) 1 (number) Sheaf (mathematics) Set (mathematics) Function (mathematics) Mereology Bookmark (World Wide Web) Element (mathematics) Software bug Number Frequency Fluid statics Different (Kate Ryan album) Term (mathematics) Representation (politics) Error message Position operator Task (computing) Area Pairwise comparison Constraint (mathematics) Matching (graph theory) Information Block (periodic table) Forcing (mathematics) Keyboard shortcut Mathematical analysis Maxima and minima Price index Type theory Message passing Pointer (computer programming) Computer animation Blog output Quicksort Table (information)
Classical physics Slide rule Game controller Information Length Code Price index Line (geometry) Limit (category theory) System call Computational complexity theory Element (mathematics) Message passing Computer animation Buffer solution Right angle Remote procedure call Arithmetic progression Descriptive statistics Buffer overflow Address space Spectrum (functional analysis)
Point (geometry) Code Mereology System call Number Revision control Radius Personal digital assistant Right angle Address space Form (programming) Spacetime Computer worm
Context awareness Randomization Run time (program lifecycle phase) Code Multiplication sign 1 (number) Primitive (album) Open set Mereology Computer programming Wiki Expected value Computer configuration Semiconductor memory Encryption Cuboid Descriptive statistics Physical system Identity management Area Arm Bit Flow separation Type theory Message passing Digital photography Process (computing) Chain Phase transition Normal (geometry) Configuration space Smartphone Summierbarkeit Reading (process) Resultant Point (geometry) Functional (mathematics) Game controller Vapor barrier Real number Virtual machine Regular graph Metadata Term (mathematics) String (computer science) Operator (mathematics) Booting Address space Form (programming) Modem Task (computing) Mobile Web Demo (music) Cartesian coordinate system Exploit (computer security) System call Symbol table Cache (computing) Computer animation Software Personal digital assistant Operating system Buffer overflow Computer worm
Android (robot) Parsing Multiplication sign Plotter Source code Range (statistics) Set (mathematics) Real-time operating system Primitive (album) Mereology Information privacy Perspective (visual) Traverse (surveying) Mathematics Semiconductor memory Computer configuration Different (Kate Ryan album) Core dump File system Physical system Area Block (periodic table) Message passing Process (computing) Order (biology) System identification Remote procedure call Spacetime Point (geometry) Functional (mathematics) Service (economics) Web browser Coprocessor Power (physics) Number Revision control Latent heat Causality Peripheral Metropolitan area network Plug-in (computing) Form (programming) Computer architecture Modem Dialect Interface (computing) Debugger Projective plane Memory management Expert system Plastikkarte Directory service Cartesian coordinate system Symbol table Pointer (computer programming) Kernel (computing) Computer animation Software Personal digital assistant Computer worm
Context awareness Information State of matter Code Surface Multiplication sign Mereology Cartesian coordinate system Number Element (mathematics) Message passing Computer animation Angle Software Semiconductor memory Operator (mathematics) Charge carrier Mathematical singularity Cuboid Remote procedure call Quicksort God Computer worm
a part of the any time you and then there
and the hair and the chore
I did not OK cool we're going to continue with the next talk on some adventures in the so your business
and yeah right thanks so high everyone welcome to breaking down and identical mommy and that's a very remote research particle goal they were going to talk about rivers engineering in exploiting basement software
so business security has been a pretty hot topic in the last couple years so going back to 2009 and send them studied most a couple talks on SMS parsing and then at expedition of memory corruption issues in in in baseband OS's and but in the time since even that there's been a lot of good research on on portable security and there's not been a lot of new information about the published on that uh explicitly woman abilities and and and real exploits and uh and that's been the case even though and the baseband of being the target set upon zone with some the highest payouts bought the without any and all attempts until last year of course and and in in place of that there's been no I guess what I would characterize as followed a lot over the repeated information that might have actually been an accurate and 5 or 6 years ago but rehashed and doesn't necessarily uh remain factual these days so that was basically
our our original motivation to uh get into this so researcher and talking about targets are obviously are quark on these events have been the biggest recipient of attention in the in the recent years and that kind of makes sense because they have maintained over a large uh um will market lead in this area but of course we used to work for quark on so we don't have the opportunity to contribute to public research in this domain there however last year was kind of a sea uh because the sums decided to to bitch quantum baseband saying the flagships and instead introduce the owner of the implementation of the they call Shannon and this gave us an opportunity and to our go after a target a point as well as to his as should try to answer the question of all this so this this this concept that the baseband OS is is massive black boxes in a creators oblique security nightmare scenario where only some initiated few might understand how they work but it's not possible to to people with them from the outside so we felt like but we want to take the chance to see you're really how far can just a couple guys in a free time going in in out in the period of a few months I'm
still getting to the meat of the talk and we're going to start by describing our 1st steps to understand reverse engineered the real-time operating system and then the finding 1 abilities and then finally and what it took to put together a full of remote code execution exploit and and what we try to do is just go through and the results in the like this is what the authors like but they explain how how we went through the process of understanding things and then let's talk about the successes but also that the failures and and and uh and at the end of all we we ended up building there are quite a few constant uh tools and scripts that aid us and we going to be releasing all those so to get started on Shannon this is essentially as something zone uh baseband implementation it's an entire uh stack including for Support for out the history not new in there it goes back to real devices such as phones but you have the states uh but he was really 6 that it was 1st introduced into a their entire of lecture blind uh but it's also use the nonsense and devices there's surmise of phones that use additional basements and it uh and something that is to use uh this design soul that essentially the S 7 outlines come out now and then miss and show the models that are installed in in the US the user Shannon still and so when he
get started 1st thing you wanna do is acquired a fair where this indicates of the uh these phones is pretty straightforward that the modem images just well 1 of the partitions so accessible on Android on on the radio competition are precisely but that once you have that uh blob the naive approaches and get you anything go over been Walker doesn't recognize any meaningful signature here so you doing it some kind of proprietary uh format for
the where so the next thing you do is just to follow up your favorite text editor and try to make some sense of the header format and which in this case so luckily wasn't that complicated because of this file format that the new we go the you'll see for the 1st marker which presumably distance for Table of contents uh nicely includes as you can see in the slide st strings for the different parts so you could we get an idea for what this is and it's essentially just an enumeration of all of the all of the worries for parts that are stitched together uh that includes a boot section which is some type of bootstrap called we talk about that later and then the main which is essentially the entire of real-time operating system and this 2 data partitions so no 1 is uh and beat it stored a is this radio configuration data goes into non-volatile memory and the latter is called offset that's also some data we action that will assure what uh the the purpose of that is part of a rooted in the to figure that out for our purposes and then the the last thing that that's so obvious to the 1st look is that there's some kind of signature or or hash uh special firmware as well presumably of course the use of implements a cue ball but also then you start looking at these different
pieces and the 1st with a bolt that would basically get lucky and just looking at uh at today in the hexagon there but and it's already apparent that you're looking at plane that's called uh that something that's an easy check with the wood on that you can often do because of the conditional codes in the 1st level is actually is something is called then you'll see these these these columns of the everywhere and and sure enough if you fire this up in idea that
you you get a very decent results from delta analysis and you just get started from the beginning of from the recent and the and try to figure out what is 6 it exactly does so get back to that in a 2nd but more interestingly and then
you look at the main part and here we're not so lucky and it's some pretty massive virus like 40 megabytes and but as you can see it but no such luck before and these images that clearly somehow encoded compressed or incorrect data where have you know and there's not a whole lot to fall back on because we also did that of course with the previous 4 where are the 2 variance for all devices and in those cases the uh the main code there was always on the plain text as well so then you wanna get an idea
for OK what what our why dealing with is this sum norm compression unknown backing some kind of meaning naive crypto and then you get lucky and are good thing that you can do is just look at the entropy of the image uh Creek as an also blog posts explain this in detail so we're go all the way into that but the bottom line is is just by the results it's clear that this is some kind of to proper critical the all then you think about OK
well how am I going to get back via the plaintext code and we look at a couple options and it a bunch of dead ends and then eventually figured something out so forcefully going back to the boot code looked at that and but the bottom line is is that is that you start is coder gets a primary quickly that essentially all that's interesting to you so the decryption and the signature checking this is done by a type copy loops that are essentially leveraging memory-mapped dial what that means is there are dedicated hardware pieces in the in the in the sort that the wheel will do the decryption with the signature checking and so on so we without some hardware-based debugging support it's going to be pretty hard to figure out so that the next thing we looked at is that is the trust on the device and we just had a hunch that we might find the motion be used for 4 0 4 loading assisting loading interest that's just because some some has uh all the habit of putting a lot of value features into their trust that's uh and NOx and what have you and however and in this case this turned out to be a dead end as well and so then we looked at OK what's happening on the and right side since presumably somehow the Android would play a role in in the league of this image and so then we just look around a little bit in the blue book bloggers and the radio walk at the end of file system running processes and Lowther short we found that this 1 process which is called the CBD were which really stands for the cellular processable demon and and even though this is involved in the loading of the images you see in a 2nd for the purposes of figuring out how to decode that our Decrypted Image this also present to be a dead end and the reason for that is
because as it turns out that and again this is basically just doing some old-fashioned reversing of the image and of course this is a was the 1st uh some sinful using 64 bit for Android as well so I guess that X-rays rays become by the plugin wouldn't surprise and 6 month earlier but uh you know to do and In any case so we were able to figure out relatively quickly that really all that this does is it the process is this DOC format to get the chance and then use this the by connection to send it over to the modern side and that of course set up we will be seeing the on the book called that they just takes these and than does the people logic to to do they actually decrypted and and then loaded and and and sold at 1st sight that that's basically not helpful for us but in the end and it turns out
that this CBD still gives us what we were looking for and the reason for that is that you know it's pretty limited just does a few things I can basically start and restart that the basement is a couple more comments that he candle and it is to nicely has the help menu that will list all the comments and in the case of the S 6 the function that's interesting effect should be renamed to test so that doesn't set value all that much but we also looked at all the phones and some other variants and for the same common just specifically say this is around on you can dump them based memory with it so that's pretty upon what it's going to build and sure enough about this of course requires rule but if you're a rule of the form you can just instructor CBD to give you a memory dump of the baseband I and at 1st look it's it's obvious that OK well icy cold nicely so we got some kind of representation of the life memory of the of the base and so but is not perfect yet especially
because I guess in this context it's annoying but in in the in the larger picture it's a pretty great thing from sensing that they actually continue to regularly updated devices but from the perspective the reverse engineer trying to maintain route and also trying to you keep up with the furry and that's a that's a problem but as found out later it is really not in your way because it turns out that you don't need the rules to to control the CBD because there's just a few bad hidden Balahur codes that you can heat up to re-enable debugging functionally and then just uh uh used a many to get the same exact random and just as a side by it's actually if if you if you wanted to some kind of resentencing on whenever this is actually very typical of Orleans the well to add debug enablement until into these uh uh called uh evoked menus and they're in particular X the developers tends to be a very good forum with rich information on that so relief it tried to debug anything get all that's an Android phone you go through the forest 1st and you probably will find some nice shortcuts like this In any case OK so we got a random
but it's obviously not exactly the same as as understood file format so we would still have to figure out of OK but how you would create like an for this and therefore that we had to go back a little bit to the to the the board as section in the CBD as well and on this world we were lucky and again since it wasn't really that difficult to identify and a piece of code that static in the boot image that basically uh the lists out of reach memory areas are then stitched together into the random that's the that ends up being is 130 megabyte uh dump that's as sent over to Android so that actually revenging can start because you have the segments and of course you don't know everything about them like the permissions of what's called and what what state of what you have just about enough to put together a basic idle the so go take it from there alright right yes so what are we
looking at at this point and so we have like this 130 megabytes random which is obviously quite a lot of lot about 40 megabytes of that his lecture is also called and we have 70 thousand functions so many reverse-engineering and this is definitely tricky it's a lot of code and obviously this is a strict binary uh even though as you can see on the right side there's like tons of strings that are very meaningful and 1 of the ones that was very useful in early on was and the 1 that you see here which is actually telling you OK this is running on an ARM cortex are 7 and of course this may not be true but in our case was actually true and yes so following that I'm all steps was word in a essentially we when identifying the real-time operating system primitives and from their final way to like that the you need which is the the radio there's and find a way to debug this and ultimately find exploit the box in that and so we go through these steps now
I'm so now that we have the cold them we could get our hands dirty directly but there's still a couple of problems the first one is that there is still in a meaningful amount of code that section not identified as and we have all the strings and usually as a reverse engineer you make use of the somehow but doing this manually for such a large binary binary doesn't scare em and identifying real-time operating system primitives and ITER is often tricky because it lacks a deep understanding of like Lord of the instructions and that ultimately anything that we're going to find out we have to yeah verifies we need a kind of uh D. functionality as well so the 1st thing we did was that assist I'd little bit with the detection of functions so this is probably something that a lot of you guys have done before and so this is not meant to pick on nite at all like function detection is a fairly complex problem especially if you don't detect functions only fire control for them but essentially it had to make sure that we actually have all the code them wanna find bugs in we 1st go applied and that is simply scanning fall for certain on prologues and and uh creates a function at that point and and even though that also includes false positives the sexually outbreeding varied throughout the the process then I'm getting to
strings so people usually and look at strings to try to find like meaningful strings and then go back at the cross-references start labeling functions and but this doesn't scale so uh so for such a binary we have like roughly 100 thousand function a hundred thousand things which which is also something that's quite common base spends mostly for like a dividing so you have like various tools running on a PC are used by modem engineers to and to writing in the field I'm and you have all kinds of strings like states strings you have 5 path information which is very valuable because this gives you kind of hi hierarchical information about the cold so every function that is including its a G is probably part of of that particular they are even even that task them but all methods are all thought at this point was we do that any any kind of automatic labeling is going to be more useful than the default as sub names um so what
we did at at the 1st step you was we essentially to pulverize all strings into 2 buckets the first one is what we call exit strings and those strings that keep appearing with certain types of functions so as what functions like fatally or or things that give you some kind of debug information tell you in what kind of the file crash was and this is essentially strings that that give you in something meaningful about what the underlying functionality may be aware that the court actually lives and the 2nd category then is essentially everything as I'm so you don't just want to take everything that is a string because that contains a lot of crap as well as we did some of filtering and normalization essentially limit this to everything that has a certain length contains certain characters and then we use this information to you earlier applied labels to all the functions that make use of these and now i
don't wanna go too much into the actual heuristics that we use here is also because we will release their code anyway so you can have a look am but essentially I'm we we we used a mix between deciding a case this function using an exact string is this also using and uh fuzzy string and and then either use a combination of these are preferred the exact string and once you're at that point where you there these functions you can also go 1 step back and start labeling the functions that call the so we simply like what you see here is we we labeled this calls as soon as something and and this gives you a lot more meaning to the actual binary when you have such a big amounts of of functions in our case this gave us roughly 20 thousand labelled functions which is actually a lot if you think about that at the origin of binary was a 70 thousand functions the next step then is OK we would
identify how the real-time operating system works in the way this is usually implemented as by using low-level instructions of that particular architecture and on the cortex are 7 what you're looking for is essentially and NCR instructions that I use for things like uh flushing caches doing tasks which and and things like that and unfortunately either by default and doesn't know these are so we wrote another plugging which are currently supports the R A 7 online arm 11 and essentially annotates all of these are with the comment and once you are there and just by looking at what the comment is actually it's it's pretty easy to tell what the actual functionality is if you see like OK this is cleaning the data cation invalidates the instruction cache and afterwards rights to system control register then this is very likely actually uh enabling and these caches and so that was the the 1st thing that we
did and that helped us to kind of find the primitives that we also use later on for expectations and but we certainly wanna know a little bit more so we wanna know OK what what kind of privilege level this running at and how do we find tasks which are usually used in a cellular base spends to like implemented in various parts of the radio stack what the cell like talk to each other how is this stack and be managed to it which is gonna be important for exploitation and going from there how do we actually find actually find that the radio article which is going to be that the meat of this and so 1st getting to the execution of this is usually fairly straightforward so on on an architecture like on you would expect some kind of kernel user split and then have essentially supervisor call instructions that would trapped into the counter implemented core functionality and then eventually returned and that's why this exists on Shannon as well it it's not really used that much so there's a couple of uh SPC and thus but mostly they're used for random thing in resets and am which is obviously not enough enough to like implement the proper um API and similar to like other real-time operating system API is on said that point and also having seen as some some register contents at that point already we concluded OK this is gonna be very likely all supervisor-mode mode all the time and that would you miss we had to verify this of course when writing explode but like I can tell you already that shown in this this is running for the most part in supervisor mode which means also that there is absolutely more separation between tasks and that are running the show uh yeah any compromise that will have quite severe consequences on so the next thing you wanna
do then is OK 1 identifies where is actually radio are implemented and as I mentioned this is usually done using task and there is essentially 2 approaches here 1 is harder so what you can do that and in almost every case is you find your way through the interrupt vector table you look at the recent handling and walk away all the way through the in his initialization of tasks and but this can be very long and for a 4 bit binary this is definitely challenging so we actually took a different route here and and we essentially made use of the fact that OK we have around them and every task that is running at that time is gonna make use of data in that random so that also means it's it's gonna make use of of of stuff stacks essentially and so what we did was we essentially scan for a typical called frame patterns in that random and and essentially backtrace that similar to to debugger and eventually you reach a point where you backtrace it's a wall and then this is gonna be very like the the place where deck is initialized and that is also the place you should be wary of task is initialized I'm so that point you find like a linked list out which is like setting up these tasks and that you can walk and we also wrote a script for that and that brings us to what you see here essentially on the right so this is listing off of all that not all the task but some you see there's 1 for mobility management which is 1 of for the radius take this call control and and it's actually also called quite a lot of tasks you see like a roughly 100 task about going from there you can look at actually the the actual radio parts I am now the next thing you wanna know
and uh this is book out of these tasks get messages like when is NOT a man like an over-the-air message process by this task and this is definitely not trivial but all the tasks in in real-time operating system usually follow like 1 prominent pattern and that is essentially Avenue that is the curing message somehow then that's processing on the message and yeah this is also the case here so this is an example from I'm not sure if you can read this properly and this is an example from the call control task so you have 1 function at the top which is essentially DQE message and then reversing that there is unfortunately no silver bullet for that obviously it helps if you worked at a event uh that is implementing base basements and even though there's like various projects like Open BC also Macomb where you you can essentially get a feeling for how this implementation may look like um but anyway all this is doing is it's the curing the message stuffs that into like a global data structure then you have another function which is processing this message and then eventually the same repeats and as you can also see here and this gives you an idea of how much we actually understand about Shannon and there's a bunch of stuff here we have no idea about which is very likely related to like signals Our State handing and but it doesn't really matter but just like identifying these prominent patterns is actually fairly straight following now the memory management and is also gonna be interesting for writing exports I've mentioned how we we're able to like find some steak area and walk all the way back to understanding tasks and looking at this also gives you information how the extra stacks unmanaged also just by looking at the task initialization you see that OK this the structure also contains like that the top and the bottom of the stack and so there's nothing special about the sunshine and so you these all continues in memory there is they all start a static location and between you have that these markers out which are used for checking stack overflows and and yeah then you look at the leaves which are either the more tricky but also fairly straightforward to identify I once you once you realize OK you have with of fairly prominent called pattern is you have something is allocated and you have some data copied into the junk as it's fairly easy to spot these and even though we uh we reverse-engineered pretty much the entire heap implementation it's not too interesting so this is a classic slot-based al-Qaeda that organizes memory chunks and buckets of different sizes and you have like a governing freeze and so this is mostly a reference for you guys them how about memory configurations so the ARM cortex our 7 is a nice enough not using an MMU so that makes uh 5 figuring out the configuration of the more easy because you don't have to walk page tables or come up with any sophisticated tracing of the Courtier but really all of this is controlled again with NCR instructions as so that we could essentially be used some of the scripting that we already had at this point and that just by looking again at the Commons it's fairly easy to spot a function which is used to configure the MPU which is going to be some kind of that is then called and tracing this you get a pretty good understanding of how the memory looks like
and so at that point we knew precisely what kind of memory regions there what kind of conditions they have what makes us said I which on the 1 hand allowed us to improve on the older but on the other hand was also useful for exploitation so 1 thing you see here already is dead and this in 1 region exist for example that has x and another 1 doesn't I just realized you don't see my most point sorry about that I'm so he had at this point you have all the information to go further with expectation and now there's
1 thing missing that I mentioned the beginning which is we need some kind of Debar capability and 1 thing that's nice on these devices once the device crashes it actually gives you some kind of crash information already so in this case you see that this is showing you it's a data board so that already gives you kind of a clue about the OK what you may have had with the payload there and but looking around for more information we eventually found this nice function on the right here which is called Dunbrack values and this essentially gives you a complete a complete map of all the registers and memory and for a while we were wondering OK how how is this actually ending up there is this a memory-mapped magic that does this but turns out it's much simpler so by simply following the interrupt vector and the exception ending here again and the exception handling as this is fitting that information out once you hit a crash from and this essentially gives you
a proper crafty bargain CC union have the bank registers and that was very useful to also tell whether we're running in supervisor mode or not at that point and but and what you're looking at next at this point it's OK you wanna do some some more and debugging and especially when the modem is running because this just here is just useful once it already crest and so we also looked into like
debugging and 1 thing that's interesting in this context and uh there was a publication earlier this year about uh the fact that on some some modems uh uh the U. art is exposed overuse and you have an AT interface on that and there's some privacy implement and the implications because you can essentially to call it and other things uh even though the phone's locked which you should be able to do but turns out it's actually far worse because if you look at what sounds and really edit there 1 of the things they added that is therefore the memory required for the modem at which at that point you can definitely used to build your own debugger about which we unfortunately didn't do that mostly because once we were at that point we're already so far to explore development that we mostly use this for like poking little bit at memory and looking at values but I yeah it would be enough to be the debugger so if anyone wants to do that you're welcome and as a general and so this is actually fairly common as well so don't ever just assume that the baseband then that would just have like the normal set of AT commands is definitely useful to to look at this and this you usually a bunch of great goodies in there the OK so going
from there we looked into finding bugs and then use going to talk about that part alright so but we still have the only kind of rushing along a lot of speed here so now we're getting to the the fun parts I guess but it also means they're getting to the part where now we can no longer uh not show you a 3 GPP uh diagrams uh notice that up until now he didn't need to know anything about Geiser morality and or whatever and you already had the understanding of we can divide the environment you know what the task are for every task you get an idea of what that is right starts high processes messages so pretty far along and but at this point you would need to make some kind of choices OK well what have about the weather look for an and in what and at this point that it's useful to the to add the reference of OK what is the function of the year's supposed to be and based on that we we decided to move to look for the memory corruption issues in the passing of layer 3 messages or on mass messages and do the reason being is that uh that uh based on the spec it's complicated enough to to hopefully give you some opportunities and also the messages are long enough that's an issue that sometimes with the lower layers of the stacks like uh our see for example that the signaling messages are a short so even if you find yourself with a buffer awful or something like that you might have a really hard time new getting you useful payloads not so much of this so we have a target we would have
to the base and that identify OK where is a task that I care about and the but some understanding of mass you will know that you will have tasks like mobility management and connection management and particle in connection management piece called call control which are as as the name suggests that you know you set up the cause manager cause you you you and calls and messages like this and all these messages so that the way they are encoded as they use where the core information elements uh which are basically just the of the around the encoded the chance that the uh comprise a message in any nest message will have some mandatory ideas and then usually quite a bunch of optional I use and so then what you will need to
do is you you need to to fight OK within the call control task where it is the passing that happens that will take the wire format split did not mean to information elements presumably put that into some kind of internal representation and then start working on that to actually passed the message and for that I guess 1 approach you could take is taken understanding of what type of magic values in and and uh then value ranges you you you you supposed to see when you look at a certain type of message and tried to sort of identify that the needle in the haystack or the the the the more straightforward thing this is to start from the top of the of the message processing of reading the kokinshu all and just go through that and that's what we did and no magic source here is just some menu reverse-engineering as you would expect that that's the case here that what this so you don't have it that easy so that the the the wanted the cue that message you have that pointer it's not exactly the the wire format just yet it's going to be involved in some kind of Shannon proprietary internal structure representation so you have to do some reversing to peel away the headers that they have and then you finally find yourself at the point where OK well that is great I have appointed and I know that these bytes in their structure are going to
correspond to what 3 Gbp says a message like this is supposed to look like so basically you find a Saturday at the a central function that which is being passed I use that is used all across these different tasks like MM and SMS and all that also in C C but and this goes ahead and the users and but global aerial of definitions of what information elements have to be like what type is what the minimum sizes and things like that and it uses that in in in a big uh promising new tool take out every i.e. from the message and and feel out and other global area which is the going to contain the representation of currently existing ideas and and that again just means for each i a pointer to the actual low values of the bytes are length indicator and better this I use greatly present in the in the message or not and so on and and then dispersing happens and then of dispatcher and they're just picked from a from a from a table of handlers and which grows to which is big based on OK which type of message to add and and that handles the message and and and so this is a pretty good now about for these tasks the number of messages that they can have a section pretty large for common choice is uh quite a few dozens and only some of those correspond to the actual over messages so 1 of that is going to be the handling of the set up of the other it and and so on but then there's going to be a bunch which are internal messages we with lower layer the tasks and C and so on if he had no instead something that you can do is you already have your understanding of which information element is the will put where in memory so that you can take your knowledge of that a set of message has to work on yet another type of ideas so you can uh look at the passing and see which of that that area it uses an and try match it that way now be more complicated luckily here it was a lot easier because in this area representation of the and there's as you can see on the snapshot here you don't just get an idea and a pointer to the the but again I get a nice uh lobstering which basically guys which message this is it even contains the tag radio message for the ones that are already and there's so it becomes really easy to precisely abilities and so at this point but we basically no OK am aware we are is for the task we we know the exact candidate handles the giver message and we understand hide gets its input animal more importantly what part of it is stated or what their constraints even exist on on the Langford a value of wonder and so on and so at this point the you notice that there was maybe like a 5 minute period here where you had to be like OK that's an O 3 B and once again we find a 7 a place where it now you know it's just passing it might as well be Adobe Reader whatever an input there's uh things like the length you know it will how much it's constraint and how much it is and and and you're basically operating on straight up over the error messages uh so that's the at that point you can uh pick your poison in terms of I like to look for bugs and you can go for some manual analysis and some Ida stripping I guess tool to help you with the finding things like the unbounded length that is going to man copies or it could go for something a more complicated uh maybe use your number compare output of your nerve whatever is your favorite static analysis tool and in our case and and this is sort of a point of view is important to make uh it was relatively quick for us to find a below vulnerably that was a winner for us that we pick for bone so on and so that ordinary forces to do 0 or more in depth the block hunting and ever so now we feel like we're not in a position to answer questions like OK how bodies is called uh world more
importantly uh we talk about how to find blogs or the inevitably could at the question OK how about fuzzing 14 just a biologist keyboard is all I just wanna say is it's not the sort of what we
would recommend there but if you wanna know why I ask this later on and they're but not this takes us to the 1 we that we actually use to get a remote code execution us a sense of a relatively quickly in about a month's put out an advisory uh affordable ability on would only like this description all that much but so instead uh here's what we actually submitted to them but the boat you can read on the slide but the bottom line is is that 1 of the messages and go show is the progress message uh which contains 1 and 3 uh information and which is called a progress indicator and that this can be sent anytime during the duration of of an active call active costs means it actually has to be lot at least so this that be done yet so and then the idea the passes this message that is basically no checks to to prevent that your classic stack-based buffer overflow when processing the length of the of the information
element and easier to just look at the code so really when the end of it is more or less every buck on his dream it's a straight up stack-based buffer overflow where I your input size in more than 200 bytes but the uh buffer that were covering into was 4 bytes and its rights right at the very end of the spectrum so my next to the return address and you have essentially entirely the you have complete control over the value so there's no in the encoding limitations on your 200 but have you by the the and so that's a pretty good situation and then on because gonna walk us through uh had end up being the exploited yeah so
obviously the the point that point on was to uh with some way show that you have arbitrary code execution unlike the desktop space it's it's like fairly but easy to tell what kind of painless those were unlike the radius stack wasn't so clear what we actually want to do am but anyway this is a simplified version of what we have shown and druggists and at that point 2 and so on the left side you see for those who are not familiar with the device used to be on them at point of a case on the left side you see the galaxy is 6 which is the form that were then attacked then 2 other phones and we're 1st going to call the 1 in the middle and then see what happens both a check this is
playing OK so we call 1 3 3 8 and that's important to memorize for that part and you see the phone is ringing but we're not accepting the call em but otherwise there is like more visible indication on on that galaxy 6 but at this point we already exploited Auerbach we call this same number again and now all of the sudden the phone on the right is ringing and so no new you may wonder like well why is this important so what we have shown and at that point on was we gave drug was the phone said OK here like call yourself and then what the phone in my in my pocket was ringing and and the idea we had was that on the 1 hand this is a fairly simple payload to implement because it's like below 100 bytes it's it's nothing that's that's too long but on the other hand this also gives you the capability to do at men in the middle as long as you know what number was originally called and 3 GPP is nice enough to actually have fewer for that which is called the the called party sub address which you can use to stuff and the origin number then initiate a new colony and uh this in in the middle on
OK so how do we actually deliver all payload over the air I don't wanna go too much into like the background description here I think most of you have probably seen and talk about how to run your own GSM network something like that and we used uh open B C in our case because this part was that is sufficiently enough to exploit over GSM the and then on the on the radio side you will need you know you you BTS for that there's also tons of options as so just mentioning the ones that we use we use on the 1 hand the system tests that you can see there and their use P 1 and as so that's not an expensive so that you can get this bill 500 from a box actually recently gotten an obedience on eBay for 200 and so it's it's actually this like normal barrier for this kind of research anymore here now once you are able to you like
deliver your payload the 1st thing that you are going to run into user k other mitigation that actually prevent the any successful X exploitation here and the good thing is shannon is not entirely fragile uh so it does do some checks so other that these markers that I mentioned earlier on I used to check for stack overflows you also have the got lives in between each chunks and as mentioned the arm 7 supports X and then as we've seen sums involves using that so that sounds not that bad and all other on on the bad side there's no real baseline mitigations whatsoever so you don't have stick in areas that you don't have any metadata protection leaves you don't have say on linking you don't have any randomization therefore code uh relocation at runtime so in terms of the exploitation and this is also why we didn't talk much about the box here because this is really like partying like in the nineties I and it becomes even worse because so they they are using accent and that but unfortunately they're not using this very effectively because heap and stack is actually not 1 of the areas that's protected by that at which we were very surprised about at that point we already like compiled on what chains and then we're like a that was like not even necessary and yeah so that much about them mitigations and so yet looking at all the expectation you usually also looking for a couple of primitives that are useful along the way so the 1st thing you usually want is some kind of a way to get like a controlled memory into the baseband and hopefully you also can put that that met at an address which is maybe that's flexion at run time or maybe even static and there 2 things that we found very useful for this and the 1st is the short term uh Mobile subscribe identity which is an ID assigned by the network and assigned to the phone and sent to the phone as well this is essentially a known the work that you can put in memory and then you also have the network names of this is the string that you see at the top of the form which OK this is the more about and there is a long 1 and a short 1 and I don't recall which 1 is which but the nice thing is 1 of them you don't see on the actual form so this is the 2nd thing that you can use to get your payload into the device and an unarmed conveniently enough you can ride of Ameritrade could show called as well and that was also nice about this area is that this is just fetched once you once you change the network and so as a result this is usually not cash to so the usual caching problems you have 1 arm you don't have if you if you wanna use the network name this kind of a trampoline to uh to jump due to the rest of the payload and the next problem you will run to run into size restrictions this so as we have seen some the the Ieast are usually not that large so most of the radio messages are definitely below like 2055 bytes which is not a lot but because of the way that the code is written here and then you talked about like how you have these areas that for a certain type of message specified like what kind of handlers called you can essentially get to put it in him so as you can make program you know we'll we had machine there so you are you execute such functionality on uncertain reading messages and by doing that you can I see stage of payload in case you need more than more functionality and some usually you also wanna have some kind of clean returns as we've seen in the individual the phone is not crashing also there is actually a symbol at the top which when the basement is crashing you will see that with this small indication that in our case we didn't wanna have any of that and that because of the way that the task or operating by simply being used that process messages over and over again by making sure that you set up the register values accordingly you can essentially just jump to the beginning of that you and your fine and the modern wiki functioning and would happily process the next stage of the payload or normal functionality so that even when you go back to like the norman network uh outside of our OPT as well you will have no no problems whatsoever and this is also similar for persistence and so when we gave the phone at that time due to drive a point on that the call was obviously happening in the non network and so you want that your payload survives there somehow and no magic here again so flight mall Dr. switching networks any of that doesn't affect any of the functionalities as long as you can put payload somewhere in the modem and you can keep executing at some point you would have no problems with persistence on that point and is most people don't regulars switch off the phones That's that's not really huge problem in practice and so we don't have to look at persistence that surviving reboots and there's some options for this 1 is mentioned here so 1 thing that you can look into like the the loading of the radio configuration values are for and and the arm and yet we don't have to do that that's and that's that's very likely also not not a big big problem in practice thank so talking about payload solute seem that the demos kind of maybe not what you expected so maybe some of you expected that OK we we snatched like all the photos context from the phone and the 1 thing that then you also stressed at the beginning is that people keep talking about like how you have the 2nd operating system a new operating system that can control everything on the device but this is really not the case anymore at least uh almost like Heidi smart phones these days and so that the basement is usually loaded by the application process and maybe a talks by uh HSI is something like that to the application process about you have separation and even though this may not be perfect you really just have limited control what you also don't have in the baseband directly like secrets that you can still directly as a really in terms of the payloads as long as you don't escalate to the application process aside you're looking at essentially messing with all the data that goes in and out of that modem and essentially utilize the fact that there's like no nothing like like end-to-end encryption and as I mentioned all of them will be used at and to rear of cults and now we also had the 2 major phase there were costing us a little bit of time and the first one was created to catching so originally we thought that for gaining
persistence we will just at patch has some of the cold by like reconfiguring the memory regions it turns out this is not working reliably for us and were actually still not sure what the reason is and 1 thing that specific about Shannon particularly is uh an architecture called or the feature called the low latency interface which is an efficient way of essentially sharing ran between different costs were thinking maybe it's related but really we have no clue and eventually went for like patching data and function pointers in order to to gain persistence and just put our called uh somewhere unknown like in used as part of the heap the 2nd problem and this is definitely painful when you try to maintain your payload before the contest and making sure that with every update you you can use your options still work and this is called the what we call dual not was and this is Shannon specific so all the yes 6 doesn't support to SIM cards but apparently some some some devices do and the way that sends an implemented this is by simply like duplicating all of the functionalities to every function you will actually find twice and you also see that in the finance in the beginning and that's a huge pain once you try to like been that symbols and then plot changes and so this is something that you wanna pay attention to it because otherwise you will spend a lot of time figuring out why exploit doesn't work because you use the wrong and this again and so the last thing that people who I usually keep asking about and this is definitely considered the Holy Grail and in the spaces OK hurry escalated upper uh your your your privileges towards the application processor and this is definitely ongoing research also on all sides so we can't claim that we that we did this but we essentially C 2 main ways to do that in 1 is actually not really modern specific so there would be no magic about this at all and this goes back to the fact that the modern dusty all the data that goes to the application process or so just by they're coming up with a payload that would make use of the fact that you can inject stuff into that let's say you inject like a small JavaScript stuff that then as your payload for owning a browser and application process escalation really becomes like every other exploit that bone to own and has nothing to do any more with with modems and that's the less interesting bar although the from from our perspective and then the 2nd role that is available is you look at a lot of IPC traffic that goes in between these cost and there's obviously a lot of the parsing and range checking on this the and it's done you have also some peripherals that you can access directly and memory wise and but this is probably hard and on the on the flip side if you find a book that's probably living for quite a while and but what you are more likely to look at is the services that build up on top of this like really is a good example and some of you may recall that and the applicant and Android project at some point published an alleged backdoor power which was essentially a directory traversal attack in a remote affairs service that sends and build on top of ready and luckily for what this is this is fixed by now so this actually doesn't exist anymore but this would have been an awesome way of escalating privileges because then you can essentially read and write stuff on like the application process a file system as so this is that the kind of things you're looking for and now for those of you who are interested in like playing further with this so 1 thing that we found it very useful if you wanna like D. block any of the IPC messages that go between these cause there's a very handy I kernel debugger best particle that as the atom dumped and this essentially gives you a full memory dump of all the IPC cases you can look at that and interestingly enough so this work on this will contain everything so if you call a number this would be the IPC message initiating that that calls it will contain the number and it also contains what what networks was seen if you if you if you switch to network and so this is nice for debugging on the other hand the sources of the huge privacy problem because that allows like any unprivileged application on my galaxy devices right now to expire on on users for effectively so and so yeah with that I want to get to like the final remarks of the intention of all talk was really to give an idea of what this treaty takes an like Holly usually approach this topic and I think I can talk for both about the that all conclusion is that this is really not that special so you have the usual real-time operating system primitives that you can identify but there is an old man like they ninja knowledge required so we definitely encourage other people to do more research in this space and it to put a number behind this and the 2 of us work part time on this set of 3 to 6 months so depending on how you count uh 3 months uh I think we were at like pretty much understanding most of that we we found about and then we were slacking off thinking that the expert is going to be a quick thing unfortunate you took longer as as always so we ended up 6 month unfolded the am and we definitely think that there's still a lot of space for future research is I talk about the escalation there there's definitely a lot of things that you can do the but what's also very interesting is that the target identification because 1 thing and that is going to be a problem also such contests is when you go there OK what what's the actual form of version that like like it's not like you have 1 operating system that is just running the latest version of Adobe Reader but but in the mobile space you have like tons of Orleans tons of different from versions depending on whether the phone is actually coming from and identifying that over the year and then picking the right payload it can be quite tricky so having future research in that area would be would be quite useful and yet as there then you mentioned and that this is not up yet so give us some time and we are not but 1 of the people who say we're going to release everything and then you can never hear anything back but we just don't have the time to like put this stuff and get have yet because of our travel and we will definitely release all the all the IDA plug ins and and tools that we have is used throughout this research and hopefully that people can build up on that the and yellow that I'm actually surprised you're well on time and we have
time for questions as well another at ch a talking about the network signed so the you which occurs so he was asking if we have like any idea of the attack surface on on the network side and this is also an interesting thing so if you think about you can that you can use your phone to like exploit the carrier and that's that's definitely a pretty interesting thing I would expect honesty that it's it's equally bad as in the baseband research that we have seen over the years uh I maybe it's even worse because nobody actually has a chance to look at this and I'm not sure the 0 if we will ever see any of this because like acquiring carry equipment is is not that easy so here OK so so you're talking about them like a real end-to-end detect that doesn't require a base station and that's actually in that that's a good question I think because a lot of the the fear-mongering that comes with based and I think somehow originates from from that angle but I think it's fair to say that most of the attacks of this is actually not an end to end the parts of the code so people have been looking at SMS and things like that and you find the box there but you know way less and flexible when it comes to like staging payloads for example like sending another message and so that I think there is box there and there have been historically also some reuse remote memory corruption is there but I think and that's not the vast majority of these and more context so what has to be very difficult as the gaussian is hard to sort of get access to other networks are actually doing a research to the state and I guess if the we should even like no more maybe i mean that would be the our experience working weather but still so I have basically no idea for the work of filtering do maybe this operator the 2nd or 3rd or whatever applies so you can you know even if you consider he finds some issue in in some layers of the stack furthermore talk about information elements there some of them which are actually on paper and so it would it would be possible that you in your smoke on the device encoded II and could it could survive the entire network but it's a huge question mark is you don't really know what the carries the sole protesting that out effectively that's really hard and the other part of it is I think it's something that the you could call I mean it's it's it it's an interesting target but we have seen that last allow with stage fright I guess and some other examples that if you think about and to and if you go higher up the stacking the application layer actually too much much much richer and richer attack surface so I would say I mean obviously match serious about having a guess the of the top and you know maybe and stuff so you know ends excitation of mobile phones using nothing but a number or what is definitely possible but if I had to do that I think you the 1st thing you would think of as you know the baseband God any other questions but alright
great thanks singular