Video thumbnail (Frame 0) Video thumbnail (Frame 1446) Video thumbnail (Frame 4324) Video thumbnail (Frame 6964) Video thumbnail (Frame 13036) Video thumbnail (Frame 14759) Video thumbnail (Frame 20386) Video thumbnail (Frame 22155) Video thumbnail (Frame 25260) Video thumbnail (Frame 26607) Video thumbnail (Frame 27808) Video thumbnail (Frame 30370) Video thumbnail (Frame 31593) Video thumbnail (Frame 33035) Video thumbnail (Frame 36223) Video thumbnail (Frame 39791) Video thumbnail (Frame 43535) Video thumbnail (Frame 44764) Video thumbnail (Frame 47959) Video thumbnail (Frame 49572) Video thumbnail (Frame 52733) Video thumbnail (Frame 55856) Video thumbnail (Frame 58549) Video thumbnail (Frame 61001) Video thumbnail (Frame 62080) Video thumbnail (Frame 63479) Video thumbnail (Frame 66999) Video thumbnail (Frame 70481) Video thumbnail (Frame 74007) Video thumbnail (Frame 82673)
Video in TIB AV-Portal: avatar²

Formal Metadata

Towards an open source binary firmware analysis framework
Title of Series
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Avatar² is an open source framework for dynamic instrumentation and analysis of binary firmware, which was released in June 2017. This talk does not only introduce avatar², but also focuses on the motivation and challenges for such a tool.
Keywords Security

Related Material

The following resource is accompanying material for the video
Video is cited by the following resource
Revision control Area Observational study Software framework Plastikkarte Mereology Event horizon Dijkstra's algorithm
Point (geometry) Turing test Server (computing) Key (cryptography) Multiplication sign Binary code Complex (psychology) Sound effect Binary code Bit Mathematical analysis Content (media) Web 2.0 Mathematics Software Software framework Energy level Cuboid Configuration space Software framework Firmware
Slide rule Presentation of a group Group action Variety (linguistics) Code Function (mathematics) Mereology Arm Architecture Peripheral Read-only memory Semiconductor memory Different (Kate Ryan album) Computer hardware Bus (computing) Operating system Energy level Interrupt <Informatik> System on a chip Abstraction Firmware Computing platform Address space Form (programming) Computer architecture Physical system Arm Variety (linguistics) Electronic mailing list Code Instance (computer science) Mikroarchitektur Kernel (computing) Personal digital assistant Computer hardware Interrupt <Informatik> Computing platform output Energy level Peripheral Abstraction Window Firmware
Run time (program lifecycle phase) State of matter Multiplication sign Direction (geometry) Execution unit Binary code Function (mathematics) Computer programming Web 2.0 Emulator Fluid statics Mathematics Semiconductor memory Different (Kate Ryan album) Symbolic dynamics Exception handling Stability theory Physical system Area Meta element Binary code Open source Electronic mailing list Bit Instance (computer science) Mikroarchitektur Emulator Fluid statics Inference Interrupt <Informatik> Peripheral Quicksort Firmware Turing test Implementation Open source Mathematical analysis Coprocessor Field (computer science) Mikroarchitektur Peripheral Causality Software Computer hardware Integrated development environment Software testing Interrupt <Informatik> Firmware Computer architecture Pairwise comparison Weight Graph (mathematics) Memory management Content (media) Bound state Interactive television Line (geometry) Binary file Human migration Kernel (computing) Loop (music) Software Integrated development environment Personal digital assistant Read-only memory Abstraction Local ring
NP-hard Group action Context awareness Source code Mereology Arm Emulator Semiconductor memory Kernel (computing) Software framework Arrow of time Endliche Modelltheorie Information security Physical system Rhombus Source code Block (periodic table) Web page Gradient Binary code Keyboard shortcut 3 (number) Instance (computer science) Process modeling Latent heat Process (computing) Symbolic dynamics Order (biology) System programming Interrupt <Informatik> Software testing Smartphone Modul <Datentyp> Quicksort Automation Arithmetic progression Genetic programming Spacetime Firmware Web page Implementation Functional (mathematics) Exploit (computer security) Mathematical analysis Field (computer science) Emulation Latent heat Prototype Peripheral Read-only memory Whiteboard Computer hardware Representation (politics) Firmware Computing platform Computer architecture Module (mathematics) Cellular automaton Projective plane Line (geometry) Exploit (computer security) Symbol table Kernel (computing) Network topology Video game Wireless LAN Communications protocol
Functional (mathematics) State of matter Multiplication sign Combinational logic Mathematical analysis Heat transfer Mereology Focus (optics) Arm Emulator Latent heat Peripheral Semiconductor memory Computer hardware Utility software Software framework Partial derivative Form (programming) Computer architecture Focus (optics) Arm Weight Bit Instance (computer science) Avatar (2009 film) Symbol table Connected space Component-based software engineering Emulator Symbolic dynamics Computer hardware Network topology System programming Software framework Partial derivative Information security Block (periodic table) Firmware
Group action Code Multiplication sign Mathematical analysis Focus (optics) Usability Local Group Software Damping Software framework Information security Physical system Pairwise comparison Focus (optics) Projective plane Open source Bit Open set Software Symbolic dynamics Buffer solution Software framework Fuzzy logic Information security Physical system Abstraction Firmware
Point (geometry) Addition Open source State of matter Multiplication sign Connectivity (graph theory) Heat transfer Avatar (2009 film) Emulator Flow separation Read-only memory Semiconductor memory Different (Kate Ryan album) Synchronization Software Core dump Ideal (ethics) Software framework Communications protocol Abstraction Local ring Addition Interface (computing) State of matter Heat transfer Core dump Multilateration Instance (computer science) Flow separation Component-based software engineering Emulator Synchronization Software framework Communications protocol Abstraction Library (computing)
Slide rule Functional (mathematics) Open source Code State of matter Execution unit Set (mathematics) Insertion loss Water vapor Branch (computer science) Mass Power (physics) Semiconductor memory Different (Kate Ryan album) Cuboid Software framework Plug-in (computing) Physical system Graphics processing unit Theory of relativity Key (cryptography) Software developer Determinism Staff (military) Bit Instance (computer science) Flow separation Symbol table Software Angle Communications protocol Computer architecture Abstraction Row (database) Reverse engineering
Slide rule Presentation of a group Computer file State of matter Euler angles Multiplication sign Virtual machine Emulation Avatar (2009 film) Mathematics Latent heat Peripheral Whiteboard Read-only memory Semiconductor memory Computer hardware Queue (abstract data type) Software framework Configuration space Communications protocol Message passing Firmware Descriptive statistics Addition Interface (computing) Weight Bit Instance (computer science) Avatar (2009 film) Virtual machine Message passing Computer configuration Computer hardware Configuration space Queue (abstract data type)
Group action Dynamical system Scripting language State of matter Multiplication sign Virtual machine 1 (number) Set (mathematics) Water vapor Mereology Independence (probability theory) Read-only memory Semiconductor memory Symbolic dynamics Representation (politics) Energy level Software framework Data structure Plug-in (computing) Computer architecture Task (computing) Physical system Key (cryptography) Counting Planning Instance (computer science) Process modeling Personal digital assistant Order (biology) Phase transition Object (grammar) Representation (politics) Routing Computer architecture Physical system Abstraction
Point (geometry) Scripting language Functional (mathematics) Demo (music) Computer file Code Length Demo (music) Function (mathematics) Solid geometry Mereology Computer font Type theory Word Uniform resource locator Pointer (computer programming) Process (computing) Semiconductor memory Personal digital assistant Source code Software framework Right angle Object (grammar) Computer architecture
Scripting language Injektivität Context awareness Injektivität Code Real number Code Binary code Commercial Orbital Transportation Services Packet Loss Concealment Data model Word Root Malware Whiteboard Rootkit Physics Source code Whiteboard Implementation
User interface Context awareness State of matter Block (periodic table) Interface (computing) Neighbourhood (graph theory) Physical law Interactive television Peer-to-peer Goodness of fit Software Cycle (graph theory) Booting
Point (geometry) Implementation Functional (mathematics) Injektivität Code View (database) Robot Demo (music) Similarity (geometry) Binary code Mereology Packet Loss Concealment Commercial Orbital Transportation Services Data model Bit rate Whiteboard Software framework Implementation Plug-in (computing) Compilation album Physical system Proof theory User interface Context awareness Assembly language Weight Code Proof theory Root Loop (music) Malware Physics output Whiteboard Object (grammar)
Slide rule Group action Direction (geometry) Demo (music) Binary code Mereology Scalability Strategy game Software testing Software framework Fuzzy logic Backup Physicalism Mereology Instance (computer science) Scalability Process (computing) Emulator Software Personal digital assistant Enumerated type Partial derivative output Software testing Text editor Firmware
Implementation State of matter Mathematical analysis Mereology Revision control Emulator Performance appraisal Type theory Bit rate Whiteboard Software Software testing Software framework Acoustic shadow Plug-in (computing) Constraint (mathematics) Software developer Interface (computing) Consistency Memory management Expert system Computer simulation Instance (computer science) Loop (music) Emulator Software Partial derivative Iteration Whiteboard Object (grammar) Reverse engineering Firmware
Presentation of a group Emulator Symbolic dynamics Multiplication sign Row (database) Mathematical analysis Binary file Subset Firmware
Web page Group action Serial port Computer file State of matter Multiplication sign Range (statistics) 1 (number) Bit rate Heat transfer Semiconductor memory Different (Kate Ryan album) Gastropod shell Office suite Traffic reporting Plug-in (computing) Address space Pressure Task (computing) Scripting language Presentation of a group Nuclear space Interface (computing) Sampling (statistics) Bit Flow separation Software Logic Read-only memory output Whiteboard Communications protocol Row (database)
Web page Complex (psychology) Functional (mathematics) Run time (program lifecycle phase) State of matter Multiplication sign Virtual machine Mathematical analysis Function (mathematics) Parameter (computer programming) Metadata Number Software bug Read-only memory Semiconductor memory Software Source code Associative property Pressure Scripting language Presentation of a group Dependent and independent variables Block (periodic table) Run time (program lifecycle phase) Web page Complex (psychology) Content (media) Heat transfer Thread (computing) Symbol table Software Function (mathematics) Phase transition Configuration space Block (periodic table) Resultant Genetic programming Row (database)
Dynamical system Greatest element Group action Open source State of matter Direction (geometry) Multiplication sign 1 (number) Mathematical analysis Local Group Revision control Emulator Different (Kate Ryan album) Row (database) Software framework Traffic reporting Multiplication Software developer Planning Bit Avatar (2009 film) Orbit Software Symbolic dynamics PRINCE2 Genetic programming Firmware
Web page Point (geometry) Dataflow Open source Code Image resolution Multiplication sign Connectivity (graph theory) Real number Median Real-time operating system Online help Function (mathematics) Student's t-test Mereology Semiconductor memory Internetworking Computer hardware Software framework Computer architecture Physical system Task (computing) Form (programming) Software developer Expression Interactive television Bit Benchmark Hypermedia Software Personal digital assistant System programming output Abstraction Row (database) Computer worm
paper if you
the the the and
the and more work until especially modest will now talk about the other 2 versions in favor of the and the all right thanks for the introduction of a stated I'm areas and I am here today to talk about I wanted to below as part of my PhD studies or what you come and if I say Alydar I don't refer to a movie by James Cameron walmart event notes I refer to the lecture alloted to framework the let's see
a whole I tried to coney Kato with you about a framework so 1st I want to tell you a little bit about binary for analyzes of general 1 just shortly to discuss it to the landscape to see what other people have done and are doing then I actually introduce the high level concepts of through to framework itself and in the end I'm going to give you a couple of examples to show how the tool can be used and is used by US effect so let's look 1st
stopped binary from analyzes why
are we interested in analyzing well entity wise well as we know the amount of embedded devices this steadily increasing day by day buzzwords like into of things and so on are found in the and these are just interconnected embedded devices misconfigurations box and when we're abilities are common on those devices and I would say that a majority of so the reported worlds we find so far of those devices are mainly misconfigurations on the low-hanging fruits like a to disclose private ssh keys this configuration the web of all just simple box in the web server itself however we hope that in the future so someone that's gonna change and when those may be secure their software more and then you would need to actually hunt for more complex box which also still in there from where however when we want to find more complex but we need more sophisticated Turing to succeed of course we can sit down and be Ross engineers a long time but at some point in time to read would greatly benefit us however
there are especially compared to desktop systems in a lot of challenges present for through analyzes 1st of all there are a variety of platforms variety of different brought system-on-chips which all columns with their own memory layout and their own pocket peripherals which may be mapped at certain addresses and maybe behave completely different on other devices furthermore there's often no operating system level abstraction and some of the film where you can do is based on the notes however there's also a lot of money picture with around which just use non-kernel at all or have some small tiny kernel for embedded systems in both cases and the heartland actions will be embedded in the firmware code itself and not as part of the kernel this form is actually a problem because when the film where x is hardware that might be we'll memory-mapped input output or of that might receive interrupts from the hardware for instance when new data is way over on the bus we need to to somehow set should in our analyzes and Turing on top of that they're just a variety of architectures like not only a lot of platforms but also a lot of windows and architectures are around why we have on desktop systems mainly exactly 660 664 on the on embedded devices we can have for all the different architectures from ARM MIPS PowerPC in some sense stock and the just to give you 1 example please don't attempt to reach the next slide so
this is just a list of the microarchitectures defined by armed and this is just on itself not of the party Windows farm which are of making system-on-chips these around so the different microarchitectures always tiny differences in the architecture which is quite shattered seem to grasp in a generic to
and and they're even more challenges which we are facing in comparison to desktop systems binary analyzes on the subsystems normally um greatly uses the instrumentation sort instruments the software under test so I'm analyzes Tulum at certain local or ever sanitizer for instance the text during runtime that everything's going fine and on embedded devices this is challenging due to 2 reasons once again the missing abstraction OS operating system and father more quite often the cold only refiles insights a read-only memory of an embedded device so what does this mean read-only memory we will need to flesh to change its contents however then on the other side from Web might be encrypted all signed by the wind or so instrumentation is harder than on subsystem likewise emulation is challenging while on the more on desktop systems all abstraction at all Hardware directions is handled by the kernel which can be easily extracted by an emulator we don't have this comprehensive possibilities for embedded devices some the reason for that is is that there are a lot of peripherals around which interact differently with the hardware and after reveled MU being able to emulate all of the underlying heart well stability wise use a lot of implementation effort likewise of fault detection when we for instance fast test from desktop systems we most of the times rely on observable crashes like for the segmentation faults or the error handling of saliva G that C for heat corruptions and so on so we get a physical output or where we we got a notable look noticeable put when when we corrupt memory on from well there's different firstly the even lines resilience-based embedded devices are much what most of the times of modern utilizing the Lipsey soul he protections are where the weights smaller is present at all and some the wisest may not even contain the volatility of some memory management unit which had 2 first-place enables notion of sex false or invalid memory accesses so in this case of from the film of my just continue to executed and also will be corrupted to state of the program another big you true this interrupt handling because a lot of room well as basically designed in a way that it runs continuously inside a signal main loop and just checks memory contents false memory companies are updated by interrupt and graphs and will derive from the execution path of some main groups 1 straight at if we go with static analyzes so we would need to be fine where those interrupts are treated follow more as we saw before there are a lot of different migratory architectures around and microarchitectures have a lot of small tiny changes and uh the instructions only present to the Micro architecture and I mean for instance coprocessor all excesses on on cause of area varying from court to court or from microarchitecture to microarchitecture
so this showed a little bit of the challenges we have in the field of dynamical analyzes let's look at the Turing landscape compared to desktops of systems the toting landscape is due to the challenges way smaller and especially smaller when only considering open-source tools the of Father more on them while the art of static analyzer systems for desktop systems exist they made exceeds the bounds when being applied to with embedded firmware because they needed to approximate the environment in which is not always possible in the meta case and it is also possible to and fro the behavior of peripherals and interact in the following i will show you for open-source tools was which are aiming to analyze from well but so obviously this is not a comprehensive list but gives a glimpse of what has been done and what kind of different approaches are out there yeah so let's start with
5 5 is so of symbolic execution engine for MSB facility from where which is based on the so clearly is the main symbolic execution framework here which basically operates on the end of the and immediately representation and in order to have phi working with analyst needs to specify an explicit analyzes memory and interrupt specifications the analyzer specifications here by defiance of among others the memory layout 1 off the film well under analyzes furthermore the memory specification of specifies whole memory should react when it's rare to read from and write to so this is basically a way to abstract memory-mapped I O so that 1 particular memory cells are accessed symbolic will use all specific concrete radios can be injected into the analyzes and the interrupt specification as well defining and which points interrupts could cool and which interrupt handler should be executed for a while the 3rd grade work which don't need any presence of a for the could wise and you could successfully analyzed vitamins before so the true when it to require the presence of the source code of the from that because that's basically the way how the works so to unfortunately source code is not that often awaitable when we are analyzing from where so let's have a look at the binary analyzes tools 1st this from diner which is so the binary analyzes framework based on communal a couple of full-system emulator would also enables users they simulation of singer processes from however in this context tree move is used as full system emulator and brings a lot of architecture which can be emulated and additionally a lot of hardware bought sort popular layouts and through model and targets armaments from firmware and specifically and you lose an instrumented Linux kernel so basically it takes extracted Linux-based from well puts it inside so the team or emulated and runs it was their own implemented kernel this kernel allows automated analyzes wireless plug ins followed them analyzes off for web pages and secure that from only to a protocol implementations additionally interesting is that this framework has capabilities to automatically scroll known exploits many Nolde from Metasploit against the emulated from where and quite interestingly a lot of exploits the fog among devices can be propagated to other devices which basically means that there's also a huge codebase shared among different kind of embedded devices of at least and that it expects world from yeah unfortunately is the downside here is that it's only work for line of spaced film well and only if there's not too specific kernel modules wrong because it's embedded device needs to do hardware actions from 0 0 to specific hardware peripherals will most likely be done we are specific kind of modules and if they can't be emulated of from a diamond fields to succeed another interesting project which was
released this year is and working more which is also name of kind of best and also based on community it is considered as a work in progress and the example released together with the tool was targeting the PCM 4 3 5 8 2 ships from where so these chips are Wi-Fi chips of used for instance in a lot of smart phones there enabling new working all the prototyping of custom hardware platforms what blocks and keyboard along with the world and also at instrumentation capabilities based on the were for different even in 2nd on unfortunately as this only emulates the from alone and there's a lot of hard and action going on the on especially during initialization function this cry size of a lot of modeling what life and arrow or to prune out execution parts which are not relevant for the analyzes for the analyst
so the last 2 will I want to talk about this a lot of the first one so uh some of you may have sought if I'm talking about I wanted to there must have been the 1st of october and this tool that was based on S 3 which basically is again a combination of cream on T which allows the symbolic the execution of from the more emulated through where the additionally I'll tell you the utilizes almost CD and GB and allows partial emulation of harm from so with partial emulation but it's basically means that the film where itself or part of the from red cells are unknowns block the and besides emulator a and specific hardware requests like other memory-mapped I all are forwarded to the actual for the could wise we have the connection of almost city and GDP and additionally a lot of provided way for all weights for orchestration so that you can for instance start executing on the the wise then transfer the important states so the important memory layout and registers inside the emulator and to continue execution inside the emulator for inside history this and I'll stay neatly to skip all the initilization function of which are not interested form analyzes additionally and quite obviously as as as tree is using the it also brings the symbolic execution when entities symbolic execution fall from well unfortunately tell 1 was heavily tied to to the S to the infrastructure and it requires in every set up in the presence of a physical device to succeed with the with the partial emulation the so what did we learn
from looking at those for tools 1st of all there's a lot of focus on the ARM architecture then really the majority of tools I utilizing king was emulation capabilities as a basic Proc for building up the framework unfortunately is the resulting frameworks are then heavily bonded to people so they don't see anyways or don't define anyways to and gets analyze state of the emulator into another 2 the I Wilson missing um way of of transfer states of analyzes is at the same time a little bit of the modulation of the out 2 framework so in a
very big picture it's a framework for a dynamic in which a target orchestration and instrumentation we will see what this means with lifetime was later on the focus of our time is on analyzes and the full thing is an open source and tighten based framework which we released in June of this year so it's it's quite new and it's a research project so we try to have a clean and usable code base but sometimes something maybe a bit fragile in comparison to other tell 1 of the 2 was redesigned and we implemented from scratch to especially focus on better usability and it better abstraction of targets
it is the little boy is a software system security group that eurocom specifically next to me the main activity buffer Dali fuzzy on and DVD but so what the main goals when we
designed it started to write our time where to have the the possibilities of target orchestration separation of execution of memory and state transfer and synchronization capabilities the target orchestration means that we orchestrated different kind of frameworks of was abstractions inside Pitons sources of targets could be anything about as emulators other frameworks and we easily want to be able to add new targets to the avatar ecosystem furthermore we wanted to the separation between execution and memory because of this is basically the core concept although the main requirement to allow all the ideal for forewarning what remote memories so that's analyzes runs inside 1 target and operates with memory of another target far more state transfer and synchronization it's important to us because once we are starting the analyzes on 1 specific target we don't want to keep the analyzes of local to the target we may be me want at a later point in time to switch the execution for instance from an embedded device tool of an emulator and for doing so we need to is the easy way to transfer the state and so all we cannot
in the end was a framework which basically consists of 4 components the other to call which is a Python library and the main interface from the analyst to the analyzes inside the framework there are the so-called targets which are the piton abstractions of so called endpoints and plenty you buy are all the things you will you want to have the as and when so emulators frameworks to balance however targets and endpoints are not talking directly to each other they're a disco interconnected by an additional layer of so-called protocols and in which we can also see
here on this picture where we have developed 2 quite a topic which defines all and and orchestrates a set of targets which all talk we're execution protocol memory protocol and reduce the protocol to the distinct end points so the question is why did we add abstraction for protocols idea is quite simple a lot of tools actually have a similar ways to communicate for instance balls key mold you and almost at the often GDP so but to talk to the analyzes for a framework which was the them software under analyzes and by separating the protocols into porpoises like execution memory we allow all the clean separation yeah all of those different concepts during the execution and
books so let's move on to the implemented targets which could also be you're small crystal nor you open source must codes on the top left we have our coverage which was actually the mass of TDD them which is quite interesting because the switches users to spilled water water to water from 1 of the 1 of the face a bold and shoot it's in a box and put some don't so I think it's quite matching mascot for GPU on the bottom left we have cream or which is the full system and later we just talked about a little bit more on the top right there's upon a framework which is the reverse engineering framework based on cream and aims to allow repeatable reverse engineering and it does so by basically recording of all the eudaemonistic I all ocuring to the software and then relation and then later on those non deterministic I also can just replay to the very same staff software from the same initialization state which will result into the same execution that went to jail of doing so is that the resulting memory footprint offer recorders waste follow the instruction while memory trace additionally ponder a loss of plug-in systems which have been lost to different functions or different units inside chemo to at foreign analyzes the last tool on this slide is saying a framework of witches as of all still under development and will be made public what would be merged into the public branch solution and angle is basically a symbolic execution framework in which provides quite powerful symbolic execution engines their capabilities sorry them but 1
thing I forgot we also support of this target which is not represented on the slide which is almost at the end of the tool to talk today take interfaces which is then in turn can talk we have data critical to embedded devices so just a little bit of background knowledge játék is debugging support present on some embedded devices and if it has a weight of 0 we can use all small city to dynamically deep dark the firmware on the time the ways
and as we've seen before a lot of tools are based on cream so if we want to have them easily integrated into the lotta ecosystem we need to have our changes of more-or-less locally that's what we did so we changed the motivated to work with some other time and tool for what's the state and memory and so on and all of the changes are located in 1 single subfolder which should make it is straightforward to implement new chemo based targets for our attitudes and more specifically the changes which we did all the most notable is the addition of a confocal machine which is similar to the true well based support description present in the working human but in that here by the configuration of the heart and want to emulate is defined in adjacent file which is automatically generated by avatar to based on the specifications and listed in price the and that allows a general and flexible configuration of the different hardware you may want to emulate additionally we added the new peripherals the top peripheral which communicates with other to we uphold 6 message queues and basically allows also the remote memories from more so the idea is that if people want if there's some peripheral where's memory-mapped I O which you can't emulate you will use other top peripheral which will then forward on memory reads and writes to the other 2 to framework which was dispatcher for instance to the physical device and a couple of other
features I want to highlight the about the framework is that we aim to design it's architecture-independent this basically means that we have a subfolder inside the framework would just use was architecture and abstractions so that the framework that can work with thoughts uh architecture abstractions open to any analyzes it at the ends of now we have abstractions for on its 86 and expected verdict 66 64 more and we are currently developing another 1 from I wanted to use as an internal memory labeled representation so just to lay and not to memory commons itself in order to be able to push it to different targets or to compute the will of the chase 5 needed to for the key more comfortable machines furthermore also modelling of peripheral routes directly implies and so you can move on and scriptio peripheral directly implies and if you know how it has to behave or if you just want to have something which statically returned to same were used because you don't care about the specific peripheral additionally that we want to keep the outer to call a small and and maintainable as possible but on the other side there a lot of tasks which are which have to be repeated during an analyzes of for instance the we want to assemble or disassembled instructions soul in order to enable it we added flexible plugin systems which also has already a couple of example plug-ins for instance in the orchestration plugin which automatically orchestrates the execution of targets in the normal way you would write another task that you explicitly defined which country when you did what when you do what while in the orchestration setting you just define a set of transitions and Alydar to were automatically and change in the state around according to a defined transitions likewise there's a instruction for water which basically that the aims to deal with those on emulated instructions so small and micro architecture dependence structures so once our time count as 1 of those instructions it would not executed inside and you later part of one's embedded device so that's the state changes at least they're accordingly so
after this kind of high level talking about a framework let's call them directly to the examples I will of the fullerene show 2 use cases how to use a lot of as a dynamic instrumentation framework and 3 how to use it as a dynamical orchestration framework so but if you want to write another transcript you would normally need to do 3 or 4 6 1st you need to create the main about object then you then you need to define the set of targets you want to deal with the new analyzes optionally if required if you have more than 1 target or working based target you need to define a memory layout and last but not least we need to specify an execution plan
so let's start with so uh from simple demo which basically is a
demonstration for had world so we have here on the left I hope the font is big enough for everyone to read it uh an executable file faded out and the pipe script had a worker pi which we succeed here what's on the right so if we execute aided alter stated the nothing happens just X was there a culture of 42 on the right side we have all of full analyzes of weather analyzes of instrumentation on 1 side so as to 0 we create solid type object and defines architecture for those analyzes we at the concrete target which is in this case the GDP target then we not only at the target but also more as a process the points to GDP so about where our a lot target connects words e and which is basically just executing Xhosa faded out 5 we initialize the GDP the target which will connected to the GDP so and that's all the initialization functions and down here we have some shared code which we want to inject into the target this shared code is basically that I'm just to share code for that simple yeah I have a word for the output on a studio so basically Cisco have a and here is the interesting part of the framework we instrument GDP from the also that we had was to right memory at the current location of the instruction pointer the memo we rewrite has a length of all shellcode is all showed cold and raw memory after we wrote those we wanted to continue our execution soul let's see if the deal but so was lost and here we go we
had had a word as an output and
1 this is just a very simple the more it directly demonstrates the instrumentation capabilities of tattoo and especially what what what what I really like is the possibility to script GB from the outside like without being without having to execute your Python script from inside GDP so you can say here on the right side of the full analyzes the you're doing of centralized in 1 place let's
continue with the binary instrumentation on a real talent as real target we choose a highway which was Pusey rootkits kids which was presented last year that any is s and it basically um injects or this baby works based on code injection on the of the of normal commercial off-the-shelf you from the beauty itself has somewhat of the board we can have a look
the everything's works and we can see it it you don't know him enough here we go
so the stone here's involving QC from which we can polar cycle very shortly the thing that while 1 of the right 1 the yeah OK sorry for that much of what to do before OK so so this the mole is so very fragile which is going to show up but it you would basically have of what you see is starting to boot we see a several ball to you um you on the side we have all human machine interface port which basically deals with all interactions to the exterior world like as the Carter for the goods which the network interface so was be interface and on the top here we have from some of the and I also for this programmable logic controller so here you can connect the different I also right now the beauty useful good everything that's fine it has no I also detected all the status of the DC of 4 different piles are disabled what's special he'll is that on the law about which we can see here and is little context and 3 and you which is just responsible for dealing with updates what mainly responsible for dealing with updates off the x Europe so updates also GPI or state and this context and 3 and will interestingly also have the neighborhood of j take the block port so we can easily sort of some things and have a here on the side or Logitech interface connected to those peers which will lead us to the deal
this device is particularly interesting because some parts of the from well our residing inside S right so the board initializes and there we delve perfect and from where is loaded into some the into the around sold this basically means we can instrument those parts of the framework which will we also did a little
by reimplementing the proof of concept is the imitation of our so here we basically do the same creator of art object lots of similar plugin and the at an almost the target of the set a breakpoint at the main loop because we want to skip also instrumentation uh all the initialization function we continue all execution under we're actually it's a spray point of this system done by the weight and once we're here we're going to inject some assembly work so this assembly code is rate of simple estate it is just a proof of concept implementation also how we may well not full implementation but it already shows what will show that we can come there were actually execution so that so Pudsey's of human-machine interface about things certain inputs are enabled and we do so by hooking an interrupt handler which is executed frequently to check the status of the I O board and more fight the stays manually so let's see if this works I think filled but to mention to say that we try to have both by to and plants reconquer below compilable code so let's use that as the pipe in this example and here we go OK so the Comoros off so I cannot show you a bot on view to let's start at to bring which
basically our symbolizing
that the input is present also clearly no input is connected to this Pusey I editor picture of the DTA just to be safe said in case just more doesn't work we can see it let's move on
to the next example which aims to improve fault detection on embedded devices this work is part of the what you corrupt it's not what you crash table but by our research group which will be presented at any as as next year and it's a joint work Siemens in parallel to this talk we uploaded the slide so you can go and check out a paper but if you're interested in more detail so what I'm going to stay here in a nutshell this paper investigates the challenges specific to fast testing embedded devices which are on the 1 hand fault detection instrumentation which we already talked about but additionally what 1 additional problem is scalability fast testing greatly benefits from having the possibilities of running multiple instances of the same fast processing software and faster than Parallel this is embedded case would mean that you need to the fuel tests traditionally just a lot of different embedded devices and the father more in the paper we evaluate different strategies to wait fast testing of embedded devices for instance physical re hosting the static instrumentation or binary rewriting and in the end we tried to and give you some approach for give some direction by utilizing partial and full enumeration of whom were using the other to framework
and so full of paper the set up has them the 2 targets on the 1 hand the STM 32 L 1 5 2 in the developement board which we're feel that still from here my sport which as nice features like directly having the játék interface embedded and even providing providing a single access to it over speak on the other hand the target we're using as part of the reverse engineering framework we as standard software for our tests we used expert for an instrumented version of expert was artificially vulnerabilities and in the analyzes itself so constraints in the sense that serve initialization of wise this run on the physical body and the emulation of the main loop of the main part of the freeways done inside partner and for analyzes we wrote 5 on plug-ins which check 1 verified during iteration the state of the film where by mimicking already existing techniques which are used for analyzing desktop software so for instance we have something which is similar to shadow steak implementation or some tools was checked so while a 1 which tries to check the consistency of the heap by tracking matte freed and reality object and the big advantage of these approaches is that there's no need to to modify the film where um so
for evaluating those we did 105 session of 1 hour each in quite some different the fastest pace learners in the native then we use partial emulation was over for warding off I O to the board we use partial emulation was also support but was another top peripheral and we utilized for iterations was applied in these we could prove that we could detect previously undetected faults and quite interestingly the full innovation provided better performance than native father in due to the fact that even in science and the rate of the clock speed of simulated from where is higher than on the actual device and now the
next deem more is actually a subset of this work and it shows the 3 couldn't replay features which we have when using partner and this is especially cool because normally if you will analyze or dynamically analyzer embedded device to meet the device physically was you physically present um however by utilizing time now we can and recalled 1 execution and replay it later to the boat insights emulator without the need of having the do present so let's look this
up in the mall so 1st of all that you
believe me the software running on the sky is actually it's a metal metal-polymer as I stated and so we're just looking
at the serial port and writing the exam if chooses you hear out of the input and here we have shown a in the XML file itself and is amenable to from where just turns a specter is a
documentary sold find this work so far so let's look at at a
valid touching script for recordings of so this is a little bit more but we have more huge than the other task which we saw before the we find 2 targets you looked upon a tighter than normal to the target we at different memory maker ranges of 1 for the read-only memory was a from a sample research 1 for the run for office of size 4 K pages and old and several ones for memory-mapped I O whereby we want to emulate seceded interface that was another top peripheral them I don't here's an example how to use the orchestration plugin so basically we define a starting targeted at a transition and starts orchestration this orchestration will automatically transfers state from nuclear from the board to the pun aboard once this specific addresses hidden execution and it was synchronized around range comes once we're there we're beginning the their logic and going into an IPython shell what continues execution is emulated going into an IPython show for dynamic what analyzes so all we need to specify a trace in the name of and the deal is not good will of this time the GDP protocol was unable to connect some 2 but no I don't have time to be bothered right now however trust me is this works and how to end on top of that I prepared already some reports before which uh impressing the unspectacular so we have so also run replayed out a with which they just executes a partner
with a configurable machine with the configurations that are automatically generated by a lot of so let's see if at least the replay works of previously recorded execution and MFA and here we all we have to replay completed successfully and a lot of debug output about the confocal machine the number of
executed instructions and the
number of the replay to nondeterministic I all OK
let's move on and the last example I want to show you is some work-in-progress where we basically want to leverage symbolic execution tool complex software using our time so for this we inserted and artificially but because testing phase inside Firefox and executed Firefox concretely inside gdb prompt of the function of interest and this is particularly interesting because anger itself will be able to run complex software as Firefox or would be needed to created with the state we analyze only 1 thread and once we had so interesting function we automatically extract the memory from GTB the memory layout from jedini sorry not to memory and copy just availed into and our wives and the memory comes itself a copy on read so if and the excesses memory that extra copies it from G B 2 and the reason for that is that any set associates a lot of meta information with the data and this will do if we would dump the full memory contents into and out of this 1 exceeds the amount of from reaffirms present that response this mission and furthermore we symbolize symbolic and by the of the function of symbolize the function arguments and start all symbolic exploration of some of our preliminary results here are the we had approximately 10 minutes of runtime and the script for just executing so 6 basic blocks accessing 21 uniquely records pages uniquely and we found the bug and so let's
recovered the example we saw we saw 5 Example dynamic instrumentation off gdb dynamic instrumentation of appears fault detection on and development bought together was pun now we got what we don't so the recourse but to replace of the bottom punish setting and we very briefly saw symbolic execution was Firefox and you'd be noted that some of those examples are already available source and the ones which are not would most likely be made a wearable was in the next months and so let's rev it up
and dynamic from analyzes is still a very challenging topic and I don't claim to have fewer was that we have solved it completely however a allotted to tackle some of the challenges and tries to improve the state of the art and additionally 1 interesting thing which we recognize is that multi target orchestration so the concept of having different emulators and frameworks interacting with each other during the same analyzes is a concept which is not limited to from where only but also desktop but also software analyzes can benefit from some other that's almost and also you we also make some plans for the new year for the next year we basically want to move on mainly went to get up currently we give open pride and a private report which is a little bit said because keeping it and saying this a little bit hotter than we want to introduce proper version into the orbiter tool and of course that more and exciting targets to enable more and exciting analyzes so if you are interested in helping us or I just want to know who have some questions feel free to contact us as an I see you on we'll just direct talk talk directly to me and just once modest out we may be looking for people to join our group is in a fruit Future of them running out of time I just Prince of nourishment shortly and I guess we can move over to the Q and a thank
make your own the resolution mostly and yeah so the sound of the 4 more support over the complex fixed 86 system like that barely and the soul the framework itself is not executing them at any and any from software itself instead it uses underlying tools to and all other tools and targets to execute software so if you execute a concretely on GPU the GDP when you mention the tool that's just what you probably need to do is to worm of men delivered it's a registered definition insights architecture on architecture abstractions but the expression goes on work from home I haven't heard about can before but as I understand it can recordings and replay executions and that includes like executions on yeah probably the real hardware possibly acuity you could use it for the binding as well like also like reversible dividing electric some you step in the code and then you jump back to a certain point of the recording and it actually yes so the original purpose of polymer is thus provides engineering of World software just executed on inside them you later and it also has all I don't know how far the development status for stepping back but I think that we're working on at as a general way while replaying you can always attached to the replay to the execution was gdb walked knowledge work and start analyzing for reversing purposes thank you the since have additional questions but then we go the market from Oregon yeah thanks a 1st all is the I have 2 questions OK 1st uh is upon the release now because a case it's it's part of a student going to published a paper about it or no pun is released on kind of Hamilton sources and we instrumented and have a more refined portion of said I wanted to framework but you can also get it on the did have the consolation upon that they share the stretch polymerizing OK so 2nd I you member this problem we don't along time because it was assumed and I want to know what improvements in our church who so regarding this yeah yeah no that's still very excellent questions and we improved so little so but the speed quite a lot unfortunately continued in the world the more flow of recordings of execution you but let's say about a 1 of the main bottleneck was memory interactions with the physical the whites and we had some time of benchmarks the transfer of of 40 K pages which we needed for this example took another task something around 2 to 5 minutes while here we are done in 1 2 5 6 the so it's a significant speedup but still not and fast enough to cope with real-time requirements obtain expression goes to Mark from so yeah if they have an animated systems rather often have a real-time components could you then for example charged walking er free chose to chance so analyze the non-time critical parts and yes social it'll be possible I mean in general so we have to investigate more little bits of real time dependent on the embedded systems which only call the output of a that scope but that's for sure 1 input we will look to in the future and I think poking on real real time critical parts maybe just working OK I just internet no hideous warlike doing so if there is there were about to you is um it's no good form for a mixed signal to you have attached to a and know there's no of Roman could be readable we started implementing it and gets the improved a little bit side on the if someone wants to step in and help to it but help was that little thing we are happy I can tell what is or what has to be done to enable them support and world but in general we are we are working on at about sorry I can't tell a specific time when it's going to be ready OK any further questions yeah I think you'll find really 3 stages churches use available only use it if you want to watch around here and what to do
this here and you might take the top of the 2