Merken

M/o/Vfuscator-Be-Gone

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
the whole bed and I found you know
but in a a and and are
we ready to get started now with Julian and Clements the with the scatter begun go on alright thank you and so as many of you know this is the this a follow-up talk on the Moskita talk done by Chris Thomas last year so my 1st 3 question would be who attended that talk or solids recorded online the look at was quoted Mama ically quite of it's OK so moles of Titus recovering from so crushing RE nightmares which is quite the reference to last year's talk and as you can see this is progress of soldiers to work in progress of the Running a deal for skater and you would like to
stress the fact that this is not a personal attack on Chris Thomas and also we would like to anticipate that our team of cicadas dust resubstitution of those move instructions very carefully would the alright a little bit of about US so
who are we actually from the ah well maybe suggest players the play for a team from Germany called topics or and the invest a lot of time into solving crazy exploitation reverse-engineering and cryptography tasks from and there we're not playing CDF than well um trying to get a Ph.D. in the field of problem analysis low level program analysis and duplication and Clemens just finished his bachelor's thesis from yours at the doubtful honor of writing a thesis on office case so you probably know knows those who could very well and can you right another that later
OK I'm going to get a sense um I will try to column amorphous cater only at a high level such as you know how it works so how it is made up internally and so the story of modification has for we like this and there is the strange discipline amongst computer scientists to find Turing-completeness very wouldn't expect it and the original paper hand as far as by Stephen in 2013 there he summarized that's even if you remove all the instructions from the Intel x 86 instruction set but the move instruction then you still not using Turing completeness which is kind of surprising I would say and 2 years later from justified almost came and implemented a compiler actually who actually that that and that's what he's in the pre conscious of the 15 last year and he didn't waste so and hits the 1 version that he called a multi-scalar 1 which translated from brain fuck to x 86 move which is not sold surprising because no graphic is quite low level but he also spent a great amount of time to write a C to exit 6 move compiler which then this basically implemented as a which machine and the cover some of the internal workings right now so we have 2 2 references there if you are interested to read up on that so let's assume we have a program that looks like this so 5 basic modes that are likely happen need something to each other and Basic 3 assumes that there is a new and then what so many of you might know from my wear off from the industry standard obfuscators is something like this and color control flow flattening and they're the obfuscated inserts a dispatcher and that actually then does the basic blocks getting for you so on the left hand version of the the them how many possible paths or there so the program because you can't anymore at what above skated after what basic op the and confiscation takes the exact opposite of a and create something like this instead of flattening its own that does something we call linearization which just lines all basic blocks into 1 linear block that loops infinity the so how is that implemented internally well and there is that setting is is a news indexation stop which some basically registers to is again handlers um 1 for seeking to which basically is a trick that the program is its own city handler and that makes that's basic block for in order to jump to basic blocks euro in that image that does not have to trigger any explicit time but could instead just and trigger a you uh in Domus case it's a move of a segment register into a general-purpose register uh well to start institutional or again and of course term those signals have to nest the and the other 1 is then of course you don't want to lose your ability to talk to the operating system on which the slide he registers another uh signal handler for exact forward in which basically John Stuart dispatcher which done could then cause the external library functions so and it does that by triggering Segfault so encounters of mothers cation that dereferencing 0 means developed to elaborate it OK to best understand how this is possible at all of them we have a little acquired Willis higher levels the program here and I know that's lexical don't like each other very much that's going it but this program should do is and it calculates the vectorial of the number that is passed in our in our view 1 and if you compliance that program as part of the left and you would get a control flow graph as depicted on the right so how is that done well 1 observation is that each inspection of each basic blocks Brock obviously music he executed each time it is past right so that whenever dead loop uh we triggers the have you'd all inspections again and the problem is that's not all aspects of sorts manipulate our state so the main trick of modification is actually whenever you introduce a variable you actually introduce 2 variables so you introduce a airing was 2 entries and the 1st entry of that error rate is state patient with trash data and the 2nd entry it and its 1 is the real data you're interested and if you follow that idea and then you can program a program like you can see in that to wire loop from line 12 to 24 which basically well and takes a state variable a global states and evaluates that states whether it is uh well in our case if state 0 state 1 and if it's true and then a dusty assignment on the left hand side of the eye and 1 Dickinson and 13 them to the real data and it is not an it's done this understand from scratch data and well it is apply that's iteratively then you get that program calculating the Victorian quite interesting the it some in context of muftis cation and this means that and Monera talk about something prefix sale we mean actually n-ary holding to uh variables and 1 of those variables the scratch data so the basic block in cation is then always starting from the state sequence that's takes that target register which basically it is it equivalent enter the states but it's all here and in texts that this x for example take a look at their basic blocks 3 that of the current basic locus of the basic plot that should be executed and if it is so then that on variable is set to true and the transformed the semantics of the basic blocks 3 written move codes um are executed and at the end of the basic block and there is a sequence that the basic block texts that uh in the next basic blocks the receptor they got 3 or 4 remember his proxy was the 1 that you In the beginning and and then it just goes on no 1 thing to remark is that arithmetic doing only move this quite tricky and dumb implemented this using lookup tables so and you can imagine and doing arithmetic for each and every possible insects and occult process or inspection can blow up the binary quite a bit and if you look at some statistics here for instance on the right hand side we have a vanilla we have the new the compiled programs and on the left hand side is the is getting and so we can see that problem sizes increased dramatically by a factor of up to well 2000 almost so nodes this collide with megabytes and also execution times because that means has to be executed all at all and only on all over again is of course also quite all full and so on if you compare the shot to 256 example which basically just hash 10 thousand bytes together and took almost well effect that 4 thousand no times longer than the original version so our thought so that it was that's quite true but it's also very slow right so who would ever in any product of raise use that's obfuscation techniques and and of course you forgot about the city of
organizers so should of course it didn't take long until well tens of appears to suggest there people thought it was a nice idea to torture density of participants and throw a mosquito binaries at them and at that point he said OK no this has to change and
if we do that we do that right and so and there's too easy solutions that we explicitly rejected number 1 and the brain function must cater well is easy because there is a one-to-one relationship from and Bradford instructions to emit it's um go move instructions and you could just statically resubmitted theories we substituted and you on same goes for the C version bottom that's quite cool of the original implementation of the mother's skater also has a hardening . peopýe script uh lying next to it and where does this fit in well prevents exactly doing those pattern-based pattern that can based approaches so on it's dust register renaming it does inspection reordering which offers clients some bits some so we can't do any pattern based approach and more and they're really forced to look at the semantics and how we do that is
now covered by claimants but so 1st of all when we assign them of his bed the Moscow we had to think about what world golds because I'm not analyzing them off binary as you song can be quite tricky quite difficult even and especially if you have to take into account that that extra hot in gets applied so you can't might you just so that you can't say esthetically was translated back so but we set down and thought about what what would be the main priority and came up with them staring at the wall of the instructions for the better in the morning and that would help a lot if he actually got the control flow back and that was all main objective apart from that and we wanted to reduce the most like what's and more tables what operations so and that is in the end you loaded into I don't you know exactly what's going on even though um you have quite the amount of instructions to you can figure out what's happening and as a sort of priority or a small list of purity here there is replacing the lookup spots respective respect by instructions based on its success instructions 1 yet so that particles then to recover control we we do it in 4 stages and the and in every stage goes on top of the previous states a 4 stage we acquire some knowledge we need in the next section and in the in the beginning we don't have any knowledge about the program apart from that as much as headed and that it contains the initial decision so we analyze a set of of that we have not enough knowledge to retrieve all the labels then we can have look where the charms arm and where are they leaving and from all this information we can finally built to control for Cloud control flow graph again and while that's false alarm emperor he so let's get right onto to it how do the tawdry analysis set up test your mentioned the of the 1st in association sets up to a signal have lost and to to external calls and also to loop around the and it's a set up the 2nd well and the interesting part is here marked as 1 that was a part of domain you actually but it was also studied the time in the binary and also not prone to any hardening because is emitted only once and that's uh at the time were the object fast will um gets linked together alright before it so um you won't see any hot on the spot so positive it's quite simple and as you see the and it the on variable would be initialized here in the 1st statement as select our that place in it will be a sum to 1 and in it is a variable that is esthetically set to 1 in the image of the the binary and after that in this world is set to 0 that make sure that this basic will only be executed once in the beginning of the when the barriers 1st order and then it calls main indexes just this basically is just adding up the execution of the of the Basic Books so from here we know where it is so so long and also we know where is on because the model if you have to select we have the 1 on the fall to going on off on top of that we now can recover the labels clueless we run over the entire binary and Parsons scrap instructions using a capstone the and
the when and when I label costs because of the way
Chancellor and some of instruction itself cannot targeted instruction pointer so it needs some of those entry points into labels where the execution is set to 1 and they are they look like the costly and the 2 lines really you have the so on which is set to 1 under a special condition and we now have to the weather is the conditional and find the labels therefore and on the right side you have an example here we are using lookup tables and and 1 because there is some of 30 32 to a bit binary you can't have a a full 32 bit lookup table so it was split into multiple lookup tables which makes the separation quiet quite a long on quite a mass amount of time of instructions and we have to put them together that's the challenge and also in the beginning we don't know what the look up table resource so um instead of seeing the um equals across here at once in the and here you would just have addresses actresses in the tables and now we have to be the lookup tables that and what is the best thing impressed with way to beat lookup tables well it just use lookup tables and this version of
binary and Boolean for and by operations the XSL and um a specific element in the lookup table which is set this thing to this lookup tables so we make sure not to look up tables have the same value them and for Boolean operations we calculate a score based on the shores from unary operations of the hash the tables which words what could 0 yeah know we have figured out what what's look up there was the to which addresses and now we can translate so that an although now
we have feels a stream of instruction which is exactly what the social before you might can identify the equality a look ups and and not well again so we have the and and the worst part is we still have to check that is your computer how uh um to work all they all that the instructions together so what we do is we start band we know and that's when we assign 1 to sell on the the result is some kind of conditions and we stop there we had to attend analyzes going backwards we uh at 1st and a and the way tank propagates to move called by courses and quite obvious and quite easy to implement so the 1st to really well and we can generate a king and across all of its and this looks way more Hannibal I I I think you're crazy and not from this graph and the still half a graph and know is we still have to handle it somehow so what we do next this so we take this graph and translated into our expression and feed it into an automatic theorem prover mainly set 3 and this ultimately the improvement we look at it and we ask well what's value will and thus a target have to move on take soul that um a belated it's 2 1 so under which conditions will someone be set to 1 and it will all put us are constant function which is just equal the label so we then have to label and we just remember where labor laws and what's the label so the Russian interests and well the label and this brings me to the start point we have to chump targets and follows the instructions have in common that the excess of higher so we do is very similar to um the ionization of the labels we start from the selective to target but i'm and see if if it's so toppled under some conditions like you see in the conditional jump all the others have now conditions because the chance worst-case acute it returns will always be executed in their chance will always be adhered to and then we also have to acquire the label which we find in the subsequent instruction or it could be something more complex like in the indirect or attention case and this notice is not easy to distinguish post due to a case spot was obtained analysis of facts and you will end up either the summary a simple case where just perspectives the reference or you have something in a way more complex and the complex cases then the status is obviously the to and we and then remember in the position of the charmed and very it's leaves us this gives us a bunch of data on 1st we just have those levels and the chance on the left side a B C D E F then the um connect them in a meaningful way is such that the label will of just execute to the next block and then we simplify the profit and we that while the control flow graph on the right yeah and that's all we do all we need to do to recover the control-flow scrap control flow graph yeah
right but the so what I mean you don't remove all the moves that might be the biggest concern right now yeah what's that it said yeah and it the ones that it can still handle had them harm executables because you know you're doing it travels obtained analysis so it doesn't matter and how the rate just the cement that doesn't matter in which order they appear because 1 what does made of metal to depend analyzes the of propagated equally also and the gets the control flow graph which is a major major hold In in reversing the binary in the and we also what I think of speaking generate a patch binary which we then can execute or which makes it looking at it in idle something else way more pleasant the generate either symbols because while going all the way here covering the labels on the targets we go all the all the time all of the tables and sold on the way we just generate Balsamo's and we break that we get them into our are formed by the can understand so what if you if you loaded in the and the patch variable to symbols into you have quite a different experience than each looking at a mosque a binary right and last but not least we're using all those fancy new framework so that was newness sitting in the audience I don't know what the creator of capstone and key stone so that's a lot that really helped in creating the mother's care from and all those things filed to Microsoft for creating set 3 automated theorem prover and so we actually had a demo that are small the molecule to ongoing problems and due to the fact that everybody I wants to go to the coffee break him I'm just click the coloring what the idea of 1 the takeaways of that them over so we have to simple tracking on and I'm sure that most of you will almost instantly be able to tell what's the needed input should be for the gets 910 in order to get the smiley face and set 1 of the but what we would have to understand the mosquitoes and then we throw this into the the mother's care and so uh then to prove our point of the woods have taken this fancy new anger binary analysis platform framework and and the cool thing is that it comes with a simple execution engine but actually routes for the liver programs and what effect and then do this and can say OK whatever that gets returns is symbolic data so treated as a formula then execute the full program from line tend to line 16 apply all those transformations to in the formula and then constrain REST recent variable to be 0 it from that um you can feed again into a theorem prover and then you get actually the correct solution and I know this is pretty laying that the can that's life right now and so all I can offer is well as their on my notebook but unfortunately known 1 that has come to us and if you're really interested this we just invite you please come to us often talk and we will show you it actually works after view of skating it and what they it also is that the most case takes no less than a 2nd so it's almost as fast as obfuscating them
so to conclude and this is where you can reach us and or e-mail addresses so little or PGP fingerprints and of course here open-sourcing everything so some of the source code of the team of skater which is by the about 4 K lines of code so and there's many many many many things we could cover and also a summary of the of kid and if somebody wants to watch himself even more vendors claim is that that uses that you might read law which might be quite a moving experience 60 pages and that's all they got for now that it covers them on the it covers so in the presence of amorphous cation more in depth by of the numerous questions well so it might be interesting to read of the businesses and this provision will much of that but the point is that the the origin of a data analyst
fj
Stereometrie
Arithmetische Folge
Streuung
Computeranimation
Bit
Thumbnail
Exploit
Statistische Hypothese
Computeranimation
Übergang
Office-Paket
Task
Datenfeld
Reverse Engineering
Kryptologie
t-Test
Programmanalyse
Normalspannung
Speicherverwaltung
Analysis
Einfügungsdämpfung
Punkt
Prozess <Physik>
Compiler
Versionsverwaltung
Maschinensprache
Bildschirmfenster
Binärcode
Computeranimation
Übergang
Formale Semantik
Vektorgraphik
Dämpfung
Gruppe <Mathematik>
Kontrollstruktur
Gerade
Lineares Funktional
ATM
Statistik
Datentyp
Vervollständigung <Mathematik>
Sichtenkonzept
Stichprobe
Plot <Graphische Darstellung>
p-Block
Kontextbezogenes System
Bitrate
Biprodukt
Teilbarkeit
Dichte <Physik>
Linearisierung
Rechenschieber
Primzahltest
Menge
Rechter Winkel
Automatische Indexierung
Kontrollflussdiagramm
Programmbibliothek
Ordnung <Mathematik>
Tabelle <Informatik>
Instantiierung
Fehlermeldung
Aggregatzustand
Standardabweichung
Proxy Server
Folge <Mathematik>
Kontrollstruktur
Selbst organisierendes System
Stoß
Zahlenbereich
Term
Überlagerung <Mathematik>
Loop
Virtuelle Maschine
Variable
Netzbetriebssystem
Programmbibliothek
Luenberger-Beobachter
Optimierung
Informatik
Bildgebendes Verfahren
Tabelle <Informatik>
Soundverarbeitung
Verhandlungstheorie
Zehn
Rechenzeit
Tablet PC
Quick-Sort
System F
Turing-Test
Mereologie
Hill-Differentialgleichung
Systemtechnik
Kantenfärbung
Hydrostatik
Unterring
Bit
Gewichtete Summe
Kontrollstruktur
Versionsverwaltung
Zahlenbereich
Implementierung
Gebäude <Mathematik>
Physikalische Theorie
Computeranimation
Formale Semantik
Loop
Informationsmodellierung
Domain-Name
Client
Minimum
Mustersprache
Skript <Programm>
Kontrollstruktur
Optimierung
Bildgebendes Verfahren
Bijektion
Feuchteleitung
Analysis
Softwaretest
Assoziativgesetz
Nichtlinearer Operator
Lineares Funktional
Systemaufruf
Mailing-Liste
Störungstheorie
Quick-Sort
Entscheidungstheorie
Objekt <Kategorie>
SLAM-Verfahren
Formale Sprache
Automatische Indexierung
Rechter Winkel
Kontrollflussdiagramm
Mereologie
Partikelsystem
Information
Ordnung <Mathematik>
Streuungsdiagramm
Aggregatzustand
Fitnessfunktion
Tabelle <Informatik>
Trennungsaxiom
Bit
Multiplikation
Konditionszahl
Adressraum
Ruhmasse
Zeiger <Informatik>
Gerade
Computeranimation
Tabelle <Informatik>
Resultante
Hash-Algorithmus
Punkt
Ortsoperator
Adressraum
Computer
Element <Mathematik>
Gesetz <Physik>
Obere Schranke
Computeranimation
RFID
Übergang
Graph
Streaming <Kommunikationstechnik>
Arbeit <Physik>
Gruppe <Mathematik>
Theorem
Primzahlzwillinge
Hash-Algorithmus
Zeitkomplexität
Gruppoid
Analysis
Lineares Funktional
Nichtlinearer Operator
Graph
Konvexe Hülle
Dreizehn
Verzweigendes Programm
p-Block
Differenzkern
Kontrollflussdiagramm
Konditionszahl
Mereologie
Wort <Informatik>
Hill-Differentialgleichung
Ultraviolett-Photoelektronenspektroskopie
Tabelle <Informatik>
Hydrostatik
Demo <Programm>
Punkt
Datenanalyse
Adressraum
Partielle Differentiation
Transformation <Mathematik>
Gesetz <Physik>
Systemplattform
Binärcode
Code
Framework <Informatik>
Computeranimation
Homepage
Ausdruck <Logik>
Eins
Weg <Topologie>
Theorem
Notebook-Computer
Elektronischer Fingerabdruck
Kontrollstruktur
Optimierung
Gerade
Demo <Programm>
Analysis
Binärdaten
Soundverarbeitung
Binärcode
Videospiel
Sichtenkonzept
Datumsgrenze
Routing
Symboltabelle
Quellcode
Ein-Ausgabe
Bitrate
Patch <Software>
Kontrollflussdiagramm
Trigonometrie
Ordnung <Mathematik>
Tabelle <Informatik>
Humanoider Roboter
Hydrostatik
Kernel <Informatik>
Chipkarte
Unterring
Radikal <Mathematik>
Information
Analysis
Computeranimation
Chirurgie <Mathematik>
Font
Hook <Programmierung>
LASER <Mikrocomputer>
Code
Maschinensprache
Computersicherheit
Druckertreiber
Gruppe <Mathematik>
Prozess <Informatik>
Elektronischer Programmführer
Reverse Engineering
Modifikation <Mathematik>
Kryptologie
Spieltheorie
Prognostik
Debugging
Computervirus
Arbeitsplatzcomputer
Optimierung
Extreme programming
Software
Framework <Informatik>
Funktion <Mathematik>
Mailbox
Technische Optik
Systemidentifikation
Explosion <Stochastik>
Formale Semantik
Kontrollstruktur
Office <Programm>
Social Engineering <Sicherheit>
Virtuelle Maschine
Ablaufverfolgung
Nummerung
Mathieu-Differentialgleichung
Wurm <Informatik>
Systemprogrammierung
Physikalisches System
Iteration
Proxy Server
Datennetz
Virtuelle Realität
Tamagotchi
Stochastische Abhängigkeit
Hardware
Physikalischer Effekt
Ortsoperator
Programm
Binärcode
Rechenzeit
Gasströmung
Visuelles System
Sichtenkonzept
Menge
Nabel <Mathematik>
Satellitensystem
Registrierung <Bildverarbeitung>
Einfügungsdämpfung

Metadaten

Formale Metadaten

Titel M/o/Vfuscator-Be-Gone
Untertitel Recovering from soul-crushing RE nieghtmares
Serientitel REcon 2016
Teil 20
Anzahl der Teile 20
Autor Kirsch, Julian
Jonischkeit, Clemens
Lizenz CC-Namensnennung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/32753
Herausgeber REcon
Erscheinungsjahr 2016
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract After last year’s talk by Christopher Domas titled “The M/o/Vfuscator”, we spent a great amount of time to analyze the inner workings of the famous one-instruction-compiler. We are happy to announce and release the (to our knowledge) first demovfuscator this year at recon0xA. This talk presents a generic way of recovering the control flow of the original program from movfuscated binaries. As our approach makes zero assumptions about register allocations or a particular instruction order, but rather adheres to the high-level invariants that each movfuscated binary needs to conform to. Consequently, our demovfuscator is also not affected by the proposed hardening techniques such as register renaming and instruction reordering. To achieve this, we use a combination of static taint analysis on the movfuscated code and a satisfiable modulo theory (SMT) solver. We successfully used our demovfuscator against several movfuscated binaries that emerged during several CTFs during the last months (Hackover CTF, 0CTF and GoogleCTF) proving that it already can handle real-world binaries different from the synthetic samples created by us. Our demovfuscator is under active development and we are working towards our next, ambitious goal: Generically getting rid of the instruction substitution and generating a much more compact and readable result. We will share our insights on this topic as well.

Ähnliche Filme

Loading...