We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Angry Hacking: How angr pwnd CTFs and the CGC

00:00

Formal Metadata

Title
Angry Hacking: How angr pwnd CTFs and the CGC
Alternative Title
The next gen of binary analysis
Title of Series
Number of Parts
109
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Security has gone from a curiosity to a phenomenon in the last decade. Fortunately for us, despite the rise of memory-safe, interpreted, lame languages, the security of binaries is as relevant as ever. On top of that, (computer security) Capture the Flag competitions have skyrocketed in popularity, with new and exciting binaries on offer for hacking every weekend. This all sounds great, and it is. Unfortunately, the more time goes by, the older we get, and the more our skills fade. Whereas we were happy to stare at objdump a decade ago, today, we find the menial parts of reversing and pwning more and more tiring and more and more difficult. Worse, while security analysis tools have been evolving to make life easier for us hackers, the core tools that we use (like IDA Pro) have remained mostly stagnant. And on top of that, the term "binaries" have expanded to regularly include ARM, MIPS, PPC, MSP430, and every other crazy architecture you can think of, rather than the nice, comfortable x86 of yesteryear. New tools are required, and we're here to deliver. Over the last two years, we have been working on a next-generation binary analysis framework in an attempt to turn back the tide and reduce our mounting noobness. The result is called angr. angr assists in binary analysis by providing extremely powerful, state-of-the-art analyses, and making them as straightforward to use as possible. Ever wanted to know *what freaking value* some variable could take on in a function (say, can the target of a computed write point to the return address)? angr can tell you! Want to know what input you need to trigger a certain code path and export a flag? Ask angr! In the talk, we'll cover three of the analyses that angr provides: a powerful static analysis engine (able to, among other things, automatically identify potential memory corruption in binaries through the use of Value-Set Analysis), its symbolic execution engine, and dynamic emulation of various architectures (*super* useful for debugging shellcode). On top of that, angr is designed to make the life of a hacker as easy as possible -- for example, the whole system is 98% Python, and is designed to be a breeze to interact with through iPython. Plus, it comes with a nifty GUI with nice visualizations for symbolically exploring a program, tracking differences between different program paths, and understanding value ranges of variables and registers. Finally, angr is designed to be easily extensible and embeddable in other applications. We'll show off a semantic-aware ROP gadget finder ("are there any gadgets that write to a positive offset of rax but don't clobber rbx" or "given this program state, what are the gadgets that won't cause a segfault") and a binary diffing engine, both built on angr. We've used angr to solve CTF binaries, analyze embedded devices, debug shellcode, and even dabble in the DARPA Cyber Grand Challenge. We'll talk about our experiences with all of that and will release angr to the world, hopefully revolutionizing binary analysis and making everyone ANGRY! Speaker Bios: Yan and Fish are two members of Shellphish, a pretty badass hacking team famous for low SLA and getting the freaking exploit JUST A FREAKING MINUTE LATE. Their secret identities are those of PhD students in the security lab of UC Santa Barbara. When they're not CTFing or surfing, they're doing next-generation (what does that even mean?) security research. Their works have been published in numerous academic venues. For example, in 2013, they created an automatic tool, called MovieStealer, a tool to automatically break the DRM of streaming media services [1]. After taking 2014 to work on angr, in 2015, they followed this up with an analysis of backdoors in embedded devices [2]. Now, they've set their sights on helping the world analyze binaries faster, better, stronger, by revolutionizing the analysis tool landscape! [1] https://www.usenix.org/conference/usenixsecurity13/technical-sessions/paper/wang_ruoyu [2] http://www.internetsociety.org/doc/firmalice-automatic-detection-authentication-bypass-vulnerabilities-binary-firmware Twitter: @zardus
32
Thumbnail
45:07
Hacker (term)Roundness (object)BitOpen sourceRight angleMultiplication signContext awarenessGoodness of fitInformation securityMultilateration
Information securityException handlingBusiness informaticsGame theoryModulare ProgrammierungHost Identity ProtocolStudent's t-testDemo (music)QuicksortSoftware frameworkPoint (geometry)NP-hardSinc function
Student's t-testBitMultiplication signInheritance (object-oriented programming)Flow separationWordReverse engineeringTable (information)Sinc function
Mathematical analysisCryptographyCodeCybersexModul <Datentyp>Open setFundamental theorem of algebraMereologySystem programmingBinary codeMathematical analysisDisk read-and-write headOpen sourceCartesian coordinate systemHydraulic jumpComputing platform
Quantum stateStructural equation modelingMathematical analysisSystem programmingMoment (mathematics)Slide ruleComputing platformDampingFundamental theorem of algebraBinary codeEntire functionHexagonRight angleComputer animation
Independence (probability theory)Scripting languageTwitterHacker (term)Revision controlFormal languageSystem programmingBinary codeCore dumpSymbol tableMathematical analysisConnectivity (graph theory)BitImplementation
Fluid staticsMathematical analysisCoroutineBinary codeGenetic programmingGraph (mathematics)Binary codeDemo (music)Boom (sailing)Open setComputer fileIntegrated development environmentConnectivity (graph theory)BootingAlpha (investment)Different (Kate Ryan album)Computing platformInteractive televisionStructural loadFlow separationComputer animation
Genetic programmingBinary codeGraph (mathematics)Mathematical analysisCoroutineFluid staticsSemiconductor memoryMedical imagingInternet der DingeFluid staticsSymbol tableMathematical analysisoutputFirmwareComputer programmingBinary codeComputer animation
Power (physics)Trigonometric functionsHacker (term)Symbol tableTouchscreenBoom (sailing)Computer animationLecture/Conference
Mathematical analysisAerodynamicsoutputFluid staticsTrailConstraint (mathematics)Variable (mathematics)outputSymbol tableMathematical analysisRight angleCondition numberConstraint (mathematics)Variable (mathematics)Demo (music)System programmingFluid staticsFlagCartesian coordinate systemQuicksortSoftware crackingBinary codeGenetic programmingMultilateration
Constraint (mathematics)Interior (topology)Local ringLine (geometry)Constraint (mathematics)State of matterCASE <Informatik>Fiber bundleBranch (computer science)Computer programmingComplete metric spaceImage resolutionPoint (geometry)outputInverse elementSymbol tableVariable (mathematics)Binary codeNumberRight angleVulnerability (computing)NP-complete
Demo (music)Order (biology)Constraint (mathematics)State of matterRight angleImage resolutionComputer programmingDemo (music)Binary codeBackdoor (computing)
Data managementSystem callLine (geometry)Modulare ProgrammierungComputer fileCodeFrame problemData typeAttribute grammarInheritance (object-oriented programming)MathematicsWritingException handlingRight angleDemo (music)LaptopMetropolitan area networkComputer animation
WindowRight angleGraphical user interfaceSoftware testingMathematical analysisFluid staticsBinary codeData managementSymbol tableArithmetic meanLecture/Conference
Source codeBinary codeReading (process)Computer animation
String (computer science)outputPasswordBinary codeAuthenticationFunktionalanalysisRight angle
Electronic visual displayData managementMathematical analysisRight angleMultiplication signoutputSemiconductor memoryBranch (computer science)Stack (abstract data type)Computer animation
Right angleoutputBinary codePasswordComplex analysisFunction (mathematics)1 (number)Touch typingComputer animation
Line (geometry)Computer fileMathematicsWritingAttribute grammarData typeComputer animation
Demo (music)Akkumulator <Informatik>Computer configurationControl flow graphVertex (graph theory)Graph (mathematics)Cross-site scriptingPasswordPresentation of a groupRight angleOptical disc driveResultantSeries (mathematics)Computer configurationControl flow graphComputational fluid dynamicsMultiplicationVirtualizationHydraulic jumpPointer (computer programming)Table (information)Mathematical analysisProjective planeCodeControl flowData managementMultiplication signLine (geometry)Binary codeDifferent (Kate Ryan album)Graphical user interfaceFunktionalanalysisBlock (periodic table)BitCategory of beingPlastikkarteStudent's t-testCuboidAsynchronous Transfer ModeSet (mathematics)Level (video gaming)Fluid staticsPairwise comparisonMereology2 (number)System callComputer clusterSymbol tableImage resolutionSensitivity analysisFlow separationGame controllerComputer animation
Semiconductor memoryTypinferenzVariable (mathematics)Data recoveryRange (statistics)Mathematical analysisState of matterData recoveryMereologyCASE <Informatik>Mathematical analysisAbstract interpretationCoroutineFluid staticsType theoryComputer programmingSemiconductor memoryMultiplication signFigurate numberLoop (music)ProgrammschleifeAddress spaceSemantics (computer science)Interpreter (computing)
Mathematical analysisSquare numberIterationSquare numberInferenceAuthorizationState of matterMathematical analysisLoop (music)Computer programming
Square numberFinitary relationVariable (mathematics)Mathematical analysisCodeLevel (video gaming)IntegerComputer programState of matterLoop (music)Branch (computer science)IterationFluid staticsMathematical analysisProgrammanalyseCartesian coordinate systemFiber bundleDemo (music)2 (number)Limit (category theory)Set (mathematics)NumberType theoryCASE <Informatik>Theory of relativityThread (computing)TrailNormal (geometry)MetaanalyseCommutatorVariable (mathematics)Different (Kate Ryan album)Real numberSquare numberBinary codeArithmetic meanMaxima and minima
Structural loadAddress spaceStack (abstract data type)MathematicsSemiconductor memoryBinary codeAddress spaceBlock (periodic table)Compiler2 (number)Touch typingTheorySemantics (computer science)Theory of relativityCartesian coordinate systemSemiconductor memoryInheritance (object-oriented programming)Right angleMathematicsStandard deviationDrop (liquid)Maxima and minimaInformationstheorieCompilerStructural loadDemo (music)Nuclear spaceWritingStack (abstract data type)
Stack (abstract data type)MathematicsData typeRecursive descent parserDemo (music)CryptographySoftware crackingTunisCompilerBinary codeStandard deviationoutputProcess (computing)MereologyDrill commandsComputer animation
HookingLine (geometry)State of matterFactory (trading post)Symbol tableCodePoint (geometry)Binary codeRepository (publishing)Type theoryFluid staticsGroup actionRight angleBit
Factory (trading post)String (computer science)Lattice (order)Process (computing)Discrete element methodFlagoutputData typePoint (geometry)RoutingCryptographyAddress spaceQuicksortClique-widthReal numberSet (mathematics)FlagArithmetic progressionoutputProcess (computing)Boom (sailing)Right angleLevel (video gaming)Software testingBinary codeKey (cryptography)Random matrixComputer animation
Revision controlChainCodeMereologyBinary codeDemo (music)Multiplication signGoodness of fitNP-hardDrop (liquid)ChainCovering spaceFigurate number
Revision controlChainFigurate numberDrop (liquid)ChainArmStructural loadState of matterComputer programmingStack (abstract data type)
Single-precision floating-point formatState of matterDrop (liquid)Computer configurationChainPoint (geometry)Computer programmingGastropod shellVariable (mathematics)
Revision controlOvalInformationFunction (mathematics)Binary codeFunktionalanalysisEncryptionFigurate numberCodeComputer programmingReal numberLine (geometry)Optical disc drive
Virtual machineMultiplication signCyberneticsGodPresentation of a groupSlide ruleComputer animation
CybersexARPANETCloud computingOpen setMathematical analysisTotal S.A.Line (geometry)CyberneticsSet (mathematics)Slide ruleGoodness of fitRight angle1 (number)Binary codeProjective planeSystem programmingOpen sourceReal numberCore dumpARPANETMathematical analysisLine (geometry)EmailCodeElectronic mailing listExploit (computer security)Connectivity (graph theory)
Transcript: English(auto-generated)
afternoon. How are we? Oh, we can do better than that. Time to go get some red bull, wake up a little bit. Maybe go get some more beer. That's always a good thing. So how many people have made it over to CTF? Just go check it out, right? You should. You should go look at it. It's fun, right? And
we've heard a little bit about CTF with this giant thing earlier and you're going to talk a little bit more about this? A little bit, yeah. Good stuff. Let's give these guys a big round of applause. Whoa, I sound really loud.
Welcome to our talk. We're here to talk about some work we've been doing in the context of our CTF team, in the context of security research and in the context of the CDC and open source. We're here to talk about anger and we'll get to
that later. But first, who are we? Up here is me. I am an artist. This is fish. Yeah. So we are from shellfish. Shellfish is a CTF team. You can go check us out. We're
playing the CTF right now on the Def Con floor there. We're fighting hard. As you'll see. So this talk is full of these live demos. We're going to try to solve some CTF challenges,
show how we approach CTF challenges with this sort of framework. You'll see true shellfish style, everything melting down completely. So I'm sure there will be exceptions and segfaults and a lot of really fun stuff. But the point is something very analogous is currently happening on the CTF floor. Of course we always come super like
badass and prepared and then the game starts. Yeah. If you don't have 0 SLA, then something is horribly wrong. Anyways, we're shellfish and we're also from UC Santa Barbara and at UC
Santa Barbara we have a really awesome hip computer security lab and that's where anger was created. The main people that created anger, the major contributors, me and fish, we got also known as Andrew Dutcher. He's an
undergrad currently at UC Santa Barbara. Amazing dude. We got also known as John Grossen. He's a high schooler at a high school in Santa Barbara, which is totally crazy and he's the
Chris Sauls, also known as Sauls. Very creative. He's a fellow phd student and then Christoph Hauser, known as Kerios, who's a post doc. So this is kind of the anger team. As for me, I've been coming to Def Con since Def Con 9.
I'm back in Alexis Park. It was awesome back then. It's awesome now. And it's kind of a lifelong dream for me to be up here speaking. This is really crazy. It was a lifelong dream for a while to even play ctf. I walked by the table and whoa. And now we're here and it's awesome to be
here in front of you guys. Like I said, I'm a phd student in Santa Barbara and I'm actually there because of ctf. So shellfish is mostly from UC Santa Barbara and I joined them for Def Con qualifier and got pulled in and got pulled into the lab. That was pretty cool. And I let my friend
fish here introduce himself. That's easier. Just grab the mic. All right. Okay, you guys can hear me. Oh, it works. Awesome. Thanks. I'm fish, obviously. This is not my real name and I don't think you guys can read my
real name if you're not from China. It's pretty hard to read. The pronunciation is Wang Ruoyu. Anyone not from China can read it? Probably not. You guys are honest. I'm also a phd student from UC Santa Barbara and I'm a part of shellfish which is a super lead hacking team as has mentioned several
times. Super famous for everything melting down and nothing works and zero SRA. Yes. And I've been playing Def Con ctf for this is my fourth year. I've been playing ctf not Def Con but I've been playing ctf for six years. I'm
not a punter. I'm mostly a reverser. So yesterday I just solved my first Def Con ctf challenge because it's uponable. I just solved it. This is like the first one out of four years. So you know I'm very happy. And to solve
that challenge I didn't really sleep last night. And English is not my mother tongue. And if I talk about nonsense today, don't be surprised. Okay. That's it. That's a little bit about me. And I'll hand it back to you. All right. So here's a detailed itinerary for today.
We'll probably fail completely at sticking to this. So we'll go over why we need a next generation binary analysis platform. Why we built it. Why we're, spoiler alert, releasing it. Then we'll talk about what we designed it to
do. We'll talk about the different parts of our analysis system. It's called anger. We'll talk about applications of it. Show off some ctf challenge solving or how it can assist in solving ctf challenges. And then we'll
talk about the ctf challenge. Are any of you guys playing the ctf right now? Raise your hand. One, two, oh, a couple of them. Okay. So you don't want to leave before the talk ends. Unless you're not on shellfish then you should leave
before we get to the live example. Okay. So let's jump into that. Why anger? Why did we build a binary analysis platform mostly from scratch instead of using one of the ones
available out there right now? In fact, there's tons of them out there. When preparing this talk I went through, I saw what's our competition kind of, right? And they're enough to fill an entire slide. When we started anger about two years ago there weren't this many. There are still some. But then
and now in the end, everyone ends up staring at binaries in ida. So ida since it came out in 2005 or so or x-rays was found in 2005, ida has been kind of the de facto binary analysis tool that everyone uses. So of course our
long-term goal is to, you know, replace ida because it's sometimes very frustrating. We've all had these moments of why is it doing this? Why did they design it that way? But
the truth is designing an analysis tool is extremely difficult. We're nowhere near replacing ida, for example, but there are things we do that ida does not and things we do and ways we do them that no other system out there is capable of. Of course, the likely situation is we'll release this and then
that slide with all those names will just have one more name on it. But hopefully people find it useful. At least we do. So we're pretty excited. So let's talk about the fundamentals of anger. What did we design it to be
like? Well, the idea is we are all python users mostly in the lab. And let's have a show of hands. Who uses python as a kind of primary hacking language? All right. So anger is written in python for python in some sense. It's
meant to be used in i python while you're working on a binary. That gives it a lot of flexibility. And it exports its analyses which are pretty powerful in a really well designed way. We'll show off later how to use anger to quickly script symbolic execution, quickly script the
finding of rob gadgets and so forth. And of course a core component of any modern analysis system is that it has to work on platforms other than x86. So in fact by leveraging an intermediate implementation called vex which
is how valgrind works as well, we support the 64 bit and 32 bit versions of all major architectures. And after legit bs tweeted the spark troll tweet, hopefully troll tweet, we
spent a couple of days trying to hack spark support into this. And it's almost there. So it's pretty extensible. So this is what a user that's using anger might see. It's very easy up there. You just import it. Boom. Open up an example
binary. And then we'll go into all of the analyses and all of the things that anger offers later in the demos. Anger has several different components. Of course we have a binary loader that's pretty general. Right now alpha is our best supported platform, but we can load the files. We can't
do much with the files yet. But we can load them and start executing until we hit some environment interactions. We of course support linux, free bsd, binaries and so forth. And we even support firmware images. So if you have a dump
off of an iot device and it's some firmware blob, anger can load it, tell you where it's likely needs to be loaded in memory and you can start analyzing it. We have a lot of static analysis routines. Fish will talk about them in more detail. And we have a symbolic execution engine. And
the symbolic execution engine of course is capable of identifying unsafe situations and reversing what input needs to be provided to drive a program down a specific path. So let's dive in. We'll start with symbolic execution. Whoa. And
we're done. It's been nice talking to you guys. I'm sure this is one of the first of many situations. Awesome. And now it's no longer full screen. Awesome. And okay. We're almost
there. We're almost there. Boom. Thank you. Thank you very much. That was the first demo. So we'll start with symbolic execution. Symbolic execution is a concept that has
been around for a little while. It's recently gaining more and more prominence. I don't know if you guys have been to yesterday's talk about symbolic execution on another analysis system. This is kind of an analogous thing. So let's
dive into what symbolic execution is. Symbolic execution aims to answer the question of how do I trigger a certain path or a certain condition. And so you might imagine a binary that does something when you give it a certain input like a crack me ctf challenge which we'll look at
later. And how would you interact with that? You could just give it input. You say here's a guess. Is it good? And it will tell you no because most likely you're not going to guess the flag. You can try to do some sort
of static analysis. So this is what you tend to do manually when you're looking at a binary and clicking through here and there. And you can do that automatically but most likely static analysis will not give you an answer because it's not precise enough. We'll talk more about static analysis
later on but for now we need something slightly different. We need symbolic analysis, symbolic execution. And so what does that entail? Well symbolic execution, we interpret an application. As we interpret it, we track constraints on symbolic variables. And when a required condition is triggered
when we see a path that we like, we concretize the inputs to identify possible or concretize the variables to identify possible concrete input. Let's do a quick example of concretization. If you have constraints on a
symbolic variable x and you say x is greater than 10 and x is less than 100, then you can do a constraint solve and come up with a number, 42. Of course in this case it's super trivial. In general this is an empty complete problem and
it's kind of a pain that sometimes you'll start a constraint solve and it will just never finish. That's one of the challenges of symbolic execution and we'll go into another one shortly. So let's go for an example program. This is in python of course, anger analyzes binaries, not python
but python is more approachable. Bonus points if you can catch the vulnerability in this program by the way. So first thing we do with symbolic execution is go line by line interpreting the program. So we hit this input. You see in
blue the input has been executed. That input created a symbolic variable x. X is unbounded. It has no known constraints on it. Then we continue executing and we'll hit this branch and what symbolic execution does when it hits a branch is it splits. Splits into two possibilities and so
one of the possibilities is when x was greater than 10 and that branch was taken. Otherwise it's the inverse of that. X is less than 10. So we continue executing. Now we have two states. Keep that in mind. Some execution has multiple states. We'll see why that's a problem later on. Now we have
two states and we hit the next branch in the one state that is not done yet and it splits as well according to the two different possibilities. So then if you want to answer the question of what does it take to print two in this
scenario, right? In order to print two we have our constraints. We have that state that the path that made it there and we do a constraint solve and the constraint solve tells us that, hey, you can put in 99 or it can tell you 42 and so forth. And then we can give that dynamically to the program, launch it and see the expected
behavior. So let's do a demo of kind of a very simple binary that has kind of this toy back door that we want to detect with symbolic execution. This keep calm is not for you. It's for me because I did stupidly did a git pull right before
doing this and so I don't think it works anymore. So you can see that? Awesome. So we'll launch it. Okay. We will switch
to fish's laptop for this demo. This is what live demos are all about, guys. It's python exceptions. Here you go.
So fish uses windows. I know it's embarrassing but bear with us. I mean, you know, like in this kind of situations windows never fails. At least it's a linux VM. Oh, yeah, by
the way, anger currently only supports linux but in the future it will be able to run on windows. Allegedly. So this is a anger management. This is anger's gooey to do symbolic
analysis and static analysis. So if you're going to look at this toy binary we have that's nice for testing, nice for explaining what symbolic execution means. Okay. And
let's look at it in IDA first actually. As I said, of course, everyone uses IDA or actually we have the source code. Yeah, we do. You don't have it? Great. So we'll just
talk about the source code. So here is. Guys, if you can't read
this, then we're screwed. Okay. So it's a very simple binary. It has a user name password. It takes the user name and password as input. And it calls the authenticate function. If that returned one it says you've been
accepted. The authenticate function has a back door. It passes the string compare which has this string sneaky, then you will be authenticated automatically. So it's possible to detect this automatically in anger management. So here's the
gooey. Over here we have the display of what paths are currently active in the analysis. We can actually even run multiple analyses at the same time. We can step these paths. And we can look at what's present currently in their
registers. Presumably we can scroll somehow. Okay. In their memory. So this is what's currently on the stack. So then we can take this path and we can step it. We can
say let's execute until it branches like we saw in the other example. So here we have a path that branched. And branched for some reason. And that reason is because there's user input that was symbolic variable that could be concretized to anything. So here we can actually look at
it. We can look at this user input and we see that in. I can't use your mouth. I'm touching. I'm a noob. But we
can see that on one hand the user could input any password and it does one thing. And if the user inputs a password so sneaky it does another. If you then look at standard output instead and we keep stepping, there. We see that here where
the user input so sneaky it immediately trusted him and let him in. So this is a simple example of how symbolic execution can help us analyze binaries. And then we'll go on to more complex ones for ctf challenges. So, yeah. So we
can see that. So, let's come on. Why did that? That was my
temporary defcon password. It's gone. Great. Yeah. Let's keep
going on yours. People tell me not to use linux for presentations but I don't believe them. I think it's just fine. But it's dark magic. So, oh, well, now it's
great. So, along with symbolic execution anger supports a bunch of static analyses that fish will go into detail with now. Okay. I just figured out that young has taken up too much
of my time. So I'll keep this part really simple. And I'll just go like talk about a little bit about it. And if you guys are really interested in it, you should come to my lab and be a phd student. So let's start. Before analyzing a binary, we all need to know the control
flow. If you open up a binary in IDA, what do you see? The first thing you see is a property box. After you click on okay, after several other settings, you see a control flow graph. We also do the same. In our anger management, that's our GUI, we'll show you the control flow graph of every single function which is like very similar to IDAs.
What's the difference? Our CFG is more accurate. Our CFG has more options that you can adjust. And the result is our CFG is much slower. That is because we support multiple options like contact sensitivity level, support like smart
backward slicing, symbolic traversal, etc. To automatically resolve some stuff that is really hard to resolve like normally or statically. For example, jump targets or another example of some virtual table pointers. Well, like in comparison, CFG is less accurate, it has fewer options
but it's really faster. This is how we use, how we create a CFG in anger. First line import anger. Second line of python, you create a project, the follower, which is the binary name. Third line, you say CFG equals to p dot analysis dot CFG. Press enter and then it will give you a CFG.
We want to see how many nodes there are, how many basic blocks there are. There are 78. Great. So if you want to have a faster CFG and you don't want to buy IDA, you can check out course guard. Course guard is a fast mode CFG generation
that does not do any symbolic solving. If there's a course guard, there's also a boy's guard. So what is a boy's guard? You can check out our project and have a look. All right. Another static analysis routine that we have in anger is
called value set analysis. Value set analysis is a kind of abstract interpretation. In case some of you guys haven't heard of abstract interpretation, it's basically a kind of static analysis, a bunch of static analysis, um, approaches that allows you to only execute part of the program. For
example, if there's a loop and it loops 1,000 times, the abstract interpretation will be executed for like 3 times. And then we kind of figure out the semantics of the program with only executing part of the program. So that gives us the possibility of actually enumerating, uh, sorry,
exhausting, uh, the state space because we're not executing all the programs. We're not, we're not really exhausting the state space. On top of that, uh, we can have our variable recovery. We can have random recovery. And then on top of that, we can build some memory access checks and type
inference. Okay. So credit goes to, not me, we take the paper and we implement them. The credit goes to the author of this paper, which is Gogol Balakrishnan. I tried so hard to read your name. Uh, I think I tried my best. If I read it incorrectly, I'm sorry. So he's the creator of VSA, value
set analysis. Uh, here's an example of what the value set analysis looks like. Here's a piece of, um, x64 assembly. Try hard to read it. You'll have 5 seconds. Okay, great. Uh,
I think if you, if you know assembly, you have already understand this program. And now we want to say, what is the rbx in the yellow score? Uh, in the yellow square. Do we know it? No, we don't. In symbolic, uh, execution, as Yang said, we just keep executing it. However, there, uh,
the problem is, at, at, um, at every single iteration of the loop, it has the possibility of branching out. That basically means we're gonna face a state explosion. Oh, if you're gonna have, um, 1025, uh, 25, sorry, 0x, 1025 different, uh, different states. If you're using native
static analysis, rbx would be anything. Because, well, we are not following every single, every single branch and maybe we are only following, uh, one iteration of the loop. With random analysis, we can actually tell, uh, rbx is less than 0x, uh, 1024. Is that good enough? Can we do better? In
value set analysis, um, this is one of the value, one of the, uh, the type of the values that, uh, that the value set analysis is using. It's called str-strided intervals. A strided interval means a set of numbers that can be described with, uh, a lower bound, uh, an upper bound and
the minimum stride between each single value and their size. So here, uh, these strided intervals can be concretized or it actually means nine different values. That's, uh, at least, at least it's ours. You see, um, between 0x1, uh, between 0x100 and 0x104, uh, the stride is four. That's
what the stride means. So in this example, we want to say, what is rbx in the L square? We follow it. We follow, uh, we take the first iteration of the loop, rbx can be from, from 0 to 4. Take the second iteration, rbx can be from 0 to
8 with a stride of 4. In the fourth iteration, rbx can be from 0 to 0xc with a stride of 4. At the fifth iteration, we figure out, oh, the loop is not terminating. What, what are we gonna do, right? If the loop is, like, looping for ever, looping forever, I always keep looping with it, no? We perform a widen and then now we think, oh, rbx can
be from 0 to, uh, infinite. After that, of course, 0 from, from 0 to infinite is not accurate enough. We perform a narrowing. It comes, becomes 0 and 1024 with a stride of 4. In this case, it's pretty accurate. We extended the original, um, value set analysis with the following two
different, uh, with the following two different, uh, improvements. The first one is called, we name it as limited variable relation analysis. For example, in this case, normal VSA will be able to tell the, um, the bond of rax, rax should be 5. Well, this is rcx. Well, they don't, they don't
do some, they don't do any relation tracking, they don't know that. We are doing some, we are doing some limited amount of, of variable relation tracking and in this case, we are able to tell, oh, rcx equals to rax plus 1 and rcx to d6. Our next improvement is, uh, we made our
VSA sinus agnostic. That is, uh, we, we included another analysis called wrapped interval analysis. So the credit goes to George A. Nevis who has the paper of, uh, sinus agnostic paper, uh, program analysis published, uh, like in 2012 in the programming analysis conference. And with
that, our analysis of binary code, uh, the precision is greatly improved. And now we'll, uh, I'll give it back to Yang and we're going to talk about applications and real demo. All right. So all of that, uh, technical talk or theoretical talk may be a little boring, but it was
necessary to get us into the actual anger application. So here we're going to demo off a couple of things that we do and that you can do with anger. Um, first we'll demo off, uh, ROP gadget finder. So this isn't like a standard ROP
gadget finder like, uh, I don't know, ROP gadget, uh, or xROP. Um, that'll tell you, hey, you know, there's this gadget and this is, these are the instructions. This gadget finder tells you what the gadget does and then you can even filter it down later. Um, and of course this is, uh, implemented in anger and it's super, um, easy to use. So
here's the example. So we, um, load a, uh, ctf binary called nuclear. Um, not from this, that fast ctf, but from a different ctf in the past. Um, and we just analyze it. We say we want all ROP gadgets with maximum size of 15, find them and we print them out. So let's, um, do, uh, that.
Because it does semantic analysis, uh, it's a little slow. Um, so it takes, you know, about 20 seconds, maybe a little more for this guy. Um, so right now anger is
analyzing every basic block and, uh, figuring out its semantics. Figuring out what registers it touches, how much it changes the stack by, um, and even, uh, where it writes to, um, in relation to, uh, the various, uh, registers that, um, it uses. So here's an example gadget. Right? It's a gadget at 0x40F4C.
In this binary it changes the stack by 0x18. It pops, registers RBX and, uh, RBP and it does a memory write to this address. So this is actually, uh, on the stack. It does a memory write onto the stack, um, and the memory read from
an address that depends on RBP. Um, this gives a lot of information. In fact, our next step is to implement, uh, Rob compiler based on this. Uh, we were hoping to have it for this time. It's not quite ready, but, uh, stay tuned. Um, another thing that I'll, uh, demo off is a, uh, how we
would solve a crypto challenge in anger. This is a crypto challenge in quotes. It's more of a crack me. Um, but it's a cool little demo of anger's abilities. The challenge is, um, from a white hat CTF. Um, it's a CTF that happened last month. Um, and then I was looking at the challenges later
to, uh, see some crypto. Um, and found this. Figured it would be a good example for you guys. So, this challenge, uh, takes, uh, input on the command line and, uh, in standard crack me fashion it tells you if you're right or if you're wrong. Right? And, uh, in general if you try to
guess, you're wrong. Let's open it up in NIDA. Right? We open it up in NIDA. We start looking around and the binary is really big. So, uh, of course we can, uh, start, uh, drilling down, um, into, uh, parts of the binary. Uh,
figuring out what it might do. You know, we can, uh, decompile it and try to figure it out. And one of the things is, uh, it does something, um, if this, uh,
returns zero, it, um, says please check again. All right. Well, let's look at it. Um, then you see it does some complicated stuff. Uh, but there's some equals equals eight. Of course from here this is my exact process of, of, uh, solving this challenge. I figured, all right, that maybe it, eight is a size. Let's play with that. So
I went, quit out of IDA, went to anger. I wrote a little, uh, well, code. Um, so it's just very heavily commented. This is in our examples, uh, repository if you guys want to check it out. So we, uh, opened, uh, the binary. Um,
anger symbolic execution has trouble with certain types of code sometimes. So, especially in, uh, static billing binaries, um, I hooked them, um, with, uh, python replacements, um, to help them along. Um, and then basically I ran it. I said, uh, I created a path group which
is the standard, uh, uh, singleton, uh, or the entry point of our symbolic execution engine and I told it to go and find, um, this point. And this is the point where we say, where it says, uh, it passes this stage and it
says the input is okay. Right? And then it does some more processing and this is a pain in the butt to get through. Right? So I figured, let's look at where we are at this point. And it turned out that at that point, the key space is much reduced to the possible key space that we can make it to that point in with and it's, after that it's
brute forcible. So over here, I get the possible values from anger. Um, of course, how to do this is all in the docs. Um, or you can look at this example and then I iterate over these possible values with a very fancy progress bar. Uh, and, uh, test every possible value until I get the right
one. Uh, allowing me to, uh, solve this, uh, crypto challenge. So let's, um, see how this goes. Run it here. So here, uh, anger is stepping through, uh, the binary and
now it's done stepping through the binary. Um, it's at this point where the input was, uh, is tested again. Um, I guess at this point. Um, and now it's just trying the, uh, reduced set of possibilities which is down from, uh, 8
bytes to 6,392 possibilities. And, uh, this is, it's an example guess for debugging. So just iterating through the possible, uh, keys that can even make it to this point and trying to find one that says success. Um, it should find it at the 80-ish percent mark. I'm actually pretty surprised it
hasn't crashed yet. Boom. So we find, we find it, um, after iterating through a couple possibilities, the flag is this. If we run the crypto, um, actually this is the input. There's the flag that turns out. If we run the, uh, binary, boom. So, um, anger is very useful for, uh, these
sorts of, oh, dear, five minutes. All right. For these sorts of, uh, challenges, um, I'll pass it on to Fish to look at some real world, uh, or CTF that's happening now and how we're using anger for that. So one of the anger's, uh,
ability is that you load up a binary, you execute obviously
part of the code in it. Um, I had some demos for it before, uh, prepared like before the Defcon, but yesterday I was playing the Defcon CTF. There's a challenge called RXC and I figured out, hey, this is a really good demo for anger and now I'm going to talk about it. So if you guys are, uh, playing the CTF right now and you guys haven't solved this
challenge, this is a little benefit for you. Cover your ears, please. Uh, RXC is a, uh, 64-bit, um, binary. It's actually a little bit better. And we spent a long time reversing it. Before we reversed it, we got some suspected drop chains from the traffic. So we want to figure
out, hey, there's a, there's a drop chain. What does this drop chain do? I mean, we can, we can, we can definitely hire a bunch of monkeys to figure it out, but we have anger. The monkey we hire would be ourselves anyways. Okay.
So this is our, uh, this is our, uh, drop chain execution, uh, program called DROP. You pass in the drop chain, load the binary RXC, and then I dump all the drop chain on our, on our
stack. Uh, yes, on the stack. I create a state, I dump the, dump the drop chain on our stack, our stack, I set the IP, and then I execute it. Uh, I'm using our, one of our surveyors called the explorer, execute it, and then I return a run that returns. Let's run this, let's
run this. We return the first, um, the first drop chain. Bummer. Okay, it's called, uh, okay, it's called R. Very
descriptive, uh, variable names. Yeah. There's a unconstrained path. So of course this is all, uh, it has five rows. Fairly technical, but, uh, you, you can read
the documentation, understand what's going on here. And now you see, oh, this is where the, the exact, the exact path that the drop chain is following. And now of course you have the ability to read every single state of every single program point along this drop chain, which is really easy. It can also help you debug your own drop chain and shell codes, right? Alright, let's turn it into
this, this example. And then the next example is, for the same binary, in this binary there's a really interesting, uh, function. There's a, it does some encryption, and later, later on we figure out, oh, it's T. We want to call it. We don't want to implement by ourselves, because like when
we're writing our exploits, implementing some C code in Python is really, uh, tedious. What do we do with it? What do we do about it? Luckily we have Python. Uh, great. So there's another program I wrote. It's really small,
called callable. What it does is, uh, it, it takes in, it encrypted with the exact program, uh, sorry, the exact function in the, uh, in that RXC program with the exact encryption, uh, encryption function. So it, so, you know,
it has like 33 something lines of code, and then you don't need to understand the encryption function anymore. You spawn up, you spawn up Python, and it automatically encrypts it for you. Let's try it. Python, callable.py. This is original data, high DefCon, 8 bytes. And then you
got encrypted data that it all works. Um, whoa, I have two microphones. We all support, uh, binary diffing, but in the interest of time, you can, because we have one minute, you can check this out on your own. And we'll briefly talk
about the CGC. Um, the CGC, as you guys hopefully know, the cyber grand challenge, that's the machine, um, one of the machines that will be running the, uh, finals, um, where, uh, machines will battle each other, uh, for hacking supremacy next DefCon. Um, shellfish participated in the cyber grand challenge, uh, and, uh, we managed to qualify.
What's going on? Uh, it's been a long presenting. No, go back. This is a very clever set of slides. It's good. All right. Shellfish participated in the cyber grand challenge. Uh, we managed to qualify, putting us from, uh,
just another CTF team into the richest CTF team in the world, along with the other ones that qualified. Um, with the CGC, we created a cyber reasoning system that created, uh, exploits from binaries and patched them. Um, it was very complex and anger actually, uh, sat at the core of every
component, um, which is pretty cool. So check out this system. It's a real world system with real world, uh, uses and we like it. Um, and it's open source. As I mentioned, um, it's open source with special thanks to, uh, our professors, um, DARPA with, uh, two different projects that,
that anger was developed for. Um, and, uh, of course all of us, you know, the contributors to anger, um, that we've gone over. Um, you can pull it, uh, on github. Um, check out anger dot io. Um, subscribe to our mailing lists. And, uh, of
course, we welcome pull requests, issues, questions. Um, we are hoping to make this the next generation binary analysis tool. And, uh, we hope to work with you to do it. Um, anger is two years old now, uh, with almost 60,000 lines of code, about 6,000 commits. It's a big project and
we'd love to have you, uh, working on it with us. Any questions? All right. I guess no questions. Thank you guys.