Angry Hacking: How angr pwnd CTFs and the CGC
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 109 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/36315 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
DEF CON 23105 / 109
12
19
20
23
24
29
32
33
36
51
58
60
62
66
67
68
69
70
71
77
82
84
85
88
89
92
98
99
103
104
107
00:00
Hacker (term)Roundness (object)BitOpen sourceRight angleMultiplication signContext awarenessGoodness of fitInformation securityMultilateration
01:03
Information securityException handlingBusiness informaticsGame theoryModulare ProgrammierungHost Identity ProtocolStudent's t-testDemo (music)QuicksortSoftware frameworkPoint (geometry)NP-hardSinc function
03:16
Student's t-testBitMultiplication signInheritance (object-oriented programming)Flow separationWordReverse engineeringTable (information)Sinc function
05:35
Mathematical analysisCryptographyCodeCybersexModul <Datentyp>Open setFundamental theorem of algebraMereologySystem programmingBinary codeMathematical analysisDisk read-and-write headOpen sourceCartesian coordinate systemHydraulic jumpComputing platform
06:48
Quantum stateStructural equation modelingMathematical analysisSystem programmingMoment (mathematics)Slide ruleComputing platformDampingFundamental theorem of algebraBinary codeEntire functionHexagonRight angleComputer animation
08:39
Independence (probability theory)Scripting languageTwitterHacker (term)Revision controlFormal languageSystem programmingBinary codeCore dumpSymbol tableMathematical analysisConnectivity (graph theory)BitImplementation
10:07
Fluid staticsMathematical analysisCoroutineBinary codeGenetic programmingGraph (mathematics)Binary codeDemo (music)Boom (sailing)Open setComputer fileIntegrated development environmentConnectivity (graph theory)BootingAlpha (investment)Different (Kate Ryan album)Computing platformInteractive televisionStructural loadFlow separationComputer animation
10:52
Genetic programmingBinary codeGraph (mathematics)Mathematical analysisCoroutineFluid staticsSemiconductor memoryMedical imagingInternet der DingeFluid staticsSymbol tableMathematical analysisoutputFirmwareComputer programmingBinary codeComputer animation
11:35
Power (physics)Trigonometric functionsHacker (term)Symbol tableTouchscreenBoom (sailing)Computer animationLecture/Conference
12:11
Mathematical analysisAerodynamicsoutputFluid staticsTrailConstraint (mathematics)Variable (mathematics)outputSymbol tableMathematical analysisRight angleCondition numberConstraint (mathematics)Variable (mathematics)Demo (music)System programmingFluid staticsFlagCartesian coordinate systemQuicksortSoftware crackingBinary codeGenetic programmingMultilateration
14:25
Constraint (mathematics)Interior (topology)Local ringLine (geometry)Constraint (mathematics)State of matterCASE <Informatik>Fiber bundleBranch (computer science)Computer programmingComplete metric spaceImage resolutionPoint (geometry)outputInverse elementSymbol tableVariable (mathematics)Binary codeNumberRight angleVulnerability (computing)NP-complete
16:11
Demo (music)Order (biology)Constraint (mathematics)State of matterRight angleImage resolutionComputer programmingDemo (music)Binary codeBackdoor (computing)
16:59
Data managementSystem callLine (geometry)Modulare ProgrammierungComputer fileCodeFrame problemData typeAttribute grammarInheritance (object-oriented programming)MathematicsWritingException handlingRight angleDemo (music)LaptopMetropolitan area networkComputer animation
17:38
WindowRight angleGraphical user interfaceSoftware testingMathematical analysisFluid staticsBinary codeData managementSymbol tableArithmetic meanLecture/Conference
18:37
Source codeBinary codeReading (process)Computer animation
19:21
String (computer science)outputPasswordBinary codeAuthenticationFunktionalanalysisRight angle
20:17
Electronic visual displayData managementMathematical analysisRight angleMultiplication signoutputSemiconductor memoryBranch (computer science)Stack (abstract data type)Computer animation
21:29
Right angleoutputBinary codePasswordComplex analysisFunction (mathematics)1 (number)Touch typingComputer animation
22:19
Line (geometry)Computer fileMathematicsWritingAttribute grammarData typeComputer animation
22:54
Demo (music)Akkumulator <Informatik>Computer configurationControl flow graphVertex (graph theory)Graph (mathematics)Cross-site scriptingPasswordPresentation of a groupRight angleOptical disc driveResultantSeries (mathematics)Computer configurationControl flow graphComputational fluid dynamicsMultiplicationVirtualizationHydraulic jumpPointer (computer programming)Table (information)Mathematical analysisProjective planeCodeControl flowData managementMultiplication signLine (geometry)Binary codeDifferent (Kate Ryan album)Graphical user interfaceFunktionalanalysisBlock (periodic table)BitCategory of beingPlastikkarteStudent's t-testCuboidAsynchronous Transfer ModeSet (mathematics)Level (video gaming)Fluid staticsPairwise comparisonMereology2 (number)System callComputer clusterSymbol tableImage resolutionSensitivity analysisFlow separationGame controllerComputer animation
25:53
Semiconductor memoryTypinferenzVariable (mathematics)Data recoveryRange (statistics)Mathematical analysisState of matterData recoveryMereologyCASE <Informatik>Mathematical analysisAbstract interpretationCoroutineFluid staticsType theoryComputer programmingSemiconductor memoryMultiplication signFigurate numberLoop (music)ProgrammschleifeAddress spaceSemantics (computer science)Interpreter (computing)
27:00
Mathematical analysisSquare numberIterationSquare numberInferenceAuthorizationState of matterMathematical analysisLoop (music)Computer programming
28:10
Square numberFinitary relationVariable (mathematics)Mathematical analysisCodeLevel (video gaming)IntegerComputer programState of matterLoop (music)Branch (computer science)IterationFluid staticsMathematical analysisProgrammanalyseCartesian coordinate systemFiber bundleDemo (music)2 (number)Limit (category theory)Set (mathematics)NumberType theoryCASE <Informatik>Theory of relativityThread (computing)TrailNormal (geometry)MetaanalyseCommutatorVariable (mathematics)Different (Kate Ryan album)Real numberSquare numberBinary codeArithmetic meanMaxima and minima
31:34
Structural loadAddress spaceStack (abstract data type)MathematicsSemiconductor memoryBinary codeAddress spaceBlock (periodic table)Compiler2 (number)Touch typingTheorySemantics (computer science)Theory of relativityCartesian coordinate systemSemiconductor memoryInheritance (object-oriented programming)Right angleMathematicsStandard deviationDrop (liquid)Maxima and minimaInformationstheorieCompilerStructural loadDemo (music)Nuclear spaceWritingStack (abstract data type)
33:50
Stack (abstract data type)MathematicsData typeRecursive descent parserDemo (music)CryptographySoftware crackingTunisCompilerBinary codeStandard deviationoutputProcess (computing)MereologyDrill commandsComputer animation
35:35
HookingLine (geometry)State of matterFactory (trading post)Symbol tableCodePoint (geometry)Binary codeRepository (publishing)Type theoryFluid staticsGroup actionRight angleBit
36:33
Factory (trading post)String (computer science)Lattice (order)Process (computing)Discrete element methodFlagoutputData typePoint (geometry)RoutingCryptographyAddress spaceQuicksortClique-widthReal numberSet (mathematics)FlagArithmetic progressionoutputProcess (computing)Boom (sailing)Right angleLevel (video gaming)Software testingBinary codeKey (cryptography)Random matrixComputer animation
39:12
Revision controlChainCodeMereologyBinary codeDemo (music)Multiplication signGoodness of fitNP-hardDrop (liquid)ChainCovering spaceFigurate number
39:58
Revision controlChainFigurate numberDrop (liquid)ChainArmStructural loadState of matterComputer programmingStack (abstract data type)
41:13
Single-precision floating-point formatState of matterDrop (liquid)Computer configurationChainPoint (geometry)Computer programmingGastropod shellVariable (mathematics)
42:04
Revision controlOvalInformationFunction (mathematics)Binary codeFunktionalanalysisEncryptionFigurate numberCodeComputer programmingReal numberLine (geometry)Optical disc drive
43:24
Virtual machineMultiplication signCyberneticsGodPresentation of a groupSlide ruleComputer animation
44:11
CybersexARPANETCloud computingOpen setMathematical analysisTotal S.A.Line (geometry)CyberneticsSet (mathematics)Slide ruleGoodness of fitRight angle1 (number)Binary codeProjective planeSystem programmingOpen sourceReal numberCore dumpARPANETMathematical analysisLine (geometry)EmailCodeElectronic mailing listExploit (computer security)Connectivity (graph theory)
Transcript: English(auto-generated)
00:02
afternoon. How are we? Oh, we can do better than that. Time to go get some red bull, wake up a little bit. Maybe go get some more beer. That's always a good thing. So how many people have made it over to CTF? Just go check it out, right? You should. You should go look at it. It's fun, right? And
00:25
we've heard a little bit about CTF with this giant thing earlier and you're going to talk a little bit more about this? A little bit, yeah. Good stuff. Let's give these guys a big round of applause. Whoa, I sound really loud.
00:45
Welcome to our talk. We're here to talk about some work we've been doing in the context of our CTF team, in the context of security research and in the context of the CDC and open source. We're here to talk about anger and we'll get to
01:03
that later. But first, who are we? Up here is me. I am an artist. This is fish. Yeah. So we are from shellfish. Shellfish is a CTF team. You can go check us out. We're
01:21
playing the CTF right now on the Def Con floor there. We're fighting hard. As you'll see. So this talk is full of these live demos. We're going to try to solve some CTF challenges,
01:42
show how we approach CTF challenges with this sort of framework. You'll see true shellfish style, everything melting down completely. So I'm sure there will be exceptions and segfaults and a lot of really fun stuff. But the point is something very analogous is currently happening on the CTF floor. Of course we always come super like
02:03
badass and prepared and then the game starts. Yeah. If you don't have 0 SLA, then something is horribly wrong. Anyways, we're shellfish and we're also from UC Santa Barbara and at UC
02:24
Santa Barbara we have a really awesome hip computer security lab and that's where anger was created. The main people that created anger, the major contributors, me and fish, we got also known as Andrew Dutcher. He's an
02:44
undergrad currently at UC Santa Barbara. Amazing dude. We got also known as John Grossen. He's a high schooler at a high school in Santa Barbara, which is totally crazy and he's the
03:03
Chris Sauls, also known as Sauls. Very creative. He's a fellow phd student and then Christoph Hauser, known as Kerios, who's a post doc. So this is kind of the anger team. As for me, I've been coming to Def Con since Def Con 9.
03:23
I'm back in Alexis Park. It was awesome back then. It's awesome now. And it's kind of a lifelong dream for me to be up here speaking. This is really crazy. It was a lifelong dream for a while to even play ctf. I walked by the table and whoa. And now we're here and it's awesome to be
03:43
here in front of you guys. Like I said, I'm a phd student in Santa Barbara and I'm actually there because of ctf. So shellfish is mostly from UC Santa Barbara and I joined them for Def Con qualifier and got pulled in and got pulled into the lab. That was pretty cool. And I let my friend
04:04
fish here introduce himself. That's easier. Just grab the mic. All right. Okay, you guys can hear me. Oh, it works. Awesome. Thanks. I'm fish, obviously. This is not my real name and I don't think you guys can read my
04:22
real name if you're not from China. It's pretty hard to read. The pronunciation is Wang Ruoyu. Anyone not from China can read it? Probably not. You guys are honest. I'm also a phd student from UC Santa Barbara and I'm a part of shellfish which is a super lead hacking team as has mentioned several
04:43
times. Super famous for everything melting down and nothing works and zero SRA. Yes. And I've been playing Def Con ctf for this is my fourth year. I've been playing ctf not Def Con but I've been playing ctf for six years. I'm
05:03
not a punter. I'm mostly a reverser. So yesterday I just solved my first Def Con ctf challenge because it's uponable. I just solved it. This is like the first one out of four years. So you know I'm very happy. And to solve
05:20
that challenge I didn't really sleep last night. And English is not my mother tongue. And if I talk about nonsense today, don't be surprised. Okay. That's it. That's a little bit about me. And I'll hand it back to you. All right. So here's a detailed itinerary for today.
05:43
We'll probably fail completely at sticking to this. So we'll go over why we need a next generation binary analysis platform. Why we built it. Why we're, spoiler alert, releasing it. Then we'll talk about what we designed it to
06:03
do. We'll talk about the different parts of our analysis system. It's called anger. We'll talk about applications of it. Show off some ctf challenge solving or how it can assist in solving ctf challenges. And then we'll
06:26
talk about the ctf challenge. Are any of you guys playing the ctf right now? Raise your hand. One, two, oh, a couple of them. Okay. So you don't want to leave before the talk ends. Unless you're not on shellfish then you should leave
06:41
before we get to the live example. Okay. So let's jump into that. Why anger? Why did we build a binary analysis platform mostly from scratch instead of using one of the ones
07:00
available out there right now? In fact, there's tons of them out there. When preparing this talk I went through, I saw what's our competition kind of, right? And they're enough to fill an entire slide. When we started anger about two years ago there weren't this many. There are still some. But then
07:22
and now in the end, everyone ends up staring at binaries in ida. So ida since it came out in 2005 or so or x-rays was found in 2005, ida has been kind of the de facto binary analysis tool that everyone uses. So of course our
07:43
long-term goal is to, you know, replace ida because it's sometimes very frustrating. We've all had these moments of why is it doing this? Why did they design it that way? But
08:00
the truth is designing an analysis tool is extremely difficult. We're nowhere near replacing ida, for example, but there are things we do that ida does not and things we do and ways we do them that no other system out there is capable of. Of course, the likely situation is we'll release this and then
08:23
that slide with all those names will just have one more name on it. But hopefully people find it useful. At least we do. So we're pretty excited. So let's talk about the fundamentals of anger. What did we design it to be
08:41
like? Well, the idea is we are all python users mostly in the lab. And let's have a show of hands. Who uses python as a kind of primary hacking language? All right. So anger is written in python for python in some sense. It's
09:01
meant to be used in i python while you're working on a binary. That gives it a lot of flexibility. And it exports its analyses which are pretty powerful in a really well designed way. We'll show off later how to use anger to quickly script symbolic execution, quickly script the
09:22
finding of rob gadgets and so forth. And of course a core component of any modern analysis system is that it has to work on platforms other than x86. So in fact by leveraging an intermediate implementation called vex which
09:42
is how valgrind works as well, we support the 64 bit and 32 bit versions of all major architectures. And after legit bs tweeted the spark troll tweet, hopefully troll tweet, we
10:00
spent a couple of days trying to hack spark support into this. And it's almost there. So it's pretty extensible. So this is what a user that's using anger might see. It's very easy up there. You just import it. Boom. Open up an example
10:23
binary. And then we'll go into all of the analyses and all of the things that anger offers later in the demos. Anger has several different components. Of course we have a binary loader that's pretty general. Right now alpha is our best supported platform, but we can load the files. We can't
10:44
do much with the files yet. But we can load them and start executing until we hit some environment interactions. We of course support linux, free bsd, binaries and so forth. And we even support firmware images. So if you have a dump
11:03
off of an iot device and it's some firmware blob, anger can load it, tell you where it's likely needs to be loaded in memory and you can start analyzing it. We have a lot of static analysis routines. Fish will talk about them in more detail. And we have a symbolic execution engine. And
11:24
the symbolic execution engine of course is capable of identifying unsafe situations and reversing what input needs to be provided to drive a program down a specific path. So let's dive in. We'll start with symbolic execution. Whoa. And
11:46
we're done. It's been nice talking to you guys. I'm sure this is one of the first of many situations. Awesome. And now it's no longer full screen. Awesome. And okay. We're almost
12:09
there. We're almost there. Boom. Thank you. Thank you very much. That was the first demo. So we'll start with symbolic execution. Symbolic execution is a concept that has
12:25
been around for a little while. It's recently gaining more and more prominence. I don't know if you guys have been to yesterday's talk about symbolic execution on another analysis system. This is kind of an analogous thing. So let's
12:43
dive into what symbolic execution is. Symbolic execution aims to answer the question of how do I trigger a certain path or a certain condition. And so you might imagine a binary that does something when you give it a certain input like a crack me ctf challenge which we'll look at
13:02
later. And how would you interact with that? You could just give it input. You say here's a guess. Is it good? And it will tell you no because most likely you're not going to guess the flag. You can try to do some sort
13:22
of static analysis. So this is what you tend to do manually when you're looking at a binary and clicking through here and there. And you can do that automatically but most likely static analysis will not give you an answer because it's not precise enough. We'll talk more about static analysis
13:42
later on but for now we need something slightly different. We need symbolic analysis, symbolic execution. And so what does that entail? Well symbolic execution, we interpret an application. As we interpret it, we track constraints on symbolic variables. And when a required condition is triggered
14:03
when we see a path that we like, we concretize the inputs to identify possible or concretize the variables to identify possible concrete input. Let's do a quick example of concretization. If you have constraints on a
14:21
symbolic variable x and you say x is greater than 10 and x is less than 100, then you can do a constraint solve and come up with a number, 42. Of course in this case it's super trivial. In general this is an empty complete problem and
14:42
it's kind of a pain that sometimes you'll start a constraint solve and it will just never finish. That's one of the challenges of symbolic execution and we'll go into another one shortly. So let's go for an example program. This is in python of course, anger analyzes binaries, not python
15:01
but python is more approachable. Bonus points if you can catch the vulnerability in this program by the way. So first thing we do with symbolic execution is go line by line interpreting the program. So we hit this input. You see in
15:21
blue the input has been executed. That input created a symbolic variable x. X is unbounded. It has no known constraints on it. Then we continue executing and we'll hit this branch and what symbolic execution does when it hits a branch is it splits. Splits into two possibilities and so
15:41
one of the possibilities is when x was greater than 10 and that branch was taken. Otherwise it's the inverse of that. X is less than 10. So we continue executing. Now we have two states. Keep that in mind. Some execution has multiple states. We'll see why that's a problem later on. Now we have
16:03
two states and we hit the next branch in the one state that is not done yet and it splits as well according to the two different possibilities. So then if you want to answer the question of what does it take to print two in this
16:22
scenario, right? In order to print two we have our constraints. We have that state that the path that made it there and we do a constraint solve and the constraint solve tells us that, hey, you can put in 99 or it can tell you 42 and so forth. And then we can give that dynamically to the program, launch it and see the expected
16:45
behavior. So let's do a demo of kind of a very simple binary that has kind of this toy back door that we want to detect with symbolic execution. This keep calm is not for you. It's for me because I did stupidly did a git pull right before
17:06
doing this and so I don't think it works anymore. So you can see that? Awesome. So we'll launch it. Okay. We will switch
17:32
to fish's laptop for this demo. This is what live demos are all about, guys. It's python exceptions. Here you go.
17:48
So fish uses windows. I know it's embarrassing but bear with us. I mean, you know, like in this kind of situations windows never fails. At least it's a linux VM. Oh, yeah, by
18:08
the way, anger currently only supports linux but in the future it will be able to run on windows. Allegedly. So this is a anger management. This is anger's gooey to do symbolic
18:26
analysis and static analysis. So if you're going to look at this toy binary we have that's nice for testing, nice for explaining what symbolic execution means. Okay. And
18:43
let's look at it in IDA first actually. As I said, of course, everyone uses IDA or actually we have the source code. Yeah, we do. You don't have it? Great. So we'll just
19:14
talk about the source code. So here is. Guys, if you can't read
19:44
this, then we're screwed. Okay. So it's a very simple binary. It has a user name password. It takes the user name and password as input. And it calls the authenticate function. If that returned one it says you've been
20:02
accepted. The authenticate function has a back door. It passes the string compare which has this string sneaky, then you will be authenticated automatically. So it's possible to detect this automatically in anger management. So here's the
20:23
gooey. Over here we have the display of what paths are currently active in the analysis. We can actually even run multiple analyses at the same time. We can step these paths. And we can look at what's present currently in their
20:42
registers. Presumably we can scroll somehow. Okay. In their memory. So this is what's currently on the stack. So then we can take this path and we can step it. We can
21:02
say let's execute until it branches like we saw in the other example. So here we have a path that branched. And branched for some reason. And that reason is because there's user input that was symbolic variable that could be concretized to anything. So here we can actually look at
21:22
it. We can look at this user input and we see that in. I can't use your mouth. I'm touching. I'm a noob. But we
21:44
can see that on one hand the user could input any password and it does one thing. And if the user inputs a password so sneaky it does another. If you then look at standard output instead and we keep stepping, there. We see that here where
22:06
the user input so sneaky it immediately trusted him and let him in. So this is a simple example of how symbolic execution can help us analyze binaries. And then we'll go on to more complex ones for ctf challenges. So, yeah. So we
22:27
can see that. So, let's come on. Why did that? That was my
22:55
temporary defcon password. It's gone. Great. Yeah. Let's keep
23:05
going on yours. People tell me not to use linux for presentations but I don't believe them. I think it's just fine. But it's dark magic. So, oh, well, now it's
23:30
great. So, along with symbolic execution anger supports a bunch of static analyses that fish will go into detail with now. Okay. I just figured out that young has taken up too much
23:41
of my time. So I'll keep this part really simple. And I'll just go like talk about a little bit about it. And if you guys are really interested in it, you should come to my lab and be a phd student. So let's start. Before analyzing a binary, we all need to know the control
24:02
flow. If you open up a binary in IDA, what do you see? The first thing you see is a property box. After you click on okay, after several other settings, you see a control flow graph. We also do the same. In our anger management, that's our GUI, we'll show you the control flow graph of every single function which is like very similar to IDAs.
24:23
What's the difference? Our CFG is more accurate. Our CFG has more options that you can adjust. And the result is our CFG is much slower. That is because we support multiple options like contact sensitivity level, support like smart
24:41
backward slicing, symbolic traversal, etc. To automatically resolve some stuff that is really hard to resolve like normally or statically. For example, jump targets or another example of some virtual table pointers. Well, like in comparison, CFG is less accurate, it has fewer options
25:01
but it's really faster. This is how we use, how we create a CFG in anger. First line import anger. Second line of python, you create a project, the follower, which is the binary name. Third line, you say CFG equals to p dot analysis dot CFG. Press enter and then it will give you a CFG.
25:25
We want to see how many nodes there are, how many basic blocks there are. There are 78. Great. So if you want to have a faster CFG and you don't want to buy IDA, you can check out course guard. Course guard is a fast mode CFG generation
25:42
that does not do any symbolic solving. If there's a course guard, there's also a boy's guard. So what is a boy's guard? You can check out our project and have a look. All right. Another static analysis routine that we have in anger is
26:01
called value set analysis. Value set analysis is a kind of abstract interpretation. In case some of you guys haven't heard of abstract interpretation, it's basically a kind of static analysis, a bunch of static analysis, um, approaches that allows you to only execute part of the program. For
26:23
example, if there's a loop and it loops 1,000 times, the abstract interpretation will be executed for like 3 times. And then we kind of figure out the semantics of the program with only executing part of the program. So that gives us the possibility of actually enumerating, uh, sorry,
26:42
exhausting, uh, the state space because we're not executing all the programs. We're not, we're not really exhausting the state space. On top of that, uh, we can have our variable recovery. We can have random recovery. And then on top of that, we can build some memory access checks and type
27:01
inference. Okay. So credit goes to, not me, we take the paper and we implement them. The credit goes to the author of this paper, which is Gogol Balakrishnan. I tried so hard to read your name. Uh, I think I tried my best. If I read it incorrectly, I'm sorry. So he's the creator of VSA, value
27:21
set analysis. Uh, here's an example of what the value set analysis looks like. Here's a piece of, um, x64 assembly. Try hard to read it. You'll have 5 seconds. Okay, great. Uh,
27:41
I think if you, if you know assembly, you have already understand this program. And now we want to say, what is the rbx in the yellow score? Uh, in the yellow square. Do we know it? No, we don't. In symbolic, uh, execution, as Yang said, we just keep executing it. However, there, uh,
28:00
the problem is, at, at, um, at every single iteration of the loop, it has the possibility of branching out. That basically means we're gonna face a state explosion. Oh, if you're gonna have, um, 1025, uh, 25, sorry, 0x, 1025 different, uh, different states. If you're using native
28:20
static analysis, rbx would be anything. Because, well, we are not following every single, every single branch and maybe we are only following, uh, one iteration of the loop. With random analysis, we can actually tell, uh, rbx is less than 0x, uh, 1024. Is that good enough? Can we do better? In
28:45
value set analysis, um, this is one of the value, one of the, uh, the type of the values that, uh, that the value set analysis is using. It's called str-strided intervals. A strided interval means a set of numbers that can be described with, uh, a lower bound, uh, an upper bound and
29:03
the minimum stride between each single value and their size. So here, uh, these strided intervals can be concretized or it actually means nine different values. That's, uh, at least, at least it's ours. You see, um, between 0x1, uh, between 0x100 and 0x104, uh, the stride is four. That's
29:24
what the stride means. So in this example, we want to say, what is rbx in the L square? We follow it. We follow, uh, we take the first iteration of the loop, rbx can be from, from 0 to 4. Take the second iteration, rbx can be from 0 to
29:42
8 with a stride of 4. In the fourth iteration, rbx can be from 0 to 0xc with a stride of 4. At the fifth iteration, we figure out, oh, the loop is not terminating. What, what are we gonna do, right? If the loop is, like, looping for ever, looping forever, I always keep looping with it, no? We perform a widen and then now we think, oh, rbx can
30:01
be from 0 to, uh, infinite. After that, of course, 0 from, from 0 to infinite is not accurate enough. We perform a narrowing. It comes, becomes 0 and 1024 with a stride of 4. In this case, it's pretty accurate. We extended the original, um, value set analysis with the following two
30:23
different, uh, with the following two different, uh, improvements. The first one is called, we name it as limited variable relation analysis. For example, in this case, normal VSA will be able to tell the, um, the bond of rax, rax should be 5. Well, this is rcx. Well, they don't, they don't
30:41
do some, they don't do any relation tracking, they don't know that. We are doing some, we are doing some limited amount of, of variable relation tracking and in this case, we are able to tell, oh, rcx equals to rax plus 1 and rcx to d6. Our next improvement is, uh, we made our
31:01
VSA sinus agnostic. That is, uh, we, we included another analysis called wrapped interval analysis. So the credit goes to George A. Nevis who has the paper of, uh, sinus agnostic paper, uh, program analysis published, uh, like in 2012 in the programming analysis conference. And with
31:22
that, our analysis of binary code, uh, the precision is greatly improved. And now we'll, uh, I'll give it back to Yang and we're going to talk about applications and real demo. All right. So all of that, uh, technical talk or theoretical talk may be a little boring, but it was
31:45
necessary to get us into the actual anger application. So here we're going to demo off a couple of things that we do and that you can do with anger. Um, first we'll demo off, uh, ROP gadget finder. So this isn't like a standard ROP
32:00
gadget finder like, uh, I don't know, ROP gadget, uh, or xROP. Um, that'll tell you, hey, you know, there's this gadget and this is, these are the instructions. This gadget finder tells you what the gadget does and then you can even filter it down later. Um, and of course this is, uh, implemented in anger and it's super, um, easy to use. So
32:22
here's the example. So we, um, load a, uh, ctf binary called nuclear. Um, not from this, that fast ctf, but from a different ctf in the past. Um, and we just analyze it. We say we want all ROP gadgets with maximum size of 15, find them and we print them out. So let's, um, do, uh, that.
32:44
Because it does semantic analysis, uh, it's a little slow. Um, so it takes, you know, about 20 seconds, maybe a little more for this guy. Um, so right now anger is
33:01
analyzing every basic block and, uh, figuring out its semantics. Figuring out what registers it touches, how much it changes the stack by, um, and even, uh, where it writes to, um, in relation to, uh, the various, uh, registers that, um, it uses. So here's an example gadget. Right? It's a gadget at 0x40F4C.
33:24
In this binary it changes the stack by 0x18. It pops, registers RBX and, uh, RBP and it does a memory write to this address. So this is actually, uh, on the stack. It does a memory write onto the stack, um, and the memory read from
33:43
an address that depends on RBP. Um, this gives a lot of information. In fact, our next step is to implement, uh, Rob compiler based on this. Uh, we were hoping to have it for this time. It's not quite ready, but, uh, stay tuned. Um, another thing that I'll, uh, demo off is a, uh, how we
34:03
would solve a crypto challenge in anger. This is a crypto challenge in quotes. It's more of a crack me. Um, but it's a cool little demo of anger's abilities. The challenge is, um, from a white hat CTF. Um, it's a CTF that happened last month. Um, and then I was looking at the challenges later
34:22
to, uh, see some crypto. Um, and found this. Figured it would be a good example for you guys. So, this challenge, uh, takes, uh, input on the command line and, uh, in standard crack me fashion it tells you if you're right or if you're wrong. Right? And, uh, in general if you try to
34:41
guess, you're wrong. Let's open it up in NIDA. Right? We open it up in NIDA. We start looking around and the binary is really big. So, uh, of course we can, uh, start, uh, drilling down, um, into, uh, parts of the binary. Uh,
35:02
figuring out what it might do. You know, we can, uh, decompile it and try to figure it out. And one of the things is, uh, it does something, um, if this, uh,
35:20
returns zero, it, um, says please check again. All right. Well, let's look at it. Um, then you see it does some complicated stuff. Uh, but there's some equals equals eight. Of course from here this is my exact process of, of, uh, solving this challenge. I figured, all right, that maybe it, eight is a size. Let's play with that. So
35:42
I went, quit out of IDA, went to anger. I wrote a little, uh, well, code. Um, so it's just very heavily commented. This is in our examples, uh, repository if you guys want to check it out. So we, uh, opened, uh, the binary. Um,
36:04
anger symbolic execution has trouble with certain types of code sometimes. So, especially in, uh, static billing binaries, um, I hooked them, um, with, uh, python replacements, um, to help them along. Um, and then basically I ran it. I said, uh, I created a path group which
36:24
is the standard, uh, uh, singleton, uh, or the entry point of our symbolic execution engine and I told it to go and find, um, this point. And this is the point where we say, where it says, uh, it passes this stage and it
36:43
says the input is okay. Right? And then it does some more processing and this is a pain in the butt to get through. Right? So I figured, let's look at where we are at this point. And it turned out that at that point, the key space is much reduced to the possible key space that we can make it to that point in with and it's, after that it's
37:02
brute forcible. So over here, I get the possible values from anger. Um, of course, how to do this is all in the docs. Um, or you can look at this example and then I iterate over these possible values with a very fancy progress bar. Uh, and, uh, test every possible value until I get the right
37:22
one. Uh, allowing me to, uh, solve this, uh, crypto challenge. So let's, um, see how this goes. Run it here. So here, uh, anger is stepping through, uh, the binary and
37:41
now it's done stepping through the binary. Um, it's at this point where the input was, uh, is tested again. Um, I guess at this point. Um, and now it's just trying the, uh, reduced set of possibilities which is down from, uh, 8
38:01
bytes to 6,392 possibilities. And, uh, this is, it's an example guess for debugging. So just iterating through the possible, uh, keys that can even make it to this point and trying to find one that says success. Um, it should find it at the 80-ish percent mark. I'm actually pretty surprised it
38:23
hasn't crashed yet. Boom. So we find, we find it, um, after iterating through a couple possibilities, the flag is this. If we run the crypto, um, actually this is the input. There's the flag that turns out. If we run the, uh, binary, boom. So, um, anger is very useful for, uh, these
38:47
sorts of, oh, dear, five minutes. All right. For these sorts of, uh, challenges, um, I'll pass it on to Fish to look at some real world, uh, or CTF that's happening now and how we're using anger for that. So one of the anger's, uh,
39:18
ability is that you load up a binary, you execute obviously
39:21
part of the code in it. Um, I had some demos for it before, uh, prepared like before the Defcon, but yesterday I was playing the Defcon CTF. There's a challenge called RXC and I figured out, hey, this is a really good demo for anger and now I'm going to talk about it. So if you guys are, uh, playing the CTF right now and you guys haven't solved this
39:40
challenge, this is a little benefit for you. Cover your ears, please. Uh, RXC is a, uh, 64-bit, um, binary. It's actually a little bit better. And we spent a long time reversing it. Before we reversed it, we got some suspected drop chains from the traffic. So we want to figure
40:02
out, hey, there's a, there's a drop chain. What does this drop chain do? I mean, we can, we can, we can definitely hire a bunch of monkeys to figure it out, but we have anger. The monkey we hire would be ourselves anyways. Okay.
40:26
So this is our, uh, this is our, uh, drop chain execution, uh, program called DROP. You pass in the drop chain, load the binary RXC, and then I dump all the drop chain on our, on our
40:40
stack. Uh, yes, on the stack. I create a state, I dump the, dump the drop chain on our stack, our stack, I set the IP, and then I execute it. Uh, I'm using our, one of our surveyors called the explorer, execute it, and then I return a run that returns. Let's run this, let's
41:10
run this. We return the first, um, the first drop chain. Bummer. Okay, it's called, uh, okay, it's called R. Very
41:28
descriptive, uh, variable names. Yeah. There's a unconstrained path. So of course this is all, uh, it has five rows. Fairly technical, but, uh, you, you can read
41:42
the documentation, understand what's going on here. And now you see, oh, this is where the, the exact, the exact path that the drop chain is following. And now of course you have the ability to read every single state of every single program point along this drop chain, which is really easy. It can also help you debug your own drop chain and shell codes, right? Alright, let's turn it into
42:03
this, this example. And then the next example is, for the same binary, in this binary there's a really interesting, uh, function. There's a, it does some encryption, and later, later on we figure out, oh, it's T. We want to call it. We don't want to implement by ourselves, because like when
42:22
we're writing our exploits, implementing some C code in Python is really, uh, tedious. What do we do with it? What do we do about it? Luckily we have Python. Uh, great. So there's another program I wrote. It's really small,
42:40
called callable. What it does is, uh, it, it takes in, it encrypted with the exact program, uh, sorry, the exact function in the, uh, in that RXC program with the exact encryption, uh, encryption function. So it, so, you know,
43:00
it has like 33 something lines of code, and then you don't need to understand the encryption function anymore. You spawn up, you spawn up Python, and it automatically encrypts it for you. Let's try it. Python, callable.py. This is original data, high DefCon, 8 bytes. And then you
43:23
got encrypted data that it all works. Um, whoa, I have two microphones. We all support, uh, binary diffing, but in the interest of time, you can, because we have one minute, you can check this out on your own. And we'll briefly talk
43:40
about the CGC. Um, the CGC, as you guys hopefully know, the cyber grand challenge, that's the machine, um, one of the machines that will be running the, uh, finals, um, where, uh, machines will battle each other, uh, for hacking supremacy next DefCon. Um, shellfish participated in the cyber grand challenge, uh, and, uh, we managed to qualify.
44:04
What's going on? Uh, it's been a long presenting. No, go back. This is a very clever set of slides. It's good. All right. Shellfish participated in the cyber grand challenge. Uh, we managed to qualify, putting us from, uh,
44:21
just another CTF team into the richest CTF team in the world, along with the other ones that qualified. Um, with the CGC, we created a cyber reasoning system that created, uh, exploits from binaries and patched them. Um, it was very complex and anger actually, uh, sat at the core of every
44:43
component, um, which is pretty cool. So check out this system. It's a real world system with real world, uh, uses and we like it. Um, and it's open source. As I mentioned, um, it's open source with special thanks to, uh, our professors, um, DARPA with, uh, two different projects that,
45:03
that anger was developed for. Um, and, uh, of course all of us, you know, the contributors to anger, um, that we've gone over. Um, you can pull it, uh, on github. Um, check out anger dot io. Um, subscribe to our mailing lists. And, uh, of
45:22
course, we welcome pull requests, issues, questions. Um, we are hoping to make this the next generation binary analysis tool. And, uh, we hope to work with you to do it. Um, anger is two years old now, uh, with almost 60,000 lines of code, about 6,000 commits. It's a big project and
45:43
we'd love to have you, uh, working on it with us. Any questions? All right. I guess no questions. Thank you guys.