Shellphish - Panel: Cyber Grand Shellphish
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 93 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/36289 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
DEF CON 2422 / 93
4
6
7
11
15
20
26
33
34
35
36
39
40
46
49
53
58
62
63
66
68
72
79
90
92
93
00:00
BootingQuicksortRight angleGroup actionGodInformation securityCodePlanningMultiplication signSoftwareExistenceLevel (video gaming)Cybersex
01:31
InternetworkingVideo gamePoint (geometry)Universe (mathematics)HexagonArrow of timeRight angleHacker (term)
02:27
SimulationBlu-ray DiscComputer virusDepth of fieldOvalGradientGroup actionStudent's t-testGoodness of fitPoint (geometry)Multiplication signPresentation of a groupComputer animation
04:13
RobotARPANETCybersexExploit (computer security)CybersexPresentation of a groupSubsetMultiplication signSystem programmingARPANETRoboticsMathematical analysisRight angleComputer reservations systemAutomatic programmingSmartphoneComputer programmingComputer chess
06:48
Roundness (object)Entire functionSoftware developerWordComputer reservations systemMathematical analysisCartesian coordinate system2 (number)Software testingGraph (mathematics)Vulnerability (computing)EmailComplete metric spaceAbsolute valueCausalityImage registrationCASE <Informatik>Event horizonHacker (term)Point (geometry)CybersexMoving averageRight angleInheritance (object-oriented programming)Core dumpConnectivity (graph theory)Slide ruleExpert systemCodeTerm (mathematics)System programmingBinary codeForm (programming)Chord (peer-to-peer)
13:03
Binary codeHacker (term)Patch (Unix)Limit (category theory)Computing platformClassical physicsData structureMeeting/Interview
13:55
Random numberIntegrated development environmentModel theoryFlagRead-only memoryCrash (computing)File systemTrigonometric functionsType theoryEmulationPatch (Unix)CodeBitComputer programmingSystem callBinary codeFerry CorstenMathematicsInterpreter (computing)Computing platformState of matterInsertion lossCrash (computing)FlagData structureSemiconductor memoryFile systemComputer architectureFunctional (mathematics)CalculationExterior algebraExploit (computer security)Instance (computer science)Task (computing)System programmingMathematical analysisSoftware testingEndliche ModelltheorieMereologyRandomizationPowerPCComplex (psychology)Point (geometry)Cartesian coordinate systemSymbol tableOperator (mathematics)Fluid staticsOpcodeEvent horizonGame controllerRight angleComputer animation
18:42
System programmingPatch (Unix)Crash (computing)ComputerExploit (computer security)CybersexARPANETService (economics)Binary fileServer (computing)ImplementationEvent horizonCybersexEvent horizonPatch (Unix)Binary codeCrash (computing)Multiplication signRoundness (object)TheoryTerm (mathematics)Game theoryRight angleComputer programmingNeuroinformatikFlow separationVulnerability (computing)Maxima and minimaComputer animation
20:14
Binary fileExploit (computer security)Information securityPerformance appraisalEvent horizonLarge eddy simulationInformation securityBinary codeExploit (computer security)Context awarenessPatch (Unix)Binary multiplierMultiplicationValuation (algebra)Point (geometry)Overhead (computing)Computer animation
20:58
DatabasePoint (geometry)Multiplication signGroup actionHacker (term)Event horizonSlide ruleInformationFeedbackPatch (Unix)Dressing (medical)Engineering drawing
22:00
CodeProgrammschleifeComputer reservations systemGodCodeFlow separationFreezingConnectivity (graph theory)Multiplication signRight angleStandard deviationXMLSource code
22:39
Software testingJoystickComputer networkSystem programmingConnectivity (graph theory)CodeMultiplication signFreezingPoint (geometry)Computer reservations systemScheduling (computing)Decision theoryProcess (computing)ResultantMultilaterationSoftwareDatabaseSelf-organizationLecture/ConferenceComputer animationProgram flowchart
23:43
DatabaseRelational databaseModel theoryPatch (Unix)Scheduling (computing)Game theoryCoordinate systemOrganic computingSelf-organizationSinc functionSoftware frameworkGoodness of fitSoftware testingConnectivity (graph theory)Scheduling (computing)Process (computing)CodeMereologyExploit (computer security)DatabaseComputer animation
24:58
Scheduling (computing)Game theoryBinary fileSoftware developerMathematical analysisArchitectureFreewareOpen sourceArmPowerPCCommitment schemeProjective planeOpen sourceBinary codeMathematical analysisConnectivity (graph theory)Multiplication sign3 (number)CompilerGrass (card game)Computer animation
26:19
Vector potentialoutputMathematicsHausdorff dimensionProcess (computing)Order (biology)Connectivity (graph theory)Exploit (computer security)Mathematical analysisSystem callSheaf (mathematics)Presentation of a groupPatch (Unix)outputSoftware testingComputer programmingBinary codeSlide ruleCrash (computing)MereologyHacker (term)Strategy gameEmulatorCore dumpFuzzy logicProcess (computing)CASE <Informatik>Right angleComputer animation
28:50
Loop (music)Fuzzy logicInterior (topology)Hash functionMetropolitan area networkPoint (geometry)Slide ruleFlagPressureoutputFuzzy logicLatent heatArithmetic progressionRandomizationBinary codeComputer animation
29:29
outputSoftware testingState of matteroutputSymbol tableCASE <Informatik>ExpressionControl flow graphBoom (sailing)WeightSoftware testingPoint (geometry)Computer programmingCodeSlide ruleFuzzy logicVideo gameLoop (music)Electric generatorComputer animationDiagramProgram flowchart
30:51
Component-based software engineeringRead-only memoryConstraint (mathematics)Point (geometry)Address spaceSpywareExploit (computer security)State of matterOpen sourceVideo gameBuffer overflowPoint (geometry)State of matteroutputConstraint (mathematics)FlagComputer programmingMereologyPointer (computer programming)Object (grammar)Buffer solutionAddress spaceFunctional (mathematics)BlogCrash (computing)CodeConnectivity (graph theory)CASE <Informatik>Exploit (computer security)Game controllerXMLComputer animation
33:04
FlagWeb pageComputer reservations systemBlock (periodic table)Computer programmingSemiconductor memoryType theoryWeb pageSensitivity analysisCryptographyLevel (video gaming)Content (media)Point (geometry)FlagHookingGame theoryoutputConnectivity (graph theory)Transformation (genetics)Personal identification numberLeakQuicksortClassical physicsAddress spaceMultiplication signConstraint (mathematics)Exploit (computer security)
34:37
Patch (Unix)Pointer (computer programming)EncryptionFront and back endsBinary codeJoystickDisassemblerMultiplication signCybersexCrash (computing)Binary codeCodeElectric generatorPatch (Unix)Connectivity (graph theory)Interior (topology)Point (geometry)Address spaceOpen sourceBitAssembly languageSoftware bugVisualization (computer graphics)Open setStapeldateiSource codeComputer animation
36:14
FlagBackdoor (computing)Dependent and independent variablesSystem programmingEncryptionPointer (computer programming)Resource allocationShift operatorEvent horizonSystem callSystem programmingSlide rulePatch (Unix)Binary codeSource codePointer (computer programming)Software bugQuicksortCodeSound effectInstance (computer science)Type theoryVulnerability (computing)Product (business)Semiconductor memoryPower (physics)Natural languagePoint (geometry)Standard deviationEvent horizonCrash (computing)CountingMathematical analysisBackdoor (computing)Exploit (computer security)CausalityRight angleGeneric programming
38:07
Event horizonOvalPrototypePointer (computer programming)Binary codeFunctional (mathematics)Overhead (computing)Slide ruleEvent horizonSpeicheradresseSystem callMultiplication signSoftware bugMathematical optimizationAssembly languageComputer animation
38:54
Magneto-optical drivePrototypeBinary codeFluid staticsRewritingAssembly languageSlide ruleInjektivitätServer (computing)CodeElectronic mailing listSemiconductor memoryOpen sourceHypermediaLecture/ConferenceComputer animationDiagram
40:10
Continued fractionCartesian closed categorySet (mathematics)Exploit (computer security)CodeRead-only memoryoutputMilitary operationBuffer overflowStack (abstract data type)Ewe languageIntegerInformationQuery languageType theoryOpen sourceType theoryPasswordSequelVulnerability (computing)Set (mathematics)Electronic mailing listProjective planeServer (computing)CompilerMiniDiscReal numberGame theoryoutputVirtual machineMereologyInteractive televisionRobotFormal grammarHacker (term)Exploit (computer security)Touch typingOpen sourceNeuroinformatikMultiplication signComputer reservations systemMoving averageBinary codeRight angleLevel (video gaming)VideoconferencingSystem programmingField (computer science)Form (programming)Insertion lossBackdoor (computing)Self-organizationAddress spaceComputer programmingComputer wormCrash (computing)BitTotal S.A.Information securityComputer animation
46:38
Computer wormProjective planeSystem callBoom (sailing)Open sourceCompilation albumInformation securityTouch typingPublic-key cryptographyCodeFreewareHacker (term)PasswordHydraulic jumpEmailTwitterXMLMeeting/Interview
48:02
TwitterTouch typingComputer reservations systemFunctional (mathematics)Installation artProcedural programmingGroup actionEmailProcess (computing)Hydraulic jumpScheduling (computing)Block (periodic table)Roundness (object)Moore's lawPoint (geometry)Computer reservations systemLoginFuzzy logicWeightSymbol tableVulnerability (computing)Utility softwareCASE <Informatik>Drill commandsIdentifiabilityTwitterBinary codeSimulationFreewarePlanningInheritance (object-oriented programming)MereologyExclusive or
Transcript: English(auto-generated)
00:01
everybody's like okay this is cool but not that cool. Apart from John who's totally cool. Okay should we start? Let's start. Alright you're up. So I'm gonna boot this off because I sort of started Shellfish but I'm sort of reaping the uh the benefit of it without
00:25
really doing anything. These guys are actually the brains behind it and the guys that stayed up all night doing all the work. I'm just looking at them thinking oh my god I remember when I did that. Twenty five years ago. Giovanni did a lot of high level planning and sushi
00:43
delivery. Exactly. That's my role. Feed them. They will poop software. Okay. Cyber Grand Challenge. If you look at our code that's really actually true. That's actually true. So uh I I'm gonna be very short on this. Um Shellfish was born out of the SAC lab
01:03
which is the security group at UC Santa Barbara. Every time you say UC people say University of California that's not right that's Berkeley. UC Santa Monica that does not exist. It's UC Santa Barbara. So get it right SAC lab is the group that's where we come from. And uh the group is led currently by me and my um uh colleague Christopher Kruegel. Uh
01:27
we look very professional here like professors but we're actually hackers behind weird handles like everybody else. I I never got the handle thing but I I needed one and so if
01:41
you look about Zanardi on the on the internet it's somebody with a gigantic nose and a germany would you say Chris is your life partner? I think Chris mi Chris mi Christopher is my academic wife. So I I have to take care of all his needs and his uh I wish he would
02:04
be here. He would be very he's super proud of everybody. But this is our university. Not bad. And that's why Shellfish is here. Our lab is exactly there where arrow points. We're on right on the beach. We have a private beach and that's why our tagline
02:20
is hacks on the beach. Um uh we're lucky that way. Might be back here. Is it back here? Yes it is it is it is. Alright. So how did it started? It started in 2004. I know it's incredibly such a long time ago. Um it's me but then I had a bunch of uh grad students including Chris uh and we evolved into uh a community and in 2005 we actually won
02:49
the Fcon CTF. Never won since then. That was the good old days. And it's all it's awesome because they say the older you get the more awesome you were. Uh so I'm I'm milking it for whatever I can. But we grew up you know and then suddenly void Chris
03:06
moved to Vienna. Became a professor there. Recured some more people. That became more people. That came back to Santa Barbara because it's awesome. Became more people. More students. More students. Even more students. And what happened is that some
03:23
people went to Boston. So we have a substantial presence in Boston and we evolved as a group. At some point all our graduate students actually became professors as well and so a
03:41
lot of you you know UC people became professor all around the world. In London, at Arizona State University, Eurocom in France. And right now Shellfish is a very big group of all academic people all around the world doing interesting stuff. So right now
04:03
our group is pretty much this. We're very inclusive. We're you know we foster research and that's what we care about. And with this I'll give my presentation a ton to Yann. Thank you Giovanni. So before we go on with uh the Cyber Grand Challenge itself I'd
04:25
like to give a shout out to uh all the other Shellfish's in the audience. So raise your hand if you're a Shellfish. Oh yeah right there. Nice. Nice. Yeah Shellfish is uh bigger
04:42
than just the CGC team. The CGC team is a strict subset but we have a lot of uh people that were cheering us from the sidelines even on the team. So let's talk about the Cyber Grand Challenge. Um DARPA has a history of grand challenges right? You guys are probably
05:01
familiar with the self driving car grand challenge and the robotics grand challenge because uh they got a lot of press similar to the Cyber Grand Challenge just now. And uh the idea behind these is DARPA finds this fledgling technology, self driving cars and they fund it with a lot of money right? So their prizes, million dollar prizes for uh
05:25
self driving cars and this motivated a lot of people to put a lot of research into it. At the time people were of course saying because the time was 2006 when we didn't even have smartphones and people were saying do you really think that someday you'll be sitting inside a computer and it'll be driving you around? That's absurd and now we have
05:45
people driving themselves to the hospital while they're having a heart attack in their Tesla. And so you know this technology push really pays off. And it's probably gonna be the same with robotics. DARPA did the robotics cyber grand challenge and probably in 10
06:02
years we're all gonna be dead. And it's also gonna be the same with programs. So the cyber grand challenge really pushed the frontier of automatic program analysis, exploitation and defense. Right now it's in its infancy. I think uh we'll see how the
06:25
CRSs did at Defcon CTF uh but maybe they won't beat the best humans. But that's the beginning. The chess systems didn't beat the best humans and the self driving cars aren't gonna beat the best humans in races right now. But eventually they will and eventually mechanical fish will kill us all or hack us all while the actual robots kill
06:45
us. So that's the cyber grand challenge. Um let's talk about shellfish's involvement in the cyber grand challenge. As Giovanni said shellfish is a bunch of academics and hackers right? So we're kinda hackademics. So um at one point we decided to
07:04
shift our research uh interests in uh at UCSB closer to binary analysis right? We started looking into uh doing automated binary analysis and all of the things along with that. Automatic vulnerability discovery and so forth. Completely independent of the cyber
07:24
grand challenge. We started doing this sometime in 2013 and in late 2013 DARPA announces the cyber grand challenge right? So I have an email somewhere in my history saying hey guys check this out this is this cool thing maybe you should participate because we're working on a lot of the same stuff. And everyone said yeah let's do it
07:42
let's go for it. I said great and then promptly forgot about it for like a year. Right? So the deadline for registration was in late 2014 I sent in the kind of uh application literally 15 seconds before the deadline because that's that's how we roll.
08:01
And they uh said great you're in congratulations uh let's you know see what you got. The first court event is coming up in like 4 months. And so we were like okay cool oh no like on the graph it's like in 1 month right? So we said cool let's let's build a CRS we're
08:21
gonna we're gonna rock the court event the the first kind of practice round that that was the term DARPA used for scored events. So the first practice round uh we were gonna we were gonna do super awesome we were gonna kill it and we totally forgot about it. The morning of the practice round I wake up and I'm like shit there's a practice round for the the CGC stuff tonight. And so we started working on our CRS
08:44
right? So the first commit to the CRS is 2 hours maybe 3 hours let's say before the practice round begins right? So we start writing our CRS practice round begins we play the practice round with some janky ass CRS that that kind of half works cool. So
09:02
then we're like alright well now we we started we're gonna get it all super put together before the second practice round. Second practice round rolls around and now we remember about it maybe 3 days before right? So the second commit to the CRS happens 3 days before the second practice round. We uh build it up build it up build it
09:20
up play in the second round say ok cool now we have this uh kind of cyber reasoning system that's uh kind of ready to play in the CQE if we keep working on it solidly until the uh qualifiers and then of course we forget about it for another couple months. And then 2 and a half weeks before the qualifiers we remember hey wait a second
09:42
the qualifiers are coming up. So then we start working like crazy and not sleeping 3 weeks of complete insanity until the uh cyber grand challenge qualifiers and we have a cyber reasoning system that we can field for the cyber uh grand challenge qualifiers and
10:00
we qualify with 3 weeks of absolute insanity. And so then we figured cool now a we're super rich cause the qualifiers came with 750 thousand dollars of prize money and b we can now spend a year working solidly right solidly with test cases test cases code
10:22
freezes milestones milestones lots of milestones and absolutely you know continuous uh integration and and and you know test uh rounds and everything for an entire freaking year. Agile development that's that's the key word here. None of that happened. So for uh 9
10:45
months we uh used our money to fly around the world giving conference talks and like saying how how cool we are and how you know fish is a is a Chinese martial arts expert or wait that was that was Kevin. Kevin's a Chinese martial arts expert and you
11:02
know Antonio's mysterious and all this shit but in really what we should have been doing is working on the CRS right and 3 months before the uh finals 3 months ago we realized this and we were like crap we should really write a CRS for real actually right like I mean we should take what we had in quals and actually like you know
11:24
extend it so it can win finals. So 3 months ago we we started working like crazy we we stopped sleeping right I have a fiance and I haven't seen her in 3 months basically that that's you know the insanity. To the founding agency that are listening we're a lot
11:43
more responsible than it looks. Yeah. This is our hacker persona right we also have an academic persona where of course we have CI of course come on who doesn't have CI and code freezes right and we we finish all our papers 2 weeks before they're due so that our
12:01
professors can uh go over them and absolutely this this is the hacker shellfish persona. Alright anyways so we went crazy for 3 months we got uh the final commit to the CRS 30 minutes before the air gap was established 30 minutes before the air gap was
12:22
established. Alright and it was a commit in one of the core components so shit could go wrong. There's a slide for that and alright I'm killing us so we did it we played the CAGC we got 3rd and this is the team that we've already introduced we're from all around
12:40
the world Italy Germany the US India there was a guy uh qualifying with us who's from Senegal uh fish is from China we're from all all over the place and we're very rich because we got 2 750 thousand dollar prizes now. So that's kind of a brief intro to our
13:07
involvement in the CGC I'll pass it off to Jacopo to introduce the CGC as a platform and what it means. Right so let's thank Jan for a very very very true and very effective
13:21
introduction to the shellfish hacker very distinct from academia very distinct from the shellfish academy. Uh alright so just very briefly so what does it mean to actually score well in the CGC? You have to you're gonna go blind with binaries that you have never seen before you have to analyze them in whatever way you want there's no
13:42
limitation on how you do it you have to own them either by a crash or by leaking a secret and you also have to patch them so that the other guys cannot do the same to you. And this is a classic um classic CTF uh structure that has some modifications to
14:02
decree in in the decree operating system to make it more model more easier to model and easier to handle for a for a program. Okay so one of the simplifications is that uh so the architecture is Intel x86 all opcodes are legal which can lead to interesting situations that we will see in a in a bit. Um syscalls are simplified much easier to
14:26
model pretty much read and write select uh allocate the allocate like malloc and free random and obviously exit a lot easier to model for a program and the actual binaries are actually a lot a lot more realistic uh uh very real they're not uh complete fake binaries.
14:49
So as a side note the DefCon CTF just finished and the DefCon CTF was also played on the same platform so just as an example of how real and complex these binaries could be one of the challenges in the DefCon CTF was a power PC interpreter and jitter which was
15:04
awful and so there's a lot of room for complexity in these programs. And on the actual polling side um I don't know if some of you guys want to barge in but basically what it means is that there is no there is no state every program runs once there is no state
15:27
it runs you either own it or it's gonna do its thing there's no uh there's no state there's no file system to modify this is a lot easier to to model for the for the qualifications and only for the qualifications it was just enough to crash the program.
15:42
Sec fault illegal instruction you will get the points you have owned the binaries. For the finals things a lot more nuanced and the actual exploitation as we will see is a lot is a lot more complicated and it's a very interesting application of how to use symbolic execution and static analysis uh but as as a as a basic idea the two ways you do is
16:06
either via a control crash in which you can show that you can not only crash the program in some place but you can actually crash the program at a place that the API that DARPA is gonna tell you please crash the program in this place and set this register to this
16:23
value if you can do that you verify that you have actually control of the program or alternative that you can leak a secret flag from memory. And on the patching side just a a brief note on how you analyze how this API is designed so that it does not become
16:43
too easy like for instance we can submit patches to the binary ok so what is preventing us from just submitting a binary just exits ok? This programs this program obviously never crashes but it also does not do anything useful so the way this is prevented is that is that there are functionality checks if you if the program does not maintain its
17:04
benign function if the program is a math calculator it needs to still be able to do all the math operation that it can do normally. And similarly there is no signal handling so no way to just hide away all the segfault if you segfault you are crashing. And finally how
17:25
would prevent us from putting in an interpreter that runs everything so checks before every possible instruction am I gonna crash? Am I gonna crash? Obviously you will never crash and the way this is prevented by diaper that you can actually do it you can do it if you want but you're gonna pay a performance price you're gonna lose points for
17:45
performance this is believe me not as easy as it sounds understanding exactly how your patch is performing is definitely not an easy task many of us looked into it I looked into it a bit Antonio looked into it in a bit it's definitely pretty hard. And then we gave
18:02
up testing performance we just say this is our patch deal with it. Yes yes that's very true and un you know informally we know other teams also had trouble but I think none more than Aravind knows very well how much of a pain how much of a big pain it can be to actually test the performance and the functionality of binary so big props to
18:25
Aravind for actually pushing through this task and actually making it and this actually helped us a lot during our our own internal testing even if it did not go into the live part. And are we now hand over? All right so the uh CQE for the qualifying event was
18:58
not the full it was not the full cyber grand challenge it was you needed to patch
19:04
binaries and you needed to crash binaries you didn't need to exploit anything you just needed to crash it. The final event you need to patch binaries crash binaries to find where vulnerabilities are and then exploit those vulnerabilities. And on top of that it wasn't
19:23
just a simple game or a simple program challenge where you got a binary you crashed it it was a game so you had to have a game theoretic aspect that uh played against other actual competitors right similar to a human CTF but all with computers. Um so the
19:42
competition was actually divided into 96 rounds uh and that wasn't predetermined it was you know however many rounds they got through in a day uh there was a minimum time per round and that ended up being 96 and there was a bunch of uh challenge binaries as they
20:02
provided to the teams to hack. And for each score for each round the team would have a separate round score that when aggregated would be their total score for the game. The score was calculated based on a multiplication of the teams availability which means how much did they fuck up the binary and how fast the binary still was right how much
20:25
overhead the patches had which is something uh Yacpo alluded to. The security score which is how exploitable were the binaries still or were they still exploitable and the valuation score which means did we find did the team find an exploit for this binary. So it
20:42
was very easy to screw yourself in this context because it they're all multipliers if you completely break the binary even if you have perfect offense even if you find all of the exploits for this binary then you still get zero points because you broke the binary. In developing for this competition we uh ran into a lot of kind of uh organizational things as I
21:08
alluded to earlier. We started super late so for example up until depressingly short time ago this was our database. Alright. After all this is a research group run by an
21:20
Italian. Again this is our hacker persona. So we actually had to do a join on this database at one point. When we uh got the real database up we were joining between the paper database and the actual database. This is relevant because it's about our performance scores. This is the database of our performance scores we're trying to analyze. That's how
21:40
it's relevant to the previous slide. Specifically this database contains the feedback from some uh practice sessions for the final event. So this is what diaper called sparring partner sessions. We wrote them down and then we had to join them with the real database to get the actual information that we needed to tune our patches. Uh we also
22:02
tried to go into code freeze several times. So at four one pm on some god forsaken day uh we froze a component of our uh CRS called Farnsworth. Uh and very shortly thereafter this is the commit log. Right so the code freeze didn't work very well. Um there are
22:22
commits such as this gem here. So that that that's Francesco here. The the you know beautiful beautiful code. This this commit was okay actually. He just has very high standards. Actually it was probably crap. Um and then of course this is uh a long time
22:43
into our code freeze. Twelve fifteen hours before our nodes were shut down a couple days ago. We were still changing very core components of the system. That's me upside down. I was at this point no longer sane. So our uh CRS consisted of a lot of components. Right? We
23:05
had a um we had a central database that we called Farnsworth for some reason. Uh which stored all of the data that uh we got from the uh cyber gun challenge API through a
23:21
component that uh you'll talk about later. Um it stored network uh captures. It uh made uh it stored the scheduling decisions of what jobs to run and then it stored the result of those jobs. So now we're gonna go one by one into all of these components probably pretty
23:41
quickly. We have fifteen minutes left. And we'll start with the organization or the framework. So uh let's start with Francesco and Kevin. So obviously coordination is very important if you're running a cluster of sixty four nodes. Um and of course um since
24:01
we needed to do that we essentially came up with like using one database to essentially store all the ground truth that we have. Um as a bunch of you probably know this is from Futurama. Um so we just went with essentially Farnsworth because well good news everyone. Um and it's the only component that we actually tested fairly well at
24:20
sixty nine percent test coverage. I think the rest probably dumps around at like one percent. Um zero? Oh perfect. Even better. Um who needs testing anyways right? I mean I think- Anger has at least fifteen percent code coverage. I think Francesco probably disagrees but eh who cares. Um then on top of that we essentially had Meister which the
24:42
Germans and you know essentially just master um which looks at scheduling jobs and deciding what jobs we want to run, what kind of part of our pipeline we want to run, exploits patching, if we want to run AFL, these kind of things. We scheduled them based on priority and this obviously sorry the last component that we actually changed
25:00
with the last commit being I guess two hours and eighteen minutes before the actual deadline. So yeah this was at twelve forty two and the same deadline to actually the shutdown was at three pm. But we made a commit. I think we rolled that commit back thirty minutes before the deadline. Yeah there were a bunch of commits at like two pm but we
25:21
actually reverted them and cleaned up the history just to make sure that they're actually not there because they caused a bunch of failures on our side. Um anyways um we would also like to give a big shout out to essentially the open source components that we essentially rely on. One of them is Python, the Microsoft research, the E3 compiler, all of our things run inside of docker containers which are
25:40
running Ubuntu with PyPy. Um we're also using Kubernetes, QEMU, PeeWee, VAX, Postgres, obviously Anger which I'm sure a bunch of people are going to talk about now. And I think that's probably Jan, possibly Sols, Andrew, John I guess and pizza yeah go ahead. I want to say something. I agree with everything he said. Anger is the open
26:08
source binary project binary analysis project that we have in the sec lab. It's really really cool. It's been open source for like a year now. We released it at Defcon last year right? Yeah. Um it does everything. It's cool. Um no time. It's very cool. That's our
26:21
logo. It's creative commons. Um we in order to do the actual exploitation and analysis pipeline we split it up into a whole bunch of components and rearrange them into these weirdo things like reviews can call execution in order to do some basic analysis of what can go where. There's automatic exploitation and patching which will all be talked about. I think they've all got their sections in this presentation. Um
26:41
there's crashes. I think you can slow down a little. Fine. Who wants to? Sorry. So who's cr- who wants to talk about crashing? Crashing. Uh guys we haven't been sleeping for three days so check it out. I'm sorry if you're friends with me. To all the
27:01
funding agency we're not doing drugs or alcohol. Looks like it but we're not. I'm not doing drugs. Nick talk about it. You see how prepared we were for this huge Defcon talk.
27:23
Hello? Uh crashing uh so our exploitation strategy is we find crashes and we turn this into exploits. Uh so. Pretty incredible. Uh so actually like a lot of teams the thing we do the most is fuzzing and this is what generates a lot lots of test cases lots of
27:43
crashes. The majority of crashes but not entirely uh all the goodies we find. So uh we use AFL as our core component. Uh fuzzing. We uh I'm we'll explain how AFL works like these slides do I suppose. And uh essentially beg begins by generating lots of inputs which
28:03
attempt to explore different parts of the program. Uh the inputs are basically random. Uh some of them are more or less educated guesses and how well these inputs do in exploring the program is tracked by instrumentation which is uh compiled into the binary or which is provided by uh an emulator like QEMU. Um so let's see did I go over all
28:26
these? So AFL does a great job of doing this. We've modified it slightly to work better on CGC binaries. So we have a couple of hacks which I think will be open sourcing which make it perfect for CGC or at least a lot better. Uh okay. Uh the uncrasher
28:46
I don't think that's actually it actually exists. But we I don't think there's an uncrasher man. The points of flag and all this shit. Yes so what. It's like karaoke slides. Right uh I already mentioned this right? AFL it's great. This is how flag
29:05
fuzzing works. Uh random stuff gets put into the binary. Yep. Same input all over again. Eventually it comes up with a random thing that works. This is much harder for a fuzzer. We have to generate a very specific input. Fuzzing will have no luck with this. Keeps continues to lose. Uh makes absolutely no progress. If you guys can't feel
29:26
like you can't keep up with Mike Pizza I feel like that very frequently. Okay so anger on the other hand is a symbolic execution engine. It's slower and more heavy weight but it's great at finding very specific cases like the one we just described. And the
29:41
way this works is by generating these states following different paths. As you can see here in the control flow graph we have different states which are being um followed. Uh eventually there is a state which will satisfy the you win expression and we talk to Z3 we ask it to generate an input which gives us the state and boom. So what we tried to do is combine both AFL and anger. And we this is called
30:03
driller. Driller begins by fuzzing. Gets basic code coverage of the program the way you would expect AFL to. And get maybe gets a couple test cases in this example X and Y. We get the cheap coverage. Next slide. Then it okay. Then we take those test cases and we trace all of them with anger. So we make the input completely concrete almost. We
30:23
actually make it keep it symbolic but we constrain it to be this concrete input that AFL generated. And we see at any point in the program if we could have taken a different path which AFL failed to take. If we could have taken that path we talk to Z3 or anger more specifically and we say give me an input which satisfies this new path. In this case we get the CGC magic. And a new test case is generated and now we continue the loop and
30:47
we feed this back in the AFL which continues to mutate that further in fuzz and it goes on and on until we continue to get more code coverage. Uh. And then we play video games. Alright so this next part is the auto exploitation. How we go from a crash which is
31:01
generated by AFL and driller to actually an exploit for the CGC which scores us a flag. Alright so in this example I think there's a buff so there's a buffer overflow inside the heat inside this malloc object here. And when you overflow this buffer you actually control the function pointer. And so we're inputting inputting inputting
31:25
symbolic bytes and eventually we control the buffer the symbolic address. We're gonna call to an address we control. And so to exploit this we use anger. We check we trace the input using anger and check that first the IP is symbolic. The PC here we say is
31:44
the state does the state have a symbolic PC? At that point we know it's probably exploitable. We can control where we're gonna jump to. And so let's set the buffer to contain our shellcode. We ask D3 to give us input where the buffer point contains shellcode and then we jump to the buffer. And that'll give us an exploit. Um and to do this we
32:05
synthesize the input in an anger that's just called state dot posix dot dump zero. So in the CGC this is discovered by taking a crashing input and tracing that with anger. So keeping all the input that AFL created symbolic and then following the path that took until we have our crashing symbolic state. So keep in mind this is very simplified. We
32:27
have a bunch more techniques that handle the harder cases and that can take a not so good crash and turn it into a better crash and you can find those all when we do our open source release and when we release more details and papers later. And in the
32:41
open source release um this component is called Rex. If you're interested in auto-exploitation check that out. Alright so then the steps again we create a vulnerable symbolic state where we control the PC. We add the constraints to set the shellcode and to set the the program counter to point to the shellcode and then we
33:02
synthesize the input and that creates our exploit. Okay so this uh this component will be talking about auto-exploitation of flag leaks. So if you didn't know there are two types of exploits you can generate in the CGC. The type one is sort of classic memory corruption. Show that you can control the program uh counter. Show that you can also
33:21
control a general purpose register. Uh however there's another type called the type two very creative which uh shows that you can leak arbitrary memory from the program. So in the CGC there's actually a uh sensitive crypt uh sensitive data that's mapped at a special address uh uh in every single binary and if you can leak content from this page in
33:41
memory you score points. Like uh heartbleed for example with uh there was a heartbleed challenge in this game which uh where the premise was leaking this data from this flag page. The sensitive data. So the way we do this in a fast way is we actually use uh the
34:01
unicorn engine which anger integrates to make the entire input completely uh concrete. The only thing which is symbolic during the flag leak detection is the flag page itself. So we trace the entire program and execute very fast because everything is being concretely uh emulated by QM with unicorn and we can detect uh and transmit because we hook it
34:24
with anger when the flag page is actually being emitted and then we can see exactly which transformations are done to this flag page. You can tell if it's been exhored or if some complicated constraints have been applied. For example this actually solved the Defcon CTF challenge uh which uh okay I still don't have enough time to talk about
34:41
that but we solved the Defcon CTF challenge this way so. We'll we'll talk about it a little more later. So uh. You have seven minutes. So of course one of the challenge was to patch this binary so we had a component called PatchRx that was going from patch from from unpatched binary to patched binary so the
35:00
general idea is we have patching techniques for this and let's add let's encrypt the return address and these patching techniques generate patches such as let's add this code here let's add this data there and these patches were injected within the binary we had three different ways the first one was slower but more reliable and the last one was faster but uh less a little bit less reliable and Fish is probably gonna
35:25
talk about the reassembler. And so we had adversarial patches that were designed not to make uh our binary our patch binary analyzable by others and this is a one of them that is pretty cool and. Um this is a detect QEMU detection this if you run this
35:42
code in QEMU QEMU I3D6 it'll hang forever. Well not really forever as long as it takes to int to increment a 64 bit int to the 64 times that's basically forever. Um and we actually owned the cyber grand challenge um visualization infrastructure with this they're apparently using QEMU for instruction tracing and so at one point during the CGC we
36:02
noticed that their instruction tracing had just stopped and it stopped right on this code which was designed to detect QEMU and crash well not crash but hang forever. This is a zero day take a picture. Uh there's we have a lot of open source bug fixes to contribute starting now. So there were other sort all sort of adversarial patches so to
36:21
speak for instance our binary was starting by transmitting the flag out but uh uh they were transmitting to STDR so to STDR so that uh this could probably confuse an analysis system that could misidentify this as a as a type 2 vulnerability. We also
36:41
have a backdoor that if some team was using our patch in the in in their submission we could actually exploit that and I'm not sure if the backdoor work during the CGC but for sure it work during the CTF. Yeah how many team? During Defcon. How many team? Fielded
37:01
our backdoor? I know that a lot of teams use our backdoor during uh Defcon. Can you name names? I'm sure it was 3 teams that fielded our backdoor at the CTF. During CGC? CTF. Okay cool. So then we had also sort of generic patches that are these are more standard academic things such as uh protecting the return pointer, protecting data code and
37:24
when when we are going to release these uh code you will see all these sort of kind of more standard techniques and then targeted patches so the general idea oh you can speak about something. So targeted patches right so qualification events. No we just wanted
37:42
to avoid crashes right cause uh anything that crashes counts as an exploit. So we had some uh you know we just checked uh using a weird quirk of one of the syscalls uh using a weird quirk of one of the syscalls we checked to see if the uh if memory was uh readable at a certain point if it wasn't recrashed. So I would like to take specific
38:05
credit for our back one slide for our targeted patches in the final event which were exactly nil and it worked great so what what can I say? And one note that uh we no functionality overhead. I thought it was a bug in the slides. No no that wasn't it. And
38:22
one one cool thing about this that we we thought we were cool uh finding these uh weird syscall tricks to detect memory locations but actually when we analyze uh uh qualification binaries from other teams when they were released we found at least one other team was using exactly the same trick. So you're saying they were both cool? Yeah we were both cool. Yeah. Okay. So uh we are running out of time so well uh the
38:48
only thing I want to say is uh anger is awesome. I spent three days in writing um reassembler and another three days in ready optimizer. So it works out. So so what is a reassembler? Just real quick. Reassembler is a static binary uh rewriter that's
39:00
basically okay we'll um talk about it later. No no no. Okay. Alright we have we had a we had a breakdown from our I think I think one of our our slide guys is uh is uh okay it's fine. The reassembler is awesome. Fish wrote a binary writer where you can inject code
39:24
into binaries and it'll seamlessly reassemble the binary to include that code. Check it out in the open source release. You go. Now there's nothing much to say. Basically tried so we that gave us sixty four powerful servers. Wait how many servers? Sixty four.
39:43
Sixty four? Sixty four? Holy shit. Not thirty sixty four. So we tried to maximize this usage the usage of these nodes and yeah we kinda did it with the CQL list. Not the memory but that's it. That's that's it. So the sixty four servers we had a lot of
40:05
media attention over the CGC and uh what we got what we got people excited about the most strangely enough is the fact that we had sixty four servers all to ourselves. Incredible. Anyways so we implemented all these systems in uh breakneck like three months uh
40:28
and we pushed as hard as we could we got it all running we made commits at the last second and we played the game or rather our baby played the game. She walked on her own.
40:40
We walked into the room and they told us hey your guys' bot started up and it's doing a lot of disk IO and we fucking lost it because until we freaking lost it because up until then we thought you know it's gonna turn on and something will fail and and it'll all crap itself. So this was incredible. And then we got third place. Top three is amazing
41:06
for us guys I can't I can't tell you how incredible it is to have been part of this comp and we're going on. It was incredible. Since we played in the CTF we didn't really get much of a chance to actually look at the data. Um however we quickly briefly
41:25
looked at it so in total there were eighty two channel sets fielded. At least our bot saw only eighty two so if more had been fielded we might have actually missed them. In total mechanical fish generated about two thousand four hundred fifty exploits. Um we generally did a total of one thousand seven hundred exploits for fourteen out of the
41:42
eighty two um channel sets. All of them have a hundred percent reliability and so far as score like always leaking or essentially um crashing at a specific address. Did you check how many were like mostly reliable? Um I did not so essentially it seems that we only got fourteen out of eighty two channel sets. We do not know how many essentially
42:04
gram attack with tecax and xandra got or mayhem with four all secure. The rumors are that we have top exploitation but we didn't have the best game theory. So like always our SLA sucks. Our SLA is shit. And yeah so in total uh can you back up one slide? Um these
42:24
are essentially the exploits would be actually generated some uh for. Actually I should say the the caveat to those rumors is mayhem was only up half the game and I think they still got almost as many exploits so. Yeah. Yeah. And so we got two of the rematch challenges so so two of the historical challenges that DARPA introduced. One of them
42:41
was SQL slammer which I think two other teams are also got but don't quote me on that. And then there was also crack adder which supposedly only we got right. And then in total if you look at essentially the different challenges that we had and the vulnerabilities that were in there this is the list of challenge sets that we got. And
43:01
with that from all of us thank you for the attention. So real quick let's talk about the next steps. Real quick. The next steps beyond automated hacking is machine
43:21
augmenting human intelligence. So in DefCon CTF we hooked up our CRS. Mayhem as the winner they played completely autonomously. We played with our CRS so I mentioned already that the CRS actually pwned one binary without us even realizing it. It
43:41
actually assisted us with five of the exploits. There were five exploits at which either after providing the crash um or after just providing interaction it created an exploit form. Um and our CRS inserts back doors into every binary that it patches. And so you
44:00
might have heard already that a lot of teams actually used our back door. This sounds all awesome but we didn't win even close. Yes. We almost got close to last so. And and let's turn down the bragging. That that's right. Just a tiny bit. The CRS did amazing but there were some issues like for example the DefCon organizers had to implement a
44:22
separate API for the infrastructure than DARPA did right because the DARPA API had to be secret so that you know everyone was on an even even playing field. And so there were some API incompatibilities and computers are very brittle and so these API incompatibilities screwed us until the very last day. So the last day I feel we had a good
44:42
showing. Up until then the CRS kept crashing, the CRS kept getting invalid data. It was kind of touch and go. Um so as you uh might have heard we're going to open source everything. We're gonna do. Thank you. We uh we're gonna do a full open source vomit
45:06
because we believe in raising the playing field for everybody. So the next time a CDC runs around rolls around we expect all of you to play as well. Hopefully using our stuff. So
45:23
we don't uh have it all ready right now to push to GitHub because we were playing the CTF. We thought we'd had time but we don't. But Chris do you think we can do a symbolic open sourcing of Angrop? Alright let's do it. Right on stage. I'm gonna unplug
45:43
the video Kevin so Chris isn't logging in. Unless I mean just don't type your password into the wrong field. I've seen that before at Defcon. It was incredible. It was someone fairly famous too. Ah there we go. Better save than sorry. I think their
46:08
password was star star star star star star star star. I enabled logging before. Pshh. Chow chow four is what Giovanni says. I think that's his password though. Alright so
46:23
we're gonna plug it back in while we try to uh desperately find the settings of the open source project. So Angrop is our ROP compiler. So if you're tired of writing return oriented programming payloads by hand you can wait hold on let me explain what it is. You
46:42
can uh use Angrop which uses anger to compile ROP payloads into whatever you want. So you say actually just read this memory or execute the syscall and it figures out the ROP payload that it needs to generate. Chris wrote it. He's an amazing guy and it's an amazing project and here it is being open sourced for the world. Boom. The rest of the
47:13
code we need to scrub free of uh private keys because there are so depressingly many um
47:20
and other uh depressing uh things and then we'll push it out this week. Alright so let's go this week. Also if you find a private key that we haven't scrubbed can you please gently let us know instead of destroying our infrastructure. I we will appreciate it. We're hackers. Hackers have some of the worst security in the world so and and and my
47:43
password is six characters long just to give you an idea. Alright Kevin how do I get back to our uh thing? But I think we're done basically. Thank you guys.
48:01
So stay in touch. Hit us up on Twitter by email. Jump on our RC channel. You can chat with us about our CRS at shellfish CRS and free node. I'm the only one there right now. Super exclusive. Or on anger at free node on anger questions. Are there any actual questions? Yeah hi uh congratulations. Thank you. On your uh work. Um so in your
48:30
driller paper you had said that uh the fuzzing was mostly responsible for 68 of the binaries whereas uh having the symbolic execution based fuzzing only let you find uh
48:47
vulnerabilities in 11 more than that. So why is that still the case or is the symbolic execution more effective than fuzzing now? You wanna talk about driller 3.0? Uh sure. So one thing we've done to actually improve one thing one thing we've done to actually to
49:08
actually improve uh driller uh especially on CGC binaries is to identify functions and install SIM procedures uh in their place. So what this means is that a lot of basic block transitions which are hard for uh or uninteresting for one symbolic execution
49:24
solve are more interesting when we have a SIM procedure. We can talk about it more if you wanna come up here. Mike? Oh last question okay well uh congrats guys. Thank you. First uh second I wanted to know uh how compute bound you felt like were the did you get
49:41
enough compute power too little too much? Would you put something else in there? Back plane ram? What'd you think? So at this point we don't actually know because we haven't gotten a chance to actually look through all of the logs. Um we had some problems in the very beginning so actually on Wednesday still to get all of our Kubernetes parts scheduled simply because Kubernetes was not catching up. Um we kind
50:04
of solved that but we at this point we don't really know what the status is in so far as the utilization of all the nodes. From watching the power consumption it seemed that the way that it dropped off it seemed that it had a lot of unnecessary jobs that would be scheduled later so I think we could have used a little less even and and it was
50:26
still yeah we could have probably used 32 nodes and done about the same. But the more the merrier especially if we can schedule more jobs. We definitely had jobs to schedule that we couldn't schedule because of delays in Kubernetes. Cool thanks. Alright
50:43
thank you and thank you for organizing this thing. Please give shellfish team a huge round of applause. What they've accomplished is immense. Thank you guys. It was a dream come true to be here. Yes.