No source, no problem! High speed binary fuzzing
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 254 | |
Autor | ||
Lizenz | CC-Namensnennung 4.0 International: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/53145 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
| |
Schlagwörter |
36C3: Resource Exhaustion101 / 254
1
7
8
9
10
11
24
26
27
28
35
39
40
41
43
44
47
49
50
55
56
60
62
63
64
68
71
72
74
75
77
78
79
82
88
93
99
100
102
106
109
111
112
113
118
119
122
124
125
127
132
133
135
136
137
138
140
141
142
143
144
145
146
150
151
156
157
158
161
162
164
165
166
167
170
173
175
177
179
180
182
183
187
201
202
208
213
219
224
226
233
234
235
237
239
240
241
244
246
247
249
251
253
00:00
Open SourceBinärcodeKryptologieSoftwareKonfiguration <Informatik>QuantencomputerDijkstra-AlgorithmusFuzzy-LogikFramework <Informatik>AdressraumComputersicherheitNotepad-ComputerInformatikt-TestTwitter <Softwareplattform>ResultanteProdukt <Mathematik>UnrundheitTermersetzungssystemDifferenteComputeranimationJSON
00:57
Open SourceBinärcodeNP-hartes ProblemBinärdatenÜbergangswahrscheinlichkeitEin-AusgabeSoftwaretestSpieltheorieCompilerCodeObjektverfolgungMittelwertQuellcodeTreiber <Programm>BildschirmmaskeBinärcodeSoftwareschwachstelleSoftwareDifferenteTwitter <Softwareplattform>Notepad-ComputerComputersicherheitMonster-GruppeSchreiben <Datenverarbeitung>CASE <Informatik>SystemplattformQuellencodierungProgrammMereologieHalbleiterspeicherProgrammfehlerCompilerFuzzy-LogikTaskGraphische BenutzeroberflächePhysikalisches SystemInformationFunktionentheorieFunktionalanalysisKeller <Informatik>Kartesische KoordinatenSoftwareentwicklerRechter WinkelResultanteAdressraumMultiplikationsoperatorNebenbedingungSpieltheorieTeilmengeGewicht <Ausgleichsrechnung>Overhead <Kommunikationstechnik>SummengleichungKernel <Informatik>Prozess <Informatik>ProgrammierumgebungMechanismus-Design-TheorieCodeAnalysisEinfache GenauigkeitVerkehrsinformationKategorie <Mathematik>Open SourceStatistische HypotheseKonditionszahlWeg <Topologie>SoftwaretestOptimierungsproblemEin-AusgabeSpeicherabzugRückkopplungSchnittmengeSystemzusammenbruchExploitHumanoider RoboterMessage-PassingGenerator <Informatik>Patch <Software>Mobiles InternetÜberlagerung <Mathematik>BitTransformation <Mathematik>Puffer <Netzplantechnik>GeradeDatenflussProtokoll <Datenverarbeitungssystem>VerschlingungRadikal <Mathematik>Treiber <Programm>TermersetzungssystemComputerunterstützte ÜbersetzungBasis <Mathematik>SchlüsselverwaltungAusnahmebehandlungChirurgie <Mathematik>Güte der AnpassungGlobale OptimierungSummierbarkeitTypentheoriePhysikalismusQuick-SortComputeranimation
08:39
Treiber <Programm>ÜbergangswahrscheinlichkeitBinärcodeBinärdatenLaufzeitfehlerAnalysisCodePhysikalisches SystemHydrostatikMittelwertSpeicheradresseOrtsoperatorStochastische AbhängigkeitRippen <Informatik>KonstantePi <Zahl>Genetischer AlgorithmusAssemblerSystemaufrufDifferenteBitSpeicheradresseBinärcodeEinfache GenauigkeitCodeSymboltabelleZeiger <Informatik>MereologieSystemzusammenbruchStrömungsrichtungOrtsoperatorMechanismus-Design-TheorieOpen SourceGeradeStochastische AbhängigkeitFunktionentheorieRechter WinkelFuzzy-LogikPersönliche IdentifikationsnummerAssemblerBildschirmmaskeFunktionalanalysisRückkopplungProgrammVariableProgrammfehlerRuhmasseSystemaufrufProgrammbibliothekSchlüsselverwaltungRandomisierungElektronische PublikationAbstraktionsebeneAdressraumCompilerKonstanteSkalarfeldKernel <Informatik>Keller <Informatik>Weg <Topologie>SpeicherabzugRelativitätstheoriePhysikalisches SystemCASE <Informatik>SoftwareschwachstelleNotebook-ComputerInformationBetrag <Mathematik>LastKontrollstrukturSoftwareTreiber <Programm>BlackboxKonstruktor <Informatik>InformationsspeicherungDynamisches SystemLoopDreiecksfreier GraphRippen <Informatik>AlgorithmusStandardabweichungLaufzeitfehlerNichtunterscheidbarkeitWechselsprungKategorie <Mathematik>FlächeninhaltPunktImplementierungProzess <Informatik>Zentrische StreckungÜbergangWald <Graphentheorie>BaumechanikCMM <Software Engineering>Computeranimation
16:02
Genetischer AlgorithmusBinärcodeAssemblerSystemaufrufSpeicheradresseRelativitätstheorieAssemblerSpeicheradresseFreewareFunktionalanalysisÜbergangAnalysisLastFuzzy-LogikDifferenteFitnessfunktionInformationRechter WinkelWeg <Topologie>Elektronische PublikationBinärcodeBildschirmmaskeWechselsprungSystemaufrufCodeComputeranimation
17:06
BinärdatenÜbergangswahrscheinlichkeitBinärcodeEin-AusgabeMittelwertROM <Informatik>LaufzeitfehlerGraphische BenutzeroberflächeAdressraump-BlockObjektverfolgungVerschlingungCodePhysikalisches SystemInterrupt <Informatik>SystemzusammenbruchThreadDeterministischer ProzessDatenparallelitätEinfache GenauigkeitVirtuelle MaschineEin-AusgabeComputersicherheitHalbleiterspeicherStabilitätstheorie <Logik>SystemzusammenbruchZeitzoneVerkehrsinformationAggregatzustandZeichenketteQuick-SortValiditätZufallsgeneratorProgrammfehlerMereologieAdressraump-BlockObjekt <Kategorie>GraphFuzzy-LogikProgrammGradientBildgebendes VerfahrenMetadatenPufferüberlaufAutorisierungMultiplikationsoperatorFramework <Informatik>Klasse <Mathematik>FunktionentheorieProgrammbibliothekSyntaktische AnalyseBitDeterministischer ProzessPuffer <Netzplantechnik>Kernel <Informatik>Einfache GenauigkeitDialektDatenflussSoftwaretestImplementierungRechter WinkelFolge <Mathematik>InformationÜberlagerung <Mathematik>CASE <Informatik>SoftwareSpeicheradresseDateiformatWeg <Topologie>KonfigurationsraumGraphfärbungSystemaufrufBinärcodeLaufzeitfehlerThreadDeterminanteProgrammierspracheSpeicherabzugLeckFlächeninhaltEinfügungsdämpfungPRINCE2PasswortSinusfunktionÜbergangswahrscheinlichkeitBildschirmmaskeFehlertoleranzOffice-PaketBildverstehenEntscheidungstheorieVerband <Mathematik>Bus <Informatik>CompilerCAN-BusPrinzip der gleichmäßigen BeschränktheitPlotterHoaxCoxeter-GruppeRadikal <Mathematik>Computeranimation
24:43
ÜbergangswahrscheinlichkeitModul <Datentyp>MittelwertBinärcodeCodeSystemaufrufp-BlockOpen SourceVerschlingungKernel <Informatik>ROM <Informatik>Explosion <Stochastik>StandardabweichungLaufzeitfehlerFuzzy-LogikTreiber <Programm>AdressraumBitJust-in-Time-CompilerOverhead <Kommunikationstechnik>Interrupt <Informatik>SoftwaretestGoogolOpen SourceHalbleiterspeicherStandardabweichungKernel <Informatik>Dynamisches SystemCASE <Informatik>MereologieBlackboxCompilerMultiplikationsoperatorSystemaufrufp-BlockFunktionalanalysisModulBenchmarkProzess <Informatik>Schreiben <Datenverarbeitung>CoprozessorIterationResultanteDateiverwaltungLaufzeitfehlerMetadatenBinärcodeKlasse <Mathematik>BimodulOrtsoperatorCodeRechter WinkelSystemzusammenbruchSymboltabelleDateiformatFramework <Informatik>Inverser LimesProgrammfehlerVerkehrsinformationTreiber <Programm>GrenzschichtablösungDatenparallelitätLeistungsbewertungVerschlingungFahne <Mathematik>Projektive EbeneFunktionentheorieHardwareURLBefehlsprozessorProgrammWeg <Topologie>Elektronische PublikationPhysikalisches SystemImplementierungARM <Computerarchitektur>BetriebsmittelverwaltungZeitzoneÄhnlichkeitsgeometrieDatenstrukturEinsBewertungstheorieGenetischer AlgorithmusExistenzsatzStrömungsrichtungPlastikkarteSchlüsselverwaltungOffice-PaketInformationsspeicherungEndliche ModelltheorieLesen <Datenverarbeitung>FreewareQuick-SortBildgebendes VerfahrenZahlenbereichBeobachtungsstudieSystem FInformationComputeranimation
32:20
BinärcodeOpen SourceÜbergangswahrscheinlichkeitSharewareGrenzschichtablösungAbstraktionsebeneMultiplikationsoperatorBefehlsprozessorOverhead <Kommunikationstechnik>ResultanteMereologieStatistische HypotheseLeistungsbewertungInteraktives FernsehenKontrast <Statistik>SystemaufrufBitBenchmarkDateiverwaltungKernel <Informatik>Physikalisches SystemCASE <Informatik>SoftwaretestIndexberechnungDiagrammComputeranimation
33:30
ModulHydrostatikNabel <Mathematik>Gebäude <Mathematik>MAPModul <Datentyp>Attributierte GrammatikVerzeichnisdienstFunktion <Mathematik>AdditionGarbentheorieDisassemblerRestklasseDateiformatClientInformationVersionsverwaltungZufallszahlenProtokoll <Datenverarbeitungssystem>ÜbergangswahrscheinlichkeitProzess <Informatik>GenerizitätEin-AusgabeSchnelltasteCompilerBefehlsprozessorDatenmodellTreiber <Programm>EreignishorizontInterface <Schaltung>SpeicherabzugSocketBroadcastingverfahrenSpielkonsoleRechnernetzRegulator <Mathematik>DatenbankLastGerichtete MengeFirmwareSpezialrechnerROM <Informatik>WärmeübergangszahlZugriffskontrolleSharewareBildschirmfensterTaskSpeicheradresseCachingWeb-SeiteSystemaufrufSystemzusammenbruchModulArithmetisches MittelShape <Informatik>URLInformationsspeicherungCoxeter-GruppeÄhnlichkeitsgeometrieZeitzoneHalbleiterspeicherSharewareSymboltabelleInformationLastEndliche ModelltheorieFunktionalanalysisSichtenkonzeptReelle ZahlRechter WinkelLesen <Datenverarbeitung>ResultanteGRASS <Programm>TopologieCASE <Informatik>Elektronische PublikationSoftwareschwachstelleInverser LimesPuffer <Netzplantechnik>AdditionPhysikalisches SystemZeichenketteSpeicherverwaltungLokales MinimumSpeicheradresseDruckertreiberBootenCodeProzess <Informatik>Gebundener ZustandDateiverwaltungDienst <Informatik>VerkehrsinformationProgrammfehlerCachingDisassemblerBetriebsmittelverwaltungValiditätKernel <Informatik>BimodulMetadatenDatenfeldSchreiben <Datenverarbeitung>Treiber <Programm>
39:26
FreewareSpeicheradresseWeb-SeiteROM <Informatik>InformationBefehlsprozessorHauptidealringStandardabweichungHardwareRippen <Informatik>CodeReelle ZahlTaskResultanteDateiverwaltungSchnittmengeReelle ZahlSoftwaretestDifferenteFreewareFuzzy-LogikBitInverser LimesBildschirmmaskeBinärcodeOrtsoperatorEndliche ModelltheorieTreiber <Programm>Framework <Informatik>Kernel <Informatik>GraphFokalpunktCodeComputersicherheitSoftwareentwicklerSechseckVerkehrsinformationKette <Mathematik>MereologieMultiplikationsoperatorRechter WinkelCASE <Informatik>Prozess <Informatik>SystemzusammenbruchBimodulProgrammfehlerModulWeg <Topologie>SpeicheradresseProgramm/Quellcode
41:22
BinärdatenÜbergangswahrscheinlichkeitBinärcodeROM <Informatik>OrtsoperatorStochastische AbhängigkeitCodeObjektverfolgungMittelwertCodeSpeicheradresseFunktionentheorieBinärcodeWeg <Topologie>Open SourceSystemzusammenbruchFuzzy-LogikMereologiePunktMultiplikationsoperatorZahlenbereichKernel <Informatik>Inverser LimesImplementierungBimodulSymboltabelleStochastische AbhängigkeitSharewareKartesische KoordinatenProgrammfehlerHalbleiterspeicherSoundverarbeitungRechter WinkelOrtsoperatorAdditionAdressraumTransformation <Mathematik>SechseckSchnittmengeSchlüsselverwaltungKontrollstrukturSchreiben <Datenverarbeitung>FokalpunktReelle ZahlComputeranimation
43:48
MittelwertObjektverfolgungÜbergangswahrscheinlichkeitROM <Informatik>OrtsoperatorStochastische AbhängigkeitCodeBinärdatenAdressraumCASE <Informatik>Rechter WinkelKernel <Informatik>QuellencodierungHumanoider RoboterTropfenOpen SourceDateiverwaltungQuellcodeAusnahmebehandlungSpeicherverwaltungBimodulAbschattungAdditionStatistische HypotheseAutomatische HandlungsplanungZeitzoneFinitismusMultiplikationsoperatorARM <Computerarchitektur>Mailing-ListeAbgeschlossene MengeDifferenteFunktionalanalysisBildschirmmaskeZahlenbereichSoftwareschwachstelleBildschirmfensterSystemplattformZweiGraphikprozessorSchreiben <Datenverarbeitung>Coxeter-GruppeLeistungsbewertungSchlussregelData DictionaryQuick-SortPhysikalisches SystemOrthogonalitätSchlüsselverwaltungBefehlsprozessorMessage-PassingHalbleiterspeicherInternetworkingElektronische PublikationGraphBenchmarkProgrammRechenschieberDateiformatElektronische UnterschriftPatch <Software>Reelle ZahlCachingSoundverarbeitungCompilerComputerarchitekturDruckertreiberSoftwareTreiber <Programm>BetriebsmittelverwaltungBinärcodePerspektiveBitVersionsverwaltungIntegralÜbergangswahrscheinlichkeitMathematikCodeSpeicheradresseRegulärer GraphDivisionGesetz <Physik>Schreib-Lese-KopfInverser LimesURLEndliche ModelltheorieRelativitätstheorieBitrateHochdruckOffice-PaketBestimmtheitsmaßGraphfärbungLASER <Mikrocomputer>Basis <Mathematik>ÄquivalenzklasseInformationFokalpunktComputeranimation
51:37
ObjektverfolgungMittelwertROM <Informatik>ÜbergangswahrscheinlichkeitCodeOrtsoperatorStochastische AbhängigkeitBinärdatenVariableCodeKernel <Informatik>ZeitzoneBimodulAssemblerPuffer <Netzplantechnik>LaufzeitfehlerMereologieAdressraumMaßerweiterungRepository <Informatik>DisassemblerCompilerGlobale OptimierungBinärcodeKontrollstrukturHalbleiterspeicherTypentheorieGewicht <Ausgleichsrechnung>DateiformatTransformation <Mathematik>RuhmasseMaschinenspracheInternetworkingBitMapping <Computergraphik>ÄquivalenzklasseOpen SourceDynamisches SystemBetriebsmittelverwaltungQuick-SortRegulärer GraphInverser LimesZahlenbereichSystemzusammenbruchGebäude <Mathematik>MultiplikationsoperatorNichtlinearer OperatorZeichenkettePatch <Software>MathematikFunktionalanalysisProgrammfehlerCASE <Informatik>Basis <Mathematik>FehlermeldungSpeicherverwaltungRechter WinkelVariationsrechnungWiederherstellung <Informatik>Objekt <Kategorie>InformationHeuristikUnrundheitSymboltabelleElektronische PublikationAnalysisTreiber <Programm>SystemidentifikationKeller <Informatik>AblaufverfolgungFamilie <Mathematik>SpieltheorieNabel <Mathematik>SichtenkonzeptVideokonferenzVirtuelle MaschineProdukt <Mathematik>VerkehrsinformationOffene MengeDifferenteGraphfärbungFlächeninhaltGraphAggregatzustandZweiDatenverarbeitungssystem
59:04
Computeranimation
Transkript: Englisch(automatisch erzeugt)
00:22
high-speed binary fuzzing. We have two researchers that will be presenting the product of the latest work which is a framework for a static binary rewriting. Our speakers are first one is a computer science master student at EPFL and the second one is a security researcher and assistant professor at EPFL. Please give a big round of applause to Nspace and
00:44
Ganymo. Thanks for the introduction. It's a pleasure to be here as always. We're gonna talk about different ways to speed up your fuzzing and to find different kinds of vulnerabilities or to tweak your binaries in somewhat
01:03
unintended ways. I'm Matthias Peyer or I go by Ganymo on Twitter and I'm an assistant professor at EPFL working on different forms of software security, fuzzing, sanitization but also different kinds of mitigations and Matteo over there is working on his master's thesis on different forms of
01:23
binary rewriting for the kernel and today we're gonna take you on a journey on how to actually develop very fast and very efficient binary rewriting mechanisms that allow you to do unintended modifications to the binaries and allow you to explore different kinds of unintended features
01:42
in binaries. So about this talk what we discovered or the reason why we set out on this journey was that fuzzing binaries is really really hard. There's very few tools in user space. It's extremely hard to set it up and it's extremely hard to set it up in a performant way. The setup is
02:01
complex, you have to compile different tools, you have to modify it and the results are not really that satisfactory. As soon as you move to the kernel, fuzzing binaries in a kernel is even harder. There's no tooling whatsoever. There's very few users actually working with binary code in a kernel or modifying binary code and it's just a nightmare to work with. So what we are
02:22
presenting today is a new approach that allows you to instrument any form of binary code or modern binary code based on static rewriting which gives you full native performance. You only pay for the instrumentation that you add and you can do very heavy weight transformations on top of it. The picture
02:41
if you look at a modern system, let's say we are looking at a modern setup, let's say you're looking at CAD pictures in your in your browser, Chrome plus the kernel plus the libc plus the graphical user interface to get a clock up at a about a hundred million lines of code. Instrumenting all of this for some form of security analysis is a nightmare. Especially along
03:03
this large stack of software there's quite a bit of different compilers involved, there's different linkers, it may be compiled on a different system with different settings and so on and then getting your instrumentation across all of this is pretty much impossible and extremely hard to work with. And we want to enable you to select those different parts that you're
03:24
actually interested in, modify those and then focus your fuzzing or analysis approaches on those small subsets of the code giving you a much better and stronger capability to test the systems that you are or those parts of the system that you're really really interested in. Who's worked on fuzzing
03:42
before? Quick show of hands. Wow that's a bunch of you. Do you use AFL? Yeah most of you are AFL. Libfuzzer? Cool about 10-15 percent libfuzzer, 30 percent fuzzing and AFL. So there's a quite good knowledge of fuzzing so I'm not going to
04:01
spend too much time on fuzzing but for those that haven't really run their fuzzing campaigns yet it's a very simple software testing technique. You're running this in some form of execution environment and fuzzing then consists of some form of input generation that creates new test cases throws them at
04:23
your at your program and sees and checks what is happening with your program and either everything is okay and your code is being executed and your input the program terminates everything is fine or you have a bug report. If you have a bug report you can use this find the vulnerability
04:40
maybe develop a POC and then come up with some form of either exploit or patch or anything else right. So this is pretty much fuzzing in a in a nutshell. How do you get fuzzing to be effective? How can you cover large source bases, complex code and complex environment? Well there's a couple of simple steps that you can take and let's let's walk quickly through the
05:04
effective fuzzing 101. Well first you want to be able to create test cases that actually trigger bugs and this is a very very complicated complicated part and we need to have some notion of the inputs that a program accepts and
05:22
we need to have some notion of how we can explore different parts of the program right different parts of functionality. Well on one hand we could have a developer write all the test cases by hand but this would be kind of boring and it would also require a lot of human effort in creating these these different inputs and so on. So coverage guided fuzzing has evolved as
05:41
a very simple way to guide the fuzzing process leveraging the information on which parts of the code have been executed by simply tracing the individual pass through the program based on the on the on the execution flow. So we can the fuzzer can use this feedback to then modify
06:01
the inputs that are being thrown at the fuzzing process. The second step is the fuzzer must be able to detect bugs. If you've ever looked at a memory corruption if you're just writing one byte after the end of a buffer it's highly likely that your your software is not going to crash but it's still a bug it it may still be exploitable based on the underlying condition
06:21
conditions. So we want to be able to detect violations as soon as they happen for example based on on some form of sanitization that we add some form of instrumentation that we add to the to the to the binary that then tells us hey there's a violation of the memory safety property and we terminate the application right away as a feedback to the fuzzer. Third but the and last but
06:44
not least speed is key right for if you're running a fuzzing campaign you have a fixed resource budget you have a couple of cores and you want to run for 24 hours 48 hours a couple of days but in any way whatever your constraints are you have a fixed amount of instructions that you can actually
07:02
execute and you have to decide am I spending my instructions on generating new inputs tracking constraints finding bugs running sanitization or executing the program and you need to find a balance between all of them as it is a zero-sum game you have a fixed amount of resources and you're trying to make the the best with these resources so any overhead is slowing
07:24
you slowing you down and in the end this becomes an optimization problem how can you most effectively use the resources that you have available. As we are fuzzing with source code it's quite easy to actually leverage existing mechanisms and we add all that instrumentation at compile time we
07:42
take source code we pipe it through the compiler and modern compiler platforms allow you to instrument and add little code snippets during the compilation process that then carry out all these tasks that are useful for fuzzing so for example modern compilers can add short snippets of code for coverage tracking that will record which parts of the code that you have
08:03
executed or for sanitization which record and check every single memory access if it is safe or not and then when you're running the instrument the binary everything is fine and you can detect the the policy violations as you go along now if you would have source code for everything this would be amazing but it's often not the case right we we may be able on on Linux
08:24
to cover a large part of the protocol stack by focusing only on source code based approaches but there may be applications where no source code is available if we move to to Android or other mobile systems there's many drivers that are not available as open source or just available as binary
08:44
blobs or the full software stack maybe maybe closed source and we only get the binaries and we still want to find vulnerabilities in these complex software stacks that span hundreds of millions of lines of code and a very efficient way the only solution to cover this part of massive code base is to
09:04
actually rewrite and focus on binaries a very simple approach could be black box fuzzing but this is this doesn't really get you anywhere because you don't get any feedback you don't get any information if you're triggering bugs so one simple approach and this is the approach that is most dominantly used today is to rewrite the program or the binary dynamically you're taking the
09:24
binary and during execution you use some form of dynamic binary instrumentation based on pin anger or some other binary rewriting tool and translate the targeted runtime adding this binary instrumentation on top of it as you're executing it it's simple it's straightforward but it comes at a
09:45
terrible performance cost of 10 to 100 X slowdown which is not really effective and you're you're spending all your course in your cycles on just executing the binary instrumentation so we don't really want to do this and we want to have something that's more effective than that so what we are
10:03
focusing on is to do static rewriting it involves a much more complex analysis as we are rewriting the binary before it is being executed and we have to recover all of the control flow all of the all of the different mechanisms but it results in a much better performance and we can get more bang for
10:22
our buck so why is static rewriting so challenging well first simply adding code will break the target so if you are if you're disassembling this piece of code here which is a simple loop that loads data decrements the registers and then jumps if you're not at the end of the of the array and
10:42
keeps iterating through this area now as you look at the jump not zero instruction the last instruction of the of the snippet it is a relative offset so it jumps backwards seven bytes which is nice if you if you just execute the code as is but as soon as you want to insert new code you
11:01
change the offsets in the program and you're modifying all these different different offsets and simply adding new code somewhere in between will break the target so a core feature that we need to enforce a core property that we need to force is that we must find all the references and properly adjust them both relative offsets and absolute offsets as well
11:24
getting a single one wrong will break everything what makes this problem really really hard is that if you're looking at the binary a byte is a there's no way for us to distinguish between scalars and references and in fact they are indistinguishable getting a single reference wrong breaks
11:46
the target and would would introduce arbitrary crashes so we have to come up with ways that allow us to distinguish between the two so for example if you have this this code here it it takes a value and stores it somewhere on the on the stack this could come from two different different
12:05
kind of high-level constructs on one hand it could be taking the address of a function and storing this function address somewhere in a stack variable or it could be just storing a scalar in a stack variable and these two are indistinguishable and rewriting them as soon as we add new code the offsets
12:23
will change if it is a function we would have to modify the value if it is a scalar we have to keep the same value so how can we come up with a way that allows us to distinguish between the two and rewrite binaries by recovering this missing information so let us take let me take you or let us
12:45
take you on a journey towards instrumenting binaries in the kernel this is what we what we aim for we'll start with the simple case of instrumenting binaries in user land talk about different kinds of coverage guided fuzzing and what kind of instrumentation we can add what kind of
13:01
sanitization we can add and then focusing on taking it all together and applying it to kernel binaries to see what what will fall out of it let's start with instrumenting binaries first I will now talk a little bit about retro right our mechanism in our tool that enables static binary
13:22
instrumentation by symbolizing existing binaries so we recover the information and we translate relative offsets and absolute offsets into actual labels that are added to the to the assembly file the instrumentation can then work on the recovered assembly file which can then be
13:43
reassembled into a binary that it can then be executed for for fuzzing we implement coverage tracking and binary address sanitizer on top of this leveraging the abstraction as we go forward the key to enabling this kind of binary rewriting is position independent code and position independent code has
14:01
become the de facto standard for any code that is being executed on a modern system and it effectively says that it is code that can be loaded at any arbitrary address in your address space as you're executing binaries it is essential and a requirement if you want to have address based layout randomization or if you want to use shared libraries which de facto you want
14:21
to use in all these different systems so since a couple of years all the code that you're executing on your phones on your desktops on your laptops is position independent code and the idea between the position independent code is like you can load it anywhere in your address space and you can therefore not use any hard coded static addresses and you have to
14:41
inform the system with relocations or pick relative addresses to on how the system can relocate these different mechanisms on x86 64 position independent code leverages addressing that is relative to the instruction pointer so for example it uses the instruct the current instruction pointer
15:03
then a relative offset to that instruction pointer to reference global variables other functions and so on and this is a very easy way for us to distinguish references from constants especially in PI binaries if it is rip relative it is a reference everything else is a constant and we can build our
15:21
translation algorithm and our translation mechanism based on this this fundamental finding to remove any form of heuristic that is needed by focusing especially on position independent code so we're supporting position independent code we are we don't support non position independent code but we give you the guarantee that we can rewrite all the
15:41
different code that is out there so symbolization works as follows if you have the little bit of code on the lower right symbolization replaces first all the references with assembler labels so look at the call instruction and the jump not zero instruction the call instruction references an absolute address and the jump not zero instruction jumps
16:02
backward relative 15 bytes so by focusing on these relative jumps and calls we can replace them as actual labels and rewrite the binary as follows so you're calling a function replacing it with the actual label and for the jump not zero we are inserting an actual label in the assembly code and adding a backward reference for PC relative addresses for
16:23
example the data load we can then replace it with the name of the actual data that we have recovered and we can then add all the different relocations and use that as auxiliary information on top of it after these three steps we can insert any new code in between and can
16:40
therefore add different forms of instrumentations or run some more higher level analysis on top of it and then reassemble the file for fuzzing or coverage guided tracking address sanitization or whatever else you want to go want to do I will now hand over to Matteo who will
17:01
cover coverage guided fuzzing and sanitization and then instrumenting the binaries in the car go ahead so now that we have this really nice framework to rewrite binaries one of the things that we want to add to actually get to fuzzing is this covers tracking instrumentation so covers guided fuzzing is a way a method for to let the fuzzer
17:25
discover interesting inputs and interesting paths to the target by itself so the basic idea is that the fuzzer will track the parts of the programs that are covered by different inputs by inserting some kind of instrumentation so for example here we have this target program that checks
17:43
if the input contains the string PNG at the beginning and if it does then it does something interesting otherwise it just bails out and fails so if we tracks the part of the programs that each input executes the fuzzer can figure out that an input that contains P will have discovered a different path
18:03
through the program than input that doesn't contain it and then so on it can one byte at a time discover that this program expects this magic sequence PNG at the start of the input so the way that the fuzzer does this is that it every time a new input discovers a new path through the
18:21
target it is considered interesting and added to a corpus of interesting inputs and every time the fuzzer needs to generate a new input it will select on something from the corpus mutated randomly and then use it as a new input so this is like this is like conceptually pretty simple but in practices work really
18:41
well and really lets the fuzzer discover the format that the target expects an unsupervised way so as an example this is an experiment that was run by the author of AFL AFL is the fuzzer that sort of popularized this technique where he was fuzzing JPEG parsing library starting from a corpus
19:00
that only contain the string hello so now clearly hello is not a valid JPEG image and so but still like the fuzzer was still able to find to discover the correct format so after a while it started generating some grayscale images on the top left and as it generated more and more inputs it started generating more interesting images such as some grayscale
19:21
gradients and later on even some color images so as you can see this this really works and it allows us to fuzz a program without really teaching the fuzzer how the input should look like so that's it for coverage guiding fuzzing now we'll talk a bit about sanitizations as a reminder the core idea behind sanitization is that just looking for crashes is likely to miss some of
19:42
the bugs so for example if you have this out of buns 1x3 that will probably not crash the target but it would still like to catch it because it could be used for an info leak for example so one of the most popular sanitizers is address sanitizer so address sanitizer will instrument all the memory accesses in your program and check for memory corruption which
20:04
is so memory corruption is a pretty dangerous class of bugs and unfortunately still playing C and C++ programs on unsafe languages in general and ASEN tries to catch it by instrumenting the target it is very popular it has been used to find thousands of bugs in complex software
20:24
like Chrome and Linux and even though it has like a bit of a slowdown like about 2x it is still really popular because it lets you find many many more bugs so how does it work the basic idea is that ASEN will insert some special regions of memory called red zones around every object in memory
20:42
so we have a small example here where we declare a four byte array on the stack so ASEN will allocate the array buff and then add a red zone before it and a red zone after it whenever the program accesses the red zones it is terminated with a security violation so the instrumentation just prints a bug
21:03
report and then crashes the target this is very useful for detecting for example buffer overflows or under flows or even other kinds of bias let's just use of the free and so on so as an example here we're trying to copy 5 bytes into a 4 byte buffer and ASEN will check each of the accesses one by
21:20
one and when it sees that the last byte writes a red zone it detects the violation and crashes the program so this is good for us because this bug might have not been found by simply looking for crashes but it's definitely found if you use ASEN so this is something we want for fuzzing so now we'll cover briefly cover ASEN we can talk about instrumenting binary in the
21:43
kernel so Matias left us with retro right and with retro right we can add both coverage tracking and ASEN to binaries so the simple it's a really simple idea now that we can rewrite this binary and add instructions wherever we want we can implement both coverage tracking and ASEN in order to
22:05
implement coverage tracking we simply have to identify the start of every basic block and the lethal piece of instrumentation at the start of the basic block that tells the fuzzer hey we've reached this part of the program hey we've reached this other part of the program and then the fuzzer can
22:20
figure out whether that's a new part or not ASEN is also like you know it's also somewhat it can also be implemented in this way by finding all memory accesses and then linking with libASEN. libASEN is a sort of runtime for ASEN that takes care of inserting the red zones and instrument adding you know like keeping around all the metadata that ASEN needs to know
22:43
where the red zones are and detecting whether a memory access is invalid so how can we apply all of this in the kernel well first of all fuzzing the kernel is not as easy as fuzzing some user space program there's some issues here so first of all this crash handling so whenever you're fuzzing a user space
23:00
program you expect crashes well because that's what we're after and if I use a space program crashes then the US simply terminates gracefully and so the fuzzer can detect this and save the input as a crashing input and so on and this this is all fine but when you're fuzzing the kernel so if you were fuzzing the kernel of the machine that you were using for fuzzing after a
23:21
while the machine would just go down because after all the kernel runs the machine and if it starts misbehaving then all of it can go wrong and more importantly you can lose your crashes because if the machine crashes then the state of the fuzzer is lost and you have no idea what your crashing input was so what most kernel fuzzers have to do is that they resort to some
23:40
kind of VM to keep the system stable so they fuzz the kernel in a VM and then run the fuzzing agent outside of VM on top of that is tooling so if you want to fuzz a user space program you can just download AFL or use libFuzzer there's plenty of tutorials online it's really easy to set up and just like compile your program you start fuzzing and you're good to go if you
24:01
want to fuzz the kernel it's already much more complicated so for example if you want to fuzz Linux with syscaller which is a popular kernel fuzzer you have to compile the kernel you have to use a special config that supports the scholar you have you have way less guys available than for you user space fuzzing and in general it is much more complex and less intuitive than
24:22
just fuzzing user space and lastly we have the issue of determinism so in general if you have a single threaded user space program unless he uses some kind of random number generator it is more or less deterministic there's nothing that affects the execution of the program but and this is really nice if you want to try to reproduce a test case because if you if you have
24:43
a non-deterministic test case then it's really hard to know whether this is really a crash or if it's just something that you should ignore and in the kernel this has been harder because you don't only have concurrency like so multi processing you also have interrupts so interrupts can happen at any time and if one time you got an interrupt well executing a test case
25:05
and the second time you didn't then maybe it only crashes one time you don't really know it's it's not pretty and so again we have several approaches to fuzzing binaries in the kernel first first one is to do black box fuzzing we don't really like this because it doesn't find much
25:22
especially something complex like a kernel approach one is to use dynamic translation so use something like qmu or or you name it this works and people have used it successfully the problem is that it is really really really slow like we're talking about 10x plus overhead and as we said before
25:41
the more iteration the more test cases you can execute in the same amount of time the better because you find more bugs and on top of that there's no currently available sanitizer for kernel for kernel binaries that works is based on this approach so in user space you have something like
26:00
volgrind in the kernel you don't have anything at least that we know of there's another approach which is to use Intel processor trace this has been like there's been some research papers on this recently and this is nice because it allows you to collect coverage at nearly zero overhead it's like really fast but the problem is that it requires hardware support so it requires a fairly new x86 CPU and if you want to find something on arm say
26:24
like your android driver or if you want to use an older CPU then you're out of luck and what's worse yeah you cannot really use it for sanitization or at least not the kind of sanitization that asin does because it just trace to the execution doesn't allow you to do checks on memory
26:41
accesses so approach Lee which is what we will use a static rewrite so we had this very nice framework for rewriting user space binaries and then we asked ourselves can we make this work in the kernel so we took the we took the system the original writer right we modified it we implemented support for Linux modules and it works so we have implemented and it we
27:04
have used it to fuzz the to fuzz some kernel modules and it really shows that this approach doesn't only work for user space it can also be applied to the kernel so as for some implementation the nice thing about kernel modules is that they're always positioned independent so you cannot have position
27:21
like fixed position kernel modules because Linux just doesn't allow it so we sort of get that for free which is nice and Linux modules are also a special class of elf files which means that the format is even though it's not the same as user space binaries it's still somewhat similar so we didn't have to change the symbolizer that much which is also
27:41
nice and we implemented a symbolization with this and we used it to implement both code coverage and binary ASIN for kernel binary modules so for coverage the idea behind the whole retro right project was that we wanted to integrate with existing tools so existing fuzzy we didn't want to force
28:03
our users to write their own fuzzer that is compatible with retro right so for in user space we had AFL style coverage tracking and binary ASIN which is compatible with source space ASIN and we wanted to follow the same principle in the kernel so it turns out that Linux has this built-in
28:20
coverage tracking framework called k-cov that is used by several popular kernel fuzzers like syscaller and we wanted to reuse it ourselves so we designed our coverage instrumentation so that it integrates with a k-cov the downside is that you need to compile the kernel with k-cov but then again Linux is open source so you can sort of always do that the kernel it's
28:42
usually not the kernel it is a binary blob but it's usually only the modules so that's still fine and the way you do this is the way you implement k-cov for binary modules is that you just have to find the start of every basic block and add a call to some function that then stores the collective coverage so here's an example we have a short snippet of code with three basic
29:03
blocks and all we have to do is add a call to trace PC to the start of the basic block trace PC is a function that is part of the main kernel image that then collects this coverage and makes it available to a user space fudging agent so this is all really easy and it works and let's not see how we
29:20
implemented binary ASIN so as I mentioned before when we instrument the program with binary ASIN user space we link with libASIN which take care of setting up the metadata takes care of putting the resins around around locations and so on so we have to do something similar in the kernel of course you cannot link with libASIN in the kernel because that doesn't work but what we can do instead is again compile the kernel with
29:43
KASIN support so this instruments the allocator to add red zones it allocates space for the metadata it keeps this metadata around does this all for us which is really nice and again the big advantage of using this approach is that we can integrate seamlessly with a KASIN instrumented
30:01
kernel and with fuzzers that rely on KASIN such as syscaller so we see this as more of a plus than like a limitation and how do you implement ASIN well you have to find every memory access and instrument it to check the to check whether this is discussing a red zone and if it doesn't
30:20
then you just call this bug report function that produces a struct trace a bug report and crashes the kernel so that the fuzzer can detect it again this is compatible with source based case and so we're happy we can simply load a rewritten module with added instrumentation into a kernel as long as you have compiled the kernel with the right flags and we can use a
30:42
standard kernel fuzzer here for the R evaluation we use syscaller popular kernel fuzzer by by some folks at Google and works really well so we're finally reached the end of our journey and now we wanted to present some experiments with it to see if this really works so for user space we
31:01
wanted to compare the performance of our binary ASIN with source based ASIN and with existing solutions that also work on binaries so for user space you can use a vulgar in memcheck it's a memory sanitizer that is based on binary translation and dynamic binary translation and works in binaries we
31:21
compared it with source ASIN and retro reason on the spec CPU benchmark and saw how fast it was and for the kernel we decided to find some file systems and some drivers with syscaller using both source space and so space case and a kickoff and carer to write base case and a kickoff so these are
31:42
our results for user space so the red bar is volt grind we can see that the execution time of all grind is the highest it is really really slow like three then 30 X overhead way too slow for fuzzing then in green we have our binary ASIN which is like already a large improvement in orange we
32:03
have source base ASIN and then finally blue we have the original code without any instrumentation whatsoever so we can see that source base ASIN has some like 2x or 3x overhead and binary ASIN is a bit higher like a bit less efficient but still somewhat close so that's it for user space and for
32:21
the kernel we these are some preliminary results so this is like I'm doing this work as part of a master thesis and so I'm still like running the evaluation here we can see that the overhead is already like a bit lower so the reason for this is that spec is a pure CPU benchmark doesn't interact with the system that much and so any instrumentation that you add is gonna
32:41
massively slow down or like considerably slow down the execution by contrast when you fuzz a file system with syscaller not only every test case has to go from the host to the guest and then do multiple syscalls and so on but also every system call has to go through several layers of
33:03
abstraction before it gets to the actual file system and all these like all of this takes a lot of time and so in practice the overhead of our instrumentation seems to be pretty reasonable so since we know that you like demos we prepared a small demo of k-raterite so let's see yeah all
33:33
right so we prepared a small kernel module and this module is like really
33:40
simple contains a vulnerability and what it does is that it creates a character device so if you're not familiar with this a character device is a like a fake file that is exposed by a kernel driver and they can read to and write from and instead of going to a file the data that you read that you in
34:02
this case write to the fake file goes to the driver and is handled by this demo write function so as we can see this function allocates a buffer a 16 byte buffer on the heap and then copies the data into it and then it checks if the data constrains the string 1 3 3 7 if it does then it accesses the buffer out of bounds you can see a lock 16 and the buffer is
34:22
16 byte this is a out of bounds read by 1 byte and if it doesn't then it just accesses the buffer in bounds which is fine and it's it's not the vulnerability so we can compile this this driver okay okay and then so we
34:45
have our module and then we will instrument it using k-retro write so instrument yes please okay so k-retro write did some processing and it
35:03
produced an instrumented module with asan or k-san and a symbolized assembly file we can actually have a look at the symbolized assembly file to see what it looks like yes yes okay so it's big enough yeah as you can see so we can we can actually see here the asin instrumentation ah shouldn't yeah
35:23
so we this is the asin instrumentation the original code loads some data from this address and as you can see the asin instrumentation first computes the actual address and then does some checking basically this is checking some metadata that asin stores to check
35:42
if the address is in a red zone or not and then if the field check fails then it calls this asin report which produces a stack trace and crashes the kernel so this is fine we can actually even look at the disassembly of both modules so object and then okay so on the left we have the original module
36:20
without any instrumentation on the right we have the module instrumented
36:25
with asin so as you can see the original module has push r13 and then has this memory load here on the right in the instrumented module red k writer right inserted the asin instrumentation so the original load is still down here but between that between the first instruction and this
36:43
instruction we have now have the case and instrumentation that those are checked so this is all fine now we can actually test it and see what it does so we can we can but we will boot us very simple a very minimal Linux system and try to trigger the vulnerability first with the non
37:04
instrumented module and then with the instrumented module and we can we will see that in the in with the non instrumented module the kernel will not crash but with the instrumented module it will crash and produce a bug report so let's see yeah this is a kme VM I have no idea why it's taking so
37:23
long to boot I'll blame the demo gods not being kind to us yeah I guess we just have to wait okay all right so we loaded the module we will see that it
37:41
has created a fig file character device in dev demo yep we can write to this file yep so this will this access the array in bounds and so this is fine then what we can also do is write 1 3 3 7 to it so it will access the
38:05
array want of Autobots so that this is the non instrumented module so this will not crash it will just bring some garbage value okay that's it now we can load the instrumented module instead and do the same experiment again all
38:25
right we can see that that demo is still here so the module still works let's try to write 1 2 3 4 into it this again doesn't crash but when we try to run 1 with 3 3 7 this will produce a bug report so this has quite a
38:50
lot of information we can see like where the memory was allocated as a stack trace for that it wasn't freed so there's no stack trace for the free and we we see that this the cache size of the of the of the memory like
39:04
it was a 16-byte allocation we can see the shape of the memory we see that these two zeros means that there's two 8-byte chunks of valid memory and then these FCFCFC is the are the red zones that I was talking about before all right so that's it for the demo we will switch back to our presentation
39:20
now so hope you enjoyed it so after applying this to demo module we also wanted to see what happens if you apply this to a real file system after a couple of hours we were when we came back and checked on the results we saw a couple of issues popping up including a nice set of use after free reads a set
39:46
of use after free writes and we check the bug reports we saw a whole bunch of Linux kernel issues popping up one after the other in this nondescript module that we first we're in the process of reporting it this will take
40:05
some time until it is fixed that's why you see the blurry lines but as you see there's still quite a bit of opportunity in the Linux kernel where you can apply different forms of targeted fuzzing into different modules leverage
40:21
these modules on top of the case and instrument eight instrumented kernel and then leveraging this as part of your your fuzzing toolchain to find interesting kernel odays that yeah you can then develop further or report or do whatever you want with them now we've shown you how you can take existing
40:44
binary only modules think different binary only drivers or even existing modules where you don't want to instrument the full set of the Linux kernel but only focus fuzzing and exploration a small different small limited
41:00
piece of code and then do security tests in those we've shown you how you can do coverage based tracking and address sanitization but this is also up to you on what kind of other instrumentation you want like this is just a tool a framework that allows you to do arbitrary forms of instrumentation so we've taken you on a journey from instrumenting binaries
41:24
over coverage guided fuzzing and sanitization to instrumenting modules in the kernel and then finding crashes in the kernel let me wrap up the talk so this is one of the the fun pieces of work that we do in the in the hex hive
41:40
lab at EPFL so if you're looking for postdoc opportunities or if you're thinking about a PhD come talk to us we're always hiring the tools will be released as open source a large chunk of the user space work is already open source we are working on a set of additional demos and so on so you can get started faster leveraging the different existing instrumentation is
42:04
already out there the user space work is already available the kernel work will be available in a couple of weeks this allows you to instrument real-world binaries for fuzzing leveraging existing transformations for coverage tracking to enable fast and effective fuzzing and memory checking to
42:22
detect the actual bugs that exist there the the key takeaway from this talk is that retro right and K retro right enables static binary rewriting at zero instrumentation cost we take the limitation of focusing only on position independent code which is not a real implementation but we get the advantage
42:44
of being able to symbolize without actually relying on heuristic so we can even symbolize large complex source large complex applications and effectively rewrite those aspects and then you can focus fuzzing on these
43:00
these parts another point I want to mention is that this enables you to reuse existing tooling so you can take a binary blob instrumented and then reuse for example address sanitizer or existing fuzzing tools as it integrates really really nice as I said all the code is open source check it out try it let us know if it if it breaks we're happy to fix you're
43:24
committed to open source and let us know if there are any questions thank you so thanks guys for an interesting talk we have some time for questions so we
43:43
have microphones along the aisles we'll start from question from microphone number two hi thanks for your talk and for the demo I'm not sure about the use case you showed for the kernel retro right because you are usually
44:00
interested in fuzzing binary in kernel space when you don't have source code for the kernel for example for IOT or Android and so on but you just reuse the kcov and kasan in the kernel and you never have the
44:20
kernel in IOT or Android which is compiled with that so are you do you have any plans to binary instrument the kernel itself not the modules so so we we thought about that the I think that there's a there's some additional problems that we would have to solve in order to be able to
44:41
instrument the full kernel so other than the fact that it gives us compatibility with with like existing tools the reason why we decided to go with a compiling the kernel with kasan and kcov is that building the like you would you have to like think about it you have to instrument the memory allocator to add red zones which is like already somewhat complex
45:02
you have to instrument the exception handlers to catch like any faults that the instrumentation detects you would have to like set up some memory for the for the ace and shadow so this is like I think you should be able to do it but it will require a lot of additional work so this is like
45:20
this was like four months thesis so we decided to start small and prove that it works in the kernel for modules and then leave it to future work to actually extend it to the full kernel also like I think for Android so so in the case of Linux the kernel is GPL right so if the
45:40
manufacturer ships a custom kernel they have to release the source code right they never do they know well that's a different issue right so that's why I ask because I don't see how it is can be used in the real world now let me try to put this into perspective a little bit as well right so there's what we did so far is we leveraged existing tools like case and or
46:02
kickoff and integrate into these existing tools now doing heat based allocation is fairly simple and replacing those as additional red zones that instrumentation you can carry out for Lee for leave out by focusing on the different allocators second to that simply oopsing the kernel and
46:23
printing the stack trace is also fairly straightforward so it's not a lot of additional effort so it is it involved some engineering effort to port this to to none case and compile kernels but we think it is it is very feasible in the in the interest of time we focused on case and enable
46:44
kernels so that some some form of a Zen is already enabled but yeah this is additional engineering effort but there's also a community out there that can help us with these kind of changes so the key key retro right and retro right themselves are the binary rewriting platform that allow you to turn
47:04
a binary into an assembly file that you can then instrument and run different passes on top of it so another pass would be a full a Zen pass or case and pass that somebody could add and then contribute back to the community it would be really useful thanks cool next question from the
47:21
internet yes there is a question regarding the slide on the spec CPU benchmark the second or third graph from the right had an instrumented version that was faster than the original program why is that cache effect thank you microphone number one thank you thank you for presentation I
47:45
have a question how many architecture do support and if you have support more than 664 okay so no plans for arm or meat so there are plans okay again
48:03
there's a finite amount of time we focus on the technology arm is high up on the list if somebody's interested on working on it and contributing we're happy to hear from it our list of targets is armed first and then maybe something else but I think was x86 64 and arm we cover the majority of the
48:24
interesting platforms and second question did you try to fast any real closed source program because as I understand from presentation you fast like just file system what we can compile and fast we see scholar in
48:43
like in a in a bust so for the evaluation we wanted to be able to compare between the source base instrumentation and the binary base instrumentation so we focus mostly on open source file system and drivers because then we could instrument them with a compiler we haven't yet tried but
49:02
this is like also pretty high up on the list we wanted to try to find some close source drivers there's lots of them like for GPUs or anything and when give it a try and find some more days perhaps yes but we see scholar you still have a problem you have to write rules or like dictionaries I mean you
49:22
have to understand the format how to communicate with the driver yeah but there's for example closed source file systems that we are looking at okay thank you number two hi thank you for a talk so I don't know if there are any kcob or a case an equivalent solution to Windows but I was
49:45
wondering if you tried or are you planning to do it on Windows the framer because I know it might be challenging because of the driver signature enforcement and patch guard but I wondered if you tried or I thought about it yes we thought about it and we decided against it Windows is
50:02
incredibly hard and we are academics the research I do in my lab or we do in my in my research lab focuses on predominantly open source software and empowers open source software doing full support for Microsoft Windows is somewhat out of scope if somebody wants support these tools we are happy
50:21
to hear and work with these people but it's it's a lot of additional engineering effort with very additional very low additional research value so we'll have to find some form of compromise and like if you would be willing to fund us we would go ahead but it's it's it's yeah it's a cost question and you're referring both to kernel and user space right yeah
50:43
okay thank you number five this seems more interesting if you're looking for vulnerabilities in closed source kernel modules but not giving you too much thought it seems it's really trivial to prevent this if you're writing your closed source module well how would you prevent this well
51:05
for starters you would just take a difference between the address of two functions that's not gonna be IP relative so right so we explicitly like in your even in the original retro right paper we explicitly decided to not try to deal
51:21
with obfuscated code or code that is purposefully trying to defeat this kind of rewriting because like the assumption is that first of all there are techniques to like the obfuscate code or remove these like checks in some way but this is like sort of orthogonal work and at the same time I guess most drivers are not really compiled with this sort of
51:41
obfuscation it just like they just you know the compile with a regular compiler but yeah of course this is like a limitation they're likely stripped but not necessarily obfuscated at least from what we've seen when we looked at binary only drivers microphone number two how do you decide where to play
52:02
the red zone so from what I heard you talked about the instrumenting the allocators but well there are a lot of variables on the stack so how do you deal with those oh yeah that's actually super cool I refer to some extent to the paper that is on the on the github repo as well if you
52:20
think about it modern compilers use canaries for for buffers are you aware of stack canaries and how stack canaries work so stack canaries like if the compiler sees there's a buffer that may be overflown it places a stack canary between the buffer and any other data what we use is we and as part of our analysis tool we find these stack canaries remove the code that does the
52:44
stack canary and use this space to place our red zones so we actually hacked the stack canaries remove that code and add at a Zen red zones into the empty stack canaries that are now there it's actually a super cool optimization because we piggyback on what what kind of work the compiler already did for us before and we can then leverage that to to gain additional
53:04
benefits and protect the stack as well thanks another question from the internet yes did you consider lifting the binary code to LLVM IR instead of
53:21
generating assembler source yes but so a little bit longer answer yes we did consider that yes it would be super nice to live to to LLVM IR we've actually looked into this it's incredibly hard it's incredibly complex there's no direct mapping between the the machine code equivalent and the LLVM
53:44
IR you would still need to recover all the types so it's like this magic dream that you recover full LLVM IR and then do heavy weight transformations on top of it but this this is incredibly hard because if you compile down from LLVM IR to a machine code you lose a massive amount of
54:02
information we would have to find a way to recover all of that information which is which is pretty much impossible and undecidable for many cases so for example just as a as a as a note we only recover control flow and we only do symbolize control flow for data references we don't support
54:21
instrumentation of data references yet because there's still an undecided undecidable problem that we are facing with I can talk more about this offline or there there's a note in the paper as well so this is just a small problem only if you're lifting to assembly files if you're lifting to LLVM IR you would have to do full end-to-end type recovery which is
54:42
massively more complicated yes it will be super nice unfortunately it is undecidable and really really hard so you can come up with some heuristics but there is no solution that will do this in that will be correct 100% at the time. I'll take one more question from microphone number six. Thank you for your talk. What kind of disassemblers did you use
55:04
for retro right and did you have problems with a wrong disassembly and if so how did you handle it? So retro right so we used capstone for the disassembly. An amazing tool by the way. Yeah so so the idea is that like we
55:24
need some kind of some informations about where the functions are so for the kernel modules this is actually fine because kernel modules come with this sort of information because the kernel needs it to build stack traces for example for easy space binaries this is somewhat less common but you can use
55:41
another tool to try to do function identification and we do like a sort of like disassemble the entire function so we have run into some issues with like in AT&T syntax because like we wanted to use a gas news assembler for
56:00
for reassembly yeah and some instructions are like you can express the same like two different instructions like five byte knob and six byte knob using the same string of like a text and mnemonic and operant string but the problem is that like the kernel doesn't like it and crashes
56:20
this took me like two days to be back so the kernel uses dynamic binary patching yeah when it runs at runtime and it uses fixed offsets so if you replace a five byte knob with a six byte knob or vice versa your offsets change and your kernel just blows up in your face it was kind of our case on case basis where you saw the errors coming out of the disassembly and you
56:44
had to fix it so sorry can you can you repeat the question like for example if you if some instruction is not supported by the disassembler so you saw it that it crashed that there's something wrong and then you fix it by hand yeah well if we saw that there was a problem with this like I don't
57:03
recall having any unknown instructions in the disassembler I don't think I've ever had a problem with that but yeah this was a lot of like you know engineering work so let me repeat the problem was not a bug in the disassembler but an issue with the instruction format that the same mnemonic
57:20
can be translated into two different instructions one of which was five bytes long the other one was six bytes wrong both used the exact same mnemonic right this is an issue with assembly encoding you had no problems with unsupported instructions which couldn't be disassembled no no I'm not as far as I know at least we have one more minute so a very
57:43
short question for microphone number two does it work is your binary instrumentation equally powerful as kernel address space I mean kasan so
58:01
does it detect all the memory corruptions on stack heap and globals no globals but heap it does all of them on the heap there are some slight variation on the stack because we have to piggyback on the canary stuff as I mentioned quickly before there's no reflowing and full recovery of data
58:23
layouts so to get anything on the on the stack we have to piggyback on existing compiler extensions like stack canaries but so we don't support intra object or flows on the stack but if you do leverage the stack canaries to get some stack benefits which is I don't know 90 95 percent there because
58:44
the stack canaries are pretty good for heap we get the same precision for globals we have very limited support thanks so that's all the time we have for this talk you can find the speakers I think afterwards offline please give them a big round of applause