Transforming Open Source to Open Access in Closed Applications

Video in TIB AV-Portal: Transforming Open Source to Open Access in Closed Applications

Formal Metadata

Transforming Open Source to Open Access in Closed Applications
Title of Series
Part Number
Number of Parts
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Place

Content Metadata

Subject Area
The inclusion of open-source components into large, closed-sourced applications has become a common practice in modern software. Vendors obviously benefit from this approach as it allows them to quickly add functionality for their users without the need to invest costly engineering effort. However, leveraging open source for a quick functionality boost comes with security side effects that might not be understood by the vendor until it is too late. In those cases, misunderstood or poorly implemented open source allows attackers to bypass security mechanisms that may exist elsewhere in the proprietary system. This talk provides insight into these side effects through an examination of Adobe Reader’s XSLT (Extensible Stylesheet Language Transformations) engine, which is based on the now abandoned open-source project called Sablotron – an XML processor fully implemented in C++. We focus on techniques for auditing the source code of Sablotron in order to find corresponding bugs in Adobe Reader. We also present a new source-to-binary matching technique to help you pinpoint the vulnerable conditions within Sablotron that also reside in the assembly of Reader. Real-world application of these techniques will be demonstrated through a series of code execution vulnerabilities discovered in Adobe Reader’s codebase. Finally, we’ll highlight the trends in vulnerabilities discovered in Adobe Reader’s XSLT engine over the last year.
Open source Observational study Abgeschlossene Teilmenge Chaos (cosmogony) Open set Cartesian coordinate system Rule of inference Code Software bug Component-based software engineering Computer animation Personal digital assistant Series (mathematics) Vulnerability (computing)
Point (geometry) Computer program Presentation of a group Mereology Code Product (business) Twitter Number Fluid statics Causality Intrusion detection system Series (mathematics) Vulnerability (computing) Form (programming) Multiplication Surface Mathematical analysis Computer animation Integrated development environment Network topology Video game output Self-organization Family Reverse engineering
Functional programming Computer program Presentation of a group INTEGRAL Length Multiplication sign Sheaf (mathematics) Function (mathematics) Parameter (computer programming) Mereology Software bug Component-based software engineering Different (Kate Ryan album) Data compression Functional programming Information security Descriptive statistics Vulnerability (computing) Physical system Social class Covering space Programming language Curve Touchscreen File format Software developer Sound effect Variable (mathematics) Flow separation Discounts and allowances Proof theory Type theory Message passing Process (computing) Duality (mathematics) output Self-organization Quicksort Reading (process) Resultant Row (database) Spacetime Point (geometry) Web page Slide rule Implementation Service (economics) Open source Transformation (genetics) Real number Patch (Unix) Abgeschlossene Teilmenge Code Metadata Element (mathematics) Product (business) Attribute grammar Number Revision control Latent heat Object-oriented programming Root String (computer science) Reduction of order Authorization Ideal (ethics) Selectivity (electronic) Software testing Plug-in (computing) Boolean algebra Multiplication Information Poisson-Klammer Projective plane Expression Mathematical analysis Multilateration Set (mathematics) Library catalog Cartesian coordinate system Uniform resource locator Spring (hydrology) Computer animation Personal digital assistant Enumerated type Calculation Statement (computer science) Video game Musical ensemble Logische Programmiersprache Table (information) Freezing Buffer overflow
Addition Mathematics Computer animation Open source Multiplication sign Mereology Coprocessor Number
Information Cartesian coordinate system Code Revision control Category of being Pointer (computer programming) Computer animation Funktor String (computer science) Integer Data structure Functional programming Table (information) Position operator Row (database) Physical system
Category of being Computer animation Personal digital assistant String (computer science) Statement (computer science) Functional programming Coma Berenices System call Pole (complex analysis) Physical system
Multiplication Matching (graph theory) Poisson-Klammer Sheaf (mathematics) Fluid statics Medical imaging Mathematics Word Pointer (computer programming) Computer animation Personal digital assistant String (computer science) Moving average Data structure Functional programming Table (information) Physical system
Polar coordinate system Cellular automaton Insertion loss Line (geometry) Code Entire function Preprocessor Type theory Computer animation Personal digital assistant Functional programming Data structure Metropolitan area network
Patch (Unix) 1 (number) Cartesian coordinate system System call Product (business) Software bug Type theory Mathematics Word Computer animation Different (Kate Ryan album) String (computer science) Data structure Buffer overflow Asynchronous Transfer Mode
Dataflow Link (knot theory) Length Mathematical analysis Counting Element (mathematics) Attribute grammar Type theory Computer animation Personal digital assistant String (computer science) Mixed reality Buffer solution Core dump Pattern language Functional programming Data conversion Resource allocation
Trail Server (computing) Open source Observational study Length Patch (Unix) Counting Parameter (computer programming) Code Sequence Revision control Type theory Mathematics Computer animation Personal digital assistant Data compression Single-precision floating-point format Calculation Right angle Collision Functional programming Data conversion
Classical physics Programming language Slide rule Dynamical system Stapeldatei Length Multiplication sign Sheaf (mathematics) Parameter (computer programming) Line (geometry) Number Element (mathematics) Software bug Product (business) Type theory Casting (performing arts) Computer animation Personal digital assistant Blog Right angle Functional programming Macro (computer science) Asynchronous Transfer Mode
Expert system Plastikkarte Parameter (computer programming) Software bug Number Type theory Optical disc drive Computer animation Data compression Personal digital assistant Boundary value problem Pattern language Functional programming Macro (computer science) Arc (geometry)
Group action Coefficient of determination Computer animation Bit rate Personal digital assistant Patch (Unix) Website Bit Number
Context awareness Game controller Observational study Length Patch (Unix) Range (statistics) Code Power (physics) Number Object-oriented programming Negative number Energy level Functional programming Resource allocation Condition number Block (periodic table) Expression Constructor (object-oriented programming) Graph (mathematics) Electronic mailing list Expert system Counting Sound effect Line (geometry) Set (mathematics) Compiler Curvature Benutzerhandbuch Uniform resource locator Computer animation Personal digital assistant Predicate (grammar) Buffer solution
Point (geometry) Implementation Pointer (computer programming) Film editing Computer animation Personal digital assistant Operator (mathematics) Interactive television Set (mathematics) Functional programming Event horizon Error message
Filter <Stochastik> Pointer (computer programming) Computer animation Personal digital assistant Forcing (mathematics) Right angle Functional programming Set (mathematics) Variable (mathematics) Error message Code Local ring
Dependent and independent variables Computer animation Different (Kate Ryan album) Patch (Unix)
Laptop Computer program Context awareness Presentation of a group Open source Multiplication sign Abgeschlossene Teilmenge Similarity (geometry) Mereology Code Twitter Software bug Inference String (computer science) Single-precision floating-point format Information security Vulnerability (computing) Area Enterprise architecture Software developer Binary code Projective plane Electronic mailing list Mathematical analysis Sound effect Lattice (order) Cartesian coordinate system Computer animation Software Right angle Spacetime
I'll add al HapMap memory-mapped come rule down from mid room this I think I also chaos and sedentary
children so like I said this is that this tight talk is entitled the transforming open source to the open access enclose applications specifically really talking today about how when any open source components of of applications like Adobe Reader you can find bugs easier instead of looking at the binary and transfer those thoughts into the closed source applications which then you can actually ever exploits covering to go over several code matching techniques and that exists and will also go through a series of case studies that will show the underlying vulnerability in open source components have been pointed in the well that's a ship with reader along with how do we actually patched the learnability itself so 1st
though quick introductions emitted on environment is Brian it's malicious input on Twitter primarily on Twitter and I'm a senior its year Rajah for vulnerability research insider Trend Micro's tipping point organization my primary purpose of in my life basically now is to run the 0 Danish the program which represents the world's largest vendor agnostic about bounty program nor what is that mean that means we work with researchers around the world we purchase zero-day vulnerabilities from researchers have a lot here in the EU in in this region and we we Trees Detection guidance for a tipping point IDS and then we ship them to the vendors to get them fixed we do a lot of but really purchasing in research inside of Microsoft until products and in fact were actually 1 top suppliers for both of those vendors run abilities for multiple years no I'm I'm also responsible for organizing in adjudicating the ever popular pound and having competition which is actually happening he in March a consequence so if you have some 0 days and some with that you would like to sell in a public form Stefan come on I want to know about my colleagues series themselves and so is of tool the user to use for the user's initiative program which is part of the family of working for this guy for the past 4 years and I do a lot of of the research cause analysis of the numbers of Montreal Canada mn right and Jesus dominant uh go by wondering which have been with TI since 2012 but with 2 . 4 11 years now I focus a lot on the static analysis and reverse engineering and just really weird cause analysis in general it did
so lucky for me I am 1 when we're going to this presentation I drew distraught you talk about the attack surface that we're going to be going over which means I give the what the honor of presenting XSLT to this audience which is very exciting for me because you like we already know all about XSLT but there is a point to this section so that you can actually start so we can put you in the right place in the summertime code to find vulnerabilities so 1st as we all
know it's very very very common now for applications to integrate open source projects into the closed source applications it's a way for vendors to quickly add new features to the product without having to go through all of that really costly engineering effort but 1 that 1 the real problems that we're starting to see inside the initial program is the vendors don't actually understand the security side effects that exist by integrating these products a lot of misunderstood assumptions about how the code works and as a result this can interject security weaknesses into the applications that the shipping to the end users is also a lot of poorly implemented components out there our code that's no longer being maintained and as a result now the vendor has to take on the burden of that keeping an open source project active implementing security features into their project a long after the product has died there's also some sometimes missing security mitigations of that that that don't get rolled in you know some of the new mitigation that are being ripped released in and say edge and things like that but you know the you know those types of ideas are actually implemented in some of his older code has been integrated into the products so will reduce real look at Adobe Reader but as we all know Adobe Reader is a widely used reader and it's got a ton of features for document authors to generate really complex documents and in the we spent a lot of time looking at that we that we've done several presentations of the last 2 years related to vulnerabilities in this product including things related to JavaScript API as buffer overflows you the freeze from all sorts of stuff that we talked about organ look at specifically today's the code that do we really didn't implement the stuff that they're actually bringing in from the open source community and the vulnerabilities that exist in them so that more highlights on the security side effects that we just talked about in the previous slide but if we look at Adobe Reader but it has several open source projects that actually integrates the 1 being a modified version of lab tests for test passing and also modified version of Cybertron which is used for an Extensible Stylesheet Language Transformation handling to island and unfortunately for Adobe the savage from product is now banned and there's no more community out there actually implementing new features there's no security fixes going on in that in that project in open source space and so it's kind of left up to Adobe now to actually integrates and fix all the bugs that are coming into the into the uh security I grew up in this in this whole curve now the don't believe us that's that being used in reader we can actually the licensing information in the product of the show the fact that they are integrating the product of it actually exists in a and so got the O L other we pulled livers information from the code you can see on the screen here it's actually using version 1 . 0 point 2 establish from and so discount proof that it is actually integrated into that deal well not XSLT that has been around since the late nineties I was part of the 3 sees an extensible stylesheet language input specifications and it's used a lot by application developers to transform data into other formats XML their into other formats and back when I was developing code uh in uh in my everyday life of my use XSLT a lot and XML or to define this 10 and bring it into a system a transformer to different formats like HTML plaintext reading also transform XSL formatting objects for later use what's important about XSLT part the logic elements that exist at the processing in these logic elements are used to describe the transformations that are going to be applied to the XML that is transforming so we have things like for each bottle select every maximal node inside of a node set and perform processing on there's is x amount of the SSL attribution which will create and be on and on a node and output node so of this copy that will copy of the input to the output and this value of to actually select the value of x small and printed out and the output note also understand that text that uh the sex assault the former you also need understand how XPath works because that is used to query the XML document to select nodes and perform calculations on the data that's coming in as it's going to be transformed know we have had put a uh XSLT earn XPath on the screen here slashed music slash artists uh square bracket 1 and this is the most like the 1st artist element that is a child music element in XML documents and stop and then allow it to be have processing performed on it now XPath functions actually add flexibility to XPath itself you can you can use node sets to actually get a set of nodes related x amount of string to actually evaluations on on a different string arguments that Boolean number to do the same thing for row in the numbers so there's a lot of functionality that exist and expect a lot code that exists but in these projects that can be audited and looked at for abilities so XSLT transform really simple looks like this on the springs from a hard to read the back but this XML on your left but is is that XML that contains a catalog with multiple CDs which a bunch of metadata related to the city itself and then there in the middle
here is actually an XSLT document that's going to transform XML into the HTML that you see on the screen and you can look at that if you look at the XML kind of Odyssey but that there is a for each of logic element in here that's going to select every every CD there's a child of and their perform processing on it using the value of XSLT logic elements so as a result you can start doing processing and transforming data of those axonal formatted into HTML and produces really amazing CD collection of HTML pages to real of and so the if we look at exactly how you would actually call the underlying XSLT engine in Adobe Reader you haven't seen many ways in the actually that 1 is XFA and and the more common way there were seeing is actually people using JavaScript API to trigger that underlying engine so what we see here that is in the first variable is actually XML documents that were willing to transform uh the next variable is the XSL document that it the transform itself this whole true the underlying code that we can use that to trigger vulnerabilities in XSLT engine of what you have to do is you have 2 parts b the XML document variable bring it into a DOM and then select the root node and then apply the XSL transform on that root node and as an underneath the covers you're going to be exercising the savage from source code that and that has been compiled into a x as a leader ideal DLL now this this is just the way that the beginning of how you start up triggering some vulnerabilities in in this code to understand what type of bugs and where to look in the code we're going to show you a couple places where you can we start looking for bugs and in this case what you're seeing is the definition of all the different logic elements that exists in Sabah try that's being used by Adobe Reader in this enumeration here talks about the different what the defines a different logic elements you can see that a copy of his in their value of is in there actually is in there for each all of the different uh XSLT logic elements are defined here now where's the processing for all of these different logic elements located the lot there is a class in Sabah trampled XSL elements inside the execute method is actually giants which gave you switch case that actually goes through and performs processing on the different logic elements so if you're going to go start looking for phone abilities as it relates to the processing of XSL elements you wanna start looking in the execute method and what to talk about in in adult section about that some abilities that existed in the scope now XPath functions are also implemented as I was wrong and these reside in EXP Odyssey CPP and again and another switch statement that exists in this code of an expression that call Funk method and we list it's listed some here on the slides but there has been vulnerabilities in excess XPath processing as well I'm a and C you can actually if you're going to go on it and look for vulnerabilities in this open source code you want look here for x have style abilities and then what you can transform those into the same vulnerabilities inside of Adobe Reader so we've also seen inside of this code and issues related to the premise implementations so another place for you to start looking if we look back at the vulnerabilities that occurred inside of Adobe readers XSLT and we can see back in 2002 Adobe actually fixed to vulnerabilities in this engine from and we'll talk about those bonds here in a 2nd if interesting back story that affects the latest patch that of dual talk about on these were discussed at length on the conference service but for the community really hasn't paid attention to this code based much since those discussions then in lizard Initiative program in 2015 we start to see new submissions from multiple researchers are in this code base coming into our program and a goddess interested in how exactly we were going about finding the vulnerabilities in Adobe Reader and it it all comes back to the source code analysis of Cybertron and fuzzing techniques that they're using to finding these bugs and so can go over some of those here in a 2nd now if with 2 can give you an idea of where some of the bugs are located we can generate we generate this table here the name column this has a different source code locations for for the code inside of Cybertron along with descriptions of what that code actually does and then the corresponding CDs there were discovered in those in those source codes uh so from locations you can see there's a couple here from 2012 and the rest of them are from 2016 in fact all of these bonds were actually at the we passed in January of this year but assign cities of 2016 for whatever reason so but what we're doing now is I'm I handed over to Jay's all where he's going to talk about how you can pinpoint the vulnerability in a closed-source application using some techniques that after you ordered in the code of Sabah trying to find out so the 1st thing I wanna mention is
that if we're not doing a street and if again to build binary in part because initially was entirely sure how easy it need to to build it but also because there are a number of processor defined he came initially we don't know whether or not what all those values were but in addition to that Adobe is implemented the number of changes on top of what's public they did open source their of modifications a few
years ago but they have not done so any time the recent past and the salad plant source that is available is incredibly we all were so start off using something that is very very well known to me that's just looking for
strings are so here we have this structure has a pointer to string followed by 2 to the known values in X functor were next type but in the binary that's just going to look like a row of integers and this is the 32
bit application we're looking at it's only gonna be a keyword each 1 so if we look for a lasting position we see a structure and we can kind of glean that this is the exact same structure we were just looking at the year from information table on its own of this structure is not all that useful and specifically because all it does map from the XSLT function names to the ideas but then their ideas are not the easiest to find if you do a search for say last which is X 1 a you'll get a bunch of references are a bunch of hits and not really going thankfully 1 of the entries in there is 1 that's called system properties and that 1 has a value of it has an idea X 3 and takes a string and this 1 is specifically useful because if you look at the code it is responsible for returning version information of the XSLT engines itself no looking at the code we
see that guy within this call fungi function we have a switch statement and the kind of way but in this particular case for system properties we have a few strings there is guaranteed to be referenced and this is just quite fantastic for whatever reason do we never modify this and this could basic sold that's ginger all . com is now I think of pole which website for some other completely about different purpose not related at all but about by looking for these things and then looking for where 1 particular function in particular use them were able to see exactly where coal from maybe and it ends up being this
function which is probably impossible reads per we find that but it references vendor vendor URI alliance in Jinja all . com and if we were to keep going up we would eventually see
that XFS system property value being reference I will show that but it ends up being kind of worthless and that math is performed on it and it doesn't it doesn't it being x 3 be like it is in the structure itself kind of
expanding on that and 1 that is incredibly useful use it in reader a lot and flashed up and is looking for just data structures so in the previous example we had a pointer to a string followed by 2 inches in those images were just in values you can expand on that and look for any array hearing cont structure where you have an array of a bunch of static data but this also works for if the function has static data and is going to place in this section because then you can look for those values and more often than not you it would find us sometimes comparible carry multiple copies in table but you can still find references and then do kind of basic matching techniques based on top of that so in this particular case TXU in using you know and took an ax i and took in the common of no Brennan bracket roll it looked for and basically just those into the lumen the words and then looking for the raw and ended up finding exactly that strengthens the
xt 5 and 7 and now we know exactly where the narrative ways of using this was a would then find where some the TF functions that you'll be talking about were located the kind of in the same vein you can just do looking for straight text values of for 1 of the functions that was vulnerable to or just look for its PCs 0 0 easier 0 1 2 N was able find that I find I sleep function but being 0 look for data structures cells can be incredibly useful 1 of the things I mentioned about not being able to build and you've been if against that is that you have the prosody
finds man this is probably also not very real but you have huge chunks of code in would find that are going to vary based on the value of these preprocessor values and in this particular case you have basically entire function that will look very very different based on that I don't want that that many places where you can kind understand what that looks like outside looking in particular function and looking for the absence or presence of of a certain type of loss but thankfully there a few places such as
this axis so often and structure that has a string that is only present when this seedy underscore change which is the more prevalent repressor defined is actually defined so by looking for forward-compatible were able to see that this about that in the production
built of Adobe Reader they did in fact use c can discourage and on that note I'm going to have to do all of your verbal's removed from
the Ottoman recovering a bunch of words here and basically the previous ones that have been found in the axis of the axis of the Boston and we're I'm going to go over a bunch of of different types and also becoming toolbox the 1st 2 but there were found by an acoustic or also known as by calls by going and 1 is he base buffer overflow and the other 1 is a type confusion and the reason for the US-led going through these bugs is that it gives that gives us like an idea of how it be patches these modes basically the Apaches books twice once in 2012 and the and once in 20 16 so basically there was like a 4 years ago between these 2 patches so it's kind of interesting so after after the he is a fruitful
flow so um xl elements in a in subglottal EPA's inside the classical the XSL element uh inside it has a symmetrical excuse that contains a mix which case and and passes them into these on the type so the way this works is that um it checks whether each element has has a name attribute and then what it does is grabs the name the value and then uh calculate the length of it allocates make a buffer based on that length and then tries to converted from UTF-8 to be at 16 and this is done inside a function called is valid and hence the name disagree as you guys can see I don't think you can see in the but but but it does assisting conclusion had UTF-8 best alone and that allocates a buffer a positive DFA to have 16 for the conversion so we went back to to launch also school just to do some analysis and we figured out that the core of this bargain was because of the way the strength of that the length of the string was being calculated basically inside UTF-8 still length and the way this works is that it loops of through the through the name through a string like by light and checks for the length of the of each byte of each character physically embedded getting counting by 1 but in special in in special cases when have coaching or something like that a doesn't calculated that the tax calculated calculates it but it doesn't add the link to the count but it keeps incrementing the count by 1 so that basically you
start miscalculated my miscalculated length that length is later used in an allocation and then it's passed to UTF-8 to just 16 the way that actually he these novel flow patterns
right and so uh there we we pointed this out in Adobe uh and basically in the DNA of the compiled and analyzed by looking for C 0 0 and if you do that it's like the canal engine so UTF-8 to Deaf 16 and so that the compiled code here as basically roughly matches the somewhat from the source code so utterly we didn't really touch that um code of the data modify anything and from the FATF 16 just can can grab the lecture and then again and up and is valid and the name of the 1st function called inside here and singing and basically the DFA a length and um the this the same case like it hasn't been touched so it's roughly the same code and somewhat from the sequential kind of interesting right um the size of the patch the way they they pass it back and 26 2012 was kind of interesting so we grabbed a bunch of the Elves from previous versions and the is generous enough to put all these and the previous values on on the FTP server and we kind of expecting to see some changes inside uh the length collision function but in fact what they did is they the modified uh the conversion function and the added an extra argument to it which is the source length and then add up obviously they just use it to keep track of what's being the feckless being written which didn't actually makes sense so until 2016 when received like a book from the same researcher Nicholas I got a Nicholas be 1 he submitted the bond that trigger the same and the same exact crashed so we submit this book to add to B and then I got passed in January 20 2017 a couple weeks ago we just about and the study actually did right so they modify the the UTF-8 single length so not excuse me they'd actually modified at the 1st of length the function that does that calculation and the underspecification guess can see that calls you get the single car length there just to get the the actual uh character length and then the added to the counter so this incrementing the count by 1 so basically the guy was like 4 years later all right so the next
1 I'm going discussing it was the 2nd but that was discovered by practical is back in 2012 and basically this is a type confusion about um he specifically calls described it as a sexual confusion
just the yeah right so there would cause of this blog is in an expert function called line so the way that the way language so basically it just for so long as it checks the number of foreigners that's being passed to it it checks whether uh of the section titled argument uh um In case that length being executed against the note said what it does is it tries to convert nodes um 2 2 elements and that's being done inside a to eat so
too is specifically interesting as it use another member another macro called cast and gasses is an interesting case and as basically it it calls dynamic cast but it does it in and the debug mode um if it is in in production just doesn't this like the classics style and and that's basically what causes that the whole fusion should what the so pointing this out in Adobe is pretty much straight forward all we have to do is this like and run a search on land and then we're gonna end up with 3 hits and it's pretty much too trivial to to find the right 1 so the right 1 is this gonna land use inside a function with a huge um switch cases this is not a problem the much easier so this it's it's the same thing that happened with the previous 1 basically Nicholas and submitted a park in 2016 that triggered the slide again the and up again when backward all the allows and we started this thing just to see what what happened back in 2012 and basically what they did is they just make sure that a certain type of node is being cast right but you can you can trigger the trigger the bug again by giving it like other types of nodes and that's exactly what we considered this clearly gave it a comment like us and and some of the of of the type of text mode and he was he was able to trigger and this is the 1st of the girI 2017 batch and basically they did right this time and day called a function called idea dynamic has to do with the casting in the right way right
so I'm going to be talking about some modern bugs and basically this 1 is not a boundary that's 1 of my favorite it's really easy to spot in source code it was found by 1 of our researchers called lady and he's the use of Singapore red so
this 1 is way um XPath functions are being handled inside a lot from basically the way it works and wine and and like that DEC's that functions and it's a lot on they check for the arguments 1st document types and then it references these these arguments and all the checks are being done inside macros so for example check cards count basically checks if the number of arguments is right if not then it's going to be lot these so if you check that this pattern inside a function called called from that implements all the expert functions again and again notice that what happens is they do they cheque artist and the reference these arcs um only in 1 case which is a substring after assumption before the reference odds and then they check for the and that's exactly what happened here and schools that that upon right um so it's not that easy to spot this 1 and basically this is just the function that contains the whole uh and switch case that implements all these sex that functions um but you can use 1 of the
techniques that we that I described before which is um basically just look for land and then you're gonna you're gonna be able to just land site called funk and from they can calculate calculate action that the case number and then you can learn from we can lend a substring before and after but what I had said the things which case so it's all good so as for the patch I think you guys they probably should out they just moved uh dada dogs checking for referencing the actual large so it's it's kind of straightforward
right so the next should only be talking about is a little bit more complicated than the other 1 is an bond rate and it has informed by Willie too so this 1
uh basically effects in Macao lists are being implemented inside somewhat from so laughter implements list objects and then and then just basic structure of of ordered lists according to but they they say anyways so 2 things we have to we we have noticed or at a study about this list of that the constructor and that and method so and a constructive a valuable called original block size is being set 2 2 to the power of love size and there's no check against that value whatsoever um independent specifically that value is used and in in out it that values using an allocation of a buffer and then the value of of values being appended to that to that specific block and the nice thing about it is that if someone controls that's that's size that line then the contribute some condition to to eventually allocating a very small buffer and then uh writing out on of onto that before that object so this is basically what happened inside create context basically all the the expert text that expressions are handled inside that create context function so 1st defines of understandable called called reached and accept the Pritzker predicates can so basically uh predicates are things the user can control XPath expressions so in this case the press count is controlled by the user and later uh physical to reach the top 10 0 so basically if if a user guide defines a certain number of flats companies he will be able to trigger here the cantilever out which is then uh in the past that the that the values in the past publication and then and will be able to write out of
right so pinpointing this 1 is that it is not really straightforward and 1st we have to find create context and from there we will be able to find create LP context level that function calls um function twice I called I called it was constructed and issued the compiler to go see exactly that matches the somewhat from code and nothing nothing has been changed it to the and so the the the next larger w we can check which is a list of banned and basically out of the car just copy paste the same thing in use they then modify any of the code and as you guys can see that does the same thing graphs that original block size use it in a location and then writes to that block the great as for the patch but um ideally passion really interesting way personally I was I was expecting them to do some some checks on the on the valuable initially but what they did is they did they implemented some checks inside that and function so they make sure that the that the value is in a certain range uh and if not then it's gonna be set to turn it value and then allocations can fail so basically the check is checking the size of the length is greater than 4 during the 0 or is a negative value that check and succeeds then find a 50 50 0 that's going to be the set to minus 1 and and allocation of just fail and good
so the last but I'm going to be covering is above a free and this 1 was found by really to um interestingly Nicholas from this book to he submitted to us but we had to reject it because it was submission so insights electron they
implement something called the uh guided pointers and basically these these are pointers at the other themselves automatically on exist or 1 something out of bond out event happens like an error triggered anything like that so what's interesting about implementation is assignment operator which which can which returns a about 0 point so there's a there's no there's can they cannot actually track these or there no cut like there's no a counter for the is a reference counting as we call so in certain cases and therefore if you have a reference to that to that to that P a set of function there's no way that the function can guarantee the lifetime of this most of these pointers so basically find the is triggered them and the 1st point is any free of and the interactions in the industry and then you're gonna end up in that situation so this
specifically happened inside the XFL case and OK this is really about so in the end of this right here there's a filtered CGP pointer which is assigned to you see for the set that you see and basically again I Uganda the role pointed assigned for the tea when you're dealing with XPath um are functions again faucet faucet faucet again error by calling a fake function and that's gonna force these GP pointers to the allocated to free themselves and so again and that with the when you see being uh freed and then filtered C to be free to so you can end up in a double the situation here hazard of
implemented this in a really interesting way some basically as a gas can see adjusted creators um the reference and distorted another uh local variable leader in code just grabs 1
difference and then deleted and then the next that other valuable and in the Lisa 2 and that's how I was published in Adobe
so that's what a patch of they they pass in a really interesting way to which is basically they removed the code responsible for deleting than the 2nd reference but you still have a reference the slack valuable so if someone actually was able to to force that reference to delicate again then you're gonna be in a difficult situation so I'm going to give it to turn to Brian and so gone and he can conclude all this who very so
what was always interesting the thing is instead of having to actually part of the binary can goes directly to the source and look for vulnerabilities and Simon abilities in in in a lot easier away from its established on has been around for a long time and 1 of the things that we actually learned by looking at it I think we look back to Adobe you remind that there there so uh door that some analysis where he looked at the the deal well that was containing the XSLT engines look back at every single release of Adobe Reader since they were written 9 came out and they're really only implemented security fixes in every 1 of those updates and and and white paper will release in the coming months will go over basically every single secured effects that they've implemented inside of this engine over the years and and it really provides a really interesting way you can use of some of the techniques that is all talked about related to pinpointing the vulnerability in Adobe Reader and look for that 1 ability in in the code infer from programs like like 0 initiative were buying vulnerabilities minutes of perfectly valid waiting to look for bugs and find bugs and if you look at the context right she would pay somebody like 50 thousand dollars for and foreign exploit against Adobe Reader by and they can easily just lots of open source code and find the bodies in Adobe Reader writing exploit embrace the context so it's an interesting technique that you can use some Adobe Reader is a very very popular software inside of enterprises in fact I was at a meeting with some government official this morning and I was looking at their laptops all of them had Adobe Reader on it as it makes a very interesting targets of to go out and look at I'm looking you know many of these projects so you no longer have communities available to them to develop security fixes are any improvements in all and it's really the on the on the vendor to go find and fix the incoming security vulnerabilities and because they didn't implement the code to begin with they may I have a good understanding and as we've seen here with that that the buzzer of those talked about is that many times they don't actually fix them correctly and so the 2 but they were actually discovered in 2012 had been around still around and for many many years until most recently the past most recently with the new incoming submissions into our program so this kind of we'll leave it at that you it's a it's an interesting codebase to look at you can count a trend that transform those but those open-source bugs in the closed source application we consists of and we hope you enjoy the presentation oriented answer any questions if you have any mn theft and the because he yeah it could yes so I have a question regarding not ultimately matching as source to a new by combining so space it defines accommodation because we don't know which defines 2 use by building have you tried similarity techniques to find which define to use some of the government of the trying to determine do so we have it was when I'm doing things I like to go more than I would be very sad know for sure and so findings things like the year that array of strings where you know for sure that it has that particular defined I usually try and go for that just so I know I can build a list rather than having to do in this small areas personal preference that it it it it so perfect thank you very much but you few