Make "Invisible" Visible: Case Studies in PDF Malware

Video in TIB AV-Portal: Make "Invisible" Visible: Case Studies in PDF Malware


Formal Metadata

Make "Invisible" Visible: Case Studies in PDF Malware
Title of Series
Part Number
Number of Parts
Zhang, Jason
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Due to the popularity of the portable document format (PDF), malware writers continue to use it to deliver malware via web downloads, email attachments and other infection vectors in both targeted and non-targeted attacks. It is known that PDF attackers can break detection by using polymorphic techniques to hide malicious code, randomizing JavaScript, obfuscating embedded shellcode or using cascading filters. Malware writers have always tried hard to develop new techniques to bypass detection. Some recent PDF attack campaigns we have seen are typical examples of such new endeavors from malware writers: a) Simple but effective URL aliasing technique to download malware. b) Using PDF to deliver specific topic related text content for search engine poisoning. c) Encapsulating PDF malware inside a PDF file to break detection. In this paper we will investigate the recent PDF malware campaigns using - and often abusing - these new techniques.

Related Material

Computer animation Lecture/Conference
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
we know that the model always to have images of all malicious behavior and the change which you music so as last year we have to making we will review and today I would like to share some of the of molecules start with something we and many stations on the sea near Earth literacy initiatives and so the way talk about the there I would have to give a brief introduction about so much less so much slightly so 1 of the primary active so for us here would have made when 1 of the those of people to define the detection and then would do the cut various research and we do that the data analysis and above all we provide light detection and and just for all customers around the world so you might ask me a question the grass seed arguement round-the-clock 24 hours the answer is no we don't see of course antigen but there sort but it he
was that would have of around the different kinds of uh so foresees based on his head coaching Oxford UK and also would have office here in Budapest and they will have more freezing kind India and and Australia that's why we can walk around the clock and then we have we 100 and 200 and universes then is to develop rules should recover this threat research response system development Advanced Research and the detection so yeah then if for any any of your interest in in my research development you're welcome to contact us as so for Stockholm OK so this is
an agenda going to talk about is the introduction about that you have and the those guys are cut of fast and then what's the structure the flower and then I will discuss some case studies which included PDF with a URI URI or in which publicly the efficient and not media companies and the 2nd 1 is about of those guys this PDF to 2 . 0 6 with this and the that always is about you know PDF embedded in another another PDF file and each each case will follow by the demo and so why do get that
problem together crossed ever use that and so on you will used to give every day and is very popular and also because of is independent of operating system and applied for and there you go mate PDF from your friend you you box you feel that is less efficient than x and they all also because the likely Acrobat PDF reader has most of what achieved that and so that's you know lots of loss to mall writers because it is flexible to talk to the PDF reader so those are the 2 key reasons why that country
and even though will use PDF everyday but you might not have chased no the structure of D of what is just is just have like you know we I cast but you don't need you don't have to know that the engine what he's what he's under the bonnet so saying here
to their other because have to give some basic information about putative structure which basically PDF has 4 sections which is simple and then followed by the body section which is a group role for objects stored in a random order and then followed by the cross-reference table and that is that's leader so
we can see that it had a question which only has a communicative model presented to different age but the version number is quite simple and and this is the main part which is the Board's section which the connection funeral people objects stored in the optical order so the object can be any number of types the air a dictionary and a number of genes so and also people don't like to use a different future due to a a compressed or encoded data so to make that signal and less readable this is especially common in PD Mahler so the uh the thought
of such a across levels table this is really important because this table will tell you about the offset of each object that's why the objects are stored in a random order but because he he's across reference table you know the of the ship object and so receive the index table yet and also as you can see that the 1st quarter of the 1st column is that there is the offset of that particular object from the beginning of and the 2nd 1 is the generation number to the fruits of the the the peer follows just erased it has no modification the generation number would start from there and then in the cross reference table the 1st entry always you know Europe started with offset 0 and the reason really future-generation number so we can ignore that and understand quantity which has therefore ends and means this object is you use and that means this object is everywhere here not you saute days in
last section which will tell you the size of the how how many objects in the PDF file and and through the root object were to start and then it's the different has been modified for each you have know slash URIs which to the period so the cross-reference table offset and and you will see this that acts that are yet to see the Q water and the you know that you what you will see that offset which is also 2 of the cost of unstable and and this is the end of March this that was the case
studies the 1st line simple
is the PDF file which you know the 2 size a single case file and then it has been unless you have a text and it contains a data to your I always share points to somewhere malicious and of promotion website and there we notice that there always those kind of popular does an efficient that 1st Armenia patterns and logistic companies like FedEx Romeo so let's take a look at some examples so we can see that you want go to 3 sentences and you have a name that in the in the middle and the free movement the users and the shortening which is different and that is that we're learning from point to point to the promoter of several hallmark so another 1 and then so if you could take the
name each for the PDF for readable will pops up a warning when the saying that there are going to another student proceed Roy and also you have a check to say OK you want me to remember this action that can watch the store blacklist distinct so uh yeah if you if you click and large during open abroad and adjusting the direct you to go to the side so you might be wondering who was going to make this figure far and wide is going to have to consider the fissioning the PDF of the kind just in can just condemn the fissioning in the mirror and these of course never thought of the the the reason I'm going I asked this question so if someone condensate allowed and proceed each were you know in this case is just another of nodes somewhat which contain summer and some of that experience so you can see that it we met we managed to this that and this has led to divide and that is that the snow this kind of name The only exists in way so serious so we often will not generate chihuahuas half knowledge about that existing more so if you're lucky you can you know you don't understand so that would be just you know we just ask why those concentrations the fissioning canopied involved you can just put the text of multiple do that but the problem is that this probability inefficient because in the universe in which the difference in our here in the middle you haven't had information you have the you deny I should really and you have more information so it's easier for them you know I'm very around to other detection to broke this is this confusion because you have less than a false positive but in the PDF example because you know you only have full text and the you with fissioning and is hard to minus of what was to run forced perspective so you have to keep doing this for support rebuild and that is the challenge that's why there just put in the deficient the in the PDF file and would also because the PDF file into the into the mirror so be careful so that the main point is to know for the fact that we tend to go amour to the promoter upset OK they say you have a look
at the this notice this simple them all and thank you this is a member 4
books from my computer just
so you can see that it is is a typical example of
use of the URI expresses logistics company and then you go distinct between most of you know more we'll also with the smoking in in another decommissioning so if you think that each with 2 some and services really studies of boring them but you know sometimes like the the Board of it so there's just wait and it takes inventing pop up smaller window and then posterior if want to proceed or not so as this kind of link does not exist for for long time so if you're lucky you can you know the minus 2 dollars the sample but sometimes muscles and you just 5 so here we see that in this pops up 1 window and then if you're books and and prediction block of uh of each 1 of remember your action that you know what the story of public assistance but I don't think we would we need to cover understand because this idea is is clear so if you could allowing to open a prominent and in touch upon some stuff here in this work of oxygen connections have so at the request of system you need someone like to in the 2nd the
2nd 1 although I think in the
2nd case is more interesting and it because these practices could assist engine poison always also sees it is a black patch search engine by optimization so basically this that is not directly attacked humans there is indirect about the direct directed cuts the search engine Google board and other Mrs. engine so that of course it is was for the CIS engine and the development of human you do some sit on the cis cis engine you will get in effect so that test set of this kind of moral and it's only got a single page again and also because they want to know promotes the website also want to know do something about practices engine optimizers optimization so in this kind of of ourselves used for research specific words like another investment like bannerets and they were we were sure that major and each has some links to other similar produces coding uh mingle we ought to cook creating too far to tuning to each other that is which include the city engineering school and also at the end of the fight has a setting that means we would we would receive some them with you so you can see that you know at the 1st studies you feel that this is a little more you know give government in hiring some and nice you become a graph and then some text but if you take a closer look at the suggestion of that is just some nearer to Austria cue relate to some of the specific topic in this case is kind of a is a binary tree form financially topic and
another example here so you have 3 of 4 excessive and then yeah so you can see that in the in the middle fire because so here you have different names which actually connects to the same apparatus similar to give up and they have a settlement means they can change this of course but the attitude to say is is just kind of flat PDF values for Q. yet so if you take a closer look at those scientists active not really complete sentence just history of the wars you need to make a keyword Q was so uh
how does the search results we have a look like so would do some special going into rises here that we know we're in a way into binary trade and there notice that here we go to 47 million results if you if you if you see that here on the properties of the year 1 which puts there they did means advertisement that is responsible advertisement on was this this is how we can support them if you want to promote it said in all the races you know it's a new product and the you you want to promote the Q these kind of prices option based in the system so the more you created the higher ranked you you will get of course Google system really complicated picture this is in general no school is based on maybe thousand and the thing is if you're trying to the parameters as well as so many people and intend to include search just go to to get the results on the top so that this section is the pay such results using the free and also I didn't show here on the on the right corner you also response which is this and it is the 2nd part is the freeze this result is what we call organic search results so if you don't appreciate you know for the search results you have to give you what you have to do with system just to get you with some of that on the top of when you want to do that so we see here you see that I hand that you know that on the top you know the the you know we can see that the those of the pdf you know relates to results so it's a pretty those guys are successful at stretching bypassing was homocysteine to which we could be better over those that give us not to be the mean of suspicion and we note that this off you know what's really know how fond of teacher and we context Google and Google agree that this is a problem we were we were going to work on that and will improve the results so we did analysis in the notice must must and it will noted that is that you know this is sink yours and you can see that as the result of a number of results in that know drops dramatically during that happening from the so what I want to say is that in each enjoy good fortune meaning and the last last goodness and how meaning so you who did a good job of the listen to us and improve the results which is so and if you could take a closer look at this in a real obstacle there you go this square bracket which has appeared instead because could believe this is the PDF found and so you have that can subject in the binary expedient brought up and if you click that link which will direct you to the promotion said yesterday to to your heart and we can know how to become known meaning here you want you know and I'm sure you know if that's what knowledge we can we can 1 1 and then substituting that's right and it was so but if you take a close look here at the university of this is the 2 Irish and if you click I receive a constant options that's what you really think that you know keep that past shown to assure you what the the global boss seeded content because that this is actually a file which you know for of the cue was really to the specific topic so we're trying to
revise this you can use you know that the usage is just space specified global water in this in this option and then entered that your life and this is your area
here so you can see that it could be
the user sees the contact these applications last few days so that the recently affirmed the validity of battery which she starts reasoner percentage PDF lower on 3 words from follow from from people objects so we believe that objective and so you you
might wonder how you know that this is direct your from this name and we think that and that each with direct to this website so we can also use you know Texas by using the web development tools and then you can you can see that this with redirection chain yeah so it starts from the Google search result and each 1 and this promoter of that we would we should mention them so that you can see that maybe it is you put this parameter is less that might be rich related to the promote the agent idea and and that it was your weirdest stories ever 12 so the whole purpose in
all of Houston if you give to increase assistant engine as the search result is true bypass this engine and little contradiction what use cloaking detection cloaking just that you try to hide some and the goal was to see really is really competent to have lost a lot of people walking on the system to to to grow to produce those kind of another approach in a techniques and because as I mentioned before I could believe that she gives grounds on its suspicious and you know what a student achievement and other fast so then let's try this maybe that's where the system is not a new thinking about this so the the the scarcity of successfully bypass the search engine and broken detection and and the whole purpose is to increase the free search ranking is called a lot and search ranking and in the world there would be a redirect users to the promoter upset that select the call with
that show this and them all of and connect the Internet in this is what is the
results we will see today so this is connected to the Internet and through that activity Wi-Fi and this this and this this interview was a k penetrator so you can see that here those others positive so that it hasn't it depends on the popularity of the cue was for them your sister node i from 6 lectures really work and if you keep it the particularly the did the name did this follows the sponsor will create such model free if we speak is you know this could CPC cost-per-click there I at all we do want OK because this will cost to response someone who wanted to make the best idea would do that so you can see that this directly to the but said it some OK so it's about
reconstruct this year this past while they but not really interested in the back
and they're OK this school we can see that here you go disputed much since since their goal was it's not this during the not covered under the stupid seminal free results here good TF London put another 1 this go to the that appears this the signatories you would have the same 1 2nd please we don't have it that this is much better than what what we saw in July this year and this is due in that year that always used to you and the trend is yeah so basically where don't think it is another 1 here look they just figure 1 of the most 1 to see you it's direct to them into the new forms of this user this is example here is to keep this is this 1 and then you can see that if you put this up passion if you keep that issues you this you purify which has you know the book you were related to this you know when treated in the freight and this open this file firebox plugging along this sort of Firefox we can you try to read redirection needs student to tension can so you've likely that 1 bar so he
directs you to this said to
this to this financial website
trading experience convinced that so we can we can see the real camera direction is considered 1 this and they go
over in this form here here here here
so if you keep this 1 and so here we can see that yeah the redirection with the option of change in policy that the direct from the 1st use you know comes from was this was also and then just new goes to the the actual promotion of salt yeah so this user is a really interesting and those guys and he what was used to the that the time to promote the financing of and become promoting it in your in your website or in our institute consists 1 to go back to here and then this watercourses books Mrs. books to the stuff that includes here the time and then we
can get communities cash and
the competition so this interesting
synergistic cool water of mutual so here this is what Google bosses suggested that this is a is a putative from richer there could believe this user and application to different and it has all the putative expenditures attribute from across the UK right by
all the Vice-Provosts consisted case of the set of
the demo remember spoke stressed which will take some time in the
context of this question stop the car because manager who will and here we just look at some countries that have them so that's why I want to start this 1st before I started to talk and then I will start snapshot which
will take a record of the current to reduce to status and the file system so this this leave that there is stuff
about dispute into so that he could say here with with this is normal any file which can contended that if you from which is embedded in the object of attachment to from so once you open the PDF file it would drop this abolished pdf and unloaded that from what might and it would have seen in the sample we we come across that which can't sit in on the quiz through the most common 1 of vector you chill printer function which is controlled in number series and then to get it right on the set of positive copying and also that there could be a new 1 which use happened and there is a wonderful asked how the property index every object in the universe so yeah that's good progress history to drop directly or Donald tomorrow from a remote to your once that the so-called and beautiful focal system successfully executed so the use
of a T because they're not reach agreement indicative of the kind used so doesn't that look suspicious just know some kind of force and typically produce and and other people in the period from but you must open this kind of file and then after some
time which will not talk to him but the PDF of the 2 different uses standard were loaded automatic yeah so this
is if we take a closer look at this 1 example here you you can see that you have with such sub hyperlink for the trajectory in a market which means to seize the object which has no impact the PDF values that and then if you were extract this PDF file no just use that all of the and then you use the access to to take a look at that you can see that if it contains most of you know obfuscated Javascript inside the lower political character just the next and that
in this case he talks no for different 1 of images so we should just give you will check which shifted the PDF reader version and independent residual in or trigger scission when a bitch and then each 1 of them is
to look through this list show so that the 2nd level
here again you use is the same year will having impact on the quality of the and here you could have you know some of the some of the Georgian jealous because he has said that and in this example the except an object so this
is the telescope extraction from this but the putative which is just kind of you know that you really doesn't make sense to you it's just kind of kind of like in the encoder stuff and the way
they manage to go to to make it easier to look better and so here you can we opted the JavaScript and also you can use some your just compare mandated trying to see what the direction of the office just just the cost so here you can see that in it's here we can see that this is this is actually which thing and code and coding conventions of and they know that is a central which is used here in the USA and escaped to decode the data that this is a circle and a centripetal the work so this is the stuff will will work and you look at the data
and then in the 2nd part of this just give you can we can see that which what chicken worsen over PDF readers and then depending on which an individual in a course on specific function which is related to the web and you can see that here in this case it is which achieves the of version of and all the other version the choice take this call this function could high called them and they hear each which is going to receive diversity is greater than a man and while other version which for to know about this mutual print the printer which just to Princeton huge numbers to fix that so as I believe that in general all all the more so you know what to do with the agenda why those guys we try to export as well 1 the edge because all the Arrow those those you have read reader to and the code was written by humans and humans are not perfect where everyone makes mistakes so that can always find some you know some of argue all you know because we can use and application in those guesses about against her to your exporters expects exploiting the weakness and then to try to get the so you can see that it actually before the check the version that that the dual of this kind of stuff number my question you know cooperation that and then that user and how do you know of all the whole whole decorate that the data number 2 1 2 scholarships for in the monitoring of uh to spread the data into the memory and and the data is active the user ID is of piece groups from the so called approach some of the of some of the structures could on looks like so that was your trigger bitch each will modify the value of the instruction pointer which the yet to be just writer and so that was the 1 that uses this is triggering the the writer will talk to you actually will be all right regions and then it will point to some memory in the and then you have the as well because of the this this new way will be random so what you can in current chief much the show called him the member a slide that in order to detect administrator she did memory to repeat the so called separable NOP-sled so that you have a higher chance to to figures show so that is what you should what about and so we extracted that
extract of the shape code and then way if you open the decision
to call the well this is a opted code decode that is the binary version which of course if you if you open that in the text that is because we can see that is is it connects to some uh IQ which covers pressed upon the sum total whatever so we can also use to
repair this value some kind of of so called emulator has the advantage was not connected to the to the to the burgeoning contracted Donald the DOE and then toward uh excuse consists dominantly of so that the
purpose of the recent you give the embedded in an arbitrary infrastructure had market to make to this features and also because uh some you know anyway companies and is not good enough to do what they were doing extracted the embedded in from from the from the uh so this is a way to bypass arid and in detection and in the end of course you try to Africa transmission so you know this
is how we're doing here so low that
share before I start this the 3rd case you know I I set up this uh this thing so I I have opened it so when the task manager because they want to what we want a montage of the memory used wishes and you will you will notice that in the memory usage when Korotkoff dramatically because of the spread of the so called and also with the we took a snapshot of a single system that converges to end of just the images has been modified added in its files have been dropped modified so I have to admit a snapshot and then what I'm going to do next is to I will open the 2 different give some sample uh that I'm sample and then after that I was thinking snapshot so you can see the difference of if the system we modified Only if our group has been dropped because now I'm going to open this up just so that is what we're so here you can see that the seasonal could hear from the government of suspicious but after often this year that you know the it's it's actually just dissaving change to the king keep to from that found in talking to the system and then the slope of something like 1 of my students and staff in time and ask you to pick OK if you keep doing anything that doesn't respond you have to make it so if you have the effect that because he is that here now the memory of seasonal moisture 150 megabytes and there is that you will see that actually we use the instruments on board stuff in the background which he drops from each adjusted student that is just the so called into the memory and we can see that if the memory usage in this increase in this now is 375 and adjust each will go up to populate and of 600 because consider that this this goes up dimension and in this demo damage which string launching homework window and but you will use that so yeah this is a 2 per cent of this kind of people from our mission of always try to partition the switching from 1 of the rich in the PDF reader and that's what we have to inject a circle into memory and if you use this you have work so this this year that in order to sort property so what I'm going to do now is to is to take another snapshot and see which has been changed in the system we have that in this as in this will take some time to because it would be coming to the front and the really good you know and to work on our type of uh most as an instance to the channeling of Jewish and Chinese food but less time at the image of America in our response and will have some charge that me in the way of all the cases and there's no you don't look like writers you you don't you know you don't look like a virus and it don't you know look like errors I don't know if you mutates like errors but it serves as an Chinese with uh but this it'll be lose a Chinese just came priest tell me what kind when we don't we don't need because you know people know we will not remove were likely to know all sorts of animals and their decisions and we anything fly in the sky is acceptably well weight anything with with let's accept tables and chairs so yeah that's that's what I'm not really true been exaggerated but you know we do each list of stuff but this the fact and social look and find the most prized so and this followed by the way you the english is really good at it because I'm trying opportunistically could English right so even if today to the last element of me because I didn't want to go to the Continental last nite and and educated drawing so so the server that uh I think
have time perfect OK the snapshot finished uh OK we can close we don't need that and we can see that this is just use different sources snapshot uh which is to wind me because in the end but if we look at this this file here in this part of files added because the that's the the jokes that can you found in this in this folder you see the sandbox folder because you know uh Adobe PDF reader just introduced the some books and some new words in this case it's you receive a which has somewhat so the drop to the and the 6 successfully bypass the somewhat you and and also dropped some other to where which are suspicious the book T and then have to achieve don't to solicit we can look at what what others to us so we can go to this comes to save some money and so we can see that here
what sandbox books for and they you see those 2 of us before well open that's really 0 1 the rest because that it actually to process used to oppose or support system that about the our from this you know that's what you so was very recently in the horror interest them or of them so this user in our just concentrating embedded PDF what came back out of the house and the chief of object and then touching bypass intervention to Susan and and the thing
just about to run out of time and
can so the conclusion is simple we will give a summary overview of PDF of letter and and we discussed about your 3 PDF is what they are doing all this from the same quarter complicates was on the whole the the case and they why they are constructed on the theory such way inspection of appear fish because it's harder for the when we're under a single to bias the force was and a false negative and then for the search engine poison and they can also for the who was in and so build detection to bypass the global detection system and dedicated to him to do further to construct in such a way to bypass a meta-fictional in order to do that we can make make them look less efficient I think that is pretty much what should I wanted to say to any questions
at all because from if no question I was just say thank you


  749 ms - page object


AV-Portal 3.12.0 (3a2599d676b25753609baac9def5622401886a53)