Packet Hacking Village - Stego Augmented Malware
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 335 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/48740 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
SteganographyAugmented realityMalwareMalwareAugmented realityCombinational logicAbsolute valueHacker (term)Goodness of fitSteganographyPresentation of a groupMereologyComputer animation
01:17
MalwareType theoryStatisticsImage warpingAugmented realitySteganographyMalwareDifferent (Kate Ryan album)Term (mathematics)Type theoryFocus (optics)Variety (linguistics)BitStudent's t-testLatent heatAugmented realitySteganographyMultiplication signTwitterResultantComputer animation
02:05
Focus (optics)BitComputer programmingStudent's t-testSteganographySource codeComputer animation
02:49
Computer programVector potentialSteganographyPersonal digital assistantDependent and independent variablesStrategy gameDuality (mathematics)Point (geometry)SteganographyType theoryMathematical analysisMultiplication signTelecommunicationNatural numberStudent's t-testHypothesisMereologyAugmented realityOrder (biology)Color managementView (database)MalwareDifferent (Kate Ryan album)Flow separationRight angleProjective planeNumberVulnerability (computing)Range (statistics)Social classTerm (mathematics)BitFocus (optics)Dependent and independent variablesCybersexStrategy game2 (number)Operator (mathematics)ExistenceCASE <Informatik>Perspective (visual)Game controllerComputer animation
06:03
Faculty (division)Presentation of a groupSteganographyObservational studyPresentation of a groupView (database)Order (biology)Level (video gaming)Point (geometry)Student's t-testEncryptionEvoluteDifferent (Kate Ryan album)MultiplicationMalwareComputer animation
06:46
MalwareAugmented realityLevel (video gaming)Gamma functionTrojanisches Pferd <Informatik>Control flowTrojanisches Pferd <Informatik>ExistenceElectric generatorMultiplication signLatent heatOrder (biology)AdditionLevel (video gaming)Type theoryRemote administrationAutomatic differentiationNumberWebsite1 (number)Student's t-testMathematicsCryptographyBitMorley's categoricity theoremCategory of beingLine (geometry)Remote procedure callComputer animation
08:25
Trojanisches Pferd <Informatik>Computer networkTerm (mathematics)Uniform resource locatorParsingVirtual machineComputer-generated imageryMusical ensembleMalwareUniform resource locatorFlow separationVariety (linguistics)TwitterPhysical systemNumberService (economics)HypermediaMultiplication signMotion captureTouchscreen1 (number)Hacker (term)Connectivity (graph theory)Basis <Mathematik>Arithmetic meanFacebookBinary fileType theoryComputer animation
10:10
Computer fileSocial softwareUniform resource locatorMach's principleProcess (computing)Block (periodic table)DampingDemo (music)Virtual machineHydraulic jumpEnterprise architectureMoment (mathematics)Variety (linguistics)Software testingSoftwareTwitterMalware1 (number)CASE <Informatik>Game controllerResultantComputer fileBinary fileTouchscreenUniform resource locatorWeightRight angleMathematical analysisNeuroinformatikPresentation of a groupHypermediaLevel (video gaming)DiagramProcess (computing)Computer animation
12:05
Mathematical analysisComputer networkEndliche ModelltheorieCharacteristic polynomialWebsiteProjective planePosition operatorConnected spaceHydraulic jumpPhysical systemRight angleNegative numberElectronic signatureIntegrated development environmentNormal (geometry)Block (periodic table)InformationMedical imagingInternetworkingMalwareInsertion lossMultiplication signNumber2 (number)Operator (mathematics)Point (geometry)QuicksortEmailScripting languageFacebookLatent heatAreaBuildingTwitterSemiconductor memoryElectronic visual displayMathematical analysisOrder (biology)CASE <Informatik>Process (computing)Form (programming)Einbettung <Mathematik>Content (media)Geometric quantizationSteganographyAugmented realityCartesian coordinate systemPresentation of a groupComputer animation
17:40
Computer networkMathematical analysisInformationOrder (biology)Hash functionPresentation of a groupProcess (computing)Right angleDifferent (Kate Ryan album)Operator (mathematics)Message passingView (database)Normal (geometry)Point (geometry)Integrated development environmentConnected spaceNilpotente GruppeMedical imagingEndliche ModelltheorieCommunications protocolWordMultiplication signProjective planeMereologyAreaDisk read-and-write headMultiplicationCyberspaceCASE <Informatik>Information retrievalDataflowLevel (video gaming)Web browserElectronic signatureInternetworkingUniform resource locatorVideoconferencingSoftware testingFrequencySoftwareVulnerability (computing)Physical systemQuicksortComputer animation
23:15
Java appletFlash memoryCodeLanding pageUniform resource locatorTrojanisches Pferd <Informatik>Virtual machineWeb pageMalwareSteganographyLevel (video gaming)Augmented realityMaxima and minimaEndliche ModelltheorieInformation securityWeb browserType theoryWater vaporEinbettung <Mathematik>AdditionVulnerability (computing)SoftwareGroup actionUniform resource locatorTouch typingInformationPoint (geometry)Pointer (computer programming)Flash memoryForm (programming)MereologyGame controllerTwitterSteganographyPresentation of a groupDifferent (Kate Ryan album)MikroblogCASE <Informatik>Context awarenessHoneywell-HoldingWebsiteInternet der DingeInstance (computer science)QuicksortServer (computing)Dot productOperator (mathematics)Trojanisches Pferd <Informatik>HypermediaMalwareMultiplication signPiIntrusion detection systemNetzwerkverwaltungRight angleMedical imagingCoefficient of variationLine (geometry)Software development kitComputer fileLatent heatComputer animation
27:36
Uniform resource locatorFlash memoryTrojanisches Pferd <Informatik>Virtual machineUniform resource locatorInformationDataflowWebsiteCASE <Informatik>Computer animation
28:10
Numeral (linguistics)ComputerString (computer science)Flash memoryTrojanisches Pferd <Informatik>Computer-generated imageryLevel (video gaming)Augmented realityFile formatInformationDifferent (Kate Ryan album)Fiber bundleWebsiteInformationComputer fileMobile appMalwareMathematical analysisWeightBitState observerWeb 2.0Medical imagingMereologySlide ruleControl flowComputer fontSteganographyResultantWeb pageFile formatFlash memoryString (computer science)Computer animation
29:52
Type theoryMedical imagingTwitterMetadataComputer fileFile formatArtificial neural networkFile viewerSoftware testingMereologyWeb browserRight angleFacebookComputer animation
30:56
Insertion lossInterior (topology)Green's functionMathematicsRight angleDot productDifferent (Kate Ryan album)Medical imagingContext awarenessInformationBitView (database)Multiplication signPixelRandomizationGraph coloringComputer fileComputer animation
31:54
Insertion lossPixelSingle-precision floating-point formatSteganographyPersonal digital assistantVulnerability (computing)Computer networkComputer-generated imageryMetadataMaxima and minimaBitType theoryASCIIMereologyCodeMP3Software developerHypermediaMetadataCASE <Informatik>File formatMedical imagingService (economics)Data compressionFacebookEndliche ModelltheorieWeightMassDifferent (Kate Ryan album)WebsiteGoodness of fitTwitterDegree (graph theory)Google ChromeCovering spacePresentation of a groupInformationVirtual machineRight angleOnline chatComputer fileMusical ensembleVideoconferencingComputer configurationComputer animation
35:18
WordComputer networkMetadataComputer-generated imageryInsertion lossYouTubeCASE <Informatik>MereologyMedical imagingSlide ruleMetadataInstance (computer science)Profil (magazine)Line (geometry)Computer animation
36:16
Computer-generated imageryMetadataComputer networkPairwise comparisonContext awarenessIntrusion detection systemParsing1 (number)TwitterBinary fileDifferent (Kate Ryan album)SteganographyState observerTerm (mathematics)Enterprise architectureContext awarenessLibrary (computing)Form (programming)Instance (computer science)Medical imagingType theoryEinbettung <Mathematik>SoftwareMereologyWebsiteFingerprintDependent and independent variablesPresentation of a groupLatent heatFundamental theorem of algebraFacebookRight angleInformation securitySlide ruleResultantBitComputer animation
40:40
Computer-generated imagerySign (mathematics)ParsingSteganographyPoint cloudService (economics)Variety (linguistics)Augmented realityFile formatVideoconferencingIntrusion detection systemComputer networkSystem programmingForm (programming)Term (mathematics)Point (geometry)DiagramSteganographyCartesian coordinate system1 (number)MalwareVariety (linguistics)MP3Presentation of a groupComputer animation
Transcript: English(auto-generated)
00:00
Good afternoon everyone and welcome to the Packet Hacking Village. It is now, yes, we're in the afternoon part of the session. And I'm just going to make this introduction really quick. I think I've been doing this for way too long. And it seems like every year. I remember one of the things is that when you're here,
00:22
I think my introduction to the two of you gets shorter and shorter. So here we go. I will mention my absolute pleasure, Mike Rago. Shout out to Mike. Thank you. Thank you. Make sure you guys can hear me okay. Sound okay?
00:40
Alright, cool. So yeah, we'll be presenting Stego Augmented Malware. Which is a combination of a lot of research Chet and I have done over the years. And applied in a slightly different manner. And to really kind of preface this,
01:04
Chet will talk about some of the research he's involved with. A college that he's involved with. Which also kind of sparked some of this additional research. And really the presentation all together. So with that we'll go ahead and get started.
01:20
So in terms of the agenda, we're going to cover a variety of different types of Stegonography Augmented Malware. Try to say that fast five times. Stats, trends, commonalities, differences. Some of the research through the college that Chet's involved with had a lot of the students focus on specific variants of malware. And then we started to look more broadly across these different variants to find commonalities, differences.
01:45
Better understand IOCs and TTPs and things like that. And then as a result, we wanted to better understand how we could detect a lot of these. Especially based on their behaviors. And then we'll talk a little bit about our ideas really about the future.
02:02
And where we think this may go next. So I'll let Chet introduce himself. Yeah, hi everybody. I've been here many years so you probably know who I am. But the focus of this talk really goes to something that I've been working on for a little over 20 years now. And that's where all the gray in my beard has come from.
02:22
So we're going to talk about steganography and how it's being applied to malware today. And specifically some work that we're doing at Utica College in one of the programs that I operate there. And talk a little bit about some of the student research that's there as well. And I also teach at the University of Arizona as well as Utica College and at Champlain.
02:40
So if you're interested in any of those programs, please stop by and talk to me. I'll be presenting tomorrow at 6 o'clock as well. So if you want to stop by then. Turn it back to Mike. That was quick. Thanks. Yeah, like Chet, I presented here many times. And Chet and I have collaborated on a lot of steganographic and steganalysis type research over the last 20 years or so I guess at this point.
03:06
And have done a few books together too around covert communications, data hiding techniques and things of that nature. And so I'd like to thank Ming for having us back again this year. So let Chet first talk a little bit about all the things they've been doing at Utica College.
03:25
And then that will really preface what we're going to go through in terms of the research and analysis. Thanks Mike. We teach a class at Utica called Cyber 642 which is Data Hiding and Access Control. So the point of this particular course is to take a really in-depth look at
03:41
what's happening and what is emerging from a data hiding and covert communication point of view. The focus of the course is to look at the latest malicious code that includes advanced persistent threats that are out there. Mike and I did a talk several years ago talking about the APT Operation Shady Rat which kind of started this
04:00
whole gamut of technologies that are used to augment malware in order to be able to make them more or less discoverable. So the whole point of this is to incorporate psychonography into malware in order to be able to conceal its existence, right, so they can communicate from that perspective. So again in recent years there's been a lot of movement and Mike's going to walk through a bunch
04:25
of those that have come out of the research that we're doing at Utica and work that we're doing together. And as Mike said, we've been studying this problem for a number of years and we've kind of looked at this from several different vantage points. So the students in this class discover, this is a master's class, discover and examine a wide range of psychonography enhanced malware threats.
04:45
We've looked at over 30 of these in the last year or so. And we want to do this in order to analyze vulnerabilities of the malware along with the psychonography methods that are utilized. Because some of these methods actually use fairly sophisticated STEG and others use very unsophisticated STEG.
05:05
So our interest is to understand where those techniques that they're using in order to enhance the malware are potentially vulnerable. So we can actually use those in order to either detect or disrupt that activity. So we want to develop these strategies that can be allowed to allow us to do detection, response and mitigation of the threats that are there.
05:27
In several cases, students have chosen to further examine these threats as part of their final capstone or their thesis project at Utica. Michael Beatty just is completing his right now and it's just an outstanding paper that
05:41
will be in public view probably in about three months when he finishes his thesis. And that thesis covers a wide range of these and the vulnerabilities that we have studied and figured out. I'm the second reader on Michael's paper and I teach the course that I'm talking about. So we spend a lot of time with a lot of different students looking at a lot of different threats that are there.
06:04
So the catalyst for this presentation comes from a couple of different points of view. One, Mike and my 20 years of studying this problem and watching the evolution of steganography being used in multiple different ways. And it's interesting, one of the things that we tend to focus on is on encryption.
06:23
But less emphasis, even to today, has been spent on the study of steganography and how it actually impacts and how it can evade detection. And now that it's being integrated with malware, it's to the next level. But some of this has been inspired by the research that we've done and also the research the students have been doing over the last couple of years in order to be able to take us to this next level.
06:44
So I'm going to turn it back over to Mike who's going to kind of walk through some of those changes and then I'll bounce back in a little bit to talk about what we're doing about it. In other words, how we can actually use what we figured out in order to do that. So I'll turn it back to Mike.
07:13
Thanks, Chet. Just started to build out a simple timeline here where you can see, you know, a spike, an increase, especially over the last four to five years.
07:24
Certainly this is a very small data set, but over time I'd like to continue to see how this evolves and emerges. But clearly there's a growing number of these and it's on the level of potentially exponential.
07:41
Of the ones that we did some deeper analysis around beyond what the students had done, we started to put these into some different categories along the lines of banking Trojans or crypto jacking. A lot of this was, this categorization here was really based on their motives, right? What are they looking to do?
08:01
In addition, a lot of these, as you might expect, are all related to remote control, CNC, remote access Trojans, things related to data theft or espionage. And then lastly, there were some actual advertising and ad-related type click-throughs and promos to generate clicks and drive traffic to specific sites or specific ads.
08:28
So one of the ones that was particularly interesting was one that leveraged social media. At a previous Packet Hacking Village or Wall of Sheep talk a few years back, I had presented around
08:42
a lot of these risks and threats across social because I was doing a lot of research at that time across Twitter, Facebook, in a variety of other social networks, also streaming media services and things like that. This one in particular was found on Twitter. And to give credit, this was identified by Trend Micro.
09:03
The premise behind it was that outside of the scope of this, one or more systems had been, you know, let's say compromised, infected with this malware. But where the steganographic component of this starts to come into play is that the malware is out there on
09:22
a regular basis checking a particular Twitter account, a Twitter account that had been out there for a few years. And whoever owns this account then started to post memes, and within those memes were certain types of commands. So once the meme was posted, the malware would identify it was out there and parse it, and within it would be a variety
09:42
of commands that the malware would leverage to run a variety of things, including screen scraping or screen captures and a number of other things. And then additionally, post this stuff to Pacespin. And so through Pacespin, it would actually initially obtain a URL of where to post it, and
10:02
it would either post it back to Pacespin or separate IP or URL altogether related to the CMC. Just at a high-level diagram here, right, you've got the computer reaching out to the social media, it's already infected with malware, and getting updates and commands via Twitter, right?
10:23
Not Twitter directly, but a meme posted to Twitter. And we'll talk more later about a presentation Dr. Phil Tully and I did two years ago at DEF CON 25 around neural nets and leveraging the ability to not only detect some of these things, but furthermore do predictive analysis.
10:46
So in this case, it went out to Twitter, parsed this particular meme, and other ones that were posted beyond that to gather commands. And from there, it would get information such as a URL to obtain command and control stuff.
11:04
I'm glad I have this on the screen because it was not only Pacespin, but Imgur and things like that. And then as a result, the files would be sent to the remote command and control URL that was provided via Pacespin or Imgur or something else.
11:20
So with this were a variety of commands that were run and leveraged by the malware itself to capture screenshots of the desktop, processes that may be running, steal stuff from the clipboard, even potentially the username for the machine and documents and things like that. But it kind of begs a question here for a moment, and that is, you know, if you're running
11:43
an enterprise network, how frequently do you have or want to permit enterprise users to be using things like Pacespin? You know, we find in most cases when doing pen testing and network analysis and things like that, that, you know, when looking at it from an ingress or egress standpoint, that this is largely still allowed outbound for posting things.
12:07
I'll jump into some more of the variants here, but for, you know, as sort of a stepping stone for that, I'll let Chet talk about his Raspberry Pi project, which he's actually, as he mentioned, going to present tomorrow as well.
12:23
Thanks, Mike. So one of the questions is, how do we detect this? Do you care if we detect this? And so the issue is, is the whole point of this operation in augmenting malware in this fashion is so that it becomes not observable, right? Or the observable is very low. So how do we actually go about detecting this? Well, there's kind of two different methods that we've actually analyzed and looked at.
12:47
The first, of course, is we could analyze the images that are being posted or recovered from different sites and basically determine if they have content embedded in them, right? And as I mentioned earlier, some of them are fairly simple in embedding. They may be using a JPEG that has data appending.
13:04
They may be using a JPEG that actually inserts things in the header of the JPEG to basically communicate the data, whether it be a PowerShell script or something like that that's embedded in those areas that we're not looking at or don't get displayed. But we can detect those relatively quickly and relatively easily. But if they go ahead and modify things like the quantized DCTs and actually embed
13:23
that information in those areas of the image, then the image is going to be compromised. And now we have to do more exhaustive analysis of that. And it causes two problems. One is we can miss things. And number two, we can issue false positives. Neither of those false negatives or false positives are a good thing.
13:43
The second problem is that in the deeper analysis of those images, it takes a long time depending upon the size of the image. So it could take several seconds or even longer depending upon the size of the image that we're actually going to try to analyze that it has been compromised or information has been embedded in it.
14:01
So we got thinking about this and said the second way to approach this is to look at the behavior. We never want to build signature detections anymore because they just don't work, right? Because as soon as the bad guys know that we're doing signature detection, they just change the signature, right? And it's difficult. And that's why most of these things get through in the first place.
14:20
So I'm not going to talk about the compromise itself of the systems. I want to talk about analyzing the behavior of the malware once it's embedded. Now, remember, these are probably in most cases fileless malware. So we're talking about memory resident things that have compromised the system in such a way that these processes are running with privilege. And therefore, they're able to actually move within that environment without being detected by, you know, traditional defenses of those environments.
14:48
But we do have been doing a lot of work in analyzing the behaviors of those operations, and they tend to be quite similar. I might kind of walk through a scenario for you already of how this actually works.
15:02
And most of them work the same way. They have a need, right, to go back to the Internet to a public site that is not in violation. Typically the way we have done these in the past is we know what CNC sites that are out there that they're using, and any connection goes to them, we basically block it, right? So it's a signature detection, right?
15:21
We've got a signature that this is a bad CNC site. But in this particular, it's not a CNC site anymore, right? Now it's basically Paceman or Twitter or Facebook or anything that has an image that's going to be downloaded, and information is going to be gleaned to that as to what to do. And then information is going to be posted back, again, in the form of a JPEG or something else,
15:41
that has the content that we want to exfiltrate from the environment in it. Again, something that's typically going to be ignored. However, when we start looking at privileged applications, right, or privileged processes that are basically posting information to the Internet or going to the Internet and getting information, that becomes unusual. So just one simple example.
16:02
But the point is the defense in depth approach here is to basically identify these behaviors, and that's what we've been studying for the last couple of years, to understand what are those behaviors. And so one of the things that I'm going to present on tomorrow evening is talking about this Raspberry Pi project that I've been working on as well.
16:20
That's basically a passive sensor. So basically the sensor is looking for aberrant behavior within your environment that's outside the norm. And these would be things that would be outside the norm, right. Normally systems are not going to be posting these kinds of images to these particular sites or retrieving them and then processing them. So we're basically developing ML models that will allow us to identify these
16:42
even if we've never seen them before, right, because they have these characteristics that we're analyzing. So as you know, the first step in doing any kind of ML work is to actually define what am I trying to detect, right, and number two, what are those specific features or characteristics that I want to basically identify that are either good or bad,
17:06
and then basically be able to build a corpus of those in order to be able to identify those behaviors in a more intelligent system kind of approach versus a signature-based approach. And that's what we're doing with the Raspberry Pi Python sensor
17:21
is basically using that sensor in order to detect this abnormal inbound or outbound behavior that is being caused by this stego-augmented malware that we're in play. Does that make sense to everybody? Any questions about that? Any thoughts about that? I'll stop here for just a second because I'm going to be shooting out to go do another presentation,
17:45
but I want to make sure that I answer any questions that are related to either how the information is being extracted from somewhere out in cyberspace or being pushed there and how this is actually working, because I know we've covered a lot. Pretty simple, straightforward. Yes, sir?
18:01
So if we were to sort of system internals, process explorer logs, something like that,
18:24
wouldn't we be able to look for this activity to do that by saying, as long as it's not a browser and it's reaching out to the Internet to pull something, wouldn't we be able to detect it that way? Sure. So that's a great example. In other words, does this particular process normally go and retrieve images
18:41
or post images to the Internet? Is it a common thing? Now, a lot of processes do that, not just browsers, right? And so where they're going is important, but we don't want to actually identify those from some signature point of view. We want to identify the behavior of when do they do them, how often do they do them, and is it something normal for this process to do?
19:04
So one of the things that the malware that we've looked at is they're trying to attach themselves to processes that normally do that, right? But they don't do it in the same way. And some of the mistakes that they've made is they use the same images, right, in order to be able to do those.
19:20
So we can detect the images or variants of the image. So one of the things that we learned about that, such a great question, is if we see a process, we may not be able to instantly identify it as a problem. But if I take an image that was being posted by that process and I hash it, and then it posts that same image in the future and the hash has changed,
19:41
what does that tell us? Something different was embedded in that image. It's the same thing in the pull-down. This is how we actually detected Operation Shady Rat in the beginning. We were seeing images that were posted and retrieved that were the same image, but they had a different hash, right? So we're able to identify it in those ways. So obviously as they become more sophisticated in using different images
20:01
every time to convey the information, it becomes more difficult. But they typically aren't that sophisticated because they're trying to actually make that happen in a very quick fashion. Any other questions? Yes, sir?
20:25
That's exactly what we're doing with the Raspberry Pi project, so I kind of invite you to stop by tomorrow at 6 when I'm doing that talk. But basically we're modeling that baseline and baselining the behavior of that environment, okay? And based on what's happening in that environment normally, what connections are being made by what devices over what protocols,
20:42
at what times of day, what size packets, that kind of thing, we're actually monitoring that and basically creating a semi-supervised learning of that environment under what we would consider normal conditions, right? And then once we do that, we can use that in order to be able to detect aberrant behavior that falls outside what that normal behavior is of that environment.
21:03
Now, one of the reasons that we use Raspberry Pi to do that is, first of all, it costs 50 bucks, right, even the new one, and we can place it in different parts of the network, right? So we can actually distribute this across a network that, you know, is much larger instead of trying to do it from a single point. So we may want to monitor things within a certain subnet
21:21
or a certain area that we're mostly concerned about. I don't know if that directly answers your question, but you kind of get where we're going with that. Anything else? Are you concerned about this? Is this something that's ever come up in your discussions? Okay, I'm seeing this is great because we've done this in the past. We haven't had a lot of people shake their head yes.
21:41
So it's great to hear that people are aware, and that's part of what our job here is to make you aware that this is going on and starting to get you to think about ways that we can actually do this. Yes, sir? Yeah, we would place the sensors in multiple locations.
22:03
Like I said, I invite you to stop by tomorrow because I go into that in great detail. Okay, but yeah, that's how we do it. We basically place the sensors in multiple locations, and we monitor the network over a longer period of time. So this is not a typical vulnerability assessment, pen test, end map. This is something that we look at the behavior of the environment,
22:22
and the critical thing on the Raspberry Pi is how do we store all that data? So, sure. Yep.
22:40
Yeah, we're only looking at flow, okay, in this particular case, but it's how we categorize the flow in order to be able to turn it into something that will allow us to be able to detect from it. But send me a message, and I'll send you the video of how that works if you can't be there tomorrow. Okay? Okay. All right, I'm going to turn this back over to Mike,
23:01
and I'm going to scoot because we've got a talk at Sky Talks at 1 o'clock that I've got to go get set up for. So, you're good? Sounds good, I guess. All right, guys. We'll see you later. Thanks, Chet. So, Chet touched on a couple of really important points in how we look at this, right?
23:22
And as he mentioned when he covers his presentation tomorrow around the Raspberry Pi, he'll go into this in more detail. Around that same time, I've got a presentation in the IoT village that touches on modeling out IoT behaviors
23:40
and identifying malicious activity as well. And a lot of this is really kind of tied back to connecting a lot of dots. In looking at, for example, the previous research we were doing around social media, as you identify some of these malicious accounts by which, as we showed in the first example, you know, what was posting memes
24:01
and those were being parsed for different types of commands to further enable the malware, you know, is that potentially an insider threat, right? If you're impacted by that, does that internal user maybe own that particular Twitter account, right? This was some research we were doing, and upon doing so, we actually found an instance
24:21
that was actually the case. So as we were flagging these malicious Twitter accounts, we actually found someone internally was logging into that account, right? So it was actually an insider threat. And it kind of begs the question that, you know, with that additional context or that additional intelligence,
24:41
and leveraged, whether it be at your firewall, your IDS, or something else, is kind of a powerful thing. It's kind of a 1% issue, but arguably a 99% problem, right, if you've got a breached server, device, something else like on the internal network, right? The other thing, too, is in looking at the baselining,
25:04
and Chet will certainly get into this in a lot more detail in his talk, is such that I see all the normal behaviors of how things are normally communicating over the network or to one another. And if I can find a way to baseline on that
25:20
so I can find abnormalities, you know, abnormal behaviors, things like that, for example, if an IoT device has been infected with Mirai, and I start to see different behaviors on the network, if this is normally just a Johnson control sensor or a Honeywell actuator that sends a signal once an hour or, you know, a very small group of packets,
25:43
you know, every once in a while for a status update on a water level, humidity, or other types of things, and it's now emanating 5 gig of data, there's a big anomaly there, right? So having the ability to baseline on that can be a powerful thing. Okay, so bringing that back to this, then, looking at a few other forms of Stego augmented malware,
26:04
we can start to kind of find patterns, and as we go through these next few examples, we're going to tie that back to what's actually occurring within the images, how is the embedding occurring, and furthermore, how can a network administrator,
26:21
someone in the security operations center, find ways in which they can potentially detect these types of risks beyond how we think about these problems today. Sundown was a particular piece of malware, sort of a kit, if you will, with a lot of different variants. In this particular example, it was leveraged as part of a website where there was a hidden iframe.
26:43
The iframe itself was completely white, so it looked like it was part of the background. But within it was a PNG file, and embedded in that some additional information. Initially, it was planted there to allow vulnerabilities to be exploited in IE, both related to JavaScript and Flash.
27:02
There's actually three specific CVEs tied to this. But if the browser was vulnerable and the user went to this particular site, this PNG would actually be parsed and decoded to reveal a malicious URL. And it would be a pointer to another site
27:21
to further infect via Flash. The interesting thing about this is that it would then pull down malware and infect the device with a Trojan variant of Zeus, and then would engage the CNC server for further theft. This just kind of goes through the flow really quick,
27:40
but this is going to be important later on as we think about the behaviors and the flows of this activity. As I mentioned, user goes to a seemingly benign site with an iframe in it, which is actually a PNG, which is going to be parsed to reveal a malicious URL that's going to redirect the user to a different location.
28:01
And upon doing that, pull down some malware, and in this particular case, was more bank-related for stealing bank information. There was some great information out there on the malwaretrafficanalysis.net site. I don't know if you've seen this great site. They'll have a lot of different bundles and zips
28:22
of all the files related to the exploit, as well as PCAPs and things that you can look at. Another variant related to malvertising was also involved a malicious web page, but this exploited Mac fonts. So if you had a Mac and you were going to the site,
28:42
it would exploit, and as a result, analyze the images, and how the data, the commands, all of that was embedded was using LSB, or least significant bit. And as a result, it would parse and extract that to create an actual string, which would then be executed and prompt
29:02
an Adobe Flash download. And then furthermore, it was actually going to infect the device with SLARE. So some observations then. You can see there's a lot of images out there as part of this Stego augmented malware, where the images themselves are being parsed for information.
29:23
The interesting thing about this is, first and foremost, it doesn't break the image format, right? So when I dig a little bit deeper in the next few slides into the how, the image is still rendered totally fine, so it looks completely benign. And it does this also without hindering
29:42
the ability to render the image, and there's no viewable distinctions to the user either. So if it's an LSB technique or something else, you visually can't see it. So one of the most simple things that we've seen, which bypasses much of the defense in depth,
30:02
is the ability to just simply append that data to the end of the image. So if it's JPEG, a PNG, or something else, a lot of these file formats have an end-of-file marker. What's interesting about it is, if you add data beyond the end of the file marker, a browser, your viewer on your desktop,
30:22
a preview, other types of things, just simply ignore it beyond the end-of-file marker. But all this data is living beyond the end-of-file marker. Now, if I was to post this to Twitter or Facebook, they're going to strip metadata. They're going to recompress the file and damage this data.
30:41
They're also going to strip off data beyond the end-of-file marker in all of our testing. And that was part of the testing that supported the neural network research that we had released two years ago at DEF CON 25. And in the context of least significant bit,
31:01
as I break down how the image is actually rendered, I've got blue, green, and red, and I modify the least significant bit of any of those or all of them to allow the ability to embed information without visually really changing the file in general. If I'm just modifying the least significant bit
31:20
and changing the color by one single bit, you're not going to visually see that difference in the actual image the majority of the time. But there are techniques where you could do the next-to-last significant bit, and we've even seen other variants of that. Certainly, if I've changed the fourth or fifth one,
31:41
that's where you start to encroach upon maybe potentially changing the viewability of the image where somebody says, hey, what's with all the, you know, the image is blurry, or I see random pixels or dots and things like that. So again, I'm just changing the least significant bit. This is all part of when it's extracted and rebuilt
32:05
allows you to actually, you know, build whatever it renders, whether it's ASCII code, the command, other types of things. And so in looking at, you know, how I could, you know, somewhat weaponize this,
32:21
you know, what would be good sites or social networks to upload this to where it wouldn't be recompressed, the metadata wouldn't be stripped, data after the end-to-file marker would not be removed, right? So as I mentioned, Twitter and Facebook both have their own compression techniques, so when you upload an image or even a video,
32:43
it may be completely recompressed. They'll strip the majority of the metadata, although there was an exploit a few months ago related to that, IPTC, and it'll strip off everything beyond the end-to-file marker, thus really rendering a new format of the actual image. But not so the case with things such as Tumblr
33:02
or Pinterest or things like that. At a talk Chet and I did two years ago, we also exploited streaming media services. So I set up a musician's account within Pandora and actually took an MP3 and modified it and reposted it,
33:22
and later on it played the same song again actually on the streaming media service, which I was actually surprised about. I just wanted to see that I could modify an MP3, embed it in the embedded JPEG that you see on your MP3 player of the album cover or the song, right? And sure enough, it showed up and it started playing
33:41
and you couldn't distinguish any difference in the music, right? Because the only thing I had modified was the JPEG that's part of or within the MP3 itself. So you could essentially communicate that over a streaming media service, which I demonstrated. And furthermore, if you go into Google Chrome and developer options,
34:01
you can download that song, right? And if you were communicating this to anywhere in the world, you could download that. And to the actual recipient, they would know to download it, pull out the JPEG and extract whatever data had been hidden within the JPEG hidden within the MP3 that was streaming over the media service itself.
34:23
So when we did this presentation at DEFCON, Dr. Phil Tully and myself, Phil is the data scientist. He's the one that's got his doctorate degree in neural nets. I did all of the background research in analyzing how these images were recompressed, what was stripped and not from each of them.
34:40
And the premise behind this, though, that was really cool, we thought was if I can better understand how these images are being weaponized and how this information can survive being uploaded, could I, as a white hat, let's say,
35:00
actually model that out and leverage machine learning to actually predict a thousand other variants of the same thing and use that as a method by which I build a massive detection capability? And that's what we proved out and demonstrated at DEFCON 25.
35:20
Oh, and I forgot, I put a screenshot in it here. So this is up on YouTube if you're actually interested in it. Go ahead. Yeah, that's a great question. I should actually put that on the slide. It actually means that it actually survives is what it means.
35:41
So green means it survives, yeah. So if I embedded something in an image and I post it either as a profile picture or just as a part of a post or part of an album, you know, how these different social networks handle those images, as you can see here, in some cases, they'll actually, they will strip the metadata or recompress it,
36:01
but in other instances, if it's a post versus a profile, maybe they don't. So bottom line, the straight answer to your question is anything that's green is survivable and is not modified whatsoever. Yeah, great question. I'm glad you brought that up. I'll have to update the slides. Yes.
36:26
I don't know if they changed their policies at all and Trend Micro was actually really good about not pointing the finger at Twitter themselves, right, that, so as a result, the only thing that I know is that
36:41
once Trend Micro had discovered this, they had notified Twitter, and shortly after, Twitter took the account down, was essentially that. On a similar note, although completely unrelated, was the scenario that I mentioned where, you know, we're finding other malicious accounts, and one of those was mapped back to the internal network
37:02
because somebody had logged into it from the internal network, which was kind of a surprise to us, actually. You know, we accidentally stumbled across an insider threat, so. Okay, so last two slides, and then I have to run over to our next talk over at Sky Talks.
37:21
Some observations here and thought-provoking things to think about. First, in terms of parsing images, you know, as we highlighted, you know, it's quite easy to append data to the end of an image, especially a JPEG or a PNG, because it's still going to render completely fine, so if I put it up on a website or anywhere else for that matter,
37:43
unless I'm posting it to something like Twitter or Facebook, all that data is going to survive, but from a viewing standpoint, nobody's going to really know it's there unless they know to look for it. So, you know, one of the recommendations here is that if, and this gets back to a presentation I did at Black Hat in 2004,
38:01
which is if I know the tools that are being used to embed the hidden data, rather than look for the hidden data itself, does the tool leave a fingerprint behind within the image? And if I map those out, then I'll know that, wow, I didn't actually find data hidden in the image,
38:21
or I didn't know to look for it, but based on specific fingerprints or things that are left behind as a part of the embedding tool, that I could actually build a library of all the known tools for basically running steganographic embedding within the images and use that as my detection method instead.
38:41
So I had written a tool called StegSpy and released it at Black Hat way back, and it did exactly that. It identified 13 different types of tools that were used for steganography or embedding within an image. And that might be particularly good then when you have an instance where its least significant bit,
39:04
or DCT as Chet had mentioned, in that that's a lot more difficult to detect. Although in the next presentation, Chet's going to show you how we can actually detect that stuff too. But that gets back to, well, LSB is really difficult to detect, but if I can maybe detect the tool
39:20
that was used to perform the LSB embedding technique because it left behind a fingerprint, that's going to be a lot easier to detect. The other thing too, and this is more fundamental network security, and that is as we took a look at just a couple simple TTPs, is this outbound, this upload,
39:41
this access to things like PACE bin and other types of seemingly benign sites. But in the context of your enterprise security, if you're responsible for the enterprise network, do you really want to allow access to things like that? Or maybe should it actually be blocked on the network? That's up to you to decide.
40:00
And then I had also touched on tying back to social networks of which were leveraged for some of these types of attacks. And beyond the insider threat that I had spoken about, if you had a feed that was alerting you to the different forms of, or the different malicious accounts across social,
40:21
especially ones that hadn't been taken down yet because it can take anywhere from a few hours to maybe a few weeks actually before those are taken down, does that give you some interesting context in terms of identifying things related to insider threat or things like that? So if that was tied back to a particular feed or something like that, another thing to think about.
40:42
So we do find that Stego augmented malware is on the rise. We're going to continue to do research on this and backfill that diagram we started with to kind of see how quickly this is increasing. We did prove out last year and the year before how this could be exploited through MP3s and MP4s
41:00
and demonstrated that. As I mentioned, I had not only done it with Pandora, but I had done things, I can't mention the other ones by name, but we had done a variety of things that we actually showed in the presentation. And so when thinking about this, it does have a lot of applicability to audio and video formats as well.
41:23
And I think I pretty much covered the rest of the point. So thank you very much. Hopefully you got a few good gold nuggets out of it anyways and some things to think about in terms of the new forms of Stego augmented malware
41:41
and things that we're doing research around. So thank you very much.