Totally Spies!
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Part Number | 1 | |
Number of Parts | 18 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/32803 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
REcon 20151 / 18
1
2
3
4
10
12
14
15
18
00:00
Extension (kinesiology)Multiplication signRight angleLevel (video gaming)WebsiteTable (information)MalwareAntivirus softwareComputer animation
00:58
Slide rulePresentation of a groupOperator (mathematics)Descriptive statisticsGroup theoryInformationTrailSoftware developerMereologyComputer animation
02:20
Computer animation
03:05
Medical imagingHand fanCondition numberMechanism designTunisQuicksortFundamental theorem of algebraRevision controlNeuroinformatikMultiplication signData storage deviceInterior (topology)Binary codeDependent and independent variablesParameter (computer programming)Boss CorporationSimilarity (geometry)FamilyAreaOrder (biology)MereologyRobotData structureVideo gameSingle-precision floating-point formatElectronic signatureSlide ruleDenial-of-service attackData miningMalwareServer (computing)EmailAntivirus softwareGoodness of fitAddress spaceHypermediaWater vaporPresentation of a groupString (computer science)Domain nameComputer animation
06:46
MiniDiscComputer fileKey (cryptography)Multiplication signThread (computing)Module (mathematics)Term (mathematics)InjektivitätMathematicsPlug-in (computing)Binary codeComputer wormScripting languagePhysical lawFunctional (mathematics)1 (number)Block (periodic table)Line (geometry)Event horizonDirectory serviceInteractive televisionEmulatorVirtual machineMathematical analysisFood energyDrop (liquid)CASE <Informatik>Object (grammar)Projective planeString (computer science)Endliche ModelltheorieTraffic reportingSubsetHash functionHookingSimilarity (geometry)Normal (geometry)Level (video gaming)NumberMereologyCoprocessorEnterprise architectureArithmetic progressionIntegrated development environmentDifferent (Kate Ryan album)Strategy gameSparse matrixFile systemPhysical systemCodeParallel portCausalityRow (database)Windows RegistryProcess (computing)Service (economics)Video gameJava appletTheoryServer (computing)Context awarenessSystem callPixelGroup actionParsingLatent heatTask (computing)Mechanism designDegree (graph theory)Point (geometry)BootingHypermediaTimestampSoftware bugMultilaterationDiscounts and allowancesFlow separationHand fanMalwareRobotResultantAuthorizationStructural loadBlogFile Transfer ProtocolPhishingAntivirus softwareNeuroinformatikTable (information)Computer animation
14:38
Server (computing)HookingVirtual machineInstance (computer science)InformationBackupProcess (computing)WindowRemote procedure callLoginMalwareMotion captureGoodness of fitVideo gameQuicksortPersonal digital assistantMeasurementElement (mathematics)Computer animation
15:33
Antivirus softwareSoftware testingProcess (computing)Message passingSoftware developerCartesian coordinate systemInstance (computer science)CASE <Informatik>Parameter (computer programming)Group actionArithmetic meanProduct (business)Configuration spaceFunctional (mathematics)TelecommunicationThermal fluctuationsSystem callField (computer science)Proof theoryDistanceDimensional analysisLatent heatTotal S.A.Virtual machineSlide ruleWindowBackupInformationSound effectContext awarenessExpert systemNumberSemiconductor memoryQuicksortTriangleProxy serverMountain passSheaf (mathematics)HoaxTerm (mathematics)Level (video gaming)Object (grammar)SummierbarkeitMeasurementoutputHypermediaVector potentialElectronic mailing listSet (mathematics)DigitizingVideo gameBinary codeSet (mathematics)Walther-Meissner-Institut für TieftemperaturforschungMachine visionTable (information)Run time (program lifecycle phase)Measurable functionOrder (biology)Form (programming)Matching (graph theory)Strategy gameCoefficient of determinationProjective planeUniverse (mathematics)Metropolitan area networkRevision controlWebsiteData storage deviceDegree (graph theory)Server (computing)Library (computing)Figurate numberAuthorizationKey (cryptography)RootkitSampling (statistics)Right angleComputer fileFlash memoryComputer wormDataflowHookingDifferent (Kate Ryan album)Loop (music)Source codeDefault (computer science)Operator (mathematics)CodeLoginWindows RegistryReverse engineeringMalwareGroup theoryInternetworkingRaw image formatChainStreaming media
24:58
Computer wormoutputLine (geometry)Sound effectAddress spaceFunctional (mathematics)Hash functionRight angleRevision control32-bitTraffic reportingVirtual machineLogical constantNormal (geometry)Operator (mathematics)ResultantParameter (computer programming)Binary codeCrash (computing)WindowMathematical analysisFunction (mathematics)Semiconductor memoryExclusive orVariable (mathematics)Configuration spaceLatent heatCASE <Informatik>Computer fileState of matterLevel (video gaming)Point (geometry)Sign (mathematics)Beat (acoustics)BitQuantumServer (computing)Computer animation
27:39
Noise (electronics)Dean numberCategory of beingMultiplication signOperator (mathematics)Software developerSampling (statistics)Surjective functionFunctional (mathematics)Modul <Datentyp>Module (mathematics)Virtual machineData storage deviceLevel (video gaming)Message passingInformationExpert systemComputer fileString (computer science)State of matterData structureEndliche ModelltheorieParticle systemType theorySet (mathematics)ResultantLatent heatGraph (mathematics)Content (media)Configuration spaceWebsiteCore dumpComputer architectureVariable (mathematics)Extension (kinesiology)Letterpress printingBitError messageIntegrated development environmentTable (information)Electronic mailing listMalwareTask (computing)Computer animation
30:13
Type theoryCore dumpModule (mathematics)Functional (mathematics)Data storage deviceLine (geometry)Key (cryptography)Configuration spaceFunction (mathematics)Letterpress printingPointer (computer programming)Electronic mailing listComputer fileOrder (biology)Virtual machineOperator (mathematics)Subject indexingHash functionSemiconductor memorySoftware developerBinary codeTable (information)Field (computer science)Object (grammar)Content (media)Data structureRevision controlNumberCodeSet (mathematics)EncryptionFile systemError messageGreatest elementFile formatAreaMatching (graph theory)HTTP cookieLink (knot theory)LengthEndliche ModelltheoriePhysical systemSampling (statistics)Process (computing)Condition numberSoftware testingSpeech synthesisDialectFlow separationReverse engineeringFreewareLevel (video gaming)Computer animation
34:09
Physical systemTheoryProcess (computing)NumberCodeSlide ruleData structureIntegrated development environmentForcing (mathematics)Software developerSemiconductor memorySound effectFlow separationCASE <Informatik>Right angleSet (mathematics)PredictabilityLink (knot theory)Virtual machineFile systemArithmetic meanContent (media)Endliche ModelltheorieAsymptotic analysisMathematicsDifferent (Kate Ryan album)Sampling (statistics)Staff (military)Projective planeClique problemInternetworkingCharacteristic polynomialProof theoryImplementationGroup theoryQuicksortMaxima and minimaLengthLogarithmAdhesionFunctional (mathematics)Point (geometry)AlgorithmFreewareSoftware testingElectronic mailing listInformationLibrary (computing)Ideal (ethics)Similarity (geometry)Workstation <Musikinstrument>Electric generatorDatabaseTape driveView (database)Dependent and independent variablesLevel (video gaming)Field (computer science)Computer fileMedical imagingString (computer science)Block (periodic table)AreaAddress spaceRootKey (cryptography)Error messageMultiplication signImage resolutionEmailInformation securityWindowMereologyReal numberInjektivitätInstallation artProduct (business)Data storage deviceMalwareAntivirus softwareShared memoryTimestampWordMetadataReverse engineeringHash functionBinary codeExclusive orRevision controlScripting languageObject (grammar)Matching (graph theory)Disk read-and-write headEmulatorInternet service providerSelectivity (electronic)Pointer (computer programming)RandomizationEncryptionSource code
43:02
Directory serviceRevision controlIPSecSampling (statistics)Electronic mailing listConfiguration spaceDifferent (Kate Ryan album)Link (knot theory)Windows RegistryMalwareFlow separationInformationWordString (computer science)BitBinary codeComputer fileGroup theorySoftware bugShift operatorSoftware developerType theoryAreaRoutingBeat (acoustics)Residual (numerical analysis)Slide ruleMultiplication signState of matterInteractive televisionComputer animation
45:41
CASE <Informatik>Attribute grammarPublic key certificatePoint (geometry)Software developerTerm (mathematics)Slide ruleInformationTheoryComputer animation
46:30
Vulnerability (computing)WebsiteBinary codeServer (computing)Web 2.0Software developerPhishingElectronic GovernmentUniverse (mathematics)Boss CorporationMereologyProcess (computing)Uniform resource locatorInstallation artCollisionAreaMusical ensembleNetwork topologySupersymmetryGraph (mathematics)Physical system
48:39
Level (video gaming)Uniform resource locatorMathematicsWebsiteCompilation albumSoftware developerTraffic reportingElement (mathematics)Library (computing)Formal languageCompilerPoint (geometry)Message passingCondition numberParameter (computer programming)ExistenceEvent horizonSound effectProgrammable read-only memoryForestComputer animation
50:07
Slide ruleInformationAttribute grammarPoint (geometry)IntegerGroup actionDecision theoryDifferent (Kate Ryan album)Charge carrierCondition numberCASE <Informatik>Computer animation
51:32
Link (knot theory)Hash functionElectronic mailing listInformationRight angleSelf-organizationSlide ruleComputer animation
Transcript: English(auto-generated)
00:18
Thanks for being here. My name is Jean Calvay.
00:21
I'm a malware researcher working at ESET, and I'm here on stage with... With Paul Rascagner, right in pink, I don't know why. But I'm a malware researcher at Jidata, a German antivirus company. Hello, my name is Marian.
00:42
Paul's name is pink because I don't like pink. I'm a malware researcher at Seifort, which is a US company. I'm working there as a threat researcher. And today we're gonna present you our topic, Totally Spies. Yep, so this whole story basically started
01:01
a few months ago, actually last year. Because of this, you may have already seen this slide. It's part of a presentation that was leaked by Edward Snowden in 2014. It was first mentioned by the French newspaper Le Monde in March 2014. And so basically, as you can see,
01:21
these slides were made by the Communication Security Establishment of Canada, the C-SEC, which is the NSA in Canada, basically. And so they describe on these slides what they call the Operation Snow Globe. And that's the description of a group of attackers that they have seen in the wild, and that they have tried to track, basically.
01:42
So they describe the group, and one of the striking information inside the slides is on this slide where they basically assess with moderate certainty that this Operation Snow Globe has been put forth by a French intelligence agency. And they actually provide very few technical details,
02:00
but there are a few of them. And that's on this slide, basically, where they describe one malware used by the group behind Operation Snow Globe, which is called by the developers, apparently Babar, and they also have a developer username, probably from the debug path. And that's basically where we decided to let the end begin.
02:30
Awesome. So I'm not sure if many of you know this picture, or the picture of these three girls. That's a, I think, a children's cartoon named Totally Spies.
02:41
These are the three spies. You can imagine these are our characters today telling you about the Totally Spies malware. Again, this was not my idea. This was Paul's idea, just to mention that, because he's the only one of the three of us who has little children.
03:02
Anyway, all right, let the hunt begin. So how did the hunt begin? Like, as you know, every good article about APT malware starts with a timeline, because apparently that's the most important thing when talking about malware. Anyway, so our timeline is gonna be, we're gonna show you the different families that we found.
03:20
The time on the timeline is when they were about, like, created, or when we believe that they were compiled and the order in which they appear is the order in which we found them. So the first thing that we uncovered was nbot, or TFC, or NGB, or we're not really sure how to call it malware, contains a lot of strings saying nbot.
03:40
It's denial of service bots, which are quite simple and were compiled in about 2010, which were not that interesting at all, but they led us on to find the bunny malware, which was a lot more interesting, which was probably compiled around 2011, or we actually know that it was spread in 2011. The next thing we found after bunny was finally Babar.
04:04
So yes, we uncovered Babar. French media was very happy about finally having someone to speak about Babar, the malware. But Babar wasn't the last thing we found, so after Babar, there were more cartoons popping up, and we eventually uncovered Kaspar. Kaspar is a reconnaissance malware
04:21
which was spread in Syria, interestingly, through a watering hole attack in 2014. And not even Babar was the last thing, but the newest cartoon character that we found was Dino, which was spread around the same time in around the same area. So today, we're gonna present about all these different characters.
04:41
All right, as a start, how did we get onto this malware? The first thing we had, as I mentioned, was Mbot, and you see on the slide, I'm very sorry that I put an IDA Pro screenshot on the slide. I was told I shouldn't do this at Recon, but I still dared to do it, just to show you how simple the bot itself was. So obviously, it does denial of service.
05:01
It's like flooding all sorts of things, and it had the strings and clear text in there. So we have denial of service bot, which comes pretty clear text, or let me call it a plain binary, uncrypted, unpacked. And I should say, they come from, whoops, from the antivirus scene.
05:20
So I started off as a junior analyst at an antivirus company, and I would have been very excited to see such a bot, because it could create great signatures. But what happens is, if you try to build a botnet, and you spread your bots, you might want to prevent someone from detecting all your bots with one single signature.
05:41
Just an idea. So this was quite interesting. But what was more interesting is that these bots had CNC servers that were sinkholed by Kaspersky. And Kaspersky, on their sinkholes, they provide an email address where you should write to if you have any questions about why this specific domain is sinkhole. So I wanted to write an email there and say like, hey, why are you like sinkholing my CNC server?
06:02
And I was very surprised when within the next 20 minutes, I got an answer from their team, oh, this is interesting that you have this. We're not really sure what it is, but could you tell me where you have the binary from? Which computer was infected? Who owned that computer? Which company was the computer in? And how did you get onto this binary? And I was like, oh, this is fishy.
06:24
So this was the first impression I had. And after a while, asking around, what is this kind of bot? A friend of mine came up to me and told me to look at something else. Like, there were other binaries with similar structures in there, and using similar techniques being quite different.
06:42
Anyway, so this was the first step towards the next cartoon, which was then bunny. So after digging a bit, I found a dropper which came with a PDB string about it telling me the project named Bunny 2.3.2. And I mean, I'm not specifically a fan of bunnies,
07:01
but I thought that's very cute. And digging into the binary, I found out it's a very interesting malware. Digging further, actually based on the hashes of these binaries, I found out they're already mentioned online on a blog, which documents a spear-phishing campaign that happened in 2011.
07:22
So I went to ask that blog writer, like, how did you get these binaries, and what kind of spear-phishing was that, and how did it work? And he didn't really give me any answer for, I don't know why he didn't say any details about the spear-phishing campaign. But what he said was, oh yeah, these binaries, I haven't looked at them closer,
07:41
but I was told it's French government spreading them. And I was like, oh wow, okay. This is even more fishy. So this was the first time I heard about the French government. Anyway, what is Bunny? So Bunny is a scriptable bot. Bunny incorporates a Lua engine, and can download and execute Lua scripts to execute the Lua with the engine
08:01
and instrument the C++ code of the binary. Let me show you how this is built. So Bunny is beautifully multi-threaded as a main thread, which is busy with command parsing from the CNC server and execution of the scripts. These scripts will be loaded from different text files, which are placed on disk,
08:20
so the command parsing will do nothing else and parse one file after another, load the Lua scripts in there and execute them in dedicated threads. So the entire bot was out to execute Lua scripts. These Lua scripts would be dumped into the text file by different hero threads. This is not a term that I came up with.
08:42
This is a term which is defined in the binary, so the binary calls its worker threads literally hero. Of these hero threads, hero zero is one that doesn't actually download and dump scripts. I was never too sure what the purpose of hero zero is, but the other three are busy with fetching scripts once
09:03
through HTTP, like plain download of scripts. Hero three would load scripts from a file, which was downloaded from an FTP server, and hero two, interestingly, would place crontasks. So it would configure tasks to be scheduled at specific points in time. Also, the crontasks time result for me,
09:20
that was defined by the binary authors. Anyway, so this is basically the workflow, download scripts and execute and inject them to the Lua engine. At the same time, the commands received from the CNC and the actions the bot would take were dumped to the text file. This text file was managed by a thread
09:42
called backfile thread. Altogether, the bot, of course, has a performance monitor to keep execution low. What is it, though, with the Lua threads? So there's several theories. With Lua, Lua's originally designed for computer games. With Lua, it can inject behavior into a computer game, like bombs explode or let things happen
10:01
all of a sudden unexpectedly and out of context. So my first theory is that this bot downloaded the scripts to instrument its own code and to inject behavior through text files into the binary. So what Bunny would do was not download other binaries as plugins or to execute behavior,
10:20
but download Lua scripts to instrument its own code and change its behavior on the fly. Another interesting thing is that by downloading Lua scripts, you're not actually downloading binaries. You don't have to execute binaries on disk, so you don't create a new thread every time you want to inject behavior. But you should only download a plain text file,
10:41
which is rather small and doesn't catch any attention and can still inject any behavior you want to a given binary. So that was pretty smart. What was also interesting about Bunny was that it was pretty, or like, I call it the slight rabbit armoring. It wasn't pretty armored, but it came with some interesting anti-analysis tricks. What was interesting for me is that it had a lot of them,
11:02
which is rather uncommon for the usually BT malware I see. Anyway, altogether, they were pretty simple. Let me just count them down. So it did an emulator check. It would check the module path of the executed module if it contained any strings indicating an emulator. I think Paul is going to explain this in more detail later.
11:22
It would check the directory name from which it was executed to see if it was the directory the dropper had created before, see if the payload was really dropped by a legitimate dropper. This might seem simple, but it works in most sandboxes to evade execution. It would change the timestamp of the payload
11:41
to the system installation date. It would check if the number of running processes was bigger than 15, which is not the case if you run in a simple emulating environment, like, for example, an anti-virus engine emulator. It would check if time APIs were hooked, which happens if you turn on, like, I don't know, a plug-in, like a stealth, and hook the get discount API.
12:03
It would obfuscate a subset of APIs, so it would load a subset of APIs dynamically indicated by hash in the binary. Interestingly, this hashing function is really simple, it's veritable, and it's shared throughout most of the cartoon malware.
12:20
It's also Paul who can speak about this later again. But what was smart is that they don't load all the APIs dynamically, but only the ones that indicate the final behavior. So APIs for interaction with the registry or APIs for interaction with the file system were all obfuscated. So I think this was for evasion of analysis,
12:43
but just only looking at the import table. It's a pretty simple trick, but might be effective in some cases. Another thing was the infection strategy. Is that me? The infection strategy, which would check which anti-virus was installed on the machine
13:01
and then decide on a specific technique and whether to inject an already existing process or create a new process and inject the payload there or ever, which requires a lot of knowledge of how dedicated anti-virus engines work and what kind of infection strategies they watch out for.
13:21
Last but not least, which I wasn't sure if it's a bug or a feature, was an evasion trick for sandboxes where the final payload would not be loaded without a reboot of the machine because the dropper lacked any functionality to invoke the payload. I have discussed this a lot with other people. It's pretty effective against the sandbox
13:41
because you would really need to reboot the machine. So yeah, the persistence mechanism was through registry. So there was a registry key which invoked the bunny loader. But the thing is the bunny dropper would not delete itself after dropping the payload. So the bunny dropper was executed, would place the payload, would create the registry key, and then nothing would happen
14:02
until the user would reboot the machine. Then the payload would be invoked through the registry key and delete the dropper. So I'm not really sure if this was like intention or if they just forgot to invoke their payload. Anyway, this was the bunny. After bunny, we were like really excited.
14:21
This is cool malware. And we found out that there was something else that had been documented in the media already. There was Babar. It was like bunny, Babar? The guy in the US said there's like animal names in the malware. And we thought like, okay, we have to be searching for Babar. And finally, really, we found Babar.
14:41
I don't know if everyone's familiar with Babar. Babar is a French cartoon character. It's an elephant. And that's been first mentioned by Le Monde in an article in 2014 where they're speaking about the mentioned Snowden slide. So they're talking about Babar. I personally call it my pet, my persistent elephant threat. Babar is a espionage malware.
15:03
It does key logging. It steals screenshots. It does audio capture. It's like everything a good espionage tool should be doing. And that's just by thoroughly invading the Windows machine. So let me tell you in advance, Babar is not like a really sophisticated malware, but it does its job very well. So Babar would work through a local instance
15:22
and two child instances as a backup and hook APIs and remote processes to steal data which would enter these APIs on the fly and exfiltrate this information to its CNC server. Let's have a look. I prepared another beautiful slide laying out the operation. So Babar would be loaded.
15:40
So Babar is a DLL which is loaded through a registry key which invokes to REC SVR32 EXE. Well then, it checked itself to a process running on the desktop which is randomly chosen which is the main instance which would then go on to infect two child instances. These are used as backup only. So if the main instance in an infected process dies
16:02
one child takes over and creates a new child to guarantee persistence. So this main instance would take over most of the functionality with the key logging. It would steal data from the clipboard. It would steal names of running processes. It would steal names of desktop windows that are open. Nothing really exciting, but what it also would do was, as I mentioned, hook into other processes.
16:23
So the main instance would load itself or the DLL into other processes through a global Windows hook and hook itself into the message chain of the running application to be able to do its inline hooking. So the processes of interest were identified
16:41
through the configuration. So if, for example, Microsoft Word was opened the Bonnie DLL would take action and place inline hooks on dedicated APIs that it was interested in. The API hooking was performed by the Microsoft detours library. So I'm a rather young reverse engineer.
17:02
I haven't heard of detours before, but other people were laughing at me because detours is like from 1999. So yeah, our authors are nostalgic. So as I mentioned, it works very well. Now let's have a closer look how that works. So first of all, what I found interesting
17:20
was the process invasion that Babar performed reloads DLL, as I mentioned, and injected to running processes, which used a section object to deliver information to the child instance or to the infected instance, containing the pipe name, number of instances, and the export name to be called. It would then allocate memory in the target process
17:43
and copy a stop function there, which was used to create a remote process, which would then load the Babar DLL and call the indicated export name. By this technique, yeah, and then the DLL would run happily in context of the infected process. By this technique, Babar could invoke any of its exports by just injecting its DLL to another process
18:01
and then calling any of the exports it had and thus calling the dedicated functionalities. This was, for example, how the CNC communications performed. The CNC communication is located in one specific export. So if a bar would want to communicate to its CNC server, it would just create another child instance with a call to the CNC functionality handled with the data,
18:23
which would be communicated and then run the DLL in context of another process. The second thing I thought was interesting because it was simple was the key logger. So the key logger Babar used was like the most simplistic key logger one could implement. It would create an invisible message-only window,
18:40
which would then, with message dispatching, create a raw input device, which was then used to filter for input window messages. I wrote the specific settings here on the slides. It would then call catch input data to receive the data the input device captured and then just translate the virtual key
19:00
to a character and write it to a file. So this is probably the most simple key logger you could write. You can find this documented on a code project article, which is titled something like, yes, simplistic key logger or simple key logger. So congratulations to our authors.
19:21
Third thing, and the last thing I actually wanted to describe was how Babar hides in plain sight and tries to evade being seen, which is the, yeah, the user land rootkit functionality. Or actually, no, never mind. User land rootkit, what does it mean?
19:41
Babar hooks target functions or hooks function calls to specific target functions, which were APIs in our case and would do this through the detours library. I would literally use code from the detours library to place inline hooks. How are these performed? So Babar would overwrite the target function, the first few bytes of the target function
20:01
to point to a detours function, which would perform the malicious functionality. So in the detours function, Babar would add the steal data, which was going into an API or steal data, which was returned from an API, in order to then add before that afterwards, call the legitimate API to hide the hooking and to deliver the right return value
20:21
to the calling process. So from the source function, the execution flow would jump to the detours function from this functionality, and then go back to a trampoline function, which contained the overridden bytes from the target function. So the trampoline function would make sure that the function call, the final function call
20:41
could happen after all, and would then hand over execution to the rest of the target function, which would then return to the detours function and from there back to the caller. So after all, this was silent, stealing data at runtime from a running process without the running process noticing it. This is called a hook.
21:01
Babar would do this for internet communication, file creation, and audio streams. These were the specific APIs it would hook. After all, Babar is a tool that does its job. It's not very sophisticated. And as we published about Babar, people were coming out to ask us, do you really think like, is it anything like Reagan? Is it anything really, really sophisticated?
21:22
And I was dying when I saw a quote from Paul when he told people that Reagan and Babar, if you compare them, keep in mind that Appa show is not for the day-to-day life. And just, you listened to me?
21:40
For information, the French media covered a lot when we published our papers. And the first question of all journalists that called me was, does it French government, first question. And the second question, I don't really know why, but all the journalists asked me if it's more complex or less complex than regime. And a lot of journalists on the French newspaper
22:03
make some joke about a potential French developer because it's less evolved than regime. So that's why I write these things. Okay, so the next one we found is Casper. So as Marion said, unlike Babar and Bernie,
22:21
Casper is simply a reconnaissance tool, a first-stage malware for the group. And among the sample we got, there is a DLL where the developers forgot to remove the original file name from the export table. It has been developed in C++, like most of the animal farm malware. And it has been deployed at least in April 2014
22:41
on a few people in Syria, thanks to a flash zero-day exploit. And interestingly, the exploit, the Casper binaries, its CNC server, were all hosted on one machine in Syria that belonged to the Syrian Justice Ministry. But we believe the animal farm, the group beyond that, which is called Animal Farm, as you may know, simply hacked the website
23:02
to use it as a storage. So the first thing that Casper does when it arrives on the machine, it decrypts its configuration file, which is an XML file, and it contains in particular instructions on how to deal with antivirus software that could be running on the machine.
23:21
And they call it a strategy. And you can see here at the top, there is a strategy tag that defines the default strategy. And inside this strategy tag, there are some AV tags defining strategy for specific antivirus software. And so at runtime, Casper checks which antivirus run on the machine and applies the corresponding strategy
23:41
of the default one. And so what's a strategy? It's basically a set of parameters that define either how to perform certain actions on the machine or whether certain actions should be performed. So for example, the AutoDel parameter defines how the Casper dropper will remove itself from the machine after having dropped the payload.
24:03
And you see here that for Bitdefender, the AutoDel parameter is set to API, which means this action will be made through a call to the Windows API function, move file xw, to register the file to be deleted at the next startup. But if you run Avast antivirus, in this case, the same parameter is set to wmi,
24:22
which means the same action will be performed differently. And in this case, it will be through a command line that will be decrypted and then executed into a new process. The command line is just a loop that tries to remove the dropper until it works. And the new process is created through a wmi request. So that means that the Casper developers
24:42
have an in-depth understanding on how each antivirus product monitor the machine. And they implemented bypass for each one of them and for each noisy action that Casper has to do, which is kind of a lot of effort. The next thing that Casper does, it receives some commands inside the configuration file.
25:03
And in particular, there is the install command to drop the payload and make it persistent onto the machine. There are two versions of the payload provided, one for 32-bits machine in the x86 tag and one for 64-bits machine. An interesting detail here I would like to insist on
25:20
is that the Casper dropper gives an input parameter to a payload. And this input parameter has to have this exact specific value for Casper payload to run normally. And the way it is implemented is actually pretty subtle. It's not a simple check at the beginning of the execution of a payload.
25:40
No, it's done in this function in the Casper payload. It's basically the function in charge of finding the API address in memory. So it's a get proc address, basically, but they don't use the name of the API function. They use a hash, a four-bytes hash calculated from the name. And interestingly, the first thing that this function does
26:01
is XOR between four bytes after that constant, a variable inem checksum, and the hash given an input to a function, the hash to look for. And where does the checksum come from? It's the result of a few arithmetic operations done on the input parameter of Casper binary. So let's say the checksum is not equal
26:20
to the output at constant. Then the XOR will not be equal to zero, and the hash to look for defined in this line will not be equal to the hash given an input to a function, which is the correct hash to look for. So in this case, this get proc address will not retrieve the correct API function because it does not look for the correct hash because we didn't provide it the right input value
26:40
to Casper payload such that the checksum is equal to the output at constant. So we have to provide to Casper this exact value such that it will execute normally. And if you don't do it, you will get a random crash because at some point, it calls the address retrieved, and it's not the correct address, so there is a weird crash inside Windows API.
27:00
So that's a technique to make the analysis of a payload without having the dropper difficult because you have to find this exact input value such that this line will have no effect. So once Casper is executing normally on the machine, it builds a very detailed report on the machine, and you can see an extract here.
27:20
And this report is sent back to the CNC server, which can provide in answer an XML file, once again, containing commands, and in particular, they can deploy a second stage binary at this state. We don't have any second stage binary because the CNC was down when we started investigating on Casper. So that's it for Casper.
27:40
And now it's time to talk about DINO. That's, as far as we know, the first time that DINO is publicly documented. So DINO is more in the category of VABAR and BUNI. It's an espionage backdoor, a second stage malware. It has a lot of features, including the ability to do some complex file search request.
28:02
The operators can ask the DINO malware, give me all files with doc extension whose size is greater than a certain amount of bytes where modified in the last few days. And that's probably the end goal of DINO, exfiltrating information from the target. We got only one sample of DINO, actually. And this sample was deployed in Iran in 2013.
28:24
It is developed in C++ with a clean modular architecture. There are no RTTI inside the binary, but there are a lot of verbose error messages, like this one, for example. And once again, they forgot to remove the original file name from the export table.
28:42
So I read the list of modules that we got in our sample with the names given by the developers. And so first there is a PSM module which maintains an encrypted on-disk copy of all the modules of DINO. Then the core module contains the configuration. The contact modules allows the operators to schedule tasks
29:03
with a syntax that is almost exactly the same as the CON UNIX command. Then there is the FMGR module to upload and download files onto a machine, whereas the CMDEXEC and CMDEXECQ are managing the execution of commands on DINO. And finally, they got a module that they call AMVVAR
29:22
that store DINO-specific environment variables. So now I'm going to dig a little bit into technical details in DINO. So one of the important things when analyzing DINO was to understand a custom data structure that they use everywhere, basically, and in particular to store the content
29:41
of the modules of DINO. The developer called this structure a data store and that's a map from string to values. These values can have eight possible types. Some of those types have a fixed size, like byte, short, word. Some of those types have a variable size. And the type names here,
30:00
they also come from the developers because in DINO there is a function to print a data store nicely and it prints in particular the name of the types. So for example, there is the result of printing the data store that is inside the core module and it contains the configuration of DINO.
30:20
So that's really the output of the print function that is inside DINO. So you can see on each line the key, the value associated with the key and the type of this value. So how are these data store implemented actually? As I said, it's a map. And they implemented it like a simple hash table. So in memory, in a data store object,
30:42
the first field is a pointer to an array of four entries and each entry starts a linked list containing key value pairs. So in order to access to a key, there is a hash function that is used. You hash the key, you get a number, take the number, module four, it gives you the index in the array. We have a linked list possibly containing your key starts.
31:02
So for example, in this simplified view, the hash of the key IP is three module four so the linked list containing the key IP starts at the index three. They fixed the number of buckets to four, the size of the array to four, which makes this data structure not really efficient because there are a lot of collisions,
31:21
a lot of key with the same index in the array and the linked list grow really fast. Another thing to know about data store is that they can be serialized and they have some custom format to serialize the data store. It looks like this, a serialized data store. It begins with a magic D-word, D-X-S-X in big Indian,
31:41
then a suspected version number, then the number of stored items and then the serialized items themselves. First the key, its length, its name, and the actual value, its type, and the value. The serialized data store are used in particular in the PSM module that maintains the up-to-date copy of all the content of the modules.
32:01
And it is done inside an encrypted file and this is, by the way, the S4 key they use to encrypt the file so we can admire the lit-speak of the developers. So yeah, that's it for data store. An interesting thing also inside DINO is a module that the developers call RamFS. It's also present in other animal farm binaries.
32:23
And like the name implies, just to conclude on the data store, I think data store is a custom data structure but if you recognize it, by the way, I would be very happy to know. So RamFS, like I said, it's, like the name implies, it's a temporary file system
32:41
that can be mounted in memory from an unencrypted blob that is initially inside the configuration. So once it's mounted in memory, RamFS will remain stored always in unencrypted chunks of data. So it's a set of unencrypted chunks of data. And the chunks will be decrypted only on demand. And in our DINO sample,
33:01
RamFS initially contained one file that the developers call the Cleaner file. And it contains instructions on how to remove DINO from the machine. So to give you an idea, there is the code that is responsible to execute the Cleaner file. So when the operators want to remove DINO from the machine.
33:20
So the first thing it does is to look for the name of the Cleaner file which is inside the configuration. And you can see at the bottom that once again they provide very verbose error message. So we know basically what's the purpose of a check from the error message. So if they found the name of the Cleaner in the configuration, then they look for the cryptographic key to decrypt RamFS initial blob in the DINO configuration.
33:44
And you can see that they call this key the passphrase. So they are really in the file system mindset. And then if it works, there is the actual mount operation which is basically creating a C++ object in memory that contains the file system. And if the mount works,
34:01
then they execute the Cleaner file which is inside the RamFS. So how is RamFS actually implemented at this low level? So in memory we got this RamFS object and as I said, RamFS is a set of encrypted chunks. And these chunks are inside a linked list for which we got a head pointer and a tail pointer
34:22
inside the RamFS object at the offset B8. So the head and the tail pointer points to the beginning and the end of this linked list. And each item of this list is actually an address of the chunk header. The chunk header is a structure containing in the first field the address of a chunk and in the second field,
34:42
the key that serves to encrypt and decrypt the chunk. So each chunk could be theoretically encrypted with a different key. And finally, the chunk content address points to the 512 bytes of encrypted data. So that's the way RamFS looks like in memory,
35:02
a set of encrypted chunks that you can access from a linked list. This is not even the file system structure. The actual file system structure is inside the first chunk. So they would decrypt this first chunk when they mount the file system and store the structure inside the beginning of the RamFS object. This is the only part
35:21
that stay always decrypted in memory. So the rest of the file system is always encrypted and decrypted on demand only. Interestingly, we got the name given by the developers for three fields in this structure because of some error message in the code. So we know the developer called these three fields here
35:41
the file list, the free chunk block list, and the free file header list. The file list is the only one that is non-empty at the beginning and like the name implies, it's a list to all the files stored in RamFS. And in this case, we only got one file, the cleaner I just talked about. So we got the name in the structure, a.ini,
36:00
and the content of the file. The content is a custom command to uninstall DINO from the machine, thanks to the dash shoe. So basically, RamFS also come with a custom command handler. And here are a few commands that they can execute inside RamFS. The install command I just showed you, the extract command to take a file into RamFS and put it onto the real file system.
36:22
Two commands to execute or inject file store into RamFS and a command to kill a running process onto the machine. So we can guess that this RamFS thing is really a disposable execution environment for the developers. It always stay encrypted. It's really hard in terms of forensic to understand the structure. So that's probably the purpose.
36:41
And my question is, is this thing custom? Did they copy paste it from somewhere or not? I couldn't find any implementation like that on the internet, so I tend to believe it is custom. In particular, because of this high-level characteristics, all the file names and file content are in Unicode. The maximum file name length is 260 characters.
37:03
The unencrypted chunks, once they got decrypted, the chunk, they are manipulated as chunks of 540 bytes this time, which is not so common, as far as I know. And I couldn't find any metadata on files, so they don't seem to have any timestamp of last creation, last modification time, or things like that. So I believe RamFS is custom,
37:21
but once again, if you recognize this thing, I would be happy to know. And now. So, we spoke about a lot of different malware, sample extra, and now I'm gonna try to explain the different links we found between each of them,
37:41
and similarity, and why we think it's the same, or more or less the same group of developers behind. So, the first things they mention is API obfuscation. So basically, every cartoon uses the same techniques to obfuscate API.
38:03
They load the library, in this example, connect that to the DLL in memory, and take the list of export function, and generate a hash of each exported function to find the wanted hash.
38:22
So, we found two different algorithm. Here is, for example, the algorithm used by Bunny and Casper. So it's really easy, it's simply a whole and XOR. I put a Python script and an example of the hash of create process function.
38:41
So, another similarity between a sample is a way of antivirus detection. On each cartoons, the developers use WMI provider to find the antivirus inside of the system. So they use root security center,
39:01
or root security center two on newer Windows system, make a select star from antivirus product to get the antivirus name and installation date, extra, extra. And in fact, they use the first name
39:23
of the security product. For example, if you look at G data, it's only the G and not G data. It only takes the first word. And they make a share on it, and store the share inside of the binary and check if the installed antivirus is this one.
39:45
Just for information, the three last share, we didn't find which antivirus is detected. We don't find the name, so if by any chance you have an idea or have a huge database of share, maybe you could help us to have a full list of detection.
40:07
So another fun things about the cartoon sample is the emulator detection. So if you look, something really fun is the last one.
40:22
If you look, A, F, Y, et cetera. So it's a random name generated by Kaspersky product, but the developer R-coded the random name inside of the binary. So I think he make a test, has this name, and R-coded this name and didn't check a second time.
40:41
And you've got a list of other random name potentially generated by Kaspersky to start the emulator. Yes, sir. Is this working? Can you hear me? Awesome. At this point, I'd like to thank a researcher who wants to stay in the Nimbus
41:01
who dedicated his spare time on reverse engineering antivirus emulators. And coincidentally, he saw my paper on bunny, and so like, oh, the test upstream, wait, that I saw before, it's Bitdefender. And then he was like, but the other string, that doesn't make sense. Like that should be Kaspersky. But then he sent me this list of names
41:21
that he expected from me, because Kaspersky emulator, and so like, this is a random string. This shouldn't be hard-coded. I thought it was hilarious. Another link and similarity between sample is the internal ID of malware.
41:42
If we look at this number, it seems to be really similar. For example, on the C-SEC slide, the Baba sample mentioned is 08184. We don't have it, but on the sample we analyzed, the Dino Bunny Baba Exetera,
42:01
the naming convention seems to be more or less the same. So it's purely a speculation, but maybe the two first number match the year of creation and usage of the malware. On the C-SEC slide, it's 2008 and 2011, 12, and 13
42:22
for the other sample. But we don't have any proof, it's simply speculation, and maybe it's the ID and the version of the campaign ID after, or something like that. I don't know. Another link between each sample
42:41
is of course the naming convention, because all the names we mentioned, Baba, Casper, Bunny, Exetera, is the internal name chose by developers. It's not our naming convention. So on every case, they choose more or less cartoon characters to identify malware.
43:04
Another link between each sample is a really, really bad English usage, other than mine, imagine. So for example, if you look the string of the registry key, you have a mistake in the middle,
43:20
and Mario, have a fun thing about these errors. I think you can mention it. Yeah, so the typos in the binaries were hilarious. If you look at the yellow-marked, is it about to negotiate string? You realize there's one letter that should be kind of different. Anyways, I was looking at the Bunny binary, and initially I saw these strings, of course, and I was like, okay, this binary does something
43:42
with a registry key, and maybe something with IPsec, because it's accessing the Microsoft IPsec key, whatever, I have no idea of IPsec. So I was looking how Microsoft interacts with IPsec, and what the binary could do with the registry when modifying IPsec in there, and then I realized, wait, there is a name, it's written wrongly, this shouldn't work, but when the binary ran, it would work.
44:02
So I thought maybe it's trying something and it has a bug in there, and then I realized that what it's writing to this registry key is actually its configuration and not anything related to IPsec. So if they wouldn't have placed a typo in the name, I would have maybe been researching IPsec forever, but because it placed a typo, I saved time.
44:20
Which I found was great. And another, even a stronger link between them is we found a CC, horizontalism.com. It's a travel agency in Algeria, I think, yeah.
44:41
It's in Algeria. And we found this directory listing on the CC, and we were able to list every file and directory, and we found a directory for several different malware. For example, the D13 is the directory used by Dino malware, and 13 is the version of Dino.
45:04
And on the same server, we've got BB28 for Babar, and we've got TFC for Tafakalu. It's one of the samples developed by the group. And just for information, on the Kaspersky report, you have some information about Tafakalu.
45:22
It's an oxidant word. That means it's hot, basically. And it's a link to French developers, but I'm French, and I never heard this word before, the Kaspersky article. So the link with French speaking, for me it's a little bit exaggerated.
45:41
And that's why I'm gonna speak about attribution now, about this case. And a lot of article people, et cetera, point French developers. And they decided to, for me,
46:00
clearly manipulate information to point French developers. So Tafakalu stuff, for example, or the TT. If you look at the slide, TT is a diminutive for theory. Yeah, maybe, but not convinced. And a term that means small people. I never heard the usage of TT in French
46:21
to identify small people. So for me, it's not really true. We got the idea that Canadians maybe don't speak real French. Canadian French. So maybe people at the back cannot read the graph,
46:41
but I decided to list every CC and check which country is used on the CC. Typically we found Syrian website, Iranian website, American website, Burkina Faso. The .bf is not for brute force, but for Burkina Faso.
47:03
Hong Kong, Saudi Arabia, Egypt, Turkey, Niger, Morocco, and Ukraine. Basically on the CC we identify compromise website like the Syrian government website. We found a university company and sometime it was fake website.
47:24
And not always, typically not for the Syrian government, but most part of the CC was WordPress website. So maybe the developers use some vulnerability on the WordPress to drop the CC panel, et cetera.
47:46
There was another interesting story to the CNC. So one of the bar binaries has a CNC server which is hosted on a Algerian travel agency website. So I didn't think this is anything suspicious, but one day my boss poked me on this website
48:04
and said like, so he's Algerian, and he said like there's no travel agency in this place in Algeria. So he basically said that in Algeria you don't really travel nowadays. Like there's no need for travel agencies and the location, like the village where this travel agency says it's located,
48:22
he doubts that there's a travel agency there. And even if there were a travel agency, it's improbable that this travel agency hosts its web server on a United States hoster. So we genuinely believe that these websites are faked. So here is the map of location of each CC.
48:45
So for Chinese it's only Hong Kong, but I cannot simply draw Hong Kong on the website. So that's why it's World China. Yeah. So I explained that some report point on French
49:03
and on bad argument, but we found some more relevant French hints. For example, on the HTTP request generated by the malware, the accept language is set to FR.
49:20
And another example is a compiler setup language is set to French as the second scruncher. And another thing is on Casper, I think, on Dino. The developers did not remove the path
49:42
of compilation of arithmetic library and arithmetic was written in French and not in English with a C at the end. So it's, of course, all of this element can be forged and I can set up my compiler in a lot of language
50:01
and manipulate the data, but it's simply a fact for us. And finally, the other information for attribution is the C-Sec slide where the agent
50:22
points a French intelligence agency for the Snow Globe campaign. At this point, I'd like to say a short thank you to the Canadian Intelligence Agency for providing these slides. I'm very sorry, they actually got leaked, but they helped us a lot in our research.
50:44
Yeah, but everybody knows that attribution is not so easy. And journalists that covered the case in France, in particular, are really good for bad attribution. For example, a newspaper said that Jean, this French guy,
51:00
is a Canadian guy, too bad. And for example, Marion from Austria, a journalist, a French journalist, doesn't really know the difference between Austria and Australia. So he makes a mistake. You maybe won't know, but usually we Austrians, we have the problem that we're exchanged with Germans, which is like, it's not directly an insult,
51:20
but it is an insult, and calling us Australian is even worse. I mean, this is a different continent. So yeah, attribution is not so easy. So thank you for your attention. Just for information, on the final slide, we will provide to the organizer,
51:40
you will have a link of all our article, a list of hash of the analyzed sample, et cetera, et cetera. And yeah, that's all. If you have any questions, you can ask us now or later at the bar. Thank you.