PACKET HACKING VILLAGE - Normalizing Empire's Traffic to Evade Anomaly-based IDS - TIB AV-Portal

PACKET HACKING VILLAGE - Normalizing Empire's Traffic to Evade Anomaly-based IDS

00:00

3

Sen, Utku Sinturk, Gözde

Formal Metadata

Title

PACKET HACKING VILLAGE - Normalizing Empire's Traffic to Evade Anomaly-based IDS

Title of Series

Number of Parts

322

Author

Sinturk, Gözde

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/39914 (DOI)

Publisher

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Perimeter defenses are holding an important role in computer security. However, when we check the method of APT groups, a single spear-phishing usually enough to gain a foothold on the network. Therefore, red teams are mostly focused on "assume breach" type of scenarios. In these scenarios, testers need to use a post-exploitation framework. Besides that, testers also need to hide the server-agent communication from NIDS (Network Intrusion Detection Systems). In this session, we will discuss one of the most famous post-exploitation tool, Empire's situation against payload-based anomaly detection systems. We will explain how to normalize Empire's traffic with polymorphic blending attack (PBA) method. We will also cover our tool, "firstorder" which is designed to evade anomaly-based detection systems. firstorder tool takes a traffic capture file of the network, tries to identify normal profile and configures Empire's listener in such way.

Speech

Text

Image

00:00

Intrusion detection systemExecution unitInformation securityInformation securityInformationWebsitePresentation of a groupTwitterComputer animation

00:38

Computer fontInformationSoftware bugElectric currentPhysical systemSoftwareProduct (business)Computer simulationDoubling the cubeState of matterOcean currentFirst-order logicPresentation of a groupIntrusion detection systemProjective plane

01:08

Artificial neural networkComputer networkPerimeterInformation securityChainClient (computing)SoftwareClient (computing)Computer architectureWebsitePhysical systemPerimeterDifferent (Kate Ryan album)Military baseServer (computing)Intrusion detection systemDatabase

01:39

Information securityInternetworkingPerimeterSoftwareEmailOrder (biology)Self-organizationMultiplication signProfil (magazine)

02:06

Information securityDifferent (Kate Ryan album)Multiplication signSoftware testingProfil (magazine)Computer-assisted translationGame theoryInheritance (object-oriented programming)Exploit (computer security)Self-organizationSoftwarePerimeterDirection (geometry)Intrusion detection systemVulnerability (computing)Computer animation

02:47

Computer networkIntrusion detection systemElectronic signatureDecision tree learningElectronic signatureExecution unitSoftwarePhysical systemFocus (optics)Type theoryMilitary baseProjective planeOpen sourcePattern languageStack (abstract data type)Computer animation

03:35

Inheritance (object-oriented programming)Element (mathematics)MathematicsElectronic signatureMaxima and minimaRaw image formatWeightNormed vector spacePhysical systemElement (mathematics)String (computer science)EmailComputer animation

03:55

Information securityMaxima and minimaFlagGamma functionEndliche ModelltheoriePattern languageBitForcing (mathematics)Physical systemElectronic signatureType theoryProduct (business)Normal (geometry)Source codeSoftware bugProcess (computing)Computer animation

04:36

AlgorithmAddressing modeInformation securityComputer filePattern languageOrder (biology)Process (computing)EmailElectronic mailing listSoftwarePhysical systemAlgorithmWeightNormal (geometry)Computer programmingProfil (magazine)MereologyMotion captureFlagServer (computing)Electronic signatureSoftware bugMachine learningComputer animation

05:26

PressureInformation securityPhysical systemWave packetOrder (biology)Category of beingPattern languageRegular graphMereologyElectronic signatureElectronic mailing listComputer animation

05:43

Artificial neural networkInformation securityRegular graphRegular graphPattern languageSoftwareServer (computing)Process (computing)Motion capturePhysical systemNormal (geometry)Order (biology)Profil (magazine)Wave packetPressureScaling (geometry)BitComputer animation

06:43

CountingForceComputer animation

07:00

Computer networkWave packetPhysical systemSoftware bugSoftwareComputer animation

07:25

Server (computing)Group actionMoving averageExploit (computer security)Projective planeMereologyTask (computing)Virtual machinePoint (geometry)Context awarenessProduct (business)Software frameworkState of matterWeightElectronic mailing listBoolean algebraTelecommunicationServer (computing)Process (computing)Function (mathematics)SoftwareComputer animation

08:31

Web pageInformation securityEmpennageTelecommunicationCuboidFocus (optics)SoftwareEncryptionDrop (liquid)Type theoryConnected spaceDifferent (Kate Ryan album)

09:05

Advanced Encryption StandardClient (computing)EncryptionAlgorithmInformation and communications technologyAlgorithmParallel portClient (computing)MereologyFunction (mathematics)Metropolitan area networkSoftwareEncryptionTelecommunicationConnected spaceEntropie <Informationstheorie>Product (business)MetadataComputer wormChannel capacityComputer animation

09:47

Cache (computing)Client (computing)Server (computing)EmailSanitary sewerHTTP cookieWeb pageHill differential equation9K33 OsaInformation managementColor managementArchaeological field surveyEmailSoftware bugWeb pageDependent and independent variablesGraphical user interfaceSoftwarePhysical systemConnected spaceWeb browser2 (number)Server (computing)Extension (kinesiology)Different (Kate Ryan album)Generic programmingTelecommunicationWindowElectronic mailing listLatent heatState of matterSpeech synthesisWeightArmLocal area networkRight angleComputer animation

11:16

LengthPhysical systemBoss CorporationWebsiteDemosceneDirected graph

11:33

Menu (computing)MathematicsGroup actionContent (media)2 (number)Server (computing)Source codeDefault (computer science)Connected spacePhysical systemDifferent (Kate Ryan album)Computer configurationMenu (computing)Email

12:06

Normal (geometry)Motion capturePolymorphism (materials science)User profileRight angleEndliche ModelltheorieSoftwarePhysical systemNormal (geometry)Process (computing)Level (video gaming)Profil (magazine)Wave packetMotion capturePerfect group

13:00

TelecommunicationNormal (geometry)Motion captureMathematicsServer (computing)Configuration spaceGroup actionComputer configurationIdentifiabilityConnected spaceMotion captureEmailComputer animation

13:28

ComputerPhysical systemBit rateComputer networkSign (mathematics)Cross-correlationPersonal digital assistantInformation securityIRIS-TInterior (topology)Group actionTelecommunicationPhysical systemMultiplication signSoftwareCorrelation and dependenceTask (computing)State of matterConnected spaceResultantThresholding (image processing)Normal (geometry)WebsiteInterrupt <Informatik>Bit rateSoftware bugFrequencyMotion captureComputer animation

14:47

TelecommunicationNormal (geometry)Motion captureMathematicsLocal GroupMotion captureSheaf (mathematics)WebsiteContent (media)Surgery2 (number)TelecommunicationGroup actionDefault (computer science)Computer animation

15:15

Information securityEncryptionEncryptionElectronic signatureINTEGRALPower (physics)Physical systemMatching (graph theory)Computer-assisted translationFlagComputer animation

15:51

AlgorithmData modelString (computer science)Markov chainProcess (computing)Bit rateEndliche ModelltheorieInsertion lossWave packetAlgorithmString (computer science)

16:27

Execution unitInformation securityConvex hullLinear partial informationHill differential equationPrisoner's dilemmaHost Identity ProtocolHand fanGraphical user interfaceComputer fileIntrusion detection systemMessage passingSystem callOrder (biology)Endliche ModelltheorieBlogEvent horizonWeb pageInformationContent (media)String (computer science)Computer animation

17:02

Dynamic random-access memoryMarkov chainMilitary operationSynchronizationComputer programmingWave packetSet (mathematics)WebsiteExecution unitOperator (mathematics)BlogSource codeContent (media)Physical systemServer (computing)Codierung <Programmierung>EncryptionComputer animation

17:54

Information securityPhysical systemInternet forumBlogComputer animation

18:11

Markov chainCodeImplementationInformation securityExecution unitFunction (mathematics)ComputerPhase transitionMultiplication signExpected valueMetropolitan area networkMoment (mathematics)CodeOrder (biology)ImplementationTime zoneComputational intelligenceCentralizer and normalizerWave packetComputer fileComputer animation

18:50

InformationServer (computing)Information securityEmailOrder (biology)Key (cryptography)Sheaf (mathematics)Group actionVirtual machineFigurate numberVideoconferencingArtificial neural networkCore dumpSoftwareDemo (music)First-order logicServer (computing)Computer animation

19:26

Real-time operating systemVideoconferencingFirst-order logicParameter (computer programming)Element (mathematics)PasswordComputer fileOrder (biology)Computer animation

19:52

Information securityQuery languageProfil (magazine)Software bugPhysical systemTelecommunicationDefault (computer science)Normal (geometry)Element (mathematics)SoftwareMotion captureCuboidProxy server

20:30

Information securityMechanism designResultantMechanism designAddress spaceElectronic signatureVirtual machineExistential quantificationRule of inferenceComputer animation

Transcript: English(auto-generated)

00:03

to my talk named normalizing empires traffic to evade animal based ideas. Can everyone hear my voice? Is it okay in the back side? Okay great. Uh so let me introduce myself first. Uh I'm Utkushan, I'm usually doing researches and writing tools which are about

00:23

offensive side of security. Uh I'm currently working in tier security uh which is not live yet but it will be live soon. Uh you can find detailed information on my website and you can follow me on Twitter uh if you are interested in. So in this presentation I will

00:41

talk- start to talk about current state of defense and how we prefer ass- assume big scenarios. After that we will check advantages and disadvantages of uh network intrusion detection systems and their evasion techniques. After that we will check empire project and its situation against animal based systems. Then we will check proposed

01:03

solutions and the first order tool uh which can be candidate solution for these problems. So you are seeing a very basic architecture of a network in there. Uh there are client computers located alongside with the other assets like server databases etc. Uh top of

01:21

them there's also an intrusion detection system uh which controls the whole traffic and tries to identify malicious traffic. Uh so also we have a concept named perimeter defense in there. Uh this concept isolates your internal network and acts like a door uh which connects you to the outside. But as we can observe from uh recent breaches

01:43

perimeter defenses are not uh holding attackers from an organization network. Uh a good crafted spear phishing um email is usually enough to gain a foothold on the internal network. So uh more and more attackers are using these techniques in order to bypass the perimeter defenses. Uh so gaining a foothold on the internal network is not

02:04

not a big deal anymore. So as attacker profile has been changed by time uh defense methods are also changing. Uh besides that penetration testing concepts are changing too. Uh today organization and testers are mostly focused on uh assume breach

02:22

approaches. So assume breach is simply accepting that attackers are bypassed your uh perimeter defenses and got a foothold on your network. So direct teams and testers will be focused on post exploitation instead of passing the perimeter defenses. So rather than traditional vulnerability scanning testing activities uh becomes a cat and mouse game

02:44

between attackers and defenders. Uh we can show NITs the network intrusion detection systems as an initial step for detecting attackers on the network. Uh we can talk about two main type of NITs in there. Uh they are signature based and anomaly based. So you

03:09

know the signature based systems. Uh it is a signature stack of previously known attacks. It compares each network package with those predefined patterns. Uh if they match uh well then uh we have a malicious traffic. Uh there are very good and open

03:24

source projects out there such as uh snort and surrogate. Uh the downside of the system systems is since they only catch known attacks they won't be able to uh catch new type of attacks. So uh how an attacker can evade those systems? Uh I say it's not so complex

03:41

but it's not uh super easy too. All you have to do is uh changing the traffic elements such as uh package size, headers and some known strings. Also maybe you can uh apply some encoding uh methods too. Anomaly based systems are a bit sophisticated than the signature based systems. Anomaly based NITs uh refers to building a statistical

04:04

model describing the normal traffic and flagging the abnormal traffic. Uh to do that it analyzes the normal network traffic and applies uh various data science techniques to build a pattern. This kind of systems are exists but uh they are usually commercial

04:21

products and not open source tools. Uh there are some theoretical concepts uh described in various of researchers but they are not so practical yet. So anomaly based system have a chance to catch uh new type of attacks. Uh we can show the overall process in a basic chart. Anomaly system uh should record the daily traffic first in order to uh build a

04:45

baseline. So what do we mean by daily traffic? It's the regular activities of the regular users on the network. For example uh visiting uh news website, sending an email via SMTP or accessing a server uh via SSH etc that kind of things. After done that

05:04

capture data will be given to the learning algorithm. It will be trained by this data to create a pattern which means the normal traffic profile. Uh after done this program listens to the network and compares each network package with the normal traffic profile. If it's not fit the normal traffic profile it's flags it's as uh a normal

05:25

traffic. Uh the evasion part is a bit tricky than the signature based systems. Uh we can list the evasion methods into two different category uh pre-training evasion and post-training evasion. The training refers that analyzing the normal traffic uh to

05:41

build a pattern. So as we know that anomaly based system needs to capture and analyze uh regular network traffic in order to build a pattern. So we are assuming that during traffic capture only legit users are uh generating legit network traffic. So that we are expecting that anomaly based uh system will be able to def- differentiate

06:04

malicious traffic from the normal traffic. But what if what if an attacker is located on the network before the training process is done? Uh he can generate malicious traffic. So uh it will be included in the normal traffic pattern. For example in this chart a user visits amazon.com uh another one connects uh some server via SSH. But the

06:27

other one the attacker is uses meterpreter's uh reverse TCP traffic. It generates that traffic. So anomaly based system will include this on the normal profile. So in future when anyone run meterpreter on the network anomaly based system won't catch

06:42

it. But we can say this scenario is not realistic right? How can an attacker know when the traffic is gonna be trained? It's hard. Uh we can count this under um malicious insider threat. Uh in a realistic scenario we expect that attacker can't know when the

07:05

traffic training is done. Therefore we need to focus on uh post uh training evasion. Uh in this scenario we will assume that a trained anomaly based system watches the whole network. So we as an attacker uh should gain a foothold on the network and exfiltrate the

07:22

data without causing any anomaly alerts. Uh to achieve that we need a uh post exploitation framework. So um I could choose Metasploit for this job but instead I choose Empire. Uh because I love this project. Uh it's very flexible for me, easy to add and

07:42

remote things. Uh also written in Python which I prefer obv- uh over Ruby or myself. Uh so uh how many people used uh Empire project before? Okay there are uh quite lot of people but there are some also people doesn't know it. Uh okay. So we can

08:04

describe, so Empire is basically a post exploitation framework like uh Metasploit. Uh we can describe Empire's workflow in two parts, agent and listener. So agent states the infected machine on the network which takes and executes tasks on there. So listener is uh

08:22

described as communication server, the C2 server in which agent connects there, gets designated task and sent related output to the C2 server. Empire supports different type of listeners such as uh HTTP, HTTPS, Dropbox and there were some others too. But

08:40

even though HTTPS connection encrypts all communication we will assume there is a solution on the network which intercepts and decrypts TLS communication. Uh because of that HTTP listener uh will be our main focus. Uh there are different traits of this HTTP listener as you can see on the slide. Uh we will see how this traits affects our

09:01

visibility on the network in upcoming slides. Uh HTTP listener provides an encrypted communication even without uh TLS connection. This encryption is done in two parts. The client data which includes data like command to be executed and its output, it's encrypted symmetrically with uh AES algorithm. On the other hand there is

09:24

metadata routing data package which responsible for uh routing packages to the right destination. It's also encrypted with uh our C4 algorithm. So we will consider these encrypted payloads are not decryptable by a solution on the fly since there is no uh

09:41

publicly known product has man in the middle capacity for Empire's communication. So assuming that an agent is deployed on the network uh we will check the communication between agent and C2 server. Uh after the initial negotiation agent will connect to the C2 server in every n seconds. Uh uh you can see generic request and response of a

10:05

heartbeat connection in there. So we can we can list following traits which can be inside of an anomaly based system. Uh first the request URI. As in shown in the request, agents make its connection to uh with a get request to a specific URI. Uh it's read that PHP in this

10:23

example. So if only HTML or ASPX pages are in use on the local network, uh this PHP extension may be flagged by the anomaly detection system because it's uh something different than the HTML or ASPX pages. Uh second we have used the HTML or ASPX page

10:40

to set the user agent value. If all users on the network uses Microsoft Windows with Chrome browser uh setting the user agent value to Mac OS with Safari browser probably will be flagged by the animal detection system because it's uh something else, something different. It's uh not familiar traffic. It's also same with the server header. Uh you

11:02

know if you are using uh Microsoft IS servers in the network uh setting the server header as Apache uh we won't be uh good choice for you. Okay. Uh the last one is post request body actually. As seen in the example request, post request body is

11:23

encrypted and contains gibberish characters. If all users are browsing uh regular websites, this will be likely to be flagged by animal detection system. Uh so we listed the traits which should be considered by an anomaly based system. Now let's check uh how can

11:40

we adjust them. Uh we can gather the traits into two different groups. Traits that can be changed uh in listeners option menu and the traits that can be changed by changing empire source code. Our first group consists request URI user agent value server header port and connection interval. Our second group consists uh default HTML content

12:05

and post request body. So let's look at the proposed solution. With which method we can uh normalize the mentioned traits. Uh the proposed solution is called polymorphic

12:20

landing attack. Uh it's a useful technique for uh evading animal based systems. Uh from a high level of perspective, the idea is creating attack packages uh which are matches to the normal traffic profile. Uh of course to use this technique, attacker should know what is considered as normal right? Uh the attacker should know the features which are

12:43

used to train a normal traffic profile. Uh so to do that our model requires a traffic capture data of the network after the normal training process is done. Uh by analyzing the traffic capture we can do some deductions about the uh normal traffic profile. For

13:02

normalizing the first group of traits which are request URI, user agent, server header and port. Uh we need to analyze the traffic capture and identify uh most common values. For example, uh which user agent values preferred most? Which ports are used? What kind of server headers in there, etc. After identifying these most common values, we

13:22

can configure Empire's listeners option. After that we can start the agent and see to communication. Uh we also have a connection interval in the first group of traits. However, uh finding normal connection interval uh it's not an easy task. One way to do

13:41

it, figuring the connection interval and frequency of the users uh to the specific websites. However, uh this solution will not be practical since there can be delays interruptions during the traffic capture and this will be uh analyze our results. Uh the second solution actually uh relies on po- false positive rates of an anomaly detection

14:03

systems. Uh some research mentioned that the false positive rate of an anomaly based system has a positive correlation with the size of the network. Uh which means that in a large network even if we keep connection interval value small and get flagged by anomaly detection system, uh this will be likely to be seen as false positive by

14:23

analysts uh because uh it's usually happens a lot. Uh however, if we are located in a small network, uh we need to set the connection interval higher than a predefined threshold. Uh the disadvantages of this method is uh C2 agent communication will be delayed uh but it's a trade off to keep our communication out of sight from uh

14:44

anomaly detection system. For the second group of traits which are explained on the previous section, uh default HTML content, it's a can be chosen by identifying uh most visited website in traffic capture data. However, normalizing post request body of the

15:03

communication is not achievable by uh using traffic capture data of course. As it explained in the previous sections, uh post request is encrypted and contains uh gibberish characters. So here's the deal, uh if we encrypt the post request body, it gives us power over signature based solutions because it won't match with any

15:23

malicious signature. Uh however, anomaly based system will flag this since it's not like a normal HTTP data. On the other hand, if we don't use encryption in there, our problem is bigger. Now signature based IDS uh will catch us since it may contain some malicious integrator like who am I, cat etc etc. Also, anomaly based IDS will catch

15:45

this, same reason with the previous one, it's not like a normal HTTP data. Uh instead of directly encrypting the data, we can use mark of obfuscation tool which is created by a silent superior team. Uh so let's check the overall process in the slide. First,

16:04

uh we need to- we need the training data which consists uh lots of texts. Uh silence team used an English book for their demonstration. Uh after that mark of encode algorithm is trained by this book to uh build a model. Then the model is built, it's

16:21

ready to encode any string uh to a fairly meaningful English text. For example, this is the data of an ETC pass VD file. Uh normally every IDS solution will flag this. However, when we use English book to generate a model and encode that string with uh

16:41

mark of obfuscation tool, we got this result. It seems like a meaningful English text uh like a blog post. However, when it's decoded by using same model, uh we will have the initial content of the uh ETC pass VD file. Uh you can check uh silence superior teams github page in order to get more information about this tool. So the

17:04

operation steps will be like this. First, we need to encode encrypted content with base 64 uh to get rid of gibberish characters. Uh as we talked, we also need lots of text which will be used as training data for mark of obfuscation. Uh one way to do

17:22

it uh including training data inside the agent. However, it won't be practical since it will increase the agent size a lot. Uh instead of that, we can download the data set from an external source. Uh it doesn't have to be our own websites. We can program it for

17:40

crawling and parsing text from like news websites, blogs, etc. After that, we are ready to encode our data. And we will send the encoded data along with the training data to the C2 server. Since the anomaly based system will see only English text, it will

18:03

think that it's kind of blog post or a forum comment and probably it won't raise any alerts. Uh but there are some drawbacks of this method. Uh the first one is uh training phase will consume time and resource of the deployed computer. Uh time may not be

18:23

a very big problem but if there is a centralized resource monitoring solution on the network, blue team may identify this resource uses uh spikes. The other drawback is you need to implement mark of obfuscation code inside the agent. Because it should do this by own zone. Uh we can send some comments and expect outputs at that moment. Uh this

18:47

issue will increase agent file size also. Uh so we created a tool called first order which is written in Python. Uh with given pickup file, first order can uh extract key

19:02

features of the network such as most used ports, most used user agents, server headers, how many different machines are located on the network, etc. Uh it can extract all the traits which we discussed in the previous sections. According to these identified data, it automatically configures uh Empire's listener. Uh so let's see a small demo video uh to

19:23

see its own action. Sorry uh I can't pause the video so I need to explain in real time. So

19:41

first order takes three arguments. One of them is uh capture data, pickup file and the other is Empire's username and password. After then it analyzes the packages and provides us a summary of the capture file, most used elements and it automatically generates an Empire listener. So since this Empire listener is fed by the most common

20:07

elements of the network, uh probably it will uh bypass the anomaly detection system located on the network. Because we are using the normal traffic profile of the network to build an uh agent and C2 communication. But if you use default configuration of the

20:26

Empire, probably anomaly based system will catch it. Uh so this tool is available on this address. Uh so feel free to use it. Uh submit issues if you find any bugs. And of

20:40

course uh send some stars. So as a result um we can say that defense mechanisms are uh evolving to something smarter, uh something better. Uh maybe in future signature based approaches will be totally be abandoned. And there is a yeah. And there is a high

21:10

probability that machine learning based defense mechanisms. Something's going on under. Anyway. Yeah. And there is a high probability that machine learning based

21:23

defense mechanisms will be cheaper and wider than today. So attackers should also uh evolve in this way. Uh we need to find smarter ways to mislead artificial intelligence. Um thank you for listening. That's all I got.