Red Team Village - Autonomous Security Analysis & Penetration testing (ASAP)
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 374 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/49159 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
DEF CON Safe Mode80 / 374
11
35
63
70
74
86
90
98
102
103
104
105
106
107
109
110
113
114
117
119
121
122
123
124
126
127
128
130
136
137
138
142
146
151
152
153
154
159
160
161
163
165
166
167
168
169
171
177
189
214
226
231
232
239
240
246
247
250
255
256
265
267
268
269
270
271
272
274
279
280
283
289
290
336
337
344
360
362
363
364
365
367
00:00
Information securityPenetrationstestAsynchronous Transfer ModeVirtuelles NetzTrigonometric functionsConvolutional codeEmailTwitterHyperbolic functionArtificial neural networkMachine learningVirtualizationCybersexComputer networkService (economics)Information securityField (computer science)InformationComputer sciencePhysical systemInformation technology consultingHacker (term)SoftwareComputer networkComputer animation
00:49
PfadanalyseDemo (music)Asynchronous Transfer ModeService (economics)Vulnerability (computing)Validity (statistics)Module (mathematics)Physical systemComputer networkPresentation of a groupPlanningInformationMathematical analysisDifferent (Kate Ryan album)Autonomic computingDemo (music)Computer animation
01:36
StatisticsMachine learningInformation securityAsynchronous Transfer ModeIntegrated development environmentGroup actionInformationDifferent (Kate Ryan album)MereologyPattern recognitionEmailPRINCE2Pattern languageMeasurementVirtual machineMachine learningIntegrated development environmentDecision theoryComputer networkLoginOnline helpPresentation of a groupPhysical systemProcess (computing)Artificial neural networkComputer animation
03:05
Information securityAsynchronous Transfer ModeMachine learningFrequencyMathematical analysisCybersexNatural numberPattern languageComputer networkInformationFlow separationUtility softwareView (database)Field (computer science)Cartesian coordinate systemPhysical systemInformation securityComputer animation
04:40
PenetrationstestSelf-organizationGraph (mathematics)Asynchronous Transfer ModeSelf-organizationBridging (networking)Graph (mathematics)Staff (military)NumberMachine learningWave packetSoftware testingCybersexScaling (geometry)Information securityInternet der DingeComputer animation
05:58
Physical systemHacker (term)MathematicsSoftwareData modelParallel portGenetic programmingTrigonometric functionsAsynchronous Transfer ModePhysical systemMereologyAutonomous system (mathematics)Symbol tableArtificial neural networkHacker (term)Cartesian coordinate systemVulnerability (computing)Machine learningInformation securityCybersexMathematical modelField (computer science)Fuzzy logicCodeComputer animation
08:04
ImplementationAsynchronous Transfer ModeMathematical analysisInformation securityPhysical systemPort scannerLoginInformationModule (mathematics)Software testingMathematical analysisComputer networkEndliche ModelltheoriePoint cloudPlanningQuantum stateForm (programming)Different (Kate Ryan album)Markov decision processWeb applicationFunction (mathematics)Game controllerGraph (mathematics)Computer animation
09:47
Port scannerService (economics)Asynchronous Transfer ModeServer (computing)Matrix (mathematics)Computer-generated imageryGastropod shellInterpreter (computing)MathematicsInformationService (economics)Graph (mathematics)Vulnerability (computing)Network topologyElectric generatorPhysical systemGastropod shellQuicksortParameter (computer programming)Complex analysisDifferent (Kate Ryan album)Core dump
11:19
Control flowPhysical systemGastropod shellAsynchronous Transfer ModeData integrityInternetworkingService (economics)Server (computing)Firewall (computing)CodeLoginElectronic mailing listParameter (computer programming)Physical systemVulnerability (computing)INTEGRALGame controllerDatabaseWeb 2.0Server (computing)WebsiteInformationComputer networkPivot elementElectronic mailing listWorkstation <Musikinstrument>Scripting languageService (economics)Virtual machineInternetworkingFirewall (computing)Computer animation
13:14
System identificationInjektivitätVulnerability (computing)Server (computing)Control flowAsynchronous Transfer ModeSoftware testingCASE <Informatik>Game controllerInjektivitätInstance (computer science)Computer networkServer (computing)Vulnerability (computing)Multiplication signDatabaseTrans-European NetworksInformationFrequencySequelLimit (category theory)Computer animation
14:17
Vector graphicsMultiplicationSystem programmingAsynchronous Transfer ModeHecke operatorMathematical analysisVulnerability (computing)Latent heatPhysical systemVector spaceCryptographyAuthenticationMachine learningClient (computing)Front and back endsSoftware testingSurfaceRoutingMaxima and minimaTask (computing)CodeMultiplication signProcedural programmingData storage deviceCartesian coordinate systemData managementNavigationPoint cloudAuthorizationLogicConnectivity (graph theory)Complex analysisMereologyIntegrated development environmentComputer animation
18:03
Asynchronous Transfer ModeGraph (mathematics)LogicRule of inferenceVertex (graph theory)Graph (mathematics)Firewall (computing)RootServer (computing)DatabasePrimitive (album)Condition numberVulnerability (computing)InformationSoftware frameworkConfiguration spaceSoftwareFirst-order logicComputer networkGraph (mathematics)Web 2.0PreconditionerRootExistenceQuantum stateBuffer overflowCASE <Informatik>Graph (mathematics)Initial value problemCategory of beingInteractive televisionServer (computing)CodeConnectivity (graph theory)Condition numberRule of inferenceExploit (computer security)DatabaseInternetworkingComputer animation
20:46
Graph (mathematics)Asynchronous Transfer ModeComputer networkInformationVirtual machineGraph (mathematics)Rule of inferenceConfiguration spaceInteractive televisionoutputComputer animation
21:25
Representation (politics)Multitier architectureAsynchronous Transfer ModeGraph (mathematics)Computer networkServer (computing)RootWeb pageLogicInformationRule of inferenceInternetworkingDifferent (Kate Ryan album)Computer networkServer (computing)System administratorInjektivitätMalwareUniverse (mathematics)Game controllerElectronic mailing listWeb 2.0Physical systemLogicPredicate (grammar)Graph (mathematics)Vulnerability (computing)Configuration spacePlanningDemonInteractive televisionRootCondition numberSoftware developerPreconditionerCategory of beingDatabaseExploit (computer security)Module (mathematics)Derivation (linguistics)CodeOvalConnectivity (graph theory)LoginService (economics)SoftwareFirst-order logicMarkov decision processSoftware testingComputer networkComputer animationProgram flowchartSource code
25:33
Asynchronous Transfer ModeGraph (mathematics)File Transfer ProtocolVulnerability (computing)Quantum stateInternetworkingGroup actionGame theoryMarkov decision processVirtual machineComputer animation
26:27
Graph (mathematics)Asynchronous Transfer ModeCross-site scriptingVirtual machineGraph (mathematics)Server (computing)File Transfer ProtocolMereologyWeb 2.0Exploit (computer security)Correspondence (mathematics)Computer networkVulnerability (computing)RootComputer animationProgram flowchart
27:34
Group actionRootQuantum stateFile Transfer ProtocolMarkov decision processAsynchronous Transfer ModeExploit (computer security)Electronic data processingVulnerability (computing)Computer networkInformationServer (computing)File Transfer Protocol2 (number)Markov decision processWeb 2.0RootComplex analysisGroup actionSoftware testingConnectivity (graph theory)Quantum stateState observerPhysical systemService (economics)Point (geometry)Intrusion detection systemComputer animation
31:24
Group actionMatrix (mathematics)Asynchronous Transfer ModeMatrix (mathematics)Complex analysisVulnerability (computing)Quantum stateGroup actionCartesian coordinate systemRootComputer animation
32:54
Quantum stateOcean currentGroup actionMatrix (mathematics)Kolmogorov complexityAsynchronous Transfer ModeHigh-level programming languageInformationGroup actionReal numberMappingQuantum stateVulnerability (computing)Core dumpOcean currentInformationRow (database)Form (programming)Exploit (computer security)Point (geometry)Markov chainFile Transfer ProtocolCartesian coordinate systemComplex analysisMetric systemComputer animation
35:19
Graph (mathematics)Asynchronous Transfer ModeVulnerability (computing)Graph (mathematics)Quantum stateInformationComplex analysisNetzplanAlgorithmMarkov decision processFunktionalanalysisVulnerability (computing)Exploit (computer security)Different (Kate Ryan album)Group actionMatrix (mathematics)PlanningCodeMetric systemIntrusion detection systemModule (mathematics)Computer animation
36:59
Asynchronous Transfer ModeFile Transfer ProtocolDifferent (Kate Ryan album)DemonPlanningMeta elementHeegaard splittingValidity (statistics)Scripting languageMultiplication signWindowComputer animation
37:48
Asynchronous Transfer ModeComputer networkService (economics)SoftwareDemo (music)Revision controlMultiplication signSphereVulnerability (computing)Message passingSource codeComputer animation
38:36
Asynchronous Transfer ModeVulnerability (computing)Mixture modelVirtual machineInformationModule (mathematics)Computer animationSource code
39:13
Asynchronous Transfer ModeFront and back endsVulnerability (computing)Module (mathematics)Different (Kate Ryan album)WhiteboardComplex analysisComputer networkInformationQuantum stateMessage passing
40:10
Graph (mathematics)InformationDot productComputer networkoutputMalwareModule (mathematics)Computer fileInteractive televisionMessage passingRule of inferenceRevision controlVulnerability (computing)Electronic mailing listScripting languagePort scannerTranslation (relic)Service (economics)Source code
42:19
Asynchronous Transfer ModeModel-driven engineeringRepresentation (politics)Electronic mailing listComputer programmingServer (computing)InternetworkingVirtual machineRule of inferenceGame controllerComputer networkInformationVulnerability (computing)Visualization (computer graphics)Library (computing)Graph (mathematics)Computer fileComputer animation
43:13
Asynchronous Transfer ModeModel-driven engineeringInformationMarkov decision processGraph (mathematics)Module (mathematics)ParsingCodeSubsetMarkov chainGraph (mathematics)CASE <Informatik>Form (programming)Complex analysisComputer networkQuantum stateGroup actionVulnerability (computing)AlgorithmSoftware testingAttribute grammarCondition numberSource code
45:14
Model-driven engineeringAsynchronous Transfer ModeQuantum stateGroup actionPlanningMarkov decision processMultiplication signExploit (computer security)Validity (statistics)2 (number)CASE <Informatik>Vulnerability (computing)Gastropod shellFile Transfer ProtocolSource codeComputer animation
46:09
Asynchronous Transfer ModeHigh-level programming languageComplex analysisScale (map)Mathematical analysisData modelMultiplication signPlanningValidity (statistics)DemonGastropod shellModule (mathematics)Electric generatorTraffic reportingAutonomic computingComplex analysisSoftware testingService (economics)InformationExploit (computer security)SurfaceSource codeComputer animation
47:42
CodeAsynchronous Transfer ModeInformationSelf-organizationSource codeSoftware repositoryLink (knot theory)Computer animation
Transcript: English(auto-generated)
00:00
Thanks everyone for attending my talk. So those of you who don't know me, I am a PhD candidate at Arizona State University. I am specializing in use of artificial intelligence and machine learning in the field of cybersecurity. I'm also a security consultant at Bishop Box.
00:23
I also authored a book known as Software Defined Virtual Network Security, dealing with like security in software-defined systems. I'm also co-founder of DevilSec, which is the hacking club that we have at ASU. I work for Blackberry Public Services
00:41
and Computer Sciences Corporation in past, and this is my contact information. So let's dive into the overview of the talk. What is the motivation? Again, what would be the overview of our system ASAP? There are three main modules in the system.
01:03
Slinger, which is used for the discovery of information, both services and vulnerability in the network. Americano, which is used for analysis of different paths an attacker can take in a network. And Cappuccino, which is our AI-based
01:22
autonomous attack plan generator. And finally, we do the validation of these attack plans, and we will jump into the demo after the end of the presentation. So let's see what is machine learning.
01:41
It's a statistical way of learning from the information present in your network, what kind of network traffic you have, the logs on your system. So using some pattern recognition to identify some attack patterns in a network, we use machine learning techniques for that,
02:02
and it's already been used successfully in things like spam detection on your email. You don't get those Nigerian prince emails these days because of amazing job done by spam detection systems. Artificial intelligence, on the other hand, it perceives the network traffic,
02:25
what kind of activities are going on within the environment and uses that information to take some decisions. So it basically acts on the information provided by different agents in the environment. You can think of smart ideas
02:44
that we can design in a system, which basically collects information from different parts of the system, updates its beliefs, and then takes some intrusion prevention system measure. So that is something that we can take the help
03:02
of AI in designing. AI and machine learning have found some useful applications in cyber security, both in industry and research. We use the attack patterns for detection to basically identify malicious actors within our network.
03:23
And we try to see if the attacks are very stealthy in nature. There are attacks like advanced persistent threat, which are basically slow and low kind of attacks, which are hard to detect. They are carried over multiple days
03:41
and a good example is Sony hack, which was carried out over a period of several months. So identifying some valuable patterns from those kinds of attacks is some place where we can definitely utilize machine learning. And there are recent investigation
04:05
in the use of AI to design deception-based system, moving target defense and cyber deception in general are two ways or two fields of research that explore how to identify the attack patterns
04:23
in a network and basically use that information to present a fake view of the network to an attacker and deceive him into honeypots, where they can do further analysis of his attack intentions.
04:40
So why do we need AI and machine learning in cybersecurity? And I did some background research and it's estimated that there will be 25 billion IOT devices in US by 2021. And the investment in cybersecurity will be up to a trillion dollar.
05:03
With penetration testing, if we look at it, the market size would be 3.2 billion US dollars. So the number of devices are growing perhaps at a quadratic scale, but we have a shortage of cybersecurity workforce.
05:23
It's estimated that 65% of the organizations feel that their staff is not very well equipped in cybersecurity. And 36% of the organization reported that there is a lack of training
05:42
or skills in existing cybersecurity workforce. So that is where we plan to use AI to kind of bridge this gap. So if we look at a very practical example
06:02
of application of artificial intelligence, DARPA Cyber Grand Challenge was one place where AI was successfully applied. There were seven participating schools, seven or eight who took part in a hacking competition
06:21
where each school was trying to target the infrastructure of everybody else while keeping its own infrastructure secure. So the important catch was that all of these participants were AI, not AI, but autonomous systems.
06:41
And there was no human involved in this competition. So Mayhem, which was a company based out of CMU, they automated what white hat hackers could do. So they found and exploited the weaknesses present
07:00
in these system. What they did was they created a mathematical model of the paths that attacker can take. And then they used two techniques, symbolic execution and fuzzing. So symbolic execution was the way to point out interesting code paths.
07:22
And fuzzing was kind of a hammer, which was hammering through those code paths to exploit the vulnerabilities pointed out using symbolic execution. And they won this championship and they managed to find 14,000 vulnerabilities
07:42
on Debian system as well. And 250 of these vulnerabilities were new. So imagine a human attacker trying to do all this. It's kind of difficult. And this shows a successful motivation to use machine learning and AI
08:01
in the field of cyber security. So if you look at our system ASAP, there are four main modules. Stinger, which is S stands for scanning. So we use Stinger for scanning and recon. The information from Stinger is fed into Americano.
08:22
So A stands for attack analysis. And we use this information from Americano to identify the attack states in the network. Latte, which is a module, L stands for log here. So it's a module which identifies network
08:42
and host logs to gather the threat evidence. And Cappuccino, which is kind of the network controller, it takes all this information from Americano and Latte to encode in form of AI model Markov decision process.
09:05
And based on that model, it identifies some attack plans. Like if a penetration tester were to test or attack this network, what kind of plan would yield him maximum output?
09:21
And eventually we can execute these attack plans on a cloud or web application and update the risk score and attack graph, which is basically Americano. So I am addicted to caffeine.
09:40
That is why I chose the name of these modules based on different kinds of coffee flavors. So let's dive deeper into Stinger. So Stinger basically scans the network topology for service information and discovers the vulnerability. So we have automated Nessus and OpenVAS APIs
10:03
to identify this attack information. And this attack information is then fed into Americano, which is attack graph generation tool. So let's look at one of the known vulnerabilities.
10:21
This is a shell shock vulnerability and there are different parameters from common vulnerability scoring system for this particular vulnerability. Like you need just a network access to execute this attack and it has a low access complexity. So you don't need to like do a lot of investment
10:43
as an attacker to exploit this vulnerability. So this is a example of kind of vulnerability where we can implement some sort of automation once we are able to identify this vulnerability.
11:01
And the reason of providing this information is that we will see later that how we can use these CVSS parameters like access complexity and possibly CVSS score to encode the information in our AI solver.
11:21
These are some other parameters like the impact on confidentiality, integrity and availability is very high and attacker can take full control of that system if he were to exploit this particular vulnerability. So let's take a look at a motivating example
11:41
where attacker is located on internet and his goal is to reach this database server and exfiltrate the information out of database server to his command and control center. And there are some publicly known vulnerabilities on these machines.
12:04
So basically attacker is trying to exfiltrate the information from database server but there is a firewall on his base so he cannot directly access this database server. So he either needs to go through the web server or wait for a internal user
12:23
to download some of his malicious code and use that as a pivot to go to the database server. And the web server is the only publicly available service in this network. So attacker can try to exploit the web server
12:44
using unknown vulnerability or he can have some malicious script on a popular website that a user downloads and that way he can gain access into his workstation. And using this, he can then take advantage
13:01
of the access control list which basically allows any network traffic from web server to go to database server or any workstation traffic to go to database server and that way the attacker can exploit the SQL injection vulnerability that is present on the database server
13:23
and then use it to gain persistent access to his command and control center. So you will see that there are two attack paths in this small network to achieve the same goal.
13:42
So imagine a very giant network with tens of thousands of instances and you are asked to perform a penetration test for that network in a limited period of time. So you need some kind of autonomy or automation
14:01
in that particular case to be able to have a good coverage in your penetration test. So we saw this example, but what about it? Like how do we basically use this information? We can do some initial attack analysis
14:21
based on this example and see that attack is multi-stage and attacker had specific attack vectors for this vulnerability and he went through multiple hosts and he circumvented some of the defenses that were present on the systems to achieve his goal of data exfiltration.
14:44
So let's discuss on kind of a philosophical level why AI can be used to hack faster. So imagine you are going home on a particular day
15:00
and you decide to take a turn on Arizona Avenue and Main Street and you have been taking this route forever to reach your home, but you encounter a traffic jam on the way.
15:22
So you went by your intuition and this got you into a traffic jam. But if you had a GPS to help you navigate, you could have avoided that jam. So similarly as penetration testers when we try to go after certain vulnerabilities,
15:40
we have kind of a preset methodology. So we will go through some authentication issues, authorization issues. We will see if we can use the user management in some way. We can try to see if we can get horizontal privilege escalation.
16:03
We go after data stores. We go after application logic. If it involves the code review, we go through the procedure of code review and use all of that to see like, what's the maximum we can get in this penetration test.
16:25
But here is the challenge. Like if you have say 20 hours for a particular assessment, do you think on environment where you have to do pen test on application
16:41
as well as the cloud part of the backend components, it's very challenging to get a good coverage in that scenario. So AI and machine learning can act as kind of navigator for us on these assessments.
17:06
So we can think of ASAP as kind of AI based GPS to navigate the attack surface. And it may not work on all kind of unknown vulnerabilities
17:21
like say data encryption issues, which you identify which is a vulnerability, but it can help us in semi-automating some of the tasks that we may miss out. So the worst thing would be that there is a very low complexity vulnerability
17:41
that was present on the system, but you just ran out of time on your pen test and you couldn't exploit that vulnerability and later the client finds out, hey, why did you miss it? So then you are in a tough situation. That is another kind of motivation to develop this kind of a system.
18:04
So with Americano, we get the information from Stinger and we use these vulnerabilities and software configuration to pass to a first order logic based framework.
18:20
And that framework basically generates a multi-stage, multi-hop attack graph. And attack graph basically shows that different paths an attacker can take in a network to be able to reach his desired goal. So if you look at the definition of attack graph,
18:41
we have some nodes and edges which are property of a given graph. There are some fact nodes NF. Fact nodes will be something like the existence of vulnerability or the existence of network connections. And conjunct node are denoted by NC.
19:04
The disjunct node are denoted by ND and root node, which is basically goal of attack is denoted by NR. So a conjunct node can be something that you can achieve based on your initial exploitation of a certain vulnerability.
19:24
So you have some fact nodes that you combine with these interaction rules that we provide in first order logic to achieve some other conjunct nodes like exact code. So suppose there is a vulnerability buffer overflow
19:41
on web server and the attacker can access the web server. So if attacker is located on internet, then that can lead to execution of code on web server. And based on that example, the root node in our case would be to gain a root privilege on database server.
20:04
So there are two kinds of edges. E pre denotes the precondition edge and E post denotes the post condition edge. So a precondition edge basically combines fact nodes and conjunct node to show that the next possible edge
20:26
is a possible state that an attacker can achieve. And E post means the edges that are triggered if some preconditions are satisfied. And we have some base initial condition nodes
20:40
in this attack graph that we can denote using NI. So to simplify this, we have some advisories that we identify based on the scanning of the network. We have host configuration information, we have network configuration information.
21:03
The principle indicate like who has ownership on which machine and we use interaction rules and policies to provide input to this attack graph based reasoning engine, which then generates attack graph for us.
21:27
So before going any further, let's look at some information of these Mulwell rules. Mulwell is basically a reasoning system
21:43
which encodes this information and it's a work by University of Kansas, which we kind of used in our development of the ASAP system. So advisories show that what kind of vulnerability exist in the network, 1WD property,
22:01
host configuration shows that the web server has Apache software, it's running on port 80 and this is the daemon. Network configuration is basically the access control list, which says that from internet to web server, there is a TCP connection
22:21
that can be established on port 80. And principle show that a user has a user account on this PC and there is another system admin, which is kind of a root level account on the web server. So all these information we can obtain
22:42
by scanning the network and by obtaining the host configuration information, the network rules. And then they go through these first order logic rules. So this is one of the rules, which is that if there is a vulnerability existing on host with vulnerability ID
23:02
and the vulnerability has a property that it's a remote exploit and there is a network service corresponding to this host and the attacker has a network access on this host and attacker is malicious. Then this will lead to a code execution.
23:23
So these are basically predicates of this rule and this is execute code by attacker on host and beginning of privilege is basically the host condition that is obtained when all these preconditions are satisfied.
23:41
And we also use the policies which showed the user access on different resources in the system to encode into these interaction rules. So this will be a logical attack graph of our system that we saw.
24:03
So you can think of attacker located on internet as this node zero. Then the node zero interacts with different nodes and these ovals represent the rules. So like one of this oval will be interaction rule.
24:22
And based on that, the attacker progresses to the next privilege node. So we can think of this as like a root exploit on say Apache web server, then attacker probably gains some other network level access using another host condition of the attack graph
24:42
and eventually reaches his goal of gaining root access on a database server by exploiting SQL injection. So the main brain or AI in this work is Cappuccino.
25:00
So what Cappuccino does is it takes the information from attack graph and information about different configurations and vulnerability from this CV search database and log information from the latte module to create a MDP graph.
25:22
So MDP graph can then be used to derive attack plan that we as penetration testers will implement on the network. So let's see how the states can be extracted from attack graph. So there were fact nodes which shows
25:41
that attacker was at internet initially and the next privilege node that he gained was a network access on say another machine FTP. So there are two things that attacker can do when he is located on internet. Either he can take no action
26:00
or he can exploit this vulnerability. Let's say the CV ID of this vulnerability is CV 2013 4124. So these will be the states that we can extract from attack graph to be used in our Markov game
26:25
or our Markov decision process. So let's revisit the attack graph and let's see that for this another example, if there are two parts that attacker needs to take
26:41
to be able to exploit this FTP machine. So one way he can go about is that he goes from SSH and then tries to exploit the FTP or another way is that he first exploit this web server and goes to FTP.
27:00
So the corresponding attack graph for this network will be that attacker has SSH access. There is a SSH vulnerability, he exploits SSH, he gains a root on SSH. Or the attacker can also go through the exploitation of web server. So basically he goes through the web server
27:24
to exploit some vulnerability on the web server and then he reaches the FTP server. So in this graph, there are basically two parts that attacker can take. So if we are to obtain a Markov decision process from this,
27:40
basically MDP has some components state, action, transition and reward. So state represent the access that an attacker can obtain at any point in the network. So he can be a user in SSH. If he exploits the vulnerability, he can be a root user on SSH.
28:01
If he exploits the web server, he can obtain a root on the web server and eventually he can also obtain a root on FTP. There are two actions that we use to simplify this Markov decision process. So in each state, attacker can either choose
28:21
to take no action or he can choose to exploit the next vulnerability. And there are some probability values that are associated with these actions. And these probability values will explain like how we derive meaningful probability values for the MDP.
28:43
And we kind of relate these to the access complexity of the vulnerability. So if the access complexity is low, then probably it's easier to exploit that vulnerability and there is a higher probability of transitioning to the next state.
29:02
And the rewards are the values attacker obtains by being in a particular state. So say attacker does not want to exploit that vulnerability. So he will have kind of a low reward. The reward is that basically he is not detected by say an intrusion detection system.
29:22
So that's kind of a reward for him, but it's not very big reward. As compared to if he is able to obtain a root account on one of the external services, that's a high positive reward. So we put this value like a plus five and we use the CVSS score of the vulnerabilities
29:43
to derive the reward. So the reward can at any point be between zero and 10. And if there are uncertainty in the attacker action, this can be considered as a partially observable Markov decision process.
30:02
But in this work, we are using the simple Markov decision process to show how we encode this attack information. And when we solve this Markov decision process using a value iteration solver,
30:21
we obtain some policies. So policies are different paths that attacker can take to obtain some of his goals and the reward for each path. So value iteration tries to maximize the value that an attacker can gain
30:43
by following a particular policy. So there are two policies, like he can either exploit SSH, then exploit FTP or he can exploit SSH, then go to web server and then exploit FTP. So obviously the reward in second policy
31:02
is higher compared to first policy. But this is for a simplified network. We can hand encode these values and solve this Markov decision process. But as a penetration tester, if we are dealing with a very gigantic network, we would want to encode this information
31:21
using some of the MDP solvers. So as I mentioned, the states are the privileged level of the attacker. The value for the transition metric, which is basically the probability of transitioning
31:41
from one state to next state using an action. So suppose S0 was a state of user access on SSH and S1 is the state of root access on SSH. And the access complexity of this vulnerability
32:02
is low. That means that there is a high probability of transitioning to state S1. So we encode the value 0.9 for low access complexity, vulnerability 0.6 for medium, and 0.2 for high probability, high access complexity
32:25
because the vulnerabilities for which access complexity is high, they are obviously difficult to exploit. So there is a good chance that attacker will stay in state S0 if he tries to take an action. And the reward value basically for transitioning
32:44
to state S1, if the CVSS score of the vulnerability is 6.4, that will be the reward that attacker will gain by transitioning to that state. So as we discussed, the states represent the current privilege level
33:03
of attacker and the actions are the actions that attacker will take. And the transition probabilities are the values we derive from the access complexity of each CVSS vulnerability.
33:21
And if we encode this information for action, exploit SSH in a form of a transition metric. So by taking action exploit SSH, there is a 0.9% probability that attacker will go from state S0 to S1.
33:42
So let's consider the rows zero, one and two and columns zero, one and two. So S0, S1, which is zero, one, row and column, will show that there is a 0.9% probability that attacker will transition from S0 to S1
34:01
by taking that action. But that action has no implication on other states. So if he was in state S1, then exploit SSH doesn't do a whole lot for going to state of S2. So attacker will remain in state S1 if he takes that action.
34:24
So there is a high probability of being in state S1 and S2. If he takes action, exploit SSH in S1 and S2. And the reward as we discussed are the values he obtained
34:42
by the CVSS score of those vulnerabilities. So if he takes action A in state S, it maps to a real number between zero and 10. So as we discussed that taking no action will have a low reward and exploiting SSH will have a positive reward,
35:03
which like if the CVSS score for this vulnerability was 6.4, that will be the reward of being in state S1. And similarly, the reward of being in state S2 will correspond to the FTP vulnerability.
35:20
So bridging it all together, how we arrive at Markov decision process from ATT&CK graph. So this is the algorithm that I designed. Basically you parse the ATT&CK graph to get the nodes which show the privilege of the attacker in step one. You'll find the CV IDs of these vulnerability
35:41
that lead to this exploitation in step two. In step three from ATT&CK graph, you fetch the CVSS score of the vulnerabilities. In step four A, use the CVSS score to create a reward metric
36:01
which show the actions and state transition mapping to a real value. So basically it will be say a column matrix for different CVSS score. In state four B, basically you use the access complexity to derive the transition metric.
36:23
And you provide all this information to a MDP solver. So the reward metric, the transition metric, the different states and the action. And we use the pi MDP solver as a solver
36:41
with the value iteration function to generate our ATT&CK plan, which then we provide to this module pi metasploit that executes the ATT&CK plan and shows that how fast is this ATT&CK plan that we obtained.
37:01
So for the validation of ATT&CK plan, like if the ATT&CK plan says that first exploit SSH and then FTP, we have MSRPC, which is a daemon of metasploit, which is running and basically it uses the Python scripts to execute different plans
37:20
and tells us like what is the cost of running different ATT&CK plans, like how much time it took for different ATT&CK plans. So for MSRPC, you will need to have one of the MSRPC session running on one window and then on another window,
37:42
you will execute your ATT&CK plan and see how it goes. So let's look. Yeah, so let's look at the demo. So we'll first go to here and perform port scan
38:07
to identify the services that are running on the network. What are the software versions of those services that are present? It will take some time.
38:20
So in that meantime, we will go and check out the vulnerability scan that we have in our network. Basically, we have like a message and we can see like this scan obtained
38:45
a lot of vulnerability on a machine that we set up. There were about 71 vulnerabilities, which were a mixture of critical,
39:01
high and low vulnerability. So our port scan revealed some information that we basically need from the Stinger module. We have the CV search API, basically,
39:22
which provides us information on different vulnerability. Like if you provide the CV ID, it tells us the access complexity and CVSS score, which we will later use in Cappuccino.
39:42
We have a Nessus scanning APIs, which can be used for connecting with the backend of the Nessus on the provided port and basically check the state of the network scan,
40:06
basically create policies and see like if the scan is running or not. We can basically export the scan using these APIs. One thing is that in latest version of Nessus, I think they have disabled some of these APIs,
40:21
but if you are using the old version of Nessus, you can use these APIs. So the scan that we obtained from Nessus, we need to provide as input to the attack graph, which is our Americano module.
40:44
And the Nessus scan file we obtained is this ASAP underscore something dot Nessus. This will serve as input to the attack graph.
41:06
So malware is a tool which uses this information. So you can basically translate the dot Nessus file using Nessus vulnerability translate.
41:25
And this will generate a list of interaction rules and another script, which is a graphgen.sh can be used for obtaining the attack graph from this translated information. So I already have these files.
41:42
So you will need to set up Nessus and XSB if you want to use the Americano module and I have the details in the idmi file. So we run the malware module and the attack graph that we obtained,
42:02
it has some nodes edges, which showed the information about the network services, the different vulnerabilities that are present on this network, the interaction rules. If we check the visual representation,
42:23
it will look something like this, like the attacker was initially on internet. Then he used some rules of access control list, which allowed the access to a particular port to have direct access on another machine in the network.
42:43
So he can network access, then he used some other vulnerability to mount a remote exploit on one of the server programs. So this information is represented using dot file and we can obviously use some D3 libraries
43:03
to improve the visualization, but this is how the attack graph will look like. So now we use this attack graph and we provide this information to the MDP solver,
43:24
which is present in our cappuccino module. So the attack graph parser basically parses the attack graph,
43:42
obtains the information about the access complexity from the CV search, it checks the predecessor and successor nodes of the network, obtains the attributes that I described in the state condition.
44:04
And basically it encodes this information in form of a Markov decision process. So if you run this code,
44:21
basically the edge information will tell us like, if you are transitioning between two edges, so let's see like between 1,007 and 994, the edge has labeled CV 2011-04-11 and then we will use this information
44:41
to obtain the access complexity of that particular vulnerability from the attack graph and encode it in form of like a reward metric. Then the value iteration algorithm will learn
45:01
the attack policy that is beneficial for the pen tester in this case. So I took a subset of the entire graph to run the value iteration algorithm and it tells us that taking action of exploits, so zero represents no action
45:20
and one represents the exploit action. So in this state taking a particular action will be beneficial for the attacker. So it will take some time to basically run the value iteration solver and it will give you a attack plan. So we use the attack plan validator
45:50
to basically validate the plan of attack. So we use the attack plan to exploit FTP and another SSH vulnerability
46:02
and basically you can see we obtained two shells in this case and it took 0.35 seconds to complete and basically using this attack plan validation, we can validate how much time it took
46:23
for our attack plan to be finalized. So basically for the attack plan validation to work, you should have the MSF RPC daemon running in one shell, one session and you need to execute your attack plan
46:42
in another session. So in summary, the threat landscape is very complex and ever growing and the autonomous pen testing solution, like the one that we have used in ASAP,
47:01
you can help us navigate through this complex attack service and it is effort towards the generation of autonomous attack plans and their validation. So Lata is another module that we still have in progress and in future our plan is to use the log information
47:23
to validate these attack plans and possibly also work for some kind of report generation tool, which help us generate the report on how these validation was performed for individual exploits.
47:43
So thank you everyone for attending my talk. The source code is posted on the GitHub repo, the link which you can see. And if you want to contact me, here is my contact information. I appreciate the organizers of Red Team Village
48:03
to give me a chance to speak at this year's death on Red Team Village and have fun everyone.