Semantic Models for Network Intrusion Detection
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 12 | |
Author | ||
Contributors | ||
License | CC Attribution - ShareAlike 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/51313 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Information securityScientific modellingEndliche ModelltheorieComputer networkIntrusion detection systemNatural languageState of matterData managementCausalityPhysical systemUniverse (mathematics)Interactive televisionFilm editingTraffic reportingProcess (computing)Data miningPrisoner's dilemmaEndliche ModelltheorieLogicRow (database)Projective planeTouchscreenElectronic data processingSoftware architectureIntrusion detection systemSoftwareArtificial neural networkSoftware testing
01:04
System programmingIntrusion detection systemRule of inferenceVirtual machineOntologyComputer networkType theoryNumerical taxonomyFinitary relationCommunications protocolFlagMechanism designSound effectDenial-of-service attackData modelPasswordData bufferRootHierarchyEndliche ModelltheoriePersonal digital assistantType theorySet (mathematics)Proof theoryVirtual machineDatabaseRule of inferenceGradientCombinational logicPerformance appraisalOntologySoftwareAttribute grammarMusical ensemblePublic domainResultantRow (database)CASE <Informatik>Information securityPhysical systemWeightNegative numberTheory of relativityFlow separationEndliche ModelltheorieNormal (geometry)Formal verificationExpert systemInformationHelmholtz decompositionSimulationNumberVector potentialSoftware bugSemantics (computer science)Metric systemBijectionCondition numberNetwork topologyBit rateDecision theoryView (database)Point (geometry)Cartesian coordinate systemSound effectInterpreter (computing)Knowledge representation and reasoningMachine learningRootConstructor (object-oriented programming)Formal grammarTelecommunicationLevel (video gaming)Category of beingConfidence intervalProcess (computing)Supervised learningForm (programming)Numerical taxonomyHierarchyMechanism designStatisticsOrientation (vector space)SkewnessOrder (biology)Local area networkReal numberContingency tableInferenceAlgorithmClass diagramNormal operatorDifferent (Kate Ryan album)Communications protocolSoftware testingPerfect groupError messageIntegrated development environmentImplementationMultiplication signSheaf (mathematics)Complex (psychology)Binary codeDenial-of-service attackThread (computing)Price indexMultiplicationTask (computing)Process modelingLogistic distributionLinear regressionProgramming languageHacker (term)Goodness of fitChemical equationPredictabilityService (economics)Computer configurationLatent heatMereologyLoginoutputIntrusion detection systemCorrespondence (mathematics)Cross-correlationMoving averageGraph coloringGroup actionParticle systemProjective planePosition operatorCurveTable (information)DivisorFood energyObservational studyUtility softwareSymbol tableNumbering schemeAuthorizationRoundness (object)Standard deviationUniform resource locatorState of matterDialectSpacetimeCore dumpReading (process)Focus (optics)Workstation <Musikinstrument>Arithmetic meanOperator (mathematics)DigitizingEvent horizonQuicksortMilitary baseWebsiteDirection (geometry)LogicAdditionServer (computing)ConcentricConnected spaceProcedural programmingCellular automatonAcoustic shadowAreaForceComputer animation
Transcript: English(auto-generated)
00:00
So, brief introduction also for Peter. Peter Bednar with PhD in Artificial Intelligence is a senior researcher at the Faculty of Electrical Engineering and Informatics, Technological University of COSIS, that is in Slovakia. And his experiences are related to the knowledge management, test and data
00:23
mining, natural language processing and distributed software architectures for the big data processing. So, Peter, when you want, you can share your work. Okay. Hello, everybody. I hope you can hear me. Yes, Peter. And I hope that you can see my screen.
00:43
Yes, yes. Okay, so perfect. So I will start. So today, I'm going to present our common work with my colleagues from our university, from Kosice in Slovakia, and the title of the paper is semantic models for the network intrusion detection.
01:04
So basically, for the interaction, intrusion detection systems, currently, we can use two main approaches to implement such as systems. The first one is knowledge oriented approach, where the critical is the role of the domain
01:21
expert, some security expert, who is basically manually specifying crafting some generalized rules which can be then used to implement such a system and to automatically detect some intrusions. Of course, there are some options. How can domain experts or security
01:45
experts proceed? Usually, these involve, for example, some forensic techniques to check existing cases, but then it's really up to him to generalize these relations between
02:04
the indicators of various activities in the system and create the rules not only for the infection, but even the rules how to actually change the environment and make this environment again secure. It's even possible to use some, like for example, to be more proactive and use
02:32
some penetration testing or ethical hacking to simulate attacks and from this, again,
02:41
the security expert can generalize the rules how the particular environment will behave, how to actually secure this environment. But again, everything is inferred by the security expert, mostly. And the second approach, which is currently more advanced, let's say, and it's
03:06
necessary because of the complexity of the current system, is a data-oriented approach where we are using the application of the machine learning methods to take a look at the data, at the monitoring data and information from the environment and automatically use statistical
03:27
inference and machine learning methods to get these rules which will be then used for the intelligent detection. We can basically divide or decompose this intelligent detection task to
03:46
machine learning tasks like classification and anomaly detection. So for the classification, this is the classical application of the supervised learning when at the beginning you have to have your input data which are basically the logs from your environment,
04:06
and then you have to manually classify this behavior if there is intrusion in the system or some security score if this is like a normal operation. And in this way,
04:25
manually you can build a training dataset and from this training dataset you can use the machine learning methods to automatically extract some rules in the form of the statistical classification models to implement automatic detection of the intrusions. Of course, in this
04:47
way you can basically cover already known security threats, but there is another option like, for example, to use a different type of the machine learning task like anomaly detection
05:01
where you are basically modeling in a statistical way normal behavior of the system and everything which is outside of this normal behavior, this is the potential security threat or intrusion. In this way, you can build a model which can basically detect even the new types of the attacks
05:27
but on the other hand side it's more difficult to interpret the results of the anomaly and you have more false alarm than in the previous classification case. But in both cases,
05:45
there are many difficulties related to application of this data-oriented approach and most of them are really like principle from the point of view of the problem because most of the data you will collect will be related to the normal operation and
06:08
you will have really unbalanced datasets towards the normal operation and you will have only a few cases of security threats. So, this is always the problem for the statistical methods
06:26
to actually infer from such an unbalanced dataset reliable classifiers and effective rules for detection.
06:40
So, in our approach, we like to actually combine these two approaches and use both the domain knowledge specified by the experts together with the statistical modeling. And for this reason, we have started with the formalization of this domain and we created
07:01
like a semantic model of mainly we focus on the network security. So, we created like the semantic model in the form of the ontology. And then we are using this semantic model to actually improve statistical inference in the machine learning algorithms and we can do this in
07:31
multiple steps, but in this proof of concept we have implemented just the first one which is the problem decomposition. And the main idea is that in the semantic model we can specify
07:42
a taxonomy of different kinds of attacks and this taxonomy can be used to decompose the classification problem to simple tasks with the better statistical features like, for example, better balancing of the cases and so on. And then for each sub-level of this hierarchy we can
08:05
use a different statistical model and combine these together to a more precise system. Another possibility to use the semantic information together with the statistical model is that
08:21
some minor cases, some minor types of the attacks where we don't have much evidence in the training data, we can directly specify and formalize rules for these minor cases using
08:41
knowledge oriented approach. And since we have already formalized this model like a semantic ontology then we can use, for example, automatic inference, some logical system to implement this inference of the rules and to
09:04
combine this together with the statistical model to one combined approach. And another possible way where we can see the synergy between the
09:21
semantic models and the machine learning model is explanation of machine learning models because if you have formalized data using the semantic model it's much more simpler to provide some explanation for the users for the particular case why,
09:40
for example, some thread was detected, what was the most important, say, data attribute or which relations between the data were actually detected by machine learning algorithm and everything this can be expressed using the common vocabulary of the semantic model of the ontology.
10:02
So, for our case we have designed the semantic model which mainly provides the taxonomy of the intrusion types and then we are modeling some entities related to the network communication because as I said we are focusing on the network communication so we have the concepts which are
10:24
describing the connections, protocols, targets like target application and so on. Then from this thread side we are formalizing the mechanisms used to perform these types of the
10:42
attacks and then effects of the particular attacks and severities and what is the target of the attack. So, all this information is specified using the semantic
11:00
model of the ontology where we have the concept, here are correlations and even data properties. All these data properties basically corresponds to the logging data or metrics which can be monitored in the given environment. So, here is just an overview of the main relations.
11:25
So, from the type of attack we can infer, for example, what are the effects or what is the involved in this kind of attack, what is the target if this is the user or root user or
11:41
some kind of service or some kind of application and so on. The main idea in this proof of concept case is to use at least the taxonomy of the semantic model to decompose the detection of
12:03
attacks. In our approach we have decomposed the classification and detection of the attacks according to the taxonomy where at first we will filter out the normal communication from
12:24
the attack case. So, this is the first model. All these models will be implemented like a statistical model. So, the first statistical model will be the attack detection model, machine learning model will be the attack detection model, and then we have used the
12:41
semantic model to decompose these attack traffics and to build a specific classification model for each major type of the attacks which is specified in the semantic model,
13:03
and then we have proposed some schema to combine all these statistical models into one ensemble. Regarding the ensemble, this is the weighted combination of the statistical models where
13:20
each model has assigned some weight for each major type of the attack which is specified in the ontology, and even this weight is a combination of performance of the classifier which is estimated on some testing dataset, but at the same time this weight is inferred
13:44
from the semantic model from the severity of the particular type of the attack. All this information is combined using the simple weighting schema and multiple classifiers,
14:01
predictions of the multiple classifiers are actually combined using these weights to provide the final decision about the type of the course type of the attack, and then, again, a particular statistical model is used to find grained classification of the particular
14:27
type of the subtype of the attack. So, all together we have three types of the models. The first one is the binary classification, binary detection of normal and attack
14:41
communication. Then we have the ensemble model which is based on the decomposition from the semantic model together with the weighting scheme, and then when we know what is the coarse grained type of the attack, then we have a
15:06
set of data attributes which are specific for the particular type of the attack. Regarding the evaluation of our approach, we have selected like the standard dataset which is quite an old one, KDD-CAP99, but it's a large dataset where you have a lot of
15:28
cases for the evaluation which was important for us. It's quite a realistic and balanced dataset with 22 types of the attacks and 33 data features like
15:42
mostly metrics and information from the communication protocols and so on. All these data were mapped to the ontology together with these 22 types of the attacks which were mapped to these coarse grained types and this allowed us to create this compound model.
16:07
For the modeling, for the statistical machine learning, we are using mostly the combination of decision trees or some simple interpretable models like
16:20
naive bias classifier or logistic regression and depending on the type of the attacks, we have evaluated different combinations of decision trees or even gradient boosted decision trees, which is basically the ensemble of the decision trees. But for us, what was important
16:40
was not only the performance for each type of the attack, like it is here, but also what was important is the interpretability of the model. So, it will be always possible to extract the explanation for each case, even local or even the global one, and interpret
17:04
this explanation using the concepts and relations from the semantic because we have a one-to-one semantic model and decision trees will provide directly basically symbolic conjunction of the
17:20
conditions combining various data properties. So, we can use this directly like the explanation of the model. Regarding the results, we have evaluated this like using the contingency tables and precision records and so on. So, this is the result for the binary classification,
17:41
for the binary classification we are using the gradient boosted tree ensemble. As we can see, the performance is quite good. The critical is this number of false negative attacks, where the system basically said that this is the normal communication,
18:02
but in fact this is the attack case. So, this is what we are going to optimize here, but at least, as you can see, the errors are quite low and this is quite balanced regarding the false positive, false negative. But still, there are 11
18:25
not detected cases. There are 35 errors where the system predicted this is the attack, but in fact this is a normal communication. This doesn't have to be a big issue here because this can be sorted out even on the next level of the hierarchy where if the confidence of the
18:48
classifier on the next level is low, then we can assume that this is a normal communication. For the second level, here we have these coarse-grained types of attacks. Again,
19:06
you can see that if you count the number of cases in the test set, then you can see that really this data, even on this level, data are really unbalanced, where you have most of the cases are from the denial of service type of the attack,
19:24
but you have only a few, for example, attacks of type user to root or even remote to local network intrusion. There are not so many cases. But even with such an unbalanced data set,
19:43
this weighted combination of models has very good precision, basically. We've done a few false positive cases for each type of the attack and even precision and recall for all these
20:03
these cases are very good. And then I don't have a result for the particular type subtypes where we have the precision on this very specific level of 22
20:22
types of the attacks. This is in the paper, but again, the most critical for our approach was how the semantic model and the composition using the semantic model will improve this part of the detection process. But overall, if we have combined our decomposition together with
20:46
our ensemble techniques and weighting combination of the models, we were able to improve accuracy. Accuracy is maybe not the most effective, the most representative metric here,
21:06
but for example, this F1 score or this false alarm rate is more important in this case and we were able to improve this by 1%. So this is quite okay. So to conclude, in this paper we
21:37
made a research related to the combination or synergy of the knowledge-based and
21:42
database models. This is really just a proof of concept where we have used only one kind, one type of the semantic, well, not only one type of the semantic information because even we have used hierarchical decomposition of the types plus information about the severity
22:03
of the severity and the target of the attacks to modify our weighting schema. But since you have the formalization of the data already, then you can explore
22:20
more relations and you can directly incorporate knowledge specified by the security expert and use more of this information to enhance the statistical learning. This is in the one way, but this formalization of data and the problem will allow
22:47
us to actually automatically extract information from the statistical model and enhance the models. But for this, the critical will be to use some explainable AI or at least some
23:03
iterable models which can be directly interpreted and from which we can extract and generalize and formalize the rules which can be then shared between the experts to
23:23
provide better security for the environment. So, this is all from my side. Thank you for your attention. Okay, thank you, Peter. This is also an interesting topic related to the
23:40
modeling and also to the AI section. I see a question in the chat that is from Halama. The question is, what is the difference between semantic model and structural model or process model? You can say something about this. Well, structural model can be basically
24:10
expressed using the construction from the semantic models like using some concepts and relations and data and so on. Process model mostly covers the dynamic behavior and you have not only declarative
24:27
semantics, but you have even execution semantics. So, there are some differences to the ontology model of how we are using this. You have to have not only declarative
24:46
knowledge, but also specify somehow execution semantics of the process.
25:00
Okay, I don't know if Halama is here and Halama, if you have something to add, you can do this now or if we have other questions for the participants. Okay, perfect.
25:24
Peter, I would just add a curiosity from my side regarding the paper. Do you also plan in your future work to apply this kind of algorithm or this kind of ontology
25:44
in real scenarios in order to implement, for example, the database for the training set and verification set or do you think that it could be better to have a simulation model about this?
26:00
What do you think about a real implementation of this semantic model? This model is actually already quite advanced regarding the
26:24
types of the data and data properties. So, we are basically covering all the concepts related to the network communication and devices and type of the attacks. Even we are covering
26:43
all common metrics and technical properties of the communication which can be directly mapped to the data. So, this can be directly used already. Okay, so the database is just quite huge, quite easy to use also to have this test
27:04
and this verification training set. We didn't perform any evaluation of the semantic model in the sense that how good is actually the coverage of the existing data sets
27:20
which are used for the intrusion detection machine learning. So, this is something which has to be evaluated if something is missing. But from our experiments, this is quite an extended model. What is missing is and what will be our future work is really better
27:42
explore this possibility to use explainable AI to enhance the semantic model, because in this way we can formalize and share the results of the statistical models with the experts. So, this is our main orientation of the research now.
28:05
Okay, okay, perfect. This is just my curiosity, so perfect.