We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Network Traffic Classification for Cybersecurity and Monitoring

00:00

Formal Metadata

Title
Network Traffic Classification for Cybersecurity and Monitoring
Title of Series
Number of Parts
287
Author
Contributors
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Security and monitoring applications need to classify traffic in order to identify applications protocols, misuses, similarities, communications patterns not easily identifiable by hand. nDPI is a library that implements various algorithms for traffic analysis able to detect outliers, anomalies, traffic clusters, behavioural changes efficiently in streaming (i.e. while traffic is flowing). Goal of this presentation is to show how nDPI can be used in real life to inspect network traffic and spot patterns worth to be analysed in detail. Modern network security and monitoring applications need to analyse traffic efficiently in streaming fashion (i.e. while traffic is flowing). This is in order to detect interesting traffic patterns in realtime without dumping data on a database and performing computationally expensive queries in batches. Many network developers do not have skills for efficiently analyse traffic, and data scientists often do not have skills to understand the complex nature of network traffic. For this reason nDPI, a popular open-source deep packet inspection library, has been enhanced with various algorithms and techniques that dramatically simplify traffic analysis and that should ease the creation of applications able to efficiently spot traffic patterns and anomalies. This talk will introduce some of these algorithms present in nDPI and show how they can be used in real-life at high-speed, contrary to many applications that are inefficient and often based on languages (e.g. Python and R) that are not designed to analyse traffic in streaming at 10 Gbit+ on commodity hardware.
Open sourceInformation securityComputer networkDenial-of-service attackSoftwareVirtuelles privates NetzwerkComputer wormLogical constantMotion captureEvent horizonSoftware developerMathematical analysisGreatest elementPresentation of a groupProcess (computing)ImplementationHost Identity ProtocolStatement (computer science)Personal digital assistantString (computer science)Pairwise comparisonElectronic mailing listPublic domainDirect numerical simulationData structureData dictionaryAlgorithmTotal S.A.Kolmogorov complexityRead-only memoryInformation securityElectronic mailing listNumberWordMultiplication signData dictionarySoftware testingString (computer science)Public domainFinite-state machineDomain namePerturbation theoryAlgorithmCartesian coordinate systemCASE <Informatik>Network topologySemiconductor memoryBitMatching (graph theory)Multi-core processorSoftware2 (number)Virtual machineAuthorizationLimit (category theory)Motion captureStack (abstract data type)Communications protocolTelecommunicationComputer wormMathematical analysisAdditionEvent horizonSubset1 (number)Open sourceAutomatonOrder (biology)Library (computing)ImplementationInstance (computer science)Data storage devicePresentation of a groupWebsiteCodecArithmetic meanTheoryUniversal product codeUniverse (mathematics)Public key certificateProcess (computing)Sign (mathematics)Parameter (computer programming)Broadcasting (networking)Moment (mathematics)Greatest elementServer (computing)MereologyPoint (geometry)Natural numberPhysical lawEngineering drawingComputer animation
Network topologyInformation retrievalPairwise comparisonBootingComputer configurationString (computer science)Data dictionaryLink (knot theory)InformationCodePartial derivativePersonal digital assistantSoftware testingConvex hullMetadataStatement (computer science)Address spaceDirect numerical simulationElement (mathematics)Read-only memoryHash functionData structureAreaCommunications protocolLocal ringVirtual machineTerm (mathematics)Keyboard shortcutTable (information)Counting2 (number)InformationNetwork topologyString (computer science)Order (biology)MetadataMatching (graph theory)BitNumberCASE <Informatik>Computer animationWordRange (statistics)Revision controlResultantData structureFlow separationRadical (chemistry)LengthElectronic mailing listIP addressHash functionSemiconductor memoryLevel (video gaming)Error messagePublic domainSoftwarePerfect groupPort scannerGeneric programmingDifferent (Kate Ryan album)AlgorithmTrailBasis <Mathematik>Point (geometry)Open sourceInheritance (object-oriented programming)Projective planeSingle-precision floating-point formatCodecNetbookGraph coloringStability theoryInstance (computer science)Physical systemSearch engine (computing)UsabilityVideo gameProcess (computing)Physical lawCrash (computing)Product (business)Constructor (object-oriented programming)Computer animation
Read-only memorySoftware bugMechanism designOutlierEvent horizonPoint (geometry)Floating pointNegative numberSign (mathematics)ForestError messageSummierbarkeitLatent heatSquare numberPredictionPhysical systemImplementationMusical ensembleExponential smoothingSingle-precision floating-point formatRegular graphExponential functionProcess (computing)Phase transitionComputerAlpha (investment)Shooting methodBinary fileElement (mathematics)Pairwise comparisonPersonal digital assistantSimilarity (geometry)Distribution (mathematics)LengthThomas KuhnComputer networkInterface (computing)Data analysisStreaming mediaEntropyStrutGoogolPatch (Unix)Boom (sailing)Video game1 (number)AreaProduct (business)MathematicsMultiplication signForcing (mathematics)Point (geometry)Core dumpSingle-precision floating-point formatConnected spaceOrder (biology)Data structureDiscrepancy theoryState observerMereologyDemosceneObservational studyReliefStudent's t-testEstimatorGreen's functionInstance (computer science)Degree (graph theory)CodecCASE <Informatik>WordWeightReading (process)Cellular automatonGoogolMathematical analysisSoftwareSystem callCivil engineeringField (computer science)Machine learningTwitterMeasurementCheat <Computerspiel>Very-high-bit-rate digital subscriber lineGroup actionOpen sourceCausalityLine (geometry)Expected valueBoundary value problemSpecial unitary groupNumberRevision controlDivisorNatural numberRange (statistics)Shooting methodData storage deviceElectric generatorReal numberBit rateView (database)Set (mathematics)Category of beingTime seriesSeries (mathematics)IterationBitType theoryComputer clusterSource codeFitness functionBeta functionMultiplicationFunktionalanalysisVirtual machineSummierbarkeitSmoothingGamma functionError messagePredictabilityPattern languageAlgorithmSoftware bug2 (number)Binary fileAlpha (investment)Computer fileFile archiverLengthMusical ensembleHeegaard splittingCodeSimilarity (geometry)Data analysisAverageExponential smoothingSoftware developerMatching (graph theory)Computer animationXML
Computer animationMeeting/Interview
Student's t-testShared memoryOpen sourceCondition numberProcess (computing)Touch typingInformation securityGoogolCommunications protocolCartesian coordinate systemMeeting/Interview
Projective planeOpen sourceProcess (computing)AlgorithmSeries (mathematics)Library (computing)MereologyCartesian coordinate systemWebsiteVirtual machinePattern recognitionAnalogyMathematical analysisTable (information)outputFocus (optics)Meeting/Interview
Motion captureFormal languageGroup actionLibrary (computing)Mathematical analysisOpen sourceComponent-based software engineeringCore dumpMeeting/Interview
RoutingCartesian coordinate systemFormal languageCommunications protocolProcess (computing)Peer-to-peerTelecommunicationKeyboard shortcutCASE <Informatik>AuthorizationBlock (periodic table)Meeting/Interview
Virtual machineDatabase normalizationCASE <Informatik>Order (biology)Motion captureLibrary (computing)MereologyFormal languageInstance (computer science)Context awarenessArmFreezingBackupElectronic program guideMathematicsTraffic reportingMeeting/Interview
MereologyProcess (computing)Communications protocolMeeting/Interview
EncryptionComputer wormMathematicsRadical (chemistry)WeightMeeting/Interview
Ring (mathematics)FingerprintProcess (computing)Meeting/InterviewComputer animation
Transcript: English(auto-generated)
Hello, my name is Luca Dery and today I am going to talk about network traffic classification for cybersecurity and monitoring. Before I start, I want to tell you a little bit more about me.
I am the founder of NTOP, a company that develops open source tools for network security and visibility. You probably know NTOP-NG and probably also MDPI because I presented this last year at FOSDEM. I am the author of various other open source tools and contributors to other tools such as Wireshark.
Finally, I teach at the University of Pisa as a lecturer. Last year at FOSDEM I have presented MDPI. This year I want to extend my presentation, adding new features that are present in MDPI but are probably not known to everyone.
In particular, I want to talk about network traffic analysis using MDPI. This is because on the market there are many tools and toolkits such as TPDK, PF-RING, NetMap or if you want also eBPF that allows you to capture events or packets for the purpose of network traffic analysis.
Unfortunately, most applications are still based on the top, bottom, X, OS that does certain activities. Whereas today the nature of the traffic is much more complex and we need to do something more than that. In order to avoid implementing the wheel many times, we have decided to put into MDPI additional features
that are not purely deep packet inspection oriented that allow applications sitting on top of it to analyze traffic. Please note that you can use MDPI on top of TPDK. You don't need to put all the top open source PF-RING stack on top of it. And also remember that this library is designed for speed.
So this means that we have tried to optimize the library as much as possible and to overcome limitations of typical solutions that are based on Python and R that can do similar things but only in post-processing because they are not fast enough or they require too many resources. Just to recap what is MDPI, it is a toolkit that was primarily designed for learning about network traffic protocols
and reporting what is the application protocol behind the certain network communication. Today we are going to talk about network traffic analysis and I'm going to present you some examples of network traffic problems that can be solved with MDPI.
The first problem is string searching. In traffic, sometimes we have to search specific strings, not just because we want to search in the payload in a certain word but also because we need to match the traffic against certain criteria. A typical example is substring matching that is implemented in MDPI by the implementation of the How-Currency-Calc.
Substring matching is necessary whenever you want, for instance, to match a certain domain name against a dictionary. So, let's say you imagine that you have a list of blacklisted hosts, a list of domain names that are not nice to contact,
a list of bombers and many things like that. So, we are talking about strings. And you want to do this matching. The matching has to be substring because when you have a domain name, you don't have to write the whole host name, only a subset of it. How-Currency-Calc is an efficient string searching algorithm that is pretty efficient.
Unfortunately, How-Currency-Calc is a little bit complicated to implement because it requires the implementation of automata. In essence, a state machine and a network, if you want, where we represent inside it all the possible nodes with the possible words of the dictionary so that whenever we have a string to match, the algorithm search inside this automata and try to find the best match, if any.
As you can see in the picture taken from Wikipedia, you have two types of nodes, the blue and the gray one. The blue nodes are terminal ones, so those that basically contain the match, and the gray ones are those that are used to build the tree.
I don't want to go too much into the algorithm because we don't have much time and I just want to describe how we can use it. In essence, the first thing to do, you have to initialize an automata with this NDPI into automata and you have to add all the possible words to it. In this case, it's a simple hello and word, and then you have to finalize the automata.
In fact, the main problem of How-Currency-Calc is that whenever you want to add a word, you have to rebuild the automata. The same happens if you want to remove it. So make sure that you have all the possible words, otherwise you need to start over with another automata and do a hot swap in case your application is processing, Then at the end, you see NDPI match string that allows you to check if inside this sentence there is at least one word matching the dictionary.
And of course, NDPI returns such a string. We have optimized the algorithm for networking, so therefore we can find strings that end
with a certain suffix or also we can have strings that begin with a certain suffix. In essence, everything you can expect for matching a domain name is present into this library, even though you can also use it for matching pure strings. Just to give you an idea, the memory used by the algorithm to create the dictionary is increasing with the number of words.
And as you can see, when you have about half a million words, that is half of the Alexa top of a million hosts, the size is about 900 megabytes. The build time is also increasing with the number of words. We have run this test on a very slow dual-core machine just to give you an idea of the speed.
So with half a million words, it takes about seven seconds on a dual-core Intel 3.2 GHz. But the nice thing is that if you have the search, as you can see, the search time is more or less linear regardless of the number of strings to search. It's just the memory that is causing a little bit more and also to build the automata.
The second problem I want to show you that you can solve with MDPI is IP matching. In this case, we need to find an IP address on a tree that is typical whenever you need to match several network prefixes with IP addresses. That again is typical if you have a list of blacklisted hosts, a list of spammers,
a list of network ranges that are not nice to contact, just to give you some examples. A radix tree is the base of this algorithm. In essence, it's a tree where we have in each node a single letter. So in this case, a cat, c-a-t-s, cat, c-a-t, and so on.
Whenever a node is a terminal node, so it's a match, it is basically designed with this yellow color, whereas if the node is intermediate, so it's used to be the tree, it's blue. This is the radix tree. Now, the radix tree is important because we need to match a certain prefix.
And matching a prefix is very important because a network, in essence, is an IP address that starts with a certain IP. So this is why we are interested in that. And the performance is good.
It's O-W, where W is the length of the string to be inserted. But here, we're talking about IP addresses. So how can we turn a radix tree into something meaningful for us? Simple. We start optimizing. So we start collapsing the nodes that contain words that can be collapsed together.
So in this case, c-a can be collapsed into a single node. So we move from a trie to a radix tree, where radix means something in common. And as you can see, this is a data structure that is naturally ordered. So therefore, if you navigate it, then you will have results in a specific order.
Now, in 1968, Morrison has created a special version of the radix tree called Patricia. In a Patricia tree, basically, we have nodes that instead of being letters, they are numbers. And it's pretty efficient for subnet matching in both IPv4 and IPv6. You can also do partial searches so that whenever you have to match, it's a network range.
So a slash 24 or a slash 32 in the case of IPv4. You can support, as I said, both IPv4 and IPv6. In MDPI, what you have to do is the following. First of all, you have to create the Patricia tree. You have to specify the number of bits you are going to use. 32 for IPv4 or 128 for IPv6.
And then you start hiding nodes, in this case, with MDPI Patricia Lookup. And then you can do MDPI Patricia Search best, because we always try to find the best match. That is usually what you want to do with networks. Along with the fact that you have or don't have a match, you can bind some metadata.
For this, you can bind some information about the network itself. If it's a good network, it's a blacklist network, it's a network of farmers. So anything you can have in mind, you can put, you can add to it. And in terms of performance, on the same machine I showed you before, you can have a Patricia tree built in less than a second.
It occupies about 17 megabytes with 76,000 prefixes, that is quite a lot. And as you can see, the search again is under one microsecond, so it's pretty fast. Again, the importance here is the speed, because we want to use MDPI live.
Another typical problem we have to address is probabilistic counting. So this means that whenever we need to know how many, whatever, we need to allocate a data structure. That is usually a hash table. So for instance, if you want to say, how many hosts does my host contact?
So you have to keep a list or a hash table of these values. Or for instance, if you want to know how many different countries a certain host has contacted. A typical question that allows you to answer the question, am I doing something local or global remote? So for instance, with Skype, you're going to contact half of the world with other typical protocols such as HTTP or TLS.
You're going to stay usually in a certain geographical area. So in order to answer these questions, the simplest thing to do is to use a generic data structure. But unfortunately, those data structures take a lot of memory, in particular if you have a lot of data. So if you are unfortunate, if you have a host scanner or a network scanner, you will end up using a lot of memory.
That's why we use probabilistic data structure. It means that data structures are not perfect. They introduce a little bit of error. But in return, you will use much less memory and you will have a lot of speed.
The one I'm going to present is called HyperLogLog that has been created by Fagioli some years ago. And it's a probabilistic data structure. So it gives you an idea of the cardinality of a set. Again, you can have, for instance, for a host, you can put the number of... So the IP address of host as contact, the number of countries, things like that.
And the memory that you're going to use here depends on the error that you're going to accept. I'll show you an example. Suppose I want to create two data structures, one for counting the number of different hosts that have been in contact, the one for counting the number of different counties that have been in contact.
When I initialize the data structure, I have to specify an I value. And the I value here in this table shows you, first of all, the memory that is used for a certain I and the error that you are going to expect when you want to know the cardinality. In my case, I use an 8. So it means that with 256 bytes, I'm able to have an error of about 6.5%.
That is pretty good. Because if I want to know scanners, I don't really need to be super perfect. I just need to know the amount of hosts or domains or whatever that have been in contact. That for me is pretty good. So with 256 bytes, I can do exactly that and calling a NDPI IP level account, I have the result.
Another typical problem is anomaly detection. That is basically whenever we want to understand if there is something that deviates from our expectation. So in this picture, you can see some people are not worked with the right red color as others.
This is our goal. And we want to do it for two main reasons. First of all, because we want to clean data. So if we find our players, if we find data that is a little bit unusual, it might be an error in measurement. Otherwise, because we find a problem. So the reasons are manifold.
Now I'm going to explain to you how we can do that with NDPI. Usually we do this with time series. Time series is another set of data points. I think you're pretty familiar with those. If you use Grafana, if you play with networking data, InfluxDB, these type of things. And once you have a time series, in essence, you have the data.
But here we have to introduce two new words. One is called observation. So it means that the value that we really read from the network. And the other one is forecast. Is the value that we expect at the next iteration to have. So for instance, now if I want to predict the next value of the traffic of my network in one minute, this is going to be the forecast.
And when the minute is passed, I'm able to do the observation to read the real value. The discrepancy between forecast and observation squared is called SSE. Sum of squared error that is used to understand how far is the prediction from the reality.
And how do we use it? Suppose to look at this picture. The series is the blue one. The green one is the prediction. So it's a way for us to mimic the real series. Again, in the future, because we predict the value.
And as soon as we have the value, we compare the real value with the prediction. In our algorithm, we have the ability of creating two bands, one low and one high. And we say if the series falls inside between the low and high band, then we are good. If it falls outside, then we have an anomaly.
Very simple. I don't want to make it too complicated because there is a lot of mathematics behind it. But in essence, there are three algorithms for doing that. The first algorithm called single exponential smoothing takes into account only the value. So gives a weight to the value we read. In double exponential smoothing, we give also a value to the trend.
So for instance, if you want to give an extra bonus to the fact that the value is increasing or decreasing. In the third case, if you have a signal that repeats over time, that is called season, then we can imagine that to predict the future, add the correction factor based on the seasonality.
For instance, if you have traffic of a host that during the night is low and during the day is high, you can speculate about the future traffic that will follow this same pattern. So these are the three algorithms. The last one is called all winters. So in essence, we have three values. Smoothing factors are called just to give an estimate of the value.
Alpha, beta and gamma. Now in NDPI, we have implemented all the three algorithms and you can decide based on the nature of your data if you want to use the first two or the third one. The first two don't take into account seasonality. So if you have a seasonality, you're obliged to use the last one basically. Otherwise, you can use the first two.
As you can see, we need the value of alpha and beta into the algorithm. So in order through that, either you use average values or something that you believe it makes sense or otherwise, you do something called fitting. So in essence, we provide the two functions that allow you to predict those values based on the past.
I will show you how it works. In essence, you allocate, in this case, a double exponential multiple data structure and inside the data structure, we continuously add the value that we read from the field. Every time we add a value, we receive back from this NDPI desk add value, a prediction and a confidence band.
So it means that we give you, back to you as a user, the value and the boundaries up and low that we expect to see. If your value is within the boundaries, then we are good. If it falls outside, we have an anomaly. If you have some measurements from the past, you can fill this algorithm with whatever values you want.
It doesn't really matter. And then at the end, you can call NDPI desk fitting and we return the best alpha and beta value for the past. So it means if your signals stay similar to what we have seen before, these are the best two values that you can use for the future for predicting the future. In essence, with NDPI, you can have something like this.
At the beginning, you see learning because the algorithm is still trying to learn how it works this signal. And then at some point, we start operating. OK, OK, OK. At some point, you have an anomaly because the value that we read, the 173, is falling outside the confidence band. Actually, it is lower.
In this case, the second reading is too high. The last thing I want to talk about today is binning. Data binning is a technique that allows you to split the value into something called bins. So in essence, a vector of numbers. A typical example, for instance, is packet length.
We are used to split packet length into bins of up to 64 bytes, from 64 to 128, 128 to 256, and so on. So this way, we don't have to keep all the individual values, but we can keep ranges. This is the goal of the bin. And we can use it for many things. For instance, if we want to compare two host time series, in essence, I can consider the point of the time series as a value of a bin.
OK, if I want to see if two connections start with the same packet length, I can use this as a bin or the length of the packet, as I showed you before. I want to show you an example.
This is code that is present inside NDPI as an example. Suppose that we have two or more time series. Suppose that we have the time series of many hosts of our network. I want to see when two hosts are similar. So I would like to know when two hosts behave the similar way from the network standpoint.
So thanks to this, there is this example. RRD is an archive for time series files. You take some RRDs of hosts generated by TopNG by other tools that allow you to pull from SNMP, it doesn't really matter. And then you give it to this tool.
In essence, this tool is trying to compare those bins and to find those bins that are similar. Here you see NDPI bin similarity. And this allows you to find hosts that behave more or less the same way and others that are very different. For instance, we have applied this technique inside NTopNG to find from SNMP those network ports that produce similar value,
so that in case of an attack, for instance, behave the same way, or ports that are supposed to behave in a certain fashion, you can find if they are similar or not, just to give you an idea. We have many, many ways of using that. And don't forget that this algorithm is super fast because with over 10,000 hosts,
we are able to compare reading the values and doing the match and all this in less than a second. So NDPI, as I said, is designed for speed. We have many more features, but I don't have the time to describe all of them. For instance, we have streaming data analysis. We have clustering, also called unsupervised machine learning.
We have other functions for high-speed JSON serialization, jitter, entropy, and so on. But I think you need to read the source code because we are running out of time. The last thing is the following. We have been recently awarded by Google because of our work in this field, and we would like to use this money to invest in the community and the development.
So if there are people interested to work in this field and being paid for developing open-source software, contact us. Thank you very much for being here today, and I encourage you to download NDPI and to play with it. If you want, you can contact me at any time. Thank you very much.
Showtime. Yeah.
Yeah. Thank you, Luca, for your talk. It was a very interesting topic. So I got a few questions. So how about contributing to NDPI and NDPI and some stuff? Yes, we would like to encourage people to contribute to it because we see that there is a need to help in various areas,
in particular, cybersecurity for dissecting protocols, not just to understand what is the application protocol behind it, but to understand the behavior of OST. And so we have recently been awarded by Google with this prize.
And the idea is to use this money to pay people, to pay students, to pay contributors. I mean, it doesn't have to be a job. It has to be a contribution to the open source. And therefore, so if you believe in, you know, if you see a value in what we are doing and you are open to help us with the new algorithms,
new protocol, new implementation or new ideas, please feel free to contact us. And we would like to be in touch with you and to understand what are your ideas and to, of course, to enroll you with this project. Yeah, sure. The next question is, what other open source projects is NDBI being used by?
Well, I know that, for instance, it is embedded in OpenWrt as a package, and there are many people that are using it inside small devices for blocking traffic using IP tables.
This is a typical example. Or I see that there are people that are using it to classify traffic and therefore to generate, let's say, data for machine learning algorithms. So it seems mostly to generate an input to other applications.
And what we said today is that we would like to also sponsor the fact that the library is also offering an API for traffic analysis so that if you need to create applications, just take DPDK, PFring or anything for packet capture, and then put NDPI for traffic analysis and focus on what you have to do.
So this is another opportunity for people. So the open source, I think, should not be limited to the application recognition, but also to the traffic analysis part. So you guys are also asking what language is supported by NDPI or is it more likely to be limited by capture methods?
Well, NDPI is a library that allows you to set the traffic and to generate the metadata. It's written in C. So in our idea, I mean, it can help you to simplify the design
of your application, because in essence, you delegate to an existing component the analysis. So I don't know if this is helping with answering the question.
Is NTOPNG based on NDPI or what is the relationship between these two open source projects? Yeah, let's say that inside NTOPNG we use NDPI as a layer for everything.
Because like I said, we have delegated to it many of the features that you have to implement. So NTOPNG is based on NDPI, but there are also other tools. PMAS just made a good suggestion on the chat saying that the DANOS that is based also on DPDK is using.
So let's say it implements something that every monitoring application or let's say a traffic processing application doesn't have just to route traffic or to be simply limited at the layer. Another example I forgot is OpenB switch, because in OpenB switch there is also somebody who moves the NDPI into it.
So that is possible, for instance, to control the traffic between peers that are talking to each other and limiting them to select the protocol. So for instance, you can say, allow SSH to pass or block Netflix, these type of communications.
You already mentioned that NDPI is written in C. What language bindings are available for NDPI? There is also a binding for Python, so you can use Python.
The Python binding is using NDPI and is extending it also in the context of machine learning for understanding the traffic. So Python is definitely one of them. And there is also somebody who ported Go, so you can also use it from Go. And being a pure C library, I don't think it's difficult to bring it to other languages such as Rust, for instance.
We didn't do it, but I believe it should be pretty simple. Also, because we try to be self-contained, namely that, you know, for instance, a bucket capture is not part of NDPI, simply because
we want you to use this library on top of, for instance, PDK or Netmap or anything. It doesn't have to be also bound to, for instance, be picked up. So this means that the library is pretty portable and doesn't bring any dependency with it besides the basic libc thing.
So it should be pretty easy to move it to other languages. Yeah, okay. Just a last question from my side. NDPI, we shouldn't, yeah, okay. NDPI, we shouldn't look at it just as a protocol, so we can do much more, I guess.
Yes, yes, that is the idea. That was the idea of the talk today. We can use it also for processing traffic. So, namely, even if you don't need it at all, you know, the public is better. You can use NDPI, for instance, for determining who is the top talker or what are the hosts that are contacting many other hosts.
You know, typical questions that you have in cybersecurity in general when you have to analyze network traffic. Yes. And the last one, the changes are from Crayta encryption, so we can do more encryption now than ever, I guess.
Yes, encryption is creating troubles to DPI because we cannot inspect the payload anymore, but it's also offering opportunities. Simply because when you talk with, you know, plaintext protocol, there is no real fingerprint. I mean, the fingerprint is probably inside the process of HTTP when you have to.