Cybersecurity is a broad topic and many commercial products are related to it.We demonstrate a fundamental concept in network analysis: re-construction andvisualization of temporal networks. Furthermore, we apply the method todescribe operational conditions of a Hadoop cluster. Our experiments providefirst results and allow a classification of the cluster state related tocurrent workloads. The temporal networks show significant differences fordifferent operation modes. In reallity we would expect mixed workloads. Ifsuch workload parameters are known, we are able to handle a-typical eventsaccordingly - which means, we are able to create alerts based on contextinformation, rather than only the package content. We show an end-to-endexample: (1) Data collection is done via python, using the sniffer script; (2)using Apache Hive and Apache Spark we analyze the network traffic data andcreate the temporary network. Finally, we are able to visualize the resultsusing Gephi in step (3). In a next step, we plan to contribute to the ApacheSpot project.
# Expected prior knowledge / intended audience:
No special skills required, but minimal exposure to the Hadoop ecosystem ishelpful.
# Speaker bio:
Márton Balassi is a Solution Architect at Cloudera and a PMC member at ApacheFlink. He focuses on Big Data application development, especially in thestreaming space. Marton is a regular contributor to open source and has been aspeaker of a number of open source Big Data related conferences includingHadoop Summit and Apache Big Data and meetups recently.
Mirko Kämpf is a Solution Architect at Cloudera and the initiator of theEtosha project. He holds a Diploma in Physics and worked on several projectsrelated to complex systems analysis. His focus is on time dependent networkanalysis and time series analysis, using tools from the Hadoop ecosystem, andespecially on the related metadata management. Mirko is actively using opensource tools, author of several blog articles in the Cloudera engineeringblog, and a speaker in Big Data related conferences and meetups. |