We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Open Source Network Automation in 2022

00:00

Formal Metadata

Title
Open Source Network Automation in 2022
Subtitle
How to build a Network Automation strategy around Open Source tooling
Title of Series
Number of Parts
287
Author
Contributors
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Network Automation has evolved a lot on the last years, adopting a lot of the practices from SDLC (Software Development Life Cycle) but with the intrinsic constraint of networking services. In this talk you will learn about which are the challenges and common use cases around network automation, for instance: configuration management, user-driven workflows, infrastructure as Code (for hybrid and multi clouds) or close-loop automation using Telemetry. You will realise how a lot of open source projects are coming together to provide solutions to the previous challenges from different angles and how you could connect them together to start building a network automation strategy that could transform your network with lower OPEX, improved reliability and security, and increased innovation.
Diagram
Graphic designComputer networkComputer networkOperations support systemOrder (biology)MereologyProjective planeShared memoryFocus (optics)Software frameworkOpen sourceTwitterSource codeComputer animation
ArchitectureDirectory serviceComputer networkPlane (geometry)Data managementDataflowOpen setNetzwerkverwaltungDevice driverTrailBackupStack (abstract data type)Standard deviationDigital electronicsInformationSoftware maintenanceMathematicsMetadataUser interfaceFirewall (computing)Wireless LANGame controllerPhysical systemSoftwareSource codeLink (knot theory)AutomationSynchronizationOnline chatComputer architectureComputer networkBuildingPower (physics)MereologyProjective planeOpen sourceConnectivity (graph theory)Operating systemDiagramComputer hardwareInformationMathematicsLogicData centerGame theoryTransformation (genetics)Computer programmingIntegrated development environmentTask (computing)Decision theoryState of matterHybrid computerPhysical systemSurface of revolutionDevice driverScaling (geometry)Data modelConfiguration spaceMachine visionReal numberServer (computing)PlanningCASE <Informatik>Vector potentialPoint (geometry)NetzwerkverwaltungOpen setData storage deviceCommunications protocolCartesian coordinate systemInteractive televisionFocus (optics)Point cloudMiniDiscMultiplication signService (economics)Cloud computingTwitterOperations support systemSoftware developerComputer animationXML
Computer networkSource codeComputer networkSoftware frameworkInformationMathematicsOrder (biology)Systems engineeringState of matterAnalytic setFunctional (mathematics)MereologyMetric systemInternetworkingCuboidVariety (linguistics)Communications protocolCartesian coordinate systemDifferent (Kate Ryan album)Service (economics)Cloud computingSource codeComputer animationProgram flowchart
Source codeComputer networkNetzwerkverwaltungInterface (computing)ImplementationArchaeological field surveyStatisticsComputer networkPoint (geometry)View (database)Order (biology)State of matterSurface of revolutionData modelServer (computing)Open setCommunications protocolInteractive televisionPoint cloudDifferent (Kate Ryan album)Archaeological field surveyService (economics)Network operating systemInterface (computing)Operations support system
Computer networkEmulatorMedical imagingMathematicsOrder (biology)Computer networkProduct (business)Software testingIntegrated development environmentProjective planeGoodness of fitServer (computing)NetzwerkverwaltungEmulatorGoogolEuler anglesOpen sourceNetwork operating systemSoftware developerComputer animation
Source codeComputer networkComputer architectureComputer networkState of matterSource codeElement (mathematics)MathematicsOrder (biology)Data centerValidity (statistics)Expected valueMereologyMultiplicationPhysical systemData modelPoint (geometry)NetzwerkverwaltungIP addressComputer fileEndliche ModelltheorieDifferent (Kate Ryan album)AuthorizationSingle-precision floating-point formatCommon Language InfrastructureMultiplication signService (economics)Interface (computing)Representational state transferOperations support systemXMLUML
Computer networkInformationOrder (biology)Relational databaseComputer networkLevel (video gaming)MereologyPhysical systemProjective planeTable (information)Machine visionBeat (acoustics)IP addressCartesian coordinate systemStress (mechanics)Software frameworkSource codeComputer fileEndliche ModelltheorieDifferent (Kate Ryan album)Single-precision floating-point formatPlug-in (computing)Just-in-Time-CompilerInterface (computing)Computer animation
ImplementationOrder (biology)Computer networkNumberData modelRevision controlOpen setEndliche ModelltheorieDifferent (Kate Ryan album)Standard deviation2 (number)Source codeComputer animation
Source codeComputer networkHill differential equationStreaming mediaType theoryInterface (computing)Analytic setSoftware frameworkData typeInformationMathematicsOrder (biology)Computer networkDecision theoryMereologyMetric systemMultiplicationDataflowData modelAreaCross-correlationData storage deviceCartesian coordinate systemLatent heatMetadataTouchscreenInterface (computing)XML
Computer networkDatabaseInformationTime seriesComputer networkLibrary (computing)Metric systemPhysical systemProjective planeAreaDependent and independent variablesPoint (geometry)Cartesian coordinate systemLatent heatOpen sourceDifferent (Kate Ryan album)Computer animation
Source codeComputer networkMathematicsComputer networkPoint (geometry)Latent heatOrder (biology)Data centerSimulationLevel (video gaming)Connected spaceComplex (psychology)MereologyConfiguration spaceCASE <Informatik>Proper mapCartesian coordinate systemScripting languageDifferent (Kate Ryan album)Pattern languageCloud computingXMLUML
Computer networkFormal languageOrder (biology)CodeComputer networkLibrary (computing)Software testingMereologyMultiplicationNetzwerkdatenbanksystemProjective planeConfiguration spaceTranslation (relic)Template (C++)Parallel portPoint (geometry)NetzwerkverwaltungOpen setOnline helpStress (mechanics)Latent heatSoftware frameworkSource codeClient (computing)Endliche ModelltheorieDifferent (Kate Ryan album)Interface (computing)TwitterComputer animation
Computer networkSource codeFirewall (computing)Software frameworkOrder (biology)Computer networkINTEGRALState of matterMereologyTerm (mathematics)Configuration spaceTemplate (C++)Information securityCartesian coordinate systemInteractive televisionFirewall (computing)Source codeOnline chatDifferent (Kate Ryan album)Rule of inferencePlug-in (computing)Just-in-Time-CompilerPattern languageComputer animationProgram flowchart
Computer networkOperations researchInformationMathematicsComputer networkConfidence intervalCASE <Informatik>Process (computing)Different (Kate Ryan album)Computer animation
Computer networkGamma functionInformationOrder (biology)Computer networkFunction (mathematics)Level (video gaming)Exception handlingMusical ensembleOpen sourceDifferent (Kate Ryan album)XML
Computer networkDatabaseInformationMathematicsOrder (biology)Theory of relativityComputer networkSoftware testingDecision theoryState of matterLine (geometry)Complex (psychology)Physical systemTerm (mathematics)Virtual machineConfiguration spaceReal numberInternetworkingArtificial neural networkProcess (computing)Point (geometry)Proper mapCartesian coordinate systemInteractive televisionStress (mechanics)Scripting languageSource codeEndliche ModelltheorieDifferent (Kate Ryan album)Capability Maturity ModelCommon Language InfrastructureInterface (computing)TwitterOperations support systemSoftware developerXMLComputer animation
Limit (category theory)InformationMessage passingSource codeComputer animationMeeting/Interview
Meeting/Interview
Computer networkServer (computing)LaptopPoint (geometry)Router (computing)Meeting/Interview
Computer networkMultiplicationAreaConfiguration spaceRoutingComplete metric spacePoint (geometry)NetzwerkverwaltungEndliche ModelltheorieInterface (computing)Meeting/Interview
Computer animation
Transcript: English(auto-generated)
Welcome everybody to yet another automation talk, and you may be wondering what makes this session special.
There is nothing special on this session apart from the topic. Today we are going to try to understand how IT network operations can be automated in 2020. The approach for this session is going to start giving you a good understanding of what
is IP network automation and share with you a framework that can be used as a reference. In order to put the different pieces that are part of this ecosystem together, highlighting what are the challenges and the trends that we are facing today. And everything with the focus on open source projects that can help us on this journey.
First, the presenters, we have Mr. Damien Garros, director of architecture and network to code. And myself, Christina Day, also part of the architecture team and network to code. Network to code for who is not familiar with the company is a small consulting company only focused on network automation.
Building almost everything that we do around open source projects. Doing a lot of contributions and trying to bring all this power into the different solutions that we are working on. For today, let's start first understanding what are the different components of a network. Traditionally, when we think of networking, we have to focus first on the control plane.
Control plane has been typically the plane where you interact with the network, configuring the different routing protocols such as OSPF, BGP and others. So it's the plane that is focused on defining how the packets will be forwarded. But the actual forwarding happens at the data plane.
Here, traditional disk has been a closed environment that you don't have access if it's not through the control plane. But since the rise of SDN, we are going to talk about this later, but we opened the door just to decouple these different planes. So we can currently today program this plane directly via OpenFlow, P4 or other languages, protocols that give us access to this plane.
However, either controlling or data plane, both of them have to be managed. And the focus for today for this session is to understand how, as from the management plane, we can do the best on the control and the data plane.
Let's start first understanding what it looks like on the traditional network management. Networking has been and still is, in part, part of a really closed vendor ecosystem where the operating systems are basically proprietary.
So you have your operating system with your private CLI, the command line interface. Usually these interfaces, because it was not actually needed, are using unstructured data, making the configuration more an art than an actual science and making automation really difficult.
So this was the reason to promote and to sustain manual changes. And this has been like this for decades. What's changing today? We have some drivers that from the last 20 years have been appearing at different pace, but are transforming completely this ecosystem. First driver was a complaint from the network operators in 2002.
They published RVC 3535 where they were asking about a different way of managing the network. They were asking about data model management so they could provide end to end services in a more smart way instead of using a CLI based interaction.
Part of this, you will be aware that the rise of cloud scale environments. So big deployments for data centers to sustain the cloud and other big scale networks promote that you cannot scale the operations of this network as usual.
You cannot do the same that you have been doing until that point. So you need to put automation in place. You have to save time from the operators and make better and smarter decisions on how you manage this network. Also in 2009, with the rise of SDN, that if it was not today, a full reality
of the vision that they promote has transformed, has changed the way that we innovate into the network. Open the door for the hardware integration, understanding something is the control plane, something is the data plane, so you can play in different games with the different planes. This opened a lot of innovation around how we automate and how we manage the networks.
Also, this was the reason for the late years to appear in network operating systems that were not closed, they were based on Linux. So making this ecosystem closer to the traditional server environments.
And last but not least, the transformation about how we not only operate the network, how we manage applications, how we manage systems. Today, the DevOps culture that we put together developers and operations, and operations include also not only the systems, but also the storage and the networking, made that everyone is used to work as everything as a service, working on
hybrid environments, on premises, in the cloud, and making the deployment time really, really fast. You cannot follow this trend if you are not automating your network operations. If we think about what is the actual adoption today, using this diagram from Crossing the Chasp, makes like as a reference to understand how we are today.
If we think about 10 years ago, there were some innovators doing research on the area, thinking about potential use cases. Because of the cloud service writers, they needed this kind of solution, they started investing and developing the first ideas about this.
There were also some peoners on the field, but the point today is that there is already a lot of companies that are doing this kind of operation. They are transforming the way that they handle networking. But we are still in front of a big revolution. We expect from our experience with the industry that in the next year, next one to three years, a lot of new companies will start adopting this.
Because networking is still behind the automation, the DevOps culture that is pushing hard from the applications and system engineering side of things. The question was that sometimes you can understand automation like an abstract thing,
you don't really understand what are the use cases that are real use cases. Because at the end automation, network automation today is automating the tasks. Tasks that are already real and you move this into a part where you still do the same, maybe better, but in an automated way.
Here there are just a bunch of examples, it's not an exhaustive list, but for instance, for you to get some glance, you can see that all the configuration from rendering the configuration, from some information that is structure,
you can build the configuration, you can provision, push this configuration to the devices, keep backups, check if the configuration in place has changed for any reason, enforce a remediation of this configuration, or you can think about other solutions like self-healing networks, that they can check what's the state, understand what's happening via telemetry,
via monitoring of the state of the network, and make decisions automatically. So the same thing that an operator in the middle of the night will get a page and react to an issue in the network. If you can code this logic into a script, into an application, this can happen automatically, saving your sleep time, that is really important, but more important for the business, is also saving operational time.
So you are reducing the impact of any problems that may happen in the network. Now we want to share with you a typical network automation framework. This should not be really different from any kind of framework for automation. The difference here we have is in the network infrastructure, infrastructure
that today is composed by physical boxes, virtualized applications in containers, also as a service, so you can consume services, networking services from cloud providers. There is a big variety of things, but the network infrastructure has some particularities that make it different.
Most of the time, when you provide a service in networking world, it's an end-to-end service. So it's taking into account a lot of different devices that have to share a state. They have to reach a consensus on the protocol just to transport the packets from one side to the other. In a way that in some of these cases, you are not the owner of all the boxes.
Imagine when you move a packet from one side of the internet to the other. This is because behind you, there is an IP layer that has converged and the packet can move from one part to the other. This adds some complexity. Specific here, we want to highlight also the telemetry and analytics. So there are big changes on this area. Traditionally, in networking, we have been using SNMP,
but today we are converging more into putting together the metrics that we get from the network, together with the servers, making a big difference. All this information via workflows, understanding what is your intent, the state of the network, how you would like to be the network from the source of truth,
putting everything together into an automation engine that you can understand as a translator, moving from your intent to the actual change will make that you can operate this way in an automated way. And we are going to use this framework in order to reference the different functions that make sense to take into account. And let's start first with the network infrastructure that, as we said, is a really diverse ecosystem.
But let's understand this from the point of view of how we interface with the devices. Traditionally, as I mentioned before, we have the CLI in order to change the state of the network. So you interact via a command line interface. You also have had and already have SNMP, the Simple Network Management Protocol,
in order to get data from the network. This has been there for decades. And there is still, in most of the cases, the primary way to interact with the network. But because of the revolution that we have seen in this area, there are new interfaces. So there are new data model interfaces like NetConf, ResConf, that is the same via API,
GNMI that is promoted by the Open Config, NetConf is by the ITF. So at the end, there are different standards, consortions that they promote, different ways to interact with the network devices, especially to make these operations more suitable for automation. Because as I mentioned before, CLI is not a structured way to connect, to interact with a device.
But because this ecosystem is really huge today, we don't only have the traditional network devices that support these interfaces. We also have custom APIs. So imagine a network service from AWS or from Azure, from a cloud provider,
you raise or ask for a virtual network. This is now defined via an API. But at the end, if you want to provide a complete network solution, your network in the cloud should connect to your network in the data center. And this needs some connectivity, some protocols that have to run between them. Also, with the rise of the Linux operating system, we also have access to the Linux shell.
This was something that was not possible before. The network operating systems didn't provide this kind of access before, but now they are starting to give you the same feeling that you have with the servers. But the reality today from this survey from 2020, run by Damian,
you can see that the first interface to interact with automation was the CLI, not NetConf. So there is a lot of huge work in order to make the interaction with the network devices smarter. Because if not, you have a lot of things to solve around it.
Another important topic, not for the actual management of the network, but to make really good development environments. Because, as you know, with the server development lifecycle, there is the testing attitude. I would say that this idea of testing everything before moving into production,
we want to test how you manage the network. And testing in production sometimes is not the best. So there have been changes and new technologies that have helped us to be able to simulate, to create labs in order to understand how to manage this network. From VirtualBox using background, moving to containers, Docker,
there are solutions like Container Lab or VRNet Lab that provide the emulation of network devices into Docker. Boxen, for instance, is now a new project that makes possible to translate from network operating system images into Docker. Another solution like NetSim tool that help you in order to abstract environments that you can then run via background or Container Lab.
But all these solutions are focused on labs. One node solution that you can run in order to simulate your small network to check some things. But what if you have to run a really big network that you want to test before production?
Google has this problem and Google has promoted Open Source Project, the Kubernetes Network Emulator, that helped them in order to run a big, big network scaled through a cluster to test these kinds of things. At the end, the idea is to be able to test things before moving into production, focusing on managing, not on the performance, for sure.
Another key element on this architecture is the source of truth, where we define the intent state of the network. So you have to understand that the network is running and there is a state, but maybe it's not the state that you want. We have to get some place as the reference where we define the intended state, what we want the network to look like.
To make this, there are a lot of challenges. It looks like a simple thing. Maybe you can do it in a table or in a file. But the changes that we have to face nowadays is about data modeling. So how a model that we think is abstract enough, then finally switch well with the different vendors,
with the different services, with the different interfaces. Maybe one interface for one vendor is different than the model that they have if you go via GNMI or if you go via CLI. These differences make it really hard to approach this in a good way that you can abstract the data model.
Then you have to get a data validation because the data is defining your intended state. If the data is wrong, your intended state will be wrong. Because this data is key on this process, we have to trace it. We have to understand who changed that, why. And the changes should be atomic because as a service, we have maybe to touch multiple places of the network at the same time.
So if an operation fails in one node of the network, maybe we have to roll back everything in order to keep the state as expected. And last but not least is the challenge on the data source aggregation. Because maybe you have multiple data sources of data.
You maybe have IP address management in one place, an IT service management in another place. You have a data center infrastructure in another place. But at the end, all your network automation is going to consume this from one point. So you have to make it fit together. And solutions to promote the single source of truth that all your automation makes sense and you have a
well-defined system of records, authority of that data, makes a lot of sense to think on this strategy. At the end, we are going to consume this data via different interfaces, maybe a REST API. Or even better, using maybe GraphQL that makes it easier to combine the different data that
you need and don't get more data that you need and the part in only one request. Here, projects we have a lot in this area, but we are going to highlight the most important ones. As you can guess, JIT, in order to track and to get the control of the files that define your state, maybe a YAML, maybe a JSON, is an option that you may take into account.
But then there is some data that has to be related. So you can also think about different relational databases. In front of the databases, you can build different interfaces. A project that has got a lot of traction for a lot of years is NIPAP for IP address management, only focused on this specific part of the problem.
There are other projects run in 2016. Netbox, that building on top of a Django Python framework, offers a complete solution for the source of truth. The idea of Netbox, the vision was to put all the information of the source of truth all together.
From Netbox, there is also an interesting project raised last year, that is called an Autobot, that is a fork from Netbox. This project tries to go a step further. So a part of being a source of truth, it wants to push all, to promote all the network automation framework where you could beat applications on top of that.
For example, the challenge of the single source of truth. So how we could add new interactions, new plugins that will take information from different systems. Now, Autobot has other support for different applications, but one of them is to promote the single source of truth. Another interesting here is a part of the traditional databases, DALT, that is just MySQL and JIT combined.
They define themselves as the JIT for databases, so you do the trackability on the tables. It's another project that helped us in order to move the source of truth into the next level. Last but not least, this is not actually a project, it's a way where we can define the models.
Models is a really challenging topic in networking. And in order to make you aware of that, I want to ask you a question. Take 20 seconds in order to look into your browser, into young models in GitHub, and you will see how many models are available.
You will see that the number of models from vendors for different standardization corporations or different consortiums like Open Config, they are massive. So for the same model, for instance, interfaces, you will find a model from the IETF, a model from Open Config,
and they are not compatible because even they are using the same language, data modeling language, young, the implementation is different. So there is a really big challenge, and also the challenge is increasing because every version of the same model of network device will have different versions.
Next stop in our framework journey is the telemetry and analytics. And here, this is something that maybe you are for sure familiar with, and we are going to talk about the specificities on the networking side.
The challenge, the goal on this area is to collect, enrich, and store the network observed state. What is the challenge from the networking side? The correlation of multiple data types and interfaces. What does it mean? So we are going to get SNMP, the traditional interface for getting data, but also some data is not available in SNMP,
and you need to go via CLI and do a lot of screen scraping. You also have to take flows, network flows to understand how a flow is moving from one side of the network to the other. You also have logs, and also an important change on the ecosystem for the network telemetry is the new streaming telemetry,
the GNMI, NetConf notifications. The idea is that you move from the traditional pull to the push. So you expect that you subscribe to a data model in order to get any change on the data model, just send me an update. This is making everything faster. You get data as soon as it changes, so you don't have to rely on your periodic or regular checks.
You get data as soon as it happens, and you can take the conclusion decisions quicker than before. Also, at this point, because we have a really rich source of truth, the idea will be that your metrics, all your info or logs, all the information that you are collecting from the network,
a part of the information that is coming by itself will be extended and reached with the business logic. So you can add some metadata on your metrics in order to make this more usable for the upper layers to visualization, to analysis, and also to make it converge into the rest of the different metrics of the IT ecosystem.
So your metrics will be side by side with the systems, with the application metrics, and putting all together, obviously, you will take more educated decisions. Related projects in this area, there are the traditional network monitoring tools like this.
You will have specific libraries to interact with an interface, GNMI or NetConf, or using Python or other libraries that we are going to see later, but just collect information in a way that you get a structured response. This is to collect information, but where is this actually happening? It's happening in the collectors. Here we have quite common collectors for systems like Fluentd, Logstress.
We have specific collectors for networking. So PMMCT is a well-known open-source project that is focused on collecting flows, collecting all the information that is coming from the network. But we want to give just a quick mention on telegraph,
because here telegraph in the networking area is not used as an agent all the time, because, as I mentioned before, a lot of different networking gear doesn't support this kind of agent running in the network. So we have to run this outside of the network, and from that,
we connect to the network via the different libraries that I mentioned before, and you get information that, at the end, will go stored in different time series databases. And these databases are the same that you will store other metrics that are relevant for your application as a whole. So we put everything together, and the metrics in this area,
we are consolidating into a point where the network metrics make a real impact with the other metrics that you already have from your applications. The specific point in networking is the automation engine. Automation engine, understanding it like the one that is going to connect to the devices and do the changes.
So the focus, as you may guess, is interact with the network, rendering the configurations, and deploy the configuration. And here, the reality is that there are more scripts than actual applications. So we are solving a lot of small tasks, a lot of small specific problems, but in a way that is not the same that you have an application
that is for network automation with full capabilities of handling different errors. We are not still on this stage in most of the cases. Obviously, there are other use cases that you have proper applications for network automation, but in most of the cases, the network automation part is not robust enough, but we are progressing.
Why this is important? Because in this area, the network complexity makes a difference. So it's very difficult to automate a network that was designed 20 years ago, where every different device was completely different from the other. Something that you need in order to automate, and you will be familiar with, is consistency. You have to get something that is replicable, patterns that you can reproduce.
So it's really important that the network architect, the one that designs the network, works together with the network automation, because if the network is simple, the automation will be simpler and more effective. Something to consider here is the heterogeneous environments, because there is no longer only data center or campus networks.
There is also SD1s, we have the cloud services that we have to interact, each one of them with different interfaces, and the challenge here is that everything has to go and to run smoothly together. We cannot think on a network that runs in the data center and a network that runs in the cloud,
because your application will be distributed, your business will depend on the connectivity of both, and this is based on running protocols, for instance, BGP, that will run end-to-end. Last but not least on networking, it's important to understand the blast radius impact.
So something really different when you work on an application is that if you mess something with your application, your application is down. It's one application, but when you think on the network, when your network is down, you can think about how many applications are impacted by this change. So there is always difficult to evolve changes in the network, and there will be a lot of needs in order to simulate the network that is complex,
because not all the different solutions by the vendors can be properly modeled in order to get a simulation of that, and this effort in order to simulate things as much as possible should be there as it is for other use cases.
But for the network, the impact on your changes is always something that you take with yourself. You always have experience to bring the network down, and it's really, really not a really happy experience for sure. What are the related projects that are interesting in this area? First, Jinja, and this brings me to the point that in the network automation,
the most popular coding language is Python today. Even we are getting some traction with Go for specific performance issues or improvements that Go brings to the table, other characteristics, but also because of all the open config, GNMI ecosystem promoted by Google is run or defined by Go.
So there is this trend that Python, let's say, is the most popular one, and Jinja is the one library that we use in Python in order to template, to transform our data coming from the source of truth, via templates defining how this data should look like when we push to the different interfaces.
Jinja is this translation layer. Then we have how to push this configuration, how to generate all these things. We can use config management solutions for well-known config management solutions. Ansible has a lot of different network models that help us on this. But we also have specific libraries, in this case, Python libraries, and Go clients that help you in order to interact with the network devices.
NetNico, Napalm, NCC client for NetConf. There are different libraries that you can use in order to interact with different devices in a way that, for you, look like you are getting a consistent interface. Even you will have to put the right configuration for each device.
This is the reason to have Jinja. On top of these libraries, there are multiple projects, like Scriply or HireConf, that help you in order to do specific solutions around this. Other projects, like Nordnier, focused on improving, for instance, how you handle the inventory of this data to connect,
or the performance, how many sessions you can do in parallel. Because Napalm, for instance, is for a specific session. If you want to do multiple concurrency, you have to use, for instance, Nordnier. Then we have other kinds of projects, like Badfis, Capirca, PyTS.
They are just focusing on translating, on testing, on homogenizing, on modeling things. So, from a configuration, they will try to get the model, and from that model, you can understand things. This is the point of Badfis. On Capirca, you can translate from ACLs. On PyTS is another testing framework promoted by Cisco, is Open Source, but the idea is that you will be able to use it
in a way to interact with the devices, and do the same flavor that you need test in Python, the same, but with networks. So, you connect to a network device, you get data, you take your decisions, and at the end, you succeed, or you don't succeed. And for you, not forget that we also need to use
other providers, like Terraform, because maybe on your network strategy, part of this, part of your provisioning, is not only your traditional network devices. You also have to connect to the cloud and provision maybe something like a global router, or whatever. So, it's a complex and diverse ecosystem to take into account.
We want to give you also an example, so how all these pieces work together, having this framework as a reference. Here you have the framework that I mentioned before. I have not commented the user interaction, so how a user can interact with this ecosystem, we are going to see here an example, and the orchestration part is just how you manage a workflow. There are a lot of options in order for you to run this.
In this example, we are going to talk about firewall automation. So, imagine that we want to give the option to a user to change the network state in terms of a new firewall rule. Traditionally, this has been a ticket that has to go to the security team, that has to analyze it. Yeah, maybe one week.
So, why we cannot automate this process? Maybe through chat applications like Mattermost that receive a well-structured request that we define the data, we validate that data. This data can be taken by Nautobot, for instance, as a source of truth, that also have an interaction with different plugins to get chatops integrations.
We get this data, we inject into the source of truth, and then we can run an orchestrator, AWS. AWS will connect to Ansible in order to generate the desired configuration. So, from this request, what is the rule that I have to change? These rules, maybe the templates, are coming from JIT. JIT is another part of the source of truth
where we have the templates that someone else, maybe a firewall engineer, has defined. Ansible will compute this information, we get the configuration. Next step, we can run bad fees in order to get, okay, let me try to understand what is the actual impact of this configuration. I am breaking some previous rule. Is this rule something that we can accept?
So, in this orchestration, we are going to say, yeah, this rule looks okay, and it's not breaking anything, and it's following the patterns that are pre-approved. So, if this is the case, let's go back to Ansible, and then we connect to the different devices of the network and make this firewall rule happen,
maybe in some firewalls, maybe in some routers, because, as I mentioned before, the network is end-to-end service, and to open a port, maybe you have to touch multiple parts on a routing path. To close this session, we are going to give you just some,
let's say, hints how to adopt network automation. First, prioritize. So, as I said before, there are a lot of different use cases. Try to get something small with low risk, but that you can get a lot of value. You can demonstrate the value of the automation. With this, you have to understand how it works,
because today, network automation is about automating tasks, so understanding how you actually work in detail, not just step-by-step. You can get all this path and try to simplify as much as possible, because probably a human is complicating it more than needed, so try to simplify it and normalize it,
because we cannot longer accept a request with a free text. We need to get something that is well-structured, so that if we are looking for an integer, it's an integer, and all this information should be well-defined. At the end, use it. Use it in an iterative way. Try to learn from your experience. Maybe on this first step,
it doesn't change anything on the network. Just get information, take educated decisions, help, and this is an incremental process where you will get confidence and everything will go better, for sure. Takeaways from this session. First thing, understand this workflow. You have to understand how you work
in order to define how you can define the intent, so how the information that you provide, the data, will transform through these steps, through this workflow, into the actual output. Then be sure that there is no single third-case solution, even a band solution in the open source neither,
but there is no single third-case. You have to integrate multiple solutions and this is the reality. You don't have other options to make this happen. Finally, network automation is not longer the exception. There are a lot of companies that are already working with network automation. Big companies, small companies, ISPs, enterprises,
everyone in different levels is starting this journey, so it's not late to start and jump into this. What's next? What we can expect for the next years to come? We still expect modeling complexity because there are still different silos.
People are having different opinions about how a model should look like, so you should still focus on abstracting the model in a way that is usable for you and you don't overcomplicate it too much. Try to focus on defining the model
in a way that is reusable across multiple platforms and solve your work. Don't try to get to the last point of the modeling complexity. We also expect to see a change on the source of truth. The source of truth that we have today that started 10 years ago
is based basically on JIT, on relational databases, but we expect to go a step further. We expect to change the way that the database is versioning, so maybe solutions like DALT that interact this layer of traceability on the changes. Why is it important? Because if you change something on your database,
this is going to be, if the automation works properly, it will be deployed to your network, so you are going to change the intent. Maybe this change that goes to the database should go through an approval process. You can obviously implement this in different layers, but having this capability on the database makes sense. Also, we expect to see different ways
to interact with the interface of the source of truth, so how you define the state, maybe through a declarative interaction. We also expect, obviously, with the maturity of the industry to start seeing more applications, proper applications than scripts, something more robust on this area. Continuous deployment has always been a challenge.
On my experience, coming from the software development industry, I always push to do automatic deployments of the changes in the network, but as I mentioned before, this challenge or this danger that you can make breaking the network, the impact the blast radius that you have is always that you have to take into account,
and in a lot of cases, some changes on the network, maybe small snippet changes are not a problem, but if you have to replace a full configuration with thousands of lines of the difference between one and the other, you will rely on a manual judgment. But we expect that this with the next years having more intelligence on the decision that we take,
the last step, the deployment could be more sophisticated and we expect to see more and more continuous deployment. So when you change something on your internet of the network, maybe your data, maybe your templates, this change through a well-tested and robust testing experiment environment,
this will be fully automated and deployed to the network. We also see some trends on trying to use because DevOps, let's say operations engineers are used to work deploying the applications, the systems using YAMLs. We are YAMLs engineer these days,
so we expect to see the trend to use the same style that we use in Kubernetes with operators to define how we would like to see the network to be. And this is a trend that is still in the first steps, but we think that this is going to take some traction because people is used to work with this YAML definition
of the state of your network, or not network, your systems. And last, as I mentioned before, all these decisions today are pretty limited in terms of what intelligence that we can put on them because you are doing nothing else than what will be your manual human decision.
You are coding that into a script. Next step at X intelligence, move the step higher that is using artificial intelligence machine learning in order to not automate tasks, automate decisions. So the decision itself will be not predefined, and we will be able to take in enough data, maybe with not really educated data,
promote some decisions. Maybe it will not be the last decision, but we will get some advices in order to take decisions that we will not get, if not. Why this is important in networking because a lot of data now is going to be available from the telemetry. We don't have only SNMP or CLI. You are going to start
getting all the telemetry information in real time, and all these decisions will save time and at the end, saving business impact. So thank you very much for your attention, and hopefully I will be able to answer your questions. Thanks.
What do we do now? There are no questions so far. But just let me.
Ah, we are live. So Chris, that was an impressive,
very tightly packed with information talk. Now to all listeners, please send your messages in the chat room.
So it looks like there are no questions. No, as I see a few people are asking the questions.
I have myself also just posted one question. So is there anything you could suggest to a small-time home sysadmin, some useful first steps to learn the way of network automation?
Say if you have like a router with open VLD and a couple of laptops and maybe a small home lab server. I think that the point at the end is whatever complexity, so it doesn't matter if it's your home network or it's a really big cloud-scale network,
everything has to start with a proposal. So you have to understand what is the painful that you want to solve. In some cases, it could be something really simple like allocating a villain. In other cases, it will be emulating a complete BGP route policy setup. So one way or another, you have to define what you want.
And then, obviously, you need some basic skills. But to be honest, the entry point to this area has been pretty relaxed in the last year. So there are multiple tooling like, for instance, config management with Ansible where you have models for mostly everything. I'm not sure if for OpenWrt,
if there is a model, but I think so. Maybe it will be something that has multiple support for interfaces. So you will hopefully have a good interface to interact with, maybe using Ansible. And then you have to understand what do you want to change.
And this change, you have to define, as I mentioned in the presentation, the intent. So what do you want to change?