We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Barometer: Taking the pressure off of assurance and resource contention scenarios for NFVI

00:00

Formal Metadata

Title
Barometer: Taking the pressure off of assurance and resource contention scenarios for NFVI
Title of Series
Number of Parts
644
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
29
48
Thumbnail
52:23
116
173
177
Thumbnail
24:07
182
206
222
Thumbnail
48:23
287
326
329
Thumbnail
25:09
349
356
Thumbnail
26:14
361
Thumbnail
51:22
373
Thumbnail
25:25
407
411
423
Thumbnail
10:54
449
450
Thumbnail
24:27
451
Thumbnail
27:12
453
459
Thumbnail
34:41
475
Thumbnail
18:23
489
Thumbnail
40:10
496
503
Thumbnail
12:30
515
Thumbnail
05:10
523
525
Thumbnail
19:53
527
538
Thumbnail
25:25
541
Thumbnail
25:08
565
593
596
635
639
Service (economics)Projective planeInformationService (economics)Computer animationLecture/Conference
Service (economics)System programmingPhysical systemSpecial unitary groupData managementMetropolitan area networkMedical imagingMathematical analysisReading (process)Plug-in (computing)DisintegrationPhysical systemCore dumpPlug-in (computing)Data storage deviceFunctional (mathematics)Cycle (graph theory)ChainMetric systemEnterprise architectureInstallation artComputing platformSoftwareMedical imagingGroup actionMobile appQuality of serviceReading (process)Expected valueBitMultiplication signCartesian coordinate systemSlide ruleAreaWeb pageAdditionProjective planeDifferent (Kate Ryan album)Interactive televisionConfiguration spaceWahrscheinlichkeitsfunktionMoment (mathematics)INTEGRALSoftware testingService (economics)Data managementConnectivity (graph theory)Software developerGame controllerCompass (drafting)MereologyWritingDemo (music)Black boxEvent horizonOpen sourceAnalytic setStandard deviationOpen setSet (mathematics)Keyboard shortcutArithmetic progressionNormal (geometry)Translation (relic)Revision controlJava appletSemiconductor memoryPersonal digital assistantComputer architectureLevel (video gaming)WorkloadType theoryData centerDemosceneRight angleVirtual machinePoint (geometry)Scripting languageCache (computing)Computer animation
Web pageConfiguration spaceMetric systemBefehlsprozessorParameter (computer programming)Structural loadCASE <Informatik>Graph (mathematics)Order (biology)Plug-in (computing)Revision controlReading (process)Data storage deviceDifferent (Kate Ryan album)Matrix (mathematics)Polygon meshVisualization (computer graphics)FluxSource code
Plug-in (computing)WikiService (economics)Plug-in (computing)NumberMetric systemCollaborationismFrequencyData managementDemo (music)Multiplication signElectronic mailing listSlide ruleMereologyAbstractionGroup actionNeuroinformatikCASE <Informatik>Power (physics)WorkloadBitAnalytic setConfiguration spaceComputer animation
MultiplicationPoint (geometry)Plug-in (computing)DatabaseTime seriesDifferent (Kate Ryan album)FrequencyMoment (mathematics)Metric systemComputer clusterServer (computing)SubsetMultiplication signCASE <Informatik>Connected spaceSoftwareOverhead (computing)BefehlsprozessorFreewareUtility software2 (number)Interrupt <Informatik>Core dumpLecture/Conference
WikiRow (database)Electronic mailing listEmailComputer animation
Service (economics)Computer animationProgram flowchart
Transcript: English(auto-generated)
So next speaker up is Emma Foley. From Intel, we'll be presenting on Barometer, which is an OP-NFE project. Hi folks, I'm going to be presenting on Barometer, which is an OP-NFE project.
So instead of actually just telling you a lot of information, I'm going to answer a lot of questions, and at the end you can ask some questions as well. So first up is, what is Service Assurance and why do we need it?
So basically, as we become more and more reliant on the Internet, data centers have played a bigger and bigger part in our lives. And as we move from traditional network deployments, so fixed-function network appliances, to NFV, data centers have become more and more important.
So as Terkel and Enterprise do this transition, we end up with a lot of tooling, a lot of infrastructure that's becoming more and more complicated. And because industries are going to have to meet or exceed the expectations that customers have for Service Assurance, QoS and SLAs.
They're going to need additional tooling, additional metrics available to actually monitor their systems for malfunctions and misbehaviors that can cause downtime.
Unfortunately, existing solutions may not actually be enough here because as the tooling gets more complicated, you need to be able to monitor not only the platform, but also software applications as well,
and relay these metrics to management and analytics engines that will manage your virtualized infrastructure. So this is where CollectD comes in initially, and I know CollectD has been around for a very long time. However, this is good because it is widely deployed and the industries that are moving across to NFV,
it's a tool that they're likely already using which will help ease the adoption and ease the transition into NFV. So a bit about CollectD first is it's got a plugin-based architecture which makes it really flexible and really configurable.
And these plugins come in a few different types. Read plugins actually access the metrics from your system. Write plugins to relay these metrics up to higher level analytics engines. And notification plugins, which would be equivalent to producing events from your system.
Logging plugins, which is pretty self-explanatory. And also a set of binding plugins, so you're not limited to actually writing these CollectD plugins in C. You can extend it using Perl, Java, or Python if you want to.
CollectD sounds great, however there are some gaps, and this is where Barometer comes in. First of all, Barometer is an instrument for measuring atmospheric pressure. It's also a project in OP-NFV. And for those of you that missed the last session, OP-NFV develops and improves NFV features in upstream ecosystems,
and also provides integration, testing, and installation to produce a reference platform for NFV, which helps industries to adopt, it's designed to facilitate the adoption of NFV.
Barometer is one of these projects, and it is concerned with feature development, primarily in CollectD, to cover the gaps that we've found in that and make it more suitable for NFV deployments.
We've produced a lot of plugins to help monitor the platform and make more data available. So not only can you monitor generic compute networking and storage, you can also get more in-depth details from your platform.
This is metrics that were already available on Intel platforms, but is now exposed through CollectD. And also metrics from applications like DBTK and OVS, which would not be relevant in traditional deployments, however they are very, very relevant as we move towards NFV.
So once those metrics are available in CollectD, they're pretty much useless unless you can actually talk to your management and orchestration and analytics engines and interact with components such as OpenStack, ONAP, Kubernetes, and so on. So along with the read plugins, we've also produced a bunch of write plugins
to talk to OpenStack via Anyaki and send notifications to OpenStack through A. We've demonstrated how you can integrate with CollectD, CAdvisor, relay all your metrics to Prometheus
and actually use that platform data and application data in Kubernetes and produce some plugins for those so we can relay the metrics up to ONAP. As well as that, we've done some work on sending these metrics via SNMP so that legacy systems can actually use the metrics.
Again, this is to help ease the adoption so you don't actually have to change your whole toolchain to use NFV. These are supposed to be pretty quick slides, more details on our read plugins. So DPDK stats, vSwitch stats, huge pages stats, cache monitoring, additional memory.
Again, libroach is one here so you can actually monitor your workloads running on virtual machines without installing CollectD on the VMs themselves which means that you're not interfering with black box commercial VNFs
and you still get the same level of metrics as you would have if you had more control over your VNFs. Again, write plugins, SNMP, Nyaki and Vez and as well as feature development in CollectD,
Barometer has worked on standardization and making sure that the metrics produced actually are compliant with open standards for metrics collection so that, again, if you have other tools, you don't have to spend a lot of time writing normalization or translation plugins that you can supplement
and interact and interoperate with different applications. We've also provided installer integration so Barometer and CollectD wouldn't be much use in OPNFV if you couldn't actually install them so at the moment we have support for Fuel, Compass, Apex as well as Cola Ansible in OpenStack
and if you're interested also technically DevStack support. During the last cycle there was a lot of work done producing a reference container so if you want to get started with Barometer and CollectD you can pull down a Docker image from the OPNFV Docker hub
and start using it and this will include all the Barometer features that have been upstreamed. We're working on installer support for that reference container so that we'll always have the latest and greatest version of CollectD actually on the system just by installing that container.
That brings me up to a demo. It's a bit of work in progress at the moment to automate the configuration and deployment of CollectD using Ansible. So what I'm going to show is installing CollectD for compute nodes
from your master node or your controller node and configure them, deploying CollectD and then on your master node aggregating your metrics to that one point and storing and displaying them.
So first of all our Ansible script is going to create CollectD configurations on our compute nodes.
This is a short demo, it's about four minutes and it hasn't been sped up. I don't think you can read it anyway.
So what's happening here is on our master node we're using Ansible to first of all configure CollectD on our compute nodes. What it does is for each parameter plugin it checks if the requirements are met and then enables and configures appropriate plugins.
So now we're done configuring on four nodes. Just going to check that those configurations exist. As well as enabling the read plugins, this is also configuring the compute nodes
to send the metrics back to our master node. Now we're going to actually deploy the container. I'll first check that there is actually no container running in case anyone had doubts. So that's CollectD deployed on four different nodes.
I'm going to check that it's running. So next up we have to set up storage using InfluxDB
and also we want to set up Grafana so that we can see the metrics that are actually produced in a nice visual dashboard. I'm having trouble reading this from here so the back of the room don't worry.
So we're using Docker Compose to set up those two containers, Influx and Grafana. Not only does it actually deploy Grafana but it also sets up a load of preconfigured dashboards so you don't have to spend hours going through the metrics that are available and picking what to put on your graphs.
Just want to add that this hasn't been sped up and we're about two and a half minutes in. So as you can see there's a lot of metrics coming in
and we can see what's going on on various different nodes.
What we're seeing is just a compute usage per node. You can get a cumulative aggregated version or you can see per CPU metrics as well.
In order to show you that there's actually something happening we're just going to stress the CPU so you can see how the metrics do change and how quickly they're collected and updated. So we can see that activity that we just kicked off.
So that was a four-minute demo on how to set up Barometer. I think that's the first time we've actually shown Barometer being deployed. Although it's not the first time we've actually shown Barometer in action. Whether you knew it or not, and all these demos that have been showcased at OpenStack and OP-NFV summits,
anything to do with metrics collection, with Docker, with Retrage, with OpenStack Watcher. What they were doing underneath was collecting metrics using those Barometer features. So if you can look at those later, I think the slides will be put up soon.
After that, where does Barometer go from here during our next release? More plugins, obviously more plugins. I'm not going to go through them here. There's a list on the OP-NFV Wiki, the Barometer Wiki, on what's actually planned. However, if you have any plugins that you want to see enabled
or that you're enabling any plugins, Barometer team is usually happy to help with reviewing pull requests on CollectD. I'm going to do some work on CollectD Cloudification. This is to address some issues or some gaps we saw at the start with actually the configurability of CollectD
and actually deploying it over multiple nodes. Namely that if you want to reconfigure CollectD, you have to restart the service. And as you might be collecting metrics at a very high frequency over a lot of nodes, this could obviously take a lot of time
but also cause a discontinuity in the metrics so you have gaps in your history, which is not ideal. So what we plan to implement is a bit of an abstraction, an API on top of it so that you can configure it on the fly, which is handy in situations where, for example, at peak times you may want to collect metrics
at a much higher frequency, or if you migrate your workloads and consolidate them into a smaller number of hosts, for example, to conserve power, you may want to increase the intervals so you're not collecting metrics as often,
or you may want to enable for certain workloads and certain compute hosts, you may want to change over time the metrics that are actually available. So it's part of the motivation to make it more configurable and more dynamically configurable.
Of course we're always open to collaborations and would like to see more people consuming barometer and barometer features. Basically the goal in the next release is to enable more services to consume data and telemetry for all kinds of use cases, including orchestration, management, governance, and audit and analytics and so on.
Does anybody have any questions over here? The question was what are we using as our time series database for CollectD?
CollectD supports multiple time series databases. You could use Nyaki or you could use InfluxDB or any other database that it actually supports. You're not limited to the features that I've outlined here.
How many data points are collected per host per second? That depends on a lot of things. CollectD has over 100 plugins available at the moment. You're only going to want to enable a subset of these plugins. Each plugin would have many different metrics available and it would also monitor or collect metrics
on many different resources at the same time. For example, CPU, you'd use utilization, free interrupts, a bunch of other things. That would be per CPU core, per host, and that's just one plugin. The frequency of which you collect them really depends on your use case as well.
I think we've tested down to sub-milliseconds. How much overhead does that collection impose? I don't have the answer for that right now.
If you follow up, I might be able to find out or provide you some tools to find out. Any more questions? Did I speak too fast?
If I want to run Barometer, can I run it on hosts and containers and VMs and so on? Yes, you can, as long as there's network connectivity between them. You can relay the metrics from any host that's running CollectD
to a designated CollectD server via its network plugin. So if there are no more questions, I will turn on the light again
so that we can see everybody. Thank you very much Emma. So if there are more questions, wiki, mailing list for the recording. Thank you very much.