We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Keep appetite for the stats, it costs nothing

00:00

Formal Metadata

Title
Keep appetite for the stats, it costs nothing
Subtitle
Presentation of the statistics consumption model in VPP from the costless low-level design to their exploitation in userspace
Title of Series
Number of Parts
287
Author
Contributors
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
VPP (aka Vector Packet Processing) is a fast network stack running in Linux userspace. It is designed to handle packets with high performance, which makes gathering statistics efficiently a must have. The model that has been chosen in VPP to provide up to date statistics is built upon shared memory and optimistic locking. The counters are updated in this shared memory at a rather low cost by the data plane and can be read out at almost any time by all the consumers. We will first describe this model in more details. The consumption of these stats may take various forms depending on the use case and the application needs. That's why we have developed different high-level components to access them: 1) A filesystem in userspace: thanks to go-fuse, we can mount a filesystem ordering statistics in folders and files, in a similar fashion to '/proc' in Linux 2) A Prometheus agent: applied to Calico/VPP, a new dataplane for Calico - the popular cloud native Kubernetes network plugin - based on VPP. Prometheus is integrated as a monitoring tool in order to export our statistics in the form of real-time metrics collected from targets. Metrics come from our pod interfaces statistics, and targets are Calico/VPP agents running on our Kubernetes cluster nodes. During the presentation, you will have a quick demo of these components.
BEEPData modelLevel (video gaming)StatisticsSoftwareProjective planeDifferent (Kate Ryan album)Software engineeringEndliche ModelltheorieMereologyProcess (computing)Vector spaceCartesian coordinate systemClient (computing)XMLUML
MultiplicationGraph (mathematics)CryptographyOpen sourceComputer networkRead-only memorySingle-precision floating-point formatPlane (geometry)State of matterDirectory serviceData typePointer (computer programming)Interface (computing)Thread (computing)Boolean algebraExtension (kinesiology)WritingClient (computing)Concurrency (computer science)Charge carrierPlanningInterface (computing)Subject indexingVector spaceEndliche ModelltheorieDirectory serviceArithmetic progressionMereologyBoolean algebraPointer (computer programming)Software testingSemiconductor memoryInterface (computing)Row (database)Different (Kate Ryan album)Scaling (geometry)SoftwareReading (process)StatisticsException handlingContent (media)Set (mathematics)Thread (computing)Type theoryVapor barrierStrategy gameNumberPairwise comparisonEmailLine (geometry)Cache (computing)BitConcurrency (computer science)Electronic mailing listError messageClient (computing)Green's functionRight anglePhysical lawWordTorusObservational studyNeuroinformatikBit rateMathematicsCoefficient of determinationProcess (computing)Open sourceState of matterKnotComputer animation
Client (computing)IBM Client AccessNetwork socketRead-only memoryRegulärer Ausdruck <Textverarbeitung>outputAsynchronous Transfer ModeComputer fileOpen setPlane (geometry)Prisoner's dilemmaClient (computing)StatisticsDifferent (Kate Ryan album)Core dumpDirectory serviceType theorySpeicheradresseInstance (computer science)Endliche ModelltheorieFunction (mathematics)Vector spaceCASE <Informatik>Computer fileSubject indexingSocial classFile systemMultiplication signElectronic mailing listMatching (graph theory)Regulärer Ausdruck <Textverarbeitung>Pointer (computer programming)InformationoutputTorusNetiquettePunched cardConcordance (publishing)DialectAutomatic differentiationCoefficient of determinationMathematical singularityProgram slicingXMLProgram flowchart
MIDIFreewareCellular automatonTrigonometryDew pointQuadrilateralPointer (computer programming)Hill differential equationDihedral groupGamma functionEmulationExecution unitIntelOpen sourceAnnulus (mathematics)Newton's law of universal gravitationInterface (computing)Directory serviceOrder (biology)outputFile systemInterface (computing)System callNumberVector spaceThread (computing)Default (computer science)Installation artProcess (computing)InformationComputer fileComputer-assisted translationStatisticsCommon Language InfrastructureScripting languageFacebookOpen sourcePhysical systemOptical disc drivePoint (geometry)Subject indexingLocal ringArithmetic progressionMortality rateGame controllerGeometryComputing platformSource code
Open sourceAsynchronous Transfer ModeStandard deviationComputer networkDisintegrationRegular graphService (economics)Chemical equationComputer architectureRead-only memoryServer (computing)Knowledge-based systemsShared memoryStatisticsTime seriesInterface (computing)Real-time operating systemServer (computing)MereologySharewareStatisticsSemiconductor memoryMultiplication signOpen sourceTable (information)Computing platformCartesian coordinate systemMathematicsInstance (computer science)Interface (computing)Endliche ModelltheorieSoftwareMetric systemCASE <Informatik>Run time (program lifecycle phase)Configuration spaceLatent heatDependent and independent variablesPlanningView (database)Error messageNumberData transmissionData managementForm (programming)Level (video gaming)Order (biology)Database1 (number)State of matterComponent-based software engineeringService (economics)Physical systemGraph (mathematics)Condition numberMobile WebPoint (geometry)Quicksort2 (number)Twin primeExpert systemDemosceneCausalityGoogolArithmetic progressionAutomatic differentiationAxiom of choiceEntropie <Informationstheorie>DiagramProgram flowchart
Maxima and minimaLemma (mathematics)WindowSigma-algebraNominal numberMetric systemBasis <Mathematik>Interface (computing)StatisticsGraph (mathematics)IP addressConfiguration spaceSource code
Normed vector spaceTable (information)Graph (mathematics)Maxima and minimaPoint (geometry)Expert systemDataflow2 (number)Software testingTransmitterGraph (mathematics)Computer animation
VarianceMereologyPell's equationPredictabilityGraph (mathematics)CASE <Informatik>Source codeComputer animation
Band matrixSubject indexingInformationInterface (computing)Rule of inferenceComputer animation
Band matrixVideoconferencingInterface (computing)Shared memoryPrice indexSubject indexingCartesian coordinate systemCASE <Informatik>Interface (computing)LogicComputer fileGame controllerView (database)CAN busMeeting/Interview
Interface (computing)Cartesian coordinate system1 (number)Selectivity (electronic)Multiplication signPlanningCASE <Informatik>LogicSubject indexingInformationGame controllerInterface (computing)MereologyCross-correlationProcess (computing)Server (computing)StatisticsState of matterMeeting/Interview
2 (number)Type theoryOffice suiteSelf-organizationFigurate numberMeeting/Interview
Computer animation
Transcript: English(auto-generated)
Hello everyone, my name is Hedy, I am a software engineer at Cisco and I'm glad to present to you, Arthur, the statistics consumption model in VPP
and some of its applications. Hello everyone, I'm Arthur, I work at Cisco and I'm part of the team developing the vector packet processing software aka VPP and we use it in different projects in Cisco. So I hope you have not eaten too much because you'll
need some appetite for the statistics and we'll have a look at the stat segment in VPP and also at various clients for the statistics. So first a quick intro on VPP and I think some of you are already familiar with this software so I'm going to be quick on this one. So VPP is an
it is highly optimized for performance, speed and scale and it is designed as a graph of nodes. So we take some vectors of packets as inputs, the packets go through different set of nodes depending on their contents and they can be dropped, they can be sent to another interface
and so on. So what is the statistics segment in VPP? It is a model based on shared memory, we use vectors and we make the exception that we have a single writer which is the main threads that can add new counters to the statistics segment or that can extend
some existing counters and that we have also multiple readers. We also make the assumption that the number of writes is really low in comparison to the number of reads and that's why we have adopted the optimistic locking strategy. What is important in the stat segment is that reading
is really really cheap. Actually it costs nothing because we don't stop the data plane at all when doing the reads, we just read the shared memory. However this means that it costs a bit
in the data plane because we have to update the statistic counters and that may induce some cache line misses. So let's have a look at the shared memory layout. We have a main header containing the directory vector, it also contains an error vector and an epoch counter and in progress boolean for the optimistic locking part. The directory vector consists of a list
of statistic entries, each entry may have a given type, for example the simplest type is just a scholar. We have the example here of the number of worker threads,
the data is simply the value of the of the scholar. We can also have a vector, so for example the interface Tx is a vector indexed by threads and interfaces and the data is simply the pointer to the array. So we have here the number of packets and the number of bytes per interface per
thread. Then we have a Simulink type, this type contains two indexes, the first one is the index of the entry in the directory vector and then we have the interface index which is the
index of the row in the array. Note that when doing some updates in the data plane, we have the pointer to this array stored in the data plane so that we can directly access it.
Okay now let's talk about the optimistic locking parts. So first taking the lock means setting the in progress boolean, releasing the lock means unsetting it and incrementing the epoch counter. So when we do a write a test we add a counter to the stat segment or we extend one, so for
example if we add a new interface we take the lock, we update the directory vector if needed so pointers may change, we update also the counters vectors and then we release the lock. What it means for readers is that they need to wait until in progress is unset and they need to
store the epoch value, their reads, what they want to read and at the end of the read they need to check that the epoch has not changed and that the in progress is not set. If it's not true it means that there has been write during the reads so we have to redo the read and we
try again. We have no concurrency issues in the stat segments because when adding or extending the counters it is done under a barrier by the main threads and this means that the worker threads cannot update the counters in the meantime. Okay so now let's present some different clients we
have in vpp that use the stat segments. So the simplest client we have is a simple executable that connects to the stat sockets to have info about the shared memory location. It allows to list or dump the statistics available and it takes a list of regex as input and which returns
the matching counters so either a list or a dump of the counters depending on the command. So it is really handy to have a quick access to the statistics, it is easy to launch so it's pretty handy to have raw access to the statistics. Now we have also developed
a FUSE file system in Golang thanks to the gofuse model. In this model we have different types of nodes, we can define directory nodes and file nodes. So a directory node contains the class in our case and it keeps a trace of the epoch. Then we have also file nodes
which correspond to the counters and in these nodes we store the counter index so that we just have to look at the given index in the directory vector instead of having to fetch the
pointer each time. So let's take an example, when you open a directory in the file system we look at the current epoch, if the epoch has changed since the last update of this directory we need to update all the subdirectories and files into the subdirectories so that we are
sure that there were no more adds or deletions in the meantime and if there were some files added for example we could still update the directory and add the given counter to the
to the directory. It would be the same if we had some deletions, we would just delete the counter inside this directory and note that this update is done recursively so that the subdirectories are also updated. When we want to read a file this time we just access
the given counter in the directory vector and we get the value as output. So now let's have a quick demo of this. On the left hand side we have an instance of vpp running and we can access the interfaces in vpp in the cli so for now we have only the default
interface local zero as expected because we have not run any traffic. Now we can try to install and start the stat file system as simple as a make install and a sudo make start.
Here we go so now we can try to access the file system directory so it's named statsfs-dir and we can ls all what we have in that directory in order to fetch all the data from the directory vector and update the epoch counters. So now we can have access to the interfaces name
and as expected there is only this local zero interface. Now let's try to add some other interfaces and run some traffic with a little script in the vpp cli and here we go let's try to access the interfaces name again and a bunch of new interfaces have appeared so let's try to
access the counters for the new interface pgo for example so we go into the directory interfaces pgo which is a symlink directory and we have a bunch of counter files so let's try to for example cat while inside ip4 and we have the number of packets ip4 packets that have gone
through this interface we can also access the files corresponding to the rx and tx packets and here we are we have all the info we wanted for the tx and rx packets and bytes
indexed per thread for this interface now we can also access some statistics about the processing nodes of vpp so we need to go inside the nodes directory and then we need to go inside the for example ip4 input directory which is the
directory for the ip4 input node and see what we have so we have a bunch of other counters and we can access them so for example we can access the number of vectors per thread and there we are and we can also access the number of clocks or calls in order to have the number
of calls for this node in vpp and there we are i'm going to present another way of consumption for our vpp stats which is the promethios exporter that gathers statistics our use case this time is calico vpp so what is calico vpp calico is an open source kubernetes networking
solution and network policy it also supports other platforms it manages networking between pods nodes vms the main goal of integrating vpp is to accelerate the networking of kubernetes clusters that use calico so instead of using standard linux networking pipeline with ip tables as a data plane
nodes run the vpp data plane and will provide faster networking to their pods without requiring any changes to the applications running in the pods this is really meant to be transparent so when running vpp we do not have any additional requirements compared to regular
calico when we use the vpp data plane the vpp instance is inserted between the host and the external network so on startup the host interface is replaced by an amplink interface and a tan interface to keep communicating with the outside as normal as for the pods vpp
creates a tan interface for every pod so vpp handles interfaces and packet transmission and creates a turn interface for every pod thus it is very useful to gather statistics about pod interfaces such as rx packets or tx packets or number of errors etc
so this is a high level view of the agent that you get on each node when you deploy calico vpp the agent is the processor responsible for all the calico specific configuration in vpp
on this container we have all the runtime configuration of vpp in the form of running servers the cni server the routing manager the services manager and the policies manager they interact with the kubernetes and calico apis to configure vpp in order to export our
vpp stats as metrics we add the new component promethio server as part of the agent promethio server knows the state of the cni server and the created pods interfaces and every interval of time it fetches the real time statistics through the the vpp api in calico vpp which
accesses the stat segment shared memory and gets the statistics needed so among stats we actually select ones related to pod interfaces so let's take a deeper look at how this works here are the nodes of our kubernetes cluster pods are created on nodes
vpp adds interfaces for these pods and we'd like to collect those interfaces stats to expose promethio's metrics in our application we need to provide a slash metrics http endpoint on every node that we call a node exporter every node exporter is running on the calico vpp
agent so it accesses the stat segment shared memory through vpp api converts stats into promethio's metrics then exposes them on the http server
now promethio server needs to be configured so that it targets those endpoints promethios uses an http pool model in order to export our statistics in the form of real-time metrics recorded in a time series database so targets are the slash metrics http servers running on
kubernetes cluster nodes and metrics come from our vpp tan interfaces statistics thus the performance of the interfaces in our system is displayed as a nice graph now let's watch this demo to better understand the feature so we have these nodes in our kubernetes cluster
and here are our calico vpp node every node has an http endpoint that provides pod interfaces statistics for example on this node we have these metrics for configuration this is how
promethios is configured we can see here the node ip's and the port has started promethios is serving the 1990 port where it has access to the cluster to collect metrics
so let's take a look at the graphs here are the different metrics provided let's select tx bytes for example for transmitted bytes
execute and then this is the graph so in the last five minutes values are at zero let's run a test to see what happens in a few seconds we have this flow of the traffic
for thanks for watching and any questions that you might have or will come up okay guys
thank you for your talk um yeah let's start with a few questions so how to collaborate information known by vpp in g the interface names with the indexes you get in the shared memory counters so we have a couple of counters that list all the interfaces
names and not names that you can also access you know that you have the correct indexes in the other counters and you can just map the the indexes you get in the uh in the names files
to the uh to the indexes you have in all the files and you get the uh stats interface for example okay nice uh in the calico vpp use case how do you correlate the application logic known by the control plane uh with the interfaces so let me answer that uh the information coming
from vpp stat segment concerns all vpp interfaces and are defined by indexes and the application logic known by the control plane knows which container is given the information and for every interface which pot is concerned and the correlation between both um is simply done by the fact that they are running in the same process
so promethio server is a part of calico vpp agent control plane and it runs the go application that accesses the vpp stats so this correlation is important because it helps select pod interfaces among the available ones okay thank you uh in case of the time
i think uh more questions could be answered uh while typing in chat um i want to thank you guys for for coming around and have this nice talk um if you've got any questions type it in the network different chat and if you want
one suffering out we got 20 or 40 seconds left i guess great thank you thank you for listening and thank you all for for joining to this uh to this talk thanks have a nice day awesome 2022. yes bye