We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Linux tracing with LTTng

00:00

Formal Metadata

Title
Linux tracing with LTTng
Subtitle
The love of development without printf()
Title of Series
Number of Parts
199
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
In the past, a lot of effort has been invested in high performance kernel tracing tools, but now the focus of the tracing community seems to be shifting over to efficient user space application tracing. By providing joint kernel and user space tracing, developers now have deeper insights into their applications. Furthermore, system administrators can now put in place a new way to monitor and debug systems using a low intrusiveness tracing system, LTTng. This presentation explains how LTTng can be used as a powerful development and debugging tool for user space applications taking advantage of this year's exciting new features such as network streaming and snapshots. It demonstrates how open source developers and hackers can use LTTng kernel and user space tracers to create powerful logging systems and easier debugging, thus greatly improving development and maintainability of their project(s). Finally, this talk concludes with the future work we will be doing on LTTng, and how the community can help with improving the project from feedback to very valuable contributions
Projective planeTrailRight angleResultantEvent horizonComponent-based software engineeringXMLUMLLecture/Conference
Network topologyEvent horizonMathematicsTracing (software)Term (mathematics)Model theoryComponent-based software engineeringContingency tableDifferent (Kate Ryan album)SpacetimeModule (mathematics)Kernel (computing)Server (computing)Run time (program lifecycle phase)Procedural programmingMereologyBinary filePresentation of a groupMiniDiscFile formatInstance (computer science)BefehlsprozessorDistribution (mathematics)Shift operatorWordMultiplication signRoundness (object)Structural loadFile viewerBinary codeSystem callBit rate
Patch (Unix)Run time (program lifecycle phase)Tracing (software)Video gameEvent horizonNetwork topologyCompilerLibrary (computing)Process (computing)Cartesian coordinate systemDemonModule (mathematics)Game controllerPredictabilitySlide ruleSpacetimeLine (geometry)Instance (computer science)Windows RegistryFreezingStreaming mediaSoftwareKernel (computing)Utility softwareVirtual machineLecture/Conference
SoftwareComponent-based software engineeringTracing (software)DemonProcess (computing)Point (geometry)Slide ruleBefehlsprozessorCartesian coordinate systemPhysical systemVery-high-bit-rate digital subscriber lineDifferent (Kate Ryan album)InformationKernel (computing)Event horizonComputer filePublic domainAdditionOrientation (vector space)Virtual machineNumbering schemeLibrary (computing)Video gameInstance (computer science)SpacetimeoutputLecture/ConferenceXML
Cartesian coordinate systemDemonVirtual machineVideo gameInsertion lossCrash (computing)Buffer solutionSpacetimeDifferent (Kate Ryan album)Network topologyWorkstation <Musikinstrument>Euler anglesIncidence algebraAdditionLecture/ConferenceUML
Shared memoryArithmetic meanOpen sourceSpacetimeSemiconductor memoryLibrary (computing)Data structureMechanism designSynchronizationSystem callLecture/ConferenceUML
SpacetimeBefehlsprozessorQuicksortScaling (geometry)Speech synthesisTracing (software)Series (mathematics)Point (geometry)CodeSoftware testingExecution unitRun time (program lifecycle phase)DigitizingField (computer science)Network topologyNumberRight angleProof theoryElement (mathematics)BenchmarkAreaMacro (computer science)Physical systemEvent horizonMultiplication signInformation securityInternet service providerInstance (computer science)Function (mathematics)Pointer (computer programming)Data structureString (computer science)BuildingState of matterEscape characterCartesian coordinate systemTerm (mathematics)SummierbarkeitFlagAffine spaceProcess (computing)Kernel (computing)Link (knot theory)WeightParameter (computer programming)Message passingLetterpress printingFrequencyClique-widthTwitterBuffer solutionFlow separationAdditionSpring (hydrology)Functional (mathematics)Streaming mediaPublic domainAsynchronous Transfer ModeComputer fileCountingComputer wormScheduling (computing)Scripting languageView (database)Type theorySystem callRing (mathematics)2 (number)Dirac delta functionTimestampLecture/ConferenceComputer animation
Kernel (computing)SpacetimeTimestampSoftwarePublic domainLocal ringVirtual machineComputer fileData storage deviceMereologyPoint (geometry)Real numberCartesian coordinate systemInstance (computer science)Goodness of fitTracing (software)Lecture/Conference
Cartesian coordinate systemRow (database)StatuteProduct (business)Kernel (computing)Core dumpCycle (graph theory)ImplementationInstance (computer science)Set (mathematics)CASE <Informatik>Real numberXML
Event horizonObject (grammar)Buffer solutionCartesian coordinate systemNetwork topologyRight anglePattern languageGroup actionData loggerIntrusion detection systemDifferent (Kate Ryan album)Scripting languageRepository (publishing)Core dumpCross-correlationReal-time operating systemVirtual machineLecture/ConferenceMeeting/InterviewXML
Multiplication signTerm (mathematics)Instance (computer science)LoginComponent-based software engineeringReverse engineeringProcess (computing)HypermediaProfil (magazine)Cross-correlationPlug-in (computing)Lecture/Conference
Drop (liquid)Instance (computer science)HypermediaSynchronizationEvent horizonStaff (military)MathematicsTracing (software)Cartesian coordinate systemDifferent (Kate Ryan album)Term (mathematics)XML
Kernel (computing)Cartesian coordinate systemBuffer solutionWindows RegistryMathematicsQuicksortPhysical systemForcing (mathematics)Different (Kate Ryan album)Series (mathematics)Boss CorporationComputer fileWordLecture/Conference
SpacetimeContrast (vision)Server (computing)Level (video gaming)File viewerTracing (software)Virtual machineDemonMedical imagingPhysical systemReal numberSoftwareArray data structureSystem administratorWorkstation <Musikinstrument>Kernel (computing)Block (periodic table)Video gameInternet forumInstance (computer science)Program flowchart
MassScaling (geometry)Virtual machinePrisoner's dilemmaElectronic program guideNetwork topologyModule (mathematics)RootCartesian coordinate systemKernel (computing)DemonLaptopLecture/ConferenceComputer animation
Musical ensembleNetwork topologyRandomizationDecision theoryKernel (computing)FreezingArithmetic meanImage registrationRootDemonCASE <Informatik>Public domainEvent horizonElectronic mailing listLecture/ConferenceComputer animation
RootMusical ensembleRight angleLaptopOnline helpEvent horizonKernel (computing)Different (Kate Ryan album)Factory (trading post)Standard deviationInsertion lossXMLLecture/ConferenceSource codeComputer animation
Kernel (computing)NeuroinformatikEvent horizonMathematicsPoint (geometry)Ocean currentComputer animation
Cartesian coordinate systemEvent horizonDifferent (Kate Ryan album)SpacetimeSemiconductor memoryKernel (computing)Right angleDefault (computer science)Function (mathematics)BefehlsprozessorNetwork topologySoftwareSynchronizationComputer fileLatent heatParameter (computer programming)BitData storage deviceFile viewerMathematicsComputer animationJSONXMLUMLSource code
Event horizonVirtual machineMiniDiscSystem callTracing (software)BefehlsprozessorLine (geometry)Different (Kate Ryan album)2 (number)Row (database)Scaling (geometry)Insertion lossThread (computing)Cartesian coordinate systemMultiplication signOperating systemBuffer solutionVery-high-bit-rate digital subscriber lineCurvatureoutputContent (media)BitStreaming mediaNetwork topologyWave packetDegree (graph theory)Point (geometry)SoftwareOffice suiteFreezingMathematicsType theoryTwitterLatent heatResultantMusical ensembleRight angleIdeal (ethics)Diagram
Java appletRevision controlFreezingPoint (geometry)Alphabet (computer science)ResultantMedical imagingMereologyLecture/ConferenceXML
Library (computing)Design by contractQuicksortCartesian coordinate systemJava appletTracing (software)Combinational logicProjective planeProcess (computing)Kernel (computing)Social classRight angleMultiplication signMilitary baseDynamical systemOpen sourceInternet forumNetwork topologySoftwareLine (geometry)Interpreter (computing)Computer hardwareArmTimestampSpacetimeXMLComputer animation
Java appletKernel (computing)Interface (computing)Term (mathematics)Tracing (software)PlanningFeedbackInterpreter (computing)Heegaard splittingKeyboard shortcutPoint (geometry)InformationComputer programAddress spaceMechanism designEvent horizonFunctional (mathematics)Absolute valueSystem callProjective planeGraphical user interfaceSpacetimeCodePatch (Unix)Cartesian coordinate systemOpen setReal-time operating systemMoment (mathematics)Different (Kate Ryan album)Physical systemInstance (computer science)Semiconductor memoryVirtual realityElectric generatorModule (mathematics)Multiplication signParallel portOpen sourceMiniDiscRight angleSign (mathematics)Closed setRepository (publishing)Arithmetic meanLatent heatElectronic program guideFilter <Stochastik>Asynchronous Transfer ModeVirtual machineMathematical analysisSymbol tableVirtualizationHookingWordMereologyMathematicsAdditionResultantPower (physics)Network topologyVisualization (computer graphics)Integrated development environmentProcess (computing)Food energyInformation securityLine (geometry)Standard deviationNeuroinformatikClient (computing)Contingency tableDisk read-and-write headBinary treeRule of inferencePattern languageInferenceMedical imagingCanonical ensembleCondition numberTheoryConcentricMaxima and minimaProfil (magazine)DialectContext awarenessDigitizingRevision controlService (economics)Dynamical systemOracleBitIntegerTable (information)AreaCore dumpPublic domainQuicksortLecture/Conference
Transcript: English(auto-generated)
Hello everyone, can you hear me? Fine, great. So my name is David, David Goulet, I work at the LTTNG project. Also I work at Efficials, which is a company that basically we make features and development in LTTNG. So today, so usually tracing talks are kind of a bit boring, but I try to do my best to make it fun.
So just to show you, who ever heard about LTTNG? Oh that's great, at least two people, great. So tracing in general, can just show you who uses tracing with S-Trace perf, S-Trace also.
Alright, cool, so I'm on a good track. So I maintain the LTTNG tools component of the LTTNG toolchain. And this presentation is about tracing with LTTNG, but showing you what we did in the last three
years in terms of excellent feature and great work and how you can use it in user space also. So that's really a quick overview of LTTNG and tracing in general. I said probably people here maybe don't know what tracing is, so I'm just going to do an overview about tracing.
Then we're going to talk about everything else and some future work. So the difference between tracing and debugging or having logging with syslog and so on, is that you can enable or disable events during round time. And tracing is low intrusiveness, very high performance.
So with LTTNG we created this user space tracer and a kernel tracer and then we created the LTTNG tools component that mixed them together. So you can merge kernel and user space trace within one trace and analyze that after all.
So tracing is basically eye throughput, eye performance debugging with like, let's say you have printf and then tracing uses that printf but eye performance rate and it's scaled over CPU's also. So this is basically what tracing is. So for instance strace, so everyone knows strace here, the great tool right?
It uses ptrace and just to hijack and understand how to print out the syscalls. So tracing is a bit different from debugging, it's very important to understand because this is basically what this talk is about, is to make you use tracing in the end.
So LTTNG 2.x, so we have 2.4 I think right now, is unified tracing. So unified tracing between user space and kernel space. As I said, low overhead, we use a common trace format called CTF which is a tracing format,
it's a binary format on disk and then we analyze that with viewers, whatever viewers you can use. And we are now shipped in Ubuntu, Debian, Red Hat and other distribution. So as I said, so there's two tracers where you have the LTTNG modules which are the kernel trace.
So as a, I'm going to repeat that like a lot, you don't need to recompile the kernel with LTTNG, you use modules as of now. So the difference between LTTNG, between perf and LTTNG is that perf is in kernel. They do a lot of clever stuff, they do also profiling where LTTNG doesn't do that.
But with the kernel module, with the kernel tracer, you can trace a lot of events in the kernel. Sys calls and a bunch load of whatever happens in the kernel in different subsystems. And then we created this USD, this is what I'm mostly going to talk about today, is this user space tracer.
So the love of development without printf, so we're going to use tracing in user space. So this tracer is an in-process library, so you're gonna, you instrument your application, you recompile and you have your tracer. But the good thing is that you can enable, disable during runtime any events, you
can live trace it, you can snapshot, there's a lot of features, it's pretty fun. Yes, so you can use the kernel as in 2.6.38, before that there's three patches to the kernel, but after that there's no patches, only modules.
The utilities of LTTNG, so LTTNG is, we created a daemon, which is called a session daemon. The LTTNG session D, the third one, is the tracing registry daemon. And this thing will combine everything you do, so when you create a trace, for instance, so you
do an LTTNG create with a command line, you can use it for kernel and user space tracing. So this is pretty great where you can integrate both sides into a trace. So yes, we created a control library, you can use it, it's LTC, so it's a C library, control library, we have a relay D, which this relay D makes, you can stream the trace over the network,
so you can extract traces from machines and put it on the network and analyze it later on, so this is the relay D, the streaming daemon. And the consumer D is the one we extract, so there's a bunch of daemon, you don't really need to care about that when you trace with LTTNG,
but if you want to contribute or make development in LTTNG, those are kind of the component. The viewers, we have a Babel trace, which basically takes an input, a trace input, a directory, so the traces and the file traces, and just print out on the console, human readable.
There's LTTNG top, and this is pretty awesome, so everyone here knows top, or each top, and this LTTNG top is exactly the same, but it uses kernel traces to print out the information on the top.
So the difference between top and LTTNG top is that we don't probe proc every second, for instance, we just use live traces, and you can have your output of what's going on in your system. This is pretty awesome, because usually the performance are one to three percent or five percent it on your system, where top takes a lot of CPUs.
Alright, so this is LTTNG user space, you instrument your application, I think I have a couple of slides where it shows what's an instrumentation, what's means that,
we call it trace points, so you put trace points in your application and you can extract at high speed your data. So in process library, the session setup, you run your app, you collect your traces, and then you analyze, that's it, that's all. So this is a quick schema of what's going on, so you're in a machine, and then you have your
session daemon, the session daemon as I said is a tracing registry, it just helps you create sessions and enable events. So on the command line you do let tng create, enable event, dash u means dash user space, or dash k dash kernel, and
we have other, we call that domains, we have other domains that I'm going to show you later on, and dash a means all. So you enable all event of all your application, so with LTTNG the great thing about that is, if you have instrumented application, you can list which application you can trace, so if you start your application on your machine, and they're going to register to that session daemon, and so the session daemon
will respond to consumer and say, alright, okay, I'm going to consume your trace, I'm going to extract your trace, and that's it. It's pretty easy. And the instrumented application, so our consumer will create buffers, and will share it with the application.
So the application will just populate those buffers with the tracing data, and this is the great thing about tracing user space with LTTNG is that, for instance, when your application crash, well, the buffers are still alive. So we have the buffers, we have everything to consume them, we know the application is dead, so
we just extract the data, and then you get all your trace, even though the application is dead. And this way is done with a shared memory. So we use, we are using heavily user space RCU, so anyone heard about RCU here, show of hands? RCU, yeah right.
So RCU is a very complicated thing, but it's a user space RCU, RCU means read, copy, update. It's a synchronization mechanism to have very, very high efficient and very high performance data structure, a lot less data structure.
So in user space, there's a library, RCU library called user space RCU, you can check it out. And we use that a lot just to synchronize our trace and our performance, because as I said, if you have 8 CPUs or 128 CPUs, it's going to scale over that.
So let's, oh sorry, so yes, this is basically how we do stuff in user space. Shared memory, we exchange buffers, and that's it. Alright, cool stuff, alright cool.
So this is how you instrument an application. Yes, it is C, it's macros, it's kind of complicated to be honest. We have a helper script that helps to create that, but in the end it's kind of a tedious process.
But again, the usefulness of tracing is that you instrument your application and you do it once, you don't put printf everywhere. You just instrument, put on trace points, and then you can trace it over time.
So your application is running, and at some point there's a problem. Well, you can just enable a couple of events that you think is the subsystem is badly working, and you can extract your data. So this is a test if you go into the LTTNG repository, we have examples.
And this one is a URL example. So your provider name, trace point name, those are two different names just to separate the trace points. So you can have a bunch of trace points in your application and just separate them into providers, or domain, or whatever.
So instance, your company name, and then the department, and then the application, then the subsystem, and that's it. You have your provider name. And then you can just trace the fields, the payload you want. So you record the ints, strings, structure, whatever.
So we have a very basic function here, and then this trace point that we created here. So UST test hello, and then we're using it, right? So when you're coding your application, or maintaining it, or whatever, you can add those trace points to the code,
and you only put trace point, and then the fields, and it's going to record. So if the tracing is not enabled, there is absolutely no performance it. So the trace point is going to be it, it just zeroed out, and that's it.
If you enable it during run time, and then you hit it, I think I have a benchmark performance chart, but as of now we are around 250 nanoseconds it takes to your application to hit the trace point. And it's kind of the lowest we can go right now.
There's no syscall in the path, it's all zero copy stuff, so this is very, very high performance. So it's pretty high, pretty cool stuff. And yeah, trace point. Tracing session, so you come in line, you create a session, so let's say you want to enable the subsystem number one and the subsystem number 42,
you can use wildcards like that, saying just subsystem 42 underscore, and then every event in that subsystem will be enabled. So dash U, user space, and then the subsystem name. And you can start, and the LTTNG start means it's going to start tracing. The trace is going to get recorded.
And then you can wait, you can do whatever you want, get coffee, and then you stop at some point, and then you can view it. So the view is, so this is an example of the trace you can extract. So you have the timestamp on the top, top left,
and then you have the delta between the two events, the orange thing is the deltas between the two events. So one second between those two. Then the host name, the trace point name, provider and trace point name, and then the payload. And you have the CPU ID saying this event was done on that CPU, CPU one, CPU two, CPU three, whatever.
And then the payload. As I said before, the payload, you can put whatever you want. Anything, strings, int, structure pointers, you can extract everything. So this is a human readable output made from Babel trace. For a trace.
So this is another example of the kernel. So for instance, this is called read, this is called open, and a sketch switch. A sketch switch is when two processes get switched on the scheduler and the kernel.
So yes, for instance, read, you have the payload read, the FD, the buffer pointer and the count. Open, you have the file name, the flags, and so on and so forth. Perf gives out around the same output in terms of kernel tracing. It's pretty similar to S trace also. S trace can understand the layout of the Cisco, for instance,
if you do receive message or send message. The payload is going to be, you're going to know the type of the payload, which means the structure, the arguments and so on.
So this is why buffer is a pointer. We don't know if, what points or what's the structure in this. It's just a pointer. But again, human readable. So again, start, stop, link and view. But this is not very useful in the end because you just start a trace, you trace and then you stop. So we created snapshots.
Snapshot means you're running your application in flight recorder. It's ring buffers, so you trace data, you get your data, you put it in the buffers, and at some point in time, depending on the size of the buffers, it's going to get, sorry, it's going to get erased by new data.
So right, ring buffer, flight recorder. The point of that is that, let's say you monitor, you have your instrument, your application with UST, with trace data, with trace points. And then you start, you just enable everything and then you start it. You just forget about it because it goes into flight recorder mode, ring buffer mode.
And you have some Cacti or Nagios alert and you trigger a snapshot. So what this does is that you're going to record a snapshot, you get everything from that point on when you take your snapshot. And it's going to extract your data, put it on disk, or put it on network. As you choose, you can put it on network, it's going to send it to a real AD,
LTTNG real AD, it's going to store it on file on another machine or on your local machine. And you can then analyze your trace. The good thing with LTTNG and why I use LTTNG is that you can do that with the kernel and user space. So you can snapshot both domain, both tracers, and you can merge them
because the timestamps are asynchronous. So you have your trace data from your application and from your kernel. So this is pretty great because you can use that to monitor an application that runs in production, for instance. So yeah, create, enable, start, and you wait, you do something, and you snapshot record.
And that's it. So real world use cases where we use that and it's still being used. Where we know a large set up that are being used with that is, for instance, we have a core handler.
A core dump handler. Where you have a core dump on the application, segfolds, whatever, anything. Your application core dumps. Well, you can tell it to, when it core dumps, it snapshots. And this is really, really useful, especially with the kernel, because you can know exactly what happened with your application and the kernel
when this application core dumped. And as I said, the buffers are shared with the consumer. It means that when it core dumps, you have the assurance that you're going to get every event that will be triggered by the application. So we have that in the tree right now. So if you go to git.ltng.org, you have the repository there,
and we have this fancy bash script that allows you to just snapshot when it core dumps. IDS also. So, let's say you are tracing regularly,
you're just tracing your application in your kernels, on different machines, different nodes. There's right now research being done at the University in Montreal, where it uses the data, live data, that is uploaded from lttng to create IDS to analyze that on the spot, real-time,
and just trigger action if some pattern emerges. So lttng is being used to correlate data logs to merge with syslog also. We try to work with plugins that allows you to use tracing data
and put it with, for instance, Splunk or syslog in terms of time correlation. You can use performance profiling, so on and so forth. So, as of last year, I think September or something like that, we are still in the release candidate process for the 2.4 version,
but we added the live component. So the live component means you can trace your data, you can trace application kernel, and you can read the event as they're being recorded. That seems easy to do, but it's kind of very difficult,
so in terms of synchronization and performance, to keep that performance on, but after a lot of work, we added this feature upstream. So as of now, the lttng tab, for instance, uses that. You create your trace, your session, your trace, your kernel, your trace, your application, and then you can live read your trace.
So, one of the things we're trying to do is to have a drop-in replacement for strace, for instance, where you can use lttng to do the same thing as strace does. And the difference between strace and lttng is that strace changes your application.
It uses ptrace, it changes stuff, and the behavior of your application within strace and without strace is different, because ptrace is very, very intrusive, and changes registry in doing some jump, trampoline buffers to extract this data.
So, imagine you can use a kernel with a 3% hit performance on your system, and have the exact same thing with strace. This is what we're trying to do. This is why we come here to Fuzzden to just explain this fantastic feature and not sell it to you, but also try to make you contribute and help us
to develop and get contributors in lttng. So, this is the kind of stuff we're working on. So, the live tracing, the thing is that you can contrast analyzes, cluster level analyzes. You can do a lot of things with live, you can imagine, where you extract data live. And one other thing with lttng in the last year is that we try to bring tracing to not kernel developers.
So, to user space developers, to system administrator, just to use tracing, not in a development fashion, but in a monitoring and useful way. So, this is a quick image of the infrastructure of lttng,
where you have lttng session D on your host. So, this means lttng session D is the daemon that traces. So, the traces are being collected on the lttng session D on the top server A, server B, server C. Those servers are being traced. And you can extract those through TCP, to real AD,
and then you can have your viewer connected to that real AD over TCP. So, for instance, we work with companies, large companies like Ericsson in Sweden, that uses this kind of infrastructure, where they have a huge amount of servers,
they extract the data to a couple of other machine through network, and then they view it with other machine to analyze it through the network again. It scales on a massive amount of machines, and it's being used. So, this is not just like, you know, just some developers having a dream.
It's being used. So, yes, I wanted to demo you lttng stuff, but I kind of have a problem with my laptop this morning, so I can't, unfortunately.
But what I can do, though, is just to show you some basic lttng stuff. So, what we have is lttng, right? So, lttng, let me see something, that will be four.
So, one thing is that if you have a session daemon running as root, it's going to load automatically the kernel module, the kernel tracer. If your user is in the tracing group, so we added this tracing group,
you can change the name, I mean, it's just comfortable, but the tracing group allows you to talk to the session daemon, and the session daemon is running as root and can trace the kernel, so a normal user can trace the kernel. So, I'm in that tracing group, so I'm going to lttng create a session, there we go.
So, there's no session daemon, so it's spawned, and then it's going to write the trace there. So, lttng list-k, that means list the event of the kernel domain. Yes, I need the, I'm not in the tracing group. All right, okay. Let me put you on with the root.
Sorry, my laptop is a mess right now. All right. Okay, put it back on the root stuff. There we go. Okay, lttng list-k. So, this is all the kernel event you can trace. You see, xtfs, btrfs, xt3, rq, kvm.
So, those are standard events that are in the kernel right now. They're called trace events in the kernel. So, perf, I have the same events. Ftrace is also, I think ftrace is listing those events also. And then you can trace whatever you want.
So, let's, oh yeah, okay. Let's trace them all. And I'm going to see how much events can you create. So, I'm going to say just enable all events in my session for the kernel. Oh, I need to create a session, though. Sorry. All right, enable event, and I'll show you this, okay.
And it is, so all the events are enabled, and there's a lot right now. So, when I do lttng start, so tracing is starting. So, at this point, my computer is tracing the kernel. Just to show you how much data it's going to create.
I think it's, h. So, let me just try to center that. Oh, we can't see, right? 11 megabytes, and then 12 megabytes. So, right now it's tracing. So, those are the files, the data files.
So, channel zero means channel zero is the default channel. So, you create a channel, and you put on events in those channels. So, you can have different events in different channels, in different sessions. You can move that around in the network or not. So, as of now it's tracing, and I think my CPU is about doing nothing right now.
But I'm extracting every, every event in the kernel right now. It's a lot of data, and there's nothing. There's Tor in awesome, that's it. So, let's stop that, and I'm going to show you how it looks like.
Yes. Oh, yeah. It'll be configuring. There you go. So, here's the data, and it's non-stop. I have an output class, and I'll show you how it works. Non-stop. All right. So, this data is being recorded.
This is kernel's event. Every KMM allocation, every memory allocation, every sketch switch, everything. Every RQ, everything is being traced. So, let's try to stop that. And just to show you the amount of data you can extract without any performance yet. And this is kind of really great.
And you can, as I said, you can couple that with any user space application. So, if I have any application, lttng list-u, and you can see the application, and what you can trace when the application starts. So, the application registers the session daemon, and then lists the events.
So, you can just list-u, see what application you can trace, and start tracing. All right. So, let's go back to... All right. So, it is just some performance results. Just to show you the awesomeness of it. So, we run that for 50 minutes.
And we took a snapshot. Every, I don't, every 30 seconds. And it created a snapshot of seven megabytes of data. And the old strace, so we compared it to strace. So, strace would create about 5.4 gigabytes of data in a matter of 50 minutes.
With 61 million events. So, remember strace is syscalls for a specific application. lttng will trace every syscall that happens in your machine for every application. And not only syscalls, but all the other events you saw.
It creates 6.8 gigabytes, 250 million events at the same time. So, those are the same. And we have a 1% event loss. So, event loss is basically that there's way too much data at high speed. So, we have to lose events, unfortunately, to be able to have this kind of performance and record.
So, this is a dedicated disk for trace. So, it means that we have a disk and the trace is being put on that disk. A dedicated disk, not a shared disk with your operating system.
And you can see there's five lines, but you see two right now. So, the red one is strace MySQL. And the rest is the same with lttng. So, there's no performance there. So, it's the same as node tracing. And we are comparing that with flight recorder, as I said, where we're just recording, recording, recording the buffers.
And the yellow one is the streaming. So, you extract your data, you stream it on the network, so you extract them to another machine. And there's absolutely no performance yet, but you trace 250 million events.
So, in a shared disk, it goes a bit more badly. Because then you have very high contention on the disk and IOs are off the roof. So, node tracing, of course, high performance. And then you get your strace, which is the yellow one, and lttng, just scalable.
So, this is why you can see the flat line of strace. Is that when you go to 32 threads on a 32cp machine, then it doesn't scale at some point. It stops scaling. And it just doesn't go linearly where lttng does. The difference is that we have per CPU buffers.
And we try very hard to just avoid cache line bouncing and a lot of performance problems we have when you have multi-trading application. So, oh yeah, 30 minutes in my talk. Recent work in new features. So, lttng 2.x has been created in 2010, I think.
And each version is the name of a beer in Quebec, Montreal, Quebec, in Canada. So, we have microbreweries in Quebec. We have very good breweries. So, we're following the alphabet, so we start A to whatever we are.
So, 2.0 is Annandale, 2.1 is Basmes, 2.2 is Couda, which is a very nice IP from a bar microbrewery in Montreal. So, 2.4 is Épaca pack, the E one, which is a microbrewery in Quebec.
And 2.5, I don't remember what the name is going to be, but it's going to start with F. So, in 2.4, we added live tracing, as I said. We have also a snapshot that's been done in 2.3, but you can use them in 2.4 or in 2.3, whatever.
And we added also Java useful support. So, I don't know if there's a lot of Java developer here. I'm not really a Java developer. I don't really like Java, but there's a lot of people using Java. So, we have this contract to support, not support, but add a feature of using Java user logging, so GUL.
Java user logging is a library where you log your application in Java. So, we hooked, we have a LTTNG agent in Java that hooks itself on the Java user library and extract every data from that and put it on an LTTNG trace.
So, this is pretty nice because if you're doing Java, you can do nothing to your application. One line, like just Java LTTNG dot create your class, or I don't remember exactly what it is, but just you create the class and then you start the LTTNG thing. And you can use LTTNG as I did with the console and trace your Java application.
So, yeah, so there's a performance it, but again, it's pretty useful. Future work, I'm almost done, so there's a lot of fun for questioning after that, but future work is that we're going to add this year hardware tracing. We've been talking a lot with ARM people, embedded people,
and we're going to try to put on the hardware traces into LTTNG. So, once we can do that, it means that you can combine hardware traces, kernel traces, and user space traces in the same trace. You can correlate the timestamp and then analyze that, which is, to be honest, pretty awesome.
And you can do that right now unless you use some proprietary software from Microsoft. Or Oracle is trying to close source the trace, but it's coming up. And then trace triggers, Android ports, a port for Android,
and then automatic analysis, dynamic also instrumentation, where D-trace does and we don't unfortunately, but we're trying on that. So, if you're looking for an open source project to contribute or even a job, dynamic instrumentation is very important for us, and we're trying to do that.
Yes, so LTTNG project, that's an overview of it, so if there's questions, that's the right time. Yes, I think there's a microphone around. There's one person over there. Right there, over.
So, just one thing for the question and answer. Please sign clearly if you have got a question, so we can come around to your place to give you the microphone. And if you want to leave, please leave, so the others have the chance to ask their questions.
So, you mentioned that there are standard tracing points in the Linux kernel, so that I don't have to recompile. And as far as I could understand, you can log to the same machine, for example, on the disk, right? You can what? I can trace what?
So, I enable everything in the kernel, and then I want to log this data on the local disk. It happens in parallel. So, do I get spurious log entries in my trace data?
So, you're saying that while we're extracting, do we get those events in the data of LTTNG working? So, yes, there's a kind of a feedback loop in there, where you see the LTTNG events.
So, we're extracting data, we're using, of course, syscalls, and you see them in the trace, yes. So, the more you trace, the more you create data. But you can filter that out. We have filters in the UST, and then it's going to come with the kernel. So, you can just remove LTTNG even. That was the question? Another question. Can I inject these trace points dynamically into the running kernel,
and do I want to do this if yes? So, the question is, can I dynamically add trace points to the kernel, right? So, the trace events in the kernel are static. So, it means they're coded, they're upstream right now, and you use them.
What's existing in the kernel is called kprobe, or function tracer, the ftrace tracer. So, you can hook to any symbol in the kernel with that, and LTTNG supports that. So, for every event, you can just add, let's say I want to trace that function in the kernel. So, you can do that with LTTNG,
with the editable event dash dash function, or dash dash kprobe. So, those are mechanisms that exist in the kernel. You can dynamically trace any symbol, if you have the address or the name. Yes, you can do that with LTTNG. Hi, I have some questions for user space tracing.
Is there any documentation anywhere for people who are not familiar with kernel programming to do user space probing somewhere? Because when I was looking for something on LTTNG, I didn't find much information. Let's say I want to add some tracing points for Python interpreter.
Is there anywhere I can look at? So, documentation is always a problem in open source projects, and LTTNG is also lacking in stuff of documentation right now. So, for instance, if you want to add the instrumentation in your application, what I would say is the man page of LTTNG-UST is the best place to look for now.
LTTNG-UST is the user space tracer. LTTNG-UST, I think. And also, you look at the examples in the code repository. This year, we want to create a developer documentation, a developer guide. But again, it's just a matter of time and resources just to do that.
Can you easily extract the stack at this level? When you use a space and you want to extract the whole stack at some point, is it easy to do in LTTNG? I just don't get... If you want to extract the stack at some given point.
Oh, no, no. Okay. No, no, no. Ah. Is it open? Okay. Can I go? Okay. Hi.
Can you describe, please, how does LTTNG... Yeah, I'm here. Sorry. Can you describe how LTTNG analyzes data while they're being created and what interfaces are available for doing that? So the question is, is there any interface to do the analysis of LTTNG data?
I'm sorry, I couldn't hear you. The question is, in the live mode of LTTNG, so extracting data while it's getting gathered, is there interfaces to do analysis, is the question? Sure, yeah, that's it. Alright, so we have LTTNG that does some analysis. There's this TMF, the Trace Monitoring Framework, I think it's in Eclipse.
So this tool that I'm going to present is a huge graphical tool to analyze the LTTNG trace and all the traces. It's in Eclipse. There's a rich client, what's it called?
RCP thing, which you can use outside of Eclipse, that allows you to analyze data and compare it. But if you go to automatic analyzes, for instance, no, there's nothing right now. And this is something we're looking for contributors to help us do automatic analyzes in terms of whatever you can think of, security, performance, and so on.
I was wondering if there's a way to use LTTNG in non-C codes, but without dropping through the FFI interface. Right, so we have Java bindings, and we're working on Python bindings also.
But, as of now, the Java works fine without using any C, but as of now you use C code or you use Java. We don't have anything else. But this year we're working for Python bindings. So it would be possible? Yeah, absolutely. Hi, I'm involved in a real-time Linux project
which works with RT pre-empt, Xenomai, as well as RTI.
So the last one is out of question because that's FTraceland. I don't expect any problems using that with RT pre-empt. The open question for me is Xenomai, are there any data points using that for trace using LTTNG user? For real-time? With Xenomai userland threads, yeah?
No, so the old LTTNG code had patches for RT pre-empt, but the 2.x doesn't support RT pre-empt at all. It won't work. I can turn the question differently. Are there, in the generation of a trace event, is there any system calls, or is that just memory to memory operations?
Yeah, there's no system calls when the trace point is being recorded. There's not. So we just... There's the write on disk, but there's kind of a zero-copy thing, where you just mem-map some region of the data, and then we extract it afterwards. So during the trace point, there's no system call at all.
Okay, thanks for the comment. Hey. There. Yeah. So how does LTTNG compare to SystemTap or DTracer? So is it that it always traces everything and you analyze it later, or what's the main differences? Alright, so the major difference is that DTrace and SystemTap,
when you trace user space, you go through a kernel every time. We don't do that in user space. We don't go through the kernel, so it means we don't have syscalls. While DTrace and SystemTap, you have to use the kernel module or go through the kernel to trace, to extract your data. DTrace, it's great,
but the big issue is that, and I'm sorry if there's Oracle people in room, but it's been bought by Oracle, right? It was a son. It's been bought by Oracle. And then Oracle right now is just not saying it right now, but it's trying to put it on a closed source way of doing stuff
where it just closes everything. And so there's a Linux support for DTrace, but it's not that good. Usually DTrace is for Solaris, and OS X is using it also. And SystemTap is a great thing. It's just a bit complicated. They do stuff differently in terms of buffering and extracting data.
But again, they go through the kernel. We don't. And this is a very, very different approach in terms of performance. So LCNG is performance centric in terms of the features where SystemTap and DTrace, they don't do that. I just want to mention that SystemTap and DTrace have dynamic tracing.
We don't. That dynamic tracing means you can just, your application running, and you can just decide to trace that specific function, for instance. And they go through a kernel to do that. That's the difference.
If I want to trace a user space application with UST, and I want to get a full trace from the moment my program starts running, so all the events from the beginning, can I somehow pre-enable the trace points before the application is registered with session demon?
So you can enable any events you want, and when the application starts, we're going to enable them before the main of the application starts. So this means that when the main starts of your application, the trace points are already enabled. So yeah, you can enable before. Okay, cool. And one other question.
If I have a long running application I'm tracing, and halfway through the trace, I decide I want to enable more events, can I do that without stopping and restarting the trace? You just go enable event, and it adds it. Yes, well that's the beauty with tracing, is that you can enable, disable events whenever you want. So at any point in time, you can disable events, meaning stop tracing that event,
or you can enable them. So you don't have to enable them prior, right? So let's say you enable one event, and then you start the application, it traces, and then you want to disable it and enable it. It's not important, you just enable event whenever you want. During one time.
Hi. Are you guys looking at tracing guests in a virtualized environment through the hypervisor? So are we looking at tracing virtualization?
I mean, you have a session team running on the host, and then you have a couple of guests, and then the guests are basically tracing. So KVM has some events in the kernel, so you can trace those events from the KVM events, but getting the trace out of the VM, within the host, for instance,
no, we don't do that, but we support to trace different, we add the host name in the trace. So for instance, if you use LXC or V servers, and also contextualization, in the trace, it's going to show which host name is being used. So this event, you can say, oh, this is coming from that machine.
Virtualization is much more split between the host and the guest, so this is much more difficult to just trace. So are there any plans of supporting that in the future? Not now, not right now, this is kind of difficult, but one thing would be great is to just synchronize the timestamp, and there's no plan for that, and that would be kind of awesome.
Thanks. All right, so thank you all, I'm here for next weekend, and I have a talk tomorrow, so please come to this OTR talk tomorrow.
Thank you for the talk, and the next talk here in Johnson will start.