Your Application versus GDB
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 199 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/32674 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Computer programmingCodeMultiplicationLevel (video gaming)Formal languageOrder (biology)Software bugCondition numberDifferent (Kate Ryan album)CASE <Informatik>Filter <Stochastik>ExpressionBitJust-in-Time-CompilerRevision controlCore dumpSet (mathematics)Virtual machineTwitterExtension (kinesiology)MathematicsSlide ruleCartesian coordinate systemMereologyDebuggerFlagWordLibrary (computing)Module (mathematics)Content (media)Structural loadInformationCompilerBoilerplate (text)Functional (mathematics)Error messageConfiguration spaceSemiconductor memoryData structureSoftware testingBookmark (World Wide Web)Category of beingMultiplication signLine (geometry)MassType theoryException handlingDistribution (mathematics)Source codeResultantGraphical user interfaceWritingLetterpress printingCompilation albumEuler anglesForestSound effectMatching (graph theory)Video gameIntegerScripting languageHoaxArithmetic meanWhiteboardInstance (computer science)Gauß-FehlerintegralFood energyContext awarenessEndliche ModelltheorieThread (computing)User interfacePoint (geometry)FamilyOffice suiteVolume (thermodynamics)AreaLogic gateLecture/Conference
09:54
Object (grammar)Semantics (computer science)Standard deviationWeb 2.0MathematicsTheorySymbol tableLinear regressionLogical constantMaxima and minimaControl flowOvalUser interfaceLine (geometry)Set (mathematics)Computer programmingSocial classFlagSoftware testingScripting languageFood energyType theoryDivisorMassHookingString (computer science)Codierung <Programmierung>Right angleRepresentation (politics)InformationCompiler1 (number)MereologyPoint (geometry)Letterpress printingDefault (computer science)Tracing (software)Web pageMedical imagingFrame problemCore dumpLevel (video gaming)IntegerImplementationSystem callPhysical systemParameter (computer programming)CASE <Informatik>Template (C++)Constraint (mathematics)Cartesian coordinate systemField (computer science)Formal languageData structureRegulärer Ausdruck <Textverarbeitung>Functional (mathematics)ExpressionGraph coloringMultiplication signNeuroinformatikBoilerplate (text)Power (physics)Variable (mathematics)Electronic mailing listFigurate numberQuicksortSuite (music)Software bugDisk read-and-write headFiber bundleAttribute grammarStack (abstract data type)Graphical user interfaceLecture/Conference
19:47
BitEmailPoint (geometry)Frame problemComputer fileObject (grammar)CompilerFunctional (mathematics)Pointer (computer programming)QuicksortStack (abstract data type)Formal languageInterpreter (computing)Parameter (computer programming)MereologyOverhead (computing)Poisson-KlammerDifferent (Kate Ryan album)Social classJust-in-Time-CompilerComputer programmingArithmetic meanFilter <Stochastik>User interfaceCASE <Informatik>NumberMultiplication signLimit (category theory)CodeCondition numberLibrary (computing)Right angleInformationPatch (Unix)Scripting languageRevision controlSheaf (mathematics)HookingVirtual machineOcean currentImplementationMathematicsRoundness (object)Data structureSingle-precision floating-point formatSoftware frameworkCartesian coordinate systemFlow separationTracing (software)AreaFood energyServer (computing)Neuroinformatik1 (number)Similarity (geometry)DebuggerVector spaceWave packetWritingPhysical systemFourier seriesSource codeFocus (optics)HoaxOffice suiteSystem callGraph (mathematics)Software bugLevel (video gaming)Lecture/Conference
29:41
NumberInformationComputer programmingComputer fileCodeInterpreter (computing)Information securityFunctional (mathematics)Graph coloringMachine visionSheaf (mathematics)Video gameProcess (computing)Communications protocolPhysical systemWordException handlingData structurePoint (geometry)Patch (Unix)Product (business)MereologyPattern languageBuildingSemiconductor memoryBitType theoryMultiplication signPlug-in (computing)Vapor barrierTerm (mathematics)Library (computing)Positional notationParameter (computer programming)Right angleCompilerKernel (computing)Focus (optics)Category of beingControl flowFlagDemo (music)Logic programmingCode refactoringOnline-AlgorithmusVariable (mathematics)Universal product codeSpherical capRange (statistics)State of matterCellular automatonPersonal digital assistantProjective planeDebuggerAreaInverse elementLine (geometry)Tap (transformer)EmailSpacetimeOverhead (computing)QuicksortLocal ringStack (abstract data type)HookingConstraint (mathematics)Internet service providerJust-in-Time-CompilerHydraulic jumpExpected valueParsingTime zoneProfil (magazine)Scripting languageLecture/Conference
39:35
Library (computing)Goodness of fitUser interface2 (number)Right angleAxiom of choiceAreaGreatest elementFormal languageSession Initiation ProtocolSource code
40:29
Software developerLibrary (computing)CASE <Informatik>Hacker (term)Point (geometry)MereologyHand fanExistenceQuicksortFormal languageCycle (graph theory)AreaFunctional (mathematics)WordLecture/Conference
41:50
Functional (mathematics)CodeControl flow graphLine (geometry)Data structureMessage passingPoint (geometry)Flow separationGraph (mathematics)User interfaceThread (computing)Control flowGraphical user interfaceDataflow
43:15
User interfaceFunctional (mathematics)Graphical user interfaceControl flow graphControl flowInheritance (object-oriented programming)Multiplication signNetwork topologyHypermediaGraph (mathematics)Object (grammar)Area
44:20
Different (Kate Ryan album)Latent heatAbstractionExpert systemElectronic visual displaySource codeFreewareUser interfaceElectronic mailing listInformationData structureGraphical user interfaceLine (geometry)Cartesian coordinate systemView (database)Parsing1 (number)Function (mathematics)CodeSystem callQuicksortIntegrated development environmentMappingProjective planeWeb 2.0Point (geometry)Visualization (computer graphics)DebuggerExtreme programmingLetterpress printingVirtual machineWrapper (data mining)Server (computing)Image registrationComputer programmingProcess (computing)Web browserWikiGraph (mathematics)Device driverPhysical systemParameter (computer programming)Document Type DefinitionLibrary (computing)DiagramTable (information)Set (mathematics)LogicCASE <Informatik>PiNetwork topologyMereologyRight angleLecture/Conference
Transcript: English(auto-generated)
00:00
Good afternoon. Firstly, could I just please ask if anybody's going to leave during questions. Could you do it very quietly and use the two rear doors, just so anyone else can hear any questions that have been asked. This is Tom Troomey. He's going to be talking about GDB and your applications. Hi. My name is Tom Troomey. I work at Red Hat. I work on the debugger team at Red Hat.
00:28
Before I start, I just wanted to say, first of all, thank you all for coming. There's maybe a little too many of you. I'm a little freaked out. I'd like to thank Red Hat for paying for my trip here. It was fantastic.
00:44
Also, I just want to make sure you guys didn't get the wrong idea about the title of this talk, GDB vs. Your Application. If you're expecting some kind of cage match, you're in the wrong room. GDB does hate your application. It expresses its content through the design of its command line interface.
01:03
This talk is really going to be about how GDB can help you debug your application better. Some filler. I like to tell you what I'm going to tell you before I tell you what I tell you. The introduction is very self-referential. We're doing it right now.
01:21
Most of the talk is going to be about Python. We added Python scripting to GDB in recent times. We're going to talk about why we did it and what you can use it for. Then more Python. Then even more Python. But then we're going to talk about a couple other things in GDB that are interesting and may help you debug your application better.
01:41
We're going to talk about probes, probe points, what they are, how to use them. We're going to talk a little bit about the JIT API. We found a bug. First, some trends in debugging. There aren't really trends in debugging.
02:00
There's trends in your program. Your program is getting bigger. It's using more threads. Every module in it is bigger. There's more debug info. The compiler generates more debug info. You're using more and more shared libraries. If you look in the distro, my favorite test case for GDB is LibreOffice. I believe the last time I tried it out it uses 160 shared libraries.
02:24
It's massive. Programs are becoming distributed. These days it's commonplace to write programs in multiple languages, not just C and C++, which is what us dinosaurs thought were multiple languages. It's more, you know, you're using JavaScript, Perl, Ruby, and Go all in one big happy program.
02:47
And JIT compilation. It's very common now. Anybody who's running Lua, JavaScript, all kinds of other things, how do you debug JIT compiled programs?
03:00
We're going to talk a little bit about the features we've added to GDB to help you with some of these problems. We haven't solved every problem, but we're making headway on all of them. GDB reacts to your program. You guys keep writing bigger programs, crazier programs, and somehow insisting that you ought to be able to debug them.
03:22
One thing I think we realized is that as your program gets bigger, debugging it gets harder. The bigger the program, the less of it you actually know. Even a modest program like GDB, very few people understand all of it. It's extremely uncommon. Maybe nobody does.
03:41
And the bigger the program, the less you can really, as a percentage, you can really know. So what that means is, when you're debugging, you want to be able to focus better. You want to filter all that extraneous stuff to focus on the part that you need to understand in order to debug.
04:02
If you look at GDB, historically, it was designed just with a certain kind of use case in mind. And it just printed to you what it knew. But sometimes it knows too much, or it knows things in a way that you don't really understand. Or sometimes it doesn't know enough. We'll get to that in a second.
04:23
So, to help you with this problem, it's not obvious maybe immediately why we did this, but we added Python scripting to GDB. And we believe that this is a nice way to let you perform this filtering. And the reason that it's nice is, it lets us decouple knowledge about your application, things that you care about, from GDB's core.
04:45
You can maintain a set of Python code to help your debugging needs. And you can keep that with your program and not have to put a whole bunch of stuff into GDB. And we'll cover some examples of that. And the rest of this slide is about some common themes in the Python editions that we make to GDB.
05:06
Hoax, what that's about is, we set things up so that Python code is associated with your application. And application is kind of a vague word. It could be associated with a shared library. So, for instance, we have things that are associated just with libstandard C++.
05:25
So, when GDB, when you start debugging your application, or when that application loads a shared library, the appropriate pieces of Python code get pulled in automatically. In a properly configured distribution, Fedora is the one I use, because I work on it, and I happen to know it's well configured.
05:46
The amount of configuration you have to do is nil. It just works, which is a very nice property. All these features can be disabled individually, and that's really important. GDB is what I like to think of as a multi-layer debugger.
06:03
You can debug way at the very highest level using your source code and the types in your program and these different Python filters and commands and stuff. But you can strip that away. All this Python code is nice, but it also can be buggy. So sometimes, if there's a bug in it, you need to be able to disable it.
06:23
So everything's disableable. And then in GDB, the layers go down and down until you can step through every single instruction at the assembly level and just look at registers and ignore all the rest. So disabling supports that concept of choosing the layer at which you want to debug. Things in GDB are per inferior, or that means if you're debugging multiple programs at once, which you can do in GDB,
06:48
little bits of Python code that are associated with your program are enabled in the programs that use that library or that application and disabled in other ones, so that when you switch contexts, you don't get confusing results.
07:03
If a pretty printer isn't written exactly perfectly, it just won't apply in your other program. Or if you're debugging multiple programs that use different versions of a library and require different versions of printers, that works in this design. And finally, GDB has the command line interface, which I assume generously that most of you actually use.
07:26
But it also has a machine interface, which is what all the GUIs use to talk to GDB to drive it. And the Python extensions we make are all designed to work both with the CLI and with MI. And what that means is you write your Python code once, and GUIs can take advantage
07:44
of it, and command line users can take advantage of it without any kind of changes. If there's questions, you know, you can just flag someone down and ask a question. So Python. We'll just do a little tour of some Python facilities that are in GDB.
08:04
Commands. You can write a new GDB command in Python. So for instance, do people here know about PAHole from the Dwarfs project? PAHole, it's a nice little thing. It can read through the debug info in your program and look
08:21
for a structure and tell you the layout of the structure and note all the holes in the structure. It's nice if you're worried about memory savings. You can repack a structure to make it smaller. So PAHole, you can write it as a GDB command in Python in like 20 or 30 lines of code, something like that. 50, I don't know. Functions.
08:40
You can add new functions to GDB. I think I have an example of some of these. Yeah, here. Got cut off. Fantastic. There we go. I can't believe that worked. So this is actually a super simple example of a Python command.
09:03
In GDB, if you type something in at the CLI and it gets an error, it aborts whatever else is going on. If it's in the middle of a little script, it just aborts that script because of how GDB works. This is a little command just to ignore errors.
09:20
GDB's command line doesn't have exception handling. Python does. It's just trivial. There's a little boilerplate you have to wrap this in, but it's nothing. So then you can just say, ignore errors. Functions. Getting back to that. You can add new what we call convenience functions. They're a function like in GDB.
09:41
GDB has its command language, but it also has an expression language. An expression language is something you can use like print, you know, print some expression, printf. You can do breakpoint conditions using this expression language. Well, we've added the ability to add functions that can be called from these expressions.
10:01
Now why is that useful? The way it's useful is that GDB has a lot of knowledge about your program. It's sometimes knowledge that your program does not have. Historically, it was very difficult to script and get this information from GDB and use it in a programmatic way. My favorite example is this function.
10:21
It's a short little function. I call it caller is. This came about because once I had a debugging problem where I had some function that was called 10 million times. I wanted to set a breakpoint on it, but only on a certain call path. I wanted to break in function F when it was called by function G.
10:42
In GDB, you know, GDB knows that. It understands that. It has the whole stack trace available to it or can compute it. But there was no way to express that in the expression language. So we write this little Python function. It's trivial. You can see there's, like before, there's a little boilerplate. And then here's how you use it.
11:01
You say, you know, break if the caller is whatever I wanted. And the full power of Python is available to you here. You can make it do regular expressions. You can make it arbitrarily hairy. And it's all, you know, extremely easy. You can refer to variables in your function.
11:21
You know, almost anything you can think of that you could do in GDB. So those are like simple things, commands and functions. Next, we're going to talk about something really useful. Pretty printing. If you've ever debugged C++, you've probably seen this.
11:44
This is printing a local variable. Now, if you're experienced at debugging C++, you can wade through this and figure out that this is, any guesses what this is?
12:00
It's a list of strings. Now, this will also test your knowledge of libstandard C++ if you think you could find out where the strings are. It's almost impossible to print a string in libstandard C++. A pretty printer does this for you. And the idea behind a pretty printer is just that when GDB goes to print something,
12:24
first it says, does anybody written in Python want to take a stab at it? We wrote a full suite of printers for basically every data structure in libstandard C++. And, you know, going back here, like, well, I don't have that as a slide,
12:40
but if you actually try to print a string in libstandard C++, I think strings are one of the okay ones, but there's a few data structures that are written in a completely non-obvious style where even if you print them, you won't actually see any of the underlying data. You just see this little head object, and you have to know to cast it to some magic sub-object and so on. The Python printers, remember our design, they're hooked into GDB.
13:05
They're associated with libstandard C++. They're maintained in libstandard C++. They have a test suite in libstandard C++. So, in theory, if libstandard C++ changes internally,
13:20
you know, the test suite catches printing regressions, and this doesn't involve GDB at all, really. All this involves on the GDB side is that we maintain some minimal ABI compatibility at the Python layer. We don't break your Python script. So, like when libstandard C++ switches to the next ABI,
13:44
nothing's going to break. The question is how you write those tests in libstandard C++ without making it depend on GDB. Actually, we do just make it depend on GDB.
14:03
Which, you know, it's a little chicken and egg problem four years ago or whatever. But, you know, today it's fine as long as we don't break the API. But like the pretty printing API, we kept it extremely simple.
14:21
Maybe too simple, but extremely simple. So, there's not much there to break. And like I said before, all these things work with GUIs as well as the command line interface. I don't have like nice screenshots for that, but the way it's designed is this is out-printed in a way
14:44
that all the GUIs, they need to set a flag or something, but it all just works. And that's not something that if you're writing a printer for your application, you don't need to worry or think about that. That's all just completely hidden to you. Now, why would you want to write your own printers?
15:00
You know, libstandard C++ is maybe the biggest offender, but tons of programs have string classes. It's nice to be able to print a string that looks like a string. Lots of programs do like tagged object representations. You know, I think JavaScript does that. Every Lisp ever does that. You can write Python to decode those representations,
15:21
and then you print an object, and it looks like an object, not like some big integer. So, type printing. This is not as awful. Well, you know, this shows how sort of beaten down I am by C++. This doesn't look quite so awful to me
15:40
until you realize that this goes on for five pages. And, you know, a lot of it is like methods, which are just generally not interesting. This is printing a type. If I print a type, usually I'm interested in the data members, you know. So we added some nice things. Some of it in Python, some just straight to the core,
16:02
like turning off methods and so forth. But you can see with a type printer, which is another kind of thing that we supply, you can say this big gooey mess at the top, something you would never write in your own program, standard basic string, la-di-la-di-la-di-la-di-la.
16:22
Well, what you really meant is standard colon colon string. And then you can see it's a template. Now it tells you what the template arguments are. And then we remove a bunch of crap. Oh, this one has a bug. I think this should, oh, no. Yeah, that should probably say string right there. Anyway, this one is maybe less generally useful.
16:44
It's more of a, if you're a heavy template user, you want something like this. And it may not be immediately clear to you. You might think, why do I, you know, what's the value of it like? But in a case like this, standard colon colon string is a type def that everyone,
17:01
that's what everyone actually writes in their program. And plenty of programs, that's a piece of knowledge that's not in your program, you know. There's a lot of type defs nobody uses. But that's a piece of knowledge that you have that your program doesn't and that GDB doesn't. And so you can teach it that through the type printers. You can say, you know, whenever you see this hairy mess,
17:22
really everybody writes this. And generally speaking, if you, you know, when you go to print this type, you should print the thing that people really want to see. So, and again, this is, you know, hook activated, associated with your application, maintained by you.
17:42
The question is, why not track that through debug info? Well, that's, yeah, that's what I meant when I said, there's plenty of type defs nobody uses. You could do that, right? You could look through all the type defs in the debug info but you don't know which ones are important. There's no flag for that.
18:01
You could add an attribute and have the compiler pass it on through. You know, this is easier. That's all. Yeah, there might be more than one, you know. And also, then you're talking about potentially scanning a lot of debug info, you know.
18:23
I mean, maybe. All right. So, more Python. I know, we're maniacs. So here is a greatly reduced stack trace. Oh, yeah, you know what? Stack traces remind me of something.
18:42
These things, the other point I wanted to make early, and I forgot to write it down on my slide last night. These things are also integrated into other parts of GDB, like pretty printing and type printing. They're integrated into back traces. So when you do a back trace, pretty printers are called, type printers are called.
19:01
So by default, when you get a back trace, everything is what you expect to see, what you actually generally want to see. And then again, if you want to go to the next level down and see the raw bits, all these features are disableable. So you can just turn them off and get one of those massive C++ traces that you're used to.
19:22
So, here's a back trace I edited down. This is a signal emission in glib. So, I don't show frame eight, but frame eight is a piece of my program that emits a signal on a glib object. Then frame seven through one
19:41
are glib's implementation of signal emission. And then frame zero is the callback. So if you're debugging your program and you see this stack trace, you might see, if we pretend that I remembered to put frame eight on it, we'd see nine frames,
20:01
but really about two of them are interesting at all. The rest is just gobbledygook. In particular, six frames of glib stack emission. That's awesome, and you want to see that if you're writing glib. But if you're just writing a GUI, you don't care. This is just clutter. And this goes back to that thing.
20:22
You want to focus on the part of your program that you're interested in debugging. You want to be able to filter out the stuff that is not very interesting. So we, Phil, added a feature last year called frame filters. Frame filters, they're activated by hooks.
20:41
They're associated with your application. You see where I'm going with this? They're disableable. They work with the command line interface and with the machine interface. And what they let you do is modify when we're computing a backtrace and we're going to print the backtrace,
21:01
they let you mess around with what is going to be displayed. So here's what it looks like now. There's frame zero. That's my callback. There's frame seven. And we've changed the name of frame seven. It says emit signal, the name of the signal,
21:20
on the object it's being emitted on. And then in brackets there, I think that's the class name of the particular object. And then you can see the rest of the frames are indented. This says these frames conceptually make up this frame. They're an implementation
21:41
detail of this thing that you might be more interested in. Now again, you can turn this off. We're going to add a feature so that we didn't do it in the first revision, but we're going to add a feature so that those kind of indented frames you can just make them go away entirely. So your trace would look like frame zero, frame seven, frame eight.
22:04
Now when is this useful? This is useful in lots of ways. Here's an example, glib. If you go look in the glib bugzilla, this code's in there. This is a case where you're using some
22:20
framework. The framework's relatively complicated. It needs to do some complicated things, but you as a user of the framework, there's a lot of it you don't really care about. You want to see the parts that are more interesting to you, not how it's implemented. Just like data structures are opaque, you want some aspects of the backtrace or the values
22:40
or so forth to be opaque. It's useful in interpreted languages. If you do a backtrace in Python, your GB has Python in it now, so we do backtraces and we see a bunch of stuff from the Python interpreter. You know, nominally interesting, but generally
23:00
not relevant to me. So you can write this kind of frame filter for an interpreted language, and it can say instead of, you know, emit signal, whatever, it could say Python method, blah de blah, Python Ruby, Emacs list, whatever. And I'm sure you could think of other uses for it along those lines.
23:23
Any questions about that? I'm sorry? Can you add frames? You cannot add frames. There's two, and that's a good question. So there's, and we're going to talk a little bit about that in the JIT stuff coming
23:42
up. There's kind of two ideas behind adding frames. One is to just make one up, like when you're printing, and have it sort of mean nothing. Not mean nothing, but you know, just come out of nowhere. Can't be done today. There was just
24:01
kind of an internal limitation preventing us from doing that, and we chose to ship with that restriction because the patch was already quite large and gone through five rounds of review or whatever, and you know, that kind of restriction is something to be lifted later, you know. It's
24:22
always okay to relax your errors, you know. So that can't be done. The other idea behind making up a frame is integrating into the actual unwinding. This is not an unwinding feature. It looks like one, but what this is actually about is display.
24:40
It's not hooked into the lowest levels of unwinding, like saying we found this frame. How do we go to the next one? What registers do we want? You know, that kind of thing. Now that can be done a different way, and that's something that we're going to, I think, expand. And that part isn't Pythoned up,
25:02
but you know, it's not unreasonable to do that. Does that answer your question? Cool. Yeah. Yeah, wait for a microphone and I can't quite hear you.
25:35
Just about, you know, functions, you were talking about breakpoints. Yeah. And you said you can, when
25:41
f is called by g, you can set up a conditional breakpoint on that. Right. How does that work with function pointers, you know, when you have structures with pointers and functions, does that work also? Yeah, that works because you know, when you're unwinding, there sort of
26:01
is no such thing as a function pointer. You just have the concrete functions on the stack, you know. So, that's not an issue. Does that make sense? This is one of those things that GDB knows that your program doesn't know. GDB knows how to like, compute the stack trace.
26:22
So, these functions are using GDB's knowledge. Are we ready to move on? Alright. You may have to yell or something. It's tough to scan. Okay.
26:40
So, that's Python. There's a lot of cool stuff. We're going to add more. Now, I want to talk to you about probes. Probes are something that's very cool. They came out of, well, the first time I heard of them was from DTrace. SystemTap implemented a source-compatible
27:01
version of probes different from, you know, I think they're implemented differently from DTrace. Don't be scared by SystemTap. The probes themselves are not specific to SystemTap. In fact, Roland McGrath, the glibc maintainer, wrote the current probe implementation.
27:23
They are a masterpiece of ELF and GCC knowledge. So, they really just rely on some ELF features and some GCC features to work. They're implemented in a single header file. Well, there's an asterisk. There's actually two header files. One that has a single define in it.
27:42
I call it one. They're extremely low overhead. Extremely low. A probe introduces one relocation into your library. Any number of probes. One relocation for all of them. So, you know, the start-up time overhead
28:01
is ridiculous. I mean, you can't even measure that. The probe itself is a single no-op in your program. It doesn't get much cheaper than that. And probes can have arguments and if you choose your arguments carefully, there's no penalty at all for that.
28:21
If you make sure that your arguments are live at the point at which you insert your probe, the compiler doesn't have to do anything special. And really, your probe is just a single no-op with no overhead for computing the arguments. What that means is you can use these probes in hot functions. It's totally fine.
28:41
Some other nice features of probes. And we're going to, I'm just going through, like, the what. And then I'll talk about why this is interesting in just a minute. But probes don't require any debug info. They're in a special section that rides along with your program. It's not stripped by strip. And
29:03
what's interesting about probes is that they let you name a spot in your program that doesn't change as you change your program. You know, if you respect sort of the meaning of the probe. And it can be in the middle of a function. So what that means is, like, suppose you're
29:21
writing these Python scripts. You know, you're adding new commands to GDB to do things with your program. And you want to add commands that inspect the program while it runs. And I'm going to show you an example of that pretty soon. They inspect the program while it runs. But sometimes, you know, you want to say, well, I need to
29:41
inspect the middle of this function. And I don't want to refactor this function for whatever reason. There's always a reason. Well, you can stick a probe point in there, and the probe point lets you reliably find that spot in the middle of a function without messing up your program logic or having to hard code, you know,
30:02
a line number into your Python code, which is inherently fragile. And then, not requiring debug info, what's cool about that is you can use this with production code. You can extract these arguments from your running program without requiring the debug info. So you can
30:20
do it in production. And one way that we do this internally in GDB is we added probes to like the libgcc unwinder. The unwinder is the part of libgcc where when you throw an exception, it unwinds the stack looking for the place that catches the exception. Now previously, a few years ago in GDB, if you were going
30:40
through your program next, next, next, next, next, and you next it over some function, and that function throws an exception, GDB would just lose control because it didn't understand about exceptions, and it didn't have any hook into there. So we added a probe into libgcc, and this probe is like right in the middle of the unwinder. And then
31:00
what we did is we changed GDB to look up this probe, and like if you look at the syntax here, stack probe, that's like the provider name, that's the name of the probe, so it's libgcc, the unwind probe, and then these are arguments, and this argument tells how far up the stack the exception is going, and this argument tells
31:21
who the, you know, the PC at which the jump is going to happen, you know, where it's jumping to. And so what this does is it lets GDB say, oh, when you're nexting through the code, I'll stick a breakpoint on this probe, and that way, whenever an exception happens, I get informed, and I can decide what to do. I can let it keep going, or I can set a breakpoint at the handler and
31:41
we gain control. And so now, nexting in the presence of exceptions works exactly like you'd expect. And the way we did it is this way. And the reason we like this way is because you don't need debug info, you don't have to install all the libgcc debug info, or
32:01
hope that your distro built GCC with debug info, or anything like that. It lets you focus better because if you install the libgcc debug info, suddenly you're stepping into libgcc. There's all this crap you don't care about. You really don't want to read the unwinder.
32:22
And it just lets you focus a little better while letting the debugger continue to do its work. And we've added a few more of these probes. There's a few more GDB uses. There's some in glibc, and there's some in libstandard C++. And, you know, it uses them internally to implement features, you know, kind of base debugging features.
32:40
But you can also just use them from the command line. You can say, you know, break-probe and then the name of the probe. You can run info probes. Redelf has some flag, so you can dump them. And then if you do add them, the nice thing is they do work with SystemTap. So you can also use them in your SystemTap scripts if
33:01
you're debugging user space processes with SystemTap. So anyway, I think it's a quite cool feature. And honestly, I was if someone had told me the constraints on this project before, you know, and then asked me to implement it, I would have probably quit my job. I didn't believe it was possible, but Roland
33:21
did it. Alright. Any questions about this? Yeah. Him first, and then... Yeah, go ahead. Can a program introspect and see its own probes?
33:44
He's asking if a program can introspect and see its own probes. Well, I mean, it could because it could open the file and, you know, like there's a library or an executable or whatever.
34:01
It could open it, parse the ELF, find the section, etc. It could do that. There's no, like, canned way to do it. And I think you know, like, it would be funky to like, you may be asking about like patching, you know, sticking a patch in there, right, or something like that.
34:21
Profiling. Okay. Yeah, I mean, it's tricky just because, you know, GDB gets to cheat. It doesn't matter that the text segment is read-only. GDB gets to still stick breakpoints and write all over it stuff, you know. So, for your own program, it would be tricky.
34:41
And you don't want to open that up necessarily because of security. Anyway, it could be done, sort of, but it's not directly supported or there's no library for it. I just want to say for profiling, Perf now should also
35:00
recognize them. So you can ask Perf to... Perf knows about SystemTap probes now? Okay, cool. Yeah, and if you, you know, by the way, if you work on a distro, when you build libgcc and glibc and libstandard C++, you should just make sure that the appropriate SystemTap header
35:21
is installed. It'll make GDB work better and it'll make your users happier and like I said, the overhead is nil. Or you shouldn't use probes for profiling your own program. Would it be feasible to have two types of probes,
35:41
one type which is under the control of the program for its own use and another type for GDB? Mainly I'm thinking in terms of being able to turn tracing, logging on and off, with no penalty. Yeah, I don't know. That's, I mean I don't see why not, but that's outside of my
36:02
sort of zone. Oh, I thought maybe there was more. Alright, so next, how are we doing? Alright, the JIT API.
36:21
Like I said, JITs are becoming very common. Previously I think GDB, I mean I never tried it. I assume it didn't work very well. If it worked at all. So, there's two JIT APIs now, two ways to teach GDB about your, you know, the compiler that's running inside of your program.
36:42
One, the first one and maybe the most basic one, is you have your compiler inside of your program write an ELF file in memory. And this ELF file can contain anything. So it can have DWARF, the debug info, to explain everything about your program.
37:02
How to unwind everything, how to find all the local variables, file names, function names, anything that, you know, DWARF can represent. And then, you just, you have a magic function. This was implemented before we had system tap probes. If we did it today, we'd just have a system tap probe for it. But,
37:21
you just have a magic function in your program. And once this ELF is in place, you update some data structure and you call this function. And GDB just has a breakpoint on this function. GDB knows, you know, this is the protocol. And so GDB, when your program stops there, GDB goes
37:41
and finds that ELF and sucks all the data out and creates its own data structures and continues on. And then everything just works. It's quite nice. I think it's used in the wild. If I remember correctly, there's some LLVM thing for this.
38:01
The problem with it is, you know, writing out a full ELF in memory is a little bit wacky. You may not want to be doing that. You may want something a little slimmer. So there's also a GDB plugin approach. And the plugin is just a shared library that GDB knows how to load. It's a little manual right now. Like I said earlier,
38:21
it's not all Pythoned. So, it doesn't have all those nice properties that the Python approaches tend to have. But this plugin, there's a pretty simple API. It can represent less information to GDB. But on the other hand, it's code you wrote that runs inside of GDB.
38:41
And so, it can know a little bit more about the program you're debugging. And that way, you don't have to go through all the effort of writing out an ELF and DWARF. If your JIT already has data structures in memory describing the program, it could just read those directly and use those. And that's something, the future,
39:02
there's some patches pending to expand that. To make it Currently, the API is quite simple. There's only a few things you can tell GDB about. And these patches expand the range of what you can tell GDB about. And I think it would be useful to hook this up to Python. It's just nobody's done that.
39:23
Alright. Next, I want to show a demo. And this is just to give you an idea of stuff that GDB can do. You know, kind of why we What the hell? Oh, right.
39:43
It's not called setup. It's called Alright. So, why Python, you know? Like, out of all the languages. Well, the answer is, I don't know. GDB picked Python before I worked on GDB. At least there, I believe there was some kind of agreement that that was
40:01
what the choice was. I implemented a good chunk of it. But what I like about it specifically is just that, yeah. Yeah, I'll move the window up. It's not interesting yet, so don't, you know. But this is just G, well, I'll
40:21
tell you in a second. But what's nice about Python is just that there's a library for everything. You know, everything exists. If you look in the distro, I don't know, there's a thousand Python packages. And if you look on pip or whatever, there's you know, 20 gazillion
40:41
or something like that. I think that's pretty exact. So, the point being, whatever you may think of Python as a language, it's very pragmatic. You can do anything with it, you know. Somebody has already got the library you want. So, you can
41:01
you can make your debugging simpler without undue effort, you know. In cases like this, I'm really a fan of sort of quick and dirty hacks as opposed to well engineered clean hacks. Because, you know, let's face it, debugging isn't the only thing that you have to do. It's an important part
41:21
of the development cycle, but it's not the only part. So, what we have here is this is going to be an example of like you know, a live example of why Python is kind of fun. So, here we have GDB, we're debugging GCC.
41:40
We're let's see, here we go. GCC is compiling this function. I'm sure you all have minds like steel traps and you'll see that this function doesn't actually do anything. Because, you know. Anyway, so here we are
42:01
debugging, we're debugging GCC. And we're in the middle of, we're actually at main I think. Yeah. So then, well, I wrote a GUI. This is just some Python code. Where we have you can't see it yet, it's just a
42:21
blank window. But you know, we imported Python GTK. We run it in a separate thread. We pass messages back and forth to the GDB command loop. And then we continue. And this is the control flow graph of that function.
42:41
And this is implemented in I think less than 200 lines of Python code. And what it does is it sets a bunch of breakpoints in GCC wherever it manipulates the control flow graph. And at those points it pulls out the nodes and edges of the control flow graph, stuffs them into a Python graph structure and then draws it into this window.
43:01
And, you know, we made it update while you step, so let me dammit. Oh well. This window is too small, but I'll get to the right spot and you'll see. Here we go.
43:20
Clean up tree CFGBB. You know, what does that do? I think it's about to optimize the control flow graph. Oh yeah. Look at that. So you can see it's very reactive. It just watches. And this is like super basic, you know. You could do more GUI stuff if you wanted, if this was interesting to you, you know. If it was something
43:40
you're debugging a lot, right? You're spending a lot of time in the control flow graph. Well, you know, I don't know that much about GCC. I certainly don't know anything about like pygtk or igraph, like this Python graphing thing. And still this took me a couple hours to write, something like that. It's super easy to do
44:01
this stuff, you know. There's no reason that you shouldn't do it. So. Here we go. It'll optimize some more. There you go. GCC also knows that function does nothing. Alright. That's all I got. Are there any questions?
44:24
Nothing? You can see like it's so easy to do this stuff. Source window for the GUI. It's just trivial. That's like ten minutes with GTK text view or GTK source
44:40
view or something. Yeah. You know there is a tool which is called DDD Data Display Debug. I never could use it because it's a graphical tool. So I'm wondering, is this tool still interesting with what you propose? Is there something this tool can do you cannot do? Or
45:00
is this tool dead because of what you propose? So, you're asking if this is something, if DDD can do this or could not do this? Okay. If DDD is obsolete, well, I mean the answer is yes but not for the reason you're thinking. DDD yeah, it's true
45:21
though. We always have to tell people don't use it. DDD doesn't use MI. Remember the machine interface I was talking about? It runs the CLI and tries to parse the output. But we never make any promises really about the CLI output. And if you're parsing the CLI output you're just setting yourself up for pain. So occasionally
45:40
people come and complain like DDD stopped working. That's not our fault. Not anymore. MI's been around 15 years or something. There's no excuse. Now, what's yeah, DDD, I know of it. And it's cool. You know, what it does is cool. What's cool about this
46:00
is, you know, that window like just just picture for a minute if you would like you want to do a graph display of a data structure in your program. Well, you could add that to DDD or Eclipse or anything. You could certainly add it to anything. Because those really have the same information that GDB does.
46:21
But the difference there is one of, say, practicality. In Python I can write this really quickly in a couple of hours. Unless I was already an Eclipse expert I don't believe I could do that in Eclipse in a couple of hours. It's just a much more hairy environment.
46:41
And in DDD you know then you're talking about getting into the C code and I mean in DDD specifically, trying to parse GDB's output and sending a lot of prints and I mean, I wouldn't want to do that. Certainly not when I can do it in two hours. The point is, you kind of
47:02
have an extreme of customizability here. And I, you know, I wanted to talk about one other crazy example. I saw a dude on GitHub who he hooked Cherry Pi into GDB, which is like his web server. So, you know, you can write a web interface and derive it from your browser if you want to do that.
47:21
It's really up to you. What's easiest for you? So I feel like the previous question the difference that strikes me between DDD and this is that DDD tried to do it generically it tried to figure out how to represent graphically arbitrary data structures whereas this is, you're expected
47:42
like, you can do it, but you know about your data structure and you can build something specific. But on the other hand, you don't get anything for free. You have to write whatever you want. And there's pros and cons. That is true, but like in this case, the display part, you know I mean, I can show you, I wrote like zero lines of code for that.
48:01
It's some call in iGraph. Please draw this now. You know, and the mapping between GCC's data structure and the iGraph data structure is just, it's extremely trivial you know, it's like this is a node, that's a node. You know, so
48:21
yeah, I mean I get what you're saying. Would it be interesting to try and do that like what you've done, have someone build a project that does that for arbitrary struts? Yeah, okay. I certainly think that would be interesting. I think it would be interesting to say, you know you could abstract out, right, and say here are some pre-canned visualizers
48:40
and they could work in the same way as all of our other hooks you know, that we have where they'd be associated with your application and some kind of registration process and you could say show me a linked list and it could talk to your linked list abstractor thingy and display it, you know. That would be interesting.
49:00
For me, this is just noodling around so, you know, I'm not I don't want to commit to writing a GUI but you know, I'd just put it out there as sort of a provocation to give you an idea of things you can do. You know, yeah. Other questions?
49:26
What GUI would you recommend for using with GDB now? What would I recommend? Yeah. You know, there's a lot of them that are good. It also depends on
49:41
what your needs are, you know. Like, if you like IDEs, there's Eclipse. There's Qt Creator. If you like something more minimal, there's Nemover KDevelop. I don't know if KDevelop is more minimal. I've never actually used that one.
50:01
Those ones are pretty good. Red Hat's working on a trimmed down Eclipse where it's just like a wrapper for the debugger for people who don't want the whole IDE thing. There's a project on the Eclipse Wiki that you can follow for that if you're interested in that.
50:21
anyway, that's a handful of decent ones. Alright, that's it for questions. Thank you very much Tom. Up next in about 10 minutes is...