Valgrind and debuginfo
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 287 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/57128 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2022208 / 287
2
4
6
8
12
17
21
23
31
35
37
41
44
45
46
47
50
62
65
66
67
68
71
73
81
84
85
86
90
92
94
100
102
105
111
114
115
116
117
118
121
122
124
127
131
133
135
137
139
140
141
142
145
149
150
156
164
165
167
169
170
171
172
174
176
178
180
183
184
189
190
192
194
198
205
206
207
208
210
218
220
224
225
229
230
232
235
236
238
239
240
242
243
244
245
246
249
250
253
260
262
264
267
273
274
277
282
283
287
00:00
InformationBinary fileTranslation (relic)Source codeSymbol tableAddress spaceTable (information)Execution unitCompilerLine (geometry)Range (statistics)Network topologyVariable (mathematics)Type theorySource codeComputer fileAddress spaceMereologyDirectory serviceSymbol tableTable (information)Range (statistics)Line (geometry)MappingBitRun time (program lifecycle phase)Flow separationNetwork topologyMultiplication signComputer programmingExecution unitFunctional (mathematics)CompilerVariable (mathematics)InformationInternet service providerLibrary (computing)CodeSinc functionDynamical systemDescriptive statisticsCorrespondence (mathematics)Fluid staticsProduct (business)Software maintenanceLevel (video gaming)Uniform resource locatorType theoryDefault (computer science)Code2 (number)File formatReading (process)WindowLink (knot theory)Object (grammar)Compilation albumTraffic reportingCategory of beingMatrix (mathematics)Binary codeDiagramComputer animation
09:55
Symbol tableTable (information)Execution unitCompilerInformationLine (geometry)Address spaceRange (statistics)Source codeNetwork topologyVariable (mathematics)Type theoryBinary fileSheaf (mathematics)Patch (Unix)SineServer (computing)Cache (computing)Total S.A.InformationComputer fileOcean currentStandard deviationLink (knot theory)Compilation albumTable (information)Uniform resource locatorBinary codeSheaf (mathematics)Client (computing)Variable (mathematics)Computer programmingSymbol tableDifferent (Kate Ryan album)Physical systemRemote procedure callReading (process)Utility softwareSource codeNegative numberDirectory serviceCASE <Informatik>Server (computing)MereologySoftware bugOverhead (computing)Cache (computing)Functional (mathematics)Range (statistics)Limit (category theory)2 (number)File formatTwitterCodierung <Programmierung>Mobile WebAlpha (investment)Network topologyLine (geometry)Binary fileExecution unitMultiplication signDynamical systemWorkstation <Musikinstrument>Electronic mailing listComputer animation
19:44
Address spaceTable (information)MultiplicationTraffic reportingServer (computing)InformationLine (geometry)Table (information)Range (statistics)Address spaceExecution unitError messageInformationBlock (periodic table)Reading (process)Multiplication signRewritingConstructor (object-oriented programming)Complete metric spaceServer (computing)Compilation albumData storage deviceConnected spaceLevel (video gaming)Computer fileFunctional (mathematics)MereologyPlotterCASE <Informatik>Source codeMathematicsDirectory serviceShared memoryNetwork topologyVariable (mathematics)LogicUniform resource locatorDifferent (Kate Ryan album)Computer programmingType theoryLengthExpressionComputer animation
29:33
MultiplicationAddress spaceTraffic reportingInformationComputer animation
Transcript: English(auto-generated)
00:08
Hi, I'm Mark Willard, I hack on Valgrind, I live in the Netherlands and I work for Red Hat, where I help maintain Valgrind for Fedora, RAL and Red Hat Developer Tools
00:24
products. For the last Valgrind release, I helped Aaron integrate DebugInfoD support, which is a new way to find DebugInfo files, and I also improved the 12 reading speeds, so that
00:41
instead of seconds for larger programs, it only takes a few hundred milliseconds at startup. So I would like to talk a bit on why and how Valgrind uses DebugInfo, which issues I faced and which other improvements we could make.
01:01
So why DebugInfo? Valgrind can mostly do its work on binary code and addresses, but the user will likely appreciate symbolic function names in source files, line numbers, specifically the user will provide Valgrind with symbolic function names in suppression files, and Valgrind
01:25
will want to report addresses as function names, and when possible, with source code files and lines. When Valgrind reports an issue, the backtrace is the most useful piece of information, because
01:45
it not only tells where an issue was encountered, but also how he got to that location. But these days, compilers often partially inline functions, if that is cheaper than
02:01
calling them, especially with link time optimization, this is done a lot and sometimes multiple levels deep. So in that case, it's really helpful to show a virtual backtrace, this function was called from another, but instead of calling it, the compiler inlined this code into the
02:23
caller, or even the caller's callers, and this is so useful that we enable it by default with inline info as yes option, and finally it would be nice if we could
02:43
report the variables corresponding to the address, or maybe requesters, that are being manipulated by the program, but that is fairly expensive, and it isn't really accurate enough, which is why read-for info is off by default, we used to have an experimental
03:07
tool, sgcheck, which was supposed to be a stack and global array overrun detector, based on the for info, but it didn't really work that well, so it has been removed,
03:23
but it would be really nice to figure out a better way to use the variable location information, to make it less expensive and more accurate, sadly I don't really know how to do that. Anyway, for this talk, I'm ignoring inline tables, they are described
03:51
in the Twirstander, but in practice implemented slightly differently as EH tables, which are always available, and they work on the adverse level anyway. I'm also
04:07
ignoring symbol names that might be mangled, and need to be demangled, to make sense to the user, which is also interesting, we added verse symbol demangling in the last
04:24
release, but it doesn't really add to the debug info itself. And we also ignore non-Dwarf debug info, like the PDP windows debug format that's used by Wine programs,
04:46
which you can run under Valgrind, and Valgrind has a PDP reader, but I don't know anything about that format. So, what do I call the debug info? Basically,
05:11
three parts. First we have the symbol tables. We always have the dynamic symbols,
05:23
since Valgrind only works on dynamically linked executables, and these are the exported functions that can be called between libraries and executables. This is often a small table, and only covers a few functions, the exported functions, but it is useful. Then there's the
05:49
symtab, which is the symbols that were needed for linking the program, like the internal static function names, but the symtab is often moved into a separate debug file, because it isn't
06:05
really needed at runtime anymore, but if we can get at it, then the symbol tables give us simple mapping from address range to function names. A bit more interesting are the debug line tables. These are also often moved into a
06:28
separate debug file. Technically it's a directory and file table, plus a simple program producing a matrix from addresses to file, line, number, and some other properties of the instructions at
06:46
a certain address. Like the symbol table, this provides us with a simple direct mapping from an address to a source file, a line number.
07:04
Till 12.5, it isn't a completely independent table. You first need to know the compile unit, and then the file table corresponds to the main directory and file name, which is kind of
07:28
inconvenient. Because the last part of the debug info is the compile unit, which contains
07:41
the debug information entry trees. So the debug information entry tree describes the program scope, function arguments, variable locations, types, and lots of the program.
08:07
So the line tables and symbol tables are fairly simple, but the dice are described as trees per compile unit. Originally, Twarf was
08:30
designed to describe the one compilation unit at a time compiler.
08:40
So you compile one source file at a time, you link the object files together, and for every object file you have a die tree describing how that source file was compiled to object code. These days you not only have compile units, but you can also have
09:17
separate type units or shared partial units. And when using link time optimization,
09:32
some of the units might actually describe debug information entries from completely separate source files. And to make things even more complicated, at least for Twarf here,
09:50
is that the description and coding of the dice, the abrefs, are separate from the actual data in the debug info section. It does make some sense, because you often have the same kind of
10:08
attribute, so it's useful to describe the format of the data once, and then have the data in the actual tree just reference the abref that describes the encoding. There are lots of
10:38
particular die trees, where to find the ranges describing the program scope, function entries,
10:47
location lists describing where to find variables, and the types so that for those variables we
11:01
don't know which side they are in, important, at least for us, is the size of a variable. When we don't want any inline information and no variable information, we use a simple
11:24
Twarf reader that only reads the top of the die tree to extract the compile unit source file and reference to the debug line table. But if read inline or read for info is enabled,
11:43
we use a different full Twarf reader that reads the whole die tree looking for the interesting dice. In practice, we used to always use the full Twarf reader, because we want to read the program scopes to tell us which function is inline to another.
12:06
So much of the Twarf reader speedup came from skipping all the non-program scope entries, which would only be interesting for the variable location information, which is off by default.
12:25
So where do we find the debug info? Sometimes you can find all the debug info in the binary itself. The dynamic symbol table is always there, sometimes the symtem. And if you
12:42
are just using your just compiled program with minus G, everything else is also in the executable. But especially for distro binary system libraries, the symtem and the debug sections are all moved into separate debug files. If you then install the debug packages for distro,
13:09
they can be found through some standard built ID based path names, or through the new debug link section in the binary. And that provides a file name,
13:24
which you can then look up using a standard search path. Valgrind comes with an experimental Valgrind debug info server, which serves compressed
13:42
debug info files in chunks from the current working directory. It is completely manual, you have to find and store the debug files by hand. It was implemented to run Valgrind on a remote mobile device with limited storage, so you can serve the debug files,
14:03
or parts of debug files from your workstation. I'm not sure how many people are actually using it, and the documentation is fairly minimal. But if you use it, it would be nice to know.
14:21
And finally, there is the new debug info D support, which is kind of the opposite. It works completely automatically. If the debug info D find utility is installed, and the debug info D URLs and variables set
14:45
to debug info D server, Valgrind will just find the debug files. So Aaron wrote the support for Valgrind 3.18.1.
15:07
It spawns a debug info D find utility if it is installed.
15:24
It puts the debug files it gets from a server into a cache. It also cases negative lookups so that it quickly knows not to ask for more,
15:47
or the debug info again on future runs, and it's shared with other utilities like GDB, alpha utils, the bin utility, system tab, et cetera,
16:04
which is handy if you use GDB and Valgrind on the same binary. Some distros now set the debug info D URLs by default, specifically Fedora 35 does. Debian also has an official debug info D server, as do various other distros.
16:26
And you can also try to use the federating server, debuginfo.d.alfutils.org, and that URL also gives some more background information on debug info D,
16:43
which client programs support it, which distros have official experimental debug info D servers. At the moment, we still have a bug when the debug info D find utility is
17:03
spawned, it might generate a sick child signal when it dies, which might arrive at the program that Valgrind is running instead of Valgrind filtering it out.
17:24
Hopefully it's solved by the time this talk airs. So the Valgrind Twitter was slow, which was exposed by the debug info D support.
17:42
Suddenly there was always debug info for everything. And it was pretty bad. If you have a simple C++ hello world program linked against libstdc++,
18:02
this is on Fedora, which is built with LTO enabled and then post-processed by DWC, a dwarf compressor, you would get 100 megabytes of debug info.
18:27
And so just running hello C++ hello world under Valgrind took 12 seconds.
18:42
Luckily, I could reduce the reading speed, it took only 4500 milliseconds.
19:01
And then I stopped trying to make it faster because it's kind of in the range of overhead that Valgrind has anyway. So why was the debug info reader so slow?
19:29
Basically, Valgrind was too eager to read all dwarf information, even when some data could be skipped and it could not reuse some data that was shared.
19:47
So we now fully skipped use and children of DICE without addresses when we use the read inline info.
20:04
We are only interested in those DICE subtrees that contain function descriptions, the program scope entries. So we can simply skip any subtree or holes used if they aren't associated with any
20:27
addresses, because a function is always associated with addresses. So we partly did that already, but when we did, and we didn't know the size of
20:42
a subtree, and we often didn't because the subtrees are variable in length and the data is encoded in a way that we cannot know how big it is till we read the data.
21:07
We would skip those entries by reading them, but then we also interpreted all the data and we actually stored it in case we had for info enabled.
21:28
But that wasn't really useful. So now we really skip those subtrees. And if we don't know the size of the subtrees, we might still need to read the data,
21:44
but we don't try to interpret it or store any references to it. The same is kind of true for reading the line tables. Some units only have source plots to show where types are defined, but no associated
22:06
function addresses. We used to read all those line tables and store those source files anyway, and now we don't, which helps.
22:22
DWARF constructs can easily be shared between units, especially when using the DWC DWARF compressor. But Valgrind would not notice sharing of line tables or address between units.
22:42
For both line tables and address, we now reuse them when the previous CU used the same tables. And finally, we introduce lazy reading of the address.
23:02
Often we don't need to read more than the top of the directory to determine whether we even need to read the rest of the tree. But we would first create all the address for all the dice in the unit up front.
23:23
So we could read all the dice if we wanted. And now we lazily read the address we need for each die. We do guess the address because they are shared between dice. But if we don't actually read all the dice for a CU, and with all the other changes
23:47
we often don't, we might not read all the address. And this was also the main difference between the two DWARF readers. The simple one only had enough logic to read the top of the unit tree to determine
24:05
where the debug line table was. And now in the case we only need to read the top of the tree, we will only read the first air breath and not the others.
24:23
So what can be done to improve things more? There are still two DWARF readers, but using the fully fledged DWARF reader really should not be slower now. So we should really clean up the code and get rid of the simple minimal one.
24:47
Hopefully DWARF 5 will introduce something like a multi-level line table. That might save us having to read the die trees.
25:02
The idea is that the line table itself will have the level of inlining there. But we have to wait on a complete proposal and of course the compilers creating it.
25:25
Um, maybe we could be even more lazy in the reading. That would be a somewhat bigger rewrite.
25:40
But we should know when an address range or block is used and we could then read the associated unit only on first contact. And if we have a debug address range section, we could actually use that to
26:06
quickly get only that unit. Uh, but if there isn't a debug error intersection, then we would have to build an address range to CU table up front, which might make things slower again.
26:30
Um, so maybe we could do the for info reading only when we are reporting errors.
26:46
It's kind of the same as the above. It would be a pretty big rewrite. Um, and the problem is the for info is the wrong way around.
27:03
So it goes from variable name to location per program scope. And the locations can be addresses, but also a constant value or a value in a or even an expression and combining those.
27:26
It's partly in a register and partly at this address on the stack, for example. And I don't know how to efficiently construct an address to variable mapping from that.
27:43
Uh, also because it's the variable locations are really per program scope. And of course, the value of a variable might not actually be in memory.
28:05
Um, so maybe we could do this only for global variables. That might be useful. Other interesting ideas is maybe stealing the chunks idea
28:25
from the velcro and debug info server and do that with the debug info defined. Uh, implementation, uh, but it is somewhat hard to see how that works with the caching
28:46
if different programs use only parts of the same debug info file. And finally, we could, uh, make debug info defined, uh, long running.
29:01
So keep the connection open, which is what GDB does. It reuses the connection to the debug info, the server to more quickly download any new debug info files. It needs, uh, might save some time on the first run. And it would be a work around for the sick child issue because they don't shut
29:23
down the debug info defined executable anymore. Okay. Questions.