Valgrind and debuginfo - TIB AV-Portal

Valgrind and debuginfo

00:00

5

Formal Metadata

Title

Valgrind and debuginfo

Title of Series

Number of Parts

287

Author

License

CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/57128 (DOI)

Publisher

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

With debuginfo Valgrind can provide more useful information about issue found. But till recently it was sometimes hard to get at the debuginfo and valgrind startup time would be really slow parsing the debuginfo. With the introduction of debuginfod support getting the debuginfo is much easier, if your distribution supports it. And the parsing of debuginfo has been improved dramatically. This talk will explain how debuginfod integrates with valgrind and how the debuginfo parsing was improved.

FOSDEM 2022208 / 287

1

28:39

A gentle introduction to Picocli

2

43:04

Z80: the last secrets

3

34:39

Why your next embedded project should be written in Go

4

32:07

5

23:40

A Better Public Transport App

6

29:11

Open Geodata Digital Spaces

7

16:22

Open Source Firmware status on AMD platforms 2022

8

28:22

Firmware Settings and Menus

9

28:50

Bringing RAUC A/B Updates to More Linux Devices

10

11:45

GPIO across Linux and Zephyr kernels

11

28:48

Eclipse Amlen: Messaging for IoT/Web/Mobile

12

28:42

Back to DirectFB!

13

29:23

Why Embedded Linux Needs a Container Manager Written in C

14

1:00:10

Automotive Ethernet PHY bring-up: lessons learned and debug tips

15

25:52

How to teach OSS licenses and compliances at a university

16

30:10

Why the pandemic could help FOSS, but was a win for proprietary software

17

58:11

Panel: Hot Topics

18

27:53

A globally unified governance framework for Open Source

19

28:40

An update on the Digital Markets Act

20

28:19

Why Device Neutrality is important for Free Software?

21

28:19

Somebody set up us the bomb

22

29:12

Rapid Prototyping Physical Interfaces with Web Serial and Cheap MCUs

23

29:05

Can JS also build the metaverse?

24

41:39

Running trusted payloads with Nomad and Waypoint

25

43:58

Simple (but useful) Ansible reporting with ara

26

44:04

Deploying An Embedded Linux Distro Build Factory with Ansible And Proxmox: lessons learned

27

28:48

Utilizing AMD GPUs: Tuning, programming models, and roadmap

28

28:49

Porting Signal processing algorithms to CuPy for precision measurement

29

28:54

PIRA: Performance Instrumentation Refinement Automation

30

28:49

Bringing together open source scientific software development for HPC and beginners

31

26:24

Containers in HPC

32

25:34

Uncovering Arcon: A state-first Rust streaming analytics runtime

33

46:07

v3dv: Status Update for Open Source Vulkan Driver for Raspberry Pi 4

34

21:39

The status of turnip driver development.

35

43:13

36

56:34

Optimal buffer allocation on Wayland

37

31:08

Fun with Finite Automata

38

34:40

TornadoVM: Hardware Acceleration For Java In Practice

39

27:03

Update On Java On The Raspberry Pi

40

33:17

azul: How Your Java is Still Served Hot

41

33:23

Jakarta EE: Present & Future

42

30:47

Fundamentals Of Diversity & Inclusion For Technologists

43

30:59

Polyglot Cloud Native Debugger: Going Beyond APM

44

59:42

Using LibreSilicon

45

38:42

Efabless Open ASICs

46

48:35

Coriolis RTL-to-GDSII Toolchain

47

44:00

48

34:01

Next generation micro-controller programming

49

39:35

Nim Metaprogramming in the real world

50

33:59

Nim concurrency & Parallelism

51

28:15

Why rule-based monitoring is (still) great

52

29:06

Network Traffic Classification for Cybersecurity and Monitoring

53

18:04

Peer-to-peer hole punching without centralized infrastructure

54

29:04

Kubernetes networking : is there a cheetah within your Calico?

55

19:05

Keep appetite for the stats, it costs nothing

56

18:12

Faster memory reclamation with DPDK RCU

57

29:02

2-cluster Kubernetes, with Calico, BGP Interconnect and WireGuard... All Without Leaving Your Laptop!

58

29:06

Challenges and Opportunities in Performance Benchmarking of Service Mesh for the Edge

59

22:05

The relational model in the modern development age

60

23:52

Percona XtraDB Cluster(PXC) Non blocking operations, what you need to know to avoid pitfalls

61

24:59

ProxySQL Cluster: challenges and solutions to synchronizeconfigurationacross multiple decentralized cluster nodes

62

22:35

ProxySQL 2021 Dev Submit

63

18:52

Release Note Highlights from 2021

64

35:06

MySQL 8.0: Logical Backups, Snapshots and PITR like a rockstar

65

56:43

MySQL Operator for Kubernetes

66

25:07

MySQL on Kubernetes demystified

67

18:53

Hash join in MySQL 8.0

68

23:07

Flame Graphs for MySQL DBAs

69

23:41

MySQL Performance on Modern CPUS - Intel, ARM, AMD

70

42:25

Newest MySQL component services features

71

59:43

MySQL InnoDB ClusterSet

72

25:08

Encrypting binary (and relay) logs in MySQL

73

39:55

Efficient MySQL Performance

74

27:10

Backup/Restore tools performance comparison

75

37:33

Bootstrapping a multi dc cloud native observability stack

76

05:45

Monitoring and Observability devroom: Opening

77

39:02

Profiling in the cloud-native era

78

38:38

Adopting OpenTelemetry and its collector

79

25:56

Suggestions for a Stronger Mozilla Community

80

28:58

Collecting Sentences for Common Voice

81

18:59

Introduction to Foxfooding

82

37:12

Searchfox: Fast code search and indexing

83

39:03

Linux Mobile vs. The Social Dilemma

84

29:00

Phosh Contributors Get Together

85

33:59

ModemManager in your phone

86

38:46

2 Years of Mobian

87

30:18

Mainlining the reMarkable 2 eInk tablet

88

35:10

From Android to mainline on the Snapdragon 845

89

29:05

Running Mainline Linux on Snapdragon 410

90

24:05

Librem 5 phone kernel report

91

39:01

The road towards using regular linux on ebook readers

92

1:28:44

FOSDEM 2022 - Closing Session

93

1:28:45

Status of camera support on mobile FOSS devices

94

24:00

Anatomy of GNOME Calls

95

33:50

FOSDEM 2022 - Welcome to Libadwaita

96

29:01

Bring openwifi to PYNQ-Z1 with ultra low cost

97

39:03

RedLeaf: Isolation and Communication in a Safe Operating System

98

33:29

Unikraft: Debugging and Monitoring

99

30:59

Mitigating Processor Vulnerabilities by Restructuring the Kernel Address Space

100

48:18

Genode meets the Pinephone

101

20:23

Advanced Unit Testing in the Hedron Microkernel

102

43:53

The Composite Component-Based OS

103

25:26

A practical solution for GNU/Hurd's lack of drivers: NetBSD's rumpkernel framework

104

54:07

Unhackable across 30 Years, End in Sight

105

28:31

UX/RT - a QNX-like OS based on seL4

106

35:32

Hardware accelerated applications on Unikernels for Serverless Computing

107

33:30

Managarm: Design of a pragmatic fully-asynchronous microkernel

108

29:05

A year of RISC-V adventures: embracing chaos in your software journey

109

19:00

Why everyone needs to know some coding: last-mile sandboxing

110

23:44

Designing a programming language for the desert

111

30:10

Fuzion Language Update

112

18:37

How to design powerful DSLs for users

113

15:09

Declarative and Minimalistic Computing

114

33:59

The Concise Common Workflow Language

115

29:01

Adventures in Dataflow

116

28:47

The Matrix State of the Union

117

59:00

The matrix-rust-sdk

118

29:29

Growing Pinecones for P2P Matrix

119

19:01

Opsdroid: Building a bot using Python3

120

19:12

The next generation of Matrix interfaces

121

30:02

All things with moderation

122

29:15

MLS meets Matrix

123

29:21

Beyond the Matrix: Extend the capabilities of your Synapse homeserver

124

28:11

Events for the Uninitiated

125

04:56

Decentralized Collaborative Annotations using Matrix

126

04:28

ChatStat - An R package for Matrix stats

127

29:35

Through The Looking Glass

128

34:34

8-bit Character support on architectures were the smallest addressable unit size is 64-bit in Clang and LLVM

129

23:20

LLSOFTSECBOOK: LOW-LEVEL SOFTWARE SECURITY FOR COMPILER DEVELOPERS

130

16:25

Towards an Operational Code Aesthetics

131

24:02

Online performance

132

30:09

Why ODF is a better standard than OOXML

133

10:33

Macro Dialog feature

134

29:09

LibreOffice WASM – an Update: A status report from the journey to get LibreOffice into the browser, fully*

135

11:07

Information Engineering Operations

136

29:23

Improved coverage analysis for LibreOffice's CI

137

10:24

Editing Simulation

138

22:38

Improving Developer Experience at LibreOffice

139

26:38

Curl based HTTP/WebDAV UCP

140

09:21

Kubernetes setup & deployment

141

09:49

Canvas For Rendering UX

142

30:10

Advantages of LibreOffice Technology

143

10:09

Building Collabora Online UI: based on the LibreOffice components

144

18:26

Peergos - Combining peer-to-peer connectivity, end-to-end encryption and fine grained access control to build a secure and privacy focused self-certifying web protocol

145

27:57

State of libp2p

146

18:00

Edges Are Infrastructure: IPFS Everywhere for a More Resilient Future

147

19:21

Hyper Hyper Space: In-browser p2p applications

148

24:02

Earthstar: The merits of being a bicycle when everything else is a hyperloop.

149

22:10

What's coming in VIRTIO 1.2

150

19:13

Tracing KubeVirt traffic with Istio

151

18:52

The story of adding TPM support to oVirt

152

39:09

Phyllome OS: A friendly virtualization-focused Linux distribution

153

27:30

Network interface hotplug for Kubernetes

154

25:27

KubeVirt scale test by creating 400 VMIs on a single node

155

23:54

Isolating PCI/CXL Devices: It All Starts with System Launch

156

27:19

Introducing OKD Virtualization

157

28:26

DevOps, Cloud Native, DPUs: beyond the buzzwords

158

27:22

Cross-platform/cross-hypervisor virtio vsock use in go

159

39:27

Panel 2: Dependencies for Vulnerability Discovery and Tracking

160

24:03

SweetAda: A Lightweight Development Framework for the Implementation of Ada-based Software Systems

161

1:04:19

SPARKNaCl: A Verified, Fast Re-implementation of TweetNaCl

162

24:00

Exporting Ada Software to Python and Julia

163

38:43

Proving the Correctness of GNAT Light Runtime Library

164

34:15

The Outsider's Guide to Ada

165

33:38

The Ada Numerics Model

166

24:34

Ada Looks Good, Now Program a Game Without Knowing Anything

167

13:49

Introduction to the Ada DevRoom

168

1:03:39

Introduction to Ada for Beginning and Experienced Programmers

169

03:37

Closing of the Ada DevRoom

170

24:02

Implementing a Build Manager in Ada

171

23:41

Getting Started with AdaWebPack

172

29:04

Overview of Ada GUI

173

28:41

Use (and Abuse?) of Ada 2022 Features in Designing a JSON-like Data Structure

174

29:02

Alire 2022 Update

175

21:27

secPaver: Security Policy Development Tool

176

29:11

State of Open Source Databases

177

46:51

IMPLEMENTING AN INCENTIVISED PARTNERS PROGRAM IN MAUTIC

178

19:07

Introduction to qbe

179

23:49

Verifiable Credentials and Decentralized Identifiers with DIDKit

180

27:11

Automatic CPU and NUMA pinning

181

39:32

Build and release tools tailored to building, releasing and maintaining Linux distributions and forks

182

29:59

Modding the Immutable – how to extend Flatcar, an immutable image-based OS

183

44:49

Collaboration instead of Competition

184

38:16

CentOS Stream: stable and continuous

185

09:00

Extending Kubernetes with WebAssembly

186

28:41

Boot2container: An initramfs for reproducible infrastructures

187

29:20

Free tools that help you run online events in an effective way

188

25:37

Streaming and Edit Conference Videos with OBS, Jitsi and kdenlive

189

38:34

FOSS Events Primer

190

50:12

Run a conference with pgeu-system

191

03:34

Welcome to the Conference Organisation Dev Room

192

28:23

FOSDEM Conference Infrastructure

193

28:57

Lessons from 6 Virtual Ansible Contributor Summits

194

19:54

Debian Conference Infrastructure

195

29:06

Introducing ONLYOFFICE Forms for paperwork automation and smart collaboration

196

24:28

Oniro - an open-source starter for fast-paced IoT environments

197

32:04

Unifying Infrastructure and Application Delivery Using Keptn

198

23:46

Porion a new Build Manager

199

23:46

Massive Unikernel Matrices with Unikraft, Concourse and More

200

23:58

How to improve the developer experience in Heptapod/GitLab

201

44:26

Continuous Integration Pipelines with Nomad, Vault and Jenkins

202

19:00

Pushing the Open Source Hardware Limits with KiCAD

203

19:37

Open CASCADE Technology: status update

204

18:51

ngspice - current status and future developments

205

18:38

LibrePCB Status Update

206

59:14

KiCad Project Status

207

18:40

Hacking through BIM models

208

29:43

Valgrind and debuginfo

209

21:49

Adding Power ISA 3.1 instruction support to Valgrind

210

53:16

Upstreaming the FreeBSD Port

211

10:11

Enable AVX-512 instructions in Valgrind

212

20:31

Privacy-preserving video object detection in WebAssembly inside Veracruz

213

20:13

SGX Enclave Exploit Analysis and Considerations for Defensive SGX Programming

214

23:51

Secure boot, TEEs, different OSes and more

215

23:18

Logging, debugging and error management in Confidential Computing

216

23:20

Intravisor -- a hypervisor for fine-grained isolation using CHERI

217

23:18

Symbolic Validation of SGX enclaves using Guardian

218

23:16

Gramine Library OS

219

23:59

Rethinking the OS for Isolation Flexibility with FlexOS

220

20:46

WebAssembly + Confidential Computing

221

23:15

Developing for the AWS Nitro Enclave Platform

222

58:13

Process-based abstractions for VM-based environments

223

23:10

Arm CCA enablement through the Trusted Firmware community project

224

35:09

Unit testing Linux kernel drivers

225

13:10

How (not) to make a mockery of trust

226

43:01

LAVA + OpenQA = Automated, Continuous Full System Testing

227

21:33

Data Replication and Migration from Ceph RGW to Cloud

228

28:57

Introducing Garage, a new storage platform for self-hosted geo-distributed clusters

229

36:33

COSI : a brief update

230

09:43

Migrate to Ceph-CSI

231

29:01

Trajectware - timeline-based navigation across computing heritage

232

58:59

A Brief History of Spreadsheets

233

37:28

Debunking The Myths About The Raku® Language

234

58:31

Keeping old Unix/Linux up-to-date with pkgsrc

235

1:24:01

A Computer Museum Why and how?

236

27:40

FrogFind and 68k News

237

29:33

Old Web Today: Keeping Flash (and other) Retro Web Sites Accessible on the modern web

238

33:58

Hack for the Planet

239

19:45

Getting 1K Chess for the ZX81 online

240

29:35

AOSC OS/Retro - An Introduction

241

43:57

Made by Woz: how Apple-1 operating system works?

242

28:21

Radically simple testing in Raku

243

57:44

Raku Steering Council Q&A Panel

244

24:00

Class learning analytics with Raku

245

36:23

A Raku Grammar for Navigation Lights

246

33:36

GitHub Actions (in|for) Raku

247

22:12

Free Software, Dependency Management, and what I got wrong at FOSDEM 21

248

33:33

Keeping the past to preserve the future

249

38:23

Decentralized DevOps with Unfurl

250

29:02

Voyager 1 adventures

251

29:05

Opensource WiFi chip (openwifi) progress and future plan

252

29:06

gr-ofdmradar: OFDM Radar in GNU Radio

253

09:34

Introducing the M17 Project

254

29:42

Emitting Hellschreiber from a Raspberry Pi GPIO: combining gr-hellschreiber with gr-rpitx

255

58:07

AlekSIS, the Free School Information System

256

42:59

Working effectively with (-support-) the community

257

56:43

Solving the knapsack problem with recursive queries and PostgreSQL

258

58:30

Slow things down to make them go faster

259

29:37

PostgreSQL Distributed & Secure Database Ecosystem Building

260

28:30

Future Postgres Challenges

261

43:42

What I wish I knew about security when I started programming

262

58:22

Secure Communication with Tls

263

38:12

Sudo: Watch and control your blind spots

264

44:27

WebRTC broadcasting with WHIP

265

43:06

UnifiedPush: A FOSS cross-platform push notifications protocol

266

48:16

Jitsi: 20 years of Real Time Communications

267

58:58

On the Far Side of REST

268

58:44

Implementing the NTFS filesystem in Rust

269

45:09

Open Source Network Automation in 2022

270

43:48

European digital sovereignty and open source

271

35:38

Are we being inclusive with our community recognitions?

272

20:35

Strengthening Developer Communities in Unprecedented times

273

20:10

Tracking your time with Timewarrior

274

19:18

A lightning intro to re-Isearch

275

18:59

Rapid Prototyping of a Positioning System

276

14:58

NetOTA: Quick introduction to IoT centric package archive

277

20:10

Jupyter for React.js developers

278

18:06

Measuring and analyzing humidity data using Python, syslog-ng and Elasticsearch

279

14:29

C meta-programming for the masses with C%: cmod

280

18:45

Generating virtual 3D exhibitions from Wikipedia

281

20:42

Collabortive group self-awareness with Where, a Holochain app

282

17:21

LibreOffice 7.3 New Features

283

13:47

Thunderbird in 2022

284

20:30

ToroV, a kernel in user-space, or sort of

285

05:18

Hardware-accelerated graphics in secure multi-tenant environments

286

37:27

Making a Community Managed FOSS Project Sustainable

287

22:18

Valgrind on RISC-V

Automatic playback

Speech

Text

Image

00:00

InformationBinary fileTranslation (relic)Source codeSymbol tableAddress spaceTable (information)Execution unitCompilerLine (geometry)Range (statistics)Network topologyVariable (mathematics)Type theorySource codeComputer fileAddress spaceMereologyDirectory serviceSymbol tableTable (information)Range (statistics)Line (geometry)MappingBitRun time (program lifecycle phase)Flow separationNetwork topologyMultiplication signComputer programmingExecution unitFunctional (mathematics)CompilerVariable (mathematics)InformationInternet service providerLibrary (computing)CodeSinc functionDynamical systemDescriptive statisticsCorrespondence (mathematics)Fluid staticsProduct (business)Software maintenanceLevel (video gaming)Uniform resource locatorType theoryDefault (computer science)Code2 (number)File formatReading (process)WindowLink (knot theory)Object (grammar)Compilation albumTraffic reportingCategory of beingMatrix (mathematics)Binary codeDiagramComputer animation

09:55

Symbol tableTable (information)Execution unitCompilerInformationLine (geometry)Address spaceRange (statistics)Source codeNetwork topologyVariable (mathematics)Type theoryBinary fileSheaf (mathematics)Patch (Unix)SineServer (computing)Cache (computing)Total S.A.InformationComputer fileOcean currentStandard deviationLink (knot theory)Compilation albumTable (information)Uniform resource locatorBinary codeSheaf (mathematics)Client (computing)Variable (mathematics)Computer programmingSymbol tableDifferent (Kate Ryan album)Physical systemRemote procedure callReading (process)Utility softwareSource codeNegative numberDirectory serviceCASE <Informatik>Server (computing)MereologySoftware bugOverhead (computing)Cache (computing)Functional (mathematics)Range (statistics)Limit (category theory)2 (number)File formatTwitterCodierung <Programmierung>Mobile WebAlpha (investment)Network topologyLine (geometry)Binary fileExecution unitMultiplication signDynamical systemWorkstation <Musikinstrument>Electronic mailing listComputer animation

19:44

Address spaceTable (information)MultiplicationTraffic reportingServer (computing)InformationLine (geometry)Table (information)Range (statistics)Address spaceExecution unitError messageInformationBlock (periodic table)Reading (process)Multiplication signRewritingConstructor (object-oriented programming)Complete metric spaceServer (computing)Compilation albumData storage deviceConnected spaceLevel (video gaming)Computer fileFunctional (mathematics)MereologyPlotterCASE <Informatik>Source codeMathematicsDirectory serviceShared memoryNetwork topologyVariable (mathematics)LogicUniform resource locatorDifferent (Kate Ryan album)Computer programmingType theoryLengthExpressionComputer animation

29:33

MultiplicationAddress spaceTraffic reportingInformationComputer animation

Transcript: English(auto-generated)

00:08

Hi, I'm Mark Willard, I hack on Valgrind, I live in the Netherlands and I work for Red Hat, where I help maintain Valgrind for Fedora, RAL and Red Hat Developer Tools

00:24

products. For the last Valgrind release, I helped Aaron integrate DebugInfoD support, which is a new way to find DebugInfo files, and I also improved the 12 reading speeds, so that

00:41

instead of seconds for larger programs, it only takes a few hundred milliseconds at startup. So I would like to talk a bit on why and how Valgrind uses DebugInfo, which issues I faced and which other improvements we could make.

01:01

So why DebugInfo? Valgrind can mostly do its work on binary code and addresses, but the user will likely appreciate symbolic function names in source files, line numbers, specifically the user will provide Valgrind with symbolic function names in suppression files, and Valgrind

01:25

will want to report addresses as function names, and when possible, with source code files and lines. When Valgrind reports an issue, the backtrace is the most useful piece of information, because

01:45

it not only tells where an issue was encountered, but also how he got to that location. But these days, compilers often partially inline functions, if that is cheaper than

02:01

calling them, especially with link time optimization, this is done a lot and sometimes multiple levels deep. So in that case, it's really helpful to show a virtual backtrace, this function was called from another, but instead of calling it, the compiler inlined this code into the

02:23

caller, or even the caller's callers, and this is so useful that we enable it by default with inline info as yes option, and finally it would be nice if we could

02:43

report the variables corresponding to the address, or maybe requesters, that are being manipulated by the program, but that is fairly expensive, and it isn't really accurate enough, which is why read-for info is off by default, we used to have an experimental

03:07

tool, sgcheck, which was supposed to be a stack and global array overrun detector, based on the for info, but it didn't really work that well, so it has been removed,

03:23

but it would be really nice to figure out a better way to use the variable location information, to make it less expensive and more accurate, sadly I don't really know how to do that. Anyway, for this talk, I'm ignoring inline tables, they are described

03:51

in the Twirstander, but in practice implemented slightly differently as EH tables, which are always available, and they work on the adverse level anyway. I'm also

04:07

ignoring symbol names that might be mangled, and need to be demangled, to make sense to the user, which is also interesting, we added verse symbol demangling in the last

04:24

release, but it doesn't really add to the debug info itself. And we also ignore non-Dwarf debug info, like the PDP windows debug format that's used by Wine programs,

04:46

which you can run under Valgrind, and Valgrind has a PDP reader, but I don't know anything about that format. So, what do I call the debug info? Basically,

05:11

three parts. First we have the symbol tables. We always have the dynamic symbols,

05:23

since Valgrind only works on dynamically linked executables, and these are the exported functions that can be called between libraries and executables. This is often a small table, and only covers a few functions, the exported functions, but it is useful. Then there's the

05:49

symtab, which is the symbols that were needed for linking the program, like the internal static function names, but the symtab is often moved into a separate debug file, because it isn't

06:05

really needed at runtime anymore, but if we can get at it, then the symbol tables give us simple mapping from address range to function names. A bit more interesting are the debug line tables. These are also often moved into a

06:28

separate debug file. Technically it's a directory and file table, plus a simple program producing a matrix from addresses to file, line, number, and some other properties of the instructions at

06:46

a certain address. Like the symbol table, this provides us with a simple direct mapping from an address to a source file, a line number.

07:04

Till 12.5, it isn't a completely independent table. You first need to know the compile unit, and then the file table corresponds to the main directory and file name, which is kind of

07:28

inconvenient. Because the last part of the debug info is the compile unit, which contains

07:41

the debug information entry trees. So the debug information entry tree describes the program scope, function arguments, variable locations, types, and lots of the program.

08:07

So the line tables and symbol tables are fairly simple, but the dice are described as trees per compile unit. Originally, Twarf was

08:30

designed to describe the one compilation unit at a time compiler.

08:40

So you compile one source file at a time, you link the object files together, and for every object file you have a die tree describing how that source file was compiled to object code. These days you not only have compile units, but you can also have

09:17

separate type units or shared partial units. And when using link time optimization,

09:32

some of the units might actually describe debug information entries from completely separate source files. And to make things even more complicated, at least for Twarf here,

09:50

is that the description and coding of the dice, the abrefs, are separate from the actual data in the debug info section. It does make some sense, because you often have the same kind of

10:08

attribute, so it's useful to describe the format of the data once, and then have the data in the actual tree just reference the abref that describes the encoding. There are lots of

10:38

particular die trees, where to find the ranges describing the program scope, function entries,

10:47

location lists describing where to find variables, and the types so that for those variables we

11:01

don't know which side they are in, important, at least for us, is the size of a variable. When we don't want any inline information and no variable information, we use a simple

11:24

Twarf reader that only reads the top of the die tree to extract the compile unit source file and reference to the debug line table. But if read inline or read for info is enabled,

11:43

we use a different full Twarf reader that reads the whole die tree looking for the interesting dice. In practice, we used to always use the full Twarf reader, because we want to read the program scopes to tell us which function is inline to another.

12:06

So much of the Twarf reader speedup came from skipping all the non-program scope entries, which would only be interesting for the variable location information, which is off by default.

12:25

So where do we find the debug info? Sometimes you can find all the debug info in the binary itself. The dynamic symbol table is always there, sometimes the symtem. And if you

12:42

are just using your just compiled program with minus G, everything else is also in the executable. But especially for distro binary system libraries, the symtem and the debug sections are all moved into separate debug files. If you then install the debug packages for distro,

13:09

they can be found through some standard built ID based path names, or through the new debug link section in the binary. And that provides a file name,

13:24

which you can then look up using a standard search path. Valgrind comes with an experimental Valgrind debug info server, which serves compressed

13:42

debug info files in chunks from the current working directory. It is completely manual, you have to find and store the debug files by hand. It was implemented to run Valgrind on a remote mobile device with limited storage, so you can serve the debug files,

14:03

or parts of debug files from your workstation. I'm not sure how many people are actually using it, and the documentation is fairly minimal. But if you use it, it would be nice to know.

14:21

And finally, there is the new debug info D support, which is kind of the opposite. It works completely automatically. If the debug info D find utility is installed, and the debug info D URLs and variables set

14:45

to debug info D server, Valgrind will just find the debug files. So Aaron wrote the support for Valgrind 3.18.1.

15:07

It spawns a debug info D find utility if it is installed.

15:24

It puts the debug files it gets from a server into a cache. It also cases negative lookups so that it quickly knows not to ask for more,

15:47

or the debug info again on future runs, and it's shared with other utilities like GDB, alpha utils, the bin utility, system tab, et cetera,

16:04

which is handy if you use GDB and Valgrind on the same binary. Some distros now set the debug info D URLs by default, specifically Fedora 35 does. Debian also has an official debug info D server, as do various other distros.

16:26

And you can also try to use the federating server, debuginfo.d.alfutils.org, and that URL also gives some more background information on debug info D,

16:43

which client programs support it, which distros have official experimental debug info D servers. At the moment, we still have a bug when the debug info D find utility is

17:03

spawned, it might generate a sick child signal when it dies, which might arrive at the program that Valgrind is running instead of Valgrind filtering it out.

17:24

Hopefully it's solved by the time this talk airs. So the Valgrind Twitter was slow, which was exposed by the debug info D support.

17:42

Suddenly there was always debug info for everything. And it was pretty bad. If you have a simple C++ hello world program linked against libstdc++,

18:02

this is on Fedora, which is built with LTO enabled and then post-processed by DWC, a dwarf compressor, you would get 100 megabytes of debug info.

18:27

And so just running hello C++ hello world under Valgrind took 12 seconds.

18:42

Luckily, I could reduce the reading speed, it took only 4500 milliseconds.

19:01

And then I stopped trying to make it faster because it's kind of in the range of overhead that Valgrind has anyway. So why was the debug info reader so slow?

19:29

Basically, Valgrind was too eager to read all dwarf information, even when some data could be skipped and it could not reuse some data that was shared.

19:47

So we now fully skipped use and children of DICE without addresses when we use the read inline info.

20:04

We are only interested in those DICE subtrees that contain function descriptions, the program scope entries. So we can simply skip any subtree or holes used if they aren't associated with any

20:27

addresses, because a function is always associated with addresses. So we partly did that already, but when we did, and we didn't know the size of

20:42

a subtree, and we often didn't because the subtrees are variable in length and the data is encoded in a way that we cannot know how big it is till we read the data.

21:07

We would skip those entries by reading them, but then we also interpreted all the data and we actually stored it in case we had for info enabled.

21:28

But that wasn't really useful. So now we really skip those subtrees. And if we don't know the size of the subtrees, we might still need to read the data,

21:44

but we don't try to interpret it or store any references to it. The same is kind of true for reading the line tables. Some units only have source plots to show where types are defined, but no associated

22:06

function addresses. We used to read all those line tables and store those source files anyway, and now we don't, which helps.

22:22

DWARF constructs can easily be shared between units, especially when using the DWC DWARF compressor. But Valgrind would not notice sharing of line tables or address between units.

22:42

For both line tables and address, we now reuse them when the previous CU used the same tables. And finally, we introduce lazy reading of the address.

23:02

Often we don't need to read more than the top of the directory to determine whether we even need to read the rest of the tree. But we would first create all the address for all the dice in the unit up front.

23:23

So we could read all the dice if we wanted. And now we lazily read the address we need for each die. We do guess the address because they are shared between dice. But if we don't actually read all the dice for a CU, and with all the other changes

23:47

we often don't, we might not read all the address. And this was also the main difference between the two DWARF readers. The simple one only had enough logic to read the top of the unit tree to determine

24:05

where the debug line table was. And now in the case we only need to read the top of the tree, we will only read the first air breath and not the others.

24:23

So what can be done to improve things more? There are still two DWARF readers, but using the fully fledged DWARF reader really should not be slower now. So we should really clean up the code and get rid of the simple minimal one.

24:47

Hopefully DWARF 5 will introduce something like a multi-level line table. That might save us having to read the die trees.

25:02

The idea is that the line table itself will have the level of inlining there. But we have to wait on a complete proposal and of course the compilers creating it.

25:25

Um, maybe we could be even more lazy in the reading. That would be a somewhat bigger rewrite.

25:40

But we should know when an address range or block is used and we could then read the associated unit only on first contact. And if we have a debug address range section, we could actually use that to

26:06

quickly get only that unit. Uh, but if there isn't a debug error intersection, then we would have to build an address range to CU table up front, which might make things slower again.

26:30

Um, so maybe we could do the for info reading only when we are reporting errors.

26:46

It's kind of the same as the above. It would be a pretty big rewrite. Um, and the problem is the for info is the wrong way around.

27:03

So it goes from variable name to location per program scope. And the locations can be addresses, but also a constant value or a value in a or even an expression and combining those.

27:26

It's partly in a register and partly at this address on the stack, for example. And I don't know how to efficiently construct an address to variable mapping from that.

27:43

Uh, also because it's the variable locations are really per program scope. And of course, the value of a variable might not actually be in memory.

28:05

Um, so maybe we could do this only for global variables. That might be useful. Other interesting ideas is maybe stealing the chunks idea

28:25

from the velcro and debug info server and do that with the debug info defined. Uh, implementation, uh, but it is somewhat hard to see how that works with the caching

28:46

if different programs use only parts of the same debug info file. And finally, we could, uh, make debug info defined, uh, long running.

29:01

So keep the connection open, which is what GDB does. It reuses the connection to the debug info, the server to more quickly download any new debug info files. It needs, uh, might save some time on the first run. And it would be a work around for the sick child issue because they don't shut

29:23

down the debug info defined executable anymore. Okay. Questions.