Debugging concurrency programs in Go
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 542 | |
Author | ||
Contributors | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/62050 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 202358 / 542
2
5
10
14
15
16
22
24
27
29
31
36
43
48
56
63
74
78
83
87
89
95
96
99
104
106
107
117
119
121
122
125
126
128
130
132
134
135
136
141
143
146
148
152
155
157
159
161
165
166
168
170
173
176
180
181
185
191
194
196
197
198
199
206
207
209
210
211
212
216
219
220
227
228
229
231
232
233
236
250
252
256
258
260
263
264
267
271
273
275
276
278
282
286
292
293
298
299
300
302
312
316
321
322
324
339
341
342
343
344
351
352
354
355
356
357
359
369
370
372
373
376
378
379
380
382
383
387
390
394
395
401
405
406
410
411
413
415
416
421
426
430
437
438
440
441
443
444
445
446
448
449
450
451
458
464
468
472
475
476
479
481
493
494
498
499
502
509
513
516
517
520
522
524
525
531
534
535
537
538
541
00:00
DebuggerComputer programmingBitSoftware engineeringHand fanDifferent (Kate Ryan album)Parallel computingComputer programmingLecture/Conference
00:58
Hardy spaceParallel computingLevel (video gaming)Computer animation
01:17
Computer programParallel computingLecture/Conference
01:35
Virtual machineComputer programmingComputer animationLecture/Conference
02:10
SequenceLecture/Conference
02:31
Software testingComputer programmingVirtual machineResultantoutputFunction (mathematics)Computer animation
02:56
Computer virusString (computer science)Computer programEndliche ModelltheorieConcurrency (computer science)Execution unitHoaxBootingACIDCodeVirtual machineSoftware testingBinary multiplierYouTubeThread (computing)Computer programmingMultiplication signBitAbstractionFunction (mathematics)System callDifferent (Kate Ryan album)VideoconferencingComputer animation
04:25
Stack (abstract data type)Pointer (computer programming)Thread (computing)Complex (psychology)Data managementFunctional programmingXMLLecture/Conference
04:48
Formal languageLecture/ConferenceMeeting/Interview
05:14
CoroutineProcess (computing)Endliche ModelltheorieComputer animationLecture/Conference
05:47
Shared memoryComputer programmingTouchscreenComputer animation
06:10
Parallel computingLetterpress printingStreaming mediaComputer programmingLecture/Conference
06:36
Software testingBitDebuggerComputer animation
07:12
Computer virusString (computer science)Software developerComputer programmingLimit (category theory)XMLLecture/Conference
07:34
String (computer science)Letterpress printingSoftware developerLoop (music)Local GroupSynchronizationCoroutineLoginDifferent (Kate Ryan album)Graph coloringXMLMeeting/InterviewComputer animation
08:02
CoroutineScheduling (computing)XMLComputer animation
08:30
Event horizonComputer virusBinary fileLetterpress printingScheduling (computing)DebuggerEvent horizonIntegrated development environmentScheduling (computing)Letterpress printingBitInformationDebuggerLecture/ConferenceComputer animation
09:02
DebuggerString (computer science)Total S.A.Message passingBuffer solutionComputer animation
09:49
Ocean currentState of matterMetadataSource codeJSONXMLLecture/Conference
10:11
Element (mathematics)Message passingLecture/Conference
10:29
Symbol tableVariable (mathematics)Common Language InfrastructureSource codeJSONXMLComputer animation
11:05
CoroutineWeb pageBitState of matterImplementationLecture/ConferenceJSONXML
11:40
Context awarenessWeb pageString (computer science)Router (computing)DebuggerServer (computing)Metric systemContext awarenessMiddlewareLink (knot theory)Slide ruleProfil (magazine)Endliche ModelltheorieWeb 2.0CodeComputer animation
12:54
DebuggerRouter (computing)String (computer science)SummierbarkeitInformationLibrary (computing)LoginDatabaseComputer animation
13:24
SummierbarkeitDebuggerView (database)Wrapper (data mining)DebuggerComputer animation
13:49
Run time (program lifecycle phase)World Wide Web ConsortiumWeb pageAutomationProjective planeDifferent (Kate Ryan album)Group actionFilter <Stochastik>Web pageJSONXMLSource code
14:25
OvalAutomationFlagDemo (music)Server (computing)Electronic mailing listSet (mathematics)Source codeDemo (music)Projective planeLibrary (computing)FlagDebuggerLecture/ConferenceSource codeComputer animation
14:59
Server (computing)Demo (music)BitComputer animation
15:18
Thread (computing)InformationRun time (program lifecycle phase)InformationComputer animation
15:37
Thread (computing)Run time (program lifecycle phase)Local ringCellular automatonString (computer science)OvalMereologyMultiplication signComputer animation
16:01
DeadlockDebuggerOperator (mathematics)DeadlockComputer programmingComputer animationLecture/Conference
16:25
Error messageDeadlockString (computer science)Formal languageDeadlockReal numberComputer animationLecture/Conference
16:47
DeadlockError messageString (computer science)Real numberLibrary (computing)Wechselseitiger AusschlussDeadlockComputer animation
17:24
Projective planeOpen sourceCASE <Informatik>Multiplication signSlide ruleComputer animation
17:49
FlagLecture/Conference
18:09
Run time (program lifecycle phase)Projective planeJSONXML
18:32
Concurrency (computer science)ImplementationSubsetRule of inferenceRule of inferenceComputer animation
18:53
Order (biology)BenchmarkDefault (computer science)Software testingMultiplication signParallel computingComputer programmingLecture/Conference
19:26
Concurrency (computer science)ImplementationSubsetRule of inferenceParallel computingLogic programmingCodeLevel (video gaming)Message passingComputer animationLecture/Conference
20:01
Concurrency (computer science)SubsetRule of inferenceFreezingComputer programmingLevel (video gaming)File systemWeightSubsetComputer animation
20:41
Rule of inferenceConcurrency (computer science)Scheduling (computing)CASE <Informatik>Computer programmingParallel computingRevision controlDebuggerDeadlockLecture/Conference
21:26
DebuggerRule of inferenceConcurrency (computer science)Point cloudLink (knot theory)Run time (program lifecycle phase)Slide ruleComputer animation
21:49
Hermite polynomialsLecture/ConferenceJSONXMLUML
22:15
Point (geometry)CASE <Informatik>Row (database)Computer programmingLecture/ConferenceJSONXMLUML
23:30
Program flowchart
Transcript: English(auto-generated)
00:05
is going to talk about the most painful thing I ever didn't go which is debugging concurrent programs. I'll give it a pause for Andrei. Hi, can you hear me well? Nice. I'm very
00:25
pleased all of you here at Fosdom in person finally since all this COVID and today I will talk about debugging concurrent programs in Golang and a little bit about myself.
00:41
My name is Andrei, I'm a software engineer originally from Ukraine, currently unfortunately living in Austria. I'm a big fan of sports, gymnastics, crossfit and different debuggers etc. The interest in parallel programming has grown dramatically recent years and the added complexity of
01:05
expressing concurrency has made debugging parallel programs even harder than debugging sequential programs and usually, sorry, every day at work I feel like I have these eight stages
01:24
of debugging myself so that can't happen, that does not happen on my machine, that should not happen, why does it happen, oh I see now, I feel I know what's the problem
01:44
then how did it ever work so last couple days I saw PR like oh it's not working since two years some code and like who wrote this and like oh wait it was me so the classical approach
02:05
for debugging sequential programs involves very easy like straightforward way so we rapidly stop on and set breakpoints with we just go step by step and like sometimes we print something
02:22
sometimes we continue rerun etc so and this style we just usually we call is cyclic debugging but the problem unfortunately parallel or concurrent programs do not always have
02:41
reproducible behavior even when they run with the same inputs on the same machine with the same results so and output results usually can be radically different and it's hard to predict this difference a cure for example when you run some program and as you can see
03:07
uh it's very dummy one but output is different each time when I run it on my machine sometimes it's same but sometimes not yeah I spend lots of time to read some some books and articles
03:23
and videos on youtube I just always trying to find like a question like an answer to my question okay there is any like okay we have books how to write code we have books how to write tests okay how to debug code there is no books even there is no books to how debug concurrent programs
03:46
and so and to to start explaining my journey how I usually do it let's a little bit remind who what is gourd so gourd is just like an abstraction yeah it's by the way is
04:04
struct which handle gourd in under the hood in inside go and usually gourdings are multiplex on on on different or multiply os threads so if one should block and like we're waiting for some
04:22
call others can continue to run and there are also lots of design which hides many complexities of thread creation and management so golang do it on our own so it's nice and to to create a gourd team it's very easy just
04:45
prefix your function with go keyword and that's new gourd nothing completed by the way who knows why they name it gourd maybe somebody have ideas yeah
05:08
why not just call it coroutine so okay nice so in each language we can just replace first letter and yeah it's yeah yes and no so like they call it at least from what I read
05:29
they call it because like threats coroutines processes and so on it's not an accurate explanation what guarantees does so gourd team has its own like simple model and how it's
05:42
executed etc etc and uh that's why like they know it cool so next question before I will share my experience how how do you think how can I debug my concurrent program
06:02
so nice nice can you repeat what the answer was for the screen thank you can you repeat the question you mean if you have an answer from the room can you quickly repeat it so it's recorded on the stream okay yeah we'll do so let's repeat how can
06:25
how can I debug my concurrent program so the gentleman suggested using prints nice yes nice it's also of that by the way uh okay any other ideas okay yes
06:47
yeah it's a good idea nice so just to repeat for people who are watching their ideas was using debugger delve using uh trace or trace
07:02
using tests etc so my first assumption was okay playground let's let's play a little bit and like few years ago like when I started like writing this talk to be honest there is was like a limit so playground worked only with go max pro 1 so it always like
07:26
reproduce my program but right now it's more or less simulate local developments okay I have more like bright ideas so maybe we can just color logs I don't know visualize goroutines
07:43
why not so here's a funny package which just what it does it's just like print different goroutines with different colors like this so yeah I mean if you do something very quick you can just figure out which goroutine which color etc
08:08
yeah return to seriously there is interesting article it's quite old but one of my friends from Ukraine he he like wrote this article also a few years ago he he decided to visualize
08:24
how all these scheduling goroutines works with these fancy pictures also very good article to read highly recommend another idea is try to print how go
08:41
schedule events so there is the environment variable which can print you some extra information and yeah and of course using debuggers today I will focus a little bit on dell and a little bit on gdb so next question can I set breakpoint inside
09:11
goroutine any ideas yes no yes so the answer is yes yeah typically you can set breakpoint
09:21
inside goroutine you can jump into this goroutine see what's inside and yeah it's very
09:40
channels so if I decide like send a message to the buffer channel of size four yeah it's very nice that you can set breakpoint you can pre-channel and dell has very
10:00
fancy like metadata which shows you even like current channel state so you see I send like one it's a first item and some meta information also useful then if I add another one so like next you see now I have two elements in channel
10:24
and the small problem usually like if I want to send message to channel from dell cli unfortunately it's not supported here's the issue I created yeah and there's a comment that yeah we can fix it but yeah I hope we will fix it some time
10:50
yeah so you can't set so technically it's possible but it's not I mean so it can be same semantic you can set and and dell will will handle it
11:08
okay now let's focus a little bit on how we can debug goroutines so yeah if you're inside a goroutine and you will print a state of goroutine there's a keyword
11:23
goroutine it prints current goroutine where you put your breakpoint but if you have lots of goroutines there's like interesting feature I really use a lot so but let's step back a little bit there's another idea and like implementation you can
11:47
use this profile labels so it's inside pprof model so you can run pprof do and inside through context run your code and it will like mark your goroutine with label
12:02
and usually you use these labels for profiling so you can open pprof profiles and see like some different metrics but you can do it also with dell which is super cool so
12:22
you can if you label your goroutines with labels like this and or if you use like middleware you can also do it I mean if you use web server you can use this
12:40
middleware I post link on next slide and it will automatically like add labels to all your handlers which is nice so you can see like which handler you are currently because if you print goroutines you will see like even in dell you will see lots
13:02
of unreadable information but if you just need to focus I don't know on login goroutines or like goroutines which doing something with your database you can label it in the same manner as you do with pprof and then yeah also you can do it directly by the way
13:21
this library which I mentioned it's very small one it's also support like set labels just a wrapper so very handy one and then if you run goroutines a keyword inside dell debugger minus l it will print goroutines it's just very simple hello world which has like this main
13:44
goroutine and few other goroutines without any labels etc but then I created another like project inspired by one article and yeah so here you can print all
14:10
all goroutines which relate it to your like label page and yeah also you can go to docs and different like group by I don't know filters so it's very handy and how you can find your
14:28
goroutine then you can switch to this goroutine if you don't know also you can print or list source code you can set new breakpoint it's very nice
14:41
and yeah also you can use this demo project it's not mine but it's more written for golent but if to run it you just need to this small tweak you need to pass some build flags and tags debugger otherwise this library will not work and then you can you can repeat
15:04
everything I did I highly recommend to play with that and when you need it you will be already like with everything you need regarding gdb yeah I play a little bit with them
15:22
quite not supported what I need for going and yeah it has this like info goroutines keyword as far as I remember you can't like filter goroutines and it's not readable so like yeah
15:43
especially this part yeah and I decided to not waste my time to be honest because it's yeah you can just use delft and for such problem rather than playing with gdb cool so
16:04
next not only with debugger you can find your problems one important problem in Golang world is deadlocks and with deadlocks usually program gets stuck on the channel send operation which waiting forever
16:27
for example to read the value and nice that Golang support detection of these situations compared to other languages for example python doesn't support this deadlock detection
16:43
which is hard to debug such problems and yeah if you need like real world examples you can see this very interesting library go deadlock
17:01
which using this library also found lots of deadlocks on cockroach db and there are lots of interesting examples how mutexes can be handled properly how to write it properly and etc etc it's like a this library is an entire separate like discussion
17:29
returning to our case yeah I wrote like I put two slides this very simple example so yeah sometimes you you have this conflicting access and you have this data races and
17:48
I saw it few times in some open source projects but usually people do not do it so I highly recommend run your ci pipeline with this dash race especially tests it helps you like
18:05
always run with this flag and it will print you like if there is data races or not this dash race not always can find all data races some common yes but sometimes no
18:23
but highly recommend to add it to your project so never skip so now I have like seven I have it I have a seven rules for you so how to unblock yourself when you get stuck
18:47
on something and you don't know how to debug it so first never assume a particular order of execution so when you're writing concurrent programs try to always think about
19:03
not running it in particular order especially it works with some benchmarks and tests so try to not put this like I also saw it lots of times when people when run tests when you run
19:23
go test by default if you know they run it in parallel but usually people say like no run it like sequentially and that's not a good idea another advice it's more about designing than writing code try to implement your any
19:46
concurrency logic at highest level as possible try to not pass lots of channels lots of like goroutines etc try to like keep logic separately and this concurrency separately
20:07
yeah don't forget as I said go go race now not always helps because it's not the text when program like whole freeze it's only when a subset of goroutines get stuck
20:23
as gentleman suggested you can use sit as trace and different tools for tracing which can help you to see like are we waiting for some resource like reading file access net it's more low level but it's very useful yeah I show it on another talk but you probably know about it you can use
20:48
conditional breakpoints which helps you to cover cases especially when it's concurrent program so you can catch only your case not like click next on every gory as I said you can use
21:07
scheduling tracer you can use go deadlock and yeah last but not least use debugger don't forget about it it's also very handy and like every every release every version I see how
21:21
debuggers are adding new stuff which is nice cool so I have like few references because to cover everything is hard in 25 minutes I will post slides so you can
21:41
accurately read everything no need maybe to like picture it and thank you thank you any questions are there any questions yeah before you're thinking if you if you want to donate
22:07
to Ukraine just let me know a few my friends right now are fighting so we can help directly if you're afraid thank you I have a question have we tried using tools such as Rvar or hermit
22:35
which try to execute the program in a deterministic fashion you mean backwards yes they
22:41
can do a recording of the execution and then replay it but the point is that the recording is deterministic yeah I use it for sequential debugging never for concurrent debugging I mean maybe it's possible but in my case it's I covered what I just showed of course there
23:00
are other cases I will try if you are leaving room try to stay quiet for a second do not talk chairs are okay so we can still hear any questions
23:22
well there are no more questions that means your talk was very clear thank you and a lot of applause