Getting Unstuck: Using the Scientific Method for Debugging
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 69 | |
Autor | ||
Lizenz | CC-Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben. | |
Identifikatoren | 10.5446/37764 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
Ruby Conference 201753 / 69
2
5
8
11
13
19
20
21
32
33
34
35
37
40
43
44
52
54
55
56
57
58
62
64
65
67
00:00
SoftwarePunktwolkeSoftware EngineeringLesezeichen <Internet>XMLUMLComputeranimation
00:41
MomentenproblemProzess <Informatik>QuaderComputeranimation
01:12
Digitale PhotographieProzess <Informatik>Statistische HypotheseProzess <Informatik>TermFlächeninhaltComputeranimation
01:40
Statistische HypotheseInformationTermKorrelationStatistische HypotheseResultanteBenutzerschnittstellenverwaltungssystemMereologieBefehl <Informatik>VariableMultiplikationsoperatorAlgorithmische ProgrammierspracheBitKorrelationsfunktionComputeranimation
03:50
Statistische HypotheseSoftwareBitHyperbelverfahrenStatistische HypotheseResultanteGüte der AnpassungOffice-PaketComputeranimation
04:24
Digitale PhotographieSchreiben <Datenverarbeitung>Prozess <Informatik>ProgrammfehlerBitWhiteboardInformationTexteditorProgrammierungNotebook-ComputerCodeLoginMereologieDatenflussAlgorithmische ProgrammierspracheSchreib-Lese-KopfRechter WinkelResultanteComputeranimation
06:11
Web-SeiteDatenbankGeradeProgrammfehlerSpeicherabzugZweiDienst <Informatik>Quick-SortInformationStreaming <Kommunikationstechnik>E-MailSystemverwaltungZahlenbereichProzess <Informatik>Web logWhiteboardSoftwareInstantiierungComputeranimation
08:21
Statistische HypotheseBefehl <Informatik>Statistische HypotheseFokalpunktInternetworkingPhysikalisches SystemFramework <Informatik>Computeranimation
09:03
Statistische HypotheseProgrammfehlerDifferenteInformationStatistische HypotheseDatenbankDienst <Informatik>TypentheorieGruppenoperationVersuchsplanungGeradeComputeranimation
10:04
ProgrammfehlerSoftwareHyperbelverfahrenProzess <Informatik>PunktRechter WinkelAlgorithmische ProgrammierspracheBitrateComputeranimation
10:42
PhasenumwandlungPhysikalisches SystemGeradeWeb logStatistische HypothesePhysikalisches SystemLoginPhasenumwandlungProgrammfehlerSoftwarePhysikalischer EffektZweiMereologieStatistische HypotheseProzess <Informatik>MultiplikationsoperatorMomentenproblemZeitstempelGemeinsamer Speicher
12:30
KreisflächePhasenumwandlungHecke-OperatorPhysikalischer EffektStatistische HypotheseWissensbasisPhysikalisches SystemKreisflächeGemeinsamer SpeicherProgrammfehlerComputeranimation
13:06
SoftwareProgrammfehlerAlgorithmische ProgrammierspracheDatenverwaltungSchreib-Lese-KopfProdukt <Mathematik>MultiplikationsoperatorGemeinsamer SpeicherProzess <Informatik>Physikalisches SystemCodeComputeranimation
14:39
Gemeinsamer SpeicherDreiecksfreier GraphGarbentheorieMultiplikationsoperatorWeb-ApplikationProzess <Informatik>BitCodeComputerspielProgrammfehlerMereologieBenutzerbeteiligungProjektive EbeneE-MailEin-AusgabeComputeranimation
15:47
Statistische HypotheseCodeStreaming <Kommunikationstechnik>Produkt <Mathematik>ComputerspielAusnahmebehandlungPhysikalisches SystemLoginInverser LimesSchnittmengeDreiecksfreier GraphSoftwaretestQuick-SortProgrammfehlerMereologieStatistische HypothesePuffer <Netzplantechnik>GeradeCodeRechter WinkelAlgorithmische ProgrammierspracheSkriptspracheKreisflächeNabel <Mathematik>MAPEinsQuaderProgrammbibliothekBestimmtheitsmaßServerComputeranimation
18:11
Digitale PhotographieKollaboration <Informatik>Kollaboration <Informatik>HilfesystemWeb-SeiteComputeranimationBesprechung/Interview
18:38
Web-SeiteRechenschieberProgrammfehlerWhiteboardProzess <Informatik>Web-SeiteHilfesystemInformationMereologieRechter WinkelStreaming <Kommunikationstechnik>XMLComputeranimation
19:49
Web-SeiteProgrammfehlerQuick-SortStatistische HypotheseProgrammierungWeb-SeiteBitAlgorithmische ProgrammiersprachePhysikalischer EffektVererbungshierarchieMinimumXMLComputeranimation
20:55
FokalpunktProzess <Informatik>Wort <Informatik>BitStreaming <Kommunikationstechnik>Quick-SortMinkowski-MetrikSpeicherabzugProgrammfehlerComputeranimation
22:05
Statistische HypotheseFokalpunktStatistische HypotheseInformationDienst <Informatik>DatenbankProgrammfehlerGruppenoperationEinfach zusammenhängender RaumComputeranimation
22:48
ZahlenbereichHilfesystemProzess <Informatik>ProgrammfehlerMultiplikationsoperatorDatensatzWechselsprungComputerspielResultanteVorzeichen <Mathematik>XMLComputeranimation
24:46
IterationIterationQuick-SortMultiplikationsoperatorArithmetische FolgeFramework <Informatik>Computeranimation
25:22
Digitale PhotographieBildgebendes VerfahrenFokalpunktProjektive EbeneAlgorithmische ProgrammierspracheVariableDifferenteStandardabweichungDienst <Informatik>ComputersicherheitBootenApp <Programm>Gibbs-VerteilungURLRechter WinkelSchnelltasteFreewareMultiplikationsoperatorComputeranimation
27:21
TypentheorieInformationGeometrische FrustrationÜberlagerung <Mathematik>Prozess <Informatik>Statistische HypotheseRechenschieberTemplateElektronischer ProgrammführerInformationTypentheoriePhasenumwandlungRechenschieberElektronischer ProgrammführerFokalpunktProzess <Informatik>MultiplikationsoperatorTwitter <Softwareplattform>Quick-SortKreisflächeIterationWeb SiteStabGüte der AnpassungComputerspielStatistische HypotheseSoftwareComputeranimation
30:19
Prädikatenlogik erster StufeSystemaufrufUmsetzung <Informatik>Quick-SortProgrammfehlerProdukt <Mathematik>HilfesystemSchätzfunktionFramework <Informatik>PunktSoftwareProzess <Informatik>DatenverwaltungStrömungsrichtungWeb logVorlesung/Konferenz
32:47
COMXML
Transkript: Englisch(automatisch erzeugt)
00:12
I'm Carolyn Taymor, and I'm here to talk to you about getting unstuck using the scientific method for debugging. I'm a software engineer at Pivotal Cloud Foundry, and I want to talk to you about
00:24
my favorite tools for debugging. So who here has gotten really stuck while debugging? I see a majority of hands, which is great because I think it's a pretty universal experience and I think if you haven't yet, you will someday. I know I definitely have.
00:41
So I want to share with you today my favorite tool in my toolbox for how to deal with that moment when you're really, really stuck. It's the scientific method, a process of making your debugging a science experiment. So today we're going to talk about the what, the how, the why, and the when.
01:01
What is the scientific method? How do you use it for debugging? Why is it such a valuable tool? And when do you know to pull it out of your toolbox? What is the scientific method? The scientific method is just a fancy term for the process of doing science, which is really cool because I think science is really cool.
01:23
So it starts with a step of gathering knowledge. What is already known about your topic? If you are researching stars or frogs, it's what research have other scientists already done? What is known about your research area? And then you start asking questions, which are really what don't we yet know about this
01:44
topic? What is there still to learn? What did previous research find which is really weird and interesting and you want to dig in more too? And then you make a hypothesis. A hypothesis is just an educated guess.
02:03
It's a statement about what you think might be the answer to your question. When you're doing scientific research, you phrase your hypothesis in terms of a null hypothesis, which is to say you phrase it as if like the statement that there's no correlation between two variables and then you want to disprove it to provide evidence
02:22
that there is some statistical correlation between two variables. And then you design an experiment. What information do you need to disprove your hypothesis? What information can you collect which will give you some sense that it's incorrect? And as you're designing your experiment, it's important to be very, very detailed
02:44
because when you're doing scientific research, it doesn't have any weight, your research, until it's been replicated by other scientists. So you need to be explicit enough in every step that someone else can do the exact same thing. And then you run your experiment following your experimental procedure and you take really
03:04
good notes because if you don't take notes on what you see, then you'll have a really hard time figuring out what you learned. And I really think the step of taking good notes is part of where the scientific method really shines as a tool for debugging.
03:21
After you've run your experiment and you've taken good notes, then you come to a conclusion. Did what you observe about your topic disprove your hypothesis? Maybe it didn't disprove your hypothesis, which might lend some support for the idea that the hypothesis might be true. Or it might have shown that your question was really irrelevant and uninteresting.
03:43
Or that your hypothesis is just like really off target and you don't really know. And all of those are great results because then you can come back to the gathering knowledge step and you know a little bit more than you did. Even if what you know is, that's not an interesting research question. Or this hypothesis is definitely not true.
04:02
Those are useful things to know. And then you also share your knowledge out. What good is scientific research if it stops in your lab or your living room or your office? When you're doing scientific research, this looks like publishing your results in a peer reviewed journal. And we'll talk a little bit later about what this looks like when you're doing debugging
04:22
on your software. So how do you debug with the scientific method? Well, one of the first steps is throughout the whole process, it's really important to write it down. That's part of what makes this such a useful tool. I don't think it matters a lot how you write it down.
04:41
Sometimes I use my notebook. I'm more of a notebook, stickers on notebook person and stickers on laptop so you can see my cool stickers. The whiteboard is a really perfect tool if you're collaborating with others, if you're programming, if you're talking with other teammates. I like stickies if I'm working with people who are a little bit resistant to using this
05:00
method and then I can make my experimental procedure on sticky notes at my desk and my pair can be less enthusiastic about using the scientific method and it works great. My favorite way of writing it down right now is actually just in my text editor. It's right where I'm doing my debugging. It's really integrated with my process flow and it's really easy to copy information
05:25
out of code or out of logs straight into my notes and then it's really easy to copy relevant pieces of those notes into a bug tracker or into Slack or other ways of sharing out with people so it works really well. I don't actually think that you have to write it down.
05:41
I know I said write it down and I'm going to tell you that again and again but I think the crucial step is that you in some way take some of the information that's floating in your head and move it out of your head in some concrete way. So I love writing. It works really well for me but I think you could get the same results by talking into a tape recorder and not like being thoughtful about what you're saying.
06:04
Like what would you write down? You could say that out loud. I know that not everyone loves writing as much as I do maybe. So then you start with that first step, gathering existing knowledge. What does this look like when you're debugging your software? It looks like doing a brain dump of everything you know about your bug and
06:25
you want to start with the user facing impact. How did you notice this bug? A customer went to the admin page and got a 500 and they sent us an email with not enough information. And you want to see everything else you've discovered in the process of using this tool.
06:41
I often forget about this when I'm starting debugging and then I only pull it out when I'm really stuck. So I've often been working on the bug for an hour or a day or a week. So I've learned some things and forgotten some of them. But you want to write down everything that you remember that you know. And you want to include like weird log lines you've seen, other strange behavior that seems maybe related,
07:01
just sort of stream of consciousness, write it all down. And as you start writing it down, I think it comes kind of naturally that you start asking questions. Because at first you write down the things that are definitely happening. And then you start writing down the things that you think are maybe happening, but you're not really sure. So for instance, maybe my service is spamming the database and
07:24
that's why the whole thing's falling over. I'm not sure. That's a great question to write down. And you might have two or three questions. You might have 20. You might have an overwhelming number of questions. But once you start getting a few questions, hopefully one of them will start being interesting or
07:42
you can sort of give up and be like, my gosh, I have seven million questions, I'm just gonna pick one. And so that's sort of the second half of this, is you pick a single question. And a couple of other interesting questions to ask when you're thinking about those questions are, how do I know what I think is true?
08:00
So I stated my facts before, the things that I know are true. How do I know they're true? Are they true? Potentially the most valuable question you can ask is, is the thing I think true actually true? That's a great question to pick as your one question to start with. You can come back to the others, so you don't have to be afraid that they're going away forever. They're still on your paper.
08:22
So once you've picked a single question, then you make a hypothesis about it, an educated guess. When I'm using the scientific method for debugging, I sometimes play a little fast and loose with it. I think of it as a general framework, not a rigorous scientific approach. So occasionally I frame it as an old hypothesis. But sometimes I just frame it as a more general statement about what I think
08:44
is happening in the system, a guess to the answer to that question. And I think it's helpful either way. If you have more than one idea about what's happening, that's okay too. You can just pick one, flip a coin, it doesn't matter. You just need some statement against which you can test.
09:04
So then you design an experiment. And it's important, again, to be really detailed. Here, it's less that you're gonna publish your experiment in a peer-reviewed journal for other people to replicate, unless you have a different type of bug than I do. But you'll wanna refer back to it for yourself, for teammates, and so
09:23
it's important to write down in detail the steps that you wanna take. And the reason that this is a helpful step is because it's much easier to say, what information do I need to disprove my hypothesis? I don't have to fix the bug. I just have to prove that the problem is not that my service is spamming the database.
09:42
That's all I need to find out. And as you're designing your experiment, you also wanna write down what you expect to see. What log lines, what behavior if you restart the service, what would you expect to see if your hypothesis is true or if it's false?
10:01
And then you run the experiment, and you take really good notes. What do you take notes on? You wanna take notes on what you expected to see. Did you see all those things you wrote down in your experimental procedure? And you definitely wanna write down about all the things that you didn't expect to see, and especially the, my gosh, I had no idea my software could
10:21
do that and I really don't know why. Those are also really important to write down. I often find other unrelated bugs while I'm following this process. Definitely write those down in your bug tracker, not now. Don't get distracted. Those are for later. But write them down. You have new bugs. It's, yeah, that's software.
10:40
Not sure it's great. One of the things that's really helpful in this taking notes process is I love to grab annotated log snippets. So you don't need to read the details of what the logs here are doing. But the interesting thing is I grabbed a chunk of logs with all their timestamps from the logs of the system that was taking too long.
11:02
It was timing out, and I didn't know why. And then I took a note of, hey, the time between the beginning of this process, the staging, and the end of the creation phase, took 52 seconds. And I know I have a three minute time out for a process of which this is only a small part. And in a healthy system, this takes three or four seconds.
11:22
I don't know what's going on, but it's interesting. This is helpful because if you just grab the logs, then tomorrow you're not gonna know why those logs were interesting. So if you just write a short snippet about why the logs are interesting, it's really valuable. This bug was like a month ago, and I still know what it was doing because I wrote two sentences at the top of the log.
11:44
So then, you come to a conclusion. Was your hypothesis disproved? Do you feel like you know what the cause of your bug is? Maybe you still have no idea what the cause of your bug is. And that's actually great, because before you had 6,000 possibilities, and
12:03
now you have 599, and that's actually a lot less possibilities that your software could be breaking, and it helps you move forward. If you figured out how to solve your bug, that's great. Like, go on and fix it, and don't forget the share out phase that we'll talk about in a moment. But if you haven't figured out why your software's breaking,
12:23
your hypothesis was disproved, or it was proved in a way that you still don't really know what's going on, then you circle back to the gathering knowledge phase. Because you now know more about your bug. You know that your hypothesis is untrue, and that's not what's actually going on. And you've probably, because you've taken detailed notes on exploring a corner of
12:44
the system that's related to your bug in some way, you probably have learned a whole lot of other things. So you may have new questions. You may have new knowledge that prompts new questions. You may also have no new questions, but you can then refer back to the questions that you set aside before.
13:00
And maybe one of them looks a heck of a lot more interesting, or a heck of a lot more likely as the cause. And then there's the share what you learned phase. So what is important to share when you're debugging? There's a lot of things that you can learn when you're debugging. Certainly all those new bugs that hopefully you put in your bug tracker.
13:21
You can share those with your team, your product manager. You may have seen problems that others may see later. So I sometimes will be working on a bug, and I'll discover that the issue is not actually a bug in the code. It's that the system was misconfigured, and another customer may come along and
13:42
configure their system in the same way that we know won't work. So it's really helpful to tell my teammates, this is what I saw. Here's the answer. You can under, like, and now next time that you see the same thing, you can have an idea in your head of what the problem is. You may have things that you need to share with another team.
14:00
It may not be your software that had the bug in it. It may be another team's software, and that's a great thing to share. And my friend Ray Krantz helped me see, I think one of the most helpful things that you can share from this process is your experimental procedure. Because it's, you just wrote down a playbook for
14:21
how you solve really hard bugs in your software. Which can be a great onboarding tool for a new person who doesn't have any idea how to solve bugs in what is now their software, but they don't know it because it's only their third day on the job. So it can be a really great teaching tool as well. So, story time.
14:43
I wanna share with you a little bit of how I've used this on some interesting bugs that I had. So I was working on a project, and we had a process where we were taking some customer input, customer data, and shelling out to node to run some code on it in a JavaScript sandbox.
15:03
I know this sounds like really wild and probably like a terrible idea, but we're doing it for pretty good reasons. And we had a customer who was reporting software, that they were starting this process, and then it was just hanging for like five hours. And we expected that sometimes it could take a little while,
15:21
it could take a minute, it could take two minutes. Our web app was designed to handle that. We had a spinning little bar to show that we were doing asynchronous stuff in the background, and don't worry, it'll return. And five hours later, no returning. So I knew vaguely what section of the code base it was in. I mean, vaguely like, it's in this third.
15:42
And so my first question that I wrote down was, what part of the life cycle is hanging? And I thought, maybe what's happening is there's something funky in the JavaScript that's running in the sandbox, and it's not returning. I don't know, I know more Rails, JavaScript, like what we're doing here is kind of wild, maybe this is the problem.
16:03
And we didn't have very good logging, so I couldn't tell from our logs what part of the system it was. So I wrote down a procedure, I'm going to add, we were able to get data from the customer that was similar enough to their production data that we were able to replicate it on our system, which was great on a test system.
16:23
And so I added a bunch of log lines just like, we're at this part, we're about to shell out to node step. We're shelling out to node, we got back from JavaScript. Just like, where in this life cycle are we log lines? And I restarted the Rails server, and it didn't hang in the JavaScript code.
16:43
It never got there. So this hypothesis was not true, and I had learned something new. So then the question was, maybe the data's too big. We knew that our test data was a lot smaller than our customer's data. We had seen this before. Other teams were testing with larger data sets than we were, and
17:02
we had this vague sense that maybe our customer's data was even bigger. And so we thought, maybe, when the data's too big, shelling out to node hangs. And so we tried it again with smaller data, in the exact same situation, except with a much smaller data set.
17:22
And that worked. And so we're starting to get a sense that this is really what the problem was. And then we circled back to the gathering knowledge stage, and we did more research. And we found out that node has a buffer limit that was way smaller than our customer's data set, which was way bigger than we had ever imagined.
17:42
And so we realized that the problem was that we were not piping our data properly and never flushing the stream. And so it was just sitting in the node buffer, and we were able to fix the bug. And by properly using the IO library, or no, Open3, and not forgetting to flush our data.
18:04
So I hope this sort of shows how you can use this cycle. Your first question might not be the relevant one, and that's what's useful about it. So why is the scientific method helpful for debugging? Well, the first reason is collaboration, and it's great for getting unstuck.
18:26
And once you're unstuck, it helps you keep moving forward. Why is the scientific method a great tool for collaboration? Well, it can help you and your teammates get on the same page.
18:42
If you have been working on a bug for a while, and you're frustrated, and you go to a teammate and ask them for their help, your notes are a really valuable tool for helping them come on to the bug in an efficient manner. You can tell them just what you're working on right now, and you don't need to tell them the entire history.
19:01
You have your notes, so you can refer back to it when you need to, like when they become relevant. You also can avoid telling them all the rabbit holes. If I'm telling a co-worker just stream of conscious about a bug I'm stuck on, I often go through the process that I've gone through to get here over the last
19:22
two days, and that has a lot of dead ends that aren't useful information for my co-worker, and so you can refer to your notes, realize their dead ends before you tell your co-worker this information, and just skip the dead ends. It's also really helpful because, I think I switched those slides.
19:43
Well, the good notes do make it really easier for your teammates to help you because of that getting on board. But getting your teammates on the same page is also really helpful if you're disagreeing about what the cause of the bug is. I do a lot of pair programming. I spend most of my day pair programming.
20:02
And sometimes when my pair and I are working on a bug, we have really different understandings of what the bug is. And we might have really different questions, we might have really different hypotheses. And so by picking this procedure where we have to agree on one question and one hypothesis, it actually gives us a really,
20:21
it forces us to communicate clearly about what we think is happening. And it gives us a really generous way to give way to each other without feeling embarrassed or feeling like we're not being respected. Because we can say, okay, we came up with 20 questions. We have to pick just one.
20:40
So let's investigate your question. If it turns out to be correct, that's great, we'll fix the bug. And if it's not, then we can come back to my question. And it gives you a really generous way to communicate when you're sort of really disagreeing. We talked a little bit about why good notes make it so much easier for your teammates to help you.
21:01
So it also helps you by getting unstuck. And the reason it does this is that it narrows your focus. And so you get unstuck with a little rocket ship, I love that. Writing down what you know helps you organize your thoughts. When you're feeling really stuck, your thoughts are often just going in circles
21:23
and they're getting a little overwhelming and you just kind of don't know what's going on. And by writing down your thoughts and then being able to look at them afterwards, that sort of externalization process is really helpful for organizing your thoughts. Because you look at all the things that you stream of consciousness dumped.
21:41
And then that gives you a little bit of emotional space to step back from your feeling terrified that you can't solve this bug. And it lets you notice which things you still think are important after you've put them down on paper. And it turns out you might find some of them are not important and you might be like, that one that I wasn't even really thinking of is
22:01
really I think that's where my question is. I think that's the problem. One other way it really helps you get unstuck is the question of how do I disprove this hypothesis is actionable. How do I fix this bug when it's a really overwhelming bug is not actionable. There's no clear action you can take.
22:20
But very often when you have narrowed your focus down to, if you take our example from before, I think that my service is spamming the database, then you can say okay, I can look at how many connections from my service to the database there are. And it's much clearer what information you need to select if you've narrowed your focus down so much. It's just much easier to come out with what information do I need to do
22:43
to disprove this tiny question than fix the whole bug. And so it helps you move forward. Indeed, moving forward. It helps you, once you've gotten unstuck, then it also helps you keep going forward. Once you've gotten past that frozen place, you can keep going.
23:02
One of the most valuable ways that the scientific method is really helpful is that it prevents you from repeating yourself in two ways. The first is when your teammate comes in and you've asked them for help. And they're like, hey, did you try restarting it? And you're like, yeah, I spent all afternoon restarting it and
23:20
that didn't work, and do you think I'm an idiot? And you know that your teammate's trying to help because you've asked the same question to someone else, but you're really pretty frustrated about it. I hear some laughs, so I think other people have had that experience. And so the scientific method, you have your notes, and you have your experiment, and so you have your great notes from when you spent all afternoon trying restarting it yesterday.
23:41
And your coworker gets to see the results, what happened. Cuz that's what they really wanna know. They don't think you're an idiot for not trying restarting it. They just wanna know themselves what happened when you restarted it. And so you have your notes, and they can read your notes, and then you can skip that whole process and jump to where you are now,
24:01
which is eight hours later from the time when you were trying restarting it with new knowledge. It's also great because it prevents you from repeating yourself. Who here has been working on a bug for a while, and then all of a sudden you realize you're doing the same thing you did three days ago to try and debug it?
24:21
Again, I see a number of hands. I've definitely done this. And so having that record of what questions you've asked, what experiments you've run as you're debugging helps you not go back and do the same things pointlessly because you're lost in this fog of my gosh I've been working on this bug forever and I don't really know. And it just all blurs together and is it three days or is it ten years?
24:42
I don't know. So it helps prevent that repeating in the fog. It's also great because each iteration moves you forward in a way that's really observable and concrete. Every time you've answered your question and you've followed your experiment and you come back to gathering knowledge, it's much clearer than if you aren't taking notes on it and
25:01
you aren't following sort of a framework. It's much clearer to see that you are making progress. Even if you now, instead of 6,000 potential reasons for your bug, only have 599, 5999 and that's still way too many. But it's really clear that it's one less and so it helps you not get frustrated and it helps you move forward.
25:23
So when should you use the scientific method for debugging? Another story. So this method really started to crystallize in my mind when I was working on a project where we were shipping a Ruby on Rails app as a VM image for five different infrastructures as a service.
25:42
And each one had to be packaged differently. And they, we were using as our VM image an Ubuntu, like a standard Ubuntu image from Canonical. And then we built, we were using Packer and Chef and a bunch of other tooling to put all our stuff on the VM image. And we decided that we wanted,
26:02
instead of using the standard Ubuntu image from Canonical, we wanted to use a VM image that was based on that same image, but that another team was building. And they had done all kinds of security hardening. And so we could get all their work for free, right? So the general, the best general procedure we could think of for
26:20
this was go in Packer, which is a tool for building VM images. And swap out the URL from the Ubuntu one to the one our team was building. And try and build it, and try and boot it, and hope it works, and let me tell you, it doesn't work that way. And we spent about a month doing this and sitting there being like, okay,
26:43
the VM didn't boot, we can't even SSH in, what's going on? And so it was really a month of the whole team being incredibly stuck and incredibly frustrated, and we learned a lot, and we got it done eventually. But I started pulling together these tools of take notes when you're
27:01
frustrated, and throw out a variable, or focus on one variable. And all these tools that I had that were individually, and started pulling them together into thinking of it like the scientific method. Because we just kept getting stuck, and I had to develop ways for us to move forward because we needed to move to this new image for our VM.
27:21
So I think there's two types of getting stuck, and I actually think the scientific method is really helpful for both of them. This idea of two types of stuckness, and that when you're debugging or just building software, or really just living life, you often are alternating between them, comes from my friend Jesse Alford. So the first type of stuck is when you have too much information.
27:44
And you can recognize that you have too much information, and that's why you're stuck. Because you feel overwhelmed, or bewildered, or you're saying to yourself, it could be one of any thousand things. Or you're noticing impossible things that definitely cannot happen in your software, you're really sure it can't do that, but it seems to be doing it anyways.
28:04
And are there leprechauns in your software? I don't know. Maybe it's poltergeists. The other type of stuck is when you have too little information. I can't think of anything. I have no more ideas. You feel frustrated, you feel stalled, and the scientific method
28:24
has aspects which will help both. So when you have too little information, writing down everything you know is a way of realizing you actually have some information. And so it's very helpful. And when you have too little information, when you pick a single question and you pick a hypothesis, you're narrowing your focus.
28:43
And so even though it could be one of a thousand things, now you have just one little thing to help. So it's actually really helpful in both phases of being stuck. And you're usually alternating between the two as you go through a debugging process. And the scientific method sort of goes with you the whole time and like can be your friend for both.
29:02
So we've covered the what, the process of doing science, the how, with lots of writing, why the scientific method is so helpful for collaborating, getting unstuck, and moving forward, and when to use it, when you're feeling frustrated or overwhelmed on really complex bugs.
29:22
I hope the next time you're stuck, you try gathering knowledge, asking questions, making a hypothesis, designing an experiment, and running it, taking really good notes, coming to a conclusion, circling back to gather knowledge again, and sharing out with your team, and
29:42
definitely make sure to write it all down. I have a reference guide for this that you can refer back to while you're debugging on my website, so that you don't have to, I'm not a huge like slides as sharing, I think that they're very low fidelity. So this can be really helpful for referring back to.
30:02
And I would love to hear if you find the scientific method helpful, or have any improvements on it for your debugging on Twitter. And thank you, I think we have a few minutes for questions if folks would like. Yeah, so the question is, how do you know when to stop?
30:21
I talked about spending a month on this bug, and like how do you know when it's something external? I think that's a really hard question, and I think that it has a lot of judgement call involved in it. I would definitely not recommend spending a month on a bug without going and talking to other teams and other people.
30:41
We were doing a lot of going and talking to other teams and other people who knew more about this as well during this process. Probably if you're spending more than a few hours stuck before at least having a conversation with a co-worker or a rubber duck or something is like. That's probably too long. I don't have any clear guidelines. I think that it's definitely a judgement call based on your software,
31:07
and based on your team, and your product team, and how important is this bug? And maybe we thought it was an important bug when we thought we could fix it in two hours, but it's not that serious if it's gonna take us a week. It's really a judgement call, yeah.
31:22
Yeah, so the question was, do I estimate in advance about how long it will take me to solve the bug? Because often, as engineers, we underestimate how long it will take us to fix the bug. We think we can fix it in an hour, and that's totally unrealistic. I don't usually try and make concrete estimates with bugs.
31:42
I generally try and develop sort of a working framework with my team, with my product manager, with my team lead, if I'm not the team leader, if I am the team lead with engineering leadership. It's sort of a general framework of at what point should we start
32:01
checking in pretty regularly? So on my current team, that's usually two days. If a bug is taking us more than two days to fix, then I try and be having a daily conversation about how is it going? Are there other people or teams that need help or that could help? Is it still worth keeping going on this bug? But I think that's really something that develops with each team.
32:24
And so it's a conversation to sort of have with your co-workers about what is our team's cultural understanding of when should we start having regular conversations about how long a bug is taking. Yeah, other questions? Okay, well, I know nobody will be sad about going to lunch a couple minutes
32:42
early. Thank you so much.