We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Best Practices for Debugging

00:00

Formal Metadata

Title
Best Practices for Debugging
Title of Series
Number of Parts
160
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Best Practices for Debugging [EuroPython 2017 - Training session - 2017-07-10 - Sala del Tempio 2] [Rimini, Italy] Debugging is a daily activity of any programmer. Frequently, it is assumed that programmers can debug. However, programmers often have to deal with existing code that simply does not work. This tutorial attempts to change that by introducing concepts for debugging and corresponding programming techniques. In this tutorial, participants will learn strategies for systematically debugging Python programs. We will work through a series of examples, each with a different kind of bug and with increasing difficulty. The training will be interactive, combining one-person and group activities, to improve your debugging skills in an entertaining way. Contents: Syntax Error against Runtime exceptions Get file and directory names right Debugging with the scientific method Inspection of variables with print and introspection functions Using an interactive debugger Pros and cons of try.. except Delta debuggin
Software testingSoftware maintenanceSoftwareIntelAssembly languageComputer programmingNeuroinformatikScheduling (computing)BitGame theorySource codeFrustrationCycle (graph theory)Process (computing)Entire functionMoment (mathematics)Figurate numberVideo gameCompiler2 (number)CodeLine (geometry)Computer animationLecture/Conference
Source code2 (number)EinsteckmodulBitProjective planeCompilerNeuroinformatikLecture/ConferenceComputer animation
CodeComputer programmingInteractive televisionMereologyInstance (computer science)Projective planeLibrary (computing)PlanningChecklistSignal processingAdditionDebugger1 (number)Web 2.0Software developerTransformation (genetics)TowerMixed realitySoftware maintenanceSoftware testingData analysisComputer animation
Shooting methodCodeStandard deviationMaizeInstance (computer science)Software testingLevel (video gaming)Function (mathematics)Right angleParsingParameter (computer programming)Functional programmingComputer fileSoftware testingSoftware developerCore dumpFerry CorstenCuboidSoftware bugComputer programmingDebuggerLetterpress printingLine (geometry)Statement (computer science)Multiplication signSocial classBuildingField (computer science)BitDot productSpacetimeRevision controlInformationLoginAdditionLevel (video gaming)Game theoryRight angleOnline helpCodeWritingFunctional (mathematics)NeuroinformatikDisk read-and-write headDistribution (mathematics)MereologyTouchscreenHypothesisComputer iconControl flowFigurate numberUnit testingVector potentialWeb pageEndliche ModelltheorieInteractive televisionInstance (computer science)CASE <Informatik>Library (computing)Arithmetic progressionGroup actionType theoryParametrische ErregungGreatest elementComputer animation
Software testingFunction (mathematics)MaizeRight angleLevel (video gaming)Software testingTotal S.A.Right angleVarianceParameter (computer programming)Normal (geometry)Boilerplate (text)String (computer science)Distribution (mathematics)Level (video gaming)TrailSquare numberFigurate numberCodePlanningElectronic mailing listMultiplicationUnit testingXML
Software testing1 (number)CodeEntire functionWindowSlide rule
Software testingUncertainty principleRight angleIntrusion detection systemTexture mappingSoftware maintenanceHTTP cookieSoftware frameworkTotal S.A.Java appletFormal languageComputer fileConfiguration spaceInstallation artRevision controlSoftware testingLine (geometry)Point (geometry)CodeShape (magazine)MereologyComputer programmingSoftware frameworkDifferent (Kate Ryan album)Projective planeSoftware maintenanceSoftware developerElectronic mailing listType theoryTemplate (C++)Parameter (computer programming)Volume (thermodynamics)Absolute valueMultiplicationModule (mathematics)ProgrammierstilBlock (periodic table)WritingContinuous integrationResultantData conversionMoment (mathematics)Functional (mathematics)Goodness of fitLetterpress printingMachine codeEndliche ModelltheorieLibrary (computing)HTTP cookieVirtualizationSet (mathematics)Computer animation
Suite (music)MaizePatch (Unix)Module (mathematics)Attribute grammarEvent horizonCodeLetterpress printingKey (cryptography)Function (mathematics)Web crawlerCodeEntire functionComputer fileWritingElectronic mailing listSource codeComputer animation
Sheaf (mathematics)Line (geometry)Module (mathematics)Library (computing)Patch (Unix)Level (video gaming)Suite (music)TypprüfungFluid staticsVector potentialType theorySource codeComputer animation
Software testingStatement (computer science)Profil (magazine)Software maintenanceContent (media)Open setComputer animation
Machine visionRight angleLengthBitRepository (publishing)VirtualizationMultiplication signRevision controlLecture/Conference
Software maintenanceHTTP cookieRevision controlComputer filePoint cloudNumberRight angleLecture/ConferenceComputer animation
Data structurePoint (geometry)Revision controlPersonal identification numberSoftware maintenanceNumberSoftwareFunctional (mathematics)Lecture/ConferenceMeeting/Interview
Revision controlComputer programmingMultiplication signCodeNumber1 (number)BitWordLecture/Conference
Transcript: English(auto-generated)
Welcome, everybody. When I was 15 years old, I was writing a computer game, a little bit like this one. I wrote this on a C64 computer in the assembly language.
Is anyone here in the room who has tried doing that actually? One or two. A few people, great congratulations. Well, I was writing, this was my first bigger program I wrote in assembler. And the way it was like, I wrote a couple of lines,
tried to run the program. Usually the computer would crash. And so I switched it off, waited a few seconds, switched it on again, loaded the computer, the compiler, loaded my source code, and the entire process repeated.
Needless to say that you won't program very fast doing this because my average debug run cycle was about 10 minutes. And this led to a lot of frustration. At the moment, my game is 24 years behind schedule.
Only later I learned that on the C64, we had devices like this one, a small cartridge that you could plug in the back of the computer, it had a button on it, you push it, and you jump right into the compiler,
can edit your code, and continue where you stopped a few seconds ago. But I was totally unaware of that. And basically, the aim of my talk is to prevent us from pitfalls like this one in Python projects.
Now, I'm not going to talk about anything fancy or totally new here, because what I want to address is a common problem that I have observed happening when people are new to Python
or are a little bit more advanced in Python. After you mastered the Python basics, you pretty soon figure out that there are plenty of libraries. For instance, if I want to do data analysis, then I need to find a library for data analysis, like Pandas. If I want to do, I don't know,
signal processing or Fourier transformation, then I Google a library for that and find SciPy, for instance. Or if I want to do web development, then I will find Django, and so on. These things on the left side, they are easy to find. Even if I don't know them,
I may have a hunch that something like this must exist. But the ones on the right side, if I have no idea that something like interactive debuggers exist, then I won't be looking for one. The same goes for automated testing, and the same goes for all the tools
in the Python ecosystem that help support and maintain our code. And I want to shed a light on these dark spots. So if you walk out of this room and see, hey, I haven't learned anything new,
then consider this an additional safety check, pretty much like if people start an airplane, pilots, they have a checklist that they go, okay, do we have clearance from the tower? Are there any other planes on the runway? Do we have enough fuel for an emergency landing? And so on and so on.
And I think it's good to have something like that in a programming project too, and this is why I call it best practices. And my talk is split into three parts. I would first like to talk about debugging, then show an example about testing, and then about maintenance, debugging.
When you talk about debugging in Python, the first thing that comes into your mind might be print. Now, print is something that I consider a bit problematic,
even though I do it a lot. Because it's like shooting holes into a building to see whether there's a fire inside. Every time you add a print statement to a piece of code, there is a risk that when deleting the print statement, you delete one line too much without noticing.
And this is why it's worth keeping in mind that there are other debugging techniques. For instance, the interactive debugger for instance, logging. With the standard logging model, this is really an excellent way
to produce diagnostic information. If your program is bigger than a couple of screen pages, then logging becomes superior to print after a while. You need to know your introspection functions like dir and type, is instance, and a couple of their cousins. And I put code review here as a best practice
that helps with debugging a lot. We had a tutorial on debugging yesterday. If you were attending some other part of the conference, the exercises are there on GitHub. You can try them out by yourself.
But what about the really tough bugs? If something does not work and does not work, you stare at the code and go on staring at your code without any progress. I'm sure that this has happened to most of you
in the room, at least it happens to me still after 28 years of programming. And I have figured out something very nasty over the years, not that I'm lacking some additional best practice for debugging here because nine out of 10 times the problem is inside my head rather than in the computer.
And fortunately, there are ways to fix that. So I would like to add to those best practices of debugging for very elementary things. Sleep, talk, read, and write.
Most of the time, if I spend more than 15 minutes on a bug and I don't find the solution, then I'm probably tired. Often it helps to talk to another person explaining what you do, to realize what the problem really is. Maybe I'm looking in the wrong spot
and explaining helps usually. Sometimes my knowledge is limited and reading the manual of the library, this time for real, helps. And if all of these fail, writing down what the program is, formulating a couple of hypotheses
or at least ideas what the problem might be could lead to progress. Most of the time I'm lazy and take a break and this has solved lots of bugs in the past for me. Testing.
The check icon here is actually a bit of a provocation because this suggests something that automated testing does not do. Testing actually does not prove correctness of your program. I know that many Python developers,
they love automated testing. I like to write automated tests and run them. It gives a feeling of achievement. But there's a pitfall, and the pitfall is that one in the bottom right corner. If there is always a possibility,
if my tests pass, it could be that both the code and the tests are incorrect even if I try hard to keep my tests as simple as possible. So tests by themselves do not prove correctness of the code, but they have a potential to prove the presence of bugs.
So if I see a failing test, I know that somewhere something is wrong. Now how can we write good tests actually? Let's imagine we are writing the game. This time in Python, not in assembly. This grew too tiresome.
We have a figure that is pushing these blue boxes around. It can only push one box at a time. Cannot push any boxes through the wall and so on until you reach the exit here in the bottom right. Now how could a test for this situation look like?
The first thing that you can think of is a fixture in pytest. I learned today when talking to the pytest core developers that it's a good idea to place fixtures in a file conftest.py, and because they get automatically imported into all your test files.
Now fixtures are actually pretty straightforward. You decorate a function with the pytest fixture decorator and the name of the function will be available as a variable in all your test functions if you put it there as a parameter.
So in that case, we would get, we could have a level parameter available in our test that then contains a parsed version of our example game situation. Actually, I did something additional here. I parameterized the test so there are two versions
of the playing field supplied to the test. One with empty spaces and one with dots on the playing field. So I can have two fixtures or more in one by parameterization. And then we can use this in a test function.
I like to group my test functions into classes. With pytest, this is fortunately a lot easier than it used to be with unit test. I still have less boilerplate code. So I can write a normal test function with just an assertion that is self-sufficient
or I can use the level parameter here. Note that I'm not importing level anywhere. This gets automatically filled in by pytest. And this test function will generate two tests for me. One for each of the variants in the fixture.
What else can we do? The third most important thing that I would like to emphasize about automated testing is test parameterization. So we can have one test function, right, multiple, try multiple examples like here.
We say by this parameterized decorator, we would like to try out all the examples in the list like having a move that goes first up and then left and afterwards the playing figure should end up on square two X and two Y.
It's even possible to build failing tests with this or tests that we expect to be failing. So we have still write only one test function but with this one, we would generate eight tests in total. So it saves a lot of code.
The code becomes actually very readable and if we end up in a situation where our test code is ridiculously easy to read, much easier than the code we are testing, then we are on the right track. So if we execute this code by writing pytest,
this is another thing I learned this morning that the dot inside pytest has been deprecated. We can use pytest without the dot in the middle. So we see that all the tests execute. This test actually uses a window
and we see 34 past tests in the entire test set, not only the ones that I showed on the slide, there's a couple of more, a few more running in the back plus two that are expected to fail because I marked them with the X fail decorator.
Now, how much testing code should you write? In my opinion, this depends quite a lot on the size of your project. If your project is small and prints an obvious result anyway,
then maybe a manual test is enough unless you want this to be continuously integrated. Sometimes I still write test code in the main block of a small Python module. I can write, make automated pytest functions out of this quite easily later if the project grows, add some fixtures
as the thing grows further and if the program keeps growing and growing, then at some point it might be helpful to switch on testing tools like Jenkins, Travis for continuous integration or Tox for testing multiple Python versions. When I speak about size of the project,
this can mean different things. It could be absolute volume in lines of code but it could also be the expected lifetime of the project so if I expect code to be maintained for two weeks, then I would not worry too much about testing if I'm writing a throwaway program.
If the program has a high dependability so if it needs to be extra safe, then doing more tests and reviews and things like that is also a good idea. In the final part of my talk,
I would like to elaborate on maintenance. Python has a fairly sophisticated ecosystem of maintenance tools and they serve the single purpose of keeping your code in a good shape like PEP8 being a layer of paint on your program
like many of you probably have heard in the talk of Anand a while ago. Making beautiful code is a virtue and Python has nice tool support to help you with that. Instead of picking a few must-have tools,
I tried to throw in some that keep reoccurring with Git being in the middle being not a coincidence. If there's anyone not using Git or version control yet in the moment, it's a good, this conference is a good starting point to learn that because you won't be getting anywhere
without version control. But others, there are many other tools as well. Some of them are interchangeable like PySkeffold recently in my personal ranking surpassed by Cookiecutter. There is Sphinx for documentation, Virtualenv and Pyenv for managing
your Python installations and libraries, Pylint and PyFlakes for watching your coding style and so on and so on. Now, what can you do to keep an overview of all of these tools?
Now, I would like to just mention two possibilities here. One of them is Magdalena Rutter is going to give a talk this Friday where she's going to present, give an overview of all the different configuration files that you can find in a well-maintained Python project.
So, this is on Friday afternoon. If this is still too far away in the future, I recommend you to take a look at Koala. Koala, some of you may have noticed that there's a flyer in the conference bag. I visited the Koala booth yesterday
and the developers actually gave me a quick introduction and I was able to run Koala within five minutes or 10 minutes. So, Koala is a framework that basically hosts many linting tools.
That means tools like Pylint or MyPy or other tools that analyze the quality of your code, not just for Python. And I thought, how awesome is that? I can check not only my Python files, I can also check my HTML templates and JavaScript code and whatever
has been accumulating in my project and get everything from one tool that tells me how good it is. So, how does that work? Koala brings its own configuration files that mainly contain a list of the tools
that you want to switch on for a given type of file. For some reason, Koala calls these different linters bears. So, you have a list of bears in this configuration file which is, in my opinion, it's kind of cute and I need to say thanks to the developers
for doing that. I really appreciate it. And you can put in some additional parameters. What you can do when you have this Koala file set up is that you simply write Koala minus minus ci and Koala starts scanning your entire code base
recursively, analyzes all the Python file and comes up with a huge list of comments and suggestions for potential improvements. This ranges from sorting imports to style checks or even using a static type checker on Python.
So, please feel encouraged to try Koala out. I'm not as brave to call this a best practice yet because it's kind of a new tool but I hope to make that statement next year.
Now, this is an overview of testing, debugging and maintenance practices that I wanted to give here and there's one more thing that I would like to do. I got into this topic and liked it so much that I wrote a book about it and I got three copies with me
that I'm happy to give away. So, after the Q and A session, I'm holding a bag open here. So, if you put a piece of paper with your name inside, I will draw three lucky winners at the end of the session and you can read more about best practices there
and some of the content like the debugging tutorial you will find on my GitHub profile. That's it for my talk. Thank you very much for your attention.
Okay, so we have some time for questions. Any questions? Okay. Hi. One thing I noticed that you didn't really mention
was talking about virtual length and kind of the mismatch of packages with repositories and PIP and compiling from source. So, could you repeat the question please? So, it seems a bit missing about virtual length. Like, surely that's a good practice to use virtual length
so you have the correct version of the packages. So, what's the best practice for getting the versions of packages right? Well, so using a requirements dependencies file. Do we have requirements text here on the top of the cloud?
So, yes, this is the number one way to go for getting the right versions because PIP can deal with it. Conda should be able to deal with it and saves you some trouble.
Any more questions? Ah, sure. So, one thing I was gonna say about the requirements thing is so I work with software where the version number is not fine-grained enough so we actually use git commits for the actual checkout
but it relates to if in your requirements file if you don't pin exact versions if in the future they get updated and functions get deprecated it ties into your maintenance issue is that future users won't be able to use your software. So, it's just a key point there
if you've not really kind of pointed that out before in talks or in your book. It's an interesting one. Yeah, this is one of the tougher problems. So, I'd be a bit careful with with leaving a certain version number
in your dependency like forever. I wouldn't do that. Rather check it from time to time or check a few different ones if a newer one comes out because I would not feel comfortable with leaving the version number empty all the time
especially not if you are planning to automatically deploy the code. If you are running the program manually then okay, I'd feel comfortable with it but not if you have any automated pipeline running in the back of it. Is your book also available as an e-book?
Yes, but unfortunately you have to pay for it. But it is, yes. Fair enough. Okay, no more questions then. So, that's it. So, applause. Thanks.