Another pair of eyes: Reviewing code well
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 95 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21081 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 201695 / 169
1
5
6
7
10
11
12
13
18
20
24
26
29
30
31
32
33
36
39
41
44
48
51
52
53
59
60
62
68
69
71
79
82
83
84
85
90
91
98
99
101
102
106
110
113
114
115
118
122
123
124
125
132
133
135
136
137
140
143
144
145
147
148
149
151
153
154
155
156
158
162
163
166
167
169
00:00
CodeProjective planeInternet forumProgrammer (hardware)Multiplication signProcess (computing)MereologyCodeBlogSoftware developerSoftware engineeringSoftware bugTwitterReading (process)Asynchronous Transfer ModeComputer programmingHydraulic jumpComputer animation
02:00
CodeProgrammer (hardware)DemoscenePrototypeBranch (computer science)Price indexUser profileSoftware testingCodeProgrammer (hardware)Video gameField (computer science)View (database)Functional (mathematics)QuicksortMereologyMultiplication signElement (mathematics)Cartesian coordinate systemComputer programmingObject (grammar)AuthorizationSoftware bugCurvePairwise comparisonRule of inferenceProjective planeInternetworkingContext awarenessMathematicsLine (geometry)Software testingCoefficient of determinationRight angleSoftwareSoftware maintenanceProcess (computing)Level (video gaming)Branch (computer science)Musical ensembleTask (computing)User profileDifferent (Kate Ryan album)Integrated development environmentHuffman codingSoftware developerPrototypeProfil (magazine)Electronic mailing listMobile appSet (mathematics)Computer animation
06:37
AuthorizationCodeCodeAuthorizationComputer fileObservational studyNumberSoftware bugSoftware developerRevision controlMereologyContext awarenessPhysical systemProgrammer (hardware)CASE <Informatik>Computer animation
08:04
CodePrototypeImplementationFunction (mathematics)MathematicsUser profileCASE <Informatik>CodeProjective planeProduct (business)CASE <Informatik>Linear regressionSoftware testingSelf-organizationAuthorizationBlogPattern languageData miningMathematicsFormal grammarField (computer science)FeedbackObservational studyImplementationTwitterData structureSoftware engineeringPressureProcess (computing)Standard deviationDescriptive statisticsSoftware bugMultiplication signDependent and independent variablesFunctional (mathematics)Data managementSound effectComputer programmingCovering spaceComputer configurationMathematical analysisGoodness of fitSocial classType theoryElement (mathematics)Extension (kinesiology)Spectrum (functional analysis)Presentation of a groupHeat transferComputer animation
12:03
User profileFile formatField (computer science)CodeRight angleComputer animation
12:24
Division (mathematics)Magneto-optical drivePatch (Unix)Address spaceCodeSoftware developerConfidence intervalMultiplication signCurveProjective planePatch (Unix)Internet forumComputer animation
13:15
CASE <Informatik>TwitterError messageElectronic mailing listSoftware bugProcess (computing)Projective planePatch (Unix)Line (geometry)Subject indexingRoundness (object)CodeRadical (chemistry)DatabaseMathematical analysisStapeldateiXMLUML
14:04
Physical systemPrice indexRoundingPauli exclusion principleSoftware bugError messageCodeSubject indexingRight angleProduct (business)Asynchronous Transfer ModeAlgorithmCategory of beingMessage passingOverhead (computing)Line (geometry)Roundness (object)Electronic mailing listFunctional (mathematics)SurfaceComputer animation
14:40
Point (geometry)Line (geometry)LengthMessage passingElectronic mailing listPauli exclusion principleCategory of beingContext awarenessElectronic program guideRule of inferenceContext awarenessSet (mathematics)PrototypeElectronic program guideCodeAlgorithmMultiplication signRight angleSoftware maintenanceRule of inferenceComputer animation
15:28
EstimationReading (process)CodeDisintegrationProjective planeMathematicsLink (knot theory)CodeVulnerability (computing)Rule of inferenceFunctional (mathematics)Software testingLine (geometry)Similarity (geometry)Computer animation
16:53
Category of beingPoint (geometry)LengthElectronic mailing listLine (geometry)Message passingPauli exclusion principleMathematicsRoundness (object)Sound effectCategory of beingCodeXML
17:24
Line (geometry)Software developerCodeComputing platformElectronic program guideExecution unitMathematicsPatch (Unix)Flow separationCode refactoringMathematicsSet (mathematics)CodeComputer animation
17:47
CodeChemical polaritySoftware bugCodeSoftwareGoodness of fitMultiplication signVideo gameJSONComputer animation
18:13
Numbering schemeMathematicsPrice indexTerm (mathematics)Multiplication signMereologyMaxima and minimaContext awarenessSynchronizationProjective planePattern languageSoftware developerAuthorizationJSONXMLUMLComputer animation
19:24
Point (geometry)Location-based serviceWindowPattern languageWhiteboardPoint (geometry)Computer animationJSON
19:47
ImplementationCore dumpPrototypeMultiplication signSoftware testingBranch (computer science)Line (geometry)Commitment schemePoint (geometry)Set (mathematics)Projective planeMathematicsBitUser interfaceWeb browserOpen sourceSlide ruleSoftwareMultilaterationCodeProcess (computing)MereologyWhiteboardFunctional (mathematics)Service (economics)Programmer (hardware)HoaxText editorConnected spaceTotal S.A.Physical lawGodControl flowLevel (video gaming)Internet forumTrianglePattern languageSound effectComputer programmingCASE <Informatik>Software frameworkContext awarenessLecture/Conference
Transcript: English(auto-generated)
00:00
Hello everyone. I'm happy to introduce Adam Dangoor. He's a Python developer coming from Bristol and he will present a talk about another pair of eyes reviewing code well. Thank you. Thank you everyone. Hello, I'm Adam and I'm going to talk about code review.
00:24
But why am I talking about code review? Well, the inspiration for this talk is really a bunch of people I used to work with who are mostly contributors to the Twisted project. And I was really lucky, I got to experience the really rigorous approach to review that's also present in Twisted and I feel that I've become a much better programmer thanks to that.
00:44
And I've also had some really bad code review experiences. The kind of experiences that make you not want to go into work the next day. And I never ever want to make someone feel like that. But when it's your role to judge someone else's work, it's really easy to upset them if you're not careful.
01:02
And I guess that some of you have experience with code review, maybe even most. Because it's kind of coming a fundamental part of the software engineering process. But maybe because of that, it's often seen as boring grunt work and not a skill that you can hone and get better at.
01:22
But at my job, I spend almost a third as much time reviewing code as I do writing it. But when I go to programming forums or Twitter or read blog posts about how to become a better programmer, I see very very little about how to become a better reviewer.
01:40
And when a programmer starts out, they're often paired with a mentor to teach them how to code or given kind of like easy fix bugs that they can get trained on. But we're really expected to jump in at the deep end and know how to review code without ever having been taught. So today I'm going to talk about these points, what code review is, how
02:04
I review code, some common pitfalls and traps that people fall into when they're reviewing, tools which can help make your life easier, and how code review can make you a better programmer. So what is code review? It's a practice where someone looks over someone else's code and they look for things which can be improved.
02:25
And we usually do this before merging code because we want to catch bugs as early as possible. Yes, we've got automated tests, but code review can sometimes be the final gatekeeper before code hits the real world. And we also use it to catch non-bug defects which can harm the project such as performance issues or maintainability problems or something like that.
02:48
But code review isn't just about making the one change set that we're reviewing better, it's about making all of the software that we ship better. And it's important to think about how we can use code review to improve ourselves and the contributors that we work with.
03:03
And I saw a talk last year at the Docs conference by Paul Rowland, who I think is also at EuroPython, and he was talking about how we should stop trying to be code ninjas and rockstars, all those other kind of things that you see on job posts, and we should start trying to be code nurses and code tree surgeons.
03:23
And I kind of see the reviewer's role as just like that. It's not as thrilling as the ninja feature author shipping something at midnight, but it's really important to the health of the project. And all of this happens within the context of the project's goals, which usually includes shipping code fast, or at least fast enough.
03:44
So let's back up and go over quickly how we get to the review stage. And in the last few companies I've worked at, it goes something like this. Eleanor, our programmer, is new to the project. She's brand new and she's looking to make her first contribution. So she picks a task from the task tracker. I'm using GitHub here, but it could be Jira, Trello, whatever you use.
04:04
And this one is written by me. The name of the user's latest employer needs to be shown in our app's new prototype feature, user profiles. So she creates a branch so that she can make changes to the code without worrying about breaking it. And using branches for development is a great way to create like a low risk environment for people to experiment in.
04:25
And it lets a reviewer easily look at the proposed change, just the difference between what's been written and what's currently there. And so Eleanor starts searching around the code and she finds that there's already a method to get date sorted employment history. Great.
04:41
And there's already a function that you can see here that returns a dictionary for a user's profile that's later handled by like a view layer and shown in the app. So she adds some new code that she thinks might work. You can see those new two lines and it takes the date sorted employment history and it displays the first item of that list in a new field called latest employer.
05:03
And then she runs the application locally and she checks her own profile and it shows her latest employer, Bilbao Inc. So we know that this works. Eleanor has seen that it works. Do we need to spend time reviewing it? Well, in my opinion, all code that's written as part of a team should be reviewed.
05:23
I've heard objections to this, particularly when the change set is trivial or if it's absolutely urgent. And I'm going to propose that very, very little of what we do is actually trivial. Programming is very hard, at least for me, and authors have an inherent bias towards thinking that their code is good.
05:43
Even if we waste a few minutes reviewing a change that really is trivial, we save hours later debugging the bug that was shipped when someone thought their code was really simple but it actually had a typo. And even if we're absolutely certain that the code can't break, a review means that at least two people know where this new code is.
06:04
And if the code is really urgent to ship and there really isn't someone around to review it, maybe it is worth merging. Maybe you've shipped a data-destroying bug and you really need to revert it right now. But at the very least, I'd recommend that you do a post-merge review. Whatever that is, I won't go into it.
06:25
So, back to Eleanor. She submits her code for review and she's nervous because it's the first contribution to the project that she's made. But of course she doesn't mention that and she puts the code up for review. So, now we've got to decide who does the review.
06:42
Who gets permission to merge code is a whole problem that I won't go into but it's got to be someone with that permission. But at least where I've worked, anyone can do it. Anyone can review code and merge it and it's just like a trust-based system. And as I mentioned, as programmers we're biased towards thinking that our code is working. But we're also biased towards thinking that our code is readable.
07:03
Our comments are readable, our variable names are understandable. So the ideal reviewer is someone who has had no part in writing the code. And if we choose someone who's recently worked on the changed files, then they're likely to have really great context which will help them find a high number of bugs.
07:23
And according to a Microsoft study that I read, the most useful review comments, so I guess the most bugs found in most cases, come from people who've actually previously reviewed the modified files. But when we choose people who are recent authors or recent reviewers, that conflicts with our goal to share knowledge with the team.
07:43
And that might be okay but it's at least something to consider. But a good workaround for that problem is when you allow developers who are unfamiliar with the code to do a review but you don't designate them as the gatekeeper. You have someone else have to say, this looks good to me, let's merge it. In fact, when I join a team, I love to jump straight into reviewing as much code as possible.
08:05
And when there's no pressure of knowing that I have to take responsibility, I'm not the one saying it has to look good, someone else will do that. Then it's a great way to learn how things are done. It teaches you, let's say, about a project's quality bar and standards and how the process works from writing code all the way through to shipping it.
08:23
So, say I want to review Eleanor's changes, this is how I'll go about it. I start by reading the spec. Without reading the spec, there's no way that I can know if the code meets all the requirements. Hopefully, the spec's in an issue tracker and it has descriptions and examples of how the new functionality might be used or how to reproduce the bug if there's a bug that we're fixing.
08:46
And then I check the CI, whatever that is, whatever tool you use. And I hope that it's passing, but sometimes we know that CI isn't always passing, so at least I check that there's no new issues. And CI really should be a signal of whether something that was already there is broken, so it's good to check that first.
09:04
And then next, I look for evidence that the spec's been implemented. And ideally, that means looking at the test changes. If there's a bug to be fixed, I want to see that there's a passing test which would have failed if the bug were present. And if there's a new feature, then I want to see tests that cover the whole spec.
09:23
And if it seems like the spec is ambiguous and I've interpreted it differently to the author, then this is a good time to start a discussion about what we want. That could be with the author or whoever's responsible for shipping the product. Maybe the product owner or project manager, whoever it is in your organization. And then I look at the implementation.
09:41
I think about whether it has functionality that isn't tested. And that's important because even if the code works now, we want it to be safe from regressions. And I kind of think about which missing test cases could convince me of that. And then I think about whether there's any risk that the changes could break existing functionality. Yes, we've got CI, but it's not always perfect, and it's often a good thing to think about.
10:04
And then I think about, is the new code going to be easy to maintain? And that can range from things like, you know, other variable names understandable, but also I have to think about what the structure of the code is. Is it really branchy? Does it have lots of side effects? Is it idempotent?
10:21
Is new code in the place that I'd expect to find it? And these are all the things that you think about when you're writing the code. So you get a lot of the advantages of pair programming, but it's asynchronous. And then I check that everything user-facing is documented. That might not be relevant in your project, but it is in mine. And I'm also, at the same time, while I'm reviewing the code, I'm looking out for new techniques and patterns and tools that I don't know yet.
10:45
And it's always really nice to say thanks to the author when you learn something from reading their code. And then I'll write a comment about anything that might be improved. And I'll ask questions and start a discussion about anything that I'm unsure about. And at the end I'll give a conclusion.
11:02
And usually it's one of these options. I say great, it looks good to me, I'll merge it. Or please make the requested changes and then you can merge it. Or make the requested changes and then resubmit for another review. And then sometimes in a rare case I've picked up a review that I'm not qualified for and I ask that someone else looks at it. And I'd say that the first step to getting better at reviewing all those previous steps
11:25
I've mentioned is to take it seriously, like we often do the rest of our craft. If you learn about software engineering practices from blog posts, then search for blog posts. If you learn from formal studies, then look for formal studies about code review. And importantly, ask for feedback on your reviews, just like you ask for feedback on your code.
11:45
So, I look at Eleanor's new code and I immediately spot a problem. The Twitter handle feed just above is title case, but the new field is sentence case. And so I know that the pull request is not up to scratch and I comment and reassign the issue to Eleanor.
12:04
So what are her next steps? Well, she sees my comment and she kicks herself for making such a trivial mistake. Why can't I do anything right, she thinks. Maybe I'm not good enough to be here. And then she searches the code base for latest employer, the name of the new field. And she sees that it's written just like she wrote it, everywhere else it comes up in the code.
12:26
Beginners don't have the confidence to protect themselves from feeling hurt by code review. And some experienced developers don't either. So what we should do as a community is be wary of imposter syndrome and use code review as a forum to make this industry a nicer place.
12:46
But you don't want to let bad things be merged just to avoid hurting someone's feelings. One tip is to always say one nice thing about the code. And when someone's made an effort, there's always something nice you can say. And it's good to always address the patch and not the person.
13:03
At a previous company, I was lucky enough to be thrown in at the deep end of a project. And that project had quite a steep learning curve. But it took me some time to get into the mindset that my code was being reviewed and not me personally. But what else could be better with this review? Well, Eleanor doesn't know what to do next if you remember that she was searching around the code base.
13:24
And one of our goals, the project's goals, is to ship code quickly. So by not making it clear exactly what she's got to do next, I've slowed down the process and I've added unnecessary work for her. And also, you might spot that there's a bug a few characters later.
13:40
Basically, if that list is empty, get employers, then I'm going to get an index error when I try to access the first item. But I stopped my review when I saw the first problem. And that means there's got to be a whole extra round of review which delays our shipping the code. And of course, sometimes there's a trade-off when you're thinking about when to stop. If a patch is really large and based on a completely wrong assumption, then scrutinizing every line probably isn't worthwhile.
14:05
So, after a back and forth and a comment about the index error and so on, Eleanor resubmits this code. And I'd say that's a code review success. We stopped a bug getting into production. It's great. And so, I start my next round of review and I comment that a try-except clause has performance overhead that we don't want.
14:25
And that the message line is greater than 80 characters and that violates PEP8. And finally, I mentioned that using a getter for the list of employers to remember that function she found and used isn't very nice. And she should use a property instead. And on the surface, these might seem like reasonable comments but it really is not a great review.
14:48
Because as a reviewer, I need to consider the context and I need to consider what's important right now. Sometimes it's security, sometimes it's maintainability, sometimes it's speed. Who knows? And this right here is prototype code if we remember the spec.
15:02
So, there's no need for me to be overly concerned with tiny, tiny performance issues. So, I've kind of wasted some time with that suggestion. And that's for PEP8? Well, it's a guide. It's not a rule set. And if you want to follow a guide, fine. But if you want rules, then I totally recommend using a tool or set of tools to enforce them. And that's partly because humans are very bad linters. We miss things that tools catch.
15:25
And it's partly because if we use tools, then we can free ourselves up to do more important things. And flake8 is one of my favorite tools. It combines a lot of tools and it enforces the flake8 guidelines, some of them as rules. And then there's coverage.py. It checks your line coverage. And requires.io, which checks for vulnerabilities and dependencies.
15:44
And landscape.io, which looks for common code smells and mistakes that people often make. And there are even tests for docs. If you're using restricted text, there's doc8, which links your documentation. And there's Sphinx's link checker and spell checker. And there are probably similar checkers for markdown and ascii-doc and whatever toolchain you use.
16:02
But sometimes these tools will suggest changes that are actually worse. My personal preference is just to always just follow the tool suggestions. And overall your code is probably going to be nicer. And you save the cognitive burden of linting by eye. But if you really want, most of these tools allow you to ignore a particular issue.
16:20
Like here I've told pylon, don't worry that the function name is too long. And if a suggestion keeps coming up in your project and your team don't like it, you can just disable some of the checks in most of the tools, or change them. And if you use GitHub, then you can integrate these tools with pull requests, so you can actually make it so that code can't be merged if the tools find problems.
16:43
And finally, reviewable is a really powerful tool which has great features that GitHub doesn't have. Like, say you're doing a review, you can say, I just want to see the changes since I last did a review. So if we go back to my review, we see a suggestion of an unrelated change. Changing an existing getter to a property.
17:03
Now if you want an unrelated change, it's probably best to just make an issue in your issue tracker and deal with it later. A review might be a great place to spot required unrelated changes, but not to request them. Because asking for unrelated changes slows down shipping the code and also makes the next review round less pleasant.
17:20
In fact, as a reviewer, I should request easily reviewable changes. But what are reviewable changes? Well, statistically, more defects are found when change sets are shorter. And as a reviewer, you can ask that a patch is split up if it's difficult to review. And you might be able to help decide where the splits should be.
17:43
So say someone refactors some code before adding to it, you might ask that the refactoring is in its own separate pull request. And when you spend a lot of time reviewing code, you start to get a good idea of what's going to be easy to review, and you'll start to write code like that. And writing easy to review code means reviewers can be more effective
18:01
and find more bugs so your software is better. And when we focus on making a reviewer's life easier, we write code that can be extended without changing what's there too much. Otherwise known as maintainable code. And a trivial example of this is the Silicon Valley comma. If I add a comma at the end of that line, then when I add a new item, I only have to see, when I add a comma at the end of Basilica,
18:22
when I add a new item, I get to see just one new item's changed. Great. And the same for docs, but I'll skip it because we're running low on time. And I'm going to say that the best reviews aren't just laser-focused on providing the bare minimum needed to ship, but they also don't add unnecessary blockers to merging.
18:40
And when you explicitly state that a suggested improvement is optional, and you can let the author choose whether, let's say, to learn a suggested tool or to just skip it and ship now, then that can be really useful. And those priorities can be adjusted depending on the context. And let's say if you've got a brand new developer to the project,
19:02
you're onboarding someone, well, you can be super picky and get them in sync with the rest of the team, or you can be easy on them because you want them to feel happy and get started quickly. And so every review comes with a mini, how would I have done this? When we read someone else's approach to the answer to that question,
19:21
and we notice that their approach is better, or at least better in some way, then we learn techniques and patterns from them. And I find that teams which value review get really much better reviews back, and they get reviews done more quickly. And my favorite example of this is Twisted's high score board. So you get a score every month, and you get some points for submitting,
19:44
but you actually get the most points for doing a review. Thank you very much for having me. I haven't put them up yet, but I'll put my slides up there,
20:02
and I'll probably tweet about it or something. Hopefully you'll find a link later. So are there any questions from the room? We have a few minutes for questions. I have just a quick one. What's that score thing that you showed in the end? That's Twisted. So Twisted is an open source project.
20:22
It's like a networking framework, and they want to encourage people to contribute. So just as a bit of fun, they've got a high score board. Who's been the best contributor this month, or the most prolific contributor this month? So you get 100 points or something if you submit some code,
20:43
and 500 points if you close a ticket, but you actually get 1,000 points if you do a review, because they have a backlog of code that's been submitted and not reviewed, and so there's no point having new code if the code that's already been submitted isn't merged. Hi. Great talk.
21:00
Thank you. So when you review your code, do you also suggest to check out it and test it, like if it's more complex, or is that not part of your job as a reviewer? How do you stand on this? So I don't always do it. Sometimes the tests hopefully tell me enough. If there are no tests, sometimes there's a good reason that code isn't tested.
21:22
Let's say it's prototype code, we're going to throw it away, or maybe it interacts with an external service, and it would just be really costly to build a fake for that service. What I like to have is a reproducible manual test case, let's say if it's Elasticsearch. Connect to Elasticsearch, check this in the web interface after you run my code or whatever,
21:42
and I like to get my editor out, I like to get my web browser out, run it, and check that that works. And also, if I've got a suggestion, and I'm not 100% sure about it, I like to modify the code, see if my suggestion might work, at least in a prototype stage, or see if I can break the tests by modifying the implementation,
22:03
or make the tests still pass by modifying the implementation in a way that the functionality would break, and then I've found a flaw in the tests. So I do get my editor out. Thanks for the great talk. I have a question regarding the size of reviews, how to handle it, because if you have a feature that needs to be implemented,
22:24
but the change set would be too large, how do you handle a case like that? Do you make earlier pull requests with only a partial implementation of the feature? Or what would you suggest? So I like to always keep master or trunk, or whatever you call your main branch, working.
22:41
And so one way that people do it is you make a branch for the whole feature, and then that's just empty, it's like master, and then you make branches off that, which have all the partial features, and you put those branches into that first branch, I should have given them names, and then eventually when you're happy with it all, you've got this huge diff, and you merge that into master.
23:06
Hi Megan. What about the time spent? What do you suggest? You can go really deep, like reviewing every line, texting every line. Cool. So it depends on a few things. One, I never want to be tired in a review.
23:22
As soon as I'm tired, maybe like 45 minutes in, my review is useless, I'm going to miss bugs and stuff. But as for scrutiny, it really depends on the context again. If I'm reviewing something right at the core of my project, it needs to be great, I'll spend as much time as needed. Or if it's, let's say, an API,
23:40
that's going to be really tough to change later because users are going to depend on it, then I'll also be really, really deep. If it's something in the middle, or like I mentioned before, prototype code, or something like that, then I can say more like trust, or okay, I've opened this up in a web browser and it all seems to work, so fine. Or at least something a bit closer to that.
24:00
And like I said at the end, if it's a beginner programmer on your team, then maybe you want to be really picky just to help them along, or something like that. It depends on the person as well. Last question. Oh, sorry. So imagine you have like several issues, and all those issues can be resolved
24:21
in maybe one or two lines of code. Do you favor like several commits, or just one commit? So different projects do it in different ways. Some people like to review the commits. It's quite nice when you've got a set of commits to be able to just revert one, or to see a history, why did that come up?
24:41
That's the advantage of commits, but usually in my reviews, at least where I work now, I don't deal with commits. So that's a different practice that lots of people do, but I don't. So I'm sorry I don't have a great answer for that. So we are running out of time. Thank you very much for your insights.