Effective Code Review
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 44 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21127 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 201644 / 169
1
5
6
7
10
11
12
13
18
20
24
26
29
30
31
32
33
36
39
41
44
48
51
52
53
59
60
62
68
69
71
79
82
83
84
85
90
91
98
99
101
102
106
110
113
114
115
118
122
123
124
125
132
133
135
136
137
140
143
144
145
147
148
149
151
153
154
155
156
158
162
163
166
167
169
00:00
Machine codeRed HatRoundness (object)View (database)Context awarenessMachine codeQuicksortMultiplication signWikiLecture/ConferenceJSONXMLUMLComputer animationProgram flowchart
00:41
Machine codeMultiplication signQuicksortIntegrated development environmentBitGroup actionLecture/Conference
01:18
Link (knot theory)DualismTwitterBlogMachine codeChord (peer-to-peer)Regular graphJSONLecture/ConferenceXML
02:02
MathematicsAuthorizationMachine codeSemiconductor memoryProjective planeComputer programmingNumberCASE <Informatik>VideoconferencingLecture/Conference
02:44
Red HatMultiplication signMachine codeTwitterProcess (computing)MereologyOnline helpSoftware developerAutocovarianceState of matterLecture/Conference
03:24
Machine codeProjective planeSoftware developerMachine codeSlide ruleBlogSource codeOpen sourceWater vaporForm (programming)
03:58
Red HatRadiusSoftware testingExecution unitAverageBit rateMachine codeFunction (mathematics)Contrast (vision)DisintegrationComplete metric spaceProjective planeRepository (publishing)Unit testingBit rateSampling (statistics)AverageContrast (vision)INTEGRALSpectrum (functional analysis)Functional (mathematics)Software testingQuicksortMachine codeComplete metric spaceSound effectSoftware bugNumberOptical disc driveLecture/ConferenceComputer animation
04:48
Software developerMachine codeAverageMultiplication signSoftware developerMachine codeBlogProcess (computing)Vector potentialNumberWordAverageFormal grammarComputer animationLecture/Conference
05:36
Red HatMachine codeMachine codeNumberMereologyShared memoryProcess (computing)Link (knot theory)Sound effectExpected valueMeasurementSoftware bugComputer programmingLecture/ConferenceJSONXMLUML
06:26
Machine codeHeat transferMachine codeExterior algebraAdditionContext awarenessInternet service providerHeat transferExpected valueResultantSoftware developerQuicksortCurveComputer animationLecture/Conference
07:33
Heat transferSoftware testingMachine codeRed HatNumberFamilyTerm (mathematics)Computer-assisted translationGraph (mathematics)Order (biology)Multiplication signOptimization problemData managementSound effectDistanceContext awarenessComputer configurationDifferent (Kate Ryan album)Machine codeLecture/Conference
08:21
Heat transferSoftware testingMachine codeMachine codeMathematicsProjective planeFunctional (mathematics)Multiplication signConnectivity (graph theory)Point (geometry)TelecommunicationLecture/Conference
08:57
Machine codeSoftware bugData miningView (database)MereologyLocal ringNegative numberProcess (computing)AuthorizationJSONComputer animation
09:33
Machine codeCollaborationismCore dumpMachine codePoint (geometry)View (database)CollaborationismDisk read-and-write headLecture/ConferenceJSONXMLUMLComputer animation
10:14
Revision controlBitFigurate numberMultiplication signCollaborationismProcess (computing)Dependent and independent variables1 (number)MathematicsQuicksortJSONComputer animation
10:49
Process (computing)Projective planeOpen sourceMachine codeLecture/Conference
11:23
Red HatComputing platformSoftware testingElectronic program guideMachine codeWeb pageVector spaceIntegrated development environmentDescriptive statisticsProjective planeProcess (computing)Lattice (order)WordLecture/ConferenceComputer animation
12:12
Context awarenessSoftware testingSoftware developerContext awarenessDifferent (Kate Ryan album)Computing platformMachine code1 (number)Projective planeDifferenz <Mathematik>Disk read-and-write headLecture/Conference
12:53
Red HatMachine codeContext awarenessGoodness of fitPoint (geometry)StatisticsMathematicsStudent's t-testMedical imagingSlide ruleProjective planeQuicksortLine (geometry)Differenz <Mathematik>LengthContent (media)Focus (optics)Electronic mailing listLecture/Conference
13:33
Machine codeRoundness (object)Formal languageObject (grammar)QuicksortDifferent (Kate Ryan album)Machine codeTwitterLine (geometry)Order of magnitudeMathematicsProcess (computing)Software developerFeedbackLecture/Conference
14:11
Computer fileNumberLinear regressionCoefficientPatch (Unix)Power (physics)Machine codeSystem programmingMathematicsFile viewerDisk read-and-write headMachine codeGrand Unified TheorySoftware bugMultiplication signPatch (Unix)Likelihood functionMereologyFamilyQuicksortThermal expansionLevel (video gaming)Set (mathematics)DivisorLogicLecture/ConferenceComputer animation
15:00
Computer fileNumberLinear regressionCoefficientPatch (Unix)Power (physics)Machine codeSystem programmingWordMereologyFactory (trading post)TwitterData structureDifferenz <Mathematik>Patch (Unix)Computer fileSoftware bugData conversionLecture/ConferenceComputer animation
15:40
FeedbackState of matterSocial classContext awarenessMultiplication signPoint (geometry)Patch (Unix)Degree (graph theory)System callArithmetic progressionIntegrated development environmentText editorMachine codeData conversionMathematicsProcess (computing)QuicksortLecture/Conference
16:39
FeedbackSign (mathematics)TwitterMultiplication signAuthorizationQuicksortDependent and independent variablesMathematicsContext awarenessMachine codeCASE <Informatik>Focus (optics)NeuroinformatikFeedbackPoint (geometry)Computer animationLecture/Conference
17:25
Machine codeProcess (computing)MereologyDependent and independent variablesPoint (geometry)Lecture/ConferenceJSONXMLUMLComputer animation
18:08
Dependent and independent variablesShared memoryProjective planeAuthorizationMachine codeMathematicsResultantWordBitLecture/Conference
19:04
Red HatLattice (order)Machine codeWordMixed realitySoftware maintenanceOpen sourceQuicksortComputer programmingForcing (mathematics)Projective planeCategory of beingMathematicsLecture/Conference
19:44
Projective planeCASE <Informatik>MathematicsOpen sourceBand matrixDependent and independent variablesSummierbarkeitLine (geometry)Multiplication signSet (mathematics)Formal language
20:24
Order (biology)Machine codeProcess (computing)WordProjective planeComputer animation
20:59
Red HatOpen sourcePatch (Unix)Process (computing)DivisorSoftware maintenanceOpen setMathematicsSocial classGroup actionProjective planeLecture/Conference
21:47
Rule of inferenceProcess (computing)Web pageFrustrationChecklistNumberElectronic program guideMachine codeComputer animationLecture/Conference
22:27
VarianceProgrammierstilSoftware testingMultiplication signMachine codeExterior algebraShared memoryProjective planeFocus (optics)MathematicsOpen sourceAutomationLecture/Conference
23:26
Electronic program guidePoint (geometry)Machine codeDisk read-and-write headInformationExtension (kinesiology)Real numberComputer animation
24:03
MultiplicationMathematicsSlide ruleMaxima and minimaSource codeMehrplatzsystemView (database)Sound effectPoint (geometry)QuicksortMachine codeComputer animationLecture/Conference
24:39
Machine codeGame controllerMultiplication signConstructor (object-oriented programming)QuicksortLine (geometry)SoftwareDistanceArmComputer animation
25:22
Moment (mathematics)Point (geometry)EmailQuicksortFeedbackConstructor (object-oriented programming)Lecture/ConferenceComputer animation
25:56
Moment (mathematics)Point (geometry)Negative numberString (computer science)Social classEmailQuicksortOnline helpProcess (computing)Machine codeShared memoryTraffic reportingLine (geometry)Point (geometry)Lecture/Conference
26:55
BitIntegrated development environmentPersonal digital assistantSpeech synthesisDistanceMachine codeLecture/Conference
27:38
CollaborationismOpen sourceData managementDynamical systemMachine codeGoodness of fitProjective planeThermal conductivityHTTP cookieAuthorizationQuery languageChainTask (computing)AutomationResultantWave packetData conversionJSON
28:35
CollaborationismMachine codeWordMultiplication signFeedbackProjective planeOpen sourceForm (programming)MathematicsSocial classProduct (business)Shared memoryProcess (computing)Lecture/ConferenceJSON
29:27
Machine codeQuicksortDifferent (Kate Ryan album)Pairwise comparisonTouchscreenPoint (geometry)JSON
30:02
Traffic reportingWater vaporFeedbackUniverse (mathematics)SummierbarkeitQuicksortMachine codeLecture/ConferenceJSON
30:53
Source codeConstraint (mathematics)Machine codeOpen setMultiplicationRule of inferenceOpen source
31:37
FeedbackVotingBasis <Mathematik>Type theoryLogicLecture/Conference
32:17
Machine codeInformationPoint (geometry)NumberSpectrum (functional analysis)FeedbackWordProjective planeComputer animation
32:54
Red HatStack (abstract data type)Open setComa BerenicesSpacetimeTwitterMachine codeTheory of relativityOpen setSpacetimeQuicksortType theoryProjective planeLecture/Conference
34:08
CollaborationismMachine codeProjective planeOpen sourceMessage passingMeta elementTerm (mathematics)Commitment schemeStatistical hypothesis testingLecture/Conference
35:06
QuicksortMessage passingComputer fileType theoryProjective planeMathematical analysisEmailArithmetic progressionProcess (computing)Patch (Unix)Latent heatElectronic mailing listPower (physics)Instance (computer science)Self-organizationLecture/Conference
36:28
Red HatBitInformation securityData managementArithmetic meanWave packetMathematicsRevision controlEmailLecture/Conference
37:08
Computer programmingMetropolitan area networkUniverse (mathematics)Machine codeMultiplication signTouchscreenResultantPoint (geometry)Type theoryProcess (computing)MathematicsRadical (chemistry)Projective planeFormal grammarQuicksortSound effectOnline chatLecture/Conference
38:30
ImplementationMachine codeDependent and independent variablesFeedbackAxiom of choice
39:27
Red HatSoftware developerMachine codeFeedbackArithmetic progressionLattice (order)Point (geometry)Level (video gaming)Multiplication signBitLecture/Conference
40:01
Red HatMachine codeSoftware developerImplementationMathematicsAverageLine (geometry)Branch (computer science)Exterior algebraBitHand fanRight angleState of matterLecture/Conference
41:22
ProgrammierstilMachine codeMathematicsLatent heat
42:01
Red HatProcess (computing)MathematicsMultiplication signRepository (publishing)Projective planeFeedbackImplementationBitAnalytic continuationPoint (geometry)Machine codeLecture/Conference
43:04
Order (biology)Machine codeAnalytic continuationLine (geometry)Group actionProjective planeSoftware maintenanceMatching (graph theory)MathematicsExpected valueLecture/Conference
Transcript: English(auto-generated)
00:00
So our next speaker, give a big round of applause for Dougal Matthews and Effective Code Review. Thank you. Okay, so I've been, well first of all, thanks all for coming and for having me here. I've been thinking about code review for quite some time now, so I'm quite excited
00:20
to sort of talk about it and yeah, just sort of start a discussion about it. That's kind of my main goal from today is if people are talking about code review more then I've had some success. So first of all, just to give you some context of where I come from, I mean what I do isn't actually too important for this talk, but it's useful to know where I'm approaching code review from.
00:41
I work for Red Hat and then I'm working on the OpenStack project, which is the bottom logo. And then I work on Triple O inside OpenStack. And the key thing about all of these is everything I do is open source and public, so all the code reviews I do are also public and open. So this means that some of the things I'm gonna say are specific to open source, I guess, and maybe don't apply to a closed environment,
01:01
but I think still a lot of the things do apply. And I also run the Python Glasgow user group in Scotland, which is completely unrelated, but I just like to show off our cool logo whenever I can. That's an inside joke, which I can explain to anyone later if they don't understand it. I'm also doing a bit of an experiment in this talk. I've scheduled some tweets and that's my Twitter account,
01:23
the Owen Dougal is a zero. They'll be tweeting out some relevant links to the talk. If there isn't two out by now then it's not working. So yeah, we'll see afterwards, but if you're looking for references to any of the papers or blog posts that I reference, they should be shared on my Twitter.
01:42
First of all, I'd just like to try and gauge the audience a little bit and see where people are at with code review, what you're doing. And to get a baseline, how many of you will raise your hand if I ask you to raise your hand? Okay, that's pretty good. I guess we're talking about 90%, so anything can be adjusted to that from now on.
02:01
And who's doing regular code reviews at work? And by this I mean as either the author or the reviewer of a change. Okay, that's a good percentage. We're looking at about 75% there maybe. And who is doing them outside of work? So maybe open source or in like a side project, something like that, again as author or reviewer. Okay, so that's a lot less than we're looking at about maybe 15, 20%.
02:23
Finally, who actually does code review but looks forward to doing it? Okay, so we pretty much had everyone doing some kind of code review. But then the number of people actually, so this is for the benefit of the video I guess and nothing else. The people actually looking forward to doing code review
02:41
is probably down to about 20%. And this is the one that is the most interesting to me. I find code review incredibly valuable. There's so much you can get out of it. It's a really useful process. But yeah, a lot of people don't really enjoy it. So I'd like to start the discussion and see how we can improve that, how we can make it a better experience for everyone. I also done a straw poll on Twitter not too long ago.
03:00
And I found out that most people are doing code review, sorry, out of all the people that responded, everyone's doing code review up to 25% or up to 50% of their day. I think that's part of their development time. But I mean, that's a huge chunk of time. So if we can improve code review, we can improve what we build. Or perhaps more importantly, we can actually improve how people enjoy their jobs.
03:23
And for anyone that's not doing code review, this talk is really more aimed towards people that are. But I just gotta quickly do a couple of quotes, things which if you need to persuade someone on why you should be doing code review, will hopefully be useful, and there'll be some references to papers and blogs. So there's just three very quick slides. But as far as I'm concerned,
03:41
the only real reason to not be doing code review is if you're a sole developer on a project. And you'll actually find that, oh sorry, there's a trick that I've used before. If you're doing an open source project, say on GitHub, rather than committing directly to master, if you open a pull request, you might find occasionally some people wander by and do a review before you merge your code. Especially if you've got watchers on your project because they get notifications when you do a pull request
04:02
on your own repository. So first, this is a quote from a book called Code Complete by Steve McConnell. And he says that the average detection rate for unit tests is 25%, 35% for function tests, 45% for integration tests,
04:21
and in contrast, the average effectiveness of design and code inspections is 55 and 60%. So in this he's talking, when he's talking about defects, it's kind of bugs or problems with the code. And he's basically claiming that code review is the most effective out of all these approaches to improve your software. I'm not sure I completely buy these numbers, but even if it was on par with your testing,
04:40
that's quite an impressive amount of benefit that you can get. Next, there's a quote by Jeff Atwood. So he wrote this in a blog post. And basically he said, the only hurdle to doing code review is finding a developer that you trust to actually do the code review. And then once you've started, he says, I think you'll quickly find that every minute you spend in a code review
05:01
is paid back tenfold. And this one really tackles the assumption a lot of people say they just don't have time to do code review. But really, I think, there'll be some upfront investment, of course, but once you actually start getting into the process of it you will find that, you will reap the benefits over time. And then finally, from a paper measuring defect potentials
05:21
and defect removal efficiency, academic names are great, they found that formal design and code inspections often top 85% defect removal and average at around 65%. So again, this is like a huge number and this is quite good because this is from an actual, an academic paper and they share some of the data they use so there's actually quite a lot of method behind these numbers.
05:41
The other two are more anecdotal as far as I can tell. So, with that aside, hopefully you're all convinced now to do code review or as most of you are doing it anyway, we can move on. But if anyone is actually still doubting the process for some reason, I can bombard you with references and links so just come and ask me afterwards.
06:00
One thing to, so one of the things about code review that's quite interesting is trying to compare the expectations people have with the outcome of the actual code review. And I think this is, it's reinforced slightly by these quotes. You'll notice two of them were talking about finding defects and bugs. And I think that the reason they're talking about that is it's much easier to measure bugs than other things but there's actually a lot of other subtle side effects
06:21
and outcomes from code review rather than just thinking of it as a bug hunt. In the paper, expectations, outcomes and challenges of modern code review, they said that while finding defects remains the main motivation for reviews, reviews are far less about defects than expected
06:40
and provide additional benefits such as knowledge transfer, increased team awareness and creation of alternative solutions to problems. So what they actually done in this is this research was done in partnership with Microsoft Research. It might have been just Microsoft Research, I'm not sure. But they took some teams at Microsoft, they quizzed the developers beforehand,
07:01
sorry, questionnaire rather than the quiz. So they were asking them why they're doing code review, what they're expecting to get out of it. And the top result was finding defects in code. So that was, most of them said that's why we are doing code review. Actually, when you look at the, sorry, and then after the code reviews had happened,
07:23
the research team then manually classified hundreds of reviews to manually find out what they thought was the outcome of the review. And this is the results that they had. So they found the defects was only 14% of the time, so that only came in fourth in terms of the actual outcome of reviews.
07:43
So it's actually a lot more lower down than you would have perhaps expected. I only called out two of the numbers because the paper had a graph in it and they didn't actually, it was quite hard to see the numbers, but they explicitly mentioned those two numbers, so that's why I only have them. But the order is at least correct. So yeah, you can see above defects
08:00
you're getting code improvements. So the difference between defects and code improvements is the code submitted for review was perfectly valid and worked, but it maybe wasn't the optimal solution depending on your context. Maybe it wasn't performant enough or maybe there was an API that could have done it in a more elegant way. I'm not sure. There's tons of options.
08:20
Then understanding was a shared understanding in your team about what is going on with the code base. So if you're not actually reviewing changes as they're coming in, the only way you really find out what's going on is by looking at your git history, which I don't think is particularly fun, or you're just trying to use something and then you're like, oh, this function's changed. That's kind of annoying.
08:40
So yeah, it's just about making sure there's a shared understanding and everyone really knows how the project is developing and evolving over time. And then they even found that social communications, which is obviously a very important thing in any team or community, was also rated higher than defects. So with these goals in mind,
09:04
rather than thinking about looking for a bug hunt in code review, we've got to think about how we can maximize all these other benefits as well. I mean, obviously finding defects is still an important part of it, but we shouldn't really focus on that because it doesn't seem to be the main actual outcome. But in code review, we definitely have two distinct roles. You have the author and the reviewer,
09:22
or you can have multiple at either side of the review process. And if I go back to why a lot of people seem to have a negative view of code review, I think it's often because the actual code review process comes along as almost an adversarial, like you're working against each other rather than together. You want to get your code merged,
09:41
but this person is just getting in your way by giving you feedback. And I mean, it's kind of a tricky problem to solve this. I'm not really sure why it happens. I think it's definitely more prevalent in some situations than others, and people that tend to be more experienced with code review are quite open to the feedback, and it works a lot better. But I think one of the reasons we maybe see this is I don't like the name for it.
10:02
I don't like code review as a name. I don't really think we'll manage to change it because that ship has sailed. But I, at least in my head, I like to think of it more as a code discussion or a code collaboration, just like me thinking more about how we can work together as a team, and it becomes an iterative process that you're both involved in.
10:24
Yeah, so it's just really about collaboration and coming up with a better solution together. You both essentially have the same goals in the long term, although it can sometimes seem a bit of a conflict at the time when you're trying to figure out what the best solution is to reach that goal. And so what I'll do now is I'm just gonna take
10:42
a look at authoring changes, and then we'll go to looking at reviewing changes after, and just sort of a few bits of unsolicited advice, I suppose, from either side. And you can pick and choose which you think is useful. This first one, I think, is really important, and perhaps overlooked, because you might think that this comes before
11:01
the code review process, which it does really. But it's always important, especially in open source projects, to talk about what you want to do first. If you don't have an issue or something describing what your goals are, and you've not discussed it with the maintainers, and you don't really know if it's even applicable or relevant to this project. Obviously, this is something that you can judge yourself.
11:20
If it's a typo, that's a no-brainer. You should just send over a pull request, or sometimes you might find that you can express what you want to describe in code better, but you just have to be prepared so that your pull request or your code review might be rejected, just because you haven't actually made sure everyone's on the same page. And I think it's also important in the, I mean, in a work environment,
11:41
you would expect to have a requirement description or something before you start working on the code. So it's just making sure you know what you're doing before you write it. It's always good to adhere to the project guidelines. This is pretty obvious, but a lot of people don't actually tend to look for them. This is also perhaps a problem with the projects, where a lot of them either don't have them
12:00
or they make them too hard to find. But you can always take a look and see how the project is working in general, look at other reviews which are waiting or commits that have been merged. Just try and fit in with how they're working, and it makes the process a lot better. So if they've got a good, they're always adding tests and documentation with everything, and they're testing on different platforms, then it's always good to make sure you're doing the same and fitting in
12:21
with their development style. Providing context is really important with your commit, or with your chain, sorry, for review. I find that reviewing code can be really quite difficult. Often you'll receive a diff for a project that you've maybe not looked at for a few months, so you need to bring back the mental context
12:41
of what the project was doing before, and then figure out from this diff what they are trying to do now, and just try to get that all together in your head. It can be really quite tricky. So you can get around this by writing good commit messages, good full request descriptions, just why are you making the change? What do they want? What does it do? Just try and be as descriptive as you can, and help them out.
13:02
And if you're missing the context from this image, by the way, it's an acoustic guitar tuning peg. I don't know why. This is sort of related to the previous slide in a way, keeping your changes small and contained. This is about limiting the context required. So you don't want somebody to have to understand
13:21
the full breadth of a project to necessarily review something, you want to be doing a small change, and keeping your change focused to what the actual topic is. Some people like to do a two line fix in the file, then they'll send you over a thousand line diff as they went around moving things around, sort of thinking they're tidying them up. And that may be perfectly acceptable,
13:40
but it's better to actually submit these in two different review requests. And this tweet I've seen a while ago, when I was preparing this talk and I just noted it down, and it's basically saying code review, 10 lines of code, nine issues. 500 lines of code, looks fine. And for this, it's basically, I mean, some developers might look at this and think,
14:02
okay, I'll send a bigger pull request, that sounds like an easy way to get my code merged. But that's definitely not the right way to think about it. And I think the reason you get more feedback on a smaller change is basically, it's much easier for the reviewer to get that change into their head and just really get their brain wrapped around it. Whereas if you're trying to understand a much larger change, you end up missing things
14:21
because you're focused so much on the overall impact of it, that lots of subtleties will be missed otherwise. And this has always been kind of a gut instinct for me, the easier, sorry, smaller patches were easier or more efficient to review. But there's actually quite an interesting paper. This one is investigating code review quality.
14:43
Do people and participation matter? They do. And they found that larger patches lead to a higher likelihood of reviewers missing some bugs. So it's great, as I said, I think this is a fairly self explanatory and obvious one. But it's good to see this is backed up with data because they spent some time looking into the public Mozilla bug tracker
15:02
and they merged it together with the bugzilla, yeah, I can't remember what Mozilla used for bugs but with their bug tracker and their reviews and they combined it to then sort of find the trends of the bigger patches were introducing more bugs and so on. So yeah, smaller diffs are good,
15:21
touching less files are good. When you actually submit your review, I think it's good to think about that as being the start of the conversation. You'll find that some people will submit a review and they'll wash their hands of it, they'll say that they're done. So they'll send it over and they say, okay, can you merge this please?
15:40
Merge this, I'm done. And that just really shuts down the conversation. It's much better to send over the code and say, hey, can you have a look at this and tell me what you think? You're basically inviting feedback and making that feedback loop much clearer. I've had plenty of sort of pull requests and things where people are just, do not want the feedback at all and it just breaks the whole process
16:00
and it just, eventually you just find it stalls and things don't really progress from there. One handy trick I find when you, so when you're opening this pull request, it's actually, or I'm gonna keep saying pull request but I just mean code review in general. I'm not specific to GitHub at all. But yeah, when you open this request for a review, I quite often like to go
16:21
and actually do a quick review myself. So I'll go and scan it on GitHub or on Gerrit. And I find that I often find mistakes in my own patch at that point. I'm not sure why. I think maybe it's the change of context from an editor to the code review tool that puts me in the right mindset. So that's just a small tip which I recommend doing. And so when you actually, your change is up,
16:43
you really need to try and relinquish ownership. So this tweet was a response to the sort of the Twitter poll that I ran about how much time do you spend doing code review. And to me this was just a sign of a really toxic culture. I'm not sure if it's the fault of the reviewers
17:00
or the authors in this change, or in this case, sorry, or both, it's kind of hard to tell without more context. But it's just, as the author of a change, it's really important to just try and think analytically and subjectively about your change when you put it up. And I think this can be quite difficult for people sometimes, you feel quite protective of it, you're quite proud of it, you get kind of tied up
17:20
with the, hey, my code works, and then when somebody then starts to give you feedback on it, it can feel quite, I don't know, upsetting. The way I like to think about it is if you've looked at your code from six months ago, you've probably found it horrific. So just try and think about the code you're writing today like you will in six months. And so really I think reviewing code is pretty difficult.
17:44
So the whole thing is about trying to make it easier for the reviewers to review your code and give you feedback. Just try and make the whole process go a lot better, help them to help you. And then when we look at it from the reviewer's point of view, they're kind of like the gatekeepers
18:02
for this project, at least for code coming in. So they have responsibility to make sure that the code coming in is good and suitable for the project. And this is just like the authors have the same responsibility to try and make
18:22
the best contributions they can for the project. And the key thing really is that you want to make sure you're sharing the responsibility. I personally see the reviewer of a code or reviewers as being responsible for the change as the author is. And I think this is where some tools like Blame are a bit broken, because that will quite often
18:43
just get you the author of the change, depending on how the code is actually merged and submitted eventually. Whereas it would be interesting if we could have a better way of integrating with code review tools and you'd get blame on it and be like, okay, these three people are behind this change or these two people are behind this change. But it's better to share the responsibility, they say.
19:04
This one is definitely for open source maintainers. So the title here is in reference to a quote which I don't know where it came from originally, but it says something like, code reviews, sorry. I'm getting my words mixed up. The quote says something like,
19:21
contributions are like puppies. Everyone likes puppies, but you need to look after them, you need to walk them, and you need to feed them. Similarly with code, when you are given it, you need to make sure it's documented, you need to make sure it's testing, and you need to make sure it works forever. So as a reviewer or a project maintainer, you should really feel like you can
19:42
quite happily say, reject the change, just to say no. Sorry, I can't maintain this, I don't have the bandwidth to maintain this. There was a case when I was working on, so one of the projects I maintain is MKDots, and I had a pull request for that, and the change came in, it looked reasonable,
20:01
it wasn't very big, but I said to the author, sorry, I don't think this fits in the project at this time, and their response was something that all ends up, all I'm asking you to do is just merge it, and I said, no, actually, you're asking me to make sure this is something that works until we're able to remove it from the project, and that can be quite difficult in open source projects, because you never really know what people are using.
20:25
It's really important that everyone reviews. I think it can be, sometimes people would assume that it's better for just the seniors to review in a project, and then the juniors are just writing the code. Maybe you feel like they can't be held responsible for the reviews.
20:40
And this is actually technically true. There is a paper, and I have lost the reference, I have a paper copy at home, I couldn't find it online. But basically, they found out while senior reviewers were doing a better job at finding defects and giving feedback, the best way for juniors to actually level up and become more useful and become seniors, essentially, was to do code reviews.
21:01
So you need everyone to do it. And you've also got to remember, there's a whole lot of benefits coming on, so there's a sheer understanding in knowing what's going on. If your juniors don't know about this, then they're all gonna stay juniors, I think. And the other thing about if you ever have a split in between who reviews and who doesn't,
21:20
you can find that that can create the divide between the two sides in this. It can become more of a competition against each other. So just make sure everyone is included in the process. And so one of the ways you can join in more with an open source project, and you can review if you're then also submitting changes. Maintainers will absolutely love you if you just review a few other open
21:41
pull requests or patches whenever you submit a change. So with everyone reviewing, it's really important to keep them all on the same page. It can be a very frustrating process for a contributor if you submit it, you get one review, and then you update it, then you get another review from someone else, and maybe it takes you back
22:01
to where you were originally, I'm not sure. So just really try and make sure you've got a shared understanding of what is expected from your contributors. You can make this clearer with review guidelines. You can also have things like review checklists that your reviewers are all using and following as a guide. So you've got to be careful with checklists because you really need to make sure
22:21
that you avoid anything that can be automated. There are a number of things that you can automate in code reviews, and by this I mean making sure that your code style passes whatever linting you want to use, and you can also make sure that your tests are run, and it's just, if you as a reviewer are having to do these steps manually, you're wasting your time,
22:40
and it's fairly easy to set these things up with something like Travis, which is free for open source projects, or there's tons of open source tools which you could run yourself as well. And one of the best things about automating this is if you run, say, PyFlakes or PEP8 against the code, and you get a negative feedback there, it's much better if it comes from Jenkins
23:02
or from an automated CI tool rather than from a human. If you find that you have your reviewers actually manually commenting on code style, then they're just wasting their time, and it's far more nitpicky and just drives everyone crazy. So the nice thing about using an automated tool is you can actually have a shared hatred of that tool
23:21
for nitpicking all of you. But really the automation's about removing the bike shed. So for these kind of discussions where people might debate code style, if it's not tested automatically, then ignore that comment as far as I'm concerned. If you want to do stricter style guides about something, then add a new, like a flake extension,
23:40
a plugin, something like that. And if you're not familiar with what I mean by bike shedding I won't explain it right now, but just go to bikeshed.com. It's basically, people will debate the simplest details rather than focusing on the actual real problem at hand. And I think someone's picking a color already.
24:02
Yeah, so moving on, there's not much to say about this. I have a slide for it just because I think some people will assume it's maybe one reviewer per change. But really I think you should consider or at least try multiple reviewers. In OpenStack we have many reviewers, a minimum of two.
24:20
So that's, I find it really useful. And I think it really multiplies the benefit from some of these sort of side effect goals that you get from code review about understanding what's going on. And one of the other things that's nice about multiple reviewers is you actually, actually that's a slightly sad point. I'll come back to it. When you're reviewing, it's important
24:41
to make sure you're feeling fresh and that you're doing, so you really want to do frequent and short reviews. There's a paper which is the impact of code review coverage and code review participation on software quality. And they found that code reviews peaked at around 200 lines per hour.
25:00
A bunch of other papers seem to reach about 400 lines of code an hour is what they suggested. So it seems to vary a bit. But it was lower than I expected. But the important thing is just to make sure you're timeboxing this and you're not sort of driving on when you're kind of mentally fried because you just won't be doing anything constructive. Personally I find my best time for code review
25:21
is just after I started drinking my strong morning coffee and I don't know, I'm sort of ready to go and I've not looked at my email yet so I'm not very distracted. And I'll spend maybe an hour or so doing reviews then. When you're giving your feedback in a review, make sure that you are being constructive
25:41
and you're giving constructive criticism and you're also giving praise to people in reviews. It's very easy to get tied up with giving bad feedback to people, as I'm saying, pointing out all the problems. It's really nice if you can point out to something that you learned in that code review. And I think it really helped make the process a lot more positive.
26:01
Interestingly, when I was practicing this talk a few weeks ago, one of my coworkers, I'm sorry, I got distracted by an email notification and one of my coworkers has actually said something on my review I'd submitted. He was like, hey, I learned something today. I didn't know you could have a class with just a dot string. You don't need the past thing as well.
26:21
And I just, I don't know, it made me feel quite good about it so then I was more encouraged to try and help sort of share more knowledge in the future. And this kind of follows on a similar line. Make sure you're being polite and aware of the tone when you're talking in code reviews. This can be tricky and it's very easy
26:41
if you're in a rush to like misword things. So in this example, if I was saying this to somebody and I said, oh hey, why didn't you do that? That sounds reasonable because you can hear my tone. I sound fairly relaxed. But if it's written down, it sounds much more like, why didn't you do that? You know, I'm not very good at a mean voice, but you can get the idea.
27:03
And so you can alternatively maybe say something like, could we do this instead? You know, it's just trying to be a bit nicer about who you ask things and it'll make some of the negative feedback easier to receive. And it's really important to make sure you're asking questions rather than telling. So worse than either of those examples, you would say something like, you should do this
27:21
and just tell them. They might have had a good reason for doing that. Maybe you just missed it, so you should always ask. And this one I hope is really obvious, but you should never be harsh and never personal in your code reviews. I think this is probably less of a problem in the corporate environment or a work environment because this tends to be a good social dynamic
27:41
between people, or at least a social dynamic between two people, and there's always like managers and so on to sort out any problems. But it's definitely more of a problem in open source communities. The Python Packaging Authority has a good code of conduct for basically for projects in general that covers code reviews. So I'd encourage that you use that. I found out about it because I was,
28:00
I've seen the Cookie Cutter, the Python project for using it. And I do plan to add it to all my projects, but I've not quite got there yet. So yeah, it's kind of like writing code is hard. Reviewing is also hard. So you want to really try and help out the authors of the chain. So again, it's helping them help you.
28:21
And that kind of brings together the two different sides that I have in this conversation. It's about helping each other to come out with the best result and working together. Automate what you can to save time and reduce that kind of burden of doing automated tasks yourself. And be kind to yourself by timeboxing your code review. Make sure you're restricting how long you spend doing it.
28:45
Something I missed from earlier, but I just remembered, so I'm just gonna say it now. One of the real benefits I find with doing reviews is that you can actually, you can learn a lot about a project. So quite often when I'm starting to work on a new open source project and I want to make some contributions for some reason, I'll actually start doing reviews
29:01
before I ever submit something. I'll be fairly careful about my feedback because I don't really understand what's going on. But you can quite often make useful changes and you start to see the benefits of that shared understanding and the shared learning in the process. So it's a great way to become familiar with a project is to look at the reviews of what's happening and see what the actual, the current reviewers are saying
29:20
and see what they're expecting from other people. And also it's incredibly useful if you can start reviewing like that as well. When I told people about this talk, a lot of them sort of asked me, was I gonna do a comparison of different code review tools? Like what was the best thing to do? Personally I find that it's not very interesting. If you want to look at screenshots
29:40
or read the documentation about these, that's fine. And really I'm only qualified to talk about GitHub and Gerrit. I use them extensively, but I know of the others and I'm sure there's a whole bunch more I don't know about, but I've never really used them. The one thing I think you could really expect from your code review is to make sure you're reviewing before the merge.
30:01
So rather than say committing directly to master in a GitHub repo, or sorry, in a Git repo, make sure that you are having somebody review before that happens. So that could be in a pull request on GitHub or in a sort of a Gerrit code review for example. The reason for this is the feedback loop is just too delayed if you try and review after it's been merged
30:22
and the effort to actually take this feedback and make updates is just much harder. So you'll find that people will do the review and they'll say, oh we should have done all this, but then eventually it gets forgotten about and no one does it. So you need to keep people motivated and only let the code in after it's been updated for this feedback. But to quickly talk about the two that I do use.
30:44
GitHub, most people here I imagine probably know, so I'll say very little about it. But it's got a very loose and undefined workflow. The labels are kind of useful for triaging and things, but it's actually quite hard to do any kind of detailed or complicated workflow in your code reviews. It's also got a simple UI which works quite nice,
31:01
but some things would be very hard to do. For example, if you wanted to consider doing multiple reviewers, you'd have to have some kind of strange convention to make sure two people had looked at it, and it just wouldn't really work that well. But yeah, everyone has GitHub which is a great benefit, and it's also very easy to use and quite appealing as well. Gerrit on the other hand is pretty much the opposite to GitHub.
31:23
It's open source, it's kind of ugly, it's hard to use, but you have very defined workflows and rules in it, and you can do much more things like you can have multiple reviewers for example. When people do a reviewing in Gerrit, they'll explicitly give a vote. So you'll write your comments,
31:42
and then you'll pick minus two, minus one, plus one, plus two, I mean they're fairly obvious that they're negative and positive. But the nice thing about that is you can actually give some feedback which is like, oh I spotted this typo, but it doesn't really matter. I'll still give you a positive review, but if you happen to be updating this
32:00
for some other reason, it'd be good to take that into account. So it allows you to give the small feedback which is useful to have, but not be like a real nitpicking person and block the review just because of that. One of the really interesting things about having all of this data attached to Gerrit is there's a huge amount of information out there online,
32:22
and this is one of the things that I would quite like to do at some point. So this is kind of a, I'm looking for like-minded people I suppose. In OpenStack for example, there are around 330,000 reviews in the Gerrit, I think. I can't remember, the number is going up all the time obviously. And you can pull down all this data.
32:41
It gives you like 20 gig of JSON or something, but I'm sure there's something we can learn in there, and maybe we can have some kind of feedback loop. And there's a whole bunch of other projects are using Gerrit in the open, so like Android uses it, WordPress uses it. So we could, maybe we could, I don't know, I'm sure there is something to be learned there, or you could just have fun looking at the data
33:00
and making pretty graphs. Yeah, I think it would just be, it'd be quite an interesting thing to do. Yeah, and that basically wraps up my thoughts about code review. I'll have to take questions. These are my contact details. And sort of related to, we're gonna do an OpenStack open space tomorrow. So if anyone's interested more about
33:20
the OpenStack code review workflow, they can come and ask there, or just plug me afterwards. Yeah, hello, thank you. One thing to note if you're looking at that, the O in doogle is a zero.
33:41
I should have picked the mono one for this one. Hi there. Great talk. I wanted to know,
34:01
as a beginner who's working on a project alone, what's a good way to find people to review my code? Yeah, that's, I mean, that is a very difficult problem. It's kind of the eternal problem of, say, an open source project, is how do you get more people to look at your code? How do you get more people to use your code? I had wondered if there was some way we could have a way of networking people
34:24
that are working as individuals, and we kind of have a loose agreement to try and review each other's code. That could be quite interesting, and it'd also get more people involved. But yeah, I don't really have a good solution there. All right, thank you.
34:44
Hi, I wanted to ask you, do you think that reviewing kind of meta issues, like you're rejecting merges in terms of, because the commit messages are not clear enough, or the commits are not squashed, is kind of being too picky,
35:01
or it makes sense in an environment? I think you just need to basically establish some clear guidelines in your project. So in most of my side projects, I wouldn't be too picky about commit messages. But in OpenStack, people actually, and that's one of the nice things about Gerrit, is your commit message is actually a file that comes up for review. So you review your people,
35:21
review commit messages as well, and make sure that they are useful. I think there's definitely some value in it, but it depends how good everyone is with their, sort of keeping their Git history nice and clean. Hi, thanks for the talk. I also happen to work on OpenStack,
35:41
so I was wondering, maybe it's just me who misses in the review comments. For instance, there's a large patch set that's in progress for a review, and there's interesting discussion, deep analysis. There's one interesting comment, patch set 25, that's a long paragraph, and then it's just buried in there.
36:00
On a mailing list based patch workflow, you can link to a specific large comment. However, in Gerrit, it's just buried and sitting in there and like six months later, if you wanna refer to it, again, you have to link to the review, and then there's like a whole bunch of 25 CI jobs
36:22
listing their status, so it's just very messy. How do you handle that? I mean, this is a specific problem with Gerrit, I think. So for people that don't know, when you submit a change to Gerrit, someone then does a review, and then you submit another change, the comments they added on the first revision are hidden. So sometimes you might actually do another review
36:41
before you see the comments, so you need to actually go and look for them or spot the email notification or something. Yeah, I think it's just a problem with the Gerrit's UI. It needs to expose this better, and actually, I found out recently you can search for all your comments that are in draft which were never submitted, and I've got a whole bunch of review comments which I thought I had submitted that are just sitting there.
37:00
I found this last week, so that was a bit of a nightmare. I was like, oh, that's why they didn't listen to that. Yeah. Thanks. It's more of a wine, I guess. Yeah. Okay, any other questions?
37:25
So I was just wondering what your thoughts are on code review versus pair programming. Yeah, so this is quite an interesting one. I think you've, so I work remotely for Red Hat, which means pair programming is a slightly different thing
37:41
in that we sometimes do it and sort of sharing a screen terminal and maybe a voice chat at the same time, and that can work quite well, but it means I've not done a whole lot of pair programming, essentially. I think that you can basically get the same result out of the process, but if you can, so in OpenStack, at least, we still need to have other reviewers
38:00
just for the simple fact that the change won't be merged until you have enough reviewers, so if you pair program it, it doesn't necessarily help you out in OpenStack, but I can see that in some situations you could just replace code review with that. I mean, if there's only two people working on a project, then there's not much point doing a formal code review if you're doing pair programming. I think the result could be the same.
38:22
Yeah, it'd be interesting to know if anyone's got experience comparing the two, which one they find more effective.
38:43
Thanks. Yeah, just a short response to your question, because where I work, we have assignee for code review per week, so before we start working on new feature, for example, we come together and discuss the implementation,
39:02
so you avoid having feedback after three days of work saying, yeah, I don't agree with your implementation or your choices, so I found it to be really useful to work on a feature together in their way
39:20
that you plan it together and discuss it before you actually submit your pull request. Yeah, no, that works very well. I think, again, this is maybe an artifact of the fact that I'm a remote worker, that what I quite often do is I will put my code up for review very early on, and then Garrett, you can mark it as work in progress, so that allows people to have a rough look at it
39:42
from a high level, but not do a proper review, and then you can maybe get some feedback a bit earlier, which is another good approach to that, to avoid wasting, say, five days or three days development, but yeah, if you're doing scrum meetings, it's always a good time to bring up these kind of questions as well. Anyone else?
40:06
Thanks for the talk. Did you maybe try to implement trunk-based development, and did you have experience with that, and code review? So, trunk-based development is just when
40:22
every change just goes straight into the trunk. Yeah, everything goes to the master, it's like, so small changes all go. That's essentially what happens in OpenStack. Every change goes directly into the master branch, unless you are back porting to a previous release, something like that. So yeah, I guess we do do that, and it seems to work pretty well. I've never really been a fan of the long-running branches.
40:41
I find that they cause more problems, which I guess is the alternative. I don't know. I'm not sure if I'm missing something else about trunk-based development. Okay, and what size are your average commits, like in lines of code? That's a good question. If I look at this data in OpenStack, I could tell you.
41:04
I guess I'd like them to be a couple hundred lines at most. Occasionally, if you're moving things around, you can end up with really big changes, but that tends to be a bit easier to review. Yeah, I'm not sure. I would have to look at that and see. Thanks.
41:35
Hi, thank you for the talk. I wonder if you have some experiences or ideas
41:41
or comments about reviewing different things than code. Like, I don't know, in OpenStack, we have blueprints and specifications, and sometimes code style changes and so on. Yeah, so, yeah, I think it can be good,
42:01
but you've got to be really careful, and this is a problem I think is happening in OpenStack, so when you want to, in some projects, when you want to make a change, there's basically a repository with just restructured text files, and you submit with a document describing what you want to do. A bit of the feedback process in that tends to take a really long time, because people really start to
42:20
go into far too much detail on the specs. So I think that tends to not work so well for some reason. I mean, documentation, I guess, is a bit more like code, and that seems to be fine, because you're actually documenting what something does, whereas when you're reviewing this spec, a lot of people are trying to interject how they think it should work, and it almost gets to the point
42:40
where people are debating the implementation, and to satisfy the code reviews that you're getting, you almost need to write out pseudocode for everything you want to do, and it just becomes this big, duplicated effort, and I think that's why we've seen a lot of OpenStack projects move away from this really detailed spec process, or at least they've got something more alternative for smaller features.
43:04
Hi, great talk, by the way. I just wanted to ask you something. You stated, and I totally agree with you, that code review is not just a one ping-pong thing, but it's just a dialogue, a continuous dialogue. But where do you draw the line?
43:22
So how far do you go with this dialogue in order to either reject a pull request or merge the pull request? Sure, so I guess it depends on the situation, and I mean, you have to really define it based on your project. If you're a sole maintainer of the project, then you can simply think, do I want this change?
43:40
Do I want to maintain this change? And then draw the line there. But if you're working in a company, then you're probably more thinking along the lines of what does the business need? Or in a large community, you're thinking about, you've got to put on your community hat and think, what does the community want? Does this match the community's expectations? It might not match what you want exactly from a feature, but maybe more there.
44:00
So there's not really a simple way to do that. I don't think you've just got to, make sure you're taking into account the group that you are working within. Okay, thank you, Dougal. So that's lunch. Don't forget about the massages.