We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Agile Metrics - Velocity is NOT the Goal

00:00

Formale Metadaten

Titel
Agile Metrics - Velocity is NOT the Goal
Serientitel
Anzahl der Teile
170
Autor
Lizenz
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
SynopsisVelocity is one of the most common metrics used-and one of the most commonly misused-on agile projects. Velocity is simply a measurement of speed in a given direction-the rate at which a team is delivering toward a product release. As with a vehicle en route to a particular destination, increasing the speed may appear to ensure a timely arrival. However, that assumption is dangerous because it ignores the risks with higher speeds. And while it’s easy to increase a vehicle’s speed, where exactly is the accelerator on a software team? Michael “Doc" Norton walks us through the Hawthorne Effect and Goodhart’s Law to explain why setting goals for velocity can actually hurt a project's chances. Take a look at what can negatively impact velocity, ways to stabilize fluctuating velocity, and methods to improve velocity without the risks. Leave with a toolkit of additional metrics that, coupled with velocity, give a better view of the project's overall health.
3
Vorschaubild
59:06
71
112
127
130
p-V-DiagrammKomplex <Algebra>Physikalisches SystemDivergente ReiheSelbst organisierendes SystemIterationProgrammierumgebungProzess <Informatik>PunktAnalogieschlussKomplexes SystemPhysikalisches SystemBitrateArithmetisches MittelBitNatürliche SpracheIndexberechnungTermKontextbezogenes SystemEnergiedichteMereologieZahlenbereichFlächeninhaltEinflussgrößeTeilbarkeitRechenwerkMinkowski-MetrikEreignishorizontRechter WinkelDruckverlaufIdeal <Mathematik>GeschwindigkeitMultiplikationsoperatorGewicht <Ausgleichsrechnung>CASE <Informatik>DimensionsanalyseTwitter <Softwareplattform>TVD-VerfahrenResultanteMusterspracheKomplex <Algebra>GruppenoperationTouchscreenSoftwareentwicklerMetropolitan area networkGrenzschichtablösungDruckspannungHyperbelverfahren
MittelwertGleitendes MittelStandardabweichungSoundverarbeitungPunktQuaderProdukt <Mathematik>GeschwindigkeitZahlenbereichIterationDifferenteProgrammierumgebungCodeSchätzfunktionZellularer AutomatOffice-PaketSpannweite <Stochastik>StandardabweichungBitrateMittelwertGleitendes MittelGruppenoperationEinfache GenauigkeitCASE <Informatik>EinflussgrößeBereichsschätzungRechter WinkelOffene MengeWeb-SeiteDatenverwaltungMinkowski-MetrikFigurierte ZahlDimensionsanalyseGanze FunktionGenerator <Informatik>TermPivot-OperationBildschirmmaskeVollständiger VerbandDivergente ReiheBeobachtungsstudieExistenzaussageVererbungshierarchieComputeranimation
SoundverarbeitungGammafunktionMittelwertOrdnung <Mathematik>ProgrammfehlerErwartungswertMultiplikationsoperatorCodeRegulärer GraphMAPWort <Informatik>BeobachtungsstudieBitComputerspielEinflussgrößeProjektive EbeneDatenfeldZählenRechter WinkelGesetz <Physik>Produkt <Mathematik>GeschwindigkeitFlächeninhaltGüte der AnpassungMomentenproblemZahlenbereichDruckverlaufVerknüpfungsgliedAdditionBitrateIndexberechnungGruppenoperationDienst <Informatik>Metrisches SystemResultanteAnalysisGamecontrollerInformationSchlüsselverwaltungMathematikSelbst organisierendes SystemWurzel <Mathematik>SoftwaretestPhysikalisches SystemFormale SpracheRegulator <Mathematik>Funktion <Mathematik>ProgrammierumgebungTermKomplexes SystemStandardabweichungArithmetisches MittelMittelwertSystemaufrufNormalvektorPhysikalischer EffektVorlesung/Konferenz
Suite <Programmpaket>ZahlenbereichArithmetische FolgeMultiplikationsoperatorMathematische LogikKomplex <Algebra>KorrelationsfunktionRechter WinkelWiderspruchsfreiheitMAPGamecontrollerCodeMathematikProgrammierumgebungGeschwindigkeitIterationPunktVollständiger VerbandVerzweigendes ProgrammDatenbankWasserdampftafelGleitendes MittelVariableTermStreuungDimensionsanalyseProdukt <Mathematik>E-MailSoftwaretestPhysikalischer EffektNummernsystemHinterlegungsverfahren <Kryptologie>Metropolitan area networkGenerator <Informatik>BitrateDreiecksfreier GraphCASE <Informatik>IndexberechnungMehrrechnersystemDiagrammIntegralComputeranimation
Data Encryption StandardLokales MinimumChi-Quadrat-VerteilungUmsetzung <Informatik>FrequenzGeschwindigkeitParametersystemMAPKorrelationsfunktionHilfesystemMultiplikationsoperatorProdukt <Mathematik>Physikalisches SystemAllegorie <Mathematik>Folge <Mathematik>DatenflussGraphOrdnung <Mathematik>SoftwaretestFlächeninhaltZahlenbereichDiagrammSoftwareentwicklerOrtsoperatorSummierbarkeitMathematikTypentheorieDatenflussplanCodeProzess <Informatik>BereichsschätzungComputerspielDatensatzZeitrichtungKumulanteMaßerweiterungPhysikalische TheorieVariableRückkopplungPunktPhysikalischer EffektBitRechter WinkelBitrateCASE <Informatik>MereologieCoxeter-GruppeArithmetische FolgeZoomGüte der AnpassungSelbst organisierendes SystemVollständiger VerbandHill-DifferentialgleichungKorrelationGesetz <Physik>
Dreiecksfreier GraphFlächeninhaltKumulanteMultiplikationsoperatorProjektive EbeneSoftwareentwicklerDatenflussStandardabweichungProdukt <Mathematik>GeschwindigkeitMomentenproblemKrümmungsmaßProzess <Informatik>PunktStreaming <Kommunikationstechnik>Dreiecksfreier GraphSoftwaretestIterationGüte der AnpassungBitTwitter <Softwareplattform>FokalpunktDatenflussplanMAPRechter WinkelAbstandPerspektiveMathematikInformationPhysikalisches SystemSchreib-Lese-KopfVierzigZahlenbereichClientDifferenteSchlüsselverwaltungRationale ZahlTermOffice-PaketDifferenzenrechnungRechenwerkGeradeBeobachtungsstudieResultanteE-MailComputeranimationDiagramm
GruppenoperationGradientProgrammierumgebungGewicht <Ausgleichsrechnung>MultiplikationsoperatorGemeinsamer SpeicherCodeUmsetzung <Informatik>DatenflussPoisson-KlammerSoftwareentwicklerMereologieSondierungDatenverwaltungZahlenbereichWarteschlangeInformationStandardabweichungGüte der AnpassungGroße VereinheitlichungEDV-BeratungIterationRechenschieberResultanteKomplex <Algebra>Rechter WinkelIndexberechnungMetrisches SystemMittelwertProdukt <Mathematik>PunktKartesische KoordinatenVerschlingungGraphStochastische MatrixTermGeschwindigkeitSoftwarewartungEinflussgrößeStreuungSpannweite <Stochastik>KumulanteAutomatische IndexierungDiagrammWeg <Topologie>Open SourceWendepunktDatenflussplanSchwellwertverfahrenDifferenteGruppenoperationKlasse <Mathematik>AggregatzustandComputerspielZentrische StreckungBitCASE <Informatik>Web SiteSchnitt <Mathematik>BitrateClientEinsReelle ZahlGraphfärbungNormalvektorRechenwerkComputeranimationDiagramm
Computeranimation
Transkript: Englisch(automatisch erzeugt)
So we got a few more people coming in, but we'll go ahead and get started. It'll give us some more time for question and answer at the end. So this session is Agile Metrics, Philosophy is Not the Goal, which should be evident by the large text on the screen.
I am Doc Norton. Given name is Michael, but Doc is my nickname, and pretty much everybody calls me that. I am Global Director of Engineering Culture for Groupon. And prior to that, I did an awful lot of Agile coaching, going into organizations and helping them make the transition towards Agile.
And this is based on a lot of that experience and things that I learned there. So we start off with something fairly simple. I've got a question for you. I'm hoping somebody in here will actually answer it.
What's velocity? In an Agile context, what is, in a software development context, what is velocity? Anybody? No one knows. Beg your pardon? Method?
It's a metric, sure, it is a metric. It is a measure of work units over time. So for every team, it doesn't matter if you measure your throughput in just by saying,
hey, we've got X number of stories that we get done in an iteration, or we've got X number of points that we did in an iteration, or we did X number of ideal days, or work hours, or frogs and crickets. It doesn't matter what that unit is, right? How many of those things did we get done in that iteration, the one weeks, two weeks, four weeks,
however long your iterations are? But it is also, and I think this is very important, it's also a lagging indicator. Does anybody know what a lagging indicator is? No? So basically, you do a thing, you take a measurement,
that measurement tells you the results of the thing that you did in the past, right? So this is an indicator that comes out after a series of events happens. So maybe some other examples of lagging indicators.
Well, this gives us an example of one. The unemployment rate is a lagging indicator, right? If the unemployment rate is rising, it means that our economy has been in decline, right? So first, the economy's in decline, then we see unemployment being affected, right?
Lagging indicators are good for detecting long-term trends, but they're actually not good for predicting into the future. What else is velocity? What's also the measure of a complex system, right?
It doesn't matter what your process is, it's XP or if it's scrum or if it's crystal or if it's agile up or if it's some other, you know, ad hoc variation thereof, you have an awful lot of factors that go into determining what the velocity is. You've got a backlog of stories
of various size and complexity. You've got different members of the team. You've got dependencies in other groups in the organization. You have all kinds of things and all kinds of steps that you have to go through before you actually get that number at the end of the process, right?
So by way of analogy, something else that is a lagging indicator of a complex system is your body weight, right? We measure our weight, maybe you do it daily, maybe you do it weekly, maybe you do it never,
but it's basically an outcome of things that we did previously. So what are some things that could affect body weight? Calories, so how much food you consume, right?
Calories also in terms of how much energy you expend. What else? The amount of exercise that you get, height, genetics, right? Yeah, your existing physical health,
the environment that you're in, where you work, the amount of stress that you're under, your social network. Your social network can actually influence your body weight. If the people that you hang out with tend to be thinner, you tend to be thinner. And it's not necessarily because you choose
to hang out with people that are like you, you actually become like the people that you hang out with. Does any given body weight mean that you're actually healthy? Not necessarily, right? You can be very svelte and you can have extremely high blood pressure.
You could be a diabetic, you could be at risk for kidney failure or any other number of things, right? You can also be someone who carries a little additional weight and actually have phenomenal respiratory system and actually be in very good overall physical health.
So our weight is just one thing that indicates our general health. So by this analogy, does any given velocity mean that a team is healthy or not healthy, that a team is doing well or not doing well?
And the answer to that is no. Now if I, well we'll get to this a little bit later. So I wanna talk about velocity a little bit and some of the flaws that are kind of baked into the way that we use velocity.
So the first thing we're gonna do is a tale of two velocities. So we have A and B, these two separate velocities, right? And before I get into this too much, a small disclaimer. I'm not comparing team A's velocity
to team B's velocity. I would never advocate that. The factors that go into determining the size of a story and everything else are complex enough and different enough from team to team that comparing one team to another, just you're not comparing the same things, right?
So let's imagine that for this use case, this is the same team in two different dimensions, A and B, all right? So let's look at their velocity in dimension A.
Here's the pattern, 10, 11, nine, 10. And in dimension B, here's the pattern, seven, 14, six, 10. So in four iterations, this is how many points they got done per iteration. So how would we determine the velocity
for the next iteration? Any ideas? How do you guys do it at your work? So is everyone here, do you practice agile? Can you just raise your hand if you practice agile?
Okay, can you raise your hand if you use velocity? Okay, can you raise your hand if you will absolutely refuse to answer a question if I ask it to the audience? Okay, that's fair. So for those of you that said yes, I do use agile
and yes, we do have velocity and did not raise your hand for I refuse to answer any questions, can one of you tell me how would your team figure out what the next iterations of velocity is gonna be for either one of these? What's a common practice?
Okay, average for the past three iterations. That is a very common technique, right? A more simple technique is what we call yesterday's weather. And in yesterday's weather, all we do is we look at what was the velocity in the prior iteration and we assume that tomorrow, next week, next iteration
is gonna look very much the same. I believe that this was probably a technique that was invented somewhere around Palo Alto, California where every single day's weather is exactly the same and so yesterday's weather actually makes sense
as opposed to Cleveland, Ohio where I grew up or here where clearly it can be 80 and sunny one day and 55 and rainy the very next. But still, a common technique is to use yesterday's weather and say all right, well if we got 10 points in our last iteration, we'll probably get 10 in the next and we're gonna plan our work based on that.
So if we do that for these two teams, what happens? Well for both of them, the next iteration using yesterday's weather is going to be 10 points. So you mentioned average of the last three iterations, so a rolling average, right? If you use tools like Pivotal Tracker,
Pivotal Tracker does this for you and it actually averages the last three iterations and says gee, I think your next velocity's gonna be X. So looking again at this team and these two varying dimensions, if we use a rolling average of the last three iterations, what's the next velocity that we're gonna get?
Still 10. Now looking at the team in dimension A and looking at the team in dimension B, do you have different levels of confidence
in the fact that 10, in our estimation of 10, right? Mathematically, yeah. These are all very tight within the range of 10. These are all over the place, right? My confidence here is much higher than my confidence here.
So how might we represent that? What I often do is I take a standard deviation across the entire body of knowledge that I have about this team's velocity history and I apply that to my estimate for the next iteration.
So we look at these, I'm not gonna go into how standard deviation is calculated. I used to do that in this talk and it basically was, it's five minutes to explain it and the truth of the matter is you take a series of numbers, you put them in cells,
you highlight them and you say hey Excel, give me a standard deviation and it gives you the standard deviation. But if we do this for these two teams, what we see is the standard deviation for this team in dimension A is 0.7, the standard deviation for the team in dimension B is 3.1. Now, whether I use rolling average or not
or I use yesterday's weather, I recommend the rolling average. I just simply add and subtract the deviation and now I get a range. So now rather than saying yeah, you know what, we think we can get 10 points in the next iteration, we actually are saying hey, we think that it's somewhere between this and this
and as we project that out into the future, instead of saying yeah, you know what, this is gonna be done on exactly November 7th at 4.30 in the afternoon according to our current velocity, we're now saying it's gonna be done somewhere between May and never
or whatever that range is. So we're being a lot more honest about where do we think we're gonna land with this thing, when do we think that we're gonna be able to get it done. Unfortunately, management usually wants us to go faster.
They don't like the range and they want us to go faster. In some cases, they bought the agile in a box. They would like that to be installed now, thank you and when they read the three page intro to agile manual, it said your team will go faster and they closed the book and they set it down.
So they want you to go faster and it doesn't matter what their management style is, it doesn't matter if you need to work on Saturday, I'm gonna need you to come in on Sunday,
we're gonna have to hit those velocity numbers or if they're super chipper about the fact that we've got a great opportunity to get some more velocity here. It doesn't matter how it's done, whether it's done with a coercive intent or with just the intent of, you know, inspiring the team, in either case, in any case,
if we set a target, we invoke the Hawthorne effect. Is anyone here familiar with the Hawthorne effect? Couple of folks, okay, all right, cool. So, I'll tell the story of it, how it came to be.
So, Hawthorne Lighting Company, I believe out of Chicago, Illinois, they wanted to do a study to determine the effect of ambient lighting on workers in manufacturing environments. And by the way, this is actually their office
for the Hawthorne Lighting Company. I am all for open office space, this is not what I'm thinking when I say that. So, they wanted to run an official experiment. So, what they did was, they brought some folks in and they talked to the employees and they said, listen, here's what we're gonna do.
We wanna figure out the effects of ambient lighting on our people in terms of productivity. So, we've already measured a baseline, we know what your productivity currently is, and I'll be honest with you, I don't know what that measure was, I don't know if it was, you know, forms filled per hour or what,
but they had a baseline for productivity. So, we're gonna be making adjustments to the lighting on the work floor and we're just gonna measure and see what happens. So, they went ahead and they increased the lighting. So, they made the room much brighter. And they took their measurements
and what do you suppose happened? Productivity went up and they said, well, this is great news. Not only have we cracked the code on how to make our people more productive, but it turns out that the solution
also leads to more business for us. You need a brighter workspace to have more productive employees. And just as they were about to have the celebration, someone said, hey, hey, hey, this is supposed to be scientific and stuff, right? Shouldn't we like try the other thing?
Okay, all right. Yeah, yeah, yeah, you're right. So, they went ahead and they lowered the lighting. They lowered it back below what was the original standard. If when the lighting goes up, productivity goes up, what do you think happened when the lighting went down?
Productivity went up, huh. Yeah, something's wrong. So, they scrapped the experiment and they set the lighting back to the normal level.
And what do you think happened? Productivity went up. Now, over a few months, it settled back down to what was kind of the normal levels, right? So, what's the story? What does the story have to do with philosophy? What's the lesson that we can actually draw from this?
That's right. My advance just broke.
That which is measured will improve, right? And the reality of it is that which is reported on will improve, but the caveat there is at a cost. And the challenge is that that cost is often hidden, right?
So, we told the workers that we were going to be measuring their productivity and we told the workers that we expected productivity to change based on the lighting and lo and behold, it did. But it went up three times. Now, either everybody in the organization
was absolutely sandbagging before the experiment started or there were compromises that took place in order to achieve that end goal. Maybe they were skipping lunch. Maybe they were coming in early. Maybe they were leaving late. Or maybe the quality of the product was slowly degrading,
but we weren't measuring that. We were only measuring output, right? So, when we report on a metric like velocity, just that alone, especially if we set targets for it or we set expectations for it, we automatically invoke the Hawthorne effect. Now, this particular study was actually debunked.
As you can imagine, I kind of tell the story in a way that makes it clear that they weren't really sure what they were doing. But many similar studies have been done since. The Hawthorne effect ended up, the name stuck, but it's actually other studies that have been done since then that have shown that yes, this actually does occur.
Something else that happens when we set a target. We invoke Goodhart's law. Is this changing? This is not changing. There we go. We invoke Goodhart's law. So, is anyone familiar with Goodhart's law? All right, I didn't expect as much, right?
Hawthorne effect is something that I know that people have heard of. So, Charles Goodhart was chief economic advisor for the Bank of England in the 70s. And he was working on some white papers, actually doing some analysis and critique of not only the financial standards and policies
of the bank itself, but was also looking at the impact of some government regulations that were happening. And he warned as they were looking to create targets for certain areas of the bank in order
to drive additional business, he warned them that any observed statistical regularity will tend to collapse under pressure once the pressure is placed upon it for control purposes.
So, what does that mean, right? Any observed standard regularity will tend to collapse once pressure is placed upon it for control purposes. Loosely translated into layman's terms, in other words, after I think about this for quite a bit, I'm pretty sure it means when a measure becomes a target,
it ceases to be a good measure. The moment that you take something, a trailing indicator like your body weight or velocity or unemployment, and you set a target for it, you negate the metric.
The metric was information, and now it's something else. Now it's a target that you're going after, right? And because you've done that, the target itself is meaningless. If the measure is meaningless, the target can't mean anything either.
So, does that sound right? I see a lot of people kind of squinching their faces, right? That seems a little odd, right? Well, wait a minute. But it was still a measure. It was still good. So, let's talk about what that kind of looks like. These are perverse incentives. A perverse incentive is basically
when you set a target for things, you get an outcome that you did not expect that is actually counter to the intent of the target in the first place. So here, we see that average call
time is extremely important. And we see how we are improving average call time. Is that really the intent of the metric to completely tank our customer service in order to achieve better call times?
No, right? So it's an unintended result contrary to interest of the incentive makers. So this is just the general problem, right? When we take something like velocity, first of all, it's a trailing indicator of a complex system. It's difficult to actually project into the future with it.
And if we set targets, we actually devalue it. And we create behaviors that we actually don't want. So if we look at this in our field, aside from just velocity, what are some other things in our industry
that create perverse incentives? Additional examples. How about rewards for number of bugs found? If we pay people, if we pay our QAs bonuses for finding additional bugs, has anybody been in an environment where this has happened? What happens?
The bug count goes up. So that's a good thing. We wanted the bug count to go up. But if you dig into it, what's actually happening is these bugs are now nuanced. The same bug is reported three different times with slightly different language and with slightly different requirements.
We start getting into, now we're classifying things that were clearly new feature requests as bugs. Why? Because I get paid more for bugs. I don't get paid more for feature requests. My opinion, my personal opinion about how things should work starts to bleed into the system, not
our collective agreement. Because again, if I can classify it as a bug and my bug counts go up, I get paid more. Are we helping the system get better? We're not. Rewards for code coverage. And obviously, rewards for higher velocity. We end up with more brittle code, lower test coverage,
more bugs, et cetera. So if we've got all of these different challenges, what do we do? Well, Deming tells us it doesn't matter. The quantitative goal doesn't matter. What matters is addressing the method by which our goals are
being achieved. If we want to increase velocity or stabilize velocity or any impact velocity in any way, don't look at it. Look at root causes.
What is making these things happen in our environment? So let's take a look at variable velocity to begin with. So you're on a team, and in one iteration, it's 4 points.
The next, it's 10. And then it's 20, and then it's whatever. So our team in dimension B had a pretty wildly variable velocity. What are some causes of variable velocity?
Complexity. So just how difficult it is to actually deliver a feature. And yes, right. So how difficult it is to actually estimate
either work effort or duration or whatever you actually estimate in. What are other things that could actually affect velocity? Changing team. Yeah, yeah. Changing the team size. Technical debt. So yes. So the fact that we were moving at a pretty good clip
and now we've built up this debt, it's getting harder and harder and harder to actually stay at that clip. Did you have your hand up? No? What's that? Competence. Yeah, yeah. How good are the different individuals on the team?
And the environment that you're actually in. Yeah, yeah. So all of these things can affect velocity, right? So I think we covered some of these, right? So time poorly spent can affect velocity. In meetings, about meetings. The time that you need to actually spend getting
your environment to work again so that you can actually do the thing that you need to do. The time that it takes to actually get the QA environment functioning so that you can run the test suite. That's all time poorly spent. That's all waste that, depending on what we're doing in that iteration, can impact the velocity. Dependency on other teams can impact our velocity.
So maybe we're doing great. In the last couple of iterations, we were delivering stuff that was basically all within our control. But in this iteration, man, we need to interface with the API team. We actually need that schema change done in the database. And those two teams, they're still doing that water faller or whatever thing
it is that they do. And it's not on their roadmap right now. And they're not going to hit the deadline that we need. Therefore, we're not going to hit our deadline. And therefore, this work's not going to get done in this iteration. And it might not even get done in the next iteration. We talked about complexity and poor story composition. So we've got these stories that are way too big.
We've got these stories that are way too small. And it's very difficult for us to get consistency when the team is basically working on tiny little things, tiny little things, tiny little things. And then all of a sudden, this great big gluttonous feature comes moving through the team.
Too much work in progress. They're asking us to just get too much done. And our solution to that is to try to do all the things. I see this very, very often on, and I have nothing
against Scrum, but I see this very often on Scrum teams, especially early adopters. Because here's what happens, especially if you're following the original Schwab or Beadle rhetoric from the original books. There's this concept of commitment, that you are going to commit to getting something
done in this iteration. And it's actually a fairly healthy idea, but it's not often healthily implemented. And so it ends up being a target. And so we sit down and we say, hey, there's three of us on this team. We've got 12 points worth of work to get done in the next week. I'll take four.
You take four. She'll take four. Let's rock and roll. And so I have two or three stories that amass to four points. And she's got the same thing, and he's got the same thing. And we all begin all the things at once. I need to get all these done. I start this one and, well, I'm not sure.
I set that aside and I start the next one. But you know what? This is the one that I actually need the schema change for. I don't know if the database team is ready. I'll shoot them an email. And now, OK, that can sit. So I guess I'll start this third one then. And all right, coming the next day, I'm going to go back to the first.
So I'm trying to do all my things. And she's trying to do all her things. And he's trying to do all his things. When we get to the end of the iteration, all the things are 80% done. 80% is zero. Nothing got completed. So often, because we have this concept of commitment, we try to do too much.
So I will do things with teams. I'm going to walk through a couple of things that I do. So one of the things that I'll do with teams is we'll actually try and assess
if a behavior is producing the outcome that we desire. So I'll use scatter diagrams to do this. The first thing I'm going to do, this one is just auto prices by year. So basically, sale price and age of vehicle to eight years.
As you look at this, do you see any kind of a correlation between sale price and age of vehicle? Yeah. There's a pretty clear negative correlation here.
As the age goes up, the price goes down. The older the vehicle gets, the lower the price is. So that's just to easily set, here's how we read these things. So this is velocity by complexity. So what we have is the points actually
able to be achieved by the team and the complexity of the code measured in terms of cyclomatic complexity, which cyclomatic complexity is, for all intents and purposes, it's the number of logic branches in your code. And if you want to know more about that,
we can talk about that off stage after this. But effectively, cyclomatic complexity indicates how easy it is to maintain this code. And we can see another correlation.
The more complex the code, the harder it is to maintain, the less the team is able to deliver. So we can actually have the discussion, then, about, hey, you know what? If we reduce the cyclomatic complexity of our code, while there's an upfront investment, we should start to see an improvement in our velocity
over time. And we can actually measure that and say, yes, that is true. It is actually happening. And if it's not happening, then we can stop with this effort. This one, I think, is interesting.
So this is for a team that I worked with. This is velocity by value. Value being, in this case, what the product owner did was said, hey, this feature, if we get it out to production, is going to net us x thousand dollars per month or whatever it was. And so we had an actual quantitative way of saying,
yes, this feature is worth this much. I've also seen teams where the value of a story is done in one, two, three, four, five. Five is really high value. One is really low value. And it's ordinal, but not cardinal. But in this case, it was actual dollar amounts.
And so here's velocity. And here's value. This is the value getting higher. This is the velocity getting higher. Is there a correlation here? There is not. Should there be? So I get that a lot. No, there shouldn't be.
I think there should. Setting aside that there are always technical stories that may not have a business value, for the most part, when we're looking at agile processes, one of the things that we're doing is saying, OK, which of these stories is the highest value to the organization?
And we are sequencing our work based on that. If we are sequencing our work based on that, it should be when we get more done, we've delivered more value. If we get less done, we've delivered less value. There should be some kind of a correlation between these two things if we're sequencing that way.
And this team was sequencing that way, or at least theoretically was. And so seeing this allowed us to actually engage in a conversation about why is this actually happening? What's going on here? And it turned out that while the product owner was putting things in value order, the team wasn't necessarily executing in value order.
And this exposed that and allowed us to have a conversation about it. All right, this is velocity by code coverage. Velocity, coverage, right? So the higher the coverage and higher velocity.
So correlation here, actual positive correlation, the better we had code coverage, the more we had in velocity. Any theories why that would be? It almost seems counterintuitive, right? Confidence? Yeah, yeah. So the argument is that we don't have time to test. We don't have time for the developers
to write automated tests because that takes too long and we've got to get this stuff done. But when we actually tracked it over time, this is the relationship that we found, at least for this team. What happened was they were able to get much faster feedback
on what they were building. They had a much higher confidence level. They were able to actually move at a faster pace. So while in the beginning the testing seemed to be slower, over the course of time, and it wasn't an extended period of time, we could clearly see that there was a correlation between coverage and velocity.
So these just help teams, these things that we can track to help teams get an idea of how their behavior actually impacts the velocity, right? So looking at saying, we want to go faster, let's stop testing, but when we actually start measuring it, we discover, gee, if we test, we go faster.
Caveat to all of this, Friedman's thermostat. Is anybody familiar with Friedman's thermostat? What is this? All right, so, Milton Friedman, another economic advisor, this time to President Ronald Reagan.
He had basically written a paper warning about making false assumptions based on good evidence, right? And so he told a story, and the story goes something like this.
So I want you to imagine that you have no prior knowledge of a home's heating and cooling system, right? You've never seen anything like this before. You've lived in stone huts, and whatever the temperature was outside,
that was the temperature inside, give or take, right? But you've heard about this village that's just on the other side of the hill, and they have this weird contraption that they've put into their homes, and they claim that it makes the home more comfortable.
It's got this little knob on the wall. You adjust the knob. The home gets warmer or it gets cooler inside, and there's something to do where it burns a bunch of fuel in order to make this happen.
So your job is to go and evaluate this and decide if we should put these things in our homes. So you go and you gather the data. Remember, you've never seen this thing before in your life. As you look at that data, what is the logical conclusion?
Should you put these in your homes or should you not? Well, based on the data, when the outside temperature increases, the temperature of the home stays exactly the same. No change.
When the outside temperature decreases, the temperature of the home stays exactly the same. Again, no change. So the external temperature seems to have no impact on the internal temperature. But what you did find was the more fuel that was burned
the more variability there was in the external temperature. So clearly you shouldn't put these things in your home. You're just gonna burn fuel and make it hot or cold outside.
So the lesson there simply is that correlation is not causation, but it sure is a hint. It's easy for us to look at this data and if we don't know enough about the system
to draw false conclusions from true data. And that was Friedman's warning in his thermostat allegory. So I want to get into something that I think is actually of really high value for teams. So if you don't do scatter diagrams for your teams, I totally understand, I get it. I think they're helpful.
I think it allows us to have conversations about what's the reality, right? We can kind of take some mysticism out of is testing slow us down or speed us up? Should we pair or not? All of these types of things. But let me show you a tool that I think is really helpful for teams. Cumulative flow diagram. How many of you have actually seen these? All right, I've given this talk over the years,
more and more people have seen them. So to kind of set the stage here, we've got a backlog. The stories that are in this backlog are completely irrelevant. I don't even know what they are anymore. What I want you to note is that we have one, two, three, four, five stages for this particular team. They've determined these are the stages of their work.
There's stuff that is ready to begin. There's stuff that is in progress. There's stuff that is in testing. It's ready for approval from the business. And then there's stuff that's actually been deployed. So this is kind of the steps that they go through to get for their work. And all we do is at the end of every stand up,
I don't know, scrum meeting, huddle, whatever you call it. If you've got some daily meeting, and if you don't, fine, set a calendar reminder. Just count the number of items in each row or sum the number of points or whatever it is that you do that determines your velocity, do that same thing for each row in your flow.
Throw that into a spreadsheet and have it do a stacked area graph. And that stacked area graph will look something like this without the arrows all over it. Let's walk through this a little bit. I'm gonna actually zoom in on this.
So what we're looking at here, this is work that's actually deployed. This is the work that's for review from our product owner. This is the stuff that's in testing. This is the stuff that's in development. And this is the stuff that's actually ready to be worked on. So that's our columns that were in our graph.
This tells us all kinds of stuff. One, I can see things like the change in scope for the overall project. Much different than your standard kind of project burndown chart. Standard project burndown chart, when you add more scope to it, the whole chart raises up and the end date just kind of floats off
a little further into the distance. But we can't actually see that change in scope. Here, we can clearly see changes in scope and when they happened. I can also see how much work is current for each of these areas of my flow.
I can see how quickly things are getting done. I can see how much work is remaining to be done at any given point. I can see the amount of work in process. How many things have been started but are not yet in production? I can see how big the backlog is. Is it getting bigger, is it getting smaller? I can see the cycle time.
The cycle time being from the moment that the team picks it up to the moment that it is actually ready for review. This is the amount of time it takes for the team to execute the work. I can also see the lead time. The lead time is the amount of time from hey, wouldn't it be nice if
to hey, ain't that nice. This is from the moment that we had the idea to the moment that the idea is actually in production. And cycle time versus lead time actually becomes, can be, very important for teams. All right, so we kind of get the ideas of what's in this thing.
So a little bit of a story about cycle time, lead time. I was actually working with an organization and they were really struggling with we just don't feel like we're getting enough done. We just don't feel like we're moving fast enough. And so the focus was on the delivery team. And it was, we need more velocity.
Give us more velocity. And it was, you devs need to move faster and QA's gotta be more ready. And rrr, right. And so I came and I looked at it and I said, all right, well let's start measuring this. And what we found was that the lead time on a feature was on average 40 days
from hey, wouldn't it be nice if to hey, ain't that nice. The cycle time was on average four days. So one tenth of the time it took
for a feature to get delivered was the actual team doing work. The other 90% was something else. So if we pushed and we pushed and we cajoled and we did everything that we possibly could to squeeze another 25% of throughput out of this team,
our cycle time would go from four days to three days. And dude, our lead time would go from 40 to 39. You would not notice the difference in terms of overall delivery.
And that team would be suffering miserably as a result. So it was a very easy way to show it's somewhere else in the system. And then we could actually look at the lanes and determine where in the system is it. So let's look at how we do that. I'm gonna actually show you a velocity chart which is the average velocity for a team,
the velocity for a team over time. So this is 10 different iterations. When you look at this, what can you tell me about this team? Is there anything you can tell me about the people that are on the team, the behavior of the team? Anything?
Yeah, so they got these spikes. They seem to produce a lot. It looks like maybe four iterations or so and then all of a sudden a bunch of stuff gets done. Yeah, then it drops drastically after that. In fact, here we got a zero and here it's not quite a zero but still it's low.
We can also see because this is actually our trend line, they are getting faster over time so no problems with this team. But we don't know what's going on here. We don't know why this is happening. And as people look at this chart, they'll say things like, oh well, you got developers on vacation
or they got too much work in process or so and so's lazy and I've seen him and he just hoards all the work and doesn't get stuff done. So we start, it's all conjecture. We don't have any data that tells us what's going on. So we're gonna look at this exact same team
and the exact same data from a different perspective. We're gonna look at the exact same information but we're gonna put it into a cumulative flow diagram. Can anybody tell me what's wrong with this team? This is work that is actually complete.
This is work that is ready for review. This is a work that's ready for testing. This is the work that is in development and then this is our backlog. Any guesses? Yeah, say it again.
Doesn't get fed by the backlog. The challenge with this team is their product owner. We're beating up on this team to deliver faster. We're not sure what's going on here.
But if we look at this, so backlog sits flat for quite all of a sudden spikes. It's flat for a long while and then all of a sudden spikes, right? At the exact same time, we've got this glut that is building up here in ready for approval, right? These two streams are fairly consistent, steady.
That means that the team is working on a certain amount of work, getting it done, handing it off, picking up some more, handing it off. They appear to actually have a fairly decent level of efficiency within the team and the testers seem to be responding fairly well.
Okay, all right, good, all right, good. But then it sits right here and then all of a sudden, all of this stuff gets approved. All of this stuff gets approved and at the exact same time, what happens to our actual overall work demand?
It also goes up. So I've approved a ton of work that you guys did over the course of the last four iterations and I've dumped a bunch of new work on you. Their product owner is a traveling salesman, right? He's probably head of national sales
but he's still out on the road. He still meets with their key clients, their key customers, their top individuals. He doesn't have time. He's out on the road to evaluate all that stuff that you're doing. So when he gets back into the office, he does two things. He quickly evaluates everything and says, good, good, good, good, good, good. And says, oh, and by the way,
customers A, B, and C said they absolutely positively have to have this feature or they're gonna go to our competitor. Here's new work, right? At one point, the team literally runs out of things to do. The backlog gets to absolutely zero. He's not even back yet, right? This is actually iteration seven.
He's not even back yet. So over the phone or via email, he dumps a bunch of new stuff and then comes back in the next iteration and says, oh, by the way, yeah, yeah, this stuff's all good, right? There's another challenge here, obviously. If you're working on stories in iteration one and they actually get approved in iteration four, assuming two-week iterations,
that's six to eight weeks after you did the work. If he says this is no good, it's practically a do-over, right? And it turns out that he was the one that was telling the team they needed to move faster.
So the cumulative flow can be extremely helpful. Now, in this case, I'm showing that it's a product owner. But what we wanna look at is, let's say that there's something going on where we're getting these gluts in the development queue. Let's have the conversation. Why is that happening? What's going on there? What are other things that are happening? If it turns out that we say, gee, it seems to me that every time we send a story
to team X because we need them to do something for us, then it all gets hung up and waits forever. Huh, interesting. Let's create a lane for sent to team X and a lane for came back from team X, and let's see how long that actually is.
And if the glut moves from our development lane to our team X lane, now we know that is, in fact, the problem, and we have data. We've got a pretty graph that shows that this is a challenge, right? So we can have a much better conversation.
My advice is that you measure many things. I wouldn't say don't measure velocity. I think it's actually a fine metric. Just don't abuse it, right? Use cumulative flow diagrams. Use scatter diagrams. I'm gonna walk you through something quick here.
So here's a team. Here's their velocity over time. As we look at this, what do we know about this team? What's going on with these folks? Start off, they've got good velocity. It's getting really good. It's getting even better, and then all of a sudden, it just starts to tank, right?
Do we know what happened? We can't. We can't know by just looking at velocity. We could look at the cumulative flow. That might help us out, but if we track code quality, we can see that the quality of the code is doing fairly well and starts to tank
while velocities continue to go up, and then we hit this maximum threshold, the quality of the code is really low, and we can no longer keep up, and as they finally just say, all right, you know what? Let's get back to reality here. They start to improve the code base,
and velocity normalizes off. We can also look at the average number of hours worked. In this case, this team was a consulting group, and so they were billing their hours to the client, so that's why they were tracking them. We didn't ask them to do that, but it's not a bad thing to do for a team as long as they don't feel that it's oppressive.
We could actually see the number of hours that they were putting in, so we can see that there's this inflection point where they're putting in 50-something hours a week, almost, and yet their velocity is going down. They just can't keep up anymore. What's that?
So in this case, we just use psychometric complexity as an indicator of quality. There are a number of things that you can do. I actually, my next talk, which is up next in, I don't know, room eight, is actually on the technical debt trap, and I talk about a bunch of metrics that you can track,
but psychometric complexity, you can look at code coverage, which is not really an indicator of quality as much as it is an indicator of discipline. You can look at afferent-efferent coupling. If you're in .NET, there's a maintainability index that actually, the .NET environment actually will generate for you, and it's a heuristic based on
a number of different criteria, but it'll actually say, hey, this code is highly maintainable or it's not very maintainable, right? So that was basically what we ended up doing was something along those lines. So there's one other thing that we measured for this team that I thought was really, really important. We actually measured team joy. The way we did it with this team was, there was a, we wrote some porcelain for git,
and every time you did a check-in, not only did you put bracket story from tracker, right, but then you also put a tilde and a one through five, and the one was basically, I hate this code, I hate my life, and five was,
dude, we should totally open source this so everyone can see it. Right? And what we found, it's a little subtle in here, but what we actually found was that joy was more of a leading indicator. Developers feel it in their gut. They will tell you something is going wrong here,
and an iteration or two later, you'll look at it at other results and go, hmm, something went wrong here. Right? So this is a good thing to track, team joy. Some real-world examples quickly. This is one of our dashboards at Groupon for application performance.
I've smudged everything out that I need to so that you can't see stuff and things, but basically the point here is that we look at how our applications are performing at any given time in production. We've got thresholds for that, so we can actually see what's happening, right? So this is actually part of our measure of quality as well. You know, can it scale? Does it do what we want it to do?
This is velocity with standard deviation. I've zoomed in on it quite a bit, but what we have is this is the velocity for that iteration in a bar chart. The brown is what Pivotal Tracker says your next velocity is going to be, and then the yellow that is around it is actually standard deviation applied, right?
So we can see, okay, Tracker says that it's gonna be 11 points, but according to this, it's actually gonna be somewhere between seven and 17 or whatever it works out to be, and then we actually do a burn and apply that standard deviation. So you can also see, well, as far as we know, we're gonna be done at this date, but based on velocity and standard deviation,
it's somewhere within this range. Obviously, cumulative flow diagrams, this is an actual cumulative flow from one of our teams. Team Joy. So this is, we quantify the answers, right, one to five, and then we can actually show for this team
here's the overall average answer, et cetera, right? Department Joy, so this is something else that we do at Groupon. This, we actually send out a survey once a month. It's 12 questions that are designed to ascertain employee engagement. If you wanna know more about that,
I'm glad to answer it. I've actually got a ton of information on this, and there'll be some notes at the end of the slide deck as well. So when we get the slide deck off the conference site, there'll be links to this particular survey, but we send it out every month, and we trend it over time, and we look at how are employees doing
in terms of engagement, and we actually work with managers on what are things that they can be addressing in one-on-ones, what are things they could be doing in terms of actual personal growth for the members of their team, et cetera. So I've got one last thing that I actually wanna say to the group,
and that is that metrics are not for managers. Metrics are for teams. Track this information, share it with the team. Have the team involved in the conversation.
This should be part of a retrospective. How did we do? Here's the data that shows how we did. How do we feel about that? Is that what we expected? What should we do collectively to change this and move forward? If the data is for the manager and is used as a way of, you know,
a tool for saying, hey, you're not doing good enough, or, yeah, I think you're doing well enough, there's too much damage that's being done, right? That's it.
I think, yeah, I think we're like right on the button as far as time goes. Anybody's got questions? I'm glad to meet down here and chat.