Principles of Component Design.
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 150 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/51498 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Multiplication signConvex hullLink (knot theory)Manufacturing execution systemSineCartesian coordinate systemLaserComputer programmingGame controllerCoroutineCodeArithmetic meanSymbol tableComputer clusterSemiconductor memoryProgrammer (hardware)Hydraulic jumpImage resolution19 (number)Library (computing)Perfect groupComputer fileAddress spaceMereology2 (number)Uniform resource locatorBinary fileRight angleTouchscreenSource codePunched tapeWeightGraph coloringSequenceDifferent (Kate Ryan album)Discrete element methodSign (mathematics)Key (cryptography)Statement (computer science)SpeicheradresseInformation securityMultiplication signCoefficient of determinationComputer virusDot productXMLUMLComputer animation
09:53
Library (computing)BitBinary fileLink (knot theory)MiniDiscRun time (program lifecycle phase)Structural loadMultiplication signDisk read-and-write headComputer programmingClient (computing)BootingPhase transitionPhysical systemSpeicheradresseGroup actionCodeFluid staticsLine (geometry)Address spaceCodeCartesian coordinate systemComputer-aided designNeuroinformatikSoftwareProgrammer (hardware)Computer fileDirectory serviceCoroutineBinary codeSuite (music)VideoconferencingPoint (geometry)View (database)Cycle (graph theory)1 (number)Java appletOperator (mathematics)Interior (topology)Coprocessor2 (number)Moore's lawVideo gameProcess (computing)Information managementEmailData storage deviceDifferent (Kate Ryan album)ArmSemiconductor memoryInformationAsynchronous Transfer ModeGoodness of fitServer (computing)WeightComputer animation
19:41
SoftwareKolmogorov complexityNewton's law of universal gravitationPhysical systemCountingModule (mathematics)Modul <Datentyp>Dilution (equation)Vapor barrierData structureArchitectureModule (mathematics)Line (geometry)Source codeSoftware developerNeuroinformatikDistanceCodeProcess (computing)Multiplication signData structurePhysical systemCurveType theoryCycle (graph theory)CuboidStatement (computer science)Key (cryptography)NumberWindowPairwise comparisonGraph (mathematics)TheoryProduct (business)Exploit (computer security)SoftwareConnected spaceSquare numberTouchscreenMaxima and minimaIdeal (ethics)BitRevision controlBinary multiplierSoftware maintenance40 (number)MereologyNetwork topologySet (mathematics)Different (Kate Ryan album)Computer clusterFlow separationLink (knot theory)Grass (card game)Software bugTerm (mathematics)MathematicsEstimatorComputer networkArithmetic meanReduction of orderOrder (biology)WordHecke operatorGraphical user interfaceWeightGraph (mathematics)Visualization (computer graphics)Computer animation
27:56
CountingModule (mathematics)Modul <Datentyp>Physical systemSocial classDiagram3 (number)RepetitionEquivalence relationAlgebraic closureFingerprintObject (grammar)System programmingData structureStapeldateiWeightTask (computing)Source codeControl systemGreatest elementSoftware testingPhysical systemBinary fileRevision controlFunctional (mathematics)NumberMultiplication sign2 (number)Electronic visual displaySoftwareCycle (graph theory)Module (mathematics)Social classData structureSoftware developerProgrammer (hardware)Group actionProduct (business)Java appletSquare numberOrder (biology)Set (mathematics)Metric systemWikiGame controllerProjective planeFormal languageMessage passingLoginLevel (video gaming)Direction (geometry)Declarative programmingTouchscreenGoodness of fitLoop (music)Graph (mathematics)Computer programmingSystem callProcess (computing)Principal idealDivisorLink (knot theory)AnalogyData managementFluid staticsReading (process)Structural loadComputer animation
36:04
Link (knot theory)AbstractionSource codeData structureAxiom of choiceCycle (graph theory)Functional (mathematics)Polymorphism (materials science)Line (geometry)Computer programmingNumberAlgorithmElectronic visual displayHecke operatorSoftware testingMultiplication signControl flowOrder (biology)Different (Kate Ryan album)Game controllerProgrammer (hardware)Plug-in (computing)Encapsulation (object-oriented programming)Cartesian coordinate systemPhysical systemGraph (mathematics)DatabaseVisualization (computer graphics)Object (grammar)Data managementLink (knot theory)Endliche ModelltheorieFigurate numberSystem callCompilerInterface (computing)Pointer (computer programming)Formal languageObject-oriented programmingInversion (music)Operating systemNetwork topologyRun time (program lifecycle phase)Independence (probability theory)VirtualizationDerivation (linguistics)MathematicsINTEGRALVideoconferencingDisk read-and-write headCrash (computing)Stability theoryCompilation albumExistenceComputer animation
44:13
SupremumPRINCE2Maxima and minimaMetric systemDemo (music)MereologyMultiplication signHand fanSource codeGraph (mathematics)Software testingNichtlineares GleichungssystemCodeCartesian coordinate systemVariable (mathematics)Level (video gaming)NumberAuthorizationStability theory1 (number)MathematicsPoint (geometry)AutomationMetropolitan area networkExtreme programmingDatabaseMetric systemClient (computing)Boolean algebraDirection (geometry)Independence (probability theory)Touch typingPhysical systemModule (mathematics)Arrow of timeRhytidectomyComponent-based software engineeringRule of inferenceGoodness of fitSuite (music)WebsiteSimulationTask (computing)String (computer science)Endliche ModelltheorieState of matterSocial classError messageSystem callComputer animation
52:21
AbstractionMetric systemSocial classMeasurementDegree (graph theory)SequenceDatabaseDistanceAddressing modeGraph (mathematics)Cartesian coordinate systemAbstractionDatabaseNumberSocial classEscape characterTotal S.A.MathematicsCycle (graph theory)Hand fanSign (mathematics)Arithmetic meanLibrary (computing)Physical systemIntelligent NetworkClosed setStability theoryTime zoneInterface (computing)Greatest elementImplementationLine (geometry)Mathematical analysisRevision controlAssembly languageString (computer science)Program slicingNetwork topologyAbsolute valueInheritance (object-oriented programming)Programmer (hardware)Numbering schemePoint (geometry)Goodness of fitRight angleDerivation (linguistics)Archaeological field surveyFluid staticsMetric systemVideo gameArray data structureWell-formed formulaCodeTouch typingComputer animation
01:00:29
XMLUML
Transcript: English(auto-generated)
00:04
Ready? Sound on? Cool. All right, so, do you see that red dot? Why is it red? It's dangerous.
00:38
I like that answer. Watch out. Take me to Cuba.
00:43
Yeah, I've had the security people restrict me from going into airports because I'm carrying lasers with me. And they have, you know, a sign on them that says dangerous. But why is that red? I have a green laser.
01:03
Hey, you know, you got to have lasers. All right, I carry three of them with me. One of them is red. I always keep them in my backpack with the batteries out of them or turned around, actually. So this is my red laser. You can get a red laser for $12. There it is, red laser. Nice. And I got a green laser.
01:30
This one's a nice one. A pretty bright one, too. Not bright enough to do anything interesting, but bright. There's a little adjustment in here. You can cut a component out and put a different resistor in here.
01:44
And then you can get almost a quarter watt out of it. The battery drains pretty fast. But you can pop a balloon with it. I mean, that's cool to do. But this is my favorite laser. And I like this one because, well, can you see that?
02:12
I mean, a little tiny violet dot. Do you see that up there?
02:21
Let's see if I can show it to you here. Whoa, that's nice and bright on my thing here. That's nice and bright. But over there, you can hardly see it. Can you see it on this thing? No. Can you see it here? No, not too much. No, can't see it there much. But I found this once. It's just a marker. And the little laser, it doesn't do too much to the marker, but the lid.
03:03
And look, it's yellow. This is not blue. And then see that little orange thing there? Watch this. I got to aim it real carefully. Whoa! It's orange. What kind of laser is this?
03:22
This is a laser of some other color. My glasses are the kind that turn dark in the sunlight.
03:45
I don't know if you can tell, but now they're all dark. And I can't see any of you. This is an ultraviolet laser. They don't bill it as such. You can buy them on Amazon, $17. They say it's a violet laser. They lie. It's much cooler than a violet laser.
04:05
It's an ultraviolet laser. Everybody has to have an ultraviolet laser. It's completely useless as a laser pointer. But you can draw nice little pretty pictures on your glasses with it. I can write my name. Alright, whatever. Look at the code on the screen. Anybody recognize that code?
04:36
I can't see it because my glasses are all dark. Anybody recognize that code?
04:42
That's PDP-8 code. That's what code looked like in 1970. That's the kind of code I was writing when I was a slightly older teenager. And then on into my 20s as well. And this statement right there.
05:04
Anybody know what that means? That's the memory address at which this program would be loaded. We used to write the address of our program into our program.
05:21
This program would be loaded at address 200. And we'd put the data at address 300. How about that? It made perfect sense. Of course you would know where your program is going to get loaded. Who else is going to decide that for you? You the programmer. You had control over memory. Now, this works fine. I'll show you a sequence of pictures here.
05:47
Let's see. Yeah, that's a nice one. Hello. No, that's not the one I wanted. That's because nowadays when you open up documents, it opens up every document that's been opened by that application.
06:10
Don't save the dog. That's the one I wanted right there. Imagine you're a programmer who's writing this kind of code and you've got a program like this one.
06:25
My program lives here. It starts at address 200. And there's a subroutine library that somebody else has written. Now, by the way, subroutine libraries were not real common. You usually wrote your own subroutines back in those days. But after a while, a few guys would write some useful subroutines and you'd think, you know, I should have those in my program too.
06:46
And you'd think, well, I'll just compile them in with my code. And that's what we used to do. We'd just take the source code and jam it together. What was the problem with that? We were talking about the 1970s here. Those programs were contained on paper tape.
07:06
Paper tape was read at 50 characters per second if you were lucky. And so increasing the size of your source code lengthened the size of your compile by minutes.
07:21
So after a while, these subroutine libraries got too long to continue to add to your source code. So what we did is we would compile the subroutine library and we would load it at location 1200. Then what we could do is we could have a binary file which got loaded at 1200,
07:42
and we could have our program at 200, and we'd have a little file that had all the symbols in it. So the symbols would tell us which subroutine was loaded where. So we knew that the get subroutine was at 1205, and the put subroutine was at 1210, and so on.
08:04
And our symbols would get compiled in with the program, the subroutines would be loaded separately, and everything worked fine. What's the problem with this? When's the last time you saw a program that stayed small?
08:21
They grow. And after a while, well, let's see. I got another picture here. Yeah, that's that one.
08:41
Oh yeah! A program that got too big. It overwrites the subroutines. This doesn't work. When your program overwrites the subroutines, your program still thinks that the subroutines are there. So when it calls location 1205, it actually jumps into some arbitrary part of your code.
09:02
You didn't know this happened, of course, until you finally debugged it and realized, oh, my program's gotten too big. What's the solution to this? Your programmers, you can come up with a solution to this.
09:21
Jump around the subroutine library. Right? You put a jump right there to jump over here. Of course, the subroutine library gets bigger, too. And so after a while, let's see if I've got that picture right. How about that one? Yeah, that's the one.
09:43
After a while, you get that problem. We actually faced these problems. We actually had to deal with this stuff. And do you know how we solved it? We came up with relocatable code. We said, you know what?
10:06
This idea of putting the absolute address in the program is killing us. What we'd really like to do is compile our programs without telling them where they're going to be loaded. And then we will tell the loader where to load them.
10:22
Now, there's a problem with that. Because that means that your binary files cannot really be binary. They have to have codes in them to tell you that certain addresses are not actually addresses. They are offsets. And the loader has to find every address that's marked as an offset and add the start address to it.
10:50
But that works. We had relocatable loaders. We made these relocatable binaries and loaders. But now you've got another problem. Because how do you know where the subroutines are?
11:00
If the subroutines are going to be moving all over the place, how do you know where the subroutines are? So now you have to add more information to the binary file. And this binary file that you thought was just the binary of your program is now becoming this very bizarre file. It's got all kinds of gunk in it.
11:21
Now you have to put the names of the subroutine library into the binary file of the subroutines. And you have to show what address they will get loaded at. And the relocatable loader has to remember where it put those addresses. And then in the program, you have to have another thing in there that says,
11:43
hey, I need to know where this program is going to be loaded. And the loader has to link the subroutine library to the program. Computers were slow in those days. Disk drives were very slow.
12:03
And there wasn't a lot of disk memory. You were lucky if you had 10 megabytes of disk. And the seek arms took a long time. And the rotational latency was high. And so link times took a long time. Especially as we got more and more libraries and more and bigger programs.
12:24
Link time could take an hour. Anybody have a link that took an hour? Anybody here working in the 80s? Oh, no, you got one now. You must be a C++ programmer. That says so right on his shirt. Oslo C++ user group.
12:42
Long link times. Even in the 70s, we had that problem, although it was much smaller programs. So the solution to that was to separate the link phase from the load phase. We would link as a second compile step.
13:02
And that would produce a final relocatable file that had all the linkages resolved. And then we could load it at runtime. And that was relatively fast enough. And for years and years and years, we lived with this two-step process. Compile down to binary files.
13:21
Then link all the binary files into an executable. And then you could load the executable. And that solved the problem until the 90s. In the 90s, something happened. It was called Moore's Law. What's Moore's Law? The speed of processors will double every 18 months.
13:44
Now, apply that from about 1970, when the speeds of our processors were half a million instructions per second. And keep that going forward until the 1990s. Well, that's 20 years. How many 18 cycles is that?
14:00
Well, that's about 15 cycles. So we have a 2 to the 15th increase in speed. Think about that. 2 to the 15th increase in speed. What is that? That's an increase of about 32,000. That's about right, too. Because we went from about a half a megahertz to 2.8 gigahertz. Well, maybe 1 gigahertz by the late 90s.
14:23
By that time, disks were going faster, too. We'd spun them up a lot faster. The heads weren't moving as far. And we were getting a lot more bits around the rim, too. We could get, what, hundreds of megabytes on a disk.
14:41
And somebody had the bright thought that, well, we don't have to do the link separately anymore. We could do the link at the same time we load. Anybody remember ActiveX? Anybody remember Olay? What does DLL stand for?
15:06
Dynamically Linked Library. Which means it's linked at load time. The link step got moved back into the loader. And we have the situation we have today.
15:20
Virtually ever, who's .NET programmer in here? Look at that, a lot of people. Java programmers, raise your hands. Not so many. How come? How come it's all .NET? Oh, maybe it's because it's a .NET-y kind of conference, huh? So in .NET, you got DLLs, dynamically linked libraries.
15:44
In Java, you got JAR files. But they're still dynamically linked libraries. Same idea, same purpose. In C++, if you're doing the Microsoft thing, you've still got DLLs. If you're doing a Unix thing, you've got shared libraries. They are still dynamically linked libraries.
16:01
Our mode of operation nowadays is to dynamically link our libraries. That's how we got here. How many DLLs do you have? Guys with solutions, video studio solutions. How many projects?
16:21
60. That's not bad. Who's got more than 60? Oh, look at that. Who's got more than 200? And do you know why you separate your code into different DLLs? Let me ask that question differently.
16:41
When you deploy your application, do you gather up all the DLLs and just ship the WAD? If you do, then why are you dynamically linking them? Statically link them.
17:01
Why would you bother with dynamic linking if you're just going to take all those DLLs, gather them up into one gigantic WAD and throw the big WAD in a directory and say, well, there's my system. Why dynamically link if you're not going to dynamically deploy? Why did we come up with dynamically linked libraries?
17:22
We came up with dynamically linked libraries so that we could dynamically deploy our applications. Why? Well, we're going to get to network speed. Hang on a minute. Because network speed has a huge impact on this whole thing. Mid-90s, I've got a client. He's got a 250 megabyte executable.
17:44
In the mid-90s, that was a big program. Now it's nothing. But then, 250 megabytes was a big deal. You could not fit it on a CD. This was a CAD system. He shipped it to companies like Ford. Ford would use it to design gears and levers, stuff like that.
18:03
He statically linked it. He would deploy the executable to his clients by burning it on several CDs. If he changed one line of code, he had to recompile, relink,
18:21
re-burn all the CDs, and deploy all those CDs to all his clients. And you can imagine that cost him a fair bit of money. I went there in the mid-90s, and I encountered them at a time when they were trying to chop up their executable into this new idea, DLLs.
18:44
Because they realized that if they had dynamically linked libraries, then they could change a line of code, and ideally, you'd only have to ship that DLL. You could email it. Back in those days, that was a big deal.
19:01
You couldn't email 250 megabytes in those days. Nowadays you can, as long as the guy you're talking to has got a reasonable email server. But back in those days, emailing in 250 megabytes was impossible. So they could email the 100 kilobytes of a DLL. And so that was very, very desirable for them.
19:20
They worked for months and months and months, chopping their application up into little tiny bits, turning them all into a bunch of DLLs, and that's when they realized their critical mistake. The critical mistake was that chopping your system up into a bunch of arbitrary DLLs doesn't do you a damn bit of good if they all depend on each other.
19:40
If they all depend on each other, then you can... Was anybody just in Scott Meyers' talk here? The last talk he just gave an hour ago. He was talking about the problem of keyholes.
20:01
The problem of keyholes is that we arbitrarily constrain someone for no good reason. Just arbitrarily constrain them. So, for example, have you ever seen a text box on a GUI that was just too short and you had to type a bunch of stuff in it and it wouldn't let you resize the window, it wouldn't let you scroll in any way,
20:24
you just had to kind of type blind, or maybe the text would scroll, but you wouldn't be able to see the beginning of it? He was mentioning the keyhole problem, and I note that something just happened here. Apparently I'm not allowed to not touch my computer for more than five minutes.
20:44
I must apparently touch my computer, otherwise I will be punished. What was I talking about? Oh, yeah, DLLs. So this guy, he put all these DLLs together, he forgot that DLLs depend upon each other.
21:01
His goal was to be able to touch a line of code and just ship that DLL that was affected, but he found that all the pound includes, C++ guy knows what I'm talking about here, all the pound includes formed a horrible network, and he had to recompile and redeploy everything anyway. They went out of business.
21:23
The purpose of my talk today, now that I'm getting around to it, is to talk about components, the problem of component design. And the first thing we're going to do is define a component. What's a component? Component's a DLL. When I talk about the word component, what I mean is DLL,
21:43
very particular kind of DLL, a dynamically deployable DLL, DLL, I almost said DNA, a dynamically deployable DLL. Why would we want to dynamically deploy? Well, because we'd like to be able to change one line of code,
22:00
just ship the one DLL that changed and ignore all the others. What's DLL hell? A term invented, I believe, by Microsoft to describe their own situation and was to be completely cured by .NET. Anybody remember that line? .NET cures DLL hell.
22:25
No, it doesn't cure DLL hell. DLL hell is the problem that we've got all these little components with different version numbers, and nobody knows which one should go together, so we invent these module maintenance tools like Maven. What are you guys using .NET?
22:41
What's the tool that lets you keep all of your DLLs in line so that you know to download version one of that one and version three of that one? Do you have a tool for that? What? Nougat! Like the soft, chewy center? Never mind, I'm not going there.
23:10
The graph you see on the screen is a graph of x squared. This is just the x squared graph.
23:22
But it's also something else. It's the number of dependencies in a system, the theoretical maximum number of dependencies, given a certain number of modules. And you can see that the number of modules increases linearly and the number of dependencies increases with the square.
23:41
The theoretical maximum number of couplings, which I show here, is proportional to the square of the number of modules. Now, of course, we would never create a system that has every possible dependency in it.
24:03
Or would we? Look at this curve. This curve is the productivity of a team in comparison to the number of modules.
24:21
By the way, this is completely arbitrary. I just generated a one over x squared curve here. This is not me collecting actual data. This is just me recollecting my experience with development teams. They go slower and slower and slower over time. Who's had this happen to them? You start out going really fast. You can conquer the world.
24:43
A year later, you're slogging through some kind of horrible wetlands. And you don't know what the heck has gone wrong, but estimates that used to be on the order of one week are now three months long. And by the way, you blow all those estimates anyway and introduce more bugs than you fix. That's the kind of problem that we have
25:01
as we proceed along a development path. And one of the reasons for that is this accumulation of dependencies. Why? Well, that's the theoretical maximum dependency between modules.
25:21
This is the theoretical minimum. If you're going to have an interconnected set of modules, there has to be some dependencies, and the minimum set of dependencies is a tree structure. Oh, you can do some better with dynamic linking if you want to, but for the most part, you're going to have a small number of dependencies. How many do we have?
25:41
One, two, three, four, five, six. Six out of seven modules, whereas here you've got, well, I think that's half of 49. It can't be half of 49 because that would be half a dependency. Maybe it's just plain 49. I don't know what it is. It's a large number. It's some relative of n squared.
26:01
Maybe it's one-half n squared plus one-half n. Something like that. It's a very large number of dependencies. Now, we don't want this. We do want that. We strive very hard to get here, but then some schmuck does that.
26:24
Visual Studio doesn't allow this inside a solution. Inside a solution, the DLLs cannot have cycles between their graphs. That's good. Don't want cycles. Between separate solutions, there's no guarantee. So if you have multiple solutions,
26:41
or if you're linking with things that come out of different solutions, you can still have cycles in the graph. If you get cycles in the graph, it looks like it adds only one extra dependency, but actually it adds more because six now depends upon two, because one depends on two, and dependencies are transitive. So six also depends upon four and five,
27:02
and six depends upon three and seven. In fact, six depends on all of them. So the number of dependencies multiplies dramatically as soon as you have a cycle.
27:22
This is the n-squared graph again. This is also a graph of C++ compile time as you add modules. The compile time and the link time, if you were doing static linking, but even if you're not doing static linking, just the compile time grows with the square of the number of modules as if you have a fully connected network of modules.
27:45
What that means is that your pound includes, or your import statements, or your using statements can be traced in a cycle. And if you have that, then you're going to wind up with this big increase in compile time. C++ in particular would do this.
28:01
Java and .NET don't. Their compile time is based on a different metric. They don't go reading source files the way C++ does. Java and .NET read binary files to get their declarations. C++ reads source files to get its declarations. And so if you had a cycle in C++,
28:23
you got punished by a big compile time. And a massive one. We would go up with the square. So you'd add a couple of modules, your compile time would double. And that made us do something about it. Who knows who Ward Cunningham is? A few of you do. Good.
28:41
And the rest of you. He's the guy who invented Wikis. Ward Cunningham invented Wikis. He's the guy who helped Kent Beck invent pair programming, test-driven development, most of the agile stuff. Get to know who Ward Cunningham is. He's a very interesting fellow. Smalltalk programmer from long ago. And I asked Ward once,
29:01
why did Smalltalk die, Ward? And he said, Smalltalk died because it was so easy to make a mess. Use C++ programmers. I was a C++ programmer at the time. Use C++ programmers are lucky. Your language punishes you if you make a mess.
29:20
Smalltalk didn't punish you. Well, neither does Java, neither does C sharp. They don't punish you anymore. It's very easy to make a very large mess and get a very tangled structure and not know you're doing it. Fortunately, Visual Studio keeps some of the cycles out of your graph.
29:44
We would like that level of productivity, which is an N log N, rather than this level of productivity. That's an N squared. And one of the ways to help with that is to manage the dependencies between our components.
30:02
Now something happened to us in the late 90s and into the 2000s. Network speeds started increasing dramatically. Nowadays, it's not hard at all to download a gigabyte in a couple of seconds, to upload a gigabyte in 10 seconds. That's pretty easy nowadays. Back in the early days, it was much harder.
30:23
So back in the early days, we thought shipping individual DLLs was going to be a benefit. Nowadays, though, we just kind of gather them all together and ship the one big wad. Why? Well, because network speed is fast enough, we can do it. We can treat our batch of DLLs just like it was statically linked.
30:42
But there's another issue. How many of you work in teams? Oh, look at that. Everybody works in teams. So you come in at 8 in the morning. You got a task to perform.
31:00
You work all day to get all your stuff working. It all works by the end of the day. You check it in, you go home. Come back the next day, all your stuff is broken. Why? Somebody stayed later than you and changed something that you depend upon. And so you work all day long to fix whatever the problem was,
31:22
and you go home and you come back the next day and your stuff is broken again. How many times can you go around that loop? Lots of times. This is a problem of large teams. Large teams will start to step on each other. And of course we've invented tools to help us. We've got source code control systems and we've got all kinds of good stuff to help us with this.
31:42
But we still can step all over each other unless we manage our projects well. So how can we manage our projects well? There is a principle.
32:06
A principle called the Acyclic Dependencies Principle. The Acyclic Dependencies Principle says, if you have a set of components, you would like them to be arranged without any cycles in the dependency graph.
32:23
Now, for a whole bunch of reasons, we've already talked about one of them, which is compile time. We've talked about another, which is just the dependency load. Here's another. Alarm would like to release its version of alarm. The team that's working on alarm would like to make release 1.0.
32:44
They've got nobody that depends on them, or nobody that they depend upon. So they're completely free to make any release they want. So they release their version of 1.0, alarm 1.0. They start working on alarm 1.1. But now alarm 1.0 has been released,
33:02
which means that elevator and conveyor can make their release. And once elevator and conveyor have made their releases, they can start to work on 1.1. But now transport can make its release. You can see what's happening here. The version numbers bubble up from the bottom.
33:22
1.0 gets created here, then there, then there, here, here, and finally there. The version numbers bubble up from the bottom. If you look at it closely, you'll realize that the version numbers follow the exact same path as the build, the compile, because the dependencies are running in that direction.
33:44
And now, some poor schmuck does this. Who's this? Well, that was me. I did that. I had an alarm subsystem I was working on, and I needed to put a message on the display in the control panel.
34:03
There happened to be a class up here that had a function called display, and I thought, oh, I should just call it. So I called it. This was not in a language that restricted me from cycles. And so it compiled, and everything was fine. It all worked okay. And then the next day, I had a group of angry developers
34:22
come to my cubicle with clubs and tell me, what the hell did you do? I said, well, I just called this control panel display class. It had a display function in it. I needed to put this message on the screen. And they said, you can't do that. Why can't we do that? First of all, what order should we build those modules in?
34:50
You'd like to build them. Bottom up. But now there's no bottom. So there's no correct build order. If you have cycles in the component graph,
35:01
there is no correct build order for the modules in that system, and therefore the execution of that system is undefined. What does undefined mean? What's the definition of undefined? Works in the lab. Anything undefined will work until you actually deploy it somewhere,
35:23
then it will fail. You can get systems to fail rather badly by having these cycles. This is pretty common in Java. If you have a system of Java that has cycles in it, you can build it, although there's no correct build order. Then you run your test, and the test will fail.
35:42
Then you build it again. That will choose a different build order, and maybe the test will pass. I know of companies that put their build in a loop until the test pass. But the problem is worse than that, because Conveyor would like to make their release.
36:04
They want to release 1.1. Now, for them to release 1.1, they have to test with alarm 1.1. But alarm 1.1 is waiting for control panel 1.1, which is waiting for transport 1.1, which is waiting for conveyor 1.1, which is the one we're trying to release.
36:22
So there's no way to make the release without checking all of that source code out into one place, integrating, anybody remember integration, the joys of integration? Integrate the whole system, and then make it work. And while you're doing that, you're going to be stepping all over each other.
36:42
So if this cycle will bring back the problem of coming in at 8 in the morning and find that everything doesn't work. But it's worse than that, because in order to test conveyor, I need alarm, which needs control panel, which needs revenue, which needs the database.
37:03
The database takes 45 minutes to load, and then it crashes. I can't run my tests. And the guys in conveyor are saying, why the heck do I need the database? Well, you need the database because of this strange dependency structure. Anybody ever look at the number of DLLs that get loaded and scratch your head and say, how come I need those?
37:24
Anybody in a C++ world ever have a link line and wonder, what's all this stuff on the link line? How come I need all that stuff? You've got cycles in the dependency graph. It's bringing in all kinds of gunk. So the first principle of components is no cycles in the components.
37:44
What if you want to do this? What if you really want to call some function from down here that lives up there? How are you going to do it? Well, you could pull out another component. That's one way to do it.
38:00
Here I took that class out of the control panel. I put it in the display component. Then the alarm component can talk to the display component. The control panel can talk to the display component. I can keep the cycles out. That's a common enough technique. Remember, these are all DLLs. So the number of DLLs in your system will start to grow as people want to add cycles to the dependency graph.
38:24
Maybe. Although there is another way to resolve the cycle. You can use dependency inversion. I could put an interface, a display interface, in the alarm subsystem and have the control panel implement it.
38:46
That turns the dependency around and changes a cycle into a straight tree.
39:01
What's OO? What is object orientation? Why do we like it? How come all of our languages are object-oriented languages? We've been doing this for 30 years. We ought to know. Models the real world. Thank you.
39:20
I planted him here so that he would say that so that then I could rip him to shreds. No, that's absurd. The whole idea that OO is a better way to model the real world is just plain nonsense. It's something that some guy concocted a long time ago to convince his manager to spend 12 grand on a C++ compiler because he couldn't figure out any other way to get his manager to do it.
39:42
12 grand? Early C++ compilers cost a lot of money. 12 grand? I'm not spending that for a compiler. Well, it'll help me model the real world. Oh. Okay, then. This whole notion of modeling the real world is just downright silly. What in the world is OO other than a bunch of functions using a bunch of data structures?
40:05
Encapsulated. Okay, fine. Encapsulated. But a bunch of functions using a bunch of data structures. And how is that different from non-OO? The answer to that is, well, it's not easy to describe how that's different. Oh, okay, we kind of put the data structures and the functions together.
40:22
But we always used to do that. Old C programmers used to do that all the time. Data structures and programs always went together. There's a famous book called algorithms plus data structures equals programs. Data structures and algorithms working together. So nothing really fancy about OO there. There is one thing that OO gave us that we did not have before because it wasn't safe.
40:46
And that's polymorphism. Very, very convenient polymorphism. We used to have it in C. Device independence in any operating system is an example of polymorphism. If you can write a program and you don't need to know what device that program is going to use,
41:03
it's clearly a polymorphic interface. But that's dangerous in most languages. Or it was back in the day because you had to fiddle with pointers to functions and that was always dangerous. What OO gave us was very, very convenient polymorphism. Polymorphism without thinking about it.
41:21
Java in particular, all the methods are polymorphic. There's no choice. C sharp, you have a choice. You can use that funny virtual keyword. C++ programmers, you've got a choice. You better use that damn virtual keyword. Especially on your destructors. But most of us, we don't even pay attention anymore. All our functions are polymorphic.
41:41
We don't even think about it. Why? Because when a function is polymorphic, something amazing happens. The flow of control goes down towards the derivative. But the source code dependency goes back towards the base. We can take a source code dependency and turn it around without changing the runtime dependency.
42:08
How do you get DLLs? How do you get components? You isolate them. But you have to maintain the runtime dependency.
42:22
Visual Studio people, are you using ReSharper? Who is ReSharper? Look at that, everybody. Does Visual Studio know about ReSharper? No. Does Visual Studio call ReSharper?
42:44
Yes. The flow of control goes from Visual Studio into ReSharper. There are function calls in Visual Studio that make their way into ReSharper. But there is no source code dependency that moves from Visual Studio into ReSharper.
43:01
Because they've inverted the dependencies. You can create DLLs that your application will call. But your application doesn't know they exist. And you do that by inverting dependencies. Turn them around. So this is one nice way to do that.
43:20
Now, the alarm system, the control panel, is a plug-in to the alarm system. The alarm system doesn't know the control panel exists. The control panel is a plug-in. And the alarm system wouldn't accept any kind of plug-in. A lot of different things could implement this display function here.
43:43
So we could have lots of different things that we alarmed. What would you rather depend upon?
44:02
A component that was stable? Or a component that was unstable? Trick questions. Everybody knows you want to depend on something that's stable.
44:23
But now let me define stability. Is my laser stable? It's not changing. But is it stable?
44:41
Stability is not a Boolean. Stability is a continuous variable. And it is defined as the amount of work required to make a change. If it takes a lot of work to make a change, it's stable. If it takes very little work to make a change, it's unstable. That is unstable.
45:01
Because it doesn't take much work to make a change, it's stable. This, well, I won't say that that's stable. Because it wouldn't take much work to upset it. And this whole stage doesn't feel very stable to me. So I may have to be careful about the way I move. So let me ask the question again.
45:23
What would you rather depend upon? Something easy to change or something hard to change? Modify the source code. What would you rather depend upon? A module whose source code was hard to change or a module whose source code was easy to change?
45:43
Same answer. You want to depend upon the thing that's hard to change. And the reason behind that is very simple. Do you mind depending on string? No. Why don't you mind depending on string? Because if those guys ever changed string, they'd be hell to pay. They would suffer more than you.
46:02
That's the equation that we're talking about here. You are happy to depend upon something if it will hurt the authors of that thing more to change it than it hurts you. That's the equation. You are happy to depend upon things as long as they're not likely to change or at least the bastards are going to pay if they change. So, all right.
46:21
We don't want to depend on things that are easy to change. Think about that very carefully. I don't want to depend on something easy to change. Do we design parts of our system so that they will be easy to change?
46:43
Yes. What parts of our system do we most want to be easy to change? Gooeys are volatile. They change for no good reason at all.
47:02
People will change gooeys just because they feel like it. There will be some committee that will be formed to say, you know, our system looks old. What the hell does that mean? Our system looks old. We need to give it a facelift. The marketing people have decided we've got to have a whole new look and feel. Not going to change any behavior, just a look and feel to the gooey.
47:23
The gooey has to be easy to change. The modules that hold that source code have to be easy to change. And that means that none of your other modules should depend on the gooey. No source code dependency should land on the gooey. Your components should not have dependencies that land on the gooey.
47:42
Gooey components have to depend upon the application. Application components cannot depend on the gooey. Otherwise, you wind up with systems where the gooey is hard to change. How many of you test your systems, automated tests, through the gooey?
48:02
Ooh, got a couple people. You write test code through the gooey, you are depending on something that's supposed to be easy to change. You will make it hard to change if you test your system through the gooey. I have a client who has 15,000 tests all go through the gooey. Same client, by the way. Same one that went out of business.
48:24
15,000 tests through the gooey. He has so many tests, he didn't know what they were anymore. He just knew they all had to pass. If somebody touched the gooey, a thousand of those tests would break, and he couldn't find the time to fix them. So he came up with a real simple rule. What do you think that rule was?
48:42
Nobody touches the gooey. They made the gooey hard to change. The other thing, of course, that could happen there is that you could lose your tests. You spend a man year putting together a nice automated test suite that goes through the gooey, and then somebody decides, ah, we need a facelift on our site.
49:02
Throw out the old gooey, put a brand new gooey, and all those tests are gone. And you get to rewrite them again, but you've got other systems you've got to test. Don't test through the gooey. Don't do anything through the gooey. All dependencies point away from the gooey. What other things do we want to be easy to change?
49:25
The database. We want the database to be easy to change. We want to be able to make changes to the database without it rippling through the whole application. All dependencies should point away from the database. Put the database in a component with all dependencies pointing outwards.
49:41
Put the gooey into a component with all dependencies pointing outwards. I don't want to depend on anything that is unstable. How can we measure instability?
50:02
See this guy up here? That component up there? Is he stable or unstable? He's got lots of incoming dependencies. No outgoing dependencies. He's hard to change. If I make a change to him, it impacts all those guys.
50:20
So this component here is responsible to those guys. Oh, and this component here is independent? It doesn't depend on anybody. It's responsible and independent. It's an adult. Stable. Adult.
50:40
This guy. He depends on lots of other components. Nobody depends upon him. He's irresponsible and dependent. He's a teenager. He's unstable. These are the two extreme kinds of components. At the two sides of the component spectrum,
51:00
you've got the adults that are highly stable, and you've got the teenagers who are highly unstable. The unstable ones are easy to change. That's where we want to put all the volatile code. The stable ones are hard to change. We can measure that stability by creating a metric.
51:23
That metric is a metric that I call I. I is equal to the fan out, the number of outgoing dependencies, divided by the fan in plus the fan out. And if you think about that for very long,
51:40
you'll realize that I is a metric that goes from zero to one, zero being stable, one being unstable, zero being an adult, one being a teenager. It's all about the dependencies. And now we can rephrase the principle to say this.
52:06
Every dependency in a component graph should point at something more stable than it is. Or, if you use the metric, these arrows should point in the direction of decreasing I,
52:22
decreasing instability, increasing stability. And you can do the math on just the fan ins and fan outs and verify that that's correct. Why was the cycle a bad idea? Something stable depended on something unstable.
52:45
But that leaves us with a problem. And the problem is this. What's that guy? Stable or unstable?
53:02
He's really stable. He's sitting down here at the bottom of the graph. Everything over here depends on him. He's very hard to change. If I touch that component, all these other components will be affected by it. If for no other reason than the release number changes.
53:24
That means that stuff down here at the bottom of the graph is very difficult to work with. But there's an escape to that. Back to polymorphism. How can you make something easy to extend
53:44
even though it's hard to modify? You make it abstract. Abstract classes can be extended trivially without having to modify them. You can add new features to a system
54:00
if those new features live inside of derivatives, of base classes. So the final principle is this. We would like abstractness to increase as we go down these arrows. Stuff up here? Concrete. Unstable and concrete.
54:20
Stuff down here? Abstract and stable. Abstractness becomes more and more prevalent as we go down this tree. In fact, we could say that abstractness is a number
54:40
which is the number of abstract classes divided by the total number of classes in a component. And if you did that, you get this number A which goes from 0 to 1. 0 being concrete, 1 being entirely abstract, composed of nothing but interfaces. And then we can do this very interesting thing. We can say that for any particular component, A plus I should equal 1.
55:10
Either it's abstract, where A is a 1, and stable, where I is a 0. Or it is instable, where I is a 1, and concrete, where A is a 0.
55:22
A plus I equals 1. The magic formula of components. A plus I equals 1. Now, you've got the adults up here, which are abstract, and everybody depends upon them, so they're stable. You've got the teenagers down here. You've got everybody that's got no incoming dependencies,
55:41
but they're very concrete. What do you got here? This is the line A plus I equals 1. This is where we'd like all our components to sit if they can't sit at one of those two endpoints. Why? Well, what's up here?
56:01
Highly abstract. Nobody depends upon it. An interface that nobody implements. Useless. This is the zone of uselessness. We do not want our components going that way. What's down here? Very concrete. Everybody depends upon it.
56:21
Database schemas. Concrete. Everybody depends upon them. Fun to change. We don't want things down here. This is the zone of pain. We'd like our components to be as far from those two points as possible. Ideally, if we could get them here and here, that would be best, but it turns out the components are persnickety that way,
56:42
so at least we would like them to be sitting along this line, or close to the line, which leaves us with one last metric. D. How far away is the component from the line? Well, D could be the absolute value of A plus I minus 1.
57:02
You can do the math on this. It's not very difficult. D is a metric that goes from 0 to 1. 0 means right on the line, 1 means at one of the two bad endpoints. If you want to know which endpoint, you can take the absolute value signs off, but I don't care. You can measure D by looking at the fan in and the fan out of a component,
57:22
measuring its abstractness, doing the math. It's not a very difficult math to do, and find out whether your component sits nicely on that line. If it does, then it is abstract as it is depended upon.
57:41
If it doesn't, that means either it's very abstract and not depended on, or very concrete and heavily depended upon, both of which are bad ideas. There are lots of tools that will automatically calculate these metrics for you. If you've ever used ndependent, that will calculate the metrics for you. If you've ever used any of the other static analysis tools,
58:03
they can generate all these metrics, I, D, all of them for you, so that you can look at your components and see if they fit. And then, think about what should be easy to change. Things that are easy to change should be in teenagers. Things that are hard to change should be in adults that are abstract.
58:24
Concrete teenagers hold the stuff that's easy to change. Abstract adults hold the stuff that's hard to change. Any questions? We started with PDP-8 assembly code and got here.
58:43
Any questions? No? Good? Oh, damn. It's all right. It's okay.
59:03
Oh, two versions. Yeah, two versions, same component. You kill the programmers. Yeah. Don't have multiple versions of the same component in your system, please. That's D-O-L-L. Anybody else? No.
59:22
Where do you place the string? Very, very good question. String class sits right there. It's in the worst possible place, but nobody ever changes it, so we don't care. This is all about the stuff we are actively developing, the stuff that is being changed. So we pay very close attention to the stuff that's in our libraries that are changing.
59:43
The stuff that we get from other libraries that's not changing, or the stuff that's in our old libraries that's not changing, we're not going to pay attention to that here. A lot of that stuff may live here, and that's okay. None of it's going to live here. Well, some of it might. Anybody pulled out any dead code?
01:00:00
Abstract classes that nobody implements? Yeah, okay, so maybe you'll see some stuff there. But string, vector, a lot of the libraries live down here, and we don't care because they're not volatile. Think of another axis coming out from this graph towards you. That's the axis of volatility. This is a slice where volatility is close to one.
01:00:22
The stuff where volatility is zero, we don't care about. Anybody else? Yo, in the back.
Recommendations
Series of 25 media