How we do language design
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 110 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/51133 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
NDC Oslo 201291 / 110
1
2
3
5
7
9
11
12
15
19
20
23
24
27
28
29
31
32
33
35
36
37
38
39
41
43
46
47
51
52
56
59
60
61
62
63
65
67
70
71
74
75
77
79
80
81
83
87
91
92
93
94
95
96
97
98
100
103
106
108
110
00:00
Software developerFormal languageProduct (business)Total S.A.MereologyCoefficient of determinationNetwork topologyBlogProjective planeBitVisualization (computer graphics)Hard disk driveLattice (order)Process (computing)CodeGroup actionPrototypePressureSoftware developerBranch (computer science)Flow separationBasis <Mathematik>ResultantMathematicsFormal languageData managementSoftware bugCompilerType theoryLine (geometry)BuildingElectronic mailing listVirtual machineWindowVotingStrategy gameDecision theorySign (mathematics)ThumbnailRule of inferenceDirected graphCycle (graph theory)Computer programmingNumberAreaDatabasePrincipal idealDiagramMultiplication signGraph coloringFitness functionTerm (mathematics)Inheritance (object-oriented programming)Control flowShooting methodArchitectureEntire functionWater vaporSlide ruleRevision controlProgramming languageOcean currentAdventure gameMoment (mathematics)SpreadsheetReading (process)Software testingC sharpProper mapError messageDrop (liquid)Open setArithmetic meanSingle-precision floating-point formatSynchronization.NET FrameworkComputer animation
09:57
Crash (computing)Formal languageFeedbackIntegrated development environmentSoftware developerVisualization (computer graphics)Default (computer science)Spherical capNumberService PackPoint (geometry)User interfaceError messageSoftware bugEmailFormal languageDatabaseType theorySingle-precision floating-point formatTraffic reportingMathematicsStatisticsOnline helpGraph coloringConnected spaceBeta functionComputer iconComputer fileTotal S.A.Stack (abstract data type)Analytic continuationHecke operatorSystem callArithmetic progressionThread (computing)Product (business)Multiplication signMereologyCategory of beingMoment (mathematics)CodeINTEGRALAlgorithmFeedbackInstance (computer science)Figurate numberComputer programmingPlug-in (computing)Data managementSpreadsheetCASE <Informatik>Software developerExtension (kinesiology)Expert systemIntegrated development environmentCrash (computing)Computer configurationEnterprise architectureWindowForcing (mathematics)SpacetimeBlogTouch typingChemical equationTwitterMetropolitan area networkGoodness of fitCuboidRight angleDependent and independent variablesSemiconductor memoryProjective planeElectric generatorMessage passingTask (computing)Statement (computer science)Control flowVariable (mathematics)Lie groupComputer animation
19:55
Formal languageSoftware developerRevision controlFormal languageControl flowMathematicsSoftware bugModule (mathematics)MultilaterationBlogCore dumpMereologyGodCASE <Informatik>ImplementationMultiplication signData managementOrder (biology)DampingVisualization (computer graphics)View (database)Projective planeCodeProcedural programmingProgramming paradigmPascal's triangleProgramming languageInstallation artSet (mathematics)NumberLink (knot theory)Computer programmingType theoryVariable (mathematics)Right angleThomas BayesProcess (computing)Wave packetCalculationConfidence intervalComplete metric spaceInheritance (object-oriented programming)Event horizonUnit testingDeclarative programmingLatent heatFunctional programmingStructured programmingDisk read-and-write headWeb pageGeneric programmingGoodness of fitPoint (geometry)Tablet computerSource codePhysical systemObject-oriented programmingMaterialization (paranormal)CompilerTypinferenzKey (cryptography)WordSoftware testingPressureVarianceCapability Maturity ModelData structureMoment (mathematics)Execution unitIntegrated development environmentWritingC sharpInternet service providerComputer animation
29:52
Formal languageRevision controlSoftware developerLengthLibrary (computing)NumberDisintegrationFunction (mathematics)CountingUniform resource namePersonal digital assistantTask (computing)Lambda calculusSingle-precision floating-point formatProjective planeVisualization (computer graphics)SoftwareMiniDiscOnline helpExistential quantificationLink (knot theory)Compilation albumMultiplication signBitLattice (order)Library (computing)Programmer (hardware)Functional programmingProgramming paradigmRight angleKeyboard shortcutPrototypeEnumerated typeAreaComputer programmingPrimitive (album)CodeAxiomFormal languageTerm (mathematics)Constraint (mathematics)Numbering schemeSoftware frameworkMathematicsExtension (kinesiology)Regular languageTask (computing)System callComputer scienceOcean currentControl flowThread (computing)Heat transferSet (mathematics)Product (business)RoboticsGroup actionRule of inferenceThumbnailExpressionFigurate numberLevel (video gaming)MereologyParameter (computer programming)Wave packetData managementSlide ruleUniverse (mathematics)Materialization (paranormal)Dependent and independent variablesDirected graphUser interfaceWordValue-added networkMusical ensembleKey (cryptography)Line (geometry)CASE <Informatik>InformationAnalytic continuationRevision controlSyntaxbaumPosition operatorNeuroinformatikSineFeedbackWeightComputer animation
39:49
Software developerKolmogorov complexityCue sportsMenu (computing)Right angleComputer programmingSoftware bugWritingVisualization (computer graphics)Revision controlTemplate (C++)Equaliser (mathematics)IntegerCategory of beingImplementationMacro (computer science)IterationAxiom of choiceCASE <Informatik>Different (Kate Ryan album)CodeArithmetic meanF sharpBinary treeEntire functionFormal languageComputer scienceMultiplication signThread (computing)Data structureOrder (biology)Problemorientierte ProgrammierspracheExpert systemIntegrated development environmentCompilerPattern languageProgrammer (hardware)User interfaceLibrary (computing)Single-precision floating-point formatC sharpType theoryProjective planeMathematicsUnit testingSoftwareBlock (periodic table)PrototypeRootProcess (computing)Software frameworkEvent horizonBoilerplate (text)Proper mapGoodness of fitKeyboard shortcutResultantQuadratic equationMultiplicationQuicksortControl flowUsabilityBit rateSearch algorithmException handlingBranch (computer science)Directed graphWordSoftware testingBlogDampingOvalDynamical systemReading (process)Variable (mathematics)Message passingRight angleCausalityWebsiteForcing (mathematics)NeuroinformatikWeightCue sportsSoftware developerRadio-frequency identificationKey (cryptography)Computer animation
49:47
SynchronizationParity (mathematics)Electronic mailing listSoftware developerOvalMathematicsError messageOperator (mathematics)Library (computing)Block (periodic table)NeuroinformatikDiscrete element methodFunction (mathematics)Point (geometry)Formal languageWordComputer scienceIntegerAverageCommutatorNumberSoftware frameworkSingle-precision floating-point formatPrincipal idealMultiplication signCASE <Informatik>Sound effectCuboidSign (mathematics)MathematicsTablet computerThumbnailLibrary (computing)Rule of inferenceDirected graphType theoryControl flowForm (programming)Extreme programmingCAN busLine (geometry)Digital photographyVariable (mathematics)PressureOperator (mathematics)Semiconductor memoryLetterpress printingLattice (order)TwitterElectronic mailing listCausalityRepository (publishing)CodeGrand Unified TheoryRight angleSphereWeb pageTemplate (C++)Projective planeSocial classVisualization (computer graphics)FeedbackShared memoryMeasurement.NET FrameworkProgramming languageStandard deviationGoodness of fitError messageComputer fileWritingAdditionSimilarity (geometry)System callComputer animation
59:44
CodeTemplate (C++)Formal languagePowerPointSoftware developerComputer animation
Transcript: English(auto-generated)
00:02
I'm really happy to be here. I'm here in Oslo with my fiancee. Her grandparents emigrated from Norway over to Seattle, and we're having a great time here. I'm scared of heights. This is the tallest conference I've ever been to. And if you look out there, there's a six-meter drop to the conference floor.
00:21
Ever since I was skydiving once, the main chute didn't open properly. I had to go for the emergency chute, and it was a bit of a mess. And since then, I've decided to keep close to the ground. So if I'm a bit nervous, that's the explanation. I'm a program manager at Microsoft. We have this job title, Program Manager.
00:43
No one outside Microsoft really knows quite what that means, but I'm going to tell you, because this talk is about how, who, what. We do language design at Microsoft. I've divided it into three areas. The how, meaning the people, the process, the meetings,
01:02
the shipping pressures, because those have a really strong influence on what we do. A bit in the middle about why, kind of the philosophy. Why do we even do language design at Microsoft? Why not leave the languages static without changing them? But then the bulk of the talk is the what of language design.
01:21
And I've tried to express in these slides the rules of thumb, the principles we use, and illustrate all of them with concrete code examples. Please stop me at any time to ask questions. I'd love to have questions. Who are the people?
01:41
We have, say, how many of you can read C Sharp? All of you. How many of you can read VB? Pretty much all of you. Well, the C Sharp language design team on the rest. I colored it green, because that's the color of C Sharp.
02:01
We have seven people all told. The visual basic language design team, seven people, but there's considerable overlap. Let's start with C Sharp. Of course, it comes from Anders Heilsberg from Denmark. He did Delphi at Balland. He created C Sharp at Microsoft.
02:20
He's the overall chief architect for the .NET languages. The overall strategy, the chief final decision about what features get in them and why. Over there in the middle, he attends both meetings. Mads Torgesen, another guy from Denmark. He's in charge of the C Sharp spec.
02:42
He drives the C Sharp language design meeting. We meet once a week on Mondays for two hours. I'm here, Lucian Wisik, the visual basic spec owner, and I drive the visual basic language design process. I also prototype new language features in the compiler,
03:03
and I fix the bugs in those prototypes. Actually, I did the prototype for the async language feature, and once it transitioned from prototype into the product team, our QA branch logged a total of 700 bugs in the feature,
03:21
which is a very humbling experience. I thought, yeah, I've probably got five or 10 bugs in them a long way from the truth. A few other people. Eric Lippert here in the C Sharp design meetings. I don't know how many of you read his blog, but all of you should read his blog.
03:40
It's called Fabulous Adventures in Coding, and it's fantastic. Peter Goldie here. He was from the original C Sharp design team with the first version of C Sharp back in 2001. At the moment, he's the chief architect for implementing the VB compiler,
04:00
figuring out how to architect it. Visual basic language design is really a democracy. That is, we don't have a single gatekeeper like Anders' for C Sharp. Instead, we keep hammering out the ideas and the syntax until we arrive at a consensus or at least a majority vote.
04:24
I'm the guy who takes the minutes from the meetings so I can write whatever I want in the minutes. So technically, I have the ultimate say, but really, I don't use it. I have to say as a disclaimer, this is all personal opinion. I started in this job back in 2009.
04:42
I don't think anyone really knows how to design a language. There certainly isn't a corporate official Microsoft policy on how to design a language because the managers and executives who run a company are so far removed from the day-to-day process of designing a language,
05:01
which is a much more creative thing. Nevertheless, the process of language design at Microsoft is totally driven by the pressures to ship a product, part of a commercial product, which we drive every two years or three years or however frequently.
05:22
That does create definite problems, right? If we identify a flaw in the language and we want to fix it, or if we identify a place where we want to add a warning, but some of our users have put on treat warnings as errors,
05:42
and if we add a warning, then suddenly their project starts getting 100 errors when they upgrade. They say, why the hell are we paying Microsoft however many thousand dollars for a new version of this product when it breaks my code? So we have to stick to a very careful bar of not breaking anyone's projects
06:01
when they upgrade to the next version of the language. The design cycle really starts a few months, sorry, the design cycle for the next version starts a few months before the current language is shipped. When the feature's kind of winding down, when only a few bugs, the most critical bugs are being fixed,
06:21
that gives us the time to try out new prototypes to explore what happens. And we think of it in terms of stones and pebbles. Each year we have the budget and the time and the effort to fix maybe a few of the big stones, the major features, and within that we'll try and sneak in
06:42
whatever other pebbles that we can fit in. There's a strong tension, you know? The management make it very clear that we have to ship within a certain deadline. We as the language designers just want to put more and more and more into the language. And I think it's brilliant that we have this pressure
07:02
to say, hey, enough's enough. Just get us out of the door because ultimately everything is a pointless exercise unless we get in the hands of people rapidly to use the feature and see how it goes. And just how much can we do in each release?
07:24
Rule of thumb, one or two of these major new features. In this release it was async and win8 support. Beyond that, we fix as many of the pebbles as we can. But the number of pebbles we can fix is really limited by quality. If the quality isn't up to par,
07:42
if there are too many bugs, really we have to fix those before we're even allowed to stick in the features that we want to fix in. By the way, the product Visual Studio is a huge code base. I wanted to tell you how many lines of code it was,
08:01
so I started trying to sync the entire enlistment. But it turns out I don't have any hard disks big enough to fit it all on, so I couldn't do that. Each night the build machines build the product. It takes, I mean, you can imagine, Microsoft, we have the most powerful machines we can get building the product. It takes all night.
08:21
Each morning we wake up and we have a fresh version of the tree that was started syncing and building at 9 o'clock the last night. We have, what, 20,000 QA tests just for the Visual Basic language. That's far too many for us to run.
08:41
We run about 1,000 of them automatically on each check-in. The rest of the 20,000 are run by our QA department on a weekly basis. Then it takes them several days to analyze the results to figure out if the breaking changes really were breaks or just glitches, and then another few days
09:01
for them to feed it back on to the developers. I said what we can do is based on how much time we have to do it. Here, I tried to give you a feeling of what we do actually do. What I did was I went through the entire check-in history
09:22
for the product for this cycle, Visual Studio 2012. I tried to get a feeling of how much we did and where and how many program hours were spent on each area. I also went through every single bug in our bug database and our work item database
09:42
to try and get a good feeling for what was logged. I made a spreadsheet of it. This chart on the left is a fairly true diagram of where we spent our effort on. The biggest chunk was the big pebbles, I told you. Async and Windows 8 development.
10:03
Beyond that, bugs. How many bugs? I think there were about 40 bugs. That's the top category up here that we fixed. IDE, I think a huge part of the value of our language is the tight IDE integration.
10:22
IntelliSense debugging, edit and continue. There was a lot of effort put into the IDE. Partly because we have to get the IDE reflecting what's changed in the language. Partly we have to improve it, make it snappier, give you better typing performance. Actually, there was a heck of a lot of performance work
10:43
that we did. I counted a total of 34 separate fixes. Finally, once we've done all of those things, language changes. This is the pebbles that I talked about. Some of them came from us.
11:00
Some of them came from requests from people. Some of them came from us going to conferences. And the people come to us afterwards and say, hey, I have this particular need. And we try to address it. So how do we actually figure out what these bugs and perf and
11:20
language changes are? For the bugs, a huge portion of them come from Watson. Whenever Visual Studio crashes and it says, do you want to send feedback to Microsoft? And you click yes, we get an automatic entry into the bug database. Some of the automated infrastructure looks at the call stack for
11:41
this crash, figures out which product caused the crash, which particular team caused the crash, then it gets fed straight into us. And what we do is we have buckets. We see how many times this particular Watson bucket has been hit.
12:00
We treat crashes the most seriously of all. We kind of stop work on the other features if there's a bug that we have to address. Actually, when we first released Visual Studio 2010, there was a bug in its Watson reports, which meant that every single Watson report came from the Watson subsystem.
12:24
That really didn't give us much help. We rushed out to fix and service pack one. So the statistics we have really date from service pack one of Visual Studio 2010. Perf Watson, that's the next thing. Perf Watson is an optional add-in that you can download.
12:43
I hope as many of you download it as possible. It was built into the release candidates for Visual Studio 2012. Our user interface experts tell us 200 milliseconds is the limit, beyond which you start to notice that it feels sluggish.
13:00
So if you have Perf Watson, whenever Visual Studio user interface thread is non-responsive for more than 200 milliseconds, it gives an automatic collection of what exactly it's currently doing at this moment. What are the stacks in flight? What are the call stacks on each of them? What are the outstanding tasks?
13:23
What extensions do you have loaded? Because actually, I think a lot of the bugs that we get come from extensions and add-ins that are in the background. And again, it figures out which team was at fault. It sends an automatic email. I think we made about 34 fixes to Perf.
13:42
In most cases, that meant pick a more efficient algorithm. Pick an algorithm that uses less memory, for instance. That kind of fix. In some cases, it meant using async. It meant pushing work onto a background thread, maybe popping up a progress dialog
14:00
so that the user interface thread remains responsive. Connect. Connect is the best place to go for people to file language bugs. When they type something in and the language doesn't behave how they expect.
14:22
We're pretty conservative when it comes to changing the language, right? As I told you, every breaking change... I'll give you a concrete example of this. In C-Sharp, if you fail to return a value on all code paths,
14:44
it gives an error message. On Visual Basic, it doesn't. It doesn't give a warning. The reason is that the Visual Basic language says that all variables, including the implicit return value, are initialised to their default value. So if you don't have a return statement, it just returns the default value.
15:06
Okay, that makes sense. But we figured that mostly this indicates a bug. So we added a warning which says, warning, you have failed to return a value on all code paths.
15:21
We shipped the release candidate with this warning in place. We got a whole bunch of irate emails. For instance, we got one email from a guy who said, look, I upgraded to your blasted release candidate. I have a solution with 200 projects. It just generated it fast.
15:42
It generated 5,000 error messages. Here, I made an Excel spreadsheet of every error message that we got. Well, it took megabytes for this error message spreadsheet to come to us. We analysed it. We looked through it. This guy had turned on treat warnings as errors.
16:00
90% of his errors were not all code paths return a value. And we looked through his code because he shared that with us. And about 10% of those cases were honest-to-goodness bugs in his program. We thought, hey, that means we did a good thing. We found out bugs in this user's program.
16:21
The whole point about warnings is to find bugs in user's programs. No. The point about warnings is to give value to users. If we have made it unable for him to upgrade without investing a huge number of man-hours to upgrade code that probably someone else in his team wrote,
16:41
maybe someone who's left his team already has written, we're not helping him. We're actually preventing him from taking advantage of all of the other IDE, performance, user interface enhancements. So we really have to do a very careful trade-off. Anyway, most of the connect bugs really are honest-to-goodness bugs
17:01
that we do fix. Also, people put language suggestions in connect. Sorry, just looking through my notes. In total, we made about 35 fixes to language bugs, my spreadsheet told me.
17:21
Some from us, some from connect. A bunch of these really were mini feature requests. But if we tell management that it's a bug, then they let us fix it. And we're passionate about fixing the language, so we try to sneak it in that way. User voice. How many of you use user voice?
17:41
I see five hands that have gone up. Please use user voice. That's the real best way for you to give feedback to Microsoft on how to change. Do you know what the number one item on... I don't know how many of you have downloaded Visual Studio 2012, either the beta that was released a few months ago,
18:01
or the release candidate that was released a few weeks. I see one hand, two, okay, a number of hands. The number one user voice item after the beta was the greyness of it looks ugly. By an overwhelming majority of votes,
18:22
our user interface team took that into account and changed away from the grey theme and added colours back into the icons into something that was a lot quicker. They really turned on a dime. Within a few weeks, they were able to change around and change the thing based almost entirely on user voice. We really appreciate it.
18:40
If you look on user voice now, the number one item is... The release candidate was released a week ago. It had all caps for the menus. The number one item on user voice now is, we don't like all caps for the menus. Please change it. Okay, they've put in an option to change it. And really, everything we... not everything we do,
19:03
a huge amount of what we do is driven by requests and scenarios. Actually, what we try to do, I've written here balance. We listen to what users say they want. But in all of this, we're trying to understand the underlying subtext of what they need.
19:26
And we're trying to look at the market trends, what's happening in the enterprise and consumer space, figure out how do the user's requests relate to the market needs, and how can we implement a general purpose language feature
19:41
that's clean and well designed and hopefully borrowed straight from Microsoft Research so that they've done all of the difficult work to figure out how to do it properly to satisfy what they've been asking for. Groucho Marx has this quote,
20:00
I don't want to belong to any club that would accept me as a member. Eric Lippert of the blog in the C Sharp language design team. I'd never want to use a language that contains all the features I ask for. I think a hugely important part of language design is learning when to say no.
20:26
I've already told you one case where adding warnings is a problem. But look at it this way. Every new language feature that we introduce is a huge amount of work. It takes days or weeks for everyone who uses it just to learn about it,
20:43
to learn how to use it properly. It takes the team's architect that time to learn about it, then he has to teach the rest of his team how to use it. He has to establish team coding guidelines on where to use this language feature correctly, where not to use it.
21:00
The var keyword in C Sharp with type inference when you have local variables and you just want to use var to infer what type it is. In most conferences I go to where I talk about this kind of thing, there's someone who puts his hand up and says, you're from the language design team, could you please tell the rest of my team back in my company
21:23
that they shouldn't be using var? I always disappoint these guys. We wouldn't add var to the language unless we thought it had useful uses, so we never answer them directly. But every team has to come up with its guidelines of whether or not it thinks var is the right thing to do.
21:43
Actually, our guidelines say that never use var when you're giving a talk or producing training material because it's too hard to read. People really need to see what the types are explicitly in order to understand what it's doing.
22:01
Yeah, then we have to write the training material. External speakers have to give talks, write training material, add new chapters to their books for every new language feature. We'll only add a language feature if we think its benefits will outweigh everyone's cost
22:21
to learn the new language feature. As Eric Lippert puts it, every language feature starts out by default with minus 1,000 points. Only if we can see lots and lots of good features to it to bring us above zero will we even consider it.
22:41
So, how does a language mature gracefully? Sometimes people ask us the question, why do you have Visual Basic as well as C Sharp? Why not just ditch one, switch over to C Sharp? Or more generally, why do we continue going with old languages?
23:03
Why don't we wipe the slate clean? This is the American Continental Congress when they were figuring out how to start America. Why don't we just wipe the slate clean? Start with a new constitution or a new simple language that's clean and proper and just has a few well-defined concepts that are suitably expressive.
23:24
My favorite language, Modular 3. About as powerful as C Sharp 1 or C Sharp 2. It had a complete language specification written in English so it was readable. There was just 22 pages long.
23:40
It had modules, generics, type inference, very sophisticated type inference, far beyond anything that we have. Beautiful language, very succinct. It never really kicked off, unfortunately. Actually, why did it never take off? It was based on Modular 2, which was moderately popular,
24:04
based on Pascal, which was ubiquitous in academia. I think a huge part of a language's assets are in the installed code base of people who use it. Their existing training, their existing skill sets,
24:22
their existing code base, their most important asset. It would be terrible for us to go along and screw it up. That's why C is still the most popular language, I think, because, well, there's so much written in it. It's not just the customers and the users' assets that are existing already.
24:42
We have a huge number of assets. IntelliSense, tools, refactorings, IDE support. Those take, I don't know, for a language like C Sharp or VB, I think they take three years to write from scratch if you can look at the previous code base
25:01
and you know what it did and you can re-implement it by looking at the old code base and you're redesigning everything in order to help you with IntelliSense and you have the same team members on team who wrote them in the first place and are re-implementing them. Three years. Why do I say that? Because at the moment of Microsoft, we're in the process of re-implementing everything from scratch in VB.
25:23
We're re-writing VB in Visual Basic. We're re-writing C Sharp in C Sharp. We're re-writing all of the IDE with the view to making it better. That's how long it looks like taking us. The code name of this project is Roslin and a few days ago, we released a new preview of it.
25:44
Another question to ask is, well, if we don't want to ruin people's assets and we don't want to add new warnings or new features that break their code, why not just freeze the language in stasis and make tiny, tiny changes to it or none at all changes or only bug fixes like LaTeX, I suppose.
26:05
I think that's a bad idea, too, because really, our job is to safeguard the user's assets in the language, all of the training and whatever they have. If we leave a language to stagnate,
26:21
we're not safeguarding their assets because all of the time, they're calculating, hey, when do I have to jump ship to the next thing? PHP, JavaScript, Python, I don't know what. If we continue to innovate on the language, make it the best place, make it the place where new things are happening, people will get the confidence that they're backing the right horse.
26:43
They're sticking with the correct language. Also, we have to respond to language pressures. Sometimes, new paradigms come along. Link, declarative programming, functional programming, immutable variables, F-sharp type providers that really are a better way of writing programs.
27:03
And if our languages are stuck, imagine if Visual Basic were stuck in the basic of 1954. We've only got go-to and go-sub. You don't even have procedures. Disaster, it would not go anywhere. You couldn't use it. Language innovations come along that we have to put in
27:21
in order to make the language stay attractive. Here, I put an example actually from Basic. If we strip away the module and the sub,
27:40
this kind of useless junk, and we stick just to the core body of this code, exactly this code runs on QBasic from, what, 1984 or so. It hasn't changed that much. We really have achieved the ability to mature the language gracefully, I think. God, it hasn't been hard.
28:01
I told you that we have 20,000 QA tests in our test suite. Visual Basic started in the early 90s with an implementation. Not a specification, just an implementation. If we want to learn how the language works, we have to look at the source code.
28:20
Of course, you can never understand something by looking at the source code. You can only understand it with unit tests and QA tests and just having an enormous number of people spending an enormous amount of time hammering at it. C Sharp has a much better place to start with. It has a mostly accurate language specification.
28:42
When I say mostly accurate, I think we've had about 150 language bugs filed where the C Sharp specification is incorrect over the past year or two. We're finding all this because we're re-implementing the C Sharp language in C Sharp. And the re-implementers say,
29:02
well, one of our design principles in re-implementing it was let's re-implement it from the spec, not from the compiler source code, because we want to re-architect it properly. And every time that they implement it from the spec and it fails the unit tests, then they say, hey, the spec was wrong. Well, that's where we're going.
29:21
Visual Basic hasn't just weathered the recent changes. It's weathered a huge number of changes, structured programming, type systems, object-oriented events, which are kind of the opposite of object-oriented. They're saying no longer use inheritance, use event handlers, functional declarative.
29:41
All of these have been stuck with it. And we've somehow tried more or less to make it work, tried more or less to stop the corner cases from being aggravating. We've tried to add new features in such a way that they combine productively with what was there before.
30:03
And the big question is, if we want to mature a language gracefully, how exactly do we do it? At this stage of the talk, I'm switching to examples. The rest of the talk will be examples of code to talk about how each piece of code
30:20
illustrates a particular rule of thumb or a particular language design principle. I should say that each piece of code that I put here represents at least one language design meeting's worth of argument. It is argument very often. Normally, it represents many language design meetings worth of hotly contested debate, discussion.
30:43
In fact, I'd say I have as many rules of thumb here as I do have examples. That shows they're not really rules of thumb. Nevertheless, we'll see how we go. I think LingQ is a fantastic example of the best paradigm of adding to a language.
31:04
Why is that? LingQ was tech-transferred from an experimental Microsoft research project called C-Omega. It was done in Microsoft Research in Cambridge.
31:22
We have such brilliant people at Microsoft Research, and that's going to be a theme in more of what I talk about. Because they do it, they can look at academia and they can see what else is happening there in the research universities. They can figure out how to refine it into a language design.
31:41
They can figure out what are the fundamental mathematical principles underlying it. And if you start from clean mathematical principles, then there's a good chance that they'll just fit together well in the language that you build with it. Then we tech-transfer it from the research into the product team.
32:01
LingQ was mostly... The tech-transfer was mostly driven by Eric Meyer and Anders. How did we do it? We added new foundational computer science features straight out of undergraduate teaching material. We added lambdas, extension methods,
32:22
the ability to quote methods for expression trees, all very standard in academia. Then we added new functionality as a library. In this case, we added a set of functions like .join, an extension method called .join, which takes the lambdas.
32:43
So we're using the foundational computer science features into the library. And then beyond that, we add language syntax, which is just syntactical shortcut for regular language features that invoke the regular language.
33:03
Gee, if we could express most of the language in terms of syntactical shortcuts into simpler language features, we'd have a much easier time. We can't. It never quite works like that. Why not? Well, in this example of LingQ,
33:21
we have to add IntelliSense. If it's just a syntactical shortcut to a low-level thing, that totally doesn't help with IntelliSense. We have to architect the compilers to support IntelliSense, really. If we have to make the language perform quickly enough for people who have a thousand single-line lambdas
33:44
in a single piece of code, we really have to optimize it. That might shape what language support we can have for the lambdas. There are a whole bunch of constraints that mean that that's not quite true. But a really important principle is,
34:00
if it's syntactic shortcut and it uses foundational computer science principles, then there's a really good chance that it will compose with the rest of the language. By compose, I mean you can use it. It integrates properly with the rest of the language. And the great thing of adding syntactic sugar, which binds to a blessed language pattern,
34:22
is that in this case, people can provide their own LingQ-like things, like the reactive framework. How many of you have played around with the new async feature? Could you put your hands up? That's not many of you. That's about a quarter, a sixth of you.
34:42
I'm giving a pair of talks on Friday about the async feature. It's a major new change. I think that every single programmer, every single level of programming, every single area of programming will benefit from the async language feature. So I encourage you to turn up to the talk.
35:01
I'm going to gloss over the details of the feature but just talk about what it means with respect to what we've changed in the language. We added two new keywords. One is the async keyword, which says this method is going to be involved in asynchronous computation, whatever that means. And the other keyword is this await keyword,
35:23
which says, hey, something's going to take a long time. Maybe I'm downloading stuff from the network. Maybe I'm loading something from disk. Let's not block the UI thread. Let's figure out how we can release the UI thread and return where we left off later.
35:41
Well, I'll talk more about it on Friday. The important thing is that we added, well, what did we add? How did we add this new feature? We started with tech transfer. Async started with a whole bunch of things. It started with F-sharp async workflows,
36:03
another thing from Microsoft Research that has made its way into the product. It started with the task type, which also came from Microsoft Research. It came with the axiom prototype that was done by an incubation effort at Microsoft. Incubation is like a startup,
36:20
mini startup company within Microsoft that just explores stuff. It came from the Async Enumerator by Jeffrey Richter and the Robotics Runtime, CCR, done in the robotics group in Microsoft. Yeah, we take all of those ideas and we figure out, some of them already provided us the foundational computer science things.
36:40
In other case, we figure out, yeah, this relates to a foundational computer science concept that's been around since the 50s, I think, called Call with Current Continuation, Call CC. Everyone who has written Scheme knows about that. There's no way we could implement the full feature like we did with Lambdas. We had to implement a simplified version of Call CC
37:04
that's appropriate to this particular, that does enough for what we're looking for here. After that, we added a bunch of library features, and after that, we added syntactic shortcut, like await is syntactic shortcut
37:21
for calling those language features. Actually, that's not quite true. It's not purely syntactic shortcut. It became very clear that there were some things that could not be done purely as syntactic shortcut and did require a fundamentally new control flow primitive that we added.
37:41
But in all, I have to say that we followed pretty well the same underlying principle. User reaction to this has been, actually, let's step back. User reaction to link was a bit dubious at first. It said link. I can see how this relates. I'm used to using SQL.
38:01
I'm going to use link well. But Lambdas, oh, they're a bit esoteric. Most programmers won't know how to use Lambdas. Two years on, they're saying, hey, I love Lambdas. For async, we're getting lots of debate. Oh, is that the right thing? Oh, this is complicated. We're also getting some positives,
38:22
like people saying, I refuse to upgrade. Oh, we released Visual Studio 2010. We released async first as a community technology preview so that people could see it, try what it's like. That was an important thing, to get your feedback, to persuade Microsoft that it was a good thing
38:41
by the people in the public saying it's a good thing. Then we can go to management and say, hey, everyone likes it. Also, it's a completely new thing that isn't yet part of undergraduate training materials. It's not like anything you've ever seen. And we wanted to get it out there and learn from you to make sure we've shaped it in the right way.
39:04
Well, the response was mixed. Some people saying, oh, it looks complicated. Oh, it doesn't do what I think it does. Oh, you chose the wrong keywords. Other people saying, hey, I love it. Other people saying, hey, I love the async CTP so much
39:21
that I'm not going to upgrade because that would break the CTP. It doesn't break it anymore. We fixed that. But there was a lot of passion around it. Since too few of you have done async, I won't talk to this slide because it doesn't get in.
39:43
But let's talk about some of the keywords we added. We added this async keyword. We did user groups. People said, oh, async. I know what async means. It means that the code runs on a background thread so that it doesn't interfere with the user interface. Nope. That's not what we used it to mean.
40:01
We used it to mean something different. We used it to mean something closer to its Greek etymological roots. I'll talk more about what that means. We thought that it's worth making a feature that's slightly harder to pick up on hour one or day one of your use in return
40:21
for a feature that's more useful to you in the three years after it's been introduced or two years or one after you've been using it. You're using it regularly. We designed for the long haul, not for the immediate ease of use, I think. Pretty importantly, we have this keyword.
40:43
C equals await f.readAsync. What that means is first it executes the library routine readAsync. In this case, it probably kicks off a request maybe to a network or something. Then the await keyword makes it block
41:02
until the network request has come back. Actually, it doesn't make it block. That was a common misconception. It says do that and then schedule the rest of my method to continue after the data is available. People are saying, that's the wrong keyword.
41:20
You should have done a different keyword. You should have called it yield until f.readAsync. That would represent more clearly what it's doing. It's yielding until the result of that thing is ready. Or maybe they said it should have been f.readAsync and on. And on means continuing with. Or maybe we should have done await with open parentheses
41:42
then f.readAsync. We didn't. Why didn't we? We tried to pick syntax that's exactly the right size for what its job is. It's neither too short, which would make people use it too frequently, nor too long, which would make people use it too rarely.
42:06
Here was an example that someone wrote to me. He said, can you add this yield each keyword to the language? Let's talk about what that is. Iterators. They've been in C Sharp for a while. We added them to Visual Basic in this release.
42:21
Visual Basic has a slightly different syntax for them. It has this iterator modifier on the methods to say, hey, this method is going to be an iterator method. And then Visual Basic just uses the yield keyword. C Sharp calls it yield return. Do you know why C Sharp called it yield return instead of just yield?
42:43
It's because it had to. It's because if it was just called yield on its own and C Sharp doesn't have this modifier, then it would have been a breaking change. If someone already had a variable called yield or a type called yield, it would be a weird type,
43:03
but they might have. Therefore, they couldn't have done it. Therefore, they had to have a compound keyword, yield return. In general, we can always add compound keywords, but we can never add single keywords to the language unless, like we've done here in Visual Basic, we added this modifier to the method.
43:23
Visual Basic has a different syntax. People asked us, hey, if you're doing C Sharp iterators again, would you have done this syntax like this? Probably, yeah. We think it's a nicer syntax having the modifier up there because it tells people up front what kind of method they're reading. And it lets you get by with a simpler keyword.
43:42
Visual Basic also, because it was done more recently, it was implemented at the same time as Async was implemented, it uses the same compiler implementation under the hood as Async did. Therefore, it was better featured. That's why VB allows you to have iterated lambdas, allows you to have yield inside a try-catch block,
44:02
which C Sharp doesn't do because it had attempt one at the implementation. If we could redo it from scratch, would we change the implementation of C Sharp to be closer to VB? Probably, yeah. Although it would be slightly slower in the common case.
44:21
That's why we haven't done it. Actually, we still have outstanding bugs in C Sharp iterators because the C Sharp implementation is just so complicated that we never figured out how to implement them. Then, when I went to implement the Visual Basic version of iterators, the QA team ported over the C Sharp unit tests of iterators over to Visual Basic.
44:41
They said, hey, it's not behaving the same. I thought, oh no, what's wrong? I looked through it carefully. It turned out it was initially a C Sharp bug because it doesn't quite do finally blocks correctly in all cases. Just too difficult to change. Anyway, someone asked us. They said, I hate having to do this for each thing.
45:01
Here I'm just iterating over a binary tree. For each child in the left branch, yield that child. Next. Why, he said, can't we just have a built-in thing yield each which takes an IEnumerable and yields each of them, which is just syntactic shortcut for the other thing. F Sharp has this syntactic shortcut.
45:23
We didn't do it because the cost of this is quadratic. We think, don't add a language syntax for something that performs badly because if it's in there in the language, people will use it. If it takes a few characters, they think it's quick to implement.
45:44
For goodness sake, people still think that X plus equals one runs quicker than X equals X plus one. No, it doesn't. Actually, it runs slightly slower if it's a dynamic thing. That's because in the dynamic case, it has to check if it's an event
46:02
because event plus equals has different meaning from integer plus equals. Anyway, that's a side topic. But it's a very clear user perception that the syntax is the shorter it performs, and we can't encourage that if it's a bad thing. I notify property change.
46:21
Everyone's been asking us for some kind of auto notify keyword because they hate having to write boilerplate code all of the time. I'd love to sit here and say, hey, we've implemented it, but I'm not going to. We've been trying so hard. We have not figured out, for goodness sake, how to implement this properly. We've not figured out how to implement it using foundational computer science principles
46:44
in a way that extends nicely into the language, in a way that isn't totally tied just to this one particular example. If we added a feature that was tied to I notify property change, it would be terrible because then another framework would come out that has I different notify property change,
47:02
and the language could no longer work for that. WPF came out with dependency properties, and our prototypes for auto notify properties did not work with WPF dependency properties. We escaped that one. It's lucky that we didn't do something until too late.
47:23
People say, what if we did macros? Couldn't we do macros to support this? Like C and C++ have had macros for ages. We are so reluctant to add macros. If you look at people using C++ and template meta-programming, every single C++ program you look at is its own domain-specific language.
47:45
People abuse macros and templates in order to create something that looks and feels like a different language. We don't want to go there. We want to make something in C sharp and VB that everyone who is an expert in C sharp and VB can read without having to learn an entire different idiom or pattern
48:04
that's arrived through the use of macros. Here's another example of picking the right syntax, which we have failed to do. In F sharp, if you write let readOnly equals 15,
48:20
or let x equals 15, it creates a readOnly variable by default. In F sharp, if you want to have a mutable variable, like we're familiar with, you have to stick extra typing in. We're developers, we're lazy, we don't like to do the extra typing, so we will, if we have a choice in F sharp,
48:41
make things immutable. Immutability is a better way of programming. It leads to drastically fewer bugs. Actually, when we're rewriting VB in VB in C sharp in the Roslyn project, all of our data structures are readOnly data structures,
49:04
just because we know that readOnly is the only sane way to write large projects with few bugs and that work in multi-threaded environments. F sharp, in this case, has what we call a pit of success. We've made it easy for programmers
49:22
to stumble into the correct way of doing things. In VB and C sharp, in the case of readOnly variables, unfortunately, we have not done that. This is how you write a readOnly property in Visual Basic. This is how you write a readWrite property in Visual Basic. We've made the wrong thing easy.
49:41
Well, we're still working on this to see what we can do better. If you have any ideas, please, please, please tell us. Async, I don't want to talk in detail about this because I'll leave it till Friday.
50:00
I sent around an internal memo saying, void returning async methods considered harmful. As we looked through the feedback we got from user talks and so on, we discovered that this particular corner of the async feature had far more than its fair share of warts. We had so many language design meetings.
50:20
We tried to look for principles that could guide us on what to do. In the end, we had no principles here. We just had to say, look, here are 20 advantages to this feature. Here are 20 disadvantages to feature. Which one outweighs the other? Well, hey, come on Friday and you'll see which one won.
50:42
Sometimes we just have to fall back on this list of pros and cons and rely on gut feeling. Here was another example. I said, start with a foundational piece of computer science and put it into the language. Here's a cautionary tale.
51:03
I'm talking about nullables here, and I'm using VB because VB nullables are slightly different from C sharp. The academic principle of nullables was for each type T, we lift it to this nullable form, T question mark,
51:20
and we have this new value, nothing, or null in C sharp, and like in SQL, it means I don't know. If you say, dim X as nullable integer equals nothing, it means, hey, X is an integer, but I don't know which integer X is.
51:42
Then, again from academia, we lift each operator from, if we had an operator plus, which goes from a type T plus type U gives a type V, then we automatically generate a lifted form, T question mark plus U question mark gives a V question mark.
52:01
In this case, integer question mark plus integer question mark gives an integer question mark, and if either of the operands were nothing, if you have I don't know plus five, what's the answer? I don't know. That's why the answer of nothing plus five is nothing,
52:22
or null plus five is nothing in C sharp. Okay, that makes sense. Visual Basic took this to its clean mathematical extreme. It said, let's lift all the operators. Let's lift the equals operator. So, if Y equals nothing, then right line nothing.
52:43
What does that do? If Y equals nothing, the value of Y is I don't know. If I don't know is equal to another variable whose value I don't know, then do something. Is a variable who I don't know equal to another variable
53:02
who I don't know? Answer, I don't know, because we lifted it for the equals operator. If I don't know, then print nothing. Should we print that thing? No, we shouldn't, because we don't know if it's true or not. So, in Visual Basic, this will not print anything. In C sharp, the same code will print something.
53:25
I think that was a design mistake in Visual Basic. We went too far for the mathematical neatness. What we should have done is we've said, if 99% of users look at this code and expect it to behave in a certain way, we'll jolly well behave in that way. That's what C sharp did.
53:41
That's what we should have done. Warnings were a bit similar. I talked about the value of adding warnings. The rule of thumb we tried to have is, obviously, we don't break existing code as much as possible, but if we're designing a new feature and we're designing whether to add a warning for it,
54:01
if users write something and 95% of the time it's probably an error in their code, then that's a fine place to give a warning. But if there are 20% of users who write it for a legitimate case, then we wouldn't add a warning for it, because we can't hurt the legitimate use of a feature at the cost of saving some other portion of users
54:26
from tripping over themselves. We'd love to add more warnings to the language. We really would. But as I said, we have these people who have slash warn as error. Gee, that's given us such a headache,
54:40
because what warn as error have done, if you do treat all warnings as errors, you have opted your project in into all future warnings that we might have. In effect, if you check that checkbox, you're saying, hey, when I leave my company and I leave my code base behind, the next person who takes over is going to take what I left him.
55:02
In other words, opting in to every single warning is going to opt in to every single dodgy piece of code that I wrote. That's a difficult thing to do. We shouldn't have this forced buy-in for all future things. So we thought, ah, we've already got slash warn as error.
55:23
Let's add another feature which says, hey, although you had warn as error, let's exempt certain warnings from that. But then we think, hey, what if we don't like that? What if certain warnings can't be exempt? What I've done feels quite a lot like
55:41
the science fiction books I used to read that were written in the 70s. The hero and the heroine are trapped in the airlock from the evil villain. And the evil villain has programmed in an override. And they say, computer, open the airlock. The computer says, I'm sorry, I can't open the airlock.
56:00
They say, computer, override the block. I need it to be open. The computer says, I'm sorry, the override has been disabled. They say, computer, override the override. The computer says, okay, the override has been disabled. They say, computer, open the airlock. The computer says, okay, I've opened the airlock.
56:23
What they've got is a measure, a countermeasure, and a counter-countermeasure. If we get that far, we've made the wrong language call. We need to stop it there. If we've got too many countermeasures, we've just gone down the wrong path. How much time do I have?
56:41
Very little time. Let's wrap it up. The library principle. Question. How many times is each framework used? I reckon the average number of uses for each framework, if we rule out the .NET framework and we rule out the C++ standard library,
57:03
is probably close to about 0.9. In other words, I believe more frameworks are written than are actually used. I think we as computer scientists have a huge, huge tendency to write frameworks, and as language designers, that's even worse because we have the huge tendency not to write frameworks, but to write language features.
57:23
The principle I try to go in with is this. If you want to do something, do it as a documentation page that explains how to do it. If you can't do that, do it as an example that they can download and see how it works. If you can't do that, do it as a template.
57:41
Is it like a file new project template that people can use? If you can't do that, do it as a library that they can download from NuGet. If you can't do that, put it into the framework. If something can't be done within the .NET framework, then, and only then, should you consider adding it to the library.
58:01
Well, to the language. All of this is strong, strong pressure not to add to the language, and we get that a lot. People saying, for goodness sake, why do you keep adding to the language? Every addition you make is crazy. I used to know C Sharp in 2004. I don't know it anymore.
58:21
As I've talked about, I think we have to keep adding. I picked this photo. It's an illustration of the Library of Alexandria, the great repository of all of the knowledge of the entire world at the time. It was unfortunately burnt down. People had to start from a clean slate. I've told you why I think starting from a clean slate
58:42
is a really bad thing in computer languages. And I've told you why despite that we try not to add it, and if we have to add something, I've told you what principles we use that I think will make it safe to add.
59:00
Thank you. Now, I've almost run out of time. I have run out of time. If you have questions, please come up here. The next talk starts in 20 minutes, so there's time. I'll be talking a lot about Async on Friday, and I'll be around all day on Friday
59:21
if you want to come find me somewhere in the hall. Thank you.