C# Language Internals
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 133 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/49606 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
NDC London 2016112 / 133
2
6
10
12
15
17
23
24
28
30
31
32
35
36
39
40
43
44
45
47
51
52
55
58
59
60
61
62
63
64
67
69
71
73
74
75
82
84
86
87
97
103
107
108
111
112
114
115
117
120
123
126
128
129
132
133
00:00
Programming languageSoftware developerMenu (computing)SynchronizationEvent horizonProgrammschleifeRegulärer Ausdruck <Textverarbeitung>MechatronicsString (computer science)InterpolationSpacetimeAlgebraic closureState of matterGoodness of fitQuicksortCompilerPresentation of a groupTable (information)Order (biology)CodeINTEGRALIdentifiabilityQuery languageSource codeProgramming languageMultiplication signOpen sourceBitSoftware developerOnline helpProfil (magazine)Type theoryWell-formed formulaPoint (geometry)Menu (computing)SpacetimeSimilarity (geometry)Sheaf (mathematics)Gauge theoryArithmetic meanCompilation albumRegular graphDemosceneExpressionService (economics)WebsiteLink (knot theory)Computer animation
03:22
Fluid staticsSoftware developerQuery languageLocal GroupCountingArc (geometry)Convex hullTask (computing)Physical systemDemo (music)OvalGeneric programmingSocial classTranslation (relic)Query languageType theoryLatent heatPattern languageCasting (performing arts)Interior (topology)CodeExpressionSelectivity (electronic)Social classTraverse (surveying)Category of beingTheory of relativityObject-oriented programmingQuicksortInstance (computer science)Product (business)Link (knot theory)IdentifiabilitySkeleton (computer programming)Cartesian coordinate systemSpacetimeBitFocus (optics)Multiplication signSource codeVarianceDifferent (Kate Ryan album)Variable (mathematics)InferenceDivisorProgramming languageNumberCompilerResource allocationRight angleSpeicherbereinigungDataflowAdditionSurjective functionMemory managementTorusWeb pageInternet service providerSheaf (mathematics)CASE <Informatik>Range (statistics)Visualization (computer graphics)Goodness of fitPressureGreatest elementMathematicsEqualiser (mathematics)SequelDatabaseLocal ringKey (cryptography)Demo (music)Coma BerenicesGroup actionUniform resource locatorProfil (magazine)Level (video gaming)Sound effect2 (number)WordPower (physics)WebsiteField (computer science)Point (geometry)Ferry CorstenObservational studyOvalStudent's t-testRevision controlComputer animation
13:07
Software developerPhysical systemTask (computing)Demo (music)Generic programmingMenu (computing)Event horizonSynchronizationProgrammschleifeSpacetimeAlgebraic closureState of matterString (computer science)InterpolationRegulärer Ausdruck <Textverarbeitung>Source codeElectronic mailing listPredicate (grammar)ExpressionPhysical systemMultiplication signType theoryNP-hardGoodness of fitDemo (music)MiniDiscInsertion lossLoop (music)Latent heatCompilerAlgorithmInterface (computing)QuicksortPattern languageQuery languageExpressionEnumerated typeRun time (program lifecycle phase)CuboidSubject indexingPredicate (grammar)Interior (topology)Equivalence relationSyntaxbaumTerm (mathematics)SurfaceSelectivity (electronic)System callExtension (kinesiology)Translation (relic)Internet service providerCartesian coordinate systemDifferent (Kate Ryan album)Link (knot theory)Focus (optics)Lambda calculusShape (magazine)Electronic signatureEvent horizonData structureSource codeProgrammschleifeProgramming languageIdentifiabilityTupleContext awarenessNetwork topologyMappingFunctional programmingWeightBitCodeProcess (computing)Gene clusterDecision tree learningCASE <Informatik>LengthBinary codePatch (Unix)Decision theorySynchronizationTable (information)Hydraulic jumpKeyboard shortcutGame theoryLogische ProgrammierspracheSequelData conversionMenu (computing)Automatic differentiationComputer filePoint (geometry)Instance (computer science)Physical lawFiber bundleIntelligent NetworkSymbol tableService (economics)WebsiteForm (programming)Numeral (linguistics)DemoscenePopulation densityPresentation of a groupTwitterComputer animation
21:47
Software developerRegulärer Ausdruck <Textverarbeitung>Physical systemGeneric programmingTask (computing)Demo (music)Functional programmingInterior (topology)CASE <Informatik>Type theoryCodeCompilerSource codeExpressionDemo (music)Predicate (grammar)Physical systemComputer iconQuery languageLink (knot theory)Equals signSyntaxbaumRight angleBitCategory of beingQuicksortLine (geometry)System callGroup action40 (number)Electronic mailing listFamilyComputer animation
23:59
Software developerDemo (music)Regulärer Ausdruck <Textverarbeitung>Latin squareGeneric programmingPhysical systemTask (computing)QuicksortVideo gameDemo (music)SyntaxbaumElectric generatorCompilerData structureProgramming languageLine (geometry)CASE <Informatik>IdentifiabilityType theoryExpression2 (number)Logische ProgrammierspracheParameter (computer programming)CodePattern languageBitFactory (trading post)Run time (program lifecycle phase)Macro (computer science)Conic sectionMultiplication signFunctional programmingLambda calculusAnnihilator (ring theory)Instance (computer science)Loop (music)Keyboard shortcutDivisorSequelLogical constant40 (number)Equivalence relationBlock (periodic table)Order (biology)KnotSign (mathematics)Workstation <Musikinstrument>Alphabet (computer science)Associative propertyWeightJSONComputer animation
27:26
Software developerRegulärer Ausdruck <Textverarbeitung>Computer iconCodePhysical systemTask (computing)Demo (music)String (computer science)CompilerCASE <Informatik>ExpressionSyntaxbaumData structureSystem callElectric generatorAbstract syntax treeParameter (computer programming)Demo (music)Category of beingIsomorphieklasseRewritingConstructor (object-oriented programming)Social classLambda calculusFactory (trading post)LengthSource codeArithmetic meanNumberFile formatSineBuildingComputer animation
29:02
Demo (music)Regulärer Ausdruck <Textverarbeitung>Physical systemSoftware developerMenu (computing)Fluid staticsConvex hullLengthGeneric programmingCompilerInstance (computer science)System callCodeRun time (program lifecycle phase)MetadataSyntaxbaumCompilerParameter (computer programming)System callExpressionCategory of beingToken ring.NET FrameworkStructural loadSound effectOrder (biology)RewritingCompilation albumException handlingData structureJust-in-Time-CompilerKeyboard shortcutBitPrime idealExpert systemAlgorithmMessage passingDot productLambda calculusWordStatement (computer science)CalculationType theorySieve of EratosthenesTransport Layer SecurityAxiom of choiceFactory (trading post)Translation (relic)Interior (topology)InformationQuicksortInternetworkingTask (computing)Link (knot theory)Fluid staticsMoving averageProjective planePhysical systemInheritance (object-oriented programming)Electronic signatureCASE <Informatik>Mathematical optimizationFunctional programmingDoubling the cubeFlagExtension (kinesiology)Latent heatPower (physics)CollaborationismHome pageInterpreter (computing)Menu (computing)SurfaceConnected spaceEquivalence relationElectric generatorC sharpProgramming languageLengthElectronic mailing listFunction (mathematics)Binary multiplierFile formatPoint (geometry)Student's t-testMathematicsDisk read-and-write headWebsiteExistenceMultiplication signDifferent (Kate Ryan album)Boss CorporationDivisorNeuroinformatikProcess (computing)Assembly languageDirected graphRight angleLine (geometry)Prisoner's dilemmaShared memorySubject indexingMatching (graph theory)Hazard (2005 film)Coma BerenicesGame theoryComputer animation
38:52
Menu (computing)SynchronizationEvent horizonProgrammschleifeNetwork topologyString (computer science)InterpolationRegulärer Ausdruck <Textverarbeitung>Software developerSpacetimeAlgebraic closureState of matterLengthPhysical systemGeneric programmingDemo (music)Dynamical systemPhysical lawExpressionPoint (geometry)SyntaxbaumAlgebraic closureKeyboard shortcutMotion captureQuicksortState of matterArithmetic meanPhysical systemSubstitute goodFunctional programmingVariable (mathematics)CASE <Informatik>Local ringRight anglePattern languageLine (geometry)Field (computer science)Memory managementAxiom of choiceSpacetimeFrame problemLambda calculusCore dumpFerry CorstenVariancePiFinite-state machineBlogGame theorySystem callVirtual machineStudent's t-testWeightKey (cryptography)Programming languageDemo (music)Fitness functionMixed realityComputer animation
42:31
Software developerTask (computing)Physical systemDemo (music)AerodynamicsGeneric programmingFluid staticsMultiplication signAdditionOperator (mathematics)Demo (music)Type theoryVarianceMathematicsPhysical systemDynamical systemError messageCompilerRun time (program lifecycle phase)System callDemosceneFigurate numberMultiplicationSet (mathematics)CodeString (computer science)Information overloadException handlingArithmetic meanCompilation albumObject-oriented programmingNetwork topologyFluid staticsStatisticsLevel (video gaming)Euler anglesCASE <Informatik>Parameter (computer programming)Displacement Mapping
45:48
Demo (music)InformationParameter (computer programming)Software developerAerodynamicsPhysical systemGeneric programmingTask (computing)Fluid staticsOvalCodeDynamical systemFluid staticsSpacetimeOvalAttribute grammarType theoryMultiplication signPosition operatorObject-oriented programmingEquivalence relationBitQuicksortInterior (topology)MultiplicationRun time (program lifecycle phase)Arithmetic meanElectronic mailing listMathematicsComputer animation
47:20
Software developerAerodynamicsDemo (music)Task (computing)Physical systemGeneric programmingPointer (computer programming)ExpressionFlagLogical constantPrice indexType theoryDynamical systemVideo gameQuicksortData structure1 (number)Boolean algebraExpressionInterior (topology)TrailSystem callSource codeString (computer science)Object-oriented programmingRun time (program lifecycle phase)Attribute grammarSyntaxbaumWebsitePattern languageGrand Unified TheoryFunctional programmingParameter (computer programming)CodeDemo (music)Network topologyMultiplication signCategory of beingCompilation albumField (computer science)FlagOperator (mathematics)AdditionCompilerSet (mathematics)Substitute goodLibrary (computing)Image resolutionInformation overloadProgramming languageCache (computing)Polymorphism (materials science)Binary codePoint (geometry)SubsetGreatest elementSemantics (computer science)outputDefault (computer science)InformationCASE <Informatik>Context awarenessVisualization (computer graphics)Rule of inferenceLogische ProgrammierspracheRight angleBuffer overflowComa BerenicesCorrespondence (mathematics)Motion captureKeyboard shortcutLine (geometry)Process (computing)Online helpTerm (mathematics)Asynchronous Transfer ModeCombinational logicForcing (mathematics)Electronic mailing listGodPrisoner's dilemmaFamilyShared memoryOverlay-NetzGame theoryFunction (mathematics)Multitier architectureMessage passingLoop (music)40 (number)DataflowAutomatic differentiationEuler anglesSpacetimeComputer animation
56:06
Software developerRegulärer Ausdruck <Textverarbeitung>InformationQuicksortRun time (program lifecycle phase)SyntaxbaumCodeSystem callJust-in-Time-CompilerPolymorphism (materials science)Type theoryParameter (computer programming)Reflection (mathematics)AdditionDecision theoryException handlingBuffer overflowCache (computing)Context awarenessBound stateDynamical systemCore dumpProgramming languageOverhead (computing)Operator (mathematics)Keyboard shortcutMultiplication signExpressionStudent's t-testSpeech synthesisCASE <Informatik>Open setComputer animation
57:47
Menu (computing)Software developerSynchronizationEvent horizonProgrammschleifeString (computer science)Regulärer Ausdruck <Textverarbeitung>Algebraic closureSpacetimeState of matterThermal expansionVideo game consoleError messageLogical constantWindowGraph coloringForcing (mathematics)OvalFerry CorstenSequelComputer animation
58:42
Software developerConditional-access moduleAssembly languageModule (mathematics)Coma BerenicesRevision controlInformation securityElectronic data interchangeBuildingOrdinary differential equationOperating systemTouchscreenDensity of statesBootingComputer fileCartesian coordinate systemStructural loadGroup actionPlanningComputer animation
01:00:28
Software developerDemo (music)Keyboard shortcutAerodynamicsMenu (computing)String (computer science)Simultaneous localization and mappingGeneric programmingTask (computing)Physical systemEmailLibrary (computing)Data compressionComputer filePoint (geometry)String (computer science)CASE <Informatik>Demo (music)Dot productBuildingComputer scienceDivisorRight anglePrime numberLogical constantHash functionSocial classAdditionMultiplication signSource codeVisualization (computer graphics)MultiplicationQuicksortBookmark (World Wide Web)Shared memoryExtension (kinesiology)CollisionGoodness of fitFile formatNumberCompilerOpen sourceSelectivity (electronic)Insertion lossVideo gameSystem callStandard deviationStudent's t-testMoment (mathematics)Type theoryPresentation of a groupUniform resource locatorNamespaceRootElectronic mailing listDifferent (Kate Ryan album)Touch typingEvent horizonData structureMatching (graph theory)Shape (magazine)Symbol tableCategory of beingObject-oriented programmingLine (geometry)Service (economics)Electronic signatureSemantics (computer science)CodeInformationAddress spaceVarianceCheat <Computerspiel>Web pageComputer animation
01:07:31
SpacetimeAlgebraic closureMenu (computing)Software developerSynchronizationEvent horizonProgrammschleifeThermal expansionString (computer science)InterpolationNetwork topologyState of matterPhysical systemTask (computing)Demo (music)Generic programmingOvalString (computer science)Data conversionType theoryInterpolationMultiplication signFile formatSystem callReduction of orderSign (mathematics)Element (mathematics)Automatic differentiationVarianceWritingCodeSource codeInformationCASE <Informatik>Matching (graph theory)Different (Kate Ryan album)CompilerNetwork topologyInvariant (mathematics)ExpressionProgramming languageVisualization (computer graphics)Statement (computer science)DemosceneBasis <Mathematik>Form (programming)Software testingPoint (geometry)NumberRoutingMathematicsNeuroinformatikStructural loadElectric generatorComputer animationSource code
Transcript: English(auto-generated)
00:06
All right, good afternoon, everyone. Welcome to my talk about C-Sharp language internals. How many of you were in my talk yesterday about C-Sharp 6? It's going to be a very similar recipe today. We're going to take a look behind the scenes.
00:21
We're going to take a look at the code that the compiler generates for various features. And we're also going to take a look at some of the Roslyn source code. Roslyn is the former code name of the compiler as a service, which means that all of the compiler source code of C-Sharp and VB is now available on GitHub. So there's no secrets anymore on how all of that stuff works.
00:42
And today, the goal is really to sort of stimulate your curiosity about how things work. I've always been born as some kind of a curious guy. And I think it's pretty important for good developers to kind of understand what's happening below your feet. Like when you're writing some piece of code and you're leveraging the latest and greatest
01:01
piece of language features available, that you know what the performance profile is, how it works, that kind of stuff. And it will also help to impress junior people on the team. OK, so let's actually get started. The formula for this type of talks is that I present you with a menu. And I don't know where we will end up,
01:20
because you get to choose what we'll talk about here. So I think we'll have time for four pieces. Otherwise, we'll sort of get some indigestion if we do a little bit too much here. But I have many of those talks online. So you can find other talks that will fill in the gaps for things that I can't talk about here today. I also have a couple of courses at Pluralsight
01:42
that talk about C-Sharp language internals. So you can actually watch those two courses over there. Each of them is like four and a half hours. So you will have nine hours of compiler internals fun. That was before the whole thing was open sourced. So that's just by looking at compiling something and looking at the generated code in the compiler. Today, I may do a refresh of those language internals
02:04
presentations online as well, using Roslyn as the source code to sort of really drill into the details. So I have a couple of appetizers, a couple of starters, a couple of mains, a couple of desserts. And I think the main, the regular order of those things is, of course, to start with the starters.
02:21
So I have two of them there that you can choose from. I have something about Ray Initializers, C-Sharp 1.0 feature, and Mysteries of Transparent Identifiers, which is something that was introduced in C-Sharp 3.0. And it may be really mysterious because you've never heard about it. So which one should we go for?
02:45
Identifiers? What's the question? That's going to be later. Yeah. That's the starters, you know. We still are sitting at the table and we sort of need to warm up a little bit. So we'll start with the appetizers, okay?
03:02
Okay. I see. Okay. Appetizers, so transparent identifiers it is? Okay, sounds good. Who knows what a transparent identifier is? Okay, good. All right. It's something that was introduced in LINC,
03:20
Language Integrated Query. And let me actually go to the language pack and actually say transparent identifier here. And there we go, Section 7.16.2.7, transparent identifiers. Clearly, it has to do with query expressions on page 215. By the way, you can find the spec on your installation of Visual Studio. It's just sitting in one of those folders there
03:42
under specifications VC-Sharp. You can actually find this document. It's also online, so you can just get it there. Let me just try to go to transparent identifiers here. And what's a transparent identifier? We'll take a look in the code in just a second. But you see over here, it says that certain translations of query expressions
04:04
actually inject so-called range variables with transparent identifiers, which in the spec are denoted by an asterisk. And then let me actually go for an example here. You will actually know, most of you likely know how LINC translates, right? If you use the from keyword, the where keyword,
04:22
the select keyword, what it does. Everyone has some good notion of what's going on here. So if you take a look over here, 7.16.2.2 from T, X, and E. Not that many people know that you can write it in LINC, actually, it's one of those corner cases. But you can write from int X in some collection,
04:43
which will insert some kind of cast. And so you will actually see that that recursively translates into from X in E.cast of T, where T is the type you specify. So you see translation patterns here in the specification. And that's the same for things like where and select. For example, over there in the middle, you see from customer C and customer city is London.
05:02
Very appropriate. I didn't change it before the stock, but it was always London in the spec. And then at the bottom, you see customers cast to customer where C goes to C.city equals London. So that's like the translation pattern. Now for some mysterious reason, in certain cases, we need to translate those things using a star.
05:22
And that's where it gets mysterious. From X1 in E1, from X2 in E2, do something. A lot of people will likely know that double, you know, multiple from classes result in a select many. Select many is kind of LINC's way of doing a join between collections, if you will. More like a Cartesian product, if you will.
05:42
But you see like how this sort of translates here. And you can already read some interesting properties out of this. Like the select many here will have a collection selector, which computes E2 and it depends on X1. So that means you can use X1 in expression E2. Your second collection can be dependent
06:01
on the value of the outer collection. So you can actually have something that, for example, can sort of traverses an object relational kind of link. And you can sort of go from products to suppliers, for example. So that's why that works. But now if you take a look further down, you have things like the let keyword over there as well.
06:23
And you again see that mysterious star appearing on the left-hand side. And those are the so-called transparent identifiers. So what does this thing do? And I'll show you in the code in just a second. We'll focus on the let clause because the select many is very similar but it's just a little bit more verbose to look at.
06:41
If you take a look at this let clause, it allows you to introduce intermediate variables, right? Let allows you to abbreviate certain things that you don't want to type many times inside your query expression. So let me give you an example. And because we're sort of dealing with internals here, let me just do everything in Notepad today.
07:03
So here we go. Just an empty skeleton here. Let me just put four spaces on the clipboard because that will become my tab key. It's the way I do things. So let me just create a collection called axis. Has two, three, and five in it. It doesn't matter what's inside. I should have pasted twice, sorry.
07:22
Var y is equals from x in axis. And now say, well, I need to do a lot of things that depend on, oops, x equals two. And I want to have something like z which is y plus one. And then ultimately I want to do a select of x times z minus 42 divide by two or whatever.
07:41
Like it doesn't matter, it's a language demo. Actually that can sort of be simplified. I can't see an optimized code. But like, it doesn't mean much. But now if you sort of think about this translation pattern what do we have to do? Where do we store those variables y and z
08:02
inside my query expression? Well, if you think about the selective, if there would just be a select you would know what's happening. It would be axis dot select of x goes to something. Let me use y here because otherwise there was no reason to introduce y. But like, how do we do this
08:21
with all those let things inside? And that's where transparent identifiers come in. If you go back to the spec, it basically says over here, somehow in a mysterious way, we want to select from star in e and that's not a star of SQL, right? It's not like select star, it's something different. Notion of a transparent identifier.
08:41
And then you can kind of infer what's happening on the right hand side here. Take a look what's happening there. X goes to new x comma y equals f. We're sort of creating an anonymous type that sort of creates a tuple, a pair of everything that's in scope right now. And then that anonymous type becomes star.
09:03
What does that mean? If you go one level further, say that in the dot dot dot, you're referring to x or y, we have to traverse to x and y properties of this anonymous type. So let me now do the manual translation in Notepad and you will see what I mean.
09:21
The compiler will translate this whole thing here and let me call it z's, because otherwise I have naming conflicts here. We will translate this first let into select x goes to new x comma y equals x times two. Next one will become select of t
09:40
and t is a transparent identifier. I have to come up with names now. That's something the compiler will do. It will come up with names for those tuples, those pairs of things. And so what it's gonna do now is it's gonna say, what's gonna be in scope later? Let me actually use x over here as well to make it a little bit more interesting.
10:00
Well, we wanna have everything we had so far, namely x and y available downstream in my query expression. But in addition to that, I also wanna have z available and z is actually y plus one. Where do I get the y from? By dotting into t onto y.
10:20
So it will translate it into t dot y plus one. Then how does the next select work? Well, I could also have used the where here, where larger than something. Then let's ignore what's next here, whatever. Select one. So like here, I wanna have a where.
10:40
How do we translate this where? Well, what's gonna flow into the where is gonna be an instance of that anonymous type that I projected in the last select. What's that gonna look like? It's again gonna be some compiler made up name to refer to that thing. That star in the spec means dear compiler, make up a name to refer to that whole scope, if you will,
11:05
of variables that are in scope right now. And so here I need to do x plus y times z minus 21 is larger than zero. Let me first write this thing. And you will see, of course, this can't compile possibly. And then let me do select at the end of t goes to one.
11:21
This can't compile because where does x, y and z come from? Well, x comes from whatever t has and where does x sit? X sits in this t, but underneath the t to get to the x over here. So it becomes t.t.x, y becomes t.t.y, z becomes t.z.
11:43
So that's how your query translates. All those t's in here are transparent identifiers that the compiler made up. Now, what's the lesson out of this? This query looks quite innocent with all of its let clauses, but for every value that comes out of x's,
12:00
you will allocate no more or no less than two anonymous type instances, which are heap allocated classes, which put pressure on the garbage collector. And in fact, you take a look at the Roslyn source code, the compiler itself tries to avoid link as much as possible for critical pieces of code. Of course, it doesn't matter
12:20
if this is a link to SQL query and it will execute in a SQL database, those allocations never happen because it translates the query to some SQL language or to whatever language your link provider supports. If it's a local thing, then it can become an issue in certain cases, because we are sort of hiding here the cost of sort of allocating all of those things.
12:42
So you may want to start off with that and when you profile your application and you find that that's a bottleneck, you know, the number of allocations of anonymous types, then you may want to do something else here, okay? But so very, very simple, simple example of transparent identifiers. Now, for those of you who are interested in finding out how all of this stuff works,
13:02
well, first of all, let me comment everything out here and compile this whole thing to show you those transparent identifiers. So I'm just going to compile it here. Who wants to, whoopsie daisy, yeah, using system.link, obviously. And I like things to be alphabetic. That's my OCD, so here we go.
13:23
Who likes to use ILDASM? Who likes ILSpy? Who wants ILDASM? ILDASM? Okay, good. Then we'll go hardcore today. I like it. Here we go. You see indeed two anonymous types and you see that they have weird names in them.
13:41
The first one is a tuple of X and Y. The second one is one that has a transparent identifier in it. And here it's just called transparent identifier zero and Z. So that's kind of that mythical value that that's my T that I had over here. This transparent identifier is the name
14:00
they came up with for this T. It's transparent identifier zero. We have some weird characters in front of it. The reason we put weird characters there is to make it such that you can't possibly type this in C-sharp yourself. Those variable names are untypable. Like, you know, you can't have them because they will not pass the lexer phase.
14:20
They will say less than is not expected here. And then if you take a look at demo over here and the main methods, where I have all my stuff sitting, I told you if you ask me to do ILDASM then you're up for this kind of torture for the whole day. But like, you know, things you see over here is all of those anonymous types.
14:41
So, for example, you see here at the very bottom that we have a call to select, which actually has a function that maps an anonymous type over here, which you can see the whole shape of. It's an anonymous type that nests an anonymous type inside of it with two ints and another int. That's X, Y, and Z being bundled up in some data structure.
15:01
And you see it projected to an int32. That's because I'm doing select one at the end. So that's the nesting that's sort of happening here. You also see the where method over there. The where method is also doing very similar things. I'm not even sure, like, what it would look like in ILSpy, so let me just open it over there as well.
15:20
If they do a good job decompiling stuff, it should look like my original link query. If they fail decompiling this, it will be much more interesting. And clearly they fail decompiling it, so it's more interesting. So we could start digging into those things, and one of those things is going to be whatever where takes in over here,
15:42
the where filter where I have all of that t.t.x and all of that kind of stuff. So if I try to go here, you actually see, if you squint, this is t.t.x plus t.t.y times t.z minus 21 larger than zero, okay? It's just the names are a little bit more heavy weight
16:03
than just t, okay? So that's the appetizer. You may already have an indigestion, but I'm sorry. You know, you've got the whole a la carte menu, so you get what you ask for, okay? All right, let's go to starter. Everyone warmed up now? Yeah, bumped up?
16:21
Okay, good. Starters. We have the case of the switch, we have acrobatics inside event synchronization, have efficient for each loops and expression tree essentials. Expressions for each? Expressions. There's more expressions than for each? Okay, let me just briefly say
16:43
what the other things are about, because you know, you can do some research by yourself. If you do a switch and your switch has a bunch of labels in it, like you know, for example, switch zero, one, two, 17, 18, 19, 101, 102, 103, you have clusters of switches,
17:01
clusters of values. The CLR has a jump table construct called the switch instruction. And so what the C-Sharp compiler is gonna try to do is actually generate a binary decision tree to actually use the most, you know, switch things possible, but without too many holes in it. So if you have a switch zero, one, two, and then whatever the values are that I said,
17:21
you can patch up the switch table with a whole bunch of gaps in it, but that's very inefficient because your code gets very big. So what the compiler will do is build a binary decision tree, you can find that code easily inside the Roslyn compiler source code. Event synchronization acrobatics, very briefly again, if you have an add and a remove handler
17:41
that you didn't specify yourself, the compiler will insert some synchronization algorithm there so that your adds and removes are actually thread safe. You will actually find something like a while loop with, you know, interlock compare exchange to make sure that you don't lose event handlers when you hooked them up and removed them. Before C-Sharp 4.0 was actually using a lock
18:00
on the disk instance, which is kind of a big no-no. So that has been improved. Efficient foreach loops, very briefly, the compiler, this regards the fact that the specification says that every foreach loop sort of uses the IEnumerable interface or the IEnumerable pattern. So it recognizes things like if you foreach over a string,
18:20
it will build a regular for loop that indexes into the string, character by character. If you iterate over, say, an array, it will become the equivalent of writing for int i equals zero i less than the length i plus plus, that kind of thing. So if you write a foreach over a regular array, it's still going to be as efficient as, you know, using a manual for loop there.
18:41
And then finally, if you have enumerators that implement structs over there, we'll try to avoid boxing. So if you have to call the moveNext method and all that kind of stuff, it will not do any boxing there. That actually works for a lot of collection types in the BCL, which return enumerators that are struct types. So let's take a look at expression tree essentials.
19:02
Expression trees, as you may know, are actually also introduced in C-sharp 3.0, and are this thing that actually enables you to write a piece of logic and capture it as data structures at runtime. And what do things like link providers do? They take those data structures at runtime and translate them,
19:21
translate them to some foreign language like SQL or what have you. Kind of glad that we started with transparent identifiers and that we also talk about link. Because I can actually show you expression trees now by simply going over here and putting askQueryable, and all of a sudden we're talking about expression trees. Now, why is that?
19:41
Because the translation of this query now is no longer gonna have axis be IEnumerable of T, but axis will be IQueryable of T. What's IQueryable of T? It has exactly the same API surface as IEnumerable of T in terms of extension methods for link, like select and where and all of that stuff. The only difference is that the signature is slightly different.
20:03
How is the signature different for those guys? It's different in the sense that it takes in expression trees as opposed to lambdas. Let me show you that first in system.core. If I don't put askQueryable there, which is just to show you the difference here.
20:21
If you don't put askQueryable, let me just find the where method because for some reason most people got the where method the easiest. If you take a look at the where method here, you see it takes in an IEnumerable, it returns an IEnumerable, and the filter is taken in as a func of TSource to bool. Very simple thing. Now, if I go to the isomorphic API,
20:42
has exactly the same shape on IQueryable of T, you will see the same signature modular one little difference. And the difference is gonna be that predicate. Just keep your I in that corner and I'll try to sort of do a non-jari motion here to get to the queryable equivalent so that it looks exactly in the same spot.
21:01
So, focus on that predicate on the right-hand side and let me just go all the way to the right here. Whoops. You've seen the difference. It's now an expression of func of TSource to bool, which means if you assign a lambda expression to the expression tree, the compiler goes down a vastly different code path to actually represent your code no longer as a piece of IL,
21:24
but as a piece of IL that at the run time will construct a data structure that captures the original intent of the user. Did you follow that? Okay. That's called quotation. Let me now show it outside the context of link to sort of show you the difference and then we'll take a look at the Roslyn compiler source code
21:41
on how it actually does that. So, let me now simplify this demo. Let me just create demo two here by copying demo.cs to demo two.cs, notepad demo two.cs. Over here. There we go. I need to add one more thing.
22:01
Expression trees are defined in system.link.expressions. Over here. And now let me do the following. Let's say I want to have a func of int. Let's do the very simple thing, the most simple thing you could possibly imagine. Let's say I want to have a func of int called f, and I want to do nothing goes to 42.
22:21
Function that always returns 42. Very nice. Now, I can also do the same thing, say func of int wrapped in an expression, call it e, and do nothing goes to 42 as well. What's important here, well, actually let me try to obey to my OCD over here a little bit more like this.
22:40
Whoops. Okay. You see clearly what's happening here. The syntax on the right is exactly the same. Actually, if you would take a look at my source code, it's always like this. Like, you know, I try to sort of align all the equal signs in my code. People hate me for that, but whatever. But so if you take a look here,
23:01
the right-hand side is exactly the same, and that's a property that we refer to as homo iconicity. We didn't invent that. It's just a Greek word, homo iconicity. And that's something you can also look for, quotations, which are seen in LISP, introduced in 1958 over there, the notion of quotation.
23:20
So we're sitting on the shoulder of giants here. But the right-hand side, you don't notice what's happening. It's the same with the link query. You write exactly your same link query, but in one case, it's assigning predicates and selector functions and all of that kind of crap to functions. In the other case, it's assigning them to expressions of functions, expressions of delegate types.
23:41
Now, the code is going to be vastly different for those two cases. The first case, the compiler is literally going to generate a method that returns 42. It's going to construct a delegate that points to that method. In the second case, it will translate it to something vastly different. So let me actually show you that now over here.
24:03
Again, you know, show of hands, I'll do some or I'll spy. Spy, oh, okay. We sort of become more like, you know, mellow already. Okay, fine. You know, so demo two will go and I'll spy. Let's hope it can't decompile it,
24:21
but I have faith that it's not that good. Indeed, it's not that good, as you can see over here. So my very first line has turned into this mysterious blah. What's this mysterious blah, by the way? When you create lambdas and you do something like, you know, nothing goes to 42, assign it to a variable,
24:40
it's actually going to create a static somewhere that will actually catch that delegate, because you may want to use it multiple times. So that's this null checking pattern. If we don't have this thing yet, then we will actually create it, okay? So that's the compiler optimizing your code so that you don't create new delegate instances every single time you go there.
25:01
So say that this thing is sitting in a tight loop, we'll actually move this instantiation logic out of the loop and actually instantiate it once. And you know, if it's already there, we can just keep using it, right? The second piece of code, I don't know how I did that. Apparently there's a shortcut I'm not aware of. Whatever. But the next piece of code
25:21
has turned into this mysterious thing. So the first line here, well actually let me show the first line a little bit more. Like where's my value 42 here? My value 42 that I'm returning from the function has turned into over here, returned 42 in some methods that's compiler generated. Again, you see that less than, larger than,
25:42
and all of that kind of blah, that's the compiler making up a name for you. So when you see less than, larger than, it's the compiler making up something. The letters that you see after them are kind of ordered by feature. So this is B, which means it's one thing that's quite early, okay? Compiler synthesized names
26:01
are sort of continued to grow in the alphabet. So transparent identifiers were starting with H, B is for anonymous methods introduced in C sharp 2.0. So you can sort of guess which language introduced it by looking at the letter. So this is a pretty ancient compiler generated trick here. So it returns 42 as we expect.
26:20
And so we just construct a delegate to point to that. That's the case of the very first line. This line of code here, no surprises. Modulo, maybe this lazy instantiation pattern. The second piece of code actually turns into expression.lambda of func of int, of expression.constant of 42 type int,
26:41
and then no parameters. So it's basically capturing my logic as an expression tree by calling into runtime factory methods that will return a data structure that now at runtime I can take a look at. And I could say, I wanna execute this thing, say, in Excel. So I could take a look at the expression tree and translate it to VBA macro or whatever, right?
27:02
I wanna execute it in SQL server. I will translate it to T SQL equivalents. So that's what expression trees allow you to do. Let me just make it a little bit more complex now, just to show you the kind of thing to expect. So let's now say we have a function from X to something, X times two plus one, over here, X times two plus one,
27:23
same syntax for my conicity. Let me rebuild this thing. And I'll go back to our LSPI. The first case is not interesting. We'll just be compiler generated method having the body of my method in there. But the expression tree now has become a lambda,
27:41
which does expression.add of expression.multiply of my parameter multiplied by the constant two plus the constant one, okay? So you see what it's generating. It's generating an isomorphic data structure to the original syntax tree that the user has written. And it can do that for method calls, property lookups,
28:01
constructor calls, all of the things that C-Sharp 3.0 actually had, okay? How does it do that? If you're interested in looking at the compiler source code, now I need to try to find it. Here it is actually, quite lucky that it's just sitting underneath there. There is this class called the Expression Lambda Rewriter.
28:21
And let me actually try to look for one thing. Say we're looking for what will happen if I do method calls. I could actually try to do a method call here. Well, whatever. Does anyone know a method call? Well, toString.length. Nonsense of course, but fine.
28:42
It's a compiler demo. It doesn't have to have any meaning. So like this is a method call. Compiler will translate that into an expression.number for the length expression.call for toString on the parameter expression. So it will build that syntax tree, if you will. So how does it actually call that factory method?
29:02
If I recompile this thing, and we take a look again at the generated outputs here, I'll show you exactly what I'm looking for. This thing is doing expression of that, of multiply of a property lookup with a call toInt toString. By the way, if you look at IL Spy,
29:21
they take the liberty of inventing keywords. Like the compiler is generating over here a load token instruction, which is an instruction that IL has, to get a reference to the metadata for a method. This is actually what allows it to create a method info that points to the toString method. So my runtime expression tree can know
29:41
which method I'm trying to call here. So this method of does not exist in C Sharp. We kind of wish it should exist. But the syntax is kind of a little tricky to sort of be able to refer to many things. So we don't have a method of keywords, but they sort of take the liberty here to decompile this compiler generated code into something called method of, which does not exist.
30:02
But where does the compiler get this call method from? Well, let me show you. If you go into compilers here, and we just search for call over here, and actually I changed this code very recently, as you can see there, because I'm working on expression trees right now to do some rewrites for them. But you see over here,
30:21
when we visit a call expression in the original user-specified code, we're gonna turn it on its head. We're actually gonna turn it over here into a call to a C Sharp expression factory named call, that's my method, and then passing in a bunch of things.
30:40
If it's static method, we'll pass a null for the first argument, otherwise we'll translate the receiver, which is everything to the left of my dot. We are gonna pass in a method info for the method, that's gonna be IL code that at runtime will compute the method info. And then we'll also get a bunch of parameter bindings for the different parameters
31:00
that are passed to that particular method. So you see that here, it's actually generating the code. If I actually go into C Sharp expert factory, and I drill a little bit deeper, it creates a static call to a method with the name specified call in this case, on the C Sharp expression type, which is actually a reference to a type
31:23
that's called Microsoft, or well actually that's my extension that I've done recently. Let me try to find the original, sorry for that. Let me see where's the original. This is the original actually. I'll explain what I've been noodling around with
31:42
in just a second. But here you see it will generate a call, a static method call to a method called call, sorry, it gets quite call-y, like call in double quotes, on which type, on the expression type, which is a well-known type that's defined in system link expressions or expression.
32:01
So it's generating system.link.expressions.expression.call of then the recursive translation of the arguments. So that's how it sort of translates your simple-looking code, and to code it at runtime can reconstruct what you wrote at compile time. So that's how expression trees basically work.
32:21
Now what have I been working on? I don't have an internet connection here. Oh actually, luckily I still have this thing available. Expression trees do not support any language constructs that are introduced after C Sharp 3.0. That's because they were created in C Sharp 3.0 for link, and after that they haven't really been maintained
32:41
to sort of get new functionality. And so if I actually run a little project here, which I've been working on in collaboration with the Roslyn team, potentially as a feature that may ship at some point, if you go, for example, to this tool,
33:01
nevermind what it is, it's a tool called Roslyn Path which I've just created for this purpose, and it's sitting all available on my GitHub homepage. But say that you have something using named parameters over here, that something was introduced in C Sharp 4.0, and that's something that was not supported
33:21
in expression trees which were introduced in 3.0. So if you compile this today using the C Sharp compiler, it will actually say, I can't do this. An expression tree cannot contain named argument specifications. So the thing I've been working on is actually adding support for expression trees
33:41
that include all the language features up to and including C Sharp 6.0, including statement bodies. And so if you take a look here now and I compile this with a reference to this new assembly, Microsoft.CSharp.Expressions, you will actually see, and that's the code you are looking at, the one that I sort of changed,
34:00
you see that this is now generating an expression tree that looks like this. It's an expression tree that has a call, and now the arguments there are parameter assignments that have references to the names of the parameters. Nevermind exactly what it is, but you see it becomes a data structure that at runtime you can inspect and you can figure out exactly what the user wrote.
34:22
They wrote a substring, they want the length to be evaluated before the start index, and then you have to shuffle them around to call the underlying methods. You know this is not the same as reordering those things, right? CSharp always evaluates sub-expressions left to right. So if you throw an exception in length, you will never evaluate the start index, for example.
34:43
So the side effect ordering is actually as it's written lexically, unlike for example C++ where it's undefined, where you can actually evaluate the arguments in whatever order you like. The compiler can make its own choice there and CSharp, it's predictable. It's always left to right. So we can't just reorder those things.
35:01
And so what we can do now with expression trees with this work I've been dealing with is for example take a whole algorithm here, right, you know a whole prime sieve calculation with async and await and all of that kind of stuff and actually compile it into an expression tree. And it looks quite the bestiality over here, like you know or beastie or whatever the word is,
35:21
like I think I'm confusing words here. Can you cut that? Like you know it's very beastie. But so like you know there's a whole bunch of stuff here and you see it's an async lambda which has statements in it, variables, you know all that kind of stuff, right? So how do we get that? Well exactly the same way as we got it in CSharp 3.0.
35:43
It's like you assign a lambda to an expression tree type. Expression of func of int to task of list of int in this case. That's just the signature of my function that computes prime numbers, right? And now this whole thing can be represented as an expression tree at runtime. You can look at it, translate it, optimize it,
36:01
do whatever you want to it, okay? So that's expression trees. If you're interested in seeing this project evolve, which you know I'm actively working on right now, it's over here on my GitHub. Just find github.com slash Bartis Mat and you will find this whole thing which has gotten quite big, you know? Because one thing, let me just finish on that note
36:23
with expression trees. One thing that's super powerful about expression trees is even if the user assigns a lambda to an expression tree and you say, gosh that's interesting, I can optimize it, but I wish I could also execute that code as a runtime.
36:42
Like it's not too late. The CSharp compiler has run, it's generated a data structure at runtime that contains the original intent of the user. There's this beautiful very small API on the surface but a huge amount of code underneath it which is called e.compile. And this will at runtime generate TIL code
37:00
that's equivalent to whatever you had written assigned to a function, okay? So at runtime you can change your mind and say, oh, we got an expression tree, let me now compile it at runtime and generate efficient code which will get JIT compiled and all of that stuff. So this compile method is very interesting. By the way, we've added something.
37:20
It's not in the .NET framework yet but in CoreCLR where you can actually pass to compile a flag that will say whether at runtime we need to compile it to IL code or whether we have to interpret it. The interpreter can be faster in certain cases if your code is smaller and you invoke it not as many times. So we have some more flexibility there.
37:41
So even though this API existed in CSharp 3.0, we're still actively adding things to it to actually make it better. And by the way, what would be the type of this thing? Of course, it would be a func again and that would be my compiled equivalent of the original delegates but now compiled at runtime. Very powerful capability.
38:01
Now you can sort of do meta-programming, can generate code at runtime and then compile it as if it's as efficient as something that the user has written originally in CSharp. So that's expression trees in a nutshell. And basically it's just a compiler transforming whatever you write
38:20
into those factory method calls and you see we have factory method calls on expression for every single thing you can imagine. If the user writes a plus, it becomes expression.at. If the user writes multiply, it becomes expression.multiply. Array, index, length, whatever, right? All of those kind of things. So a huge thing which is kind of the language specification in method calls, if you will.
38:42
If you have a checked keyword around your at, it becomes at checked, for example. So a whole bunch of things here. Okay, let's go back to the menu and take a main course. This was already a pretty heavy starter, but you know, we're quite hungry today, so a main course.
39:01
We have something about anonymous method closure spaces. We have something about state machines, something about dynamic, and something about the lock keyword. Your choice. Dynamic? Dynamic, state machines, ah, all the dilemma, you know, there's so much that I can tell here.
39:22
I think the majority was saying dynamic, I'm afraid. Okay, so let's do dynamic. Quite interesting. It's kind of interesting how you guys sort of pick things that sort of all line up because dynamic uses expression trees underneath. Like, did you know? Okay, I didn't pay anyone in the audience
39:42
to pick things in the right order, of course. The other things, let me just briefly point them out. Closures are the things that occur when your lambdas refer to variables in the outer scope. So for example, in this case, if I have something like var two equals two,
40:01
and then I sort of substitute it here for two, and let's now ignore all the expression stuff. And at this point, of course, I'm always in a bad situation because now I have to change my indentation because now it looks quite funky. No pun intended for the funk keyword there. But you see now I'm capturing a variable from the outer scope. That's a closure.
40:21
What does that mean? It doesn't mean that we are substituting two for the value two, because if afterwards over here, I'm saying that two is now re-declared as three, just like at some point there was a US state that mandated that the value for pi would be three and no longer 3.14.
40:41
You can find a law that was passed in the 1800s where people said 3.14, that's crap, it needs to be the constant three. So now I'm redefining two to mean three. If I execute this function, it will use the value three. So it doesn't capture the value at the point you reference it, it captures the value or uses the value at the point you use it.
41:02
So if I call that same function now over here with some value like one, it will use the value three instead of two over here. That's a closure. And that will actually allocate this two, which looks like a local variable. This will be allocated on the heap alongside my delegate
41:21
so that if you sort of return this delegate to somebody else, when the stack frame is long gone, you still have a reference to that value. So this sort of hoists up local variables to become fields in classes, okay? So that's closures. State machines, there are two places
41:40
where we use state machines, both the iterators, the yield return keyword in C-sharp 2.0, as well as async and await. For those of you who went to my talk yesterday, you've seen some state machine voodoo for async and await. And there's a lot of talks online about that. And demystifying lock, the only interesting thing about the lock is that the pattern that we emit for a lock keyword
42:02
has actually changed in C-sharp 4.0 to take advantage of new methods on system threading monitor. Most specifically, it's actually using try enter and exit as opposed to enter and exit. You can find some interesting blog posts about why this was actually changed to deal with some very interesting core cases.
42:21
So let's now talk about dynamic binding. So dynamic is one of those love-hate relationship features that we have. It's very handy, but it's quite a dragon, you know, in the language, because again, it does a whole bunch of things at runtime. So let me actually create a little demo here, demo three.
42:42
And in demo three, we are gonna do something like this. Well, actually, how many of you use dynamic or know what it does? Quite a few, okay. So dynamic is a way of statically typing things that need to be dynamically bound. That was kind of the invention of dynamic.
43:02
What does it mean? We have a new type called dynamic, which at compile time, erases into system.object. It's not a new type that you will find in the BCL. There's no such thing as system.dynamic. Dynamic just means whatever I do on this thing needs to be dynamically bound or late bound, okay?
43:23
So let me actually do something else instead because it will make it more interesting. Static, dynamic. That's the fun thing that you can do with dynamic is you can have static methods that return dynamic and take in two dynamics, A and B, and then inside of it, well, actually, let me use C sharp six here,
43:41
goes to A plus B like this. The same as curly brace, curly brace return. So more concise. So this is a method that takes in two objects and we'll add them together. Now what I can do is actually over here, I can call add with the value one and two
44:01
and out will come three, unless mathematics has changed in the meantime. And I can also have foo bar equals add of guess what? Foo and bar. And I can also have var whatever equals add of date time dot now comma time span from days one or something.
44:23
These are all things I can add together. But this add method doesn't know what I'm gonna pass to it. It takes in two dynamics. You can assign everything to dynamic. And at runtime, this plus method now, wait, not method, this plus operator has to be resolved to the underlying things
44:41
that will actually make this work. In the first case, it needs to become an add instruction. In the second case, it needs to become a string dot concat. In the third case, it needs to become a method call to op underscore addition declared on date time with two arguments, one for date and one for a time span. So it needs to do overload resolution, if you will,
45:02
and binding steps but now at runtime. Now what's the interesting thing about this? The interesting thing is if I call this add method multiple times with the same set of types passed in, it will sort of asymptotically get almost as performant as statically typed code.
45:21
Why is that? Well, let's actually take a look behind the scenes now. Okay? So so far, kind of makes sense. Dynamics says whatever, you know? You just dot and you do whatever on this A and B and then we'll figure it out at runtime. And if it doesn't work, we'll get a compiler error thrown at runtime in an exception, okay?
45:40
So what's really happened there is if I compile this thing now, demo3.cs, and we'll stay in ILSpy for now. It's compiling still, but here we go, demo3. If I take a look at demo3 and again, they don't succeed decompiling things, which is good for our purposes. You see indeed that I have main
46:01
calling my add method multiple times here. And then underneath that, I have this thing, static dynamic add with some attributes on top of things. So they're kind of a little bit too eager. In reality, those things are declared as object in the underlying IL code. And we have some custom attributes
46:21
that we put on those things. So if somebody calls that, we can sort of disambiguate between was this method declared to take an object or does it really want to take in dynamic? This gets actually a little bit more complex if your dynamic is sitting in some weird spaces. What I mean by that, and I won't show you right now,
46:41
but say that you have a method like static void foo which takes in an IEnumerable of IList of dynamic and let's actually make it more complex. This is not complex enough, yes. IDictionary of int to dynamic. Well, how do we declare this thing?
47:02
How do we sort of convey the meaning that at runtime there's this inner position in that generic type is actually dynamic and not object because we are gonna sort of translate it to the equivalent of object because the CLR doesn't know about dynamic. So actually let me just compile it for fun and kicks while we read it anyway.
47:21
Just exciting. That's what I do the whole weekend long, right? I don't have a life as you can tell. But so if you take a look over here, you get this beautiful thing, dynamic of a Boolean false, false, false, true. This is actually doing a depth first scan of the type structure and sort of seeing
47:40
which things in my generic were dynamic and which ones are not. So this is kind of if you overlay that with a tree that represents the type, I list of I dictionary of int to dynamic, it's sort of one, two, three, four. The fourth type, yeah, I got it right. The fourth type is dynamic. If I switch those things, it will be the third one.
48:02
So we sort of have to keep track of that because at runtime, this actually will look object here. If I actually go into IL code, hope they don't mess up IL code here. But yeah, see, it looks like object here. So that custom attribute is needed for us to know what was dynamic and what was declared as object
48:21
because object doesn't do late binding, dynamic does. So let's go back to the add method now. So what does the add method do over here? In this case, we'll just sort of browse what it's done here as opposed to going into the guts of the whole thing. I can show you a few places that are of interest.
48:42
Now what this is doing here is it's actually again doing a lazy instantiation pattern. If we don't have this beautiful thing called O underscore underscore two, P underscore underscore zero, we will create it. What's that thing? It's a so-called call site. And this call site is parameterized by a function.
49:03
So let me first show where that thing sits. It sits in here. It's something declared to be a call site of a function that takes in a call site two objects and returns an object. What's this piece? Let's start from the right-hand side. This thing should be quite obvious.
49:21
Maybe an overstatement. But the add method took in two dynamics. Well, I'm not, let me correct. I'm no longer talking about the add method. At this point, we are talking about the body of the add method where we are adding two dynamically typed things together. Adding two dynamically typed things together
49:40
is like a function that takes in two objects and returns an object. So that's the last three objects in there. If this would be A plus one, it would be object int object. So this is what this thing corresponds to. Now what else do we have here? The func also takes in a call site
50:00
which in a mysterious way will be itself. It will pass in itself to itself every single time. And that's the trick here. So this call site actually does have a so-called target. If you go here, call site of T has a property called,
50:20
actually not a property, a public field. Now people will say like, ah, I have to spank you because public fields are bad. Compilers emit public fields all over the place. So this is actually a target delegate type to be of type T. In my case, it will be a func of call site object object object. So what's this target thing doing?
50:41
Why is it a writable mutable field? It's a writable mutable field because every time we learn about some new set of arguments on how we should execute those things at runtime, we'll sort of replace this thing. If you call the add method with two ints, we will sort of generate an expression tree that says
51:01
if the left hand side's type is an int and the right hand side's type is an int, then we've learned to add two ints together, we have to cast the objects to int and use the plus instruction. And we are gonna substitute that into this target. We're gonna replace the body of the target by recompiling an expression tree that now knows,
51:21
as if you've written that thing at compile time, that to add two ints together, we have to use an add instruction. This is called a polymorphic inline cache. It's inline because it's at the point where you use it, where the plus occurs in your source code and it's polymorphic because it starts learning about all the combinations of inputs. So when you call it again with two strings,
51:42
what's gonna happen? The target will have code in it that says if the two things that are passed in are ints, then we'll just do an int32 addition. Else, what are we gonna do in the else case? The default code of the target in the else case, the bottom most else case, will be something that cries out for help.
52:02
This will be the dynamic language runtime that says, my God, I don't know how to do this. And who is it gonna call to help resolve this dilemma? It's gonna call the C-sharp compiler at runtime to do every overload resolution step it would have done at compile time, but now using the runtime types of the arguments.
52:21
It will say I have two ints here. C-sharp compiler says I have two ints, you need to add them together. Int32 add instruction expression.add of left and right. If you give it two strings, the compiler will say, well, how do I add two strings together? Exactly the same logic as the compiler does when you just type this in Visual Studio.
52:40
It will say adding two strings together, method call expression to string.concat taking in two strings. The DLR will say, ah, happy, two strings. In the future, I will always call a method that does string.concat. And the compiler will no longer be called at runtime. If you give it two strings again in a tight loop, will be as efficient as just doing string.concat
53:03
at compile time. And then last but not least, if you give it a datetime and a timespan, the compiler will say I can do this, datetime timespan, op underscore addition is the underlying method for the add operator. Let me add that to the delegate as well. So that's how this sort of proceeds, this whole thing.
53:22
Now, how does the compiler know at runtime what I wanna do? The answer is, of course, that again, it captures your whole intent in here as a parameter to the call site. And that's called a binder. The call site gets educated by its lifeline. Who should I call if I don't know what to do?
53:42
Well, we'll tell it here. The compiler emitted code that passes in a binary operation binder, which sits where? In Microsoft.c sharp. What's Microsoft.c sharp? It's a subset of the c sharp compiler in a runtime library that has exactly the same overload resolution rules
54:02
as c sharp at compile time. And here it basically creates a binary operation which will encode all the semantics that I needed to resolve all of this stuff that we couldn't resolve at compile time, how to do all of that stuff at runtime. So for example, it says this needs to be
54:20
an add instruction or it needs to be an add note. Where are we doing this from? We are doing it from the context of the demo. Why is that important? Because at runtime you need to do visibility checks. If I do an add between two things that are private and that are not visible to me, we have to say we can't do an add between those two things.
54:40
So we need to know where the code occurred that asked to do a plus. That's this type of demo that will allow us to do runtime visibility checks. And then we have information about the two argument infos here. The first argument, there's nothing special about those arguments in this case. There are no argument flags, but what are those things for?
55:00
Take a look at the argument flags. We need to know at runtime, what's the case here? Like at runtime, did we know at compile time the static type of this argument? If you do a plus one, we know one is an int32. We want to know that at runtime, that is an int32. Is it an out parameter?
55:21
Is it a ref parameter? Is it a named argument? Is it a constant? Is it a literal constant value? We want to know all of that stuff at runtime to do exactly the same steps that otherwise you would do at compile time. And then similarly over here, the first parameter passed into the call site here is C sharp binder flags.
55:40
And it actually says there's nothing special about this, but what could be special, it could be something like over here, we are doing this in a checked context. We need to know, do we have to generate code that will do overflow checking or not, right? And a whole bunch of other things here. So let me actually try to change this. Let me just put the checked keywords around it like this.
56:03
So now it's a checked a plus b. At compile time or at runtime now, we need to know that the user wants checked arithmetic. So when we generate the add instruction for a and b, we have to generate at .ovf, which is the IL instruction for overflow checking at,
56:21
which will throw overflow exception if you fall out of bounds. So if I go here and I refresh this whole thing, I've recompiled it in the meantime, I believe, you see that the add now has over here a binder which says, by the way, this thing needs to be done in a checked context. So we have all the information again at runtime
56:41
that we sort of had at compile time, but the user says, I don't wanna do anything with this compile time knowledge. Use it at runtime, because I'll give you arguments of any kind of type and you have to figure it out at runtime. But if you look at the performance profile, this will actually approach static typing very closely.
57:01
Unlike, for example, if you would use reflection and you would just say, do dynamic calls to methods, which will always be method info dot invoke. In this case, it will actually become IL codes that stitch together an expression tree and gets compiled and runs a JIT compiled speed. I say almost, because of course it has that whole polymorphic checking thing in it.
57:23
If the two arguments are ins, do this. If the two arguments are string, do this. So we have to build that decision tree, like what we're gonna do. And that's, of course, additional overhead for every call. We have to see what do we want to do here. But the core operation itself will not be as slow as you would think it would be.
57:40
Okay. So that's polymorphic inline caches in C-sharp and the dynamic language runtime. All right, and then we'll get a light dessert to end. I have two desserts here. That's not the price of the dessert, but it's, yeah, something. And I also have interpolated strings.
58:01
So which one do we want? What's the other one? The first one, okay, okay, that one. Anyone has a guess what that feature is? What? It could be a color, yes. It's the color of errors in the console window produced by CSC to DXE.
58:22
No, those are red, which would be FF0000. So that's not it. Any other guess? I'll just give you two guesses. It's a magic constant that appears somewhere. So let's talk about magic constants. Actually, let's have some fun first. A little quiz about magic constants.
58:43
If I do notepad of notepad dot DXE, why would you do that? What's the first two letters you're gonna see?
59:01
But does somebody know the exact two characters? MZ, yes. Mark Zebowski. He's the guy who built the DOS loader back in the 90s. He put his initials there so that when you run this thing, the loader and the operating system can say, this is executable or this is not executable. So MZ executables DLLs, okay.
59:22
Next one. If I notepad one of my C-sharp applications, then we'll wonder the XE, it's an executable, no secret. It's gonna have MZ at the top, but there's gonna be something else in here. Four initials. Anyone knows four initials in here?
59:42
Four initials somewhere. Now everyone's looking like, where are they? Like they're somewhere on the right, like they're not on the screen. No? BSJB. Bill, Susan, Jason, and Brian.
01:00:00
Jason is Jason Zander, who's now the VP of Azure. And B is Brian Harry, who's the guy who owns everything TFS related and ALM related. Those were the four original guys who wrote the CLR on a weekend, a couple of weekends. Brian and Susan are no longer with us, but in a sense no longer employed at the company.
01:00:21
But Jason and, yeah, some people get old, right? Now I'll tell you the story about somebody who committed suicide. If I open a zip file, let me first create a zip file. By the way, every Word, Excel, PowerPoint document is a zip file now, which contains an XML file inside of it.
01:00:42
So let's just do a plain old zip file here, demo3.cs or .zip. What is it going to start with? PK, exactly. That's Phil Katz from PK-zip and PK-unzip, who created one of the original zipping tools. And he was actually somebody who at some point had enough compressing a whole bunch of things
01:01:03
and unfortunately committed suicide. That's, I guess, what happens if you work on compression algorithms. Yeah, not really funny. So yeah, that's PK. So those are the magic strings. Now, this thing is also a magic value, and I'll show you where it shows up. Let me copy demo3.cs, well, copy demo.cs to demo4.cs.
01:01:31
Watch carefully and shout out as soon as you know where the constant occurs, okay? var x equals new, a equals 1, b equals full.
01:01:45
What? No, no, it has nothing to do with a base address or something. My IL code will now contain this constant. Where would we need that constant?
01:02:02
We could have used another constant. That's a little hint. It's a constant you could find in a computer science book. You could find what? The full string. The what? The full string? The full string. The full string? No, the full string doesn't have that in it. Of course, I could put in the full string, those characters, and then I'm cheating.
01:02:23
I could have done it in a different method. It's not an entry point location. Well, good guess, could be. Somebody at the end?
01:02:40
Type signature? No, no type signature either. I could do something else here, and we still were. What are we generating for anonymous types? And this constant would appear in a textbook somewhere written by Donald Knuth. Why do we need a constant from Donald Knuth's book?
01:03:01
Okay, so we generate over here clearly a class which has a constructor, which has two private gettable properties because it's an immutable object. We also generate two things to do a service. One thing is toString, actually three things, toString to get a nice toString.
01:03:21
Well, hashCode, yes, to get hashCode. It's the seed we use for get hashCode for anonymous types that we generate. It's actually a gigantic prime number. You can look it up. So if I compile this thing, demo4.cs, and I rldesm demo4.dxz over here,
01:03:41
the anonymous type get hashCode has this magic number in it, and some other number as well, actually this one here. One will be constant amongst everything. One will be generated based on the structure of the type to deal with clashes that if things have sort of the same shape.
01:04:03
But you see over here that constant repeating a couple of times. If I actually go to the source code, I think I did bookmark it somewhere. If not, we can try to find it. Oh, it's actually the last one here. Here we go. Here it is, the hash factor.
01:04:23
That's the prime number that we use for all of those multiplications in addition to an initial hash. And so that's actually how we get a pretty good hashCode based on some number that's been documented in literature as being a good factor to use for hashing. Just like in general for hashing, you want to use prime numbers
01:04:42
to reduce the number of collisions and things, but those are some interesting values that somebody on the compiler team and implementing this feature looked up from one of those textbooks, in this case Donald Knud's book. I think we actually lost the comment, because in the native compiler there was a comment that actually looked it up.
01:05:01
The native compiler source code, which unfortunately is not open source, had a reference in it, but I also have a copy of Donald Knud's book and it no longer matched the page number. So we sort of lost where it originally was. Talking about page numbers, I will give you one more challenge. If you're excited about the Roslin, it's all open source,
01:05:23
the compiler source code, all of that stuff, I have bookmarks here too. I have a bookmark sitting right there, and that allows me to navigate in presentations to interesting points of interest, saying interesting twice, but whatever. So I can go to something, like emit or whatever,
01:05:43
I can put a bookmark here in Visual Studio. That's just a standard feature of Visual Studio. Quite often you put them on methods or members to go back to them, but if you insert lines in Visual Studio or outside Visual Studio, those things do no longer match up. One thing that would be quite easy to do in Visual Studio using Roslin
01:06:04
is build a Roslin extension that actually creates semantic bookmarks. You say, I want to have a bookmark to this method declared in this class. You just click the bookmark button, and no matter where you move it, you can ask Roslin, where does this thing sit? If you refactor things, if you move things to different places in your code,
01:06:23
different files, of course, if you change namespaces and classes and names of those things, it would no longer work unless you have some event handler to detect those kind of things, but you could create semantic bookmarks now by just referring to the symbol name of something declared in your source code. This would be not very difficult to do using the Roslin APIs,
01:06:43
but could be very educational because it will get you right in touch with the Syntax3 API and the Symbol API. So replaced by the next time I come to London, somebody tells me, please install this Visual Studio extension, and you'll have a new bookmarks thing that,
01:07:02
if I share it out to you, those bookmarks, tomorrow they may be invalid because somebody checks in more code and things start moving, but today, if you create a Roslin extension, you could actually have those bookmarks no longer have file and line numbers, but just the semantic symbol info so that when you move things around, you can still find them.
01:07:23
That would also have helped, of course, with finding this hash code in the Roslin source code. To wrap up, I always talked about what the other things are about. Interpolated strings, if you have any time to look at those things, it's new in C-sharp 6. Interpolated strings will reduce to string.format calls
01:07:43
or something else, depending where you assign those strings to. You can also assign them to i-formattables and formatable strings. Those are the other types you can convert interpolated string literals to. For those of you who haven't seen interpolated strings yet, the last thing I'm going to say today
01:08:00
is actually just show you an interpolated string. var s equals, classically you would do string.format, let's say, over here of 0,1, and then you would put x.a and x.b here. With C-sharp 6, you can actually write var t.
01:08:23
You could already write var t in C-sharp 1.0. That's not something we've added. Like, you know, t was always a variable that you could use. But you can now write this thing instead, the dollar sign. And the dollar sign will compute the holes for you and actually make sure that you don't have those things sitting out of wag.
01:08:43
And within the curly braces, you can actually use any language element that you choose. You can make method calls. You can't use statement trees there. You can only use expressions. But you can now interpolate a string this way. And if you just put it like this, it will become a string.format call. But if you give it to something like you, also invalid,
01:09:06
you can do formatable string.invariant, which is some method call that will use culture info invariant. In this case, the string gets assigned to a formatted string, which is a new type in the BCL.
01:09:21
In both cases, the compiler will do something different. The first case, it will generate string.format. In the second case, it will do something you can easily find in the source code by just searching for interpolated string. How do you find things? And that's the final note here. You load raslin in Visual Studio. You do control, comma, and you write interpolated string.
01:09:43
And then you start looking at all the matches. And eventually, you'll find the code that does all this magic. It's quite easy to find, actually, if you go through here. You will find everything that deals with lexing and parsing, the syntax, but you will also find things over here, like conversions.
01:10:01
So here, for example, we will find that this is unreachable. That was not what I wanted to see. But yeah, you get the gist. You just look, and there's not that many matches for those kinds of features. So you can actually find quite easily where it does all of its magic. It should be here, burnt interpolated string.
01:10:21
So like here, you see one example. Jenny, don't change your number. Reference to a song, apparently. But at the bottom there, it has string.format. This is what we make of it, but it depends on a couple of things. So if you look through this code, you will actually see that in some other cases, we'll do something else. Here, we'll actually call string.format.
01:10:41
And this is actually the code that likely has already lowered things. So it doesn't deal with the other cases here. But you can find them. They're actually conversion notes that do that. So with that, I hope you enjoyed a look behind the scenes. I hope your weekend will be filled with joy of looking at compiler source code,
01:11:00
which I think would be a nice way to spend your weekend or to watch the snooker, which I will be doing. Thank you very much.