We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Fun with Generics

00:00

Formal Metadata

Title
Fun with Generics
Title of Series
Number of Parts
133
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Benjamin Hodgson is going to sadistically torture C#’s generic type system in the name of entertainment. In a series of increasingly extreme ventures to the edge of sanity, he’ll cover some advanced techniques for de-duplicating code, making APIs more expressive, and using the compiler to find more bugs than you would think possible.
36
61
Thumbnail
1:11:04
73
107
Generic programmingType theoryInterface (computing)Streaming mediaKey (cryptography)OvalComputer fileWritingReading (process)String (computer science)Sound effectPhysical systemFile systemType theoryComputer fileQuicksortImplementationKey (cryptography)Data storage deviceData conversionInterrupt <Informatik>CuboidPlastikkarteAbsolute valueAbstractionBuffer overflowStack (abstract data type)Programmer (hardware)Optical disc driveString (computer science)Interface (computing)Real numberExistential quantificationComputer programmingLink (knot theory)Revision controlCodeStreaming mediaPoint (geometry)Goodness of fitDatabaseRepresentation (politics)Reading (process)Social classDifferent (Kate Ryan album)MiniDiscParsingRight angleGame theoryWebsiteDistanceSoftware developerMereologyCartesian coordinate systemFamilyMessage passingGeneric programmingAxiom of choiceSequelSpacetimePoint cloudNatural languageComputer animation
Interface (computing)Key (cryptography)String (computer science)Streaming mediaReading (process)OvalWritingGeneric programmingInvariant (mathematics)AutocovarianceElectronic mailing listFunction (mathematics)outputFinitary relationParameter (computer programming)VarianceSummierbarkeitCountingPairwise comparisonInterface (computing)Direction (geometry)Electronic mailing listArithmetic meanCoefficient of determinationRight angleGroup actionMultiplication signWord1 (number)Type theoryData storage deviceInvariant (mathematics)Natural languagePhysical systemVarianceParameter (computer programming)Interactive televisionSign (mathematics)Theory of relativitySocial classMessage passingCodeCASE <Informatik>CodeWebsiteComputer programmingString (computer science)QuicksortAdditionBitPoint (geometry)Lie groupNumeral (linguistics)TorusPairwise comparisonSubject indexingComputer fileAutocovarianceProgrammer (hardware)FunktorGeneric programmingFile systemComputer animation
Software developerJava appletScripting languageUsabilityPower (physics)Spectrum (functional analysis)Type theoryPhysical systemCone penetration testFormal languageCompilerVariety (linguistics)QuicksortCodeJava appletPolymorphism (materials science)Parametrische ErregungNumberDirection (geometry)Graph coloringCategory of beingMultiplication signOrder (biology)Spectrum (functional analysis)TheoremFluid staticsPoint (geometry)Computer programmingProof theoryWordIndependence (probability theory)Generic programmingMathematical structureMathematicsExpressionMathematicianPersonal digital assistantSoftware developerTerm (mathematics)Normal (geometry)Maxima and minimaLevel (video gaming)Software testingPattern languageNatural languageOffice suiteStatisticsDiagramComputer animation
Software testingException handlingDefault (computer science)Physical systemTestdatenBuildingOvalThomas KuhnSoftware developerSocial classKey (cryptography)Software testingExecution unitDefault (computer science)DatabaseSQL ServerOrder (biology)Arithmetic meanQuicksortTrailCategory of beingInstance (computer science)Object (grammar)Run time (program lifecycle phase)Representation (politics)CodeComputer configurationPlanningType theoryINTEGRALRight angleSystem callFunction (mathematics)CASE <Informatik>Multiplication signCompilerImaginary numberFigurate numberComputer programmingPoint (geometry)Constructor (object-oriented programming)Cycle (graph theory)Observational studyBuildingSequelNormal (geometry)Slide ruleException handlingSemiconductor memoryState of matterComputer animation
Software developerGame theoryType theoryCoefficient of determinationSoftware testingNatural languageComa BerenicesCodePower (physics)CodeCompilerWritingInformation technology consultingFormal languageVideo gameSocial class19 (number)BitMultiplication signHand fanState of matterQuicksortWebsiteInstance (computer science)Cellular automatonRight angleTerm (mathematics)Compilation albumComputer fileChecklistConstraint (mathematics)Proper mapFluid staticsExistenceSoftware developerPhysical systemPattern languageFilm editingFunction (mathematics)String (computer science)Text editorRun time (program lifecycle phase)System callAbstractionQuantificationParameter (computer programming)Line (geometry)Generic programmingComputer animation
Software developerSoftware testingInheritance (object-oriented programming)HierarchyConstraint (mathematics)CASE <Informatik>CodeLimit (category theory)Physical systemSign (mathematics)Software testingBitType theoryParameter (computer programming)Phase transitionSocial classState of matterExtension (kinesiology)Order (biology)Level (video gaming)CompilerPoint (geometry)ResultantComputer fileBoolean algebraConstructor (object-oriented programming)Fluid staticsCompilation albumRight angleMathematicsSet (mathematics)Session Initiation ProtocolBuildingTable (information)Intelligent NetworkComputer animationMeeting/Interview
Software developerHexagonMaxima and minimaComputing platformState of matterLocal ringInvariant (mathematics)Type theoryRight angleSocial classWindowExterior algebraExtension (kinesiology)GodBitCodeMultiplication signProgrammer (hardware)Level (video gaming)Decision theoryParameter (computer programming)CompilerSoftware testingPhysical systemLine (geometry)Boolean algebraGoodness of fitCompilation albumAsynchronous Transfer ModeSystem callDesign by contractPoint (geometry)BlogQuicksortRun time (program lifecycle phase)Formal languageInstance (computer science)Constructor (object-oriented programming)Slide ruleProduct (business)Computer programmingCodeTransformation (genetics)Insertion lossObservational study2 (number)Table (information)Computer animation
Parametrische ErregungPolymorphism (materials science)Software developerMorphismusFluid staticsSequenceType theoryVector graphicsParameter (computer programming)LengthWritingVector spaceElement (mathematics)MathematicianNumberPhysical systemRepresentation (politics)Personal digital assistantLink (knot theory)Type theoryPhysical systemSocial classAbstract syntax treeCodeNumberNatural numberCASE <Informatik>LengthSound effectPrimitive (album)Right angleVideo game consoleComputer iconExtension (kinesiology)DivisorCodeParameter (computer programming)DiagramVector spaceFluid staticsOffice suiteNetwork topologyBitLevel (video gaming)Computer programmingFood energyElectronic mailing listQuicksortGraph (mathematics)Multiplication signInductive reasoningFormal languageWordSequenceInteractive televisionVarianceException handlingMathematicianGeneric programmingLine (geometry)Time zoneSpectrum (functional analysis)Key (cryptography)Pointer (computer programming)Polymorphism (materials science)Parametrische ErregungCellular automatonInstance (computer science)Object-oriented programmingBoolean algebraPoint (geometry)Interface (computing)
WeightSoftware developerExtension (kinesiology)Vector spaceCASE <Informatik>Information overloadType theoryCompilerPhysical systemParameter (computer programming)Multiplication signInterior (topology)VotingSlide ruleLengthFunctional (mathematics)Cellular automatonRun time (program lifecycle phase)Right angleFunctional programmingSummierbarkeitCasting (performing arts)Buffer solutionNumberAsynchronous Transfer ModePointer (computer programming)Power (physics)CodeProper mapOrder (biology)Object-oriented programmingDisk read-and-write headNatural numberElectronic mailing listSet (mathematics)QuicksortFluid staticsWordProgramming languageTypprüfungWritingSystem callPoint (geometry)Exterior algebraSpacetimeRadical (chemistry)Compilation albumMultilaterationLink (knot theory)Maxima and minimaVideo game consoleCone penetration testComputer animation
Software developerExecution unitType theoryVector spaceLengthGeneric programmingFunctional (mathematics)CASE <Informatik>Dependent and independent variablesPoint (geometry)Different (Kate Ryan album)Parameter (computer programming)Natural numberArithmetic meanElectronic signatureLevel (video gaming)Regular graphMoment (mathematics)NumberBitCartesian coordinate system2 (number)Element (mathematics)SummierbarkeitElectronic mailing listSoftware developerDisk read-and-write headRun time (program lifecycle phase)Sound effectEscape characterDivisorSystem callState of matterVariety (linguistics)Archaeological field surveyForm factor (electronics)Right angleForcing (mathematics)WebsiteMultiplication signMilitary baseVideo game consoleResultantVideo gameLie groupLink (knot theory)Maxima and minimaAdditionComputer animation
Type theorySoftware developerGeneric programmingSurfaceSocial classInferenceClosed setPattern languageFunction (mathematics)Inheritance (object-oriented programming)Matching (graph theory)Type theoryElement (mathematics)Dressing (medical)Point (geometry)Formal languageResultantAddress spaceLevel (video gaming)Revision controlComputer animation
Transcript: English(auto-generated)
So, I'm Benjamin. I work at Stack Overflow. I used to work at Huddle, and there are two other Huddlers giving talks now, so it's very important to me that you put the green cards in the box. It's not a competition. It's a competition. So, I should say at
the start, if you have any questions, just interrupt me, shout out throughout. I think it will be a lot more interesting for me and for you if it's more like a conversation than just me sort of blurting stuff at you. And the odds are that if you have a question,
then someone else also has a similar question, and they'll be grateful to you for asking it. So, please just interrupt me. Similarly, if you become bored or you find my voice annoying, just interrupt me. I won't mind. It will be completely fine. So, yes, so this
talk I'm going to attempt to, well, I'm not going to, I don't think I can answer that question. I don't think it's an answerable question, but I hope I can give you some insight into what types are, really. I think we all work in C sharp, and we kind
of feel like we understand types, but I'm, like, lots of talks about types. If you ask a C sharp programmer about generics, they'll probably think about the collection classes or link or something like that. And if you ask, like, one of the speakers on this week who's talking about F sharp, they're probably going to talk about discriminated
unions and railway-oriented programming. I'm not going to talk about any of that stuff. That's boring. I'm going to give you some examples that I've harvested from the real world, code that I've actually seen and written. I'm going to give you some advice. At the start of the talk, there's going to be some good advice, and then by the end of the talk, it's going to be some quite bad advice that
you should definitely never do in the real world, and my challenge to you is to figure out where the good advice ends and the bad advice begins. So let's just get into it. So here's an interface for, like, a file storage class. There's
a read method that takes a path key and returns a stream, and likewise there's a write method that takes a stream and a path key and will write the. So the idea of this path key thing is that it's like an abstract representation of paths, right? So this interface decouples the person that needs to store files on the disk from needing to know where they're
stored on the disk, right? So if we had a string in here that was the absolute path, then you'd only be able to store things, you know, into a file system. And as you can see, there's an implementation of this that works for file system, for the file system, right? So there's an i file system path
generator, which knows how to turn an i path key into a string, which is the absolute path, and then there's a file system reader and a file system writer that know what to do with these absolute paths, right? But as I said, using that interface, decoupling the users of the file store from
knowing what the path is allows us to implement it in some other ways as well. So here's a version that talks to Amazon S3, right? And so there's a key generator that knows how to turn a path key into an S3 key, which is like the Amazon SDK class, which kind of defines where
something's stored in S3. And then there's some readers and writers, readers and writer classes that know how to talk to Amazon S3. And similarly, there's like, you could use Rackspace Cloud files or, you know, any file storage system of your choice, you could store a blob in a SQL database or something like that if you were stupid. But the point I
want to make is that these three classes, they're very similar-looking, right? Particularly these two. What is it that is different about the implementations of these three classes? Well, the main difference is that the file system one uses a string for its absolute path, and
the AWS has an S3 key, which is a different thing, and the S3 key. So in the bad old days of C sharp 1.0, before we had generics, we'd have had to live with this sort of code duplication,
right? Otherwise, they're pretty much exactly the same. They tell something to generate the key, and then they read the thing out using the reader. I don't mind sort of running around, to be honest.
So if we want to develop a sort of abstraction for these three things, the only thing that varies is the type, and so all we really need to do is use generics. So first, we'll generalize these interfaces, like the path generator and the reader and the writer, into an I path generator, which uses
something called a T path, where T path is going to represent the type of our path that you see. So the file system one is going to be replaced with a file store of string, right? A file store of string. Here, T path will be string, and for the
Amazon S3 one, this T path will be an S3 key, and for the Rackspace one, it will be this little thing that wraps up two strings, right? And so the point I wanted to make was that generics are not just for the collection classes, right? They show up in unexpected places. This file store class
has really nothing to do with collections or any of the traditional uses of generics that .NET programmers are used to, but the generic type system has allowed us to de-duplicate some code and write some code that is easier to show that it's correct, because it has less knowledge
of the details of the path, right? So is everyone happy with how this file store thing works? Incidentally, do you know what these in and out things mean? Many people, come on. Variants and contravariants. A couple
of people know, most of you looking sort of blank. In and out there are called variance annotations. So variance is spelled like this, variance. And they're about the interaction between generics and subtyping, right? So the question variance attempts to answer is when is my generic type a subtype of a related generic
type, right? And so if we've got animals and dogs, then we can see that an iron numeral of animals can be an iron numeral of dogs can be used as an iron numeral of animals, right? An iron numeral of dog is subtype of iron numeral of animal. So for iron numeral, the
subtyping relation goes in the same direction as its type parameter, right? So iron numeral is called covariance, because covariance means it goes in the same direction. I compare it, on the other hand, if you need an eye comparer that can compare two dogs, I
should be able to give you an eye comparer that can compare any two animals quite straightforwardly, right? If I can compare any two animals, I can certainly compare any two dogs. And so the subtyping relationship for eye comparer goes in the opposite direction. Does that make sense? It's a little bit counterintuitive, because we're not used to thinking
about subtyping going sort of upside down like that, but I've got some examples that I hope will sort of illustrate this a little bit better. The mnemonic, incidentally, is in and out, right? And the reason for that is, a type can be covariance if it only produces T's, right? So iron numeral of T will give
you T's sequentially. It will never take a T in as a parameter. If all you've got is an iron numeral, you can't put T's inside it. You can only get T's out. Likewise, eye comparer will never produce a T. It takes two T's and compares them, right? But it doesn't produce a T. So I've got some kind of examples that
I hope will sort of clarify why that is the case. So here's an example of covariance, right? We've got a method here that takes an iron numeral of animals and calculates the average age of these animals, right? List of dog implements iron numeral of
dog, not iron numeral of animal. But because of covariance, because iron numeral is covariance in its type parameter, we can use our iron numeral of dogs as if it was an iron numeral of animals. Yeah? Likewise, the eye comparer of dogs, if we've
got a method that needs an eye comparer of dogs to work, but all we've got is an eye comparer of animals, it should work, right? Because all I'm doing is I'm taking two animals in of whatever type and telling you which one's better, right?
And then if your type, if your interface has T's coming in and going out, then it can be neither covariant nor contravariant. That's called invariant. So eye list, you can index into it and get a T out of it, but you can also add a T to the
end of your eye list, right? So we can't use a list of dogs as if it was a list of animals, because the method might try to add a giraffe to our list of animals, and that shouldn't work, right? You should never be able to add a giraffe to a list of dogs. Right? So that's variance and covariance
and contravariance. So hopefully now you understand it. And so next time you see in and out, you can impress your colleagues with long words like covariance and contravariance. It's kind of an academic thing. But it is relevant
to the real world. So any questions about co- and contravariance? Does everyone understand it now? A few people nodding. Okay. So this is a talk about types. So I wanted to
talk about a great variety of type systems in the world, right? We're mostly used to the C sharp type system, which is quite similar to the Java type system and quite similar to the C++ type system. Over here are the untyped languages, which don't have a compiler and a type checker to assure you that you haven't made a type error. But there's a great deal on the far
end of this spectrum that is foreign to many jobbing developers, right? So Scala and F sharp, I've drawn them over here. You shouldn't really read very much into the particular order that I've put things in, right? But Scala and F sharp I've drawn over here because they're also statically typed and they're kind of based on the C sharp and
Java type system, but they make improvements upon it, right? Particularly in the realm of sort of generics and parametric polymorphism, which I might say a few times is the long word for generics. Scala and F sharp kind of make improvements on the parametric polymorphism end of the C sharp and Java type systems. Haskell is kind of very
strongly leans heavily on the, I'll talk about this more later, leans heavily on the static typed parametric polymorphism thing. And then there are some more, these languages are largely sort of experimental. A lot of them are based on Haskell. But these are what are called dependent type systems,
which maybe if there's time at the end, maybe I'll be able to show you something. Independent type systems, types and values are a lot more sort of flexible and you can check a lot more using the type checker and you can do things like values appearing in types and stuff like that. I'm going
to show you an example of that later on. And Coq is, so many of these are sort of based on mathematical type theory and they've got strong enough type systems that you can kind of express sort of mathematical theorems in it. Coq is intended as an assistant for mathematicians, right, to help mathematicians prove mathematical theorems about various mathematical
structures. And so Coq has a very rich, and a funny name as well, a very rich type system and a funny name that is powerful enough to express all sorts of mathematical ideas. Have you ever heard of the four color theorem? The four color theorem is like some slightly esoteric
mathematical idea that if you've got a map of an arbitrary world, you should be able to color it in with a maximum of a certain number of colors. And the four color theorem was proved using an automated theorem proof of like Coq in the eighties or something. So one of the points
I want to make is that the languages at the far end of this spectrum with very strong flexible static type systems are often quite hard to use and they're not necessarily that practical for day-to-day development. They're often developed in sort of the academia land. Haskell kind of came down from academia, and they
were intended as tools for mathematicians, but we can use ideas from these to help us improve our programming in normal day-to-day programming worlds. At the other end, you know, JavaScript and Python are very easy to use and very easy to get started with, but then not very
powerful in terms of the guarantees that they give you, right? What's interesting is that most of the languages on this end of the spectrum give you a way to turn off the type checker, and the reason for that is that a type system often sort of gets in the way. I don't know if any of you have kind of experienced the C sharp type system
not allowing you to write the code that you want to write. There's plenty of code in the world that is correct and has no duplication and so on but doesn't type check. And so many of these languages, so in C sharp, like dynamic would be an example of a way to turn off the type checker or unsafe casts or something like that.
And that's in, so any time you have to do an unsafe cast, that's kind of, that's generally upsetting because it's an example of the C sharp type system not being good enough to prove the properties you know to be true about your system. What else? It's interesting to me that Scala and Haskell in
particular are trying to make inroads in this direction. They're trying to put more stuff into the type checker in order to allow you to encode more sort of correctness properties about your program statically. Does anyone want to
ask me any questions about my rubbish diagram? All right. So let's get on with another example. Hold on. So are you guys familiar with the test data builder pattern? Have you ever heard of this? The idea of a test data builder is that it's
kind of intended to help you with testing, right? It'll give you a sort of, it'll give you a default instance of a given class but it'll also, with its various properties sort of filled in, but it'll also provide you a way to sort of override those
properties, right? So I've got some code here that hopefully will work. So we've got, this is like a factual sort of HR application, right, which tracks when people have been paid and how much they've been paid and stuff like that. And so a salary payment test data builder, when
you create one and then call build on it, it'll produce a salary payment, this sort of entity class that we're going to say to the database or, you know, do some work with. And if you don't call any of these like with date issued or with recipient methods, then you'll just get a default salary
payment with some dummy values sort of filled in for you, right, just a normal date, time, a particular amount. So, you know, if you need a salary payment but you don't really care what's in it, you would actually give you a means by which to override certain
properties of the thing that it's going to produce, right? Now, this salary payment test data builder, let me go back to the slides, can both create an in-memory representation of your salary payment, right? It'll just return a new one. But it can also, it also knows how to
save salary payments to the database. So I haven't bothered writing it, but there'd be some SQL in here that goes away to SQL server and says, here, here's my salary payment object. I want you to put this in the database so that I can manipulate it, right, for the purposes of integration testing. So if you're writing a unit test, you would
call build, but if you're writing an integration test, you'd call save, yeah? But the problem is that in this imaginary database of ours, there's a foreign key from salary payment to employee, and so you have to make sure that you set the recipient of your salary payment before
you try and save it to the database. So if we go back to the code, we can see that there's some checking that ensures that you called with recipient at some point in the past, right? So when we call with recipient, we set the recipient, but we also note down that recipient has been set to something
other than the default value. It has to be a real employee because it has to satisfy the foreign key in the database, right? And then if recipient was set is not true, we check it when you call save and throw an exception and tell you off, right? So the problem is this, if you create a
salary payment test data builder and then try and call save on it, this is an invalid program. It's only ever going to throw an exception, right? We can tell this just by looking at it, but it's kind of frustrating to me that we can't convince the compiler that this is an invalid program. So let's
kind of set about making sure that you can't call save on a builder that doesn't have an employee that you've set. So the first thing we might think of would be to introduce a subclass that knows how to save a salary payment called salary payment salary pay can't type
salary payment test data saver, right? And it inherits salary payment test data builder, and in the constructor, we require an employee recipient and then we just set the recipient in the
constructor, right? And now we can move the save method down to this subclass so you can't you can't you can't call save on a salary payment test data builder you have to create a salary payment test data saver and in order to create a salary payment test data saver you have to set the recipient
right? And that enables us to get rid of that recipient was set thing. We don't need to keep track of that anymore because you had to set the recipient in order to even get hold of the salary payment test data saver so let's just get rid of all this all these runtime checks
okay? Everyone happy with the plan so far? So up here in the usages, if we intend to call save we have to have created a salary payment test data saver right? And it's telling me off because I didn't
I didn't give it an employee so let's just take this and we don't need to call with the recipient anymore so that works great right? This one down here salary payment test data builder dot save that's not going to work because there is no save method on salary payment test data builder anymore so the only option we have
here is to remove the code that doesn't work how terrible what about this case so we don't need to call with the recipient anymore we intend to save it right? we intend to save the salary payment so let's turn this into a salary payment test data saver
saver unfortunately it's not letting us call save on the salary payment test data saver that we just created it's telling me that the save method doesn't exist so why is that?
does anyone know? yes that's right with amount we called with amount and the declared return type of with amount is this salary payment test data builder but it returns this right? so we know that because we just created the salary payment test data saver it's got to be a salary payment test data saver
so we should be able to call save on it, the save method will be there at one time but the compiler can't figure out because we created the salary payment test data saver and we're returning this we basically haven't explained to the compiler that we're returning the very same instance of the builder
what we really want to say is the thing we return from with amount is the type of something like that right? but there is no way to say this in C sharp so this is kind of this is what I meant when I said that the
compiler, the type system often gets in the way a little bit but can we make it work? well we know that we know that we're returning something what we really want to say is we're returning something of the current type but one way to do this would be to use a type parameter for the return
value of with amount and so on so let's try this let's call it T self right? because the intention is that this T self type parameter is going to be is going to be the type of this so T self and this is going to return T self as well whoops
right? so now it's telling me off because because I'm returning this but the compiler hasn't figured out that this is the same as T self there's not really any kind of constraints on what T self should be T self could be anything so how can we make sure that this is
T self? well let's make this an abstract class abstract and what we really want to do is say to the subclass I want a way to get hold of a T self and the intention is that it's going to be you know the instance
of this protected right? and so we can say return this here return this okay? is everyone clear with where this is going?
so salary payment test data saver is a salary payment test data saver which returns salary payment test data savers from all of its with methods right? and now we need to implement that override which hopefully v sharper is going to do for me and then here
because we specialise T self to be salary payment test data saver the return type of this has to be a salary payment test data saver and so we can return this right? everyone clear with what's going on here? why is it red? I forgot the return
right? and so now we get to write something fun which is class salary payment test data builder is a salary payment test data builder which returns salary payment test data builders and so of course we need to override that this
method and as if by magic our code compiles right? so we can now call test data saver and then call dot with amount and because the compiler because of the magical power of generics
we've been able to it's quite weird though right? this class looks like any code that requires you to write the same thing on a line three times next to each other that's always a bit kind of strange and there are other problems with this right? so for example we could say
create another subclass of salary payment test data builder called the nonsense builder which is a salary payment test data builder which returns I don't know strings but the problem is that the intention of that T self was that it would be the current the actual runtime type of the builder right? so I mean
this is perfectly valid code just you know return foo so can we get compile time assurance that this sort of nonsensical class doesn't exist? what do you think? can you think of a tool
that the language gives us? yeah we could say where up in here so salary payment test data builder T self where T self is a subtype of salary payment test data builder of T self so our code is kind of folding in on itself in a weird
but we made this work the nonsense builder can't work anymore because string is not a subtype of salary payment test data builder of T self right? so we have to it's not quite so this technique here I think in C++ they call it the curiously recurring template pattern
which I quite like it's also known as f-bounded quantification this where clause is called an f-bounds because you know it was invented by professor something in 1971 or whatever and he gave it the boring name of f-bounds I prefer the curiously recurring template pattern
so so this f-bounds thing the where T self is a subtype of this it's you'll see this show up I mean it's frequently used in languages like Scala because Scala developers like static types but what it means if you ever see something that looks like this
what it means is that T self is intended to be the concrete type of the instance right? if you ever see this sort of this f-bounded quantification the problem is that f-bounds don't quite mean the same thing as this imaginary this.type thing
that we wanted to write right? f-bounds do not quite mean the same as self-type and I'm going to illustrate that we can still write a nonsense builder which returns salary payment test data saver right? this type checks it will type check
I'm just going to turn null because I'm not interested so this type checks salary the type parameter here is intended to be the concrete type of this but it's not right? the nonsense builder has no relationship to salary payment test data saver so f-bounds aren't quite as good
as proper self-types okay? but I mean this code's like have you guys ever opened a picture file opened a jpeg in a text editor? and the text editor tries to sort of interpret the jpeg file as ASCII and you just get garbage out I put this code into a .gif file and something very strange happened
this is what came out yeah, I mean like, how many of you think that you could debug this code? I mean I don't think I'm going to understand what's going on I don't think Nick Cage understands what's going on it's very strange, it's like really weird
this code, isn't it? I just don't understand it at all so I want to show you a different way of doing the same thing right? because it's you often learn something about a system by trying to push it to its limits and so that's what we're going to try and do here
the idea this file is the same file but I've kind of rewound it a little bit so we're back quite near the beginning again so we've just got this one salary payment test data builder so the employee was set thing, whether or not you've called with recipients that's a bit of state that's on the builder, I think
so we could probably make the compiler check that state if we found a way of representing the employee was set variable, if we found a way of representing that state within the type system so here's the idea I'm going to write some
type level data, which is a concept which may not be familiar to many of you so I'm going to write a class called true and a class called false and the intention is that these are going to be booleans that live inside the type checker and we're going to use these boolean classes these boolean types so I'm not very interested in the
values of these types what I'm interested in is I'm going to use the types as if they were values but the type checker understands them so the idea is that we're going to use this true and false class to represent whether or not the employee of the salary payment test data builder has been set at this point
right? so let's say has recipients, that's what we're going to name the type parameter here so when you call with recipients no, let's start with with data issued so what is the return type of this with data issued method
it's going to return a salary payment test data builder of the same which hasn't if we've already set the employee on the test data builder here if we've already called with recipients and then we call with data issued then the recipient has still been set right? and likewise if we haven't called with
recipient yet and we call the recipient still hasn't been set so the return type of this is going to be whatever we're going to just keep has recipient exactly the same as it was right? is everyone clear on what this is going to be about? and likewise with amount
has recipient doesn't change but with if we call with recipient the idea is that it should go from having not had the recipient being set to having had it be set right? so the value of has recipient for the return type of with recipient is going to be true right?
and we can't return this what we're going to do is we're going to return a new salary payment blah blah blah and I've got a constructor which I've commented out here so we're going to return a new one
and we're just going to pass in the same values that we had before where have I gone? so it's going to be recipient this.recipients this.this.datashued and this.amount and then so this is basically creating
a new salary payment test data builder with the type parameter updated to true whatever it was before we've now set the recipient so the value of has recipient here whoops for the newly created test data builder is true so what we need to model now is the fact that when you first create a salary payment test data builder it starts out in the
false state right? with no with no so we're just going to with no recipient having been set so we're just going to create a new subclass salary payment test data builder is a subtype of salary payment test data builder of false right?
and yeah let's make this constructor protected okay? I've got some compilers okay there we go so now that we've set about
representing this the state of the test data builder in the type system so remember when the state changes when we set the recipient we get back a salary payment test data builder which knows that the recipient has been set or for which the compiler knows that the recipient has been set excuse me
so now all we need to do is make sure that you can't call save unless has recipients is true so I'm going to in order to maintain the API I'm going to put this in an extension method static class builder extensions
and I'm going to just move this save method public static and it takes this salary payment test data builder for which has recipient is true right?
and then we call builder.build in order to get the payment and then we save the entity so now we get the same result as we did before you can't call .save on a salary payment test data builder for which you haven't set the recipient and so the point I wanted to illustrate here was that
I'm just going to comment this oh dear the point I wanted to illustrate here is that you can convince the type checker of all sorts of things that you may not have expected were possible right? these true and false classes they're meant to represent booleans but they live inside the type system it's kind of like values
that the type checker understands right? normally we expect booleans to be sort of running around in our program at runtime but we can also manipulate them to a certain extent at compile time any questions?
some people are looking a bit confused okay um what about uh it's still not quite yeah okay no I'm not going to talk about that so what I wanted to illustrate with that was um if you think about your data
if you think about the states that your data can transition through and whether or not it's feasible to represent those states in the type system you can actually increase the compilers knowledge of your system in ways that you might not have expected right? and so if you can come up with
types that look like the data that you would normally represent at runtime then there's not really any reason that you can't use them at compile time as well do you see what I mean? yeah yeah sure yeah yeah so the idea is that when you create
when you first create where is it when you create a salary payment test data builder right? so when I call the constructor of this like um like up here right? then what I get is a salary payment test data builder which is for which that has recipient type parameters false
yeah? so it always starts out with the false state that is the recipient hasn't yet been set um then when we call with recipient we get out of it a salary payment test data builder of true that is to say has recipient is now true right? for this salary payment test data builder
and we do that by creating a new salary payment test data builder and copying over the data does that answer your question? so the idea is that when you call with recipient the builder undergoes a transformation from false to true right? and then we
can assert that the value is true using that extension method yeah go on well has recipient isn't an actual type right? it's just a parameter right?
so what we could do is we could say something like class the idea is that this would be like a type level boolean so let's call it bool and then true and false would be instances of bool whoops would be subtypes of bool I should say and then we can
assert that has recipient is a subtype of bool yeah? so um I didn't bother originally showing you that but yeah go on uh huh
yeah you could yeah yeah but let's suppose that we're putting this in an assembly and distributing it right? that's a good question
I don't think you could in C sharp I think you could in a more strongly statically typed language like actor maybe at the end if there's time I'll show you some actor yeah anything else? okay
yeah so I suppose I could rename it to t has recipient let's call it t has recipient yeah? faz recipient okay?
so like ordinarily the sort of checking this sort of checking like have you set the recipient ordinarily that's that's saved until runtime right? but we can convince the compiler to check various sort of invariants of our types if we use the type system in creative
ways um right is everyone else happy? there were quite a few questions so there may be more yeah right right indeed or
yeah yeah yeah that's right you could say where is it yeah yeah yeah that's right so you're saying that you could come along and say this yeah that's that's an alternative way of
oh god yeah that's an alternative way of kind of baking that I mean the problem is another problem is that someone else later could come along and say class you know perhaps and that or or
have you have you seen that daily wtf posts? I don't mind um so yeah there are ways around I don't want to get too far into that because yeah
so so earlier when I challenged you to find the line between good advice and bad advice you reckon we passed that line now well I'm making no comment yeah anything else?
would I recommend using it in production codes? I'll let you be the judge is it reasonable?
I'm going to let you decide I'm not I'm not going to offer an opinion on that yeah there was one at the back was there? were you raising your hand? I thought you were, no never mind anyone else? yeah
not as far as I know Scala has a self-type yeah I don't I think Eric Lippert has written a bunch of blog posts the C-shirt designer or one of the C-shirt designers has written a bunch of blog posts about the fact that people often come up to him and say why doesn't C-sharp have this feature
and he invariably says well because it's quite expensive to implement a feature for a language that's used by hundreds of thousands of programmers around the world right because not only do you have to implement the feature you have to think fully think through the design of how that feature is going to interact with all the other features you have to test it you have to see like how is it going to work
in a certain locale in a certain platform of windows all that stuff and so the design decisions that go into C-sharp aren't just a question of would programmers like this right they're a question of can we afford to build it as well right so yeah I would like it if self-types were available in C-sharp
I can appreciate why the designers chose not to do it
I'm not particularly familiar with code contracts my understanding was that it was it puts runtime assertions into your code and then erases them in release mode is that right? ah right and it uses like an SMT solver or something like that
does it? yeah I don't know I expect you could probably use contracts to do something similar to this but you know the fun of it is trying to do it with the type system right anything else? okay
what's the time? okay 20 minutes so I've got another slide so I've drawn a little graph a little diagram what I'm trying to illustrate here is that generics, parametric polymorphism
they're synonyms and subtype polymorphism are like they're not opposites exactly but they're kind of two different axes but the idea of both is that we should be able to write polymorphic code that is code that works over a number one piece of code that works on a number of types but parametric polymorphism and subtype polymorphism
are very different ways of going about it and so parametric polymorphism you say that my type T can be any type in the entire system so you get a great deal of generality out of it, whereas subtype polymorphism you don't get so much generality because you say my type has to be a subtype of some interface or something like that on the other hand
parametric polymorphism you have to anticipate that that your type is going to be used parametrically whereas for subtype polymorphism you can always come along and add a new subtype of a given type, right? So they're kind of complementary in a way but in that the pros of subtype polymorphism are the cons of parametric
polymorphism and vice versa I've drawn a sort of graph showing where various languages on the spectrum fall and I've also tried to illustrate that I haven't really got time to give you many examples of this but variance is one example of
the surprisingly complicated interaction between modern generic type systems and modern subtyping systems I think Scala I was being polite when I put it outside the complicated zone, Scala's really very much in the complicated Scala's quite complicated and so going back to your point earlier trying to avoid the complicated
zone is one of the key trade-offs in language design, right? You don't want to make your language too complicated for people to understand and some of the languages on the right hand end of the spectrum that I showed you earlier are too complicated for the majority of people in the world they're aimed at academics and professors and stuff
yeah any questions about my rubbish graph? my other rubbish graph? alright let's get onto the fun example so not this can anyone tell me what these two lines do?
what does this program do? come on so it's an exception, yeah? it's an empty list and we're trying to get the first value of it first value of an empty list that's never a sensible thing to do, right? it's an invalid program to try and get the first value of an empty list you can't do it
so wouldn't it be great if we could prevent this invalid program using the type system? wouldn't it be great? so the idea is that we're gonna rather like we had a Boolean bit of type level data earlier we're gonna have a bit of type level data that represents numbers
and we're gonna teach the type checker what to do with numbers the type checker is gonna understand how numbers work and then when you call first on a list we intend to ensure that that list has a length
greater than one so what would type level data for numbers look like? well let's ask mathematicians mathematicians define natural numbers inductively they say fact number one is that zero is a natural number and fact number two is that you can always create the successor of a natural number
and so we can represent this in the type system a class called a type called zero and a type called successor which has an actual number inside it right? so z is zero sssz is three
right? do you understand where this is going? so the idea is that we're gonna have a type parameter rather like we did for hasRecipient we're gonna have a type parameter on our list which goes up by one that is adds an s every time you add something to it so I'm gonna give you a quick recap of linked lists
because linked lists are how we're gonna do it does everyone know how linked lists work? there's nil at the end of your linked list and then there's a sequence of con cells in front of it and each con cell has a pointer to the rest of the list right? and so I've got some codes here's our
let me get rid of that region here's our natural number type z and s that I described earlier and here's a linked list so there's an abstract class called linked list and then there's the nil case which has nothing inside it there's the cons case which has an item and a pointer to the rest of the list
so let us parameterize let's rename it to vec vec for vector we're gonna teach our vector to know how long it is so first off we'll start by parameterizing our vector by some type n which will be the length of the list
n being the natural number right? and so okay so what's the value of this type parameter for the nil case? so z exactly so when you create a nil what you get back is a vector of length 0 what about cons?
so we want to increment something right? which involves calling successor but n could be anything right? you could have cons and then nil so n would be 0 in that instance so you could have cons and then n would be 10 or whatever so n has to be a parameter right?
cons of t and n and so the tail of the cons cell is gonna be a vector of length n so the idea is that you put a vector of length n in and you get a vector of length sn out
yeah? and so now we can write another little extension method static class vec ext public public that's embarrassing static static t oops
t first and it takes a t and an and an n well let's come back to that so what is the type of the parameter of the argument of first? it's a vec of t and what?
so it has to be a vector that's longer than 1 right? so what does that mean? successor of n yeah successor of some n I don't know what n is gonna be you can decide what n is, I don't mind and again we have to treat n parametrically
so we know here that the vector has at least one cons cell in it because the only way to get a vector of s and n is by calling cons by creating cons yeah? so what we need to do unfortunately we have to cast oops can't type
cons of t and n v and then we can get the item from it it's annoying that we have to cast it here right? we know that the only way to get a vector of length s of something
is if it's a cons but the compiler is not clever enough to understand that indeed it's not really true because someone could come along and add another subclass of vec later on um that manipulates the type parameter in a different way so we have to cast it to a cons here in order to get the head out of it in a
proper dependent type system the compiler would know that if you've got a vector of s of something then it must have an item inside it, it must be a cons so let me, I've got some example code just comment this
and so I hope I got this right it's unhappy with something cannot convert from vec vec dot nil int I might have got the
can anyone tell me why it's not type checking it's telling me cannot convert from nil of int to vec of z and int oh I've got these type parameters the wrong way around that's right
this is what it is so there we go, okay, so we've got three items in our linked list three, two, one
and the type of the linked list is sssz which is three so we achieve victory right? and here we've got an empty vector and we can't call first on it it's telling me that blah blah blah it's telling me that there's no extension method which takes a nil, a vector of
length z any questions about vectors? there must be some yeah let's try and write that's a good question let's try and write sum
let's try and add up a vector of ints sorry, how far can we go before we get stuck can we go further than this before we get stuck I guess is your question public static int sum with an n in there
right vec of int and n so supposing that we can do let's resign ourselves to doing a runtime check here right, so if v is vec of int and zero
right, then it must be a nil so we just return zero, there's nothing in an empty list otherwise we want to recursively sum the rest of the linked list and then add the number that we've got on the head of it to the front right so we want to return
return let's so it would be v.head plus sum sum of v.tail so what do we need to cast v to in order to get the head and the tail out well v.c equals
it's a cons of int and well what do we put here
yeah that's an alternative so it would it would terminate here right because we've done the runtime yeah so we can try that, it also won't
work so public static int, we can try overloading, creating an overload for for the case where the vector is empty let's just say vec of int and
z so this is going to be a nil and we just return zero here and then so this must be a vec of sn right and then we can cast it to a cons of int and n
but we still get a type error here and the reason is that it doesn't the compiler doesn't know which overload of sum to call right it doesn't know whether n is going to be zero or something else at this point so it can't tell whether to dispatch overloads are resolved at compile time so it doesn't know
which sum to call here it doesn't know whether to call this one or the one up there yeah so we've got stuck that's the answer we got stuck as soon as we tried to consume a vector using a normal recursive function the C-Shark compiler fell over yeah um
what's the time what time does this end 22 it's now 25 so we can finish the slides or I can show you five minutes of Agda which is a dependently typed programming language vote hands up if you want to see some Agda
lots of people ok let's do some Agda so this is an Emacs buffer Agda's got quite a powerful Emacs mode which I will show you so let's write natural numbers Agda's a sort of it's a functional programming language rather like Haskell but it's got a much stronger type system Haskell
Nat, naturals, so set is Agda's word for type, so nat is a type, right? Types and values are the same thing, so types have types in Agda, and the type of nat is a type. A bit confusing, right, already, but let's go.
Zero is a nat. Nat, this is the inductive definition that I showed you earlier, and then successor takes a nat as an argument and produces a nat, right? So let's define addition on natural numbers. Takes two nats as arguments and produces a nat.
So if we load it into Agda, it says, okay, I'm happy with your definition of nat, now what do you want to do to fill in the definition of plus? Well, let's fill in the arguments first, let's call them x and y. So Agda, what could x be? So Agda's telling me that x could be zero or it could be the successor of something
else, and those are the two cases that I need to fill in. So in the case that x is zero, the sum of x and y is y, and in the case that x is the successor of something, we want to recursively add x and y and then put the successor back
on, right? So that was just a preamble of how to define functions and data in Agda. Let's do vectors. Data vec. So vectors are parameterized by the type of things they contain, which is this a, and indexed by a natural number. So there's a difference between parameters, I'm not going to go into it here.
But there's a natural number in the type of a vector, and so my point is that you need to give, if you've got vec, you need to give it a type and a natural number before you get back a type, right? That's what that type signature means. So there's the empty vector. What's the type of the empty vector?
What do I put here? So this first parameter is going to be a, right? So it's a vector of anything you like. And the length of it is what? Zero. And then I'm going to write cons as an infix operator, like plus was.
So what's the type of cons? Well, we give it an a for the head, and we give it a vector of a and something, n, arbitrary n for the tail, and what we get back is a vector of a and the successor of n.
So Agda is quite happily manipulating natural numbers. It's going to complain here because n is not in scope, so we just say there's a natural number inside there as well, but I want Agda to figure out what the number is, okay? And so now we can quite easily write head.
So head, as we said before, takes a vector of a and the successor of something and produces a something, produces an a. So the length of the vector that you give to head has to be one or greater, right? And it's going to complain because a and n aren't in scope at the moment.
So n, that. But the curly braces just mean I want Agda to figure out the values of a and n based on whatever the runtime, whatever I end up calling it with, right? Head. So let's introduce the parameter.
So Agda, what could x's be? We've told Agda that it has to be length greater than one, so when we ask Agda what x's could be, it doesn't give us a case for the empty vector. It's telling us that it has to be a cons, which is interesting.
The type here is participating in the development of our function, right? We've explained to Agda that it has a length of greater than one, and Agda has told us in response that, hey, it can't be an empty vector in that case. So here we just return x. Let's do adding vectors together, right?
Concatenating two vectors, so let's call it plus plus. What's the type of the function that concatenates two vectors? Well, it takes a vector of a and n and a vector of a and n, and what does it produce? Vector of a and what? n plus n.
So remember, plus is just a regular value level function that we defined. Oh, I've got to go in a moment. But we are able to use that regular function in the type, right? This is what dependent types are all about. Values can appear in types and be manipulated by the type checker just as regular values can and just in the same way as types can.
So I just want to finish this example. So it's complaining because a is not in scope. n and n and that's, let's call them x's and y's.
So what could x's be? Agda's going to say it could be an empty list or it could be a cons. So in the case where it's an empty list, what's the result? What do you get when you concatenate something onto the end of an empty list? y's. What do you get when you concatenate something onto a cons cell? Well, actually, I think we told Agda a fair amount about the type of this function.
So Agda, what do you think the answer is? The type checker's writing our code for us. So is there anything else you guys want to see? I could do take. Take's kind of fun.
So take, you give it a number, a natural number. So n of type nat and a vector. So the vector has to be something longer than, at least as long as the type n, right? As the number n. So you can't take five things from the four element vector. And so we're telling the type checker that it has to be n plus something.
It can't be shorter than n. And what we get out is a vector of a and n. So what's interesting about this is that n is a value, a runtime argument, but it's appearing in the return type. So take, it's going to complain because a and m are not in scope.
So Agda, what could n be? Well, it could be zero, or it could be a successor of something. So in the case that it's zero, we just return a vector of length zero, right? Indeed, Agda can tell us what the answer is,
because there's only one way to get a vector of length zero. In the case that it's the successor of something, we need to be able to pull the item off the front of x. So again, Agda knows that vector can't be the empty vector. And I think Agda can fill it in for us again.
Let's just finish the slides. It'll be very quick. So I think I've shown you that generics are something rather different than what you thought they might be, I hope. This isn't Agda, this is Idris. Idris is a very similar language to Agda.
But I think the point I wanted to show you was that in Agda and Idris, it actually does work. In C sharp, it doesn't work. But look, the elements that we saw in the C sharp version, which didn't work, are present and correct in the Idris and Agda version, and you can manipulate type level data and values can appear in types.
Types and values are the same thing in Agda and Idris, and the result is a much freer and more flexible world. So going back to the question that I started with, what are types? I think I've shown you that, I don't know.
Thank you.