We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Standing on a Beach, Staring at the C

00:00

Formale Metadaten

Titel
Standing on a Beach, Staring at the C
Serientitel
Anzahl der Teile
163
Autor
Lizenz
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
C has followed a slow and steady progression from being a high-level assembler to a general-purpose language with a strong systems focus, becoming the lingua franca that ultimately holds the software universe together. In over four decades the idioms and work practices around the language have changed. But only a little. What if we look at the language from the perspective of other trends? Without trying to fake or shoehorn the language into another paradigm, what practices — big and small — can we bring to it from the worlds of C++, OOP, functional programming, TDD and others?
28
30
76
128
Formale SpracheSpeicherabzugProgrammierungCodeTeilmengeBitLuenberger-BeobachterNebenbedingungCASE <Informatik>Klassische PhysikMultiplikationsoperatorProgrammierspracheSoftwareentwicklerKategorie <Mathematik>Wort <Informatik>AdditionMigration <Informatik>Formale SpracheObjektorientierte ProgrammierspracheBridge <Kommunikationstechnik>Physikalisches SystemSpeicherabzugZahlenbereichCompilerTeilmengeKlasse <Mathematik>MAPVererbungshierarchieProgrammierungAlgebraisches ModellFormale SemantikSystemaufrufFundamentalsatz der AlgebraLaufzeitfehlerEntscheidungstheorieProgrammbibliothekMereologieQuick-SortSoftwaretestDatenstrukturSchreiben <Datenverarbeitung>ZeichenvorratCodeTypentheorieFramework <Informatik>KohäsionKontextbezogenes SystemRichtungPaarvergleichSchnittmengeVarietät <Mathematik>VariablePunktGeradeRechter WinkelFunktionale ProgrammierspracheBasis <Mathematik>SichtenkonzeptStandardabweichungDickeMusterspracheFreewareProgrammiergerätGüte der AnpassungComputeranimation
TeilmengeSchlussregelENUMInformationsspeicherungComputerLokalität <Informatik>EntscheidungstheorieBimodulSystemprogrammierungDisjunktion <Logik>Objektorientierte ProgrammierspracheProgrammierungModelltheorieMAPSelbst organisierendes SystemZeichenketteAlgorithmische ProgrammierspracheKlassische PhysikFlächeninhaltMathematikEinsParametersystemSelbstrepräsentationZahlenbereichLuenberger-BeobachterMakrobefehlFunktionale ProgrammierspracheModul <Datentyp>Mailing-ListeComputerspielInformationDreiecksfreier GraphKalkülFormale SpracheAusnahmebehandlungPhysikalische TheorieGenerator <Informatik>StandardabweichungStatistikObjektorientierte ProgrammiersprachePunktZeichenvorratSigma-AlgebraOffice-PaketMultiplikationsoperatorDruckverlaufMechanismus-Design-TheorieBitKreisbewegungGarbentheorieErwartungswertStrukturierte ProgrammierungSchreib-Lese-KopfSichtenkonzeptQuellcodeCompilerKonstanteStellenringHauptidealProgrammierungCASE <Informatik>Translation <Mathematik>StereometrieE-MailPRINCE2TermInformationsspeicherungElektronische PublikationProzess <Informatik>VerkehrsinformationTransformation <Mathematik>Wort <Informatik>SoftwareKonfigurationsraumResultanteAppletp-BlockDeklarative ProgrammierspracheLastDifferenteCoprozessorProjektive EbeneUmwandlungsenthalpieBimodulGrundraumEntscheidungstheorieMereologieKlassendiagrammMultifunktionTouchscreenDatentypInterface <Schaltung>URLTypentheoriePräprozessorAbzählenVorzeichen <Mathematik>DämpfungENUMProgrammiergerätKontrollstrukturLokalität <Informatik>Open SourceCodeJSONXMLUML
Disjunktion <Logik>SystemprogrammierungObjektorientierte ProgrammierspracheProgrammierungAbstraktionsebeneTypentheorieCodecOrientierung <Mathematik>Modifikation <Mathematik>VererbungshierarchieGeheimnisprinzipStrebeOvalIndexberechnungLipschitz-StetigkeitVektorraumMinkowski-MetrikSchnittmengeCASE <Informatik>Schreib-Lese-KopfPunktPhysikalisches SystemElektronische PublikationSelbstrepräsentationKategorie <Mathematik>Funktionale ProgrammierspracheCodeMereologieAbstrakter DatentypProgrammierspracheProgrammierparadigmaKlassische PhysikCodecProgrammierungModifikation <Mathematik>EinsNichtlinearer OperatorMAPImplementierungMultiplikationsoperatorKünstliches LebenZeiger <Informatik>Prozess <Informatik>Objektorientierte ProgrammierspracheModul <Datentyp>Ganze ZahlCompilerRegulärer GraphStandardabweichungService providerVariableVersionsverwaltungSigma-AlgebraTermStatistikGrenzschichtablösungMailing-ListeZahlensystemInterface <Schaltung>GrundraumBimodulTeilmengeArithmetisches MittelE-MailGleitendes MittelTypentheorieVererbungshierarchieGeheimnisprinzipAbstraktionsebeneNumerische TaxonomieTopologieRechenschieberDifferenteSchnelltasteSoundverarbeitungGüte der AnpassungProdukt <Mathematik>ModallogikLaufzeitfehlerRandwertSpeicherverwaltungWort <Informatik>SoftwareKugelkappeWorkstation <Musikinstrument>SichtenkonzeptModelltheorieComputerspielDickePlastikkarteLogische ProgrammierspracheOrdnung <Mathematik>InformationDatenstrukturFunktionspunktmethodeDeklarative ProgrammierspracheVerschlingungKeller <Informatik>JSONUML
IndexberechnungOvalRechenwerkWurm <Informatik>Urbild <Mathematik>MenütechnikProgrammierungInklusion <Mathematik>ProgrammierparadigmaLokales MinimumComputerIterationProdukt <Mathematik>Nichtlinearer OperatorSpannweite <Stochastik>OrdnungsreduktionGruppoidFunktionale ProgrammierspracheObjektorientierte ProgrammierspracheMultiplikationProgrammierungProgrammierparadigmaFaktor <Algebra>SelbstrepräsentationE-MailPhysikalisches SystemElektronische PublikationFolge <Mathematik>ZahlenbereichAbstraktionsebeneDatenstrukturSoftwareOnlinecommunityCASE <Informatik>Algorithmische ProgrammierspracheMultiplikationsoperatorResultanteVersionsverwaltungWort <Informatik>TermGeradeCompilerTuring-TestSpannweite <Stochastik>DifferenteDivergente ReiheNichtlinearer OperatorRechenschieberRekursive FunktionMailing-ListeSpeicherverwaltungCodeProdukt <Mathematik>OrdnungsreduktionShape <Informatik>ZeichenketteTypentheorieProgrammbibliothekNormalvektorMinkowski-MetrikENUMMomentenproblemBefehl <Informatik>MaßerweiterungFormale SpracheMathematikGüte der AnpassungRechter WinkelExogene VariableBitOrdnung <Mathematik>PhasenumwandlungGrenzschichtablösungAggregatzustandLoopDeklarative ProgrammierspracheProgrammierspracheTabelleMusterspracheApproximationTwitter <Softwareplattform>Office-PaketStirling-ReiheZahlengeradeGrundsätze ordnungsmäßiger DatenverarbeitungOpen SourceComputerspielWasserdampftafelTeilbarkeitSprachsyntheseVariableArithmetischer AusdruckLastKanalkapazitätSystemaufrufComputeranimation
Spannweite <Stochastik>OrdnungsreduktionGruppoidStatistikComputerSystementwurfInterface <Schaltung>ParametersystemFormale SpracheSoftware Development KitTrägheitsmomentRechenwerkIndexberechnungDezimalzahlRahmenproblemProgrammierungKonvexe HülleVerschlingungRekursive FunktionKonfiguration <Informatik>Web-SeiteMultiplikationsoperatorZahlenbereichObjektorientierte ProgrammierspracheParametersystemBefehl <Informatik>Lesezeichen <Internet>Arithmetisches MittelTranslation <Mathematik>BitCASE <Informatik>AppletMailing-ListeBildschirmfensterFunktionale ProgrammierspracheProgrammierungDateiformatModelltheorieEinsPunktLastVererbungshierarchieRegulärer GraphMusterspracheAggregatzustandZeichenketteFolge <Mathematik>ProgrammbibliothekImplementierungDatenstrukturProgrammierspracheÜbersetzer <Informatik>CodeSchreib-Lese-KopfFormale SpracheStabilitätstheorie <Logik>Physikalisches SystemPhasenumwandlungRegulärer Ausdruck <Textverarbeitung>Zeiger <Informatik>ProgrammiergerätGeradeMatchingMinkowski-MetrikSoftware EngineeringGrundsätze ordnungsmäßiger DatenverarbeitungMomentenproblemLogische ProgrammierspracheSpeicherabzugAlgorithmische ProgrammierspracheVariableTeilmengeDezimalzahlKontextbezogenes SystemSpannweite <Stochastik>Mixed RealityZahlensystemAbstandMinimumZustandsmaschineWort <Informatik>Klasse <Mathematik>LoopNormalvektorFunktionspunktmethodeMathematikDatenverarbeitungssystemRichtungInteraktives FernsehenMAPGraphiktablettVerschiebungsoperatorOrdnung <Mathematik>Interface <Schaltung>Zentrische StreckungPlastikkarteXML
Wechselseitige InformationBinärdatenStrebeZahlzeichenEntscheidungstheorieFunktionale ProgrammierspracheHalbleiterspeicherWort <Informatik>ParametersystemQuadratzahlProgrammierungBitExogene VariableGebundener ZustandPoisson-KlammerProzess <Informatik>SoundverarbeitungZeichenketteCodeSpeicherverwaltungSchreib-Lese-KopfQuick-SortLeckUmsetzung <Informatik>ResultanteDifferenteTermZahlenbereichDivisionAppletBefehl <Informatik>ComputersicherheitTabellePunktLogische ProgrammierspracheCASE <Informatik>MultiplikationsoperatorRandwertDigitalisierungNP-hartes ProblemDeklarative ProgrammierspracheZeiger <Informatik>ProgrammfehlerBus <Informatik>Primitive <Informatik>MathematikBenutzerschnittstellenverwaltungssystemVollständiger VerbandUmwandlungsenthalpieMinkowski-MetrikAggregatzustandMailing-ListeKontrast <Statistik>GraphfärbungTypentheorieElement <Gruppentheorie>VerschiebungsoperatorNichtlinearer OperatorSystemaufrufRechter WinkelProgrammiergerätProjektive EbeneEinflussgrößeMAPHochdruckFormale SpracheDreiGruppenoperationSummengleichungJSONXML
Element <Gruppentheorie>OvalFunktion <Mathematik>SoftwaretestGeschwindigkeitTabelleSoftwaretestFunktionale ProgrammierspracheCodierungCASE <Informatik>Punktp-BlockDickePräkonditionierungArray <Informatik>MomentenproblemKontextbezogenes SystemBinärcodeOpen SourceElement <Gruppentheorie>Sequentielle SucheZeiger <Informatik>SchlüsselverwaltungQuick-SortErwartungswertGrenzschichtablösungKonditionszahlTermAbstraktionsebeneRuhmasseZeichenketteTeilmengeZahlenbereichCodeStellenringResultanteDesign by ContractTouchscreenDifferentialKlasse <Mathematik>GeschwindigkeitTypentheorieReelle ZahlModelltheorieProgrammfehlerMathematikKonzentrizitätPhysikalisches SystemComputerspielRekursive FunktionNatürliche ZahlProgrammbibliothekGenerizitätBitSoftwarewartungMinkowski-MetrikWort <Informatik>VariableProgrammierungImplementierungSpannweite <Stochastik>Luenberger-BeobachterFramework <Informatik>FehlermeldungDeskriptive StatistikUmwandlungsenthalpieKlassische PhysikInnerer PunktParametersystemDisk-ArrayRichtungMAPFunktion <Mathematik>Kontrast <Statistik>Spezielle unitäre GruppePaarvergleichVererbungshierarchieValiditätSichtenkonzeptNeunzehnDifferenteDatenstrukturZentrische StreckungBefehl <Informatik>Güte der AnpassungRechter WinkelBenutzerschnittstellenverwaltungssystemMonster-GruppeUnternehmensarchitekturHypermediaFormale SpracheAussage <Mathematik>SpeicherabzugAbstrakter DatentypAbgeschlossene MengeMultiplikationsoperatorKategorie <Mathematik>JSONXMLUML
Transkript: Englisch(automatisch erzeugt)
Right. Good morning. Well done. Nine o'clock. So just a quick question. Who is actually using C or C++ actively at this point? Okay, good. Right. Who's using just who's using
C as C as opposed to, hey, look, it's a subset of C++. Okay, that's this point. Who's using C99? Okay. Okay. Okay. It gives me a rough idea. So, yeah, I think probably the talk title came before I figured out what the hell I was going to talk about. Actually,
no, that's not strictly true. What I want to, what I was interested in doing is rather than giving a sort of a blow by blow look at language features, C99, C11, libraries and bits and pieces, because I haven't been working in it actively for a while. So I'm
basically just a tourist. I do bits and pieces here and there. I've been involved in C++ standardization and I'm briefly kind of involved in C standardization, though I must say I was more of an observer than anything else. But a lot of people use straight C and that is, and the classic, there's a classic style of C or a number of classic
styles. One of the things I have found in going from one language to another is I've learned an immense amount about, not about just about new languages, but about how I could have been using older languages and I could have been using what I was already
doing. And it was actually, although the culture is helpful in transmitting practices, sometimes the culture kind of, you know, gets stuck. And there are certain ideas that don't migrate in very far and are seen as foreign to that culture, non-idiomatic. And yet there's a lot that can be said for that. So my interest in this kind of thinking, this
kind of idiomatic approach or this approach is reflected in my interest in patterns. The idea of looking and understanding the context in which something occurs to try and help you shape a solution and to try and look at it a little independently of the way that
many people might do just as a habit, in other words, to question it, what are the trade-offs? What is the problem I'm trying to solve here? I'd love to say go out and buy these books if you're interested in patterns. Sadly, I would say if you're really interested in patterns and you bought about three other books, then it's time to buy these
because this will answer all the remaining questions, but they're not their interest. One that does prove particularly popular is 97 Things Every Programmer Should Know. And I know it's popular because I just signed a copy five minutes ago, and it's been translated into a variety of languages. It's delightful. I've got a number of books that I can't even read the alphabets off. Very exciting. However, here is one thing I do want to take.
I want to take Russell Winder's observation. Very simple idea. No well more than two programming languages. The basis being cross fertilization is at the core of expertise. Idioms problem solutions that apply in one language may not be possible in another language.
That immediately teaches you about something, something that you may have assumed was a given, a way of doing something that was natural and universal, you suddenly discover is highly contextual. But at the same time, when you encounter this, you discover that
when you try and move idioms from one language to another, sometimes ideas flow very easily. Sometimes they don't teaching about both contexts. But what is interesting is where you take an idea from somewhere and you say, look, I can't do it exactly like that. But if I were to adopt this style, it would solve a class of problems. This I found
particularly useful, for example, with functional thinking, as well as originally with object thinking, functional thinking, just being able to say to somebody at the architectural level, I know you're not necessarily using functional programming, but I would I would consider this approach and that will be supportable. Idiomatically with review and so on, but
out which you will change the fundamental runtime properties and development time properties of your code base. This is what we're saying. We're not just talking about syntax sugaring. We're talking about something a little bit deeper here. So let me just start off in that case with one of the books that I well, I say one of the books, I've got three
three editions of it. I've got third, fourth and fifth edition of See a Reference Manual Harbison and Steel. For me, I it's difficult to know which one was the most defining one for me. I'm going to say probably the third edition, because you can see it slightly browner at the edge. So therefore it's slightly well thumbed. But actually, this is more
definitive from a general point of view. It's very well written. The fifth one, perhaps not as well written, but it does cover C99. There's a more stronger cohesion with the fourth edition. So I've taken this quote from the fourth edition. There is a and this is a subset and approach that I used extensively a number of years ago. C++ is nearly but
not exactly a superset of ISO C. Now, of course, what we mean by superset has shifted over time. In other words, we have had since the writing of these words, we've had C plus plus 98 has gone has become a standard. We've had C plus plus 11. We've had C plus plus 14. We've also had C 99 and C 11. And they these subsets are kind of ever shifting,
but they're mostly in the most mostly stable. It's possible to write C code in the common subset of ISO and C++ languages called clean C by some. So the code can be compiled either as a C program or as a C plus plus program. Well, what is the virtue of this? When they're
referring to ISO C, they are very much referring to Well, you can argue about whether it should be called C 89 or C 90. As they're using ISO, we'll call it C 90. The idea is you are able to write in a common subset because you are dealing with a language
when you recompile it as C plus plus, the semantics do not change, except in a couple of very well documented edge cases. And you probably shouldn't be there anyway. So it's one of these cases, you get a lot better type checking kind of you kind of get a lot of stuff for free. You also what is the other motivation for doing this? It allows
your code to be integrated into a larger ecosystem. You can integrate the code more easily if it's written in clean C, you can integrate that in classic clean C, if we're talking kind of C 90 style, you can integrate that with a ridiculous number of API's and then bridge into the world of C plus plus a lot of API's to systems that are
provided in C when they say we have a C API, what they're actually saying is we have a C 90 API. And the chances are it will probably be clean C subset because they probably have these issues as well. There's only so many support calls you can take from somebody saying we're trying to build your code with our C plus plus compiler and it's coming up
with these problems. And you can either fight that or you can say, you know what, it's really not a big deal, we can just change. And this is stuff like just steering clear of certain keywords, just watching out for a couple of things that are normally dubious practice anyway, clarifying your types, in other words, making it a lot more lint friendly and so on. So you're getting all of these benefits. But of course, there are
constraints. The natural constraint is that C 99 has and subsequently C 11 or C 11 is a relatively modest update, have moved off in different directions from the main line of what C plus plus is doing. So this subset is kind of looking a little more subset
like there's a bunch of features, you have to make a very clear conscious decision. Am I or am I not going to be working in the clean C subset, which means I can't use things like variable length arrays, I can't use designated initializers. There's a whole bunch of stuff you'd have to steer clear of. On the other hand, the ability to participate
in a larger set of tools, for example, testing tools, very helpful to be able to work from C plus plus and test the C code coming in from the outside, because quite frankly, the C frameworks suck when it comes to testing by comparison with what is possible
in C plus plus. I will just qualify that by saying that my C plus plus frameworks also suck, but they suck ever so slightly less. They don't need to, but that's a different talk and a different subject. So okay, what about basic other habits? First things, the usual stuff that you kind of pick up from Bjarne Stroustrup, stuff that Bjarne hates
the preprocessor. He really doesn't like the preprocessor at all. He really wanted to get rid of all uses of the preprocessor. He did not entirely succeed. But there are still loads of C style habits that I see people kind of creeping into, old C habits
which they don't need to, particularly if they're using something like C99, but even if they're not. So macros, what are they normally used for? Well, a lot of macros get used, we need to call this into question, a lot of macros get used for things like inline functions. Now if you're using C99, inline, you've got it. Work with that
rather than the challenge of macros. It's a very, very simple observation. I made the observation a number of years ago, I probably picked it up off somebody else. In C, if you can read and understand the function macro, it's probably wrong and broken.
If you can't read it, then it might be right. In other words, you have to jump through so many hoops to just make sure that you've kind of isolated that argument and this, that and the other. So if you can read it, that's a bad sign. Just work in the language, go as far as the language will take you. One of the most obvious ones is const and quite frankly, people have this love of putting stuff in header files.
They put constants in header files. I have no idea why. I mean, really, I do not have any idea why. Most constants should just be names and most of them should actually be local. They shouldn't be public to everybody. They should be based at the level of the function and passed down as parameters. The stuff that you find in header files,
you find constants that are numeric, integers, floats. You find constants that are strings. Strings do not need to be in header files. Get them out of that, okay? There's no reason they should be up there. Every time you decide to change, when you discover that your constants are not actually constants, you set off a rebuild. So don't put yourself
through that pain. So just quite happily go ahead and use something like const. That will do the right thing with a couple of linkage qualifications on that. For floating point numbers, same deal again. For integers, we may actually want to compile time constant at which point C breaks down. It does not support that, but one of the
classic techniques that was, in fact, if you go and look in KNR, New Testament, and if you look in Kernighan and Pike's Practice of Programming book, you'll see that
they use the enum trick for declaring anonymous enums. It's plain and simple, gives you integer constants that are in the language and do crazy things like respect scope instead of trashing everything in their path. Just use it. There is no reason to not use it. Your colleagues may come up and say, I could not read your constants. I don't understand
things unless there are hashtags in front of them. I have yet to see somebody refer to this as hashtag defined, but I think that generation is coming. I think that generation is coming. The point there is, use the enum, because the language knows about it. Everything knows about it. They will report errors sensibly. Simple transformation, we take something
and people decide, hell, I've had enough of typing backslash zero, so I'm going to go and define null in UL, capitals. The enum, there you go. Bang, done. We're in the language. One proviso, and this is one that a lot of people trip up on. Please don't use macro naming for non-macros. This is such a habit people get into. It was such
a habit that it causes Java people to write their constants in upper case, which is so bizarre. It's a case of, why are you shouting at me? Because it's constant. That's okay. The compiler will tell me if I do anything. That's what static typing is all about. Why are you shouting at me? It's not the most important thing in the world.
It turns out the reason we do this, always remember, is to stop trashing the rest of the language. Therefore, if you do this, you will be trashed. If you do this, you stay out of harm's way. Nice and simple. That's the basic little bits and pieces, but there's a point here about program structure. I was thinking about, how did I want to
phrase this? I realized, I often talk to people about one of the people always going on about the dry principle, the solid principles, this principle, that principle. I said, actually, one of the most important principles for me is the principle of locality or locality of reference. Now, in software, we know about locality of reference. It's to do with access, and it's normally described in terms of storage locations and frequent
access. When I refer to it, generally in programming, I'm talking about this. Locality of access, I'm talking about the screen, I'm talking about the brain. Locality of access, put things together that should be together. This should affect your declarations. A lot of C programmers still write in a Fortran style where they kind of, I think that's for
you, where they still put all of their declarations right at the top of these huge, honking, great functions. There's no need to do that. You can put them at least within the block, but C99 liberalizes this and allows you to declare anywhere. It makes life a lot easier, but this also goes further than that. This is back to the comment
I made about constants. Most constants really have no business being in header files. They are not globally available. They should be somebody's private business, or they should be the result of a function. Most constants, quite frankly, are not constants. Put them somewhere appropriate. In other words, what you think of as a constant is, particularly configuration, is actually the answer to a question. What you need to figure out is
what is the question. What function do I ask to give me this value? If that's a local value, that's absolutely fine. It's held within a module. What do we mean by module? Module is not a formally defined term. Because C allows you to do anything you want with the preprocessor in files, you have to impose a structure. A module is an informal
term that we might use to talk about a header and source pairing. It effectively equates to a translation unit, but that's a very compiler-centric view. The header source file correspondence is something that I'd like to say is universal, but it's not
as common as I'd like to see. In other words, here's the public part, here is the private part. I have to say, what affected my C programming most at around the time that I was doing C was learning about Ada and Modula 2, because they have a really strict idea of, here is the specification, this is the interface to what you're going to use, and then here is the body that defines it. Once you have that, it gives you
a very coherent model. What I find is that a lot of C projects are mired with messy kind of headers that are kind of, oh, look, there's constants.h and there's enums.h or something useless like that. You've got features randomly scattered across the files. It's why grep was invented. Where the hell is my stuff? Now, this idea of modularity leads
us into the question of, well, what is my motivation for this stuff? We can go back to the 1970s, 1972. We see Parnas gives a kind of classic definition, information hiding. We propose that one begins with a list of difficult design decisions. There's design decisions which are likely to change. Each module is then designed to hide such
a decision from the others. What he observes in this classic paper is that data representation is one of the most common and commonly shifting areas of change. Therefore, this idea of organized modules around data types is a classic idea. C has no opinion on this, but because it has no opinion, you can shape it in that way. The minute I started
doing my C code like this, instead of the kind of classic procedural view, life got a lot easier. But obviously this takes us into kind of an expectation of objects. Now, I'm going to say that C is not really where you want to do full-on object orientation. And in a lot of people's minds, they kind of go, oh, well, object orientation,
that's all these language mechanisms and so on. We're going to step back from that and try and understand a little more about the motivation. And I will say quite frankly, I mean, this is a very interesting book. It's written in five or six sections, six sections, five of which are unreadable. It introduces and uses a sigma calculus to try and represent
a theory of objects. And they don't even use the regular sigma. I learned a new variant of the Greek alphabet. I thought I knew them all, but now there's apparently another kind of squiggle, and it's not the sigma that is used in statistics for standard deviation. So they use an
obscure notation and an obscure set of stuff. But part one is great. It's just about ideas, and it's written in English. So it's great. So we can fish that out. Object-oriented programming does not have an exclusive claim to all these good properties. They define a bunch of properties. Systems may be modeled by other paradigms. Resilience can be achieved just as
well by organizing programs around abstract data types, independently of taxonomies. So in other words, you don't have to go and create vast inheritance trees and so on to call yourself and indeed some people think of that as a downside or side effect of common object thinking. It's data abstraction alone is sometimes taken as the essence of object orientation.
So what are these good properties? People who come across object orientation and then try to take it back to C, and I'm guilty of that. I created some fairly wretched object systems on top of C a long time ago, and those are buried in history and with any luck got deleted.
I think they were fascinating experiments, but they should never see the lighter production code. And also in the early 90s, there was a big fetish for books on OO to publish, to include his how to do it in C, and they were invariably bad, possibly not as bad as the Fortran one that I saw in one book. That was just, if you looked at it, it was one of those cases of no, no, no, you've got to need him too far. This is just bad code. Yeah,
you can call it object oriented if you want, but I'm looking at code that looks bad. I do not want to maintain this. So what can we do from this point of view? What are these wonderful properties? Well, it turns out we can capture the essence of this resilience very easily. Ignorance, apathy, selfishness.
These are three fundamental properties of any good properties of any modular system. These are not the ones you get taught normally. I'm going to be very, very clear here for anybody who's looking at slides afterwards. These are properties of the code, not of the coder. OK, it's a big difference between the side of the keyboard. Ignorance, don't know.
Apathy, don't care. OK, this is what you want from a module. So how's that module implemented? Don't know. Yeah, but how's it? I don't care. And you should write it so it does not need to care. It's not simply that you hide the data, is you should create it in a way that that hiding does not create a necessity to go beyond the boundary.
And more importantly, selfishness. You see, I used to teach the first two and then I had kids. A toddler is a perfect. I mean, I say this in the nicest possible sense. Toddler is a perfectly self-centered creature. Two years old, a child. You are the enabler to the rest of the universe. You are the means by which they get biscuits and snacks and all kinds of other things.
They don't know how to turn on the television. You are the mechanism for that. The world is organized around their uses as far as they see it. It turns out that this is what you want from a module. Instead of we have a very common habit and I see this again with C layering, when we're wrapping up certain APIs or integrating them.
There's this idea of, OK, what we're going to do is we're going to wrap up these facilities and offer them to people. It's the wrong way of writing it. You want to write it the other way around. What do you want from this? Be very selfish. What is the usage you wish to support? What's nice about this is it, first of all, makes the code a lot more usable. Secondly, the interface is always narrow.
Instead of offering this kind of superset of all the possibilities, just wrapping from the ground up, you take it from how are people using it. Now, obviously, the ones that people normally quote is encapsulation, inheritance, polymorphism. They do that in that order, which reflects a particular bias, I think, that comes from statically tight languages. It's better understood this way.
Encapsulation, the hiding of things, polymorphism, the hiding of things. This is kind of interesting. Polymorphism is not achieved directly in C. You have to kind of hand roll it. This is where you go into function pointer territory. But the idea is that what you're saying is,
I do not know what kind of object I'm interacting with. With encapsulation, I do not know how something is implemented and yet I'm using it. I understand how to use it. With polymorphism, it's exactly the same. But you have variability in the types. So polymorphism is more like encapsulation than anything else. But it's a byproduct of inheritance and statically tight languages.
And I'm afraid inheritance really isn't that important. People overhype it. Most of what is good about inheritance is being able to use it to achieve polymorphism. Now, as I said, in C, it's not really the natural space for it in this case. So I'm going to pick on a simple example I normally use.
But I'm going to use this one in C rather than the L languages I normally present this example in. Recently used list. I want to hold a list of integers. OK. But it's not just going to be a regular list of integers.
I would like to organize this so that the most recently added goes to the head. In other words, it's like a stack. And then it also only holds unique occurrences. So kind of like your recently opened files list.
Most recently opened goes to the top. If you reopen something, then it moves to the top but does not occur twice. So it's like a stack and like a set. Now, I could sit here and worry about the representation. But actually, I'm going to organize it in classic abstract data type style. I'm going to do a forward declaration. My preference is normally to always work with a typedef name.
So in that typedef, forward declare that there is a struct recently used list. Give it the full name recently used list. And that is as far as we're going to go with that. We then define everything else around the operations on it. From creation through to destruction. We provide a vocabulary of usage.
Now, there's a bunch of other tricks that we could do if we really care about polymorphism. But I'm not going to go that far. I think people over complicate things. In many cases, the classic abstract data type approach is enough. In other words, there will be one implementation of this at runtime. And you can either fulfill this through shared objects and DLLs or simply at an earlier binding link time.
In many cases, this is sufficient. If it's not, then that's a separate question. But don't put all the machinery in because people don't like dereferencing function pointers and using them like that. It just gets messy. There's an extra level to hide things. But here we've got a very simple way of just checking on things, querying the size.
I've kept most of the names fairly simple. And I've got an implementation there. Not particularly rocket sciencey. That should probably be size T. I'm not entirely sure how I got away with that one. I'm pretty sure that should be size T. Um, oh, you know what?
We can do that. This is live coding. Yeah. So where's where's that code? There we go. And so there we go. OK. It was always like that. You never saw anything. And I've got appointed to the items that I want right now.
There's an implementation. I'm not supposed to rest on that for any length of time. That's an implementation that does the job. But I can also go ahead and swap out that implementation and provide a C++ version. So now I'm providing a C++ implementation to a C API. This is what I'm talking about in terms of mixing systems.
And I can find a C++ version that kind of does the right thing. Now what's important about this is not that the C++ code is shorter. What's important about this is that it relies on this very clear separation. I take away the representation from the header file. I only use a forward declaration.
This has become more accepted as a practice. But I will say, I remember the first time I ever used this as a technique. The compiler I was using warned me that I had not elaborated the type at the point of uses. And it actually regarded this as a warning, which is really difficult if you're trying to achieve zero warnings.
Here I am using a brilliant technique. Look, Nicholas Wirth said so. The Ada guy said so. Everybody says so. I'm using proper modularity. I've got abstract data types and I'm getting a warning from the compiler. Being able to forward declare something very, very important if it's going to live on the heap anyway. Now the question is, what if it's not going to live on the heap? Well, I'm going to leave that as a kind of a piece of homework
because there is this question of wanting to work with objects that may be smaller and still keeping their representations hidden. But what I'd like to do, first of all, is just clear the way for lots of wrapping code, which just fills and pollutes a lot of header files with lots of structs, which then also have to include the dependent header files.
Just doesn't need to be like that. It can be a lot simpler. Now with this, obviously we're kind of touching on this question of paradigms and other styles of programming. Whenever using the word paradigm, we've just mentioned kind of object paradigm. And strictly speaking, what I've really been talking about is abstract data types. They are not identical. They kind of overlap in terms of data abstraction.
But I do urge you, if you wish to use the P word, because people do overuse it, go and read this paper. Robert Floyd, 1979. This is Turing Award speech, basically. And he's the guy responsible for getting everybody using the word paradigms in software. So it's his fault.
However, if you read it, you end up with a very different picture from what we currently use it as. We use paradigm normally to exclude things, whereas he's actually saying very much the opposite. And when you look at what he describes, I'll come back to that in a moment. He says, I believe the current state of the art of computer programming reflects inadequacies in our stock of paradigms,
in our knowledge of existing paradigms, in the way we teach programming paradigms, and in the way our programming languages support or fail to support the paradigms of their user communities. What is interesting is he wrote this in 1979. And it kind of still feels relevant. But what is more interesting is the way that he uses the word paradigm.
He's saying we need lots. We need to mix. We need to match. And if you read it and you replace, if you read the paper and replace the word paradigm with the word pattern, you get a better understanding. What he's actually talking about is what we would probably now call patterns. In other words, he is looking at the shape of the problem
and then the shape, what can I find in the solution space that fits that? And if you change the problem slightly, sometimes that triggers a change over here. What he's saying is that, you know, avoid wandering around with the one golden hammer trying to hit everything. And this idea of being able to mix and go outside and bring stuff back is really important. Now, the one that a lot of people will then immediately think of,
given I've done kind of object stuff, is that I should probably be talking about functional programming now, which I am going to. But probably not to the extent that I did yesterday in Functional C++. Who was in my Functional C++ talk yesterday? OK, good. Most of what I said there, copy paste. There you go. That was cheap. Yeah, to our talk. This is brilliant.
OK, right. Most of what I said, obviously not everything. I want to look at some other aspects of the language that perhaps get neglected. But, you know, offers a different way of thinking. I will be using toy examples though. So I'll start off with Peter Deutsch's observation,
to iterate is human to recurse divine. Now, there is a problem with recursion and the greatest problem with recursion is not that C doesn't support it. Well, certainly does. Proper stack based language. It's that everybody's taught recursion using one simple example. What is that example? Factorial. Yeah, the problem that everybody feels is exactly,
you know, the one to solve. This is the big problem in our system. We need a factorial generator. And it's a case of no, really, it's very, very unconvincing. So if I go ahead and I kind of like, whoa, there we go. There's a nice kind of procedural version. I've got lots of state change going on there. So I've got a result that's changing.
I've got a counter that's changing. And just for kicks, I've done both changes in one statement. So, yeah, that's a very imperative statement there in the while loop. And so somebody comes along and says, hey, look, functional programming. And they do this. And you think, well, OK, same number of lines of code. But you appear to have gone further out on the right hand side.
It's difficult to see this as a killer. It's difficult to walk away and go, you've sold me on recursion here. Somebody may come along and say, oh, you're still using statements, are you? No, it's all about expressions, mate. There we go. Yes, the ternary operator, my friend. OK, so, OK, so I've saved a little bit of syntax here,
but this is not compelling. There's a very simple reason this is not compelling. It's not the right idiom for solving it in this language. Really, the loop is a better way. In fact, a look up table is an even better way. And if you're doing really big numbers, then use Stirling's approximation or whatever. It's just not convincing whichever way you look at it. Now, I'm going to show you some alternative ways.
I'm not saying this is how to solve it, but I want you to reframe the problem. You may find this does actually apply in other cases. This is the normal mathematical definition that people work to. This is how they define factorial. Turns out there's another way of defining factorial. It's defined as a series product, which allows me to gratuitously use a large Greek letter in my slides.
You don't normally get away with that in commercial conferences, so I'm quite pleased with that one. Now, OK, so let's go out somewhere. Let's go and have a look at Haskell. How do I calculate factorial in Haskell? Well, it's embarrassingly easy. For a given value n, then what I'm going to do is I'm going to have a list from 1 to n, and then I'm going to multiply them all together.
I want the product of that. Well, that's it, yep. You see, they didn't use recursion. They did a series product. Now, that's not going to teach me anything I can do usefully in C. Maybe if I look at it and break it down a little more, what I'm going to do here is I'm going to say, right, I'm going to fold, fold right, I'm going to fold this operation or apply this operation,
which operation, the multiplication operation that you see quoted there. I'm going to take 1, which is our starting value, and I'm going to fold that into the sequence 1 to n, and I'm going to apply this. So 1 times 1, and I'm going to take the result of that and apply that to 1 times 2 and so on and so on, and I'm going to get the result out at the end. Now, this compositional approach, this higher-order function approach,
gets us thinking. It's like, actually, this is a little more useful. I might be able to apply this somewhere. We have, you know, we can do stuff with this. We can see a kind of an approach doing this in Python where fold is called reduce. I'm going to take the multiplication operator. I'm going to take a range 1 to n plus 1,
because it's an off by 1 range. I'm going to start with 1, and I'm going to apply that. Now, can I do that in C? Well, the answer is yes, and I can write it pretty much like that. I'm going to introduce a range type, a range structure. One of the things I really like in C99 was a feature I stumbled across by accident in the GNU compiler in the early 90s
when I thought, I wonder if they support struct literals, and I thought, logically, they should. And I was really excited, and then I tried it on another compiler and realized it was an extension. But, you know, you wait long enough, these things happen. This allows us to conveniently organize a much more declarative approach.
If you look at a lot of C code, particularly stuff with structures, you spend a lot of time initializing stuff, setting up a variable in order just to parse it. Well, why can't I just parse it? You know, why do I have to declare extra variables, extra struct variables? Why can't I just parse it? So here I've got a very clear idea. I'm going to parse in an operation.
I'm going to create a reduce. That's not for free. I'm going to parse in multiply, which we'll come to. I'm going to take a range from one to n plus one in steps of one. I have to be specific about that. And I'm going to start with one. If we open this up, it gets a little bit messier. The problem is a lot of people, their instinct,
their initial instinct will leave the type def of the range aside from the moment. The reduction, what they'll initially do is say, right, I know how to program this. You wanted to do multiplication, Kevlin. That's brilliant. I'm going to create an enum for that. And I'm going to put in multiply and I'll probably put in add and divide and stuff like that.
And then I'll use a switch statement because we all know that's how you really program in C. Yeah, and it's just like, oh, really? And so every time this just becomes a monstrosity. Of course, if you're being paid by the line of code, this is fantastic. But this is really not a compositional way of thinking about it. A more appropriate approach is to reframe this and say,
no, no, no, no. What we're going to do is we'll have the range. There we go. Start, stop, and step. And I'm going to create a multiply operation. Obviously, not a very exciting operation. As I said, I'm working with factorial here. But this idea of being able to separate stuff out. The number of years ago, I wrote a string handling library in C that was based exactly on this approach.
Instead of generating loads and loads of string functions, it was basically, what can you do with strings? What do you want to do? You want to search through them. You want to transform them in certain ways. And the idea of being able to pass in a function that did the bit that you wanted orthogonally, so it was composable, was really key.
What are the performance implications of this? Well, unless you have performance problems, there are none. OK? Don't worry about it until you hit it. You are dealing with an extra level of indirection. Basically, every single time. On the other hand, I would like to point out that before you start criticizing any function indirection,
always go and look at the code dropped out for a switch statement. It may shock you in some places. OK? It's not always a nice, easy translation. Sometimes there's a little bit going on that you don't want in there. It also means that we've now got something composable.
Not just shorter, but it's a case of I'm using function pointers in the way in which they were originally intended. It's a compositional approach, as I've described. It's obviously not a new idea. The thing's been in the language for years. But I find that C programmers are more likely to use function pointers if an API tells them to than to introduce them themselves.
In other words, saying, here's a technique that's not just for other people. It's something I can use to reframe stuff. One of the other classic papers to look at, Butler Lamson, Hints for Computer System Design. This comes from 1983. Use procedure arguments to provide flexibility in an interface.
Technique can greatly simplify an interface, eliminating a jumble of parameters that amount to a small programming language. And you can end up, and this is quite important, because you end up in some cases with these ridiculously long argument lists. And you see, I always remember looking through the Windows API, being stunned by stuff like create window, which takes 11 arguments, all of which are scalars.
The opportunity for getting it right is minimal. There's a high probability you will get something wrong. It's a case of, no, really, I want a much simpler way of doing this. Now, again, this takes us into another space as well. This is going to take us out of the clean C space. But I was using Python a moment there.
So I can create a range. I can use positional arguments. C has positional arguments. Most languages have positional arguments. But I've also got the ability to use keyword arguments, where I can use the name of the argument to specify.
I can put them in any sequence that I wish. This turns out to be surprisingly useful. And sometimes people go, oh, that's a bit strange. Well, if you use command line tools, it's totally not strange at all, because that's what most command line systems are composed of. There's normally a positional model when you use a command line tool. And then there's minus such and such.
And now you've got a keyword argument. Same model, but without the pain of having to work out which argument passing library you want to use. Can we do this in other languages? Well, unless the language supports keyword arguments, people normally revert to using things like option objects and stuff like that. It kind of gets a little bit messy, and there are builder objects and so on. And if you're working in Java or C++,
you might find a lot of these floating around. Turns out we don't have to worry about any of that if you just go back to something I said before. Structs. I can just create structs on the fly. And I can use designated initializers. I can use designated initializers directly to make up for this.
Anything that I do not specify will fall to its default value. In other words, whatever the equivalent zero is. So there's a little bit of coding there. But let's put it this way. If they'd done this, if this feature had been available for something like create window in the Windows API, it would drop from about 11 arguments to about three or four.
And it's quite radical shift. The idea is that you are naming what you're. And the other thing you may have seen with long argument calls is people get all special about how they format them. When you end up with these long. Well, you know what? I can't really tell what 00 null 00 means. So I'm going to have to put these arguments one per line
and then a comment next to it saying what it is. It's just like, oh, dear God. This is where software engineering goes to die. It's like, OK, this is what I want. This is how I want to say, look, here is the meaning. I don't have to be worrying about the order and detail. What's also nice about this is it is open to change. In other words, if I have an argument that says options
or something like that and I realize, oh, you know what? I forgot a parameter. That's OK. I just tuck it on the end. Appropriately work with a zero value for it and then people and it's like they're all along. It just needs a recompile. So it allows you to achieve stability where you might not otherwise have had it.
So we moved away from recursion and found a couple of other interesting compositional techniques that can really shape your thinking. But I want to move back to recursion. Here's the bank of K&R. Just to focus, they did one of the better ones on the recursion, C Recursion.
Recursion 269. Oh, there it is, page 269. See, that's the best. I like that, yeah? And one of the examples, and I'm going to presume it's the one on page 86 or 139 or I can't remember, is this one, which is, I'm not going to go through this, but it's basically how to print a number out in decimal. Use printf. No.
How to print it out. Now, it demonstrates a couple of things here. It demonstrates a non-tail recursion, not something, and it demonstrates something that cannot trivially be transformed to a loop. You can do it, but you need an intermediate data structure to do so. I want to use, but I want to focus on an example,
one of my favorite examples, and I mentioned practice of programming earlier on. I remember, there's one of my favorite examples because I remember stumbling across it the first time I read the book, back when it came out, and thinking, you guys just did grep, a really small subset, okay, so not super intelligent regular expression, pattern matching, no state machines, and so on, but you just did grep, a simple subset of grep
in under a page. I remember flipping the page over thinking, where's the rest of the implementation gone? Then I went back and read it and go, wow, this is really elegant and really smart. So it's a cool piece of code, and indeed, Brian Kernighan selected it a few years later as one of his favorite pieces of code in the book,
Beautiful Code. But the thing I want to rest on is, it uses two things. One is it uses recursion naturally, and it uses mutual recursion, rather than just trivial recursion, you can flex out into a loop, in the statement of how it solves things.
So let's just extract out the three core functions match, which given a regular expression and some text match, we'll go off and try and match it, and it will match here, match here does this, does the basic logic of looking at, are we at the end,
do we have a wild card, and so on, and then we've got a match star, so to match the wild card, so we've got that kind of structure. But okay, what do I really want to focus on? Well, let me just tidy this up slightly, I'm going to C99 it a little bit,
and put const back in there, where it should be, so a little bit of C99ing, then I'm going to mess about with the formatting. And now, we've got something here, that if you look at closely, if you look at closely, has a really important stylistic point.
The first thing to notice is, the only place that there are, there are no extra variables declared, the only place where variables are modified, is in the context of loops, which is kind of, this isn't even going to work at that distance, kind of the top, there's a do while, at the bottom there's a do while,
and there is a little bit of state change going on there, to advance it. Now I'm not going to eliminate that with recursion, I'm just going to say look, the state change is isolated and well understood, but what I'm really interested in, is if you go through this and look really carefully, you will see there is a particular programming style, that is common in functional programming, and it's a head tail style.
Okay, we have a list of things, or a sequence of things, in this case it's a sequence of characters, we look at the head of something, and then we, and based on that, we will then go and do something, or work with the tail. Okay, head and tail, if you go through and you look, there'll be, and unfortunately the mixed notation, makes it not obvious in places, you have star in some places,
and square bracket zero in others, and then plus one, so on for getting the tail, so there's a few things like that, if I change this, and imagine that we have introduced three primitives, head, which takes a const char star, and returns a char, in other words, the zeroth element, and tail, which takes a const char star,
and returns a const char star, in other words, head, where the head is plus one, so in other words, not something you'd normally do, and then I'll introduce nil, just for fun, rather than null, just because I'm feeling lispy today, and that, we'll do that, then you end up with this, also get rid of those do-while loops, do-while loops, they carry some weird state,
don't do it, once you've done that, what's interesting, if you go through this, you realize that it's a pure, it's this far away, from being a piece of pure functional programming, it's really quite nice, the way that the problem is solved, and expressed, is you could kind of convert this, to a kind of lispish style, well I kind of have,
but a lispish, more lisp friendly language, this level, so it's really interesting things, that come out of analyzing, why this work, and how this works, now of course, we know that factorial is not really, what the world is all about, we've established that, so I want to look at a couple of other ideas,
that we may incorporate, or poke around, and sort of in terms of explorations of style, and I'm going to use the other grand problem, of the 21st century, fizzbuzz, so fizzbuzz allows us to, just basically demonstrate,
do people know how to use conditional logic, it's a very common kata, but it's kind of fun, because it allows us to mess about, with certain stylistic questions, the goal of fizzbuzz, if you're not familiar with it, is to generate numbers from, given numbers one to a hundred, to map these numbers, through to a,
the result one, two, three, is divisible by three, and therefore we map that to fizz, so anything that's divisible by three, is mapped to fizz, four, five, five, we're not going to say five, we're going to say buzz, anything that's divisible by five, gets mapped to buzz, so it's one, two, fizz, four,
buzz, fizz, seven, eight, and then fizz again, and then when we hit something divisible by three and five, fizzbuzz, okay, so that's it, so that's the scope of the problem, not the world's hardest problem, however there's a lot of fun to be had, so let's have a bit of fun, so, naively we might wander in and go,
yeah okay, I've got this one, so I'm going to allocate some space, so I'm going to pick up some common C programming habits, that I kind of see floating around, and I'm not going to blame these on C programmers, or Java programmers, or anything specific, but, oh malloc, yeah, how big does the string need to be, I don't know, 20 seems good, so there's kind of an arbitrary number in there,
notice that it's hard coded, because it's not a universal constant, you know, I just felt like 20 at that point, you know, your colleague might have chose 30, or 42, or something like that, so it's just some number, and then we're going to go through, it's very procedural style, I've got an if, and so on, okay, so I've got all of that, and then we,
and then we pass it back, and get a memory leak, or rather, no, we tell the caller, you must look after this, and we get a memory leak, um, I'll come back to the memory leak in a minute, okay, so, we can tidy things up a little bit, we can set everything to null immediately, using calloc, and we feel faintly smug with ourselves, and then we remember that we had this conversation about the ternary operator,
and yes, now we're getting somewhere, okay, now we're actually getting somewhere, so, in fact, we can actually tell you that, we don't need that calloc anymore, so, we've got a copy-stracat technique, and in fact, I could merge the result of, I could push, um, stracat into stracopy, I've chosen not to do that here,
um, so this is a lot shorter, a lot tighter, is it harder to read, well, I don't know, it's a good question, I find lots of if statements quite hard to read, I think the point is, it's logic that is difficult to read, it's a case of what you're accustomed to, it's ultimately logic, or a decision being made is the boundary that you're hitting,
so, um, we've got that, now, when I look at that, you know what, I'm feeling really uncomfortable about this, what if the, um, what if the number is too big, what if somebody actually uses n, and it's really large, okay,
and so, and I've not got enough digits, I'm saying 20 should be enough, and you're thinking, well, that's probably 32 bits, yeah, it probably is, but I wouldn't bank on it, okay, it's a case of, you want to be cautious about this one, so, you sit there and go, right, I'm going to use assert, okay, not bad, um,
but, still, I'm kind of a bit concerned about how we're going about this, so, I want to focus on the fact that, we often talk about UX, user experience, I want to remind you that as a programmer, you are also a user, you're a user of the code, and I'm going to call this PX, program experience, the program experience of this code is not great,
it's quite clunky and difficult to use, it puts a lot of responsibility on the caller, and it doesn't really do bounds checking, or it kind of does with the assert, and so we might say, here's how I'm going to respond to that, what I'm going to do now, is I'm going to shift this, we're going to take memory management out of this function,
so now this function no longer has a side effect at that level, we're going to take memory management out, and then we're going to copy stuff in, and then return that, so the caller is no longer forced into using heap, so the decision is with the caller, this is actually an interesting example of ignorance and apathy in action,
we've actually allowed the caller to make a decision here, rather than taking it for them, and we do some bounds checking, actually we don't, we don't do quite enough bounds checking, it's really easy, this is no longer about UX experience, yes it's about sex, security experience, the security experience of that code is not great, because this is C, you're going to run into this kind of stuff,
so we kind of then eventually decide, you know what, snprintf is our friend, and again a very declarative style approach, in other words one that is based on calling a function, and making decisions with respect to the arguments of that, the function is doing a lot of work, it does all of the bounds checking for us, and that kind of comes out quite nicely,
just for a bit of fun though, I thought I'd take a different approach, I did this a while back, slightly different approach, it's like well, do we need memory at all, do we need to worry about the bounds checks, the problem is incredibly well defined, and actually we can do it with, we can do it with a very simple, if we say a fizzbuzz result, fizzbuzzed, is simply an array of two pointers,
then we can say, these pointers to strings will hold our result, okay, so in other words if it's fizzbuzz, zero will hold fizz, and one will hold buzz, if it's 23, which is prime, I did choose that correctly,
yes, then zero will hold two, and one will hold three, the string for it, so in other words the caller has to do a little bit of work, but it's just to put two strings together, that is kind of fun, because you can have a great deal of fun creating a lookup table for it, yes it is that much fun, yeah you might not get this right first time, that is a disadvantage,
however as a side effect, given that we've got no memory allocation for anybody, it's kind of cute, and it's very declarative, what I've done here is I've now declared, here are the digits that I'm using, I've calculated at the time which digits I'm using, here are the fizzbuzzes I might be using, and then my result looks at these and say,
well did we make any decisions about fizz or buzz or numbers, and then we decide on that, okay, and there's a small bug in there, so we fix that, and then we just tidy up a bit on the typedefs and structs, and we've now got a piece of code, that nobody is going to get rid of us, for our job is secure,
because they're going to be going like good grief, what the hell have they done, I've been functional and I've not allocated any memory, it's like oh no, okay, so yeah there is a guide, there is a moral to this, you can go too far down that road, lookup tables are very powerful, I don't think they're used enough, but I'm not sure they're used appropriately here, there's a different way of looking at this stuff,
so that different way is just to realize and shape the solution around the nature of the problem, and I don't think people do this enough, there are not, in some cases when you realize that, for example string results, it's already bounded, why are we throwing around an arbitrary char star, I've had this a number of years ago, with a company that took some C code wrapped it up in C++,
but they kept everything as char pointers, but I pointed out to them, that actually they shouldn't be using char pointers in the C code at all, because they were dealing with codes, they were dealing with codes that were known to be no more than 17 characters in length, so it's known to be no more than 17 characters in length,
which means you can put a null on the end, that's 18 characters, and you'd like to know the length of it, they were strlenning absolutely everywhere, well you've still got a little bit of space to hold the length because it's fixed, put that in a struct and you can pass it around, and say oh what about the cost of copying, the cost of copying 20 bytes really is surprisingly cheap, it really is not a big cost,
people are still thinking 1970s, 1980s architecture, it really is very cheap to copy this stuff around, so I'm going to say, that we know that the value of fizzbuzz cannot be larger than fizzbuzz, in terms of number of characters, that's eight, so we're going to create that, and then we've got this solution that does all the right stuff, is appropriately range checked and everything,
and gives us a result that we don't have to, which satisfies an ignorance apathy and selfishness point of view, but gets us away from the idea that everybody should just be working in primitive types, you don't have to abstract the type at this level, you just have to name it, and take advantage of the fact that arrays are copied, in the context of a struct,
this technique is not used enough, because it's not common amongst K and R style, but K and R is a very old style, so we can keep it simple. Okay, so I'm going to finish with a look at testing, which is kind of slightly foreign to the culture of C,
it's not that people don't test, it's just that of all the programming cultures where people do test, C is kind of the last to sort of really get testing as a general practice, in spite of the fact that people like Brian Kerning have been writing about this for decades, our general advice, you know, don't leave until the end,
do it lots, and do it automatically, how am I going to do it, do I need testing frameworks, well, I don't really need a testing framework, a testing framework makes life easy, but the very basic element of what you need is that, so you can just get away with that, that's all you need is the ability to express an assertion,
what a testing framework gives you, is it doesn't allow you to test something you could not already test, it just makes the testing of it simpler, it makes the execution model of your tests simpler, but it doesn't change what you can test, that's the difference, it changes convenience, but not testability, problem is, a lot of people when they see assert, they immediately think, oh, right, yeah,
I know, what you mean is I can check the preconditions and post conditions, I don't need to test my code, because I can have that tested, I can put the assertions in my code, right, let's see how that one works out, so I'm going to take a library function here, bsearch, okay, so binary search, the usual void pointer noisiness,
this is how we do generics in C, a pointer, I passed in a pointer, here's the key, here's the base, this is how many elements I've got, this is how big each element is, and by the way, here's how we're going to do comparisons, so it's a five argument monster, and we're going to say, right, okay, so what is the result of bsearch,
what's it supposed to do, what's it supposed to return, let's say that we find, so what are we expecting to come out of this, one of two possibilities, an element or, no, it's going to point to the element, and if it doesn't find it, no, right, okay, that should be nicer,
we understand it, we can verbalize it, this is brilliant, that's what a binary search does, okay, so now we're going to go and write the assertion, and what the hell are we going to assert, try writing that, just think about it for a moment, I've gone and calculated the result using a binary search, and now I'm going to assert what,
I'm going to assert that it's either null or not null, well that's the same as assert true, and you say well no, but what if it's there, well how do you know it's there, well you need to find it, well that's what this function does, it's the finder function, that's what we're in, so how do you know if it's there, well maybe you could use a different find technique,
what like linear search, well that should speed it up nicely, oh in other words you're screwed, you cannot write this post condition in a way that would be considered acceptable or reasonable by anyone, oh but your problem's not over there, what about the precondition, what is the precondition to binary search, that it's sorted,
how do we know this, we do a linear search to check that it's sorted, okay, you see this is kind of an idea designed by contract, I get a lot of people advocating designed by contract at me, and I presume this is because they don't know very much about it, I started looking at it about 25 years ago,
and I think I now understand a lot about it, I understood about 20 years ago why it was not perfect, and then about 15 years ago I understood why it does not work, certain levels, and this is the fundamental thing, is it cannot be expressed direct, certain classes of precondition, it's not that it's not useful thinking, it's just that it's not as useful as you'd hope it would be,
it actually applies in a minority of cases, oh and by the way, the precondition that's not complete, what's the real precondition of this, the precondition to be search is, key, base, key and base are valid, non-null pointers, I can check for null, but how do you check a pointer is valid, you can't check a pointer is valid within this function in the system of C,
it's what we call a girdle statement, it's not provable within that, you can only do it from the outside of the system of this function, so basically, and that one is a hashtag, so this is why we care a little bit more about testing, it's just that a cert on the inside is not going to do it for us,
so realizing that we are pushed for time, I'm going to close with a couple of thoughts, a couple of examples, Tony Hoare observed, the real value of tests is not that they detect bugs in the code, which is certainly useful, but that they detect inadequacies in the methods, concentration and skills of those who design and produce the code,
I've got a little example, I'll just kind of quickly walk through, an array of integers, I want to convert that to comma separated variable output, okay, so when I pass in an array that's got one, two, three, the string that I want out is one comma two comma three, but I also don't want it to overrun,
so I've got an idea there, how am I going to express it, well, I've got an implementation here, that's in C, I've got an implementation here, that's in C++, both of these are functionally equivalent, how can I make such a claim, I can make that claim because I had tests, now the problem is our intuitive idea of how we do tests,
tests, is we say, oh, and this is, I see this all over C systems, I've got a function, therefore I have a test. What's that going to look like? It's going to look like this. We're going to have test ints to CSV, and we're going to have lots of little tests going on and heading up, and I put dot, dot, dot in there because I lost the will to live, and so will you. And the point is, when you come to maintain it, this is
a pain in the backside. So maybe you make your life easier from a maintenance point of view, and you put comments in, because that obviously makes things easier. And then you realize, well, maybe I should partition these because it all looks like a big undifferentiated mass hugging the left-hand side of the screen. So we decide, you know what, I'm going to use block structure, and I'm going to put the comments above the blocks. And then somebody comes along and gives you that
moment of insight and says, you do know that named blocks are called functions. And suddenly you realize, yes, I see it all clearly now. And you choose an appropriate naming convention that allows you to write full sentences that are propositions. Merely saying, hey, guess what, I'm testing this is
not enough. Tell the reader what's going on. Tell the reader your ambitions and expectations. Observe and you see that instead of one-to-one, you end up with a many-to-one because there are many scenarios of usage. You want to peel out these scenarios of usage. Peel out the error cases from the normal cases. Differentiate between the different normal cases. Find out if something
overruns or, you know, this is the question, what happens when you get an overrun? Does it get truncated? Do you write nothing at all? Or is it undefined? The point is you want to say this to the reader. Tell them what they're expecting. So the point here is a test case, a function, should just be that. And we want to end up with a case where we end up with our test
cases look like a description. They look like a specification. And this is kind of classic observation from Nat Price and Steve Freeman. Tests not written with their role specification in mind can be very confusing to read. The difficulty in understanding what they are testing can greatly reduce the
velocity at which a code base can be changed. This is the key idea here is the sensibility of tests is not merely to test that something is correct. When you come back to it you'll be struggling to understand what it is that was supposed to be correct. You want to express that. You are saying here is what we believe to be correct in a way that you can check
it. And by the way, it also does the checking. We have these two sides to it. So in our Whistle Stop Tour, which is a subset of what I'd really like to talk about, which goes even further into all these paradigms, but hopefully that's given you kind of an idea of actually classic as some of these books that people then see from
are, and classic as many of the examples that we see in certain open source systems when we look at the code, there are a number of styles out there that can really bring things in. If you want to say keyword arguments, bring that in. If you want to say I want my code to be properly portable, then identify that subset appropriately.
If you want to say I want to use a more functional style, then understand what it is that you're expecting from that, but learn more specifically what the tools are. Most of those are about locality and isolation. It's not just about recursion fun. Locality, isolation, separating stuff out, and we see that that also plays into abstract data types, and in both cases you're talking
about composability, and that we don't have to end up with large masses of undifferentiated C. Thank you very much.