We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Procedural Macros vs Sliced Bread

00:00

Formale Metadaten

Titel
Procedural Macros vs Sliced Bread
Serientitel
Anzahl der Teile
8
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Alex will try to convince us that procedural macros are the best thing since sliced bread. With "macros 1.2" now on stable Rust there's no better time to discover the world of procedural macros and what they can do for us. We've already seen the power of derive macros shoot Serde to one of the most popular crates in Rust. With attribute macros we can go even further with lightweight annotations powering frameworks. And finally with procedural macros the sky is the limit! Let's learn about tips, tricks, and usage of the procedural macro world! Follow us on Twitter: https://twitter.com/rustlatamconf
Open SourceFacebookMakrobefehlAlgorithmische ProgrammierspracheMaßerweiterungSystemaufrufGruppenoperationAttributierte GrammatikFunktion <Mathematik>PrototypingGüte der AnpassungRechter WinkelAlgorithmische ProgrammierspracheMakrobefehlProgram SlicingFunktionalStabilitätstheorie <Logik>BitStrömungsrichtungDerivation <Algebra>Prozess <Informatik>SystemaufrufUmwandlungsenthalpieAggregatzustandMaßerweiterungCompilerCASE <Informatik>Physikalisches SystemHyperbelverfahrenEntscheidungstheorieSoftwaretestZahlenbereichProgrammbibliothekMAPAttributierte GrammatikPaarvergleichPunktInhalt <Mathematik>ComputeranimationVorlesung/Konferenz
BildschirmmaskeMakrobefehlVersionsverwaltungBitWeb-SeiteMakrobefehlAlgorithmische ProgrammierspracheInhalt <Mathematik>Ein-AusgabeSkalarproduktToken-RingStreaming <Kommunikationstechnik>GarbentheorieSchreiben <Datenverarbeitung>ProgrammbibliothekSchnelltasteAbzählenFunktionalBildschirmmaskeQuick-SortEinsDerivation <Algebra>CompilerAttributierte GrammatikFunktion <Mathematik>p-BlockEinfach zusammenhängender RaumHochdruckRechter WinkelPhysikalisches SystemGruppenoperationVorlesung/KonferenzComputeranimation
Ein-AusgabeCompilerGruppenkeimENUMStatechartMakrobefehlDateiformatZeichenketteWärmeausdehnungSyntaktische AnalyseGradientenverfahrenInformationENUMDatenstrukturMakrobefehlMathematikKonstruktor <Informatik>PunktElektronisches ForumAutorisierungGeradeZahlenbereichEin-AusgabeProgrammbibliothekMultiplikationSchnitt <Mathematik>Algorithmische ProgrammierspracheNamensraumGruppenoperationTypentheorieGanze FunktionMultiplikationsoperatorToken-RingDatenfeldSchlussregelStabilitätstheorie <Logik>Attributierte GrammatikMailing-ListePhysikalisches SystemDifferenteKlasse <Mathematik>HalbleiterspeicherCASE <Informatik>HydrostatikFigurierte ZahlZeichenketteVerschiebungsoperatorFunktion <Mathematik>BitDifferenzkernStreaming <Kommunikationstechnik>Inhalt <Mathematik>IdentifizierbarkeitZweiEinfach zusammenhängender RaumEinsBildschirmmaskeMinkowski-MetrikValiditätPoisson-KlammerSchnittmengeMultifunktionTopologieMereologieSelbstrepräsentationSerielle SchnittstelleSyntaktische AnalyseQuick-SortOrdinalzahlFunktionalInterface <Schaltung>SinusfunktionArithmetischer AusdruckCodeExpertensystemParametersystemDefaultModulare ProgrammierungDateiformatHochdruckSystemprogrammWort <Informatik>Derivation <Algebra>FlächeninhaltFreier ParameterCompilerSchnelltasteVorlesung/KonferenzComputeranimation
Syntaktische AnalyseGradientenverfahrenInformationEin-AusgabeMakrobefehlTypentheorieToken-RingInterpolationDatenfeldSyntaktische AnalyseSinusfunktionArithmetischer AusdruckMathematische LogikMAPHyperbelverfahrenBitStreaming <Kommunikationstechnik>CodeVerknüpfungsgliedValiditätToken-RingTypentheorieQuick-SortParserSchlussregelTermAlgorithmische ProgrammierspracheFehlermeldungIterationDatenstrukturProgram SlicingTopologieZeitkomplexitätRohdatenProgrammierungInterpolationMailing-ListeEin-AusgabeProzess <Informatik>CompilerDatenfeldInformationSystemprogrammMakrobefehlInstantiierungGeradeZahlenbereichExpandierender GraphGenerator <Informatik>DifferenteÄhnlichkeitsgeometrieENUMIdentifizierbarkeitFundamentalsatz der AlgebraDerivation <Algebra>Vorlesung/KonferenzComputeranimation
BildschirmsymbolFehlermeldungMakrobefehlGeradeTopologieGruppenoperationVerschlingungPlastikkarteCodeAutorisierungWort <Informatik>TypentheorieBitHyperbelverfahrenGüte der AnpassungFehlermeldungDatenfeldSyntaktische AnalysePunktGenerizitätKorrelationsfunktionUmwandlungsenthalpieCompilerToken-RingMultiplikationsoperatorDerivation <Algebra>Algorithmische ProgrammierspracheMakrobefehlAuflösung <Mathematik>MereologieURLSystemaufrufInformationEin-AusgabeNormalvektorValiditätCASE <Informatik>Vorlesung/KonferenzComputeranimation
VersionsverwaltungFehlermeldungMakrobefehlENUMAlgorithmische ProgrammierspracheMakrobefehlSchreiben <Datenverarbeitung>PunktFundamentalsatz der AlgebraLeistung <Physik>DefaultCompilerQuick-SortTypentheorieMultiplikationsoperatorHilfesystemFehlermeldungInformationENUMParametersystemKonfiguration <Informatik>ParserProgrammierungSampler <Musikinstrument>KontrollstrukturAlgorithmische ProgrammierspracheMAPToken-RingCASE <Informatik>BitPhysikalisches SystemHochdruckValiditätStabilitätstheorie <Logik>Rechter WinkelDerivation <Algebra>Wort <Informatik>Syntaktische AnalyseMaßerweiterungInterface <Schaltung>TextbausteinKonfigurator <Softwaresystem>CLIProgrammbibliothekGeradeRandomisierungEin-AusgabeVorlesung/KonferenzComputeranimation
Virtuelle RealitätMakrobefehlAlgorithmische ProgrammierspracheMakrobefehlGanze ZahlFehlermeldungTypentheorieQuick-SortInstantiierungCodeStreaming <Kommunikationstechnik>Klasse <Mathematik>Demoszene <Programmierung>PunktMultiplikationsoperatorTextbausteinLaufzeitsystemInterface <Schaltung>Objekt <Kategorie>CompilerInformationFunktionalProjektive EbeneProgram SlicingPhysikalisches SystemSelbst organisierendes SystemDatensatzMaßerweiterungVorlesung/KonferenzComputeranimation
MakrobefehlAnalysisMultiplikationsoperatorHydrostatikFunktionalLaufzeitfehlerDifferenteSinusfunktionVerschlingungMakrobefehlAlgorithmische ProgrammierspracheVorlesung/KonferenzComputeranimation
MakrobefehlAlgorithmische ProgrammierspracheOpen SourceFacebookMakrobefehlAlgorithmische ProgrammierspracheVorlesung/KonferenzComputeranimation
Transkript: Englisch(automatisch erzeugt)
All right, good afternoon everyone. My name is Alex and today I have one goal, which is to convince you that procedure macros are better than sliced bread. Now you might not think that's actually that much of a kind of a bar to live up to,
but there's a very common phrase saying that this is the best thing since sliced bread. And so it's actually, it's used so often, a lot of things are compared to sliced bread, I figured I'd continue the tradition. So to start off, I'd like to tell you about the rise of procedure macros and kind of a little bit of history of how we came from the dark ages of Rust 1.0 to procedure macros that we have today.
The first thing you'll notice is that back around in January of 2014, this is about a year before 1.0 was released, we have this poor request number, the infamous 11151, LAN for loadable syntax extensions. Now this was the first foray that we've used to try and have procedurally loaded, defined on crates.io, externally defined,
loadable libraries that you can modify the syntax in the compiler. Now it turns out that immediately after that, we actually kind of regretted landing that. But the idea is that this was kind of the proving ground, the staging ground, which we ended up using to fuel a lot of the later decisions about procedure macros. And so this was some critical test cases
that we had along the way to kind of inform us as to how procedure macros are used, how they're most commonly expected to be used in this ecosystem, how to build them, make them efficient, all of these kinds of nitty gritty details. And so although we still have lots of legacy support for this and it's kind of a pain to support these older syntax extensions, they were vital in getting today's system
currently stabilized. The next thing was these two RFCs happening in quick succession, which is starting to formally design the macro system. This happened about a year after 1.0, and we had these two, one for modularizing the macro system, and the one for actually specifying what a procedure macro is. And so I'll be kind of going into more details about what's going on here,
but you can see how years ago is when we actually started the stabilization process and specification process for the macros that we actually have today. Now, they have been tweaked since they were designed back then, but they're mostly the same in spirit. Next up, we have in mid-2016, the first stabilization of macros with derived macros.
So many of you might know, this is when Serde first started to compile on stable rest, and this was a massive boon for the ecosystem. As I imagine, everyone who's used Rust very, very quickly runs into Serde, and Serde is so reliant on derived macros for its functionality. And then the last thing was when, in about April of 2018, so kind of
beginning of last year, we started having a call for stabilization for more macros other than just pound derived. And so this was attribute macros and function-like macros, and so these ended up actually being stabilized in October of 2018, and so this is kind of where we are today, and I'm gonna be talking a little bit more about kind of
what the current state of macros are, and then going into all the attributes derived and functions. So it's only fair that we talk about the rise of sliced bread, as it's the comparison as well. And so most of this is copied from Wikipedia, but there's two notable points. One is the inventor of sliced bread is from Iowa. I also came from Iowa, and so I'm very proud of that. I did not know that before, anyway.
The next thing is that in World War II, the U.S. banned sliced bread because it was so difficult to produce, but then they had to undo that because it was too popular and everyone revolted. All right, back to the actual technical content. I'm gonna first tell you a little bit about what a procedure macro is. Now, if you were at David's workshop yesterday, this is probably going to be a bit of a review,
but it'll kind of get us all on the same page. And then I wanna talk a little bit more about how we actually write these procedure macros. So not just what they are, but how do you actually functionally use them in the ecosystem. And then finally, some fun stuff coming down the pike into some cool examples of macros. So all right, macros come in three primary forms
in the compiler. The first of these is function-like. They are denoted by this bang and then this delimited insides, and one of the most common ones is println. And println isn't literally procedure macro, but it's the same thing in spirit. The next is the very familiar derive macros, and so this is everything to do with Serde and everything to do with every other form of derive on the CreateSci ecosystem.
And this is sort of annotations for structs and enums to extend with various functionality. And the last is attribute macros. Now we've heard some talk of as in bind engine, I'm not gonna go too much into the details about that, but this is effectively a macro that is defined on CreateSci O and as a library-ified definition of how it works.
So to actually start out by writing a proc macro, you have to tell both cargo and the compiler what you're doing, because the macro has to be compiled a little bit differently, especially when you're cross-compiling and do all that crazy stuff. And so all you have to do is open up the lib section of your cargo toml, write in proc macro equals true, and you're good to go. Now you're ready to start writing some procedure macros.
If you're writing a println macro or a function-like macro, you're gonna use this pound proc macro attribute and then take this weird token stream thing and return this weird token stream thing. And the general goal here is that we have this invocation, like println of a wheat bread, and we wanna transform that and completely replace it with some other ugly expanded thing
that the contents don't matter too much, but it's something the user's not gonna write and have to worry too much about. And so to kinda connect all the dots here, we can see the function name of the procedure macro is actually connected to the macro itself. The input is what's inside of those parentheses, and so we're gonna get just the quote wheat for this one macro invocation. And then the output is going to replace
the entire invocation, and it has to contain all those tokens, the stdio print, format, all that good stuff. The next of these is derived macros. And so this is an example of how Serde might actually be defining their derived macro inside there. We can see it's a little bit different where it's not just proc macro, but it's proc macro derive, and then the name of what's in there.
And the input and output is also slightly different here as well. So a derived macro is only ever used to annotate either a struct or an enum, and so that'll be the input here, and the output is something that we append. So we can kind of see the connections here where the name is specified in the attribute, the name of the function actually doesn't matter too much.
The input is just the struct, not the attribute with it. And then the output does not include the struct. It's just this one impl block. And so the main thing here is that with a derived macro, you cannot modify the struct or the enum. You cannot change what's happening. You can only append to it typically by adding trade impls or other items, or just various other syntactical constructs
that you want associated with your derived macro at that point. The next form is, the last form is attribute macros, which is kind of like Wasm bindgen. This is, again, slightly different from the previous ones where we have proc macro attribute as opposed to the previous forms. And there's also two inputs here, as we'll see in a second. And so the idea here is that we're starting
with this bake function, and we're gonna annotate it with a Wasm bindgen start. And then we wanna transform that by generating some goop which you don't have to worry so much about. It's got a bunch of crazy stuff, plus the original item itself right there. And we start looking at the connections. We can see, kind of like with function-like macros, the name of the function is connected to the name of the attribute.
And then the args, so this first argument, is what's provided in parentheses to the attribute itself. And so here, args is just gonna be start and nothing else. And if you don't, you don't have to provide the parentheses, it'll just be an empty token stream. The input itself is the item with which you are annotating. And so, it's this entire function bake and all the contents inside there.
Now, the output is going to replace the invocation. So kind of like with function-like macros, if we want the bake function to persist, we gotta make sure to put it in the output as well. So we not only have our generated goop, but we also have the bake function serialized straight into the output. So that's kind of the interface that we have
with these three kinds of macros. And the next thing you might be wondering is, what's this token stream thing? Why is the input this weird token thing? And so a token stream is sort of the lexical foundation for all Rust syntax. And this is where it includes atomic tokens inside of it. It's not a string, it's not some other serialized form, but it's kind of a parsed representation in memory
of each token that you passed in. And this is all defined in the compiler. It's completely unstable internally. You can use it stable externally, but it's all kind of a bunch of weird details. But you can kind of sufficiently think about this as a cheaply cloneable, hence the RC here, list. And so it's not literally that internally,
but you can very cheaply clone it. You can iterate over it and see what's inside. Now the really interesting part here is what's actually inside of it, which is this token tree. And so the tree aspect comes from this first variant called group, but I'll go over that in a second. And suffice it to say that these four variants, these four pieces of syntax, are enough to capture all of Rust syntax.
And I have a bunch more syntax as well. So I wanna break down this derived example and show you some what each of these syntax variants corresponds to in this example. And so we have this derived macro, which is SIRTI, and then a couple of SIRTI attributes and some fields inside there. So first up, we have a group.
A group represents a delimited set of tokens by a balanced delimiter. So these are parentheses or brackets or curly braces. And so here we have three. We have just the bracket on SIRTI, the parentheses on SIRTI, and then the curly braces on the struct itself. And so that's kind of the nesting form here. And this is why it's a token tree,
as opposed to just a token or a token list, is that this is a nesting where the group internally has another token stream of what's inside there. The next are identifiers. And so these are keywords. These are variable names. These are struct names. These are type names. These are just kind of any valid Rust identifier, which there's various rules for that.
But so here we have SIRTI, rename all, structs, the fields, the types. All those are identifiers, which are kind of, they have no internal spaces. They're just one valid Rust identifier. Next we have punctuation, which is kind of all the little bits of syntax here and there, like commas and pounds and colons and equals. And critically, a piece of punctuation
is one character large. And so it's just one UTF-8 character or one Unicode character. And this will, internally, if you have like a shift left, which has two less than signs, that'll be two different pieces of punctuation. And I'm not gonna go too much in details. We can all figure that out later. And the final one here is literals. And so these are numbers.
These are strings in this case. We could have byte strings. We could have, not arrays, well, anyway. So those are kind of the bare values, which are compiled into static memory and kind of just a different class of syntax. So these four pieces are what encompasses everything that you will receive in your procedure macro. You'll be getting a giant list of all that,
and you're expected to produce a giant list of all that. And so to kind of talk another feature about the macro system that we've been developing is the fact that we've been modularizing all of these as well. So before we actually had stable macros in 2018, we had to use these funky macro use attributes.
We had to use other funky attributes. They had kind of weird scoping rules. And so with the recent push for stabilization of macros, we've added new support in the entire module system to correctly scope these, work through the same module system that you're kind of used to. And so we can bring in println, serialize, wasm-bindgen, we can even re-export. And so typically, as I was saying, you have a dedicated crate for defining a procedure macro,
but you might re-export it from a different crate. So the wasm-bindgen crate will re-export this from the wasm-bindgen macro crate kind of internally. And one of the nice aspects of this is that macros are their own namespace. And so this means that you can bring in multiple values or types into scope at the same time.
So when you say use sturdy serialize, that's actually bringing into scope both the trait and the macro. And so it kind of cuts down on the number of imports that you have to write. And it's also kind of a nice thing for library authors where if you have everything line up very nicely, trait-wise and derive-wise, then you can have one import to bring everything into scope all at once.
So all right, that was a little bit about a whirlwind tour of what procedure macros are. And so let's dive into some internal details of how you would actually write these macros and generate these giant lists and parse these giant lists. And so our problem here is we have this println macro we wanna write, we have this input which is some list of tokens, just some list of syntax that we have to figure out, and we have to produce this stdio println whatnot.
But internally, it's kind of hard to work with all that. So what we morally want to do here is have kind of a parsed representation of what we might be receiving. And so if we say println some number of lows, then we want kind of the format string to be parsed directly into memory, and then we also want some expressions. Now, we don't really know
where this expression type is coming from, but we want kind of arbitrary Rust expressions because we want Rust to be very familiar at that point. And then so after we actually parse it, we might do some validation to make sure that the number of curly braces, placeholders, matches the number of arguments that we pass in. And then finally, we wanna actually expand it to generate that stdioprintln business that we were seeing. So it turns out the crate-side ecosystem
has a number of very handy crates for doing exactly this. The first of which is a crate called sin, and so what sin does is it's providing a lot of default parsers and default types that matches a lot of Rust syntax, and so we can just replace this expert with a literal sin expert defined on crates.io. We can then use sin, it'll have a lot of nice utilities,
we'll see those in a second, for implementing this parse function, so we can very easily write our parsers and we don't have to parse all of Rust syntax in every single crate and have a little buggy here and there. And then to actually expand this crate, to actually create some token streams and make more, we can use the quote crate, which we'll be looking at in some detail after this as well.
So as I was saying, sin will be used to parse all of the Rust code that we receive, and the way that it mostly works is kind of, a lot of times when you write a macro, you're extending Rust syntax in very small or very tiny ways. So in a println format, we kind of have custom syntax in the string, but everything after that
is just a bunch of function arguments. So just commas and expressions and all existing Rust syntax. And so, the reason sin is so powerful is it's kind of giving you all these parsers which you can already leverage, but then you can write the top level logic for having little pieces of custom syntax here and there, and then use the rest of sin to kind of do the real work of actually parsing an expression or kind of dealing with precedents
and all that crazy stuff. And then one thing I'll talk about a little bit later is kind of this idea of preserving span information. And so, we saw in the previous talk how important spans are for producing errors. And so, that's also very important when you're manufacturing all these token streams. And this is exactly what the compiler is compiling. And so, we wanna make sure that that span information is accurate
to make sure that the error messages are as good as possible. And then finally, sin is relatively large in terms of there's actually a lot of Rust syntax. And so, because procedure macros are always compiled with your crate, when you type cargo build, you have to actually build a Rust parser on the spot. And so, there's a lot of feature gates to aggressively remove code that we don't have to compile.
And so, it compiles very, very quickly for most average cases. And to give you a bit of an idea, if you're writing a derived macro, you don't actually have to define anything. We don't need this my invocation or any other serialized instance. We can just use the utilities provided by sin. So, sin gives us this derived input type. It gives us a convenient macro
for parsing directly into it. And we're off to the races. We can immediately take a look at that, see if it's a struct or an enum, how many fields it has, what kind of variance, all those various aspects of the input and just keep on going. And it goes a little bit beyond this where it's a little bit more advanced and you might have seen this in the workshop yesterday had you been there. But there's extra traits for parsing,
kind of custom syntax in nice ways, kind of a, it's similar-ish to nom, but a little bit more ergonomic in a few places, but I'm not gonna go too much into the details here. And so, the second half of this was after we've actually parsed this code, we now need to generate a bunch of token streams. And so, that's where quotes gonna come in, where if you take a look at the bare fundamentals,
it's hard to parse, it's hard to take a look at this token stream and interpret it so we parse it. But suddenly, it's very difficult to create token streams and all you can do is create them from a list or kind of create an empty one or kind of add new tokens to an existing one. And so, the idea here is that there's a convenient macro called quote, which does a process called quasi-quoting,
which doesn't make any sense to me, but I'll talk a little bit more about it. And the idea is you can have these interpolations where you can splice in pieces of syntax that don't always make sense in isolation, but when you take a look at it as a whole, it kind of creates a valid Rust program. And to give you a bit of an idea, this is producing the same syntax of let slices equals 42.
And so, at the top here is what you would be doing with the raw token stream and token tree APIs. It's basically terrible. It's horribly ergonomic to create each identifier, to create each piece of punctuation. It's very difficult to read this, to modify it, to understand what's going on, whereas belowhand, it's actually pretty obvious we're creating let slices equal 42.
You also get some nice syntax highlighting in here, kind of because it looks a little bit like Rust syntax, but these are functionally the same where they're not actually executing this code. They're producing a token stream that represents that code, which will then be compiled by the compiler later on. And this interpolation aspect comes in when you have local variables, such as the name of some structure generating
or the fields that it contains internally. And then you can just kind of do this splicing where you can have little pound names or you can do iteration with these, it's sort of like a macro rules style syntax. But the idea here is it's all very valid Rust syntax and you don't have to worry so much about what's going on, but it's very easy to read, see what's being generated
as opposed to manually generating each of these kinds of token streams. And so, QUOTE ends up being a very, very powerful create and for writing very concise code generators are kind of all these crazy different kinds of expanders and sin itself comes with all of these. So if you're given an expression, you can both parse it and then create a token stream as well, it's all provided for you there.
So all right, the last thing to consider when you're writing a macro is the span information, which I've been saying about. And so, as Esteban was saying earlier, a span is all about where did this piece of code come from and so that's all primarily used in diagnostics, but it's also used in things like debug information for if you set breakpoints on line numbers.
But the idea here is that every token tree, so the groups, the literals, the punctuation, that you saw earlier are all annotated with a span. So this is from byte four to byte five, this is on line six, line seven, that kind of stuff. And if you actually erase all span information, so if you take this input and this bakeat macro and you just stringify everything and parse it back out,
you're gonna get a horrible error message here that says, oh, it points to bakeat pound three seven five and it doesn't actually point to split out half and it's very difficult to know in your code where that is actually located. Whereas if you actually correctly preserve the span information, then you can get the normal compiler errors of saying, oh, okay, it actually is happening at this line,
at this method call and at this exact location. And so this is something that's very, very important to remember when you're writing procedure macros is that you need to make sure that all along the way, span information is preserved as much as possible because in this case, there's no problem with the macro itself. So the bakeat macro could be producing
completely valid code but it's the internal code written by the user which is actually the invalid part which has a syntactical error or a type checking error or a name resolution error. And so that's naturally going to happen and you wanna make sure that your procedure macros don't destroy all the diagnostics coming out of it. And the other really cool thing that you can do with spans is actually manufacture custom error messages.
It's a bit of a hack today but you can create fancy error messages like this which point directly at a particular input token and have no correlation to what the compiler would do. And so here I'm saying I want a very specific kind of bread, not just a generic kind of bread to need at this point. And so this is a great way for macros
who can kind of deduce at parse time what's valid, what's not valid, whether this is syntactically incorrect and provide very pointed error messages of saying, you didn't annotate this field or you did annotate this field and you shouldn't have done that and you forgot to use this. All that kind of good stuff. Sorry, the last thing that I wanna talk about is a little bit about the future of macros
and the kind of some features we have coming down the pike as well as also some examples of some cool macros a little bit later. And it would be remiss of me to not talk about the word hygiene during a talk about macros. Now I don't really know what hygiene is myself but it always comes up when you talk about macros and Rust and so the best way that I can explain it
is a small example of how it always goes wrong. So let's say you're writing a custom derive or a derived macro where you say you're implementing my trait for some type that someone passed in. Well it turns out this might not actually work because the way macros work today it's as if you just took those tokens and then wrote them right next to the code or where your macro was located.
And so maybe my trait wasn't in scope and so where do we actually pull in my trait from? So this might have a compiler unless you have that in scope. As a macro author we'll fix that. We'll say oh this comes from my crate and so we'll say oh it's located here at this path in my crate and therefore this will always compile because you don't have to have it in scope
you just have to make sure it'll link to my crate but you're probably pulling in my crate or the macro through the crate anyway. But even this doesn't work. I can rename crates in my card automo for whatever reason I might want and so although you might call it my crate I might call it other name or something else completely different. And so at this point we actually have no idea
what we're gonna write in this macro here. We have no way to canonically refer to something that we want to refer to. And so this is the whole fundamental problem of hygiene which is I want to deterministically refer to this trait no matter what you're linking to, no matter what you called things, no matter what you said anything about. And so that problem is very very difficult
and one that we have not tackled yet and so all the procedure macros are not hygienic where they're just taking your tokens and copy pasting them and they're effectively best effort where you kind of just have to write this and hope for the best and make sure that idiomatically your crate's not renamed or idiomatically no one writes other crazy stuff or tries to re-export macros. And it's a major pain point of macros today
and one we want to continue to improve upon although it's a very large extension relative to where we are today unfortunately. So the next thing is that we've seen kind of the power of Rust's diagnostics. We've seen how helpful they can be, how useful they can be, how bad they can be in some cases but this is one where we want to give
that same level of power to procedure macros written on crate's IO. And so this is an example of an error message produced by the compiler but we were unable to use that in our procedure macro. All we could do was generate the word mismatch types and so eventually we were gonna have the ability to have this underlined text, the expected business, the health business,
the note business, warnings, all sorts of official level diagnostics and so this is one that's coming down the pike but not stabilized yet. I think it's implemented on nightly right now and you can play around with it but it's still got a little bit of time before stabilization. And the last thing that I would like to see is kind of a better experience for debugging macros
and so when you're working with macros you tend to be producing pieces of Rust syntax and you have to produce valid Rust syntax because it's gonna be turned around and parsed by the compiler but if you produce invalid Rust syntax you get a crazy error message like this that doesn't make any sense and so this has to do again with the span information where the spans all point to this original macro
and it's a lot of internal details here but I would love to see kind of a better system for maybe we can pretty print in the compiler or we can have a better experience where you don't have to add random printings in your macro to see what's coming out. Right now it's just a kind of a, you have to hope that it was valid previously and then know what you changed if it breaks so you can kind of look within that delta
to see what's actually broken in this kind of case. Sorry, the last thing that I want to do is kind of give you a bit of an idea of some cool macros that I've seen in the ecosystem. I was just sitting in the workshop yesterday and I realized that all these macros that are just described in that one readme are just, I always love seeing the possibilities
of procedure macros. I've been talking a lot about what they are and kind of what they can do at a technical level but I love seeing examples of how people are doing completely crazy things with macros that I would never even consider. So David showed me this one which is what if you had a macro to say that your enum was always sorted? So if I put rye before focaccia, I would get a compile time error saying oh, you put this in the wrong place,
please place it after focaccia. Not necessarily a killer feature of Rust but kind of cool that you can write this on crates.io and it's just a completely library-fied crate and it'll never break and you can pull it at any time. Similarly, many of you might have used clap for argument parsing when you're writing a command line program in Rust and it turns out there's this convenient other crate
called structopt where you can slap on, derive structopt and all of a sudden you have an argument parser. You can internally say what the long arguments are, the short options are, kind of some help values, default values, all sorts of crazy configuration and this is a very succinct way to define a CLI interface and have it parsed already
for you and kind of remove tons of boilerplate. And so this is one that's not really deriving a trait per se but it's still kind of leveraging the ability of derive to be a very lightweight annotation where you're like, oh, I don't know exactly what that is but I can kind of understand what's going on because so much of the syntax surrounding that is familiar and easy for me to read.
Similarly, the Gnome project has been working on a macro for quite some time where they have a very complicated runtime system and what they wanna do is define these things called G objects in Rust. And they have a lot of boilerplate associated with them, a lot of kind of various pieces of interfaces here and there and the idea is they can wrap it all up
in a nice little macro. And so this isn't exactly Rust syntax, it's more kind of G object or Gnome oriented syntax but you can kind of see and kind of surmise what's going on at this point. And so this is a cool example of how tons of boilerplate is generated behind the scenes here and this is very error prone in C, it's very error prone in Rust if you write it by hand
but here is a nice and succinct way to kind of keep everything nice in one place and understandable and just kind of a cool thing you can do with macros at that point. And it also is showing how you can extend Rust syntax, you don't have to have exactly structs or exactly enums, this is just arbitrary token streams and so class isn't a thing in Rust
but you can make it a thing in your macro if you want. And then one other important thing to point out here is that this is again one of those instances where there's so much code here in the middle, you could have any sort of compiler errors with that, like you could have type check errors, you could have weird integer errors or anything like that, so you wanna make sure that that span information is all preserved here,
so if each slice has a random bug inside of it not associated with the macro, you still have those high quality error messages coming out of it. And then the last thing I thought which was really really cool is David's actually also created a macro called no panic where you put it on a function and if this compiles, the function doesn't panic. That's kind of, I've heard that a lot
with kind of embedded systems where they want functions that can never panic or functions that are guaranteed to not panic for code size reasons or just kind of correctness reasons and so this is a very lightweight way to say to do crazy analysis at compile time in a kind of a interesting sense but the idea is that this is at least a way to say this function never panics at runtime
when I actually compile it and you can know that statically. So all right, that's what I have. If you have some further questions, these are some great links. The first of which is David's proc macro workshop done yesterday which has a lot of excellent exercises to kind of get your feet wet, see how these macros work in various different ways and kind of get used to sin and quote and all that.
The reference for procedure macros is actually pretty up to date and it has, not necessarily, it's very technical. It's a reference piece of documentation and then the sin and the quote creates have excellent documentation for getting started with procedure macros and kind of how to write your own and how to do that. So all right, thank you so much.