We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

The Power of the "Where" Clause

00:00

Formale Metadaten

Titel
The Power of the "Where" Clause
Serientitel
Anzahl der Teile
8
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Rust expresses trait bounds using the where clause. It is an essential tool for all generic Rust programming. Yet, many Rust programmers don't know about the full expressiveness of it! This talk guides us through most of the features and shows us how to creatively use where clauses to both keep our code clean and flexible, but also sound and hard to use in a wrong way. This talk will teach us the basic building blocks to not get lost in a forest of constraints. Follow us on Twitter: https://twitter.com/rustlatamconf
Open SourceFacebookLeistung <Physik>ValiditätProgrammierungRechenschieberMereologieTwitter <Softwareplattform>Prozess <Informatik>VerschlingungGenerator <Informatik>MultiplikationsoperatorComputeranimationVorlesung/Konferenz
MAPZweiMultiplikationsoperatorProzess <Informatik>Produkt <Mathematik>Güte der AnpassungFormation <Mathematik>GruppenoperationVorlesung/KonferenzComputeranimation
PunktMakrobefehlQuadratzahlCliquenweiteDatenverarbeitungssystemSpeicherabzugHochdruckSystemaufrufFunktion <Mathematik>Lesen <Datenverarbeitung>SynchronisierungTypentheorieMinkowski-MetrikProjektive EbeneMaschinenschreibenOrtsoperatorPunktHochdruckZahlenbereichGenerizitätTypentheorieBitProgrammierungSichtenkonzeptCASE <Informatik>ZweiCompilerBefehl <Informatik>FunktionalNeuroinformatikWellenfunktionQuadratzahlMereologieCliquenweiteVariableSelbstrepräsentationShape <Informatik>Derivation <Algebra>DatentypMakrobefehlVersionsverwaltungKlasse <Mathematik>CodeSystemaufrufGraphfärbungMultiplikationsoperatorGebundener ZustandVorlesung/KonferenzComputeranimation
ZeichenketteTypentheorieRechter WinkelFormation <Mathematik>FunktionalGebundener ZustandProgrammbibliothekAssoziativgesetzIterationSpielkonsoleHochdruckNebenbedingungKlasse <Mathematik>SchnittmengeZeichenketteVorlesung/KonferenzComputeranimation
DualitätstheorieBootenENUMDefaultStrategisches SpielGebundener ZustandZählenVollständiger VerbandZahlenbereichVektorpotenzialAlgebraisch abgeschlossener KörperParametersystemResultanteFunktionalStandardabweichungProgrammbibliothekNebenbedingungMereologieZweiThreadKontextbezogenes SystemFehlermeldungCASE <Informatik>InformationMusterspracheGenerizitätGebundener ZustandVektorraumElement <Gruppentheorie>TypentheorieImplementierungMessage-PassingSelbstrepräsentationAlgebraisch abgeschlossener KörperStrategisches SpielDefaultEinsParallele SchnittstelleVollständigkeitENUMAbgeschlossene MengeRechenwerkZeitrichtungVorlesung/KonferenzComputeranimation
ProgrammiergerätVorlesung/Konferenz
VektorpotenzialAlgebraisch abgeschlossener KörperInnerer PunktFunktion <Mathematik>ThreadCompilerTypentheorieVariableEinfügungsdämpfungKonstruktor <Informatik>ValiditätWrapper <Programmierung>MusterspracheMereologieFunktionalPunktErwartungswertImplementierungNebenbedingungKlasse <Mathematik>Gebundener ZustandHardy-RaumWurm <Informatik>DatenstrukturSpieltheorieHydrostatikArithmetischer AusdruckBefehl <Informatik>RefactoringProgrammierungComputeranimation
RefactoringGenerizitätMAPKlasse <Mathematik>RefactoringMAPFunktionalKomplexes SystemMusterspracheProgrammierungVorlesung/KonferenzComputeranimation
ProgrammbibliothekKartesische KoordinatenInnerer PunktProgrammierungAutorisierungVorlesung/Konferenz
Gebundener ZustandZustandLoopZustandCompilerTypentheorieTwitter <Softwareplattform>Radikal <Mathematik>GruppenoperationBefehl <Informatik>LoopArithmetischer AusdruckLesezeichen <Internet>StrömungsrichtungArithmetische FolgeZustandsmaschineImplementierungFehlermeldungRechenschieberFunktionalGebundener ZustandAutomat <Automatentheorie>Projektive EbeneEinfügungsdämpfungComputeranimationVorlesung/Konferenz
LoopMachsches PrinzipInformationsspeicherungDatenmodellCodeZustandZustandsmaschineProzess <Informatik>DatenbankEndliche ModelltheorieTypentheorieSampler <Musikinstrument>InformationsspeicherungMusterspracheLoopGruppenoperationProgrammierungWeb logRadikal <Mathematik>ImplementierungArithmetischer AusdruckSchnittmengeVorlesung/KonferenzComputeranimation
TypentheorieLesen <Datenverarbeitung>NebenbedingungWechselseitiger AusschlussZeichenketteVektorraumVorlesung/Konferenz
InformationsspeicherungDatenmodellAvatar <Informatik>ZeichenketteEndliche ModelltheorieAvatar <Informatik>DatenbankCompilerInformationsspeicherungImplementierungFunktionalCodeQuellcodeZusammenhängender GraphComputeranimationVorlesung/Konferenz
ZustandPunktInformationsspeicherungCASE <Informatik>DatenbankFunktionalGebundener ZustandFehlermeldungKonfiguration <Informatik>AbfrageTypentheorieHilfesystemEndliche ModelltheorieProgrammierungSiedepunktVorlesung/KonferenzComputeranimation
FlächeninhaltEinsNebenbedingungMultiplikationsoperatorVorlesung/KonferenzComputeranimation
Leistung <Physik>MultiplikationsoperatorNebenbedingungTypentheorieVorlesung/Konferenz
VorwärtsfehlerkorrekturKartesische AbgeschlossenheitDigital Rights ManagementPhysikalisches SystemMusterspracheCompilerComputeranimationVorlesung/Konferenz
Open SourceFacebookFormation <Mathematik>Computeranimation
Transkript: English(automatisch erzeugt)
Well, hello and thank you. First of all, Alex has not told you the full story. This is now a completely valid Rust program and you actually don't need to program anything anymore. You can just have everything generated.
We just lost our jobs. So who am I? I'm Florian. There's Twitter and company links on this slide. I'm here for Mozilla actually. Mozilla sent me here as a part of the TechSpeaker program and this is actually my second time
I'm on this stage. I was here before at RubyConf UoY. I was 10 years in the Ruby community talk. Now I'm five years in the Rust community, something like that, and I met a lot of familiar faces. I'm happy to be here again. I started Rust in 2015 mostly out of personal curiosity. And because I'm not a good solo learner, I immediately started the build-in user group,
which is one of the biggest and most active around the globe. I'm organizing two Rust conferences, RustFest and OxydiseConf. I can heavily emphasize how much of a great job this is. And I'm a product member since 2015, mostly in the community team and now also moving
to other spaces. I can definitely agree with Nico that the Rust project is very easy to join on all positions. If you have something that is specifically of your interest, it's very easy to get in touch with people and get involved.
And this is also my first talk on a Rust conference. I've organized them. I have never given a talk at a Rust conference. And what I want to do is I want to make you competent at reading and writing where classes because they are important, but have some subtleties to understand. So just as a bit of background, what is a where class? Let's say we have a pretty simple data type.
We have a point. The point has an x and a y value. And we have a function that's called print point that just takes a reference to that point and prints it out, like nothing important. The most important thing here is this point implements debug because our println macro requires things that we print out using the question mark syntax.
It requires them to be debug. But this version of this function just takes a bare point. So what if I have squares? So I have a square. It has an x and y position and width.
That's enough to describe a square. And I can write a print square function that takes a square and just does exactly the same thing. That's very repetitive. And computers are very good at repetitive. We are not as good as computers here. And the unifying thing between these two is not that they're shapes.
For the printing function, what we need is actually that both of them implement the debugging representation by using the derived debug statement up there. So we can rewrite that function as fn print takes a shape s, takes a borrow of a shape s,
and prints it out. But the important thing is we need to indicate to the compiler in some way that we only accept types that we can actually turn into this debug representation. That we do using the where clause. So we put there where s is debug.
The thing under number three is called the trade bound. And the thing under number one is called a trade variable. A type variable, sorry. So now we can both create a point and a square and use both, print both of them out using print.
The important part here is the first statement, the first print statement, internally calls a function that's called something like print point and the second something like print square. Important thing here is those are actually different functions. The compiler just generates them for us.
Logically speaking, there's an infinite number of those. There's an infinite number of those functions for every type that is debug. But in case of this program, we actually only compile two. So for Rust, from Rust point of view, those functions will be compiled on need. We are actually using print point, we are actually using print square.
These two are needed, these two are going to be generated. There are other places where the where clause can be stated. For example, we can create a struct that is generic, that inside has two somethings, preferably numbers if we're representing a point.
And we can also say p needs to be debugged. Most of the stuff that's important, you can totally exercise on function stuff. A bit more detail here. So we have a function that has two generics, t and e,
and I can express what I want from them. First of all, what the where clause gives me over the other shorthand syntax is that I can split it up so it can state bounds multiple times. I can say I want t to implement a trait called trait
and a trait called other trait. Or I could have another type, e, that implements trait plus other trait. So I can do both of these. It's functionally the same, but for ordering your code, it just helps a lot. The important thing in the where clause is the left things are concrete types.
What this says is I have a function, and for every pair of types, t and e, that fulfill the bounds to the right, there are, I can compile this function for any two exact types.
Because the left hand is an actual type, this for example works. I can say I have a function takes into string t where string, which is the standard string of the Rust library implements from t. So I can say I take any kinds of types that can be turned into a string in that way.
Where clauses are important for a couple of reasons. For example, they are the way how we can, for example, constrain on the type that an iterator returns us. So again, using just the bare debug trait
and printing stuff out to the console, I can have a function called debug eater that takes any kind of iterator. The iterator trait is, again, standard iterator, but it has an associated type, which is the item that it's going to return. I don't know what the item is, but using the where clause, I can at least say
the item must be in the set of types that do implement debug and that can be printed to the console. And to my knowledge, that was also one of the arguments why the where clause was actually introduced, to be able to actually do this.
So there's a couple of patterns on how we can work with that. So when I do generic programming, I'm always talking about this idea of constraints. And for example, if you take some of the standard library types, for example, result, result being either this worked
and gives me result back, that's the okay variant of the result enum, or it gives me an error back, that doesn't give me a lot. And if you have a look at the standard library implementation of result, there's a couple of functions defined. There's actually a lot of functions defined on this type.
And the most basic ones are, for example, here, the implementation for any kind of result value or error. I have two functions, is okay or is error, that just tell me which of the variants that was. That does need no knowledge about what T and E actually are
but if I want to call unwrap, for example, if I have result unwrap, I'm going to have a panic message that includes the debug representation of my error that I had. And there, I have an implementation
that says, implt E, result, T E, where E is debug and then there's a couple of functions that rely on error actually being debugable. The other way around, there's implementations for, if T, the value when everything worked, if T implements default, so it has a default value,
result gains a function that's called unwrap or default, which doesn't panic but instead, in case of an error, gives me the default value back. So we are gradually constraining the result type more and the more we know about it, the more functions get unlocked on it.
And gradually unlocking features based on these kinds of bounds is a common API strategy in Rust and you can see that all through the standard library. Let's talk about another piece of standard library which is the threading API.
I have an example here that does, again, something rather useless. I have a vector and I spawn a thread which takes that vector and counts the elements of it and gives me the result back.
The threading API is generic because it can push anything inside and get anything back. So, coming up with a first attempt at writing that on my own, you could come up with something like this. You have the thread spawn function, it takes an F and a T type,
takes in the F, returns me the T. Why is it F? Second, thread spawn takes a closure, runs that closure, includes all the data that my closeover, in this case the vector, and returns me whatever that closure
has actually returned as the join handle. The join handle will give me information of has that actually run to completion, was there an error or whatever, but it gives me the result of the pack. The problem is, what fits these slots?
What can I put into that F and that T? Problem there is, threads, if I'm spawning a thread, a classic problem is, all the data that is put on that thread should preferably not reference anything
out of the context where the thread was spawned in. Why? Because both are going to run in parallel, and the data in the first part might be removed, changed, whatever, and is independent of the second. So I want to have this idea that Nico introduced today in the morning,
of actually giving complete packages over, and also getting complete packages back, and forcing the programmer to actually move everything over, and not half of it. The problem is, if I would write the API like this,
that were actually possible, because I have no other constraints on F and T than that they're actually generic types. They might end up being references, they might be any kind of valid loss type. So what I can do,
I can use the where clause here to express additional things here, and a way of expressing what I just said, you can actually send that stuff over to a thread. Rust has a marker for that, Rust has a marker for everything that is actually allowed to do that traveling. That's called send.
And the other thing is, I can bound this value with a special lifetime that's called ticstatic. And the bound, any type plus ticstatic, essentially means it must own all data, so you can give up complete ownership, the party that spawns the thread
must give up complete ownership of all the data, of all the payload it puts on the thread and after the thread is done, we want to remove it and throw it away, we also need to have the ability to bring everything back that you want to bring. So this F send plus static bound
expresses this quite neatly. There was an issue in my teaching in two or three years ago that people felt like expressing plus ticstatic is cheating because you don't take part in the references game. It's actually a meaningful statement. If you don't want to deal with references,
if you don't want to deal with borrows, just express ticstatic and that's probably a very valid solution to your problem and just deal with ownership. Another problem that might arise when you start working with where clauses and start trying out bounds
is we often have wrapper types and wrapper types sometimes express some kinds of expectancies over what you put inside. Here again, I have a wrapper and again, just using the debug trait as an example trait. Now I have another function that takes that wrapper
and just unwraps it and takes the inner part. But because I have expressed that wrapper only can only have types with a certain bound inside. If I want to write take inner, I also, to fulfill that,
I also always have to constrain the generic type or the generic variables of take inner as debug. The function itself doesn't actually use it though so that's a problem. I want to write this function. It just does nothing more than taking that structure, taking out what's ever inside
but I do have to express an additional bound just to basically reiterate that wrapper already expects the inner part to be debug and nothing else is allowed. So what can I do against that? There's a pattern in which you can write this wrapper
in a way that it actually itself can contain any type but you can effectively only create that wrapper in a fashion where it is debug. Let's just zoom in here. The way it works is you don't allow users
to directly construct the type and you're only giving them a constructor and that constructor carries the bound that I want to express. What you can then do is later if you actually want to use this bound,
for example by putting an inspect method on that wrapper, you just need to restate it but because the compiler actually follows all those variables through the whole program, it will still know the thing you put into the wrapper was a type that is debug and no one can effectively create any of those wrappers
that don't have this bound at all which I can refactor this in a way by actually putting that bound again on the implementation instead of on all the functions themselves.
That's for you to decide how you want to do this but this allows me to write the take inner function in a way where I have to express this bound because at that point I don't care about it. It's absolutely not new. Where classes are primary refactoring targets.
If you want to change your program and make it more flexible, more constrained depending on what your goal is, your where classes are your primary refactoring target. Also, don't start out extremely generic. Don't try to write generic code out of the blue.
The pattern that I've shown in the beginning, figuring out that two or three functions are basically the same and you probably can move them into a generic function is a very useful one. Probably the thing that I've shown in the beginning, that's something you could write immediately
but any kind of more complex system, start simple, start building up, start building into genericness. Also, finding the right level is important though that's a classic in programming. A lot of application programming,
outer edge application programming suffers from the fact that people try to do it to make it too generic. But for library authors, for giving flexibility and for communicating intent to the outside, this is very useful. So always be aware where you are and whether that's actually needed. And in the end you might end up
writing terrible clauses like this. For one of my current projects, I'm going to refactor that tomorrow. It's literally a work in progress. Okay, some advanced examples. Traits and bounds can be used to express relationships between types
and this becomes very useful. This is one of my favorite Twitter accounts. It's called Happy Automata or I'm vaguely reassuring state machines. It generates state machines like this and I would like to write one of those state machines myself and state machines usually have states.
Some of those states are terminals. I skipped having start states in this example just because that will be wrote and would just make the example bigger. But what I can have is I can write, for example, two traits. The first one being transition to S, another state.
Expressed in the work clause, S needs to be a state. Self also needs to be a state so it can make a statement about the type that this trait is going to be implemented on and give that a function called transition and transition will take the current state,
actually owning and thereby destroying it and return the next state and another trait, terminate, that can only be called in terminal states that just removes the state machine, calls it done. And I can create myself three states. The state machine that I'm creating here
is basically there's a start, there's a mid state that I can actually loop into again and then there's an end. So I have start, loop and stop. I implement state for them. This is actually an empty trait. It's just a marker to make the compiler know these types are states. There's no, I gain no functionality from that
and stop is actually the terminal state. That's where I'm ending. And then I can implement transition to loop from start, I can go from loop to start, transition to loop for loop so I can go back into it again and transition to end for loop. Sorry, there's an error on the slide
and implement terminate for end so I can actually call, so I can actually stop. That means I cannot terminate that state machine if I have not ended up in the end state. So I need to make sure that people actually, that the users of the state machine actually follow through and take this process to the end.
The code here is simple. This is one pattern how to write this. There's a whole blog post on this by a community member called HoverBear. And the setup and the programming of this is a little involved but the usage is rather straightforward. The reason why I have to type the left side,
so I need to actually express what the next state is going to be is exactly because I have this loop state where I can either loop again or go to the end state and this is something where I actually have to tell the type checker I intend this to be the next state. The two comments in the middle, those wouldn't compile.
So if I would try to terminate while I'm still in the loop state, that doesn't work because loop doesn't implement terminate. And if I would try to transition from the loop state again to start, that doesn't work. That's not defined.
Second example that comes also out of my work is how about like talking about what's stored in databases. Let's say I have a storage tray. My storage can be queried, for example, for a model. That takes the storage, both the storage,
but I also give it an ID. So it takes the storage and reads out the model under this ID. The problem with this definition is I could try to get anything out of that.
I could try to read strings, vectors, mutexes, whatever, because every type is valid to fill that variable and this is what constraining gives me. Constrain to the things that are meaningful. And I can define a code trait here that says stores model.
And I can also constrain the stores model tray too. It can only be implemented on storages. Then I can extend my function with where self actually stores that model. And now I've defined a function that communicates you can try to query models out of this but only if it actually stores them.
And you force the implementer to actually declare what's stored here, to declare that to the compiler. So for example, I can have a user's database. This user's database implements storage by, what do I know, an SQLite-backed database, Postgres or whatever.
I have a user model, and for example, another avatar model so they can have users and avatars in the same database. And then I implement, implements storage user and implements storage avatar for user's database. And this becomes pretty natural. So having, you can only query things
where the storage actually stores. And in the end, I have things, I can write a program that boils down, looks like this. I can connect to my database and I can try to query it.
I actually have to state at this point what model I actually want to query out of it. So here the type checker won't help me because I've actually said I want to have multiple options and I need to decide. So I'm saying query the user out of that database. But if I would try to query a string out of it
or any kind of other type, it will tell me, no, I actually don't store this. The error message in this case would be that the storage actually doesn't implement, the function exists, but it doesn't implement the right bounds. So the conclusion out of all of this.
Getting comfortable with all the stuff that where clauses give you is important. Take it slow though. So don't start writing big ones just right out of the door. Exactly picking which constraints to need where is key. And spending some time on actually figuring out
what you need, potentially over-constraining first, later maybe removing some of the constraints may help. There's also an API concern around this. If you further constrain a where clause, you are breaking your previously committed API.
If you're widening it, if you're allowing more people to call it or this to be called with more types, you're not breaking the external API. And there are creative patterns of interplay with which you can start declaring to the compiler how your systems work to be found in all that.
Yeah, thank you. That's it.