OpenBSD network stack evolution

OpenBSD network stack evolution
Dealing with the IP checksum and the protocol checksums (foremost TCP and UDP) in the network stack is surprisingly complex. Having stumbled over an unexpected performance penalty from the IP checksum, I always had this area on my mental todo - and when we stumbled over a really nasty piece of code in pf dealing with these checksums, I re-evaluated and changed the IP checksumming in our stack, for performance and to make better use of checksum offloading to network cards. Changing the protocol checksums in the same way is harder and in the works. ALTQ has been with us for more than a decade - last not least Kenjiro Cho and myself merged it with pf in 2003. ALTQ has always been a research project, and tought us and the entire community a lot of important lessons. Now it is time to re-evaluate - the entire "glue" between the actual queueing disciplines (of which just two remain, prio and bandwidth shaping) gets redesigned and -implemented.
the the people data good morning I had to place that among over the line I'm going to talk about the new tools such as the world this and there is something new this year I tried to people short so we will have a chance to finish in time that lunch what use
stating the obvious temporary storage for packets in the context of least and that's usually just the chain of packet header and also the density this and the the classic way of processing this is just 1st in 1st out other 4 belong to a group of on
how to stack goes through but I respect the amount of Scotch interrupts the interrupt handler takes the packets also your extreme and places that is to be at introduce this is assuming that 4 of course after the dead himself the direct eventually the process the self and the outside of the emperor dequeues the packets from the minute you hence the lower drive you put rich processes found 1 moment assuming final packet is little hassles on my forward Robocop magic and eventually end up like the outputs and there we have the 2nd son of cues to care about you and what slips into the into the juice for the next thing we're going to send the packet out these are the
to choose really care about the so the unit you on the input side and we all use on the interface is the fear of tunnels and stuff instead that more on the head of the and all these cues are I have
cues the structures in years to consider its new than simple them pointed to the cat had of you pointed to the tail embolic lost on the line have the maximize but this is allowed to have as a kind of for the drops and there's some magic for congestion handling that is not on public for this book can as already mentioned the
classic way of processing was his 1st in 1st out the the other 2 methods that matter is a priority queue but the idea there being better get lower latency for important packets important defense on some kind of classification of course so all priority if you really really does is changing the order in which would process the facts on which you am Of course this is only really efficient when you're machine under load only overloaded because otherwise you trade and fast processing anyway affect change late Slaton's lightly armored comes really important when overloaded cost and the higher part effects on the process of low priority packets of the wall you the process late or actually dropped the other part that people seem to care about 4 is almost the only is that OK there you have to do a lot more over there you actually have to measure the bank of taking up by a certain class of packets and the late sending those out if you're exceeding the configure rates and while delaying makes things a little more complicated but so the
classification like talking about that's basically the decision how packet should get skewed and how it should be handled that is completely independent from the action chooses a very common misunderstanding classifying just marks the packets so you can classify completely different spot and the actual queuing happens the all of the dual not receive for for the classification Bismarck packets the curative which sits in the packet header and but in the yes in the packet header Indian books that way on on the news the of course could use the extra classifier press that through the powerful and well I might be biased I cannot like the
actual correlation all or none of shape and of course what happens when we and on the q most of all that happens the DQ time and that you time all we have to do is to put the packets of rights of use and at the due time you have to process the D. right or the solid support for articulate and at the right time but the work for panel click and this obviously can only happen around the rock use their use and
privatization is really useful in any Q and this is the 1st law sections the ordering effects like this if that reflecting is really only useful on the outgoing choose why would you prevent shaking on the in in in long queues that just doesn't make sense this analogy right behind of acute again so there is more shaking going on about because that changes the time and again and the other point is that we only know the interface of the UK and interface and they're actually on the on on side few of the art in the amount so we have no idea of ever going to something that and differing not always want
product you why our told you it's only really really useful that start under load the machine be pretty much overloaded only but once you are in this little situations you really really really wants to perfect you why you don't want to lose important packets so what is important about the different grasses but there's things that you really belong to if you run hot he certainly do not want to use the Coq announcements so if you have a carb announcements before someone who sold so much downloading that that your is saturated of machine saturated and you got the power cut back it's the other knowledge does not see or announcements things you're dead and takes over but since they're actually not that still think that your master then master master situations that both machines things but they have to follow perspective and well it's kind of obvious that things up so you don't want that the same thing is true for spanning tree you don't want disclose announced of the the other interesting cases that you don't want your administrative ssh sessions to suffer from some user during this formed on that product you know it
really is everywhere the the LAN had has a prior to and surprise they priority loads pretty much everywhere each and letters which has 4 8 cues for interface and they can have very simple very limited classifier and many of the better Africa 5th cards also of multiple and received use this this almost sentence and will become a nice to be able to use that right what we have
now for priority abandon shaping is all too what was what you did some research project but was actually use this the this the world's 12 13 14 years ago and this was developed outside the obvious the tree outside any used it for you actually and so he had to do it in the least intrusive way which led to some design decisions that you have to make using that model that but 1st states the primary purpose of what you was not to have a such on used is the primary purpose was researching the scalars like back then this will all very shiny and you know what I was doing especially the band shaking you try like 10 or 15 different you efforts and this is the reason for the the the the you design as it is it's a framework of bloggers get universal applied will anything so there's in Montes Monde of interactions and abstractions and of course that Council the prize from the codes more complex and to and performance really isn't all that good and a lot you work
it replaces the spark left you rover instructor called have all due to that hasn't strong makes the welcome to a little bit Of course you need new and the Turing functions but doing something start of just processing the patterns right away and unfortunately you to be a separate default you any anybody here a summary of the IF but it's macros yes so on a few plane and they have they are horrible as they are and and even before what you there 2 different versions there is the idea underscore stock and there's ifyou under underscore start and now we're adding a 3rd variant of not years ago there there is the 3rd variant for what you and being able to achieve that mess up as 1 of the driving forces behind replacing or and what you
must be explicitly and they print effects once you any book you what happens is that you and EQ functions are being replaced that's function points that in parts and the in a molecule has already mentioned to the classification of racial prepare have since 2003 when Andrew Châu and myself merged or Q 2 years and the states have long parties scheduler and free for it and the true benefit the once once considered Q plus this purely on the other it's is each of a series of it later on initially only at Cibecue edge of the sea has but at later so what you can
smooth some and they're not only related to what we did last night the separate I ask you left all acute structurally have drawings of this new that each and every inhabitant of this rather so I was supposed to support what you have to be modified but to be fair this have to happen even without the default you change anywhere because the all variables just not just what designed for for delayed signing of and the the activity like the input side and all the other cues for tunneling functions and and that kind of stuff are not what you can will cost they use the old the old but I ask you so I have a lot to the conflagration got drastically simplified by the PF emerge but is still pretty complex and especially for edge of the see nobody understands and so on that pretty much means that the and their own foremost myself that was out of 10 years ago soviet we want from it even with the the the PF configuration side is drastically simpler than the old separate what you specify that there before but it still is very complex and dust is way too much overhead just by molecule interface and means of cancer center-forward performance was should election money just painted you and that that's the cost of all those interactions in and and extraction there's just way too much code extra called running to year or 2 itself is about 9 thousand lines of approach and the dish I have some just to remove is moment of of of so I
started to replace the forward to hearing is the the center of the 2 and the goal for the new priority cure was this but it to be a super super super simple and the overhead should be so low that you can just enabled unconditionally around all the time that is actually the current status of that has been reached and how do we lose more the instructor I ask you the Q and teacher macros and by a handful of helper functions not more so it's very small self-contained and instead of want you had to the I asked you structure we use an array of 8 no this slot configurable on purpose and then the student although started 0 and go down to 7 and then you take the party value through time from the text so
that's the new I have to hesitancy always read is take the head and tail pointers and make an area of numbers the rest remains unchanged that's the energy macro and but that the books following about this so the entire change you really use instead of of excess in the tail and perhaps directly the excess the array and just take the prior well you from the Predator so this is really simple dirt cheap a country at all I that's all of this the that is very simple to grow old is most of the globe so for the moment we get away with that I didn't change eventually yes that change eventually but in other areas unfortunately somebody wants to help but be my guest to a lot can be done but just somebody but they wall your family more into that of the line to at that all our most of Oracle's the big so we do have to care about that so far about at least not all that much and dictionary this slightly higher but again the solution to fury and the antenna change you about this is that excess the race that a patent had erectly and do a little you say and starting starting at my free on sets so and the decay is for around well go down to the at the heart Europe there is no
confirmation possible on purpose well some necessary at and just before views therefore pesification long I guess the main point being useful for for classification examples here you mentioned something always doing said prior to 1 of those 7 values the old way in all cure using interface interface Q names but she names for use it to be a good idea that Bennett has turned out of but it was not a very good idea on the other way to to set this priority that should inherit the priority on on the LAN interfaces from that none had so if you already classified somewhere else the party in the Internet of you retain that if you want to reset but it actually this this explicitly have to do so here and and stacked carbon spanning tree and a couple of other things so springs of prior but default there's nothing you have to do about that that is word but you
still need abandon shape that unfortunately is hotter I'm this year we only need 1 there is no point in having multiple or so funny what and of edge of the sea is the most flexible 1 also warned that used to be the hardest 1 to use but the city you can entirely be expressed and echinacea so there really is no point of a separate sync you of and the configuration that's the challenge was that was way too hot and I haven't really seen any agency set up to the wild apparently the course well it's too hard to use on the other hand the reduced the useful lot so but simple enough but it C is the better schedule and many many many rights so the challenges the operation what age of the
stands for her and Professor of your costs and a service to the subscriber as a bottomless like this so consists of a band will hold to the 1st time deep in the 2nd and a burst band were called and what and what was the new for the 1st you ms the you gets the gun with 1 assigned to call this first one of an after wards that's the and to and you don't actually have to specify the 1st time about growth than there is no initial 1st and the worst kind connection you know and the and the direction of of 2 that was make much sense but it's possible so an agent was thank you
consists 3 although service scripts 1 is controlling the middle assigned band like as long as you make sure but that the sum of the of the minimum battle sissified fall accused not exceed beta fish that is guaranteed benefits are for example to make sure that you all boat traffic doesn't go under and the the level of error can still hear the person talking to the 2nd 1 was the target that's what the scheduler tries to give you that within the limits of the minimum and 3rd on maximum so we have limits and of the target you never exceed the maximum what this also implies that there always is borrowing going on so nations to see you can always oral and will from from the parent you if that has so bundling available within a set of influence and
so just like secure edge of this it is a tree of so everything as a parent it's little and the Yezidis explained it's always boring going on so the plan we want to use
the existing core agency engine I'm not going touch this for now the course that task for the past of replacing the restaurant you is big enough in itself but there are some things that you'd like to do and see but it really is a completely separate job in separate task included on later and we want to remove all the what you mean because that's there is a lot of performance and that leads us 3 big parts that we have to do and there's gaffe-filled at the top you have here just to to configure and set up everything capital the comfort factor so that has to be written for the new configuration language of little I mean of course then there's parts and you have from the bound to to actually setting up the against stone there and the Martin that well that's kind and the 3rd part of spoken the actual engine into the and the tube story about looking that
it is actually kind of all we do is add another 2 pointers to to our structure after the edge of the CIA abstract contains all the information that it's perceived needs to deal with that and but the part of that that is actually in charge of taking the packets off the queue at the right time just talk about the regulator so that it's an extra information to isn't there there is an idea to be able to use that separately to just limit the interface that will say your immediately but only 1 5 of them I keep thinking that you should you should be required to use for that mission to something like I have on the interface that will fight for them but again that's the future but I
really is 1 that that's all forms pockets of those packets and and controls and when they go up and there in the cute it doesn't do any damage Letterman's or stuff like that this attention with his job an edge of the sea basically tells the talk about her day what to do and then to it then DQ functions just look at the pointed to the edge of of C struct if it's there it's associated is active and graphical the edge of the sea functions if it's not there if it's now you can just did you as you can use our old scary markets but the
actual words as I mentioned this is the real challenge across all getting the conflagration language right that's really hard on the 1st attempt we did in 2003 all the examples of failure but as I said I didn't find any edge of the sea setups in the wild costs of the penny is way too hard to use the common cases right as I explained it is is very powerful right this assertion so there's all those events but and a common cases all the people wanted to it is is some kind of fashion event will between so resource of right so the common cases should be really really easy and straightforward shouldn't have to bother with a service costs and worse times minerals metals and the more complex ones loaded because I'll make it simpler that can be said that it should still be reasonably easy and as usual as always and yeah want stuff to read it if you if you speak English over the PF rules that you should be able to understand this even if you have never touched it before or if you have some kind of never knowledge of these
how does that work and by me sitting in my case it doesn't very very important part of the of the is of found the other developers but the the vital work of paper of the dish of Japan but the slides communication that's 1 small wants I'm not sure what they mean but the presentation for you but that's just in the
so we still want to have a Q set up and the classification in and the classification remains the most before it just said you know if that you actually exists on the interface the end up with you can you can be trusted technician not if that exists on the out of phase that we end up on the news that achieve affected there if it does not exist you just use the default you or if there is no queueing an for that interface ignore it the simple case now this is
on new configuration language that took quite some time and this through does not have a parent cell that's the route you on the nature of phase with a given band will will probably inherit the band was that the universe is running on but this is not on a person decided yet the a the the child to use explicitly name their parents and then went down that's the target will and it's a tree so those to the bus is a is a child and of another Q which is a child of the root you and there's 1 that you have to go to mark the default you any packet coming in but doesn't have an explicit you assignment goes to the fortune so that's the simple case and
there is 1 about me that you I need in insert yes they the news that yes I know in view of all again I only got half of what is that deserve
what it's last of this is coincidence the default you could be anything this is all for the 1st the idea so for now you actually have to physically what you but and Khot was the 1st to suffer but I think about that the point and in some cases
you want to limit the length of the queue of chewing too late packets was not not really a good idea so but whenever you remove the reach that you limit pets dropped dropping packets on his feet is good things cost that's the way to signal the center to to slow down so you have thousand patents in most cases in it's much better to drop only only to tell the sender to back off and science for so very controlled the maximum length
and you we adding our of burst the band will really Simple does that correspond with 4 a couple of milliseconds there and the
minimum again you can just go with minimal band will but then you can also extend those are you can give the 1st the maximal including a 1st panel
that kind of thing yes that's
animals and what we we were out for a 2nd that's the question and it really didn't want to answer it because because there is no proof you answer when which was entirely I will do that pulse 100 ms if it has been running at the at the fire and were you born in this response and there is a fairly complex algorithm to decide on when it's I do not forget that you are legal for bursting again so rule of thumb if it's been running and say you know 80 per cent for or couple seconds bandgap respond to that
but use Heineman remains as simple as it always has been just give the queue name
and the retaining B minus the position of entities CX and his villa packets and basically this goes back to the study of finding out on his this a lot the name internally it maps
to unique entities so we'll just assume they're both of mates to when we hit the outgoing interface we checked whether they're than that you is there if yes the use if not we got a little too that's another question I think yeah you can but you don't have to OK so it was that setting up the cues that has to happen poor interface basis yet the
assignments the classification is independent from that that could happen for the complete data for later so we don't really at specification time don't care whether we have put you on the operon office so long as to make sure that the desire Q has the same name on different of the offices fine and a set of so there we don't go to the default you and so on so we
can try to you with the same yes it is but you don't have to
if you don't define issues their solitude going on and still through the wealth approach was no always on so it's not strictly 5 more but if you don't specify anything the only thing is that a process of water on the the few things that are part of the default pop and spanning tree That's right so that's false negative what you get and the yeah I'll put 1
explained the F and loaded physician and this goes back to very high Meyer 10 years ago noticing that when when made people that is that's over behind his his own line this fall modes were grouped super-slow have looking and wide notice that that is the the TCP X were due late so much that this follows you went on with all the other side only sensing data chip acknowledge looking perceived the in the in the right so on we the kind of want to to prior TCP X that does not have a apparel and the same was true for packets marked as as a leader in using the type of search in the light blues is you give it to humans everything goes through that the 1st you fool in that case TCP acts that don't have any payload and packets marked as to the 2nd you do you prefer to progress of the Alps I'd look the pictures actually from Canada
the Sephardic Europe is committed an entry and it's even 5 1 I don't remember when I come with this I think it was off by 0 there is going to be a slight and exchange so if you use the already you common 1 when you have 2 more for your profit from and the bandwidth shape of his work in progress what I have this P and had a little part b become for racial language a set of big challenge I have all all the the low-level stuff at the end and so in what I'm missing right now is just some room between the kernels and said in the end of the I will this is a fairly big task that differ as it is a fault thousand lights and ii estimate there's about a thousand more missing for some reason I seem to be incapable of making use longer than 100 upshot of 4 thousand I don't know why a ending up with is the less it that's the ability to to watch the human action that is for a for you that's have little answers that that has to be adopted and I really hope somebody else does that for me and the plan right now is to have everything ready for the 5 comma decimal 3 released yes this is 1 year from on this will take some time to get right I since to compute
whining only when the change things here there's something of this time we'll keep Port Huron for his to unfortunately some key words flesh so we will have to make a just very are going to do this is to rename the question words for the existing what you thought you users will have to modify their pair of slight me but this is really a set up it just read you told you this is 1 example of the most prominent 1 and as I mentioned this being a big target once we got rid of what you can finally clean up the idea of age Leslie helped to make it really nice Chinese but well that's the future nothing it which is a
course that is an existing keyboard your 1st to cite yes yes of from the other all they all processes in the same same value what what's the the what did this how does that happen I miss the I am it's you 1 is Europe OK we all have to reconsider the inherited the Slovene and of and I I really missed that 1 exactly so yeah it's not yes so we have to change the Harrity now want 1 mapping we all laughed a little map a table or something OK but that the people of the Union not enough explorers ability of these and but thanks for pointing out the course a sentence the fossils the 1 make feasible it can happen in complex next next of new going to lighten up meeting his 2nd and simply click continued deciding on on what who suffer from all of OK
good so it's a simple mapping at least the I'll change that right away on this I didn't do that because the Colts lecture finished I I plan on doing you the not because and of getting getting dominant makes the past even even bigger than it already is that might look at the future when you that you only need to yes but telling it tell you news you're times little so can't do everything at once country everything I wanted it's a matter of time for others as you know I my because we want to get rid of the mass in the highest established and taking it taking part of what we already have to read all the ground it is easier than starting were completely like not on the future not yet the we look at that 1 thing after the other if really you know some would you That's right the hello actually be supported just as an example I'm in the all you and they on the graph a life the money what on the you home automation Japonica's were does make sense this but what the side effect of his wife I ask you is that of words endeavor using after right this is this is 1 of the advantages of this approach of the separate so struck attitude may cost that exactly the following right so this will just work so you all that's
of I can a cut here and the places that really matter this processed slightly different and more efficient I mean this takes off 1 but of course in the in the feature functions from say at Interop you do want you anyway so just modified that you'll no
more questions for you are shy welcome more
not strictly as in if you the and in I I don't in in a perfect world you'd probably get around with all kinds of shit because of world that the of what right you know there is like them so that the which is don't you is there a lot you can see some of the day That's exactly we use this us can be a candidate for buffalo to have this is this all that death of I'm done the