We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

The ultimate guide to HTTP resource prioritization

00:00

Formal Metadata

Title
The ultimate guide to HTTP resource prioritization
Subtitle
How to make sure your data arrives at the browser in the optimal order
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Come learn about how browsers try to guess in what order web page resources should be loaded and how servers use that information to often (accidentally) make your web page slower instead. We look at what resource prioritization is, how it's often implemented terribly in modern HTTP/2 stacks and how we're trying to fix it in QUIC and HTTP/3. We use clear visualizations and images to help explain the nuances in this complex topic and also muse a bit on whether prioritization actually has that large an impact on web performance. HTTP/2 started the move from multiple parallel TCP connections to a single underlying pipe. QUIC and HTTP/3 continue that trend. While this reduces the connection overhead and lets congestion controllers do their work, it also means we no longer send data in a truly parallel fashion. As such, we need to be careful about how exactly we send our resource data, as some files are more important than others to achieve good web performance. To help regulate this, HTTP/2 introduced a complex prioritization mechanism. Browsers use complex heuristics to try and estimate the importance of a resource and, with various success, communicate their preferences to the servers. It has however become clear that this scheme does not work well in practice. Between server implementation bugs, questionable browser choices and bufferbloat in caches and network setups, HTTP/2 prioritization is sometimes more a liability than a useful feature. For this reason, this feature is being completely reworked in HTTP/3 over QUIC. However, there a whole new can of worms is opened. One of QUIC's main features for improving performance over TCP is that it removes "head of line blocking": if one resource suffers packet loss, other can still make progress. That is... if there are other resources in progress! What performs well for lossy links turns out to be exactly what to prevent for high speed connections. Along the way, we also discuss existing options for web developers to impact the browser's heuristics and server behaviour (such as resource hints (e.g., preload) and the upcoming priority hints). Finally, we question about how we got in this terrible state of things to begin with: if people made so many mistakes implementing HTTP/2 prioritization, why didn't anyone really notice until 3 years later? Could it be its impact on web performance is actually limited? Or have we just not seen its full potential yet? We make this complex topic approachable with plenty of visualizations and animations. The content is mainly based on our own research (and papers) and that of others in the web community, such as Patrick Meenan and Andy Davies.
33
35
Thumbnail
23:38
52
Thumbnail
30:38
53
Thumbnail
16:18
65
71
Thumbnail
14:24
72
Thumbnail
18:02
75
Thumbnail
19:35
101
Thumbnail
12:59
106
123
Thumbnail
25:58
146
Thumbnail
47:36
157
Thumbnail
51:32
166
172
Thumbnail
22:49
182
Thumbnail
25:44
186
Thumbnail
40:18
190
195
225
Thumbnail
23:41
273
281
284
Thumbnail
09:08
285
289
Thumbnail
26:03
290
297
Thumbnail
19:29
328
Thumbnail
24:11
379
Thumbnail
20:10
385
Thumbnail
28:37
393
Thumbnail
09:10
430
438
Web browserMathematical optimizationElectronic program guideFamilyLecture/Conference
Multiplication signSystem callFamilyComputer animation
Chaos (cosmogony)Computer-generated imageryTDMAWechselseitige InformationScripting languageParallel portConnected spaceHome pageMultiplication signProcess (computing)AnalogyCommunications protocolWeb pageParallel portOpen setMultiplicationComputer configurationQuicksortNumbering schemeForm (programming)Computer animation
SequenceComputer configurationTerm (mathematics)Web 2.0Band matrixOrder (biology)Roundness (object)Web pageInternetworkingEntropie <Informationstheorie>Home pageComputer animation
VolumenvisualisierungLink (knot theory)Scripting languageBlock (periodic table)Web 2.0Block (periodic table)Home pageVolumenvisualisierungTape driveArithmetic progressionMedical imagingFree variables and bound variablesBitArithmetic meanDifferent (Kate Ryan album)Program flowchartXMLComputer animation
Link (knot theory)VolumenvisualisierungScripting languageBitDifferent (Kate Ryan album)Arithmetic meanFree variables and bound variablesTape driveParameter (computer programming)TouchscreenRight angleVolumenvisualisierungWeb browserComputer fileBlogHome pageStructural loadSoftware developerContent (media)Medical imagingWeb pageGreatest elementBlock (periodic table)Mobile appClient (computing)Scripting languageRepetitionComputer animationProgram flowchart
Block (periodic table)Link (knot theory)Scripting languageJava appletWeb browserPosition operatorSoftware developerSynchronizationWeb browserType theoryPosition operatorComputer fileComputer animation
Web browserHeuristicHeuristicWeb browserBitDifferent (Kate Ryan album)Computer fontWeb 2.0Home pageGraphical user interface2 (number)Web pageComputer animation
Graphical user interfaceComplete metric spaceWeb browserWeb pageMedical imagingClosed set2 (number)Graphical user interfaceComputer fontStructural loadDifferent (Kate Ryan album)Nichtlineares GleichungssystemMultiplication signLie groupComputer animation
Cross-site scriptingSynchronizationSequenceWeb browserHeuristicBand matrixGraphical user interfaceDefault (computer science)BitMedical imagingBlock (periodic table)Numbering schemeOrder (biology)Web browserHeuristicVolumenvisualisierungDifferent (Kate Ryan album)Right angleMultilaterationRoundness (object)Software developerComputer animation
ParsingWeb browserMoment (mathematics)Compilation albumScripting languageCASE <Informatik>Computer fileDifferent (Kate Ryan album)TouchscreenRoundness (object)
Numbering schemeDifferent (Kate Ryan album)Web browserWeightObservational studyWeb 2.0Home pageMechanism designScaling (geometry)Line (geometry)Web pageProgram flowchart
Computer networkHeuristicAverageDivisorCellular automatonSoftwareLatent heatGraphical user interfaceDifferent (Kate Ryan album)BitCondition numberLine (geometry)CASE <Informatik>Roundness (object)Set (mathematics)Computer animation
Computer-generated imageryPoint cloudSequenceCross-site scriptingScripting languageHeuristicWeight functionScalable Coherent InterfaceNormal (geometry)Default (computer science)Server (computing)Graphical user interfaceNumbering schemeSet (mathematics)BlogTwitterRoundness (object)Observational studyScheduling (computing)Computer animation
Maxima and minimaScheduling (computing)Home pageHeuristicDifferent (Kate Ryan album)Web browserWeb pageRoundness (object)WebsiteObservational studyResultantInsertion lossNumbering schemeVideo gameRight angleUltraviolet photoelectron spectroscopyQuantum statePoint (geometry)Set (mathematics)Computer animation
CASE <Informatik>Web browserRight angleWeb pageHome pageConnected spaceFactory (trading post)OutlierSoftware testingNumbering schemeBitSinc functionWeb 2.0Latent heatProof theoryView (database)Heuristic
Software testingHome pageExecution unitInterior (topology)Link (knot theory)Computer configurationClient (computing)Web browserBitSoftware testingWeb pageConnected spaceVisualization (computer graphics)System callWeb browserLine (geometry)Scripting languageDirected graphClient (computing)Functional (mathematics)XMLProgram flowchartComputer animation
Scripting languageLink (knot theory)Computer configurationClient (computing)Web browserInversion (music)FlagWeb browserBlogFlow separationMoment (mathematics)Graphical user interfaceDefault (computer science)HeuristicSelectivity (electronic)Core dumpSoftware bugBasis <Mathematik>Web 2.0Ferry CorstenRevision controlWeb pageComputer animation
Link (knot theory)Computer configurationClient (computing)Web browserServer (computing)WebsiteServer (computing)Web browserOnline helpFunction (mathematics)Order (biology)CAN busStreaming mediaComputer animation
Server (computing)Level (video gaming)Scripting languageJava appletSynchronizationComputer-generated imageryTexture mappingServer (computing)Web browserGoogolCommunications protocolRight angleRoundness (object)Order (biology)Level (video gaming)IntegerConnected spaceJSONComputer animation
Texture mappingLevel (video gaming)Connected spaceFront and back endsClient (computing)Content delivery networkMultiplication signCASE <Informatik>Server (computing)BitNumbering schemeMechanism designNetwork topologyLatent heatSpeech synthesisLevel (video gaming)Computer animationLecture/Conference
Network topologyCross-site scriptingWeightServer (computing)Band matrixNetwork topologyWeightMultiplication signInheritance (object-oriented programming)Web browserComputer animation
Client (computing)RootWeightEqualiser (mathematics)Band matrixProxy serverRoutingRight angleIntelligent NetworkNetwork topologyWeb browserDiagramProgram flowchartComputer animation
Network topologyNetwork topologyContent delivery networkNumbering schemeWeb browserGraphical user interfaceElectronic mailing listServer (computing)Type theoryWeightExtension (kinesiology)RootDifferent (Kate Ryan album)SubsetWebsiteRouting1 (number)Mechanism designSequenceSoftware testingProgram flowchartComputer animation
Server (computing)Free variables and bound variablesWeightInheritance (object-oriented programming)MIDIWeb browserOrder (biology)Mechanism design1 (number)SubsetProper mapCASE <Informatik>Mathematical singularityRight angleMedical imagingNetwork topologyComputer animation
Free variables and bound variablesWeightInheritance (object-oriented programming)Server (computing)Message passingWeb browserNumbering schemeNormal (geometry)Computer-generated imageryWechselseitige InformationNetwork topologyWeb browserType theoryInformationLevel (video gaming)Complex (psychology)Normal (geometry)Software developerCommunications protocolRevision controlQuantum stateComputer animation
Multiplication signCommunications protocolRevision controlNetwork topologySpeech synthesisLecture/ConferenceComputer animation
Electric currentTrailNumbering schemeEmailMenu (computing)Level (video gaming)Semantics (computer science)Level (video gaming)Normal (geometry)Server (computing)1 (number)EmailRoundness (object)Function (mathematics)Evolute2 (number)BitFlagMereologyWeb browserRight angleSoftware bugBinary codeSequenceSpeech synthesisNumbering schemeProcess (computing)Frame problemScripting languageComputer animation
Communications protocolRevision controlArithmetic meanCache (computing)Client (computing)Buffer solutionInversion (music)Disk read-and-write headForcing (mathematics)Computer animation
Line (geometry)Connected spaceStreaming mediaComputer fileSingle-precision floating-point formatBlock (periodic table)Insertion lossSpeciesCore dumpComputer animationLecture/Conference
Line (geometry)Computer fileConnected spaceWeb browserRight angleComputer animation
Line (geometry)RoundingControl flowScheduling (computing)MultiplicationMultiplicationHome pageMultiplication signBitGoodness of fitDifferent (Kate Ryan album)Roundness (object)Computer fileWeb pageNormal (geometry)Computer animation
Graphical user interfaceInformationOpen sourceDifferent (Kate Ryan album)Web pageSoftware developerRight angleVideo gameServer (computing)Multiplication signExpert systemGoogolPoint cloudPoint (geometry)Computer animation
WebsiteExpert systemWeb 2.0Web pageLecture/Conference
Computer networkComplex (psychology)Home pageThread (computing)Web browserComputing platformSoftware testingSoftwarePosition operatorProcess (computing)Scripting languageComputer animation
Complex (psychology)Home pageComputer networkThread (computing)Computing platformWeb browserProcess (computing)Control flowSoftwareThread (computing)BitCore dumpPhysical systemMoment (mathematics)Software testingWeb browserGraphical user interfaceMereologyLecture/ConferenceComputer animation
Complex (psychology)Home pageComputer networkThread (computing)Web browserComputing platformWeb browserGraphical user interfaceMoment (mathematics)Software testingScaling (geometry)Position operatorArithmetic progressionRight angleComputer animation
Numbering schemeArithmetic progressionSpeech synthesisCASE <Informatik>Office suiteSoftware bugComputer animation
Speech synthesisMathematics
Firewall (computing)Software developerDifferent (Kate Ryan album)Speech synthesisWeb browserOpen setMathematicsPosition operatorComputer animation
EmailExecution unitHeuristicWeb browserGoodness of fitMathematical optimizationLine (geometry)Disk read-and-write headSign (mathematics)Video gameLecture/ConferenceComputer animation
Point cloudOpen source
Transcript: English(auto-generated)
All right, so HTTP resource prioritization. I'm going to be honest, actually quite a boring topic. And so for today, I decided that I'm going to talk about
something completely different, something I think you and I will much more enjoy, which is, of course, food. I love this stuff. I eat it at least once a day, typically with my family, and over time, I've noticed that we tend to eat food in slightly different ways. For example, if this is the meal, what I like to do is to keep the best to last.
I typically eat the broccoli first, because I don't really like it, and I like to finish strong with the fries. And my girlfriend, she thinks that's a bit stupid. She thinks, by then, the fries are all cold and soggy, so she switches it around. And you can see, in my household, we often have a lot of leftover broccoli. And my sister, she's much more strict. She typically devours something in its entirety before
moving on to her next victim. It's easy to see why she decided to become a lawyer. Now, finally, my dad, bless him, his old age, he simply doesn't care anymore. He just goes around the plate picking it up as he goes. Now, you know the old saying, if you turn your hobby into a job, you'll never have fun again.
And that's what I ended up doing, because the way I see it, loading a web page actually looks a lot like eating a meal, at least in the modern protocols. In the old style, HP1, you would open multiple parallel TCP connections, which, in my analogy, would be the same as everyone having six mouths eating at the same time, which is, of course, insanity.
Luckily, with HTTP2 and TCP and the upcoming HTTP3 and QUIC, we can move to a single underlying connection, which is much more sane. Now, this also means, of course, that we now have to start multiplexing our resources on this one connection. Sounds simple enough. It's actually quite complex in practice. And I've found four problems with this in practice that
I'll be talking about today. And the first problem is, which of the possible options is actually the best? Because, like with the food, we have many different options here. We can send everything back to back sequentially. We can do some kind of a round robining scheme, switching the bandwidth between resources. We can even combine these things. And for the food, it didn't really matter.
It was most like personal preference. But of course, when loading a web page, this order really matters in terms of how the web page renders, and in turn, the user experience. So which of these options is best for web performance? Let's try to deduce that by using a very simple example. So in this web page, the two top ones, the JavaScript and the CSS.
We can say that they are render blocking. This means that they have to be downloaded in full and executed before you can actually render the rest of the HTML page. That means it makes sense that we fetch them at first, sequentially, back to back. Now, the next two resources, like you see, are progressive JPEGs. What does that mean? You typically have two ways of encoding a JPEG image.
The first one is typical. It just loads from top to bottom. The second one, the progressive, is actually much smarter. And it encodes the JPEG in different quality layers, meaning that even if you have just a little bit of the JPEG available, you can already render a low quality placeholder up front. This means that if you know that you're going to have
progressive JPEGs, it makes sense to round robin them on the screen, because you can already start rendering all of them in some way or another. On the other end, if you don't have progressive JPEGs, you can make an argument that it's better to load them sequentially. You can say something very similar about HTML as well. If these things are render-blocking, then why
would I send the body down before the CSS? Because you can't render it anyway. A counterargument would be that even though the browser doesn't render it, it already parses the HTML, discovering new resources, like the images, which can then already be requested. A countercounterargument is then that doesn't really matter because the images are blocked behind the CSS and
JavaScript anyway. And if your HTML ends up being quite large, you will block everything behind it while rendering nothing to the screen. So it's not immediately evident what the best approach here is. And we can make it worse. Let's add a new resource, an asynchronous JavaScript file on the bottom. This means it's not render-blocking, but it
should still be executed as soon as possible. As soon as possible JavaScript, you might think, OK, high priority resource, load this before the images. You could also say, I think the developer put this on the bottom of the page for a reason. It's not high priority. I'm actually going to load it after the images. And then the final cherry on top comes when we finally
downloaded our first JavaScript file. We execute it, and we find what does it do. It actually requests a completely new resource that we didn't even know about, a JSON file. Now, if this is one of those fancy new React or a few client-side rendered apps, this JSON is probably going to contain the main content for your web page.
So it's going to be high priority. On the other end, if it's not, this might just as well contain some random comments on your blog post that nobody's going to see anyway. So you might actually leave them to the back. And I could keep on going, but I think everyone by now understands the gist of this, and that is a browser simply does not know. It doesn't know how large the resource is going to be.
It doesn't know what it's going to end up doing. It doesn't know what kind of sub-resources it's going to require. So the only thing the browser can do is guess, from kind of course signals, things like the type of file you have, the position in the document, and things like async defer preload, which I'll also include later on.
So what actually happens is that the browser constructs what they call a heuristic, a guess of what is going to be most important. And we can see here, there's a bit of difference, but also some agreement. Most of the browsers think HTML is quite important. So is JavaScript and CSS. But there are some differences in opinion about how important fonts are.
And especially for the fetch example, there's definitely some disagreements. One very important one is the one on the edge there, exactly. That's the old Edge browser, now before they moved to Chromium, which actually failed to specify prioritization at all, at least for HTTP2. Now that you notice, can you try to predict which of these
heuristics is actually going to work best for web pages? Maybe you do. Let's try and see if you were correct by an example. So here we're going to load the same web page on the different browsers. Let's see which one comes in first. It's Chrome. Then we have Firefox close afterwards. We get Safari.
And now we can wait 10 more seconds for Edge to complete. I'm going to play it again, and I want you to focus on the differences, especially between Firefox and Chrome, and how they load fonts and images. It's going to be very stark of a difference if you focus on fonts and images.
So it's quite different. And I don't know if you guys would have been able to predict this. I would have had a tough time with this. That's actually because I was kind of lying. I was being a naughty boy, because I actually only showed you one half of the equation. Because it's not just the heuristics of the browsers that are different. It's also how they wish this to be
enforced on the wire. For example, Chrome really likes everything to be downloaded in fully sequential order. That's different from, for example, Safari, which does a weighted round-robin scheme. So HTML is more important. It gets more bandwidth. But everything else gets at least a little bit of bandwidth until it's HTML's turn again. Firefox does a little bit more complex.
It tries to give more priority to the more important resources, obviously. So the images are left to a bit later. And on the bottom, we can see what happens for Edge. It didn't specify anything. So it falls back to the default in HTTP2, which is fully fair round-robin behavior. And we saw that that was clearly the
worst of everything. But why was that? Well, if you remember the orange and the green resources, those were the JavaScript and CSS that were render blocking. You can see that all the way at the end there, we're still downloading those resources. And if you remember, like I said, we have to download these in full to be able to execute them and get them usable.
And if you start round-robin-ing between this kind of resource, you're actually delaying when that moment happens. It's actually much earlier if you send them sequentially. This is, of course, very un-nuanced. Modern browsers are much smarter than this. They, for example, some of them have streaming script compilers, which actually can already parse and compile JavaScript as it comes in.
So it's not as bad as you might think. But you still need to wait for this final chunk to come in to be able to use it. Then you might wonder, why are we using round-robin at all? Well, there are some cases in which it's actually quite OK. For example, if you have resources of very big file differences. In this case, you have a very large resource that is
holding up to much smaller resources in a sequential scheme. Instead, if you would do this in a more weighted round-robin, like Safari does, you actually get those smaller resources downloaded quite a lot faster. They can be used. And it doesn't really delay the big resource all that much in the grand scheme of things.
So the question was, which of these browsers is best for most web pages? We've now only looked at one example. And from this, you can see that maybe there are pages that actually behave differently on different browsers. So what we really need to do to do good science is look at many more different web pages, of course. And the problem is that I know of only about five or
six studies that have actually done this at scale, two of which have been my own. So let's look at a couple of these examples. In this graph, each of the lines is a different prioritization mechanism, a different browser, if you will. And you see, actually, there's not that much difference between them. The only one that's actually apart from the rest is,
indeed, what Edge is doing, the completely fair round-robin. And this was only on quite bad cellular networks. If we do this on much faster cable networks, for example, the differences become even smaller. Then we did a second research looking at a bit more of the theoretical side, so taking out the specific network
conditions a little bit, seeing what would happen in an idealized situation. And at least for our data set, we could see that Chrome is clearly the best. But we can also see that Safari, in some cases, will be better than Firefox. That's our data. Then we have the third research was from Cloudflare. So they're a big CDN, and they've implemented their own
prioritization scheme at the server. You could say that they kind of combined what Chrome is doing for higher priority stuff and then kind of what Firefox is doing for the other stuff. And it's difficult to really draw conclusions from this because they don't really publish any papers or data sets. But there's one very good quote on their blog post on this, and they say that this is about 50% faster than
what Edge and Safari is doing. So that means that the sequential behavior should be better than round robin in general. So we start to see kind of a trend. We all say Chrome is quite good. Of course, the next two studies completely contradict this. So these are from Google itself, the guys who make
Chrome, right? The first study says we compared this to a random scheduler, like no logic at all. And we only got faster results for about 31% of the web pages. Then the second study, quite recently, they compared this also to the fair round robin scheme. And where we found 50% or more differences, they find
only about 2.7% differences. And it gets so much weirder because they also compared this to LIFO, which is lost in first out. So the last requested resource gets sent to the browser first, and they're still only getting a 3.1% benefit. Right? When I first saw these things, I thought, these guys
have made a mistake. This can be true. They use bad websites. They have bad setups. I don't know what. However, I know some of these people. They're very smart guys. They really wouldn't make any basic errors like that. And so it's very difficult for me to just discard these things as nonsensical and just say my results were correct.
That means that if you would ask me today which of the browsers is best, my answer would be, I don't know. And I don't think anybody knows. I think many people have opinions, but they don't really have proof on this. And I agree that's not a very satisfactory answer. What I think is happening is that, indeed, most of these
schemes are quite good for most web pages, but we have a lot of huge outliers. Some pages are going to do really well on some of the browsers, but some are also going to do really, really badly if the browser gets the heuristics wrong. So like most things in web performance, it's going to depend on a very specific web page, and you're going to
have to tweak for your use case. Now, how can you do this in practice? The best thing to do is first verify that you actually have a problem. You can use the web page test tool for this. It will generate you this nice waterfall. And since last year, they started showing with the opaque bits there where exactly your HTTP2 data is coming in for that specific resource.
And then you can also collapse this waterfall into one connection view, like they call it, like beneath. And that actually gives you a timeline like the visualizations that I've been using in this talk that really give you an overview of how things are coming in. And if these don't match up with what you were expecting, then maybe you have a problem. There are some ways of dealing with that.
You can start switching things around, ordering in your web page. There are also some client-side features that you can use. We've already seen async, defer is similar, but also allows the JavaScript execution to be delayed. But those are only for JavaScript. You can then use the preload function, which is, as I explained before, if a resource would request a sub
resource somewhere down the line, you can actually say to your browser, I know I'm going to need this in the future, you can already start requesting this now. You don't need to wait, for example, the fetch API call. This also allows you to do nice things like loading CSS in a non-blocking way.
Very cool, has some problems. There's been a longstanding bug in Chrome, where you can actually end up with a priority inversion where the preloaded resource gets sent first before the resource actually requesting the preloaded resource. Andy Davies has a very interesting blog post on that. The other problem is that this is currently not supported
on Firefox, and it's quite unclear whether it will become available any time soon. The final feature, the core feature, I would say, for is what they're calling priority hints. These will allow you to manipulate the heuristics on a per resource basis by saying something like importance high
or importance low. So the browser knows that it might have to tweak its guess for this specific resource. You might also set this using the fetch API. This is fantastic. I'm very excited about this. The problem is this is only implemented in Chrome. And I think this was implemented about three years ago. They then tested it on several select web pages, but
they haven't yet enabled this by default. This is still behind the flag. And it's very difficult to know if this is actually going to be enabled soon, and also if any of the other browsers are actually going to implement this, but it's not looking like that at the moment. So maybe you're helped with this. I hope you are.
But if you're not, there is still a big red panic button that you can push to make everything better, which is using server-side overrides. Because up until now, we've been talking about what the browser wants. This is what the browser wants to happen, but it's, of course, the server that has to send the data to the browser. So if the server thinks what you're telling me isn't
right, I know better than you, it can, of course, just ignore what the browser is telling it and just stream the resources in the order it thinks is best. Sounds fantastic. It's actually very complex in practice. And to understand that, we actually have to move to our second problem of the day, and it's how do we actually communicate what the browser wants to our server.
Again, sounds simple, can get complex. The easiest way of doing this, the original way of doing this, which was in the Google SPDY protocol, which was the precursor to what eventually became HTTP2, was very simple. It just said every resource gets one integer, a priority level or a priority bucket, and then the server can just
go down these buckets in order of priority serving the resources. Very simple to set up, works quite well in practice, but there are some problems. It's, for example, impossible to indicate how you would like round robin to happen in this scheme, if at all. There's a second problem having to do with fairness.
Now follow along with me, because this is like eight years ago where they were very optimistic about everything using a single connection all the time. So the use case was we will have different clients connecting to one CDN node, and then the CDN node ends up connecting to the back end server, which is one big persistent HTTP2 connection.
And on this connection, we're not just going to multiplex the resources for each client individually, but across the clients. Everything is sharing one big connection. If everybody plays nicely, that works well. But what happens if one of these clients decide to say, hey, everything that I'm requesting is actually highest priority?
If the origin just follows that, you can get a very unfair situation, which is great for the client misbehaving, but quite bad for the rest. And if you take this to the extreme, why would other clients not also start saying this? Which is good, you get fairness again, but a completely useless prioritization mechanism. You might be thinking this seems a bit of an esoteric
issue, it's not really important in practice. But if you look at the history of all these things, which I've included there, you will actually feel that this is one of the main issues why they moved from the very simple scheme to what eventually ended up in HTTP2, which is quite a bit more complex. Because HTTP2 actually says, we're going to do everything
in a dependency tree. There's no longer just one integer, no, you have a specific place in a resource tree. And if you're alone at your level, that means you get all the available bandwidth at that time. That's the way you get sequential behavior. But if you have a sibling, two things at the same level, then you end up round-robbing between them. You can actually specify a weight for each of the
resources to get an unfair round-robbing, as you can see here. This is good. It's also quite easy to communicate to the server. Every time you make a request, you just have to say, hey, this is the parent for this request, maybe this is the weight. And the server can kind of deduce what the browser wants from that.
It also very elegantly solves the whole fairness issue. The only thing the CDN has to do is simply put a new root node on top of everything, give everything equal weight. And we will have a fair bandwidth share without having to mess with the individual client's priorities. This all seems very sensible, right? That's, of course, the reason it's also an HTTP2 spec.
Now, did it actually pan out to work well in practice? Well, no. As it turns out, for example, this. I don't know of any CDN that actually ended up implementing this scheme. None of the CDNs are actually using this in practice, even though it was one of the main reasons to switch to the
dependency tree. And it gets worse, because it's not just the CDNs. If you look at what the browsers are doing, they also don't really use all this flexibility. Chrome just builds a very long sequential list. Safari and Edge just add everything as a sibling to the root. The only difference there is what type of weights they imply.
The only one actually using this to its full extent is Firefox. But as we've seen before, they don't necessarily get better performance out of this on every design. If you think this is bad, let's look at the server side, because even today there are servers that don't actually implement this mechanism.
Of the ones that claim to do, Patrick Meehan and Andy Davies have done some good tests, they find that only a very small subset of them actually do this properly. Only a very few actually listen to what the browser is saying and actually serve the resources in the correct order. And the final nail in the coffin of this whole thing is that it's difficult to do server side
overrides, as I said before. Because remember, the use case is I have one, for example, image that I want to make higher or lower priority. How exactly am I going to do this? It depends on the browser which tree manipulation that I will have to end up doing. It's possible, but I need to know details of how these browsers internally construct their trees and have
to manipulate. This is actually what Cloudflare does. So Cloudflare tries to guess which browser is connecting to it based on the priority tree it sees. It then extracts the resources into their own scheme, at least for non-edge browsers, because it's not sending any priority information.
So it has to just use MIME type to try and determine what is happening. So I'm not saying it's not possible, I'm saying you need a Cloudflare level engineering team to manage this complexity. This is not something a normal developer can do. It's normal that they have put this behind their commercial offering. So I think we can conclude that this was a state of
HTTP2 prioritization around the time that we started on HTTP3. So HTTP3, new version of the protocol, runs on top of QUIC, which is a new transport protocol next to TCP. And because of QUIC, it's very different. We already had to change quite a lot of things to make HTTP3.
And the discussion then was, how about we also change the dependency tree setup? Why don't we use something simpler? There was a huge amount of discussion about this literally months and months. We ended up deciding, yes, we're going to remove the tree setup for HTTP3 and actually switch back to something simpler. Now this is only the current proposal.
This is not the final spec. This is just the way we're thinking about it right now. The idea is to go back to something that looks a lot like what Speedy did, but with some key adjustments. The first thing is that you can see that there are a couple of these levels that are now reserved for the server. This means that it doesn't really matter what the browser ends up doing with this. The server is always going to be able to do the server
side overrides that we couldn't really do in HTTP2. The second new feature is that each resource can now get an incremental flag indicating if it should be round-robin on the wire or not. This is the main part of the spec. There's a second aspect of that is that we're proposing to
communicate this using a normal HTTP header. In HTTP2, everything was communicated using HTTP2's binary framing layer. So it really wasn't all that visible. It was difficult to debug. You didn't really see it in normal dev tools. So the idea is here if we just use a normal header, it's going to be much easier to view, much easier to debug.
Maybe we can even expose this to JavaScript and let users manipulate this themselves. I personally think this is a very good idea. I'm very excited about this. But like I said, there are many open questions for this concrete proposal. For example, let's say that we have six different resources, all of the same priority level.
But one of them needs to be sent sequentially and the rest incrementally. How do you actually, as a server, put this on the wire? There are many different ways. You could say I'm first going to send the first two, and then I'm going to switch to the sequential one and then the rest. You could say the sequential one is probably going to be more important. It's probably a blocking resource, so
send that one first. You could say I'm going to send a little bit for all of the incremental ones so that they can start rendering or processing or whatever, then the sequential one, then the rest. And you could keep going. It's not really clear what the best solution is. Then you also no longer really are able to do what Safari wants you to do with the weighted round robin.
The question is, do we really still need this in a simpler scheme or not? This is just one of the questions. There are many more. There are many people that don't like the idea of using a normal HTTP header for this. And even the ones that do are not keen on the idea of exposing this to the JavaScript Fetch API without any special things.
There's also still the fairness issue to talk about, right? So this is not literally what is going to be in HTTP3. But I do think it's going to be a small evolution of this proposal, still something much simpler than we had in HTTP2, easier to comprehend, easier to implement, hopefully less bugs, which is great. This means we're done, right?
End of talk. Thank you. However, there's, of course, more. We've only been looking at the HTTP layer for now. But that's not the only thing that influences this, of course, because we are running on a transport layer as well. For example, there's a very nice issue here.
This was talked about by Patrick Meehan in about a two hour talk, almost just on this issue. Well, the thing is, if you have TCP buffers that are too large, you can actually end up with a priority inversion situation again, where you fill these buffers with low-priority cache resources. And then there is no more room for the high-priority
stuff that you get later. And so things get delayed again. This means that even if you have a well-behaving client and a well-implemented back-end server, you can still have problems with prioritization. There's another problem with TCP. It's called head-of-line blocking. This is because on HTTP, we know that we have different
files on the same connection. TCP does not. TCP sees everything as a single, opaque byte stream. This means that if we have even a single packet loss in there, TCP has to back up everything waiting for that single packet to be retransmitted until it can deliver packets three and four to the top layer, even though they are for unrelated resources.
This is actually the core thing that separates QUIC from TCP. Because QUIC does know that there are different streams on the same connection. It doesn't know it's an HTML or a JavaScript file. It just knows these things are completely independent. This means that if QUIC suffers a single packet loss, it can actually just bubble up packets three and four to
the browser, only waiting for the one packet to be retransmitted. This is why they sometimes say that QUIC removes TCP's head-of-line blocking, and so it's better for performance. And I kind of agree, but there's a lot of nuance there, because if you look at this more closely, this
only happens if you are round-robin-ing your resources. You need to have multiple things in flight at the same time. If you're sending things sequentially, this benefit goes away, because of course, you can't reorder things within the same file, and you end up with a something that looks a lot more like the normal TCP behavior, right?
So this could work in practice, could not. The most interesting thing is we've been saying that the sequential behavior is probably better for web page performance, but on well-seen networks, it might make more sense to switch to something at least a little bit round-robin-y to get around this behavior, right? So QUIC brings a whole lot of new different challenges
that reflect up to what we want to happen in the HTTP layer. This is just one example. There are many, many, many more. Chance has it I just finished a new paper on this. Please go and read it, or come talk to me afterwards if you want to know more of these details. Because it's about time for me to start wrapping up with the fourth and probably most important problem.
And I say here, there is no problem. Why am I saying that? Because maybe you've noticed, but I've actually given you some very contradictory information. On one end, I've said edge is 50% slower than what Chrome is doing, and all of the HTTP server deployments are
enormously broken. They do prioritization completely wrong. We have a big problem. On the other end, you have the source from Google saying, it doesn't really matter. It's only a 3% difference. Who cares? And you have other things that I've noticed over the years. I haven't actually seen a lot of developers complaining about,
for example, I've enabled HTTP2, and suddenly my web page is so much slower on edge. I haven't actually seen a lot of these posts. So it seems that people are not actually noticing these things a lot in practice. And so this is completely opposite viewpoints. And only one of them can actually be right.
And if it's the right one that is right, that means that I've wasted two years of my life researching a non-issue. But if the left one, if I am right, and if Cloudflare is right, this actually means that HTTP2 has been broken for a long time. Our websites are quite slow. And nobody has noticed. It's a very difficult thing to sell to a room full of web
performance experts. So I've talked so long and hard about how to combine these two things. And I'm still not there yet. But I do have some conjecture, some things that I think are happening. Personally, I think performance matters. Well, of course, performance matters.
I think prioritization matters. But for most web pages, because not everything is still using the one connection, there's a lot more other aspects coming into play here, is that you don't really see it even if the prioritization goes wrong. You'll primarily see it on very complex pages and if you test on slow networks.
And sadly, a lot of us still are not really testing on slow networks, let be honest. The second thing is that even if you have a problem at the network, it's often not the bottleneck. If you're shipping five megabytes of JavaScript to your mobile device, it's probably going to be stopped at the main thread processing. You might not even have a problem at the network, even
if it's suboptimal. It could also be that if something breaks in this prioritization stuff, it actually breaks very hard. It's very obvious, and people fix it quickly without thinking too much about it. Or, and I think this might be more likely, is that people have seen problems, but they have been unable to match this with the core reason, and that is the prioritization
mismatch that's happening. Because people don't really know enough about how the system works internally. That's one of the reasons I wanted to do this talk, to hope maybe a little bit more of insight into this topic and into the world. One of the main reasons, I think, is because we have a very unhealthy browser ecosystem at the moment.
So many different people are just using Chrome. As we've seen, Chrome tends to do quite well in our tests. I believe it's doing prioritization the best of all the browsers. So it might be that we're just not noticing this because everybody is focusing on Chrome. So we're not there yet, and I think that we still have
some issues to discover with regards to prioritization, what is actually going on at scale. What I do think, and I hope, is that QUIC will actually help us with this. I think QUIC will unearth some of these problems and introduce some new issues because it's the way it works. So that's going to help us make some progress on this. I also think that what we're going to end up with
for HTTP3, that isn't going to solve all our problems. There are still going to be edge cases, of course. But I hope it's going to be simple enough that we can understand this. I also think that we'll be able to backport this into HTTP2 so that we can solve some of the existing issues there. And if that's true, if it all becomes much more easier
to understand and debug, that means that if I maybe get to come back next year and do a new talk on this, I can summarize all of this in just 10 minutes because it's so simple. And I can spend the rest of my talk talking about something that I actually care about, which is Belgian waffles. Thank you.
I think we have two minutes for questions.
Five minutes? OK, five minutes for questions. No questions. Oh, there is? I don't see you. Shout. Oh, there.
Yeah, so the question is what are the expected changes you need to do to move from HTTP2 to HTTP3?
Yeah. So the most differences here are on the quick layer. Quick changes a lot, so you're going to have a lot of problems with your DevOps set up and your firewalls and opening up ports and that kind of stuff. But at the HTTP layer, not much actually really changes, except, of course, for the prioritization stuff. But normally, you as a developer shouldn't have to care about that too much.
It's still going to be the browsers that translate the heuristics over the new thing. So the move to HTTP3 should be fairly simple if you have a good DevOps team in place. More questions?
OK. So the question is, with HTTP3, will the browsers end up changing their heuristics? That's kind of what I tried to say with the whole head of
line blocking issue. Because quick allows new things to happen. It could be that the browsers are now saying we have something that works well on TCP, but we find something new that works better on quick, and so they might try to change up some of the things depending on what works well in practice. OK? So I think they're going to stay largely the same, but I
hope they're going to diverge at some point, because quick allows you some really cool internal optimizations for this as well. Anyone else? Thank you all.