We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

bootchart2

00:00

Formal Metadata

Title
bootchart2
Title of Series
Number of Parts
97
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
A quick expose on boot time profiling tools, show-casing the slightly-less-lame: "bootchart2" - a new and improved boot time profiling tool, and other ways to speed things up." Which of the seven+ 'readahead' tools do you want, and why ? With a summary of recent fixes and wins in the area, and how to get involved with improving Linux boot performance.
5
15
Thumbnail
48:33
41
Thumbnail
35:21
47
48
Thumbnail
1:03:30
50
75
Thumbnail
50:56
94
Maxima and minimaBootingComa BerenicesHill differential equationAngleExecution unitGastropod shellThetafunktionLink (knot theory)Revision controlScripting languageGastropod shellEquivalence relationCanonical ensembleLine (geometry)Lecture/Conference
Gastropod shellBootingTerm (mathematics)ProteinCollisionProcess (computing)Interface (computing)Maß <Mathematik>Task (computing)Prime idealImage resolutionInformationInterface (computing)Process (computing)Computer animationLecture/Conference
BootingProcess (computing)Interface (computing)Kernel (computing)Programmable read-only memoryDisintegrationInformationState of matterTask (computing)Block (periodic table)Interface (computing)Multiplication signKernel (computing)TheoryProcess (computing)Transport Layer SecurityBootingMultiplicationBefehlsprozessorJava appletGraphical user interfaceGroup actionCASE <Informatik>FrequencyThread (computing)IntegerStatisticsComputer animationLecture/Conference
BootingAlpha (investment)Process (computing)Line (geometry)Multiplication signBootingVideoconferencingGoodness of fitLine (geometry)Arithmetic progressionTorusProgram flowchartLecture/Conference
Hill differential equationAlpha (investment)BefehlsprozessorProcess (computing)Line (geometry)Menu (computing)System callBefehlsprozessorIntegrated development environmentMultiplication signHypermediaStructural loadPlug-in (computing)Alpha (investment)Library catalogFormal languageLevel (video gaming)BootingArtistic renderingSound effectProgram flowchart
Alpha (investment)BefehlsprozessorProcess (computing)Line (geometry)Hash functionGraph (mathematics)Sinc functionSequenceComputer hardwareMultiplication signCoprocessorProgram flowchartLecture/Conference
BefehlsprozessorGraph (mathematics)2 (number)BefehlsprozessorCumulantMultiplication signGoodness of fitNP-hardMathematical optimizationComputer animation
BefehlsprozessorGraph (mathematics)Kernel (computing)BootingEmpennageChi-squared distributionMathematical optimizationQuicksortBootingMultiplication signProcess (computing)SoftwareKernel (computing)Lecture/Conference
BootingKernel (computing)BootingRootKernel (computing)FrequencyGraph (mathematics)Computer animationLecture/Conference
Kernel (computing)BootingModule (mathematics)Execution unitKeyboard shortcut2 (number)Transport Layer SecuritySoftwareMiniDiscPhysical systemComputer animationLecture/Conference
Module (mathematics)BootingKernel (computing)Function (mathematics)Artistic renderingBitStructural loadPhysical systemKernel (computing)Module (mathematics)Volumenvisualisierung2 (number)Process (computing)Computer fileMessage passingMultiplication signRight angleFunctional (mathematics)BootingResource allocationPerfect groupTerm (mathematics)Scripting languageComputer animationLecture/Conference
Scripting languageBootingParsingEuclidean vectorMonster groupModule (mathematics)NetzwerkverwaltungData managementMountain passLie groupLink (knot theory)BootingHacker (term)Key (cryptography)Crash (computing)Scripting languageLecture/ConferenceComputer animation
Maxima and minimaBootingLink (knot theory)RAIDWechselseitige InformationState of matterSoftwareSet (mathematics)Structural loadNetzwerkverwaltungVideo game consoleVirtual machineSoftware bugMIDIPoint (geometry)Physical systemRadical (chemistry)Condition numberPower (physics)Planning1 (number)Multiplication signMobile appScripting languageGastropod shellServer (computing)QuicksortCrash (computing)Computer animation
Computer networkModule (mathematics)User profileKernel (computing)Computer fileProjective planeOrder (biology)Physical systemWrapper (data mining)Film editingScripting languageGastropod shellQuicksortProcess (computing)Single-precision floating-point formatLecture/Conference
BootingMaxima and minimaoutputModule (mathematics)Kernel (computing)Cache (computing)Structural loadComputer file2 (number)BefehlsprozessorPatch (Unix)Monster groupMultiplication signLine (geometry)Serial portComputer animation
BootingLie groupScripting languageUser profileNetzwerkverwaltungKernel (computing)Monster groupModule (mathematics)InterleavingProfil (magazine)Video gameParsingCodeComputer animationLecture/Conference
InterleavingBooting2 (number)Point (geometry)QuicksortVideo gameProcess (computing)Lecture/Conference
InterleavingThread (computing)Process (computing)BootingProfil (magazine)Order (biology)outputHard disk driveGoodness of fitEntire functionDisk read-and-write headLecture/Conference
Menu (computing)BefehlsprozessorInterleavingExecution unitVacuumString (computer science)Physical systemProcess (computing)Different (Kate Ryan album)Computer configurationMultiplication signOpen setBefehlsprozessorSystem call
BefehlsprozessorAxiom of choicePhysical systemBootingLine (geometry)Punched cardSheaf (mathematics)Heegaard splittingMiniDiscAxiom of choiceTerm (mathematics)Band matrixXMLComputer animationLecture/Conference
Patch (Unix)Axiom of choiceExecution unitConvex hullModule (mathematics)Physical systemGraph (mathematics)Tap (transformer)Revision controlReading (process)Tracing (software)Hard disk driveFlash memoryComputer animation
Patch (Unix)Hill differential equationAxiom of choiceExecution unitConvex hullBitHard disk drivePhysical systemKernel (computing)StapeldateiComputer fileOpen setLoop (music)File systemCore dumpNP-hardComputer animation
IntelPhysical systemAxiom of choiceProcess (computing)Set (mathematics)File systemKernel (computing)Block (periodic table)BefehlsprozessorOrder (biology)Greatest elementPosition operatorLecture/Conference
Process (computing)Set (mathematics)BefehlsprozessoroutputMixed realityPhysical systemBootingProcess (computing)XMLLecture/Conference
Domain nameMultiplication signBlogGoodness of fitKernel (computing)Image resolutionView (database)DiagramLecture/Conference
Execution unitScripting languageProcess (computing)BefehlsprozessorComputer fileException handlingRevision controlMultiplicationContent (media)BitProgram flowchartLecture/Conference
Physical systemMiniDiscScripting languageProcess (computing)Link (knot theory)Core dumpMultiplication signComputer fileProcess (computing)Physical systemSlide ruleTap (transformer)CodeLibrary (computing)Computer animationLecture/Conference
Physical systemMiniDiscScripting languageProcess (computing)Configuration spaceComputer fileBootingRight angleCompilation albumLibrary (computing)Computer animationLecture/Conference
Computer wormConvex hullPoint (geometry)Punched cardMultiplication signLine (geometry)BitComputer animation
Form (programming)Hacker (term)Artistic renderingDifferent (Kate Ryan album)Lecture/ConferenceProgram flowchart
Hacker (term)MaizeBooting2 (number)Multiplication signLevel (video gaming)Power (physics)Virtual machineHard disk driveStructural loadLecture/Conference
Hacker (term)Personal identification numberBootingExecution unitBefehlsprozessoroutputRootNormal (geometry)Graphical user interfaceSimultaneous localization and mappingConvex hullMenu (computing)Demo (music)Perfect group2 (number)BootingLecture/ConferenceSource code
Virtual machineoutputLine (geometry)Physical systemGoodness of fitXMLSource codeProgram flowchart
QuicksortTime zoneSoftware bugGreatest elementMultiplication signGraph (mathematics)Goodness of fitGradientRun time (program lifecycle phase)BefehlsprozessorDisk read-and-write headRight angleLine (geometry)Reading (process)Lecture/ConferenceComputer animation
Execution unitVirtual machineBootingLecture/ConferenceSource codeJSONXML
Online chatExecution unitKernel (computing)Parameter (computer programming)Goodness of fitBootingLine (geometry)Vertex (graph theory)Point (geometry)Execution unitDemonSource codeXMLComputer animationLecture/Conference
Computer hardwarePoint (geometry)BefehlsprozessorHeegaard splittingSound effectXMLLecture/ConferenceComputer animation
BootingMultiplication signShooting methodTable (information)Right angleLecture/ConferenceMeeting/Interview
Multiplication signGoodness of fitExecution unitNeuroinformatikKernel (computing)BootingProcess (computing)Software bugMoment (mathematics)Lecture/Conference
SatelliteParameter (computer programming)BootingMedical imagingPhysical systemLecture/Conference
Goodness of fitMedical imagingEmailFeedbackCodeRight anglePatch (Unix)Multiplication signQuicksortLecture/Conference
Execution unitArtistic renderingRight angleSoftwareInformationDrag (physics)DemonOpen setTrailLecture/Conference
Patch (Unix)RankingBootingSoftwareGraph (mathematics)RandomizationLecture/Conference
Computer programmingServer (computing)Graphical user interfacePhysical systemMeeting/Interview
Slide ruleBootingError messagePhysical systemSoftware bugMathematical optimizationPattern languageNumberPatch (Unix)Semiconductor memoryMultiplication signLecture/Conference
Kernel (computing)Multiplication signComputer programmingPerfect groupProcess (computing)CASE <Informatik>Lecture/Conference
Lecture/Conference
Transcript: English(auto-generated)
It renders with Cairo, to SVG and PNG, and it's a lot nicer. Then, Scott James did a really good thing, Canonical, great, they actually did something. And they rewrote the shell script, which is essentially the moral equivalent of this. I mean, you know, there are more lines, but that's practically what it does.
And he rewrote it in C, so it went a lot faster. So you could get better resolution, better granularity, which is cool. And then Bootchart 2 came along, which was me. And I rewrote his collector for a couple of reasons. The first thing is the PROC interfaces are not particularly good. So when is the process running? You know, the information in PROC, it's unclear to me what the granularity of that is,
how long, you know, the state information there doesn't seem to me to be very good, basically. And if something can run and then block, and seemingly it's not running, I don't know, some kernel guy would tell me, I'm sure. But there's a better interface, which is the task stats interface, and that gives you nanosecond accurate, in theory, time accounting.
It's interesting, the nanosecond values go up in suspiciously uniform, large, integer chunks. But at least it's a lot better than it could be. And the interface is horribly block and unpleasant, and it doesn't work. And it's supposed to have task groups or something, you know, so I guess if a process has multiple threads, but that doesn't work either. But if you wander through PROC, you can actually get the information out
at some hideous CPU cost, and you can do it at a relatively high frequency. And that's great, task stat is on in virtually every kernel, except in Intel's Moblin, where virtually everything is turned off, just in case it's useful. So it allows us to say, better than that, which process has used how much CPU. So in the original boot chart, you could say, you know,
well, something seemed to be busy at this kind of time, but not tell me how much CPU was used by that process. So we can now answer that question. And boot chart also integrates PyBootChartGUI, it throws away the Java thing, and so you can have a much better coupling between those two beasts. So, you know, in the old world, we went from something like this,
and in the new world, we go to something like this, where you see all of these, you know, lots of little tiny things, which now aren't being thrown away, because apparently they took no time. And so, you know, boot udev, udev, udev, all these tiny things, and lots of very narrow, it's just kind of, you know, new and differently broken. So that's great. And you notice that here, you know, boot udev was just taking no time. This is all gray, if you can't see it at the back.
And that's at least slightly blue, which is progress, you see. There's a tori bias underneath some of this, as you can tell. Anyhow, yeah, which in bulk has lots of lines? Yes, okay, so when you start seeing interesting things happening, you know, there's all these tiny striations here. This guy is a really good guy. Look at this. This skim guy is taking serious CPU, you know.
It doesn't actually do anything, or, you know, it's only here to set an environment variable, really, at this stage. But, you know, it really goes through with panache. I think it, like, loads all of the language, you know, catalogs and plug-ins. Anyway, so that needs fixing. The other thing that the original boot chart did, which was very zen, was that when lots of things were happening at once,
it would change the alpha channel of the rendering. So it would show you the percentage of CPU that was being used. So the main effect was, as soon as the CPU got remotely busy, everything became very blurry and transparent. You couldn't see what was taking any time. So we turned that off, and there was an instant win. So now you can see in this channel here, you know, when the CPU is really going at it across umpteen processes,
actually you can see that they're all still doing something, and try and get some idea of the sequencing. And, of course, since it's accurate, you can also see holes. A nice feature that someone could hack for me would be to show nice holes that go vertically when the processor is stalled. And we see some of those. ACPI is causing us some serious grief on some hardware.
And you can see, you know, the same problem happens in SysFS three times. You know, we take a half a second delay, pulling a battery. We don't need to pull when UDEV starts, when X starts, and when HAL starts. One and a half seconds of sleep. Very easy to see, but it would be nice to highlight that. Another thing that we can do, because we've got actual hard CPU time,
how much time did it take at any given time, is draw cumulative charts of what's taking the CPU. So that's kind of fun. So here you can see, for whatever reason, we're starting Banshee at boot, and it's taking nearly 20% of the time. That's good, huh? Optimization opportunities, but you can also, you know, sort of start to rank these guys, and you can see when they appear. It's unclear to me that it's very useful, but it looks pretty.
Marketing is everything in software, as you know. Another thing that's pretty good is kernel boot charting. So the old boot chart, actually this is a new boot chart, but the old boot chart suffered this too. You know, you would start, because we knew when the kernel got going. This is kind of zero time.
And then at some very much later time, the boot chart actually started. And so this gap here was just a complete mystery. What happened? Who knows? And Ayan came up with a really nice way of dealing with this, I guess, in it called debug stuff. I think it was him. Either way, he used it. And so then you can draw a nice graph,
and you can say, look, the kernel is booting in this period, and this is a populated root FS, these are a frame buffer in it, keyboard, blah, blah, blah. Software resume, which is just a beast of a, you know, the seconds here are now this wide. So one, two, three, four, five, six, seven seconds, maybe? If I can count properly. Anyway, lots of seconds.
And that's the software resume. And of course there was nothing to resume. So, you know, it just carried on in the end. But you can also see some interesting things. So when the system's actually got started, this is udev and the various things happening in either a RAM disk or, I don't know, main system, it's starting to do mod probes and module loads. You can see then the asynchronous and delayed init bits happening,
I guess, in the kernel there. I don't understand any of that. I just parse the data and render it. Does that make sense, Greg? Perfect. Right. Nice, nice. It's, yeah, okay, but until you look at the rendering,
you know, you have no idea why the kernel is taking eight seconds to start. It seems a reasonable thing to take eight seconds. I mean, you know, it's a lot of code, right? It should take eight seconds. And then when you look at it, you realize it's really dumb. And this is, there's much more detail that's perhaps more useful for the kernel guys in terms of far more granular, you know, allocations of which functions
and which initialization pieces are taking time and how that parallelizes and so on. And that's pretty nice. So all we do is we store it. We do a bit of a worse job of rendering it, but there's this pretty tool in the kernel here, and you pipe your D message into this marvelous Perl script. You get an SVG file out, and it looks pretty like that, and it tells you why your kernel is slow to start.
So there are lots of schools of thought on how you should improve boot time and lots of elite hacks. And sometimes we have customers that come to us and say, oh, we've discovered if we put, you know, like, 15 ampersands in all these scripts, it starts much faster like this. And you're like, yeah, yeah, yeah. And it crashes as soon as you press a key, you know.
Yeah, I mean, some of the race conditions are really exciting. One of the ones that I had was that you're setting terminal settings on the console, and one of the cunning settings is not to receive a signal when you press enter. Like, normally when you press enter, you get a signal, like SIG, whatever. And due to a race condition, we were starting X, which was turning off the SIG air, and then we were turning it back on by restoring the console state.
The symptom of this was that the first time you ran X and you hit enter, the X server crashed. But could we find it? No. We S traced the whole system. I mean, you know, it took forever to work out. It's a feature, you know. It's one of those really nice features that, you know,
it sort of dates back from 25 years ago that no one actually knows or cares about anymore. But, you know, whatever. You have to turn it off. So, you know, all of these wonderful plans, you know, maybe you start UDEV and you start HAL and power with it, and then you start X as well, and something will go wrong on some machine somewhere. Or network management. You know, you can bring up network management much later.
That's fine. But it turns out there are a whole load of apps that aren't really geared up for the network stack to appear, you know, in mid-initialization or mid-flight. Which sucks. They're all bugs. There's no point in not fixing them. But finding so many bugs at once can be, you know, challenging. So, another thing that the people try and do to make it quicker is they think,
well, what the problem is is we have all these shell scripts. You know, what we need to do is throw away the init system and rewrite the init system. And that's probably true. But one approach was to rewrite it all as a single shell script. You know, so everything happens in order. You know, it's beautiful. So we tried this. The Moblin project did this. We tried it. I spent quite a while switching our actually already parallelized
kind of make file-based sort of init process to use this cunning new obviously faster way. But it's actually no faster. It turns out that the cuts are elsewhere, you know. The thousand cuts are in each thing that you're starting up and the silly wrappers around them. So, that was pretty serious. And, yeah, I mean, there's just a lot of work
profiling each piece and making it not suck. So, modprobe, you know, was very slow. There was some big silly cache and only Red Hat fixed that. UDEV was also pretty slow. I think Civa's fixer SUSE. kernel module loading took some kind of monster lock that stopped anything happening. So, you know, you could load lots of modules in parallel, of course, you know, until you actually wanted to load the module,
which, you know, then it would all kind of serialize, I think. HAL, yeah, HAL has been doing incredibly inefficient things and still is, you know. I mean, but, you know, something like a 10-line patch doubled the performance of HAL at startup, which is previously two seconds of CPU or something. And there's another 50% in there that is not passing a 600K file that we don't need to pass 23 times.
Honestly, you know, I mean, if you want glory, you know, it's easy. It's easy here. But the answer is not doing all of this crazy stuff. It's just like taking a profiler to ordinary pieces of code and going, huh, that's dumb. And GCOM, similarly, you know, the XML parser, the guy that wrote it didn't understand UTF-8 and did stuff in really silly ways.
And, you know, and it still goes on. There's plenty of fat left, really. Another interesting thing is SSD, well, I guess IO. IO takes a lot of your life away if you're not very careful. And this is the famous five-second Moblin boot chart. I'd like to, you know, this is sort of a historical, you know, interest here.
Yeah. Does it boot in five seconds yet? No? Okay, so this was a high point. You know, XFCE is obviously the future. And with an SSD, life is very easy, it turns out. So SSDs are not only blazingly fast, which throws away, you know, much of your RL latency, but you can also do stupid things, seeking all over the place. So what you do is you start your huge SReadAhead process,
which is, I guess, down here. As you notice, the SReadAhead process isn't actually doing anything, because it's an old-style boot chart. But what it's actually doing is batching four threads' worth of IO requests for everything you need in order. And it does it in the background. So if you really need something and it wasn't in the profile, you get it. So that's SSD, which is all good. Hard disk, you have the opposite problem
that you really, really, really need to read linearly. So, again, this is my SReadAhead, and, well, you instrument your entire boot process and you try and catch everything, not just the open syscalls. You catch execs and users and various things. And then you, you know, you kind of chunk it into chunks that seem good to you, some, you know, mystical way.
And then you read a whole lot to start with, and you try and get ahead. Actually, I think this failed. But you try and get ahead of the system, and then you let the CPU go, and you say, go. And then you race ahead trying to catch up so that you're going to be ahead of it all the time so that when it needs, you know, the data, it's there. Yeah, it's quite exciting. Anyway, there's lots of different options,
but it does make a difference. Here's, I think, a relatively lame and relatively old screenshot of the 34-second, this is some open SUSE boot, and there's down to 28 with, again, some different way of doing preload that's, I think, split into sections and stuff. I mean, the punchline is you can see that this disk is not, you know, this is a theoretical gigabit a second bandwidth to the disk,
but reality is that you're getting, you know, like 14 megabytes a second peak. You know, and that's with the preloading. You know, so there's some serious problems in terms of getting data off disk. The great news is that you have a lot of choice here. You know, there's a lot of preloads out there. This is the SUSE one. I was just showing you the graph there. And it uses system tap, which is wonderful.
Was Mark, no, he hasn't made it. But anyway, I have to advertise system tap because I like Mark. And yeah, there's kind of a module that traces the IO, and the problem is not tracing the IO, but tracing the IO that wasn't your read ahead so that you don't incrementally build a bigger and bigger and bigger read ahead thing that becomes relevant. And then there's other versions. So Red Hat have a nice thing. The Intel had a nice thing, I think,
based loosely on the Red Hat one, which is now dead. There's no updates for it. It works well on Flash, but not hard disks. Ubuntu have Uread ahead, which is kind of Sread ahead, cleaned up, working a bit better for hard disks. Has, unfortunately, a disastrous copyright assignment policy very close to it, so I would abandon it and stand well away. Otherwise, you'll get burnt.
Blocking reads, yeah, yeah, yeah. So it also requires some hardcore kernel stuff. It actually opens every file that you're going to load in your system in one batch. It's like a loop from one to a thousand. Open all these files, and it does some clever things. It looks at the XT3 filing system. Actually, most of these look at the filing system in order by block positions. And my preferred one is the one that I maintain, obviously,
which is at the bottom. And it's quicker, at least for me on my kernel with my work set. So what else can we do? Yeah. So sometimes things are slow, and you really don't know where. You know, there's a lot of CPU happening in this thing. Of course, a lot of tools that make that quite easy to do.
You can run Valgrind on it, this kind of thing. But if it's a big mix of I.O., latency, and CPU, then it can be kind of hard to tell what's going wrong. And this is a really nice system called PRKUTL. And there's a PR set name thing you can send to it. And then there's a handful of characters. It likes to truncate whatever you're about to set it to. And the great thing is that you can change your process name, and boot chart 2 will adapt.
And actually, this is all one process here, but I just switched names so that I could see which domains were taking what time inside it so you can slice and dice and see very easily what's going on. So my good friend, Ian, has come up with this amazing thing called TimeChart. You can read about it in this blog. It's integrated in the kernel. It provides the ultra, ultra high resolution everything view
of what's going on in your CPU. And you can render some multi-megabyte SVG file which can't be viewed on anything except Inkscape version, whatever. And it will show you exactly what is happening on your CPU. I think there's always a trade-off between meaningfulness and content. I don't know. I've not used this tool. I'm sure it's very good. It shows you things you can't find elsewhere,
but it might be a bit of overkill when the problems are often quite banal. So, other bits. Oh yeah, there's some magic stuff for SystemTap. Like I say, if you follow the links and the slides, you'll find magic SystemTap stuff that tells you what I is happening, who's triggering it, and so on. What processes are taking how much time and so on.
I guess it kind of overlaps. Sread ahead, at least mine dumps all the files and where all the data is coming from. Often, it turns out code, userlib is a great big chunk of it. Userbin, likewise. All those files and et cetera. This is a Moblin startup, which is really quite lean. A SUSE Moblin one. You know, like 2.8 meg of files and et cetera. Config files right at boot.
And so it goes on. The X team, of course, like great big libraries as well. They have compilers and side compilers, no doubt, to pop in there. So there you are. BigChop2 reaches the places that other beers don't, you know. And there it is. There's plenty more to do. It needs packaging. It needs packaging for every distro. I've put quite a bit of effort into actually making it work,
even with strange initrds and so on. I'd love to hear your experience if you want to get it working on your distro. How am I doing for time? Oh wow, that's pretty good. Okay, the other punchline is, What? This was a gratuitous advert for Moblin and it's gone horribly wrong. Honestly, I'm distressed about that.
You see, look, Moblin Rocks, gratuitous park. It says so there, look. Let's try again. Hey, that's cool. So, one of the lesser known things about OpenOffice is that the slideshow is using a different rendering pipeline to the editing piece. Or, yeah, details.
Torsten, where is it? Anyway, good, okay. So, you know, it is possible to boot fast. There is really lots of dumb stuff in there. How fast we really can boot is unclear. You know, the promises vary between, you know, negative time, two seconds, five seconds, ten seconds. What I can tell you is that we can ship a machine that will boot in 23 seconds from power on button
when seven seconds is the BIOS with a hard disk in it. So, you know, we're getting pretty fast. And, you know, less is more or less, yes. So, if you take a whole load of stuff out, you can boot a lot faster. But, you know, at some stage you're going to have to do the work. So, you know, you can't do everything incredibly quickly.
You can only really win by deferring work and doing this more intelligently. So, thanks to all the people who did it, did the research, mostly not me. Are there any questions? There must be some questions. Actually, I could do a demo of the tool. Dave, shut up. I'll just do the demo first. Is that all right? You know, here's a very good question, no doubt. Now, somewhere here, there was that thing that I just ran that was lurking.
Aha, perfect. Excellent. So, one of the things we do just to scare people when they've seen an old boot chart is to double the size of the thing. So, as soon as you run boot chart two, it looks as if you're twice as slow. That's great, isn't it? Which focuses your mind on accelerating yourself. And if that doesn't work, or, you know, you manage to speed it up too much, you can actually horizontally zoom as well so that, you know, you can sort of reinforce the fear
to, you know, actually get you going. Here you see our, you know, 15-ish second boot, actually on a slightly slower machine. Great chunk of I.O. to start with, as S-Readerhead races the system. Yeah, and then, you know, the other guys really get going. Oh, we draw pretty red lines and things on as well. Actually, that came from Scott,
which is, you know, if you like pretty red lines, it's good. And, yeah, there we are. We get up, you know, pretty quick to the end zone. So, that's nice. What else do we do? Oh, yes, we have this cumulative foo at the bottom, which is quite fun, as I showed you, I think, before. And, you know, and there's all sorts of bugs as well. Unfortunately, you can't see what Innit was doing before you even started logging,
but, you know, it at least tells you it took a whole lot of time before it got there, you know, so there's a good graph here. What Innit is doing with the CPU, I don't know, but it stops doing it pretty quickly. I mean, you can see that, so that's good. It's quite well-restrained during the runtime. And, yeah, so hopefully by looking at the gradient here, I mean, obviously the optimal graph is kind of a line
from the bottom here to the top right, you know, that's kind of linear, the CPU used all the time and preferably not very long. But, yeah, it's not quite like that. So, what else can we see? IO is much more visible, I guess, so when the read ahead screws up and we see red stuff happening, you actually see the red, which is kind of nice. What else?
Let me see if I can show you something really silly. Actually, I'm... Oh, there was an MSI machine. OK, let's... I don't know what's happening. There's just too many boot charts, as you can see. Take one a day, keeps the doctor away.
Oh, well, huh. Yeah, so here's something pretty interesting happening in the kernel. I don't know. Yeah, that's... Yeah, well, who knows? Good thing we didn't have the kernel booting stuff turned on for that, hey? So you actually have to add extra parameters to the kernel boot line to do this. There's debug in it called... It's all documented in the readme. So that's pretty good. What else?
Yeah, as I say, the vertical line thing would be really helpful to find some of these more stupid stuff. I'll try and find the stupid stuff. I think here is one. So Hal Demon at this point is starting up, and this is the half a second sleep that the kernel does when we start reading the sysfs on this hardware. And you'll notice that if you look at the...
If you keep your eyes fixed at that point, and then you look at this guy at the top, wouldn't it be nice to have a split pane so you could see this and then scroll this? Anyway, but there's a great dip in CPU use. In fact, nothing is happening around here really at all. It's just kind of gone. So that's lame. So that's the tool. I don't think I can show you anything else.
David, what's your question? Shoot. Oh, there's a microphone. Sorry. We have four minutes. Is that right? Is there any way to run boot chart at any time other than boot time? Like, let's say my computer is particularly slow
that I can start up data collection and then analyze it for over ten minutes data. So you can do that. At the moment, it doesn't work very well in boot chart too. There's kind of a bug. Since we did all the kernel passing, as soon as you do boot chart with the kernel debugging turned on, it goes, ah, let me show you the beginning of time, and then much, much, much, much, much later, let me show you what you're doing now.
So yeah, that needs fixing. But yeah, there's no reason why not, and it should do a good job. There was one other thing. One of the things that used to annoy me about the boot chart as one of the reasons I turned it off is if I booted and then stayed in GDM for a while, that there was some kind of a parameter that said, keep going until the gnome system. Right, you just sat there accumulating.
And I just sat there, and so when I logged in, it was everything blocked up for minutes while it was doing Cairo SVG stuff. Right, because it's just such a huge image. It's rendering in the background. So is that fixed? That's good feedback. We should fix that. Send me an email. You know, so there are timeouts to this thing, but yeah, we can add some more timeouts.
Why not, why not? So with a fine beard. Yeah, that requires code to make it interactive, but that's a really good idea. Send me a patch, definitely. I love it, I love it. What a good idea. Anyone else? Since it's a new rendering engine,
do you have any chance to also add a network usage information in there since the logging daemon at least and the old booter could drag it? Yeah, right. We certainly can. Could it track it? Yep, it does. For the systems, yeah. Okay, yeah, that's pretty trivial. I mean, this graph rendering thing is pretty generic. It's easy enough to grab random stuff
out of proc and render it. Again, patches are all there. I don't have any network problems at boot luckily, so I'm going to do this for myself. Anyone else? Something you've always wanted to know, you know, I don't know. Aha. Oh, you have to wait for the microphone,
otherwise, you know, I have to repeat it. Would it be difficult to adapt the system to analyze programs just instead of the separate steps but just let's say a GUI program or some server program that's not very responsive to improve the performance of a program?
Yes, I suppose so. Potentially it's helpful for looking at the system. There's another very good tool called Sysprof that's probably what you want to use for that. I was meant to put it in my slides. There are some nice patches now so that you can do a Sysprof of the whole boot and you can see where slowness is, which allows you to micro-optimize in entirely the wrong place, which is great.
What? What? You know, malloc is really, really slow. You know, we've got to speed malloc up, you know. It's taking like half the time. Actually, you know, someone had the idea of speeding malloc up before you and you're not going to win, so why not just work out why you're allocating all this memory and freeing it and, you know, there's been a number of such optimizations.
It's unfortunate. So my tip is when you find a performance bug, stick an fprintf in, rerun it, and you will immediately understand why, you know, the human pattern recognizable. Yeah, yeah, yeah. Well, okay, Sysprof works well with multi-threaded programs. It will show you all the processes, so it will show you X. If you're lucky, it will show you the kernel.
I think it will take you down into the kernel if the thing's blocking that and it's pretty cool. It's a really, really good tool. Time out. Perfect. Thank you for your patience. You're all very good.