The History and Future of Crash Dumps in FreeBSD
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Alternativer Titel |
| |
Serientitel | ||
Anzahl der Teile | 31 | |
Autor | ||
Lizenz | CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/45259 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
1
3
5
6
7
9
10
11
13
14
15
16
18
19
21
22
26
27
28
29
30
00:00
SpeicherabzugSystemzusammenbruchSystemprogrammierungOffene MengeComputerMathematikWeb logPartitionsfunktionSkriptspracheSystemzusammenbruchWechselsprungSpeicherabzugMultiplikationsoperatorRepository <Informatik>Technische InformatikOffene MengePartitionsfunktionTopologieDatenbankAutomatProgrammfehlerSchreiben <Datenverarbeitung>DokumentenserverGewicht <Ausgleichsrechnung>Projektive EbeneCodeInformatikPhysikalisches SystemSkriptspracheRechenschieberWeb-SeiteGrundraumMathematikComputersicherheitLoginXMLComputeranimation
03:14
SpeicherabzugSystemzusammenbruchSpeicherabzugRechter WinkelKernel <Informatik>KernspeicherMultiplikationsoperatorInformationSystemzusammenbruchBildschirmmaskeProzess <Informatik>BitAggregatzustandHalbleiterspeicherPunktComputeranimation
03:56
SpeicherabzugChiffrierungArchitektur <Informatik>Funktion <Mathematik>KörpererweiterungTime line <Programm>RechnernetzMagnetbandlaufwerkFlächeninhaltSystemzusammenbruchMathematikSpeicherabzugEinsSoftwareHalbleiterspeicherStrömungsrichtungChiffrierungKörpererweiterungProgrammfehlerElektronische PublikationCodeMaßerweiterungComputerarchitekturATMFunktion <Mathematik>E-MailMailing-ListeFlächeninhaltGeradeDateiformatTopologieWeb-SeiteDivergente ReiheMagnetbandRepository <Informatik>SechsAutorisierungLesezeichen <Internet>Modul <Datentyp>QuantenzustandAutomatXMLComputeranimation
07:36
SpeicherabzugPaarvergleichSpeicherabzugPaarvergleichSystemzusammenbruchKonfigurationsraumMini-DiscAlgorithmische ProgrammierspracheDefaultWechselsprungDemoszene <Programmierung>AutomatSampler <Musikinstrument>PartitionsfunktionComputeranimation
08:39
Strom <Mathematik>SpeicherabzugFunktion <Mathematik>ROM <Informatik>AnalysisDateiformatDateiformatAnalysisSpeicherabzugMultiplikationsoperatorAlgorithmische ProgrammierspracheFunktion <Mathematik>ProgrammLastVarianzMereologieHalbleiterspeicherBootenPartitionsfunktionSystemzusammenbruchComputeranimation
09:39
SpeicherabzugInhalt <Mathematik>BacktrackingStatistikPhysikalisches SystemInformationZeichenketteMetadatenSpeicherabzugSkriptspracheAggregatzustandDifferenteInformationDateiformatMultiplikationsoperatorZeichenketteMetadatenElektronische PublikationInhalt <Mathematik>Computeranimation
10:17
WechselsprungDateiformatMini-DiscSpeicherabzugROM <Informatik>BootenSCI <Informatik>SystemzusammenbruchSpeicherabzugDateiformatBinärcodeForcingProgrammBootenBitRechter WinkelE-MailPartitionsfunktionInhalt <Mathematik>HalbleiterspeicherMultiplikationsoperatorComputeranimation
11:41
PufferspeicherDateiformatMini-DiscSpeicherabzugWeb-SeiteKernel <Informatik>VerzeichnisdienstROM <Informatik>SpeicherabzugHalbleiterspeicherWeb-SeiteDateiformatBenutzerbeteiligungMinkowski-MetrikAutomatComputeranimation
12:30
SpeicherabzugMotion CapturingKernel <Informatik>InformationMini-DiscROM <Informatik>SpeicherabzugKernel <Informatik>InformationBinärcodeTypentheorieSkriptspracheBildschirmmaskeQuick-SortElektronische PublikationMultiplikationsoperatorComputeranimation
13:02
SpeicherabzugInhalt <Mathematik>Message-PassingPufferspeicherKernel <Informatik>VersionsverwaltungZeichenketteKonfigurationsraumFunktion <Mathematik>DateiformatMini-DiscFormale GrammatikElektronische PublikationMereologieSkriptspracheSpeicherabzugDifferenteVersionsverwaltungGrenzschichtablösungMessage-PassingFunktion <Mathematik>SondierungDateiformatMini-DiscComputeranimation
13:58
SpeicherabzugNichtlinearer OperatorSystemzusammenbruchPaarvergleichInformationAnalysisGerichteter GraphTopologieKernel <Informatik>Figurierte ZahlKernel <Informatik>SpeicherabzugSystemzusammenbruchDebuggingPackprogrammHyperbelverfahrenPaarvergleichGerichteter GraphInformationDateiformatMultiplikationsoperatorSymboltabelleAnalysisProdukt <Mathematik>Computeranimation
14:53
SpeicherabzugMachsches PrinzipStellenringRechnernetzDatenkompressionSchätzungPartitionsfunktionInteraktives FernsehenRechenbuchKernel <Informatik>Notebook-ComputerÄhnlichkeitsgeometrieProgrammierumgebungGamecontrollerInformationsspeicherungZeitstempelDatenmodellKörpererweiterungModul <Datentyp>CodeChiffrierungInformationVersionsverwaltungLokales MinimumBimodulSpeicherabzugAlgorithmische ProgrammierspracheProzess <Informatik>Quick-SortProdukt <Mathematik>Minkowski-MetrikAnalytische FortsetzungKernel <Informatik>MereologieSystemplattformSystemzusammenbruchCodeStellenringDefaultVersionsverwaltungGewicht <Ausgleichsrechnung>SchätzfunktionMixed RealityÜberlagerung <Mathematik>Dämon <Informatik>Web-SeiteChiffrierungBimodulCASE <Informatik>SoftwareProgrammierumgebungNotebook-ComputerInformationsspeicherungFestplatteKörpererweiterungModul <Datentyp>InformationRechenschieberMatchingDatenkompressionZahlenbereichAutomatDifferenteSchnittmengePartitionsfunktionPhysikalisches SystemFigurierte ZahlGerichteter GraphMaßerweiterungTesselationMagnetkarteZeitstempelAdditionGamecontrollerEndliche ModelltheorieRechenzentrumDatenbankBetriebssystemGeradeFahne <Mathematik>Computeranimation
20:45
SpeicherabzugDateiformatMini-DiscSchätzungRechenwerkRechenbuchPartitionsfunktionWeb-SeiteMultigraphBimodulVersionsverwaltungE-MailBinärcodeCodeLokales MinimumKernel <Informatik>InformationPartitionsfunktionSpeicherabzugDatenkompressionComputeranimationProgramm/QuellcodeXML
21:24
SpeicherabzugDatenkompressionDatenkompressionProzess <Informatik>Elektronische PublikationSpeicherabzugDateiverwaltungPartitionsfunktionComputeranimation
21:59
SpeicherabzugDatenkompressionPatch <Software>PartitionsfunktionAdditionMini-DiscRoutingMultiplikationsoperatorCodeE-MailMailing-ListeThreadSpeicherabzugAutomatDatenkompressionPatch <Software>AdditionMagnetkarteReelle ZahlComputeranimation
23:46
ChiffrierungSpeicherabzugKernel <Informatik>SystemzusammenbruchInformationWeb-SeiteSchnittmengeDateiformatSchlüsselverwaltungChiffrierungSensitivitätsanalyseKernel <Informatik>SpeicherabzugPasswortInformationProgrammierumgebungProzess <Informatik>MereologieRechter WinkelAlgorithmische ProgrammierspracheMini-DiscComputeranimation
24:59
ChiffrierungSpeicherabzugDateiformatMini-DiscKernel <Informatik>StrebeMultigraphAlgorithmusZeichenketteSymmetrische MatrixSpeicherabzugKernel <Informatik>SchlüsselverwaltungE-MailZeichenkettePrivate-key-KryptosystemMailing-ListeComputeranimation
25:32
SpeicherabzugKörpererweiterungTotal <Mathematik>SpeicherabzugModul <Datentyp>MaßerweiterungCodeQuick-SortProdukt <Mathematik>Gewicht <Ausgleichsrechnung>Rechter WinkelMereologieGüte der AnpassungPhysikalisches SystemComputeranimation
27:30
CodeE-MailMessage-PassingInformationSpeicherabzugKörpererweiterungATMPhysikalisches SystemProdukt <Mathematik>HilfesystemGanze FunktionATME-MailCodeElektronische PublikationInformationMultiplikationsoperatorGarbentheorieVersionsverwaltungGrundraumSelbst organisierendes SystemMessage-PassingComputeranimation
28:57
RechenschieberCodeSimulationNominalskaliertes MerkmalWurzel <Mathematik>SpeicherabzugDateiformatArchitektur <Informatik>PartitionsfunktionDefaultMultiplikationsoperatorSpeicherabzugQuick-SortInformationComputerarchitekturBetriebssystemJensen-MaßComputeranimationXMLFlussdiagrammProgramm/Quellcode
29:38
SimulationSpeicherabzugMarketinginformationssystemKorrelationCoprozessorDifferenteMereologieBetriebssystemQuick-SortNichtlinearer OperatorE-MailComputeranimationProgramm/QuellcodeXML
30:08
Patch <Software>CodeVersionsverwaltungSpeicherabzugModul <Datentyp>E-MailSimulationWurzel <Mathematik>SynchronisierungJensen-MaßOvalRechenschieberVerschlingungRepository <Informatik>DifferenteRechenschieberVerschlingungInternetworkingRechter WinkelZusammenhängender GraphSpeicherabzugKernel <Informatik>Lesen <Datenverarbeitung>HalbleiterspeicherMathematikMinkowski-MetrikProzess <Informatik>LastRepository <Informatik>Funktion <Mathematik>Quick-SortKontextbezogenes SystemBitSoftwareCodeVersionsverwaltungAggregatzustandDokumentenserverVerzweigendes ProgrammGüte der AnpassungMetropolitan area networkProjektive EbeneSchätzfunktionProgramm/QuellcodeXMLComputeranimationFlussdiagramm
Transkript: Englisch(automatisch erzeugt)
00:05
So welcome to the history and future of crash jumps and FreeBSD. I'm Sam Guider. If you'd like, this is the repo where I have the slides and paper. The paper has changed like one or two paragraphs, but the slides have been completely converted to Keynote.
00:21
So if you're interested in that, you can find it right there. Okay, so who am I? My name is Sam Guider. I went to Texas A&M University. I just graduated. I'm a computer engineer and computer science major with a minor in math.
00:42
I've used Unix-like systems for 12 years. That's code name for Linux was most of those years. Somewhere in college, I got onto OpenBSD because I had like a security kick. And then I wanted to be, I wanted CFS, and so I was like, okay, FreeBSD. So that's where I've been for the past four.
01:03
So what do I do at work? I work at Groupon. They have a ton of FreeBSD machines running their databases, and they crash sometimes. So logs showed that the crash jumps were larger than our swap partitions. So when I was asked to go look at these things, I was like, okay.
01:24
So we have to figure out what we can do about that. Turns out these dumps are very, very large because we have 32 gigabyte swap partitions, and they were larger than that. And then I went in to look even further, and it turns out not only are some of these swap partitions small, they're nonexistent.
01:42
So there was a bug in the provisioning script, and so there's many machines out there with no swap at all. So how do you get crash dumps without a swap partition or a very small swap partition? But I was still in school, so really what I have to justify is a 14-page paper.
02:03
Why do I have that? Well, if you're sitting in a technical writing seminar, you need papers, and you need paper topics. At the same time, if you guys haven't stumbled upon it, the Unix history repo is really cool. It's a Git repository that combines the history of Unix, AT&T Research Unix, and then later AT&T Unix.
02:26
All the BSDs, or, sorry, not OpenNet and FreeBSD, but 4.3 and all that. And all the way up to FreeBSD 12. So I decided to combine my two issues at the time. I needed a paper. I needed to learn about crash dumps.
02:42
So I wrote a long history of crash dumps in FreeBSD. So, but what's the technical motivation? I wanted to understand how crash dumps work because I wanted to solve my missing or very small swap problem. Deciding on a solution and avoiding reinventing the wheel was important, because as I looked around, I found all these projects that lived and died.
03:05
Netdump is probably the best example. It's been around forever, and it's still not hit the tree. And Unix history is always fun. So what's a crash dump? Or sometimes called a core dump. That's core memory right there.
03:21
A machine readable form of the state of a machine at some point in time, usually after a panic. So it's essentially a dump of all memory at the time of a crash. And they're named, I often say core dump, crash dump's interchangeable. Most people mean, if they say crash dump, they mean the kernel and core dumps for processes.
03:40
But a core dump is named after this guy here. Each of those magnets contains one bit of information. I don't know how much information this is. I should have read the caption. But that's RAM all the way up to like the 70s. Okay, the history. So the paper contains a long history that's essentially like a list of features in crash dumps from 6th edition Unix all the way to 12 current.
04:08
So I call it the odyssey of doodump. Doodump was added in like 6th edition FreeBSD and has been named the same function all the way through history. But the paper starts at 6th edition.
04:22
And it starts with a little introduction to core dumps in crash 8. And then ends at FreeBSD 12 current when they added encrypted dumps. So it turned to the appendix of that paper in the repo I linked. If you'd like to see that, it includes architecture support, feature changes, and larger bug fixes.
04:43
For even more depth, go to the org mode file on GitHub. It includes like individual commits where changes were made, mailing list emails, emails between me and authors, and just a bunch of notes. This is one of my favorite quotes. Rod Grimes helped me write some of his paper.
05:02
And he had a quote. Well, I remember in 1979, I can remember doing a crash dump on a Harris S210 24-bit machine with a line printer. And it only took two hours to print. So speeding these things up is a problem that's been around for a while.
05:22
So here's a quick timeline of the output formats of crash dumps. So originally, like Rod was talking about, they were on line printers. Later, they moved to magnetic tape when UNIX came out. After that, paging area is what they refer to when it came out, swap. And then now we're trying to get network dumping.
05:43
I would love for that to come into the tree, because that's the easiest way to get around no swap. So there's a series of extensions. And this is partly why I wrote this history, because I came across these strange changes. Everybody has this interest in this esoteric area, but they never come to fruition.
06:04
Why is this? If I come up with some crazy idea, is it doomed to follow the same path? What should we do? So you end up looking through the years. This timeline starts in 2004. A net dump started at Duke. It was earlier than 2004.
06:21
Was it really? 1998. Nice. There you go. So it's been how many years? Maybe like seven. OK. Yeah, because I know it's a popular feature. OS 10 has it, and we're going to talk about that. And that's the same code if you were at the Dev Summit Mark Johnston was talking about,
06:41
but just changed over time. Yeah, so you can see net dump. All these dashed ones are ones that have not yet hit the tree. Next is mini dump, so it's far smaller dumps. It's not the same size as your memory. If I were going to do a full dump on my current machines, I'd have to have 384 gigabytes of swap.
07:03
Not really worth it. After that, in 2008, text dump was added by Robert Watson. And then we're working on compressed dump would solve my problem. Encrypted dump is cool and actually made it to tree. Modular dump is something Rod is actually talking about right now,
07:22
so if you want to learn about that, go across all. And then mini dump size is something Rod also wrote, which is an extension to estimate how large your dump will be before you do it. Okay, so next I'm going to go over the general procedure of how a dump is taken,
07:41
and then I'm going to go over all the features in FreeBSD. A quick how-to, if you wanted to take a crash dump and you want to see how that works, we'll do that. Next, full dumps, mini dumps, which is the current default. What is a text dump, why might you want them, and then a comparison between these three.
08:02
So first, before I say anything, don't do this on your machine. If you're just blindly following and typing things in, please don't do this. Or actually do. If you do it, tell everybody. So you're purposely panicking your machine, so save your stuff. Your machine's probably already set up correctly, but you can set up the rccomp configuration.
08:28
So dump dev auto will find your swap partition, and then you'll dump there as soon as you write that last disk at all and panic your machine. Okay, so how does a dump procedure work?
08:42
This has been the same since 4.1 BSD all the way to right now. So most OS have something almost exactly the same as this. Usually it's even the same names. So at the time of a panic, you dump your RAM through the function dumpsys
09:01
to your swap partition, and then after a reboot, a program named savecore runs, takes the dump from swap, and then loads it into var slash crash. So if you wanted, I guess there's just more ways to panic your machine if you'd like.
09:20
If you want to be fancy, you could use dtrace. Same thing, though. So dumpsys lands all parts of memory on a swap in a particular format. So if it's a full dump, it's one format. Many dump another, text dump the last. And then on reboot, savecore writes dump to the dumper for analysis.
09:40
So what do you get in a core dump? Well, there's three types, and they're slightly different. Full dump and many dumps are pretty much the same thing with different formats, whereas text dump's almost another thing altogether. So the full dump and many dump contents, there's three files. There's the info file, which is metadata about the dump. It's the time, panic string, host name, just like metadata.
10:04
Then the core text. It's a bunch of scripts that run to give you the state of the machine. And the last is the core itself, vmcore. So what does this format look like on swap?
10:22
So dumps are written at the end of swap. So they're written from the end and then kind of going backwards as opposed to starting at the beginning of swap and writing forward because programs like Fisk, when you boot, they assume that they can use the swap because nothing else is using it.
10:40
So they just use the beginning. So there's kind of a dice roll that your dump won't be so close to the beginning of your swap partition. It'll get overwritten. Savecore makes sure that it's not corrupted by Fisk by checking this leader here. And so if that leader's screwed up, then it just assumes the whole dump is screwed
11:00
and just says forget about it. So this is a full dump. This is the full contents of memory at the time of crash. So most of this is going to be zeros probably. It's an ELF format, and before it was ELF, it was A.out format. So if you're into the history bit, go look at previous D6.
11:23
And so you can see the different parts in dump. So a full dump is like a binary, right? So it's an ELF binary. There's a little bit of a header for savecore, but the rest is just like most other binaries.
11:41
So next is a mini dump. Mini dumps were added in FreeBSD. I don't think I can hear myself. There you go. Okay. Mini dumps were added in FreeBSD 6.2 by Peter Whim. So they only contain memory pages in use, whereas a full dump will just contain all of memory. So this saves you a lot of space.
12:00
This is much smaller. Modern dumps can still be fairly large. So even though my machines make mini dumps, they're actually still too large to hit the swap. These aren't in ELF format. They're a custom mini dump format. It's its own thing, and it contains these things.
12:23
It's written the same way, as you can see, but I wanted to show it was relatively smaller. Okay, and the last type is a text dump. Text dumps were added by Robert Watson in FreeBSD 7.1. The text dump facility allows to capture kernel debugging information in a text form
12:43
as opposed to just a binary. The issue is you do have to script it. You have to know what you want ahead of time, and that's sort of the trade-off there is you can get these really tiny text files if you know what you're looking for, but if you don't know what you're looking for, you'll need these mini dumps or full dumps if you wanted to.
13:04
So there's several different files in text dump. Just these different files here. It contains the version that you're on, what your panic message was. The important part is the captured DDB output, so you script that a priori, and so you got to know what you're looking for.
13:21
You can get a pretty good survey, but generally you know what you're looking for when you're using text dumps. So text dumps have a different on-disk format. It's just those files one after the other in ATAR. The only interesting thing is actually he writes the trailer first
13:43
and then moves all the way to the leader and then writes the whole thing again, and that's because he doesn't know the size before he writes the dump because he hasn't run the DDB scripting. So why would you use one format over another,
14:04
a core dump versus a text dump? So both are useful when crashes aren't predicted. So if you're in production, you're not going to pull up an online debugger and figure out what your problem is. You're going to put that machine right back into production, take your core and go work on it so you can debug it offline.
14:21
It also allows archiving of crash data for later comparison if you're trying to figure out, you know, we're having the same issue over and over again. Why is that happening? And so core dumps, you don't need to know what you're looking for ahead of time, like I said, but you do need the source tree, debug symbols, and a kernel built for analysis. Whereas with text dumps, they're less complete,
14:42
but they're like very, very small in comparison. Sometimes it's easier to extract information using DDB over KGDB than it is to get it out of a core dump. And then there's another, I'm going to go over two more OSes and then like one tool that I have to do with core dumps
15:03
because the other OSes, I'm interested in those features and porting them to FreeBSD, and then the tool is just useful in production. For OSes, we're going to cover Mac OS X and Alumos, and then there's one tool I'm going to cover, Backtrace.io. Sami, who just walked in, if you have more questions,
15:21
go ask him. They're a sponsor of the conference. Okay, so Mac OS X, it's pretty different from the BSD core dump procedure, most of all because they're using Mac OS and not ELF, and they have some neat features. They can do dump to the local machine, but it's like sort of the default to do it over the network or FireWire.
15:45
They use the modified TFTBD daemon from FreeBSD to do net dumps, and then by default they have dump compression, which is really nice, and that's both local and using their net dump.
16:01
And that full procedure, like from getting to end, is available in the paper. I didn't say this before, but the same is the case with the FreeBSD procedures. So Alumos, it's not a BSD, but the features are important, especially because they had ZFS before us.
16:21
So they have a couple of features that are really nice. Online dump size estimation. So there's a bunch of different toggles you can set, like compression or doing a live dump, things like this, or what you're going to dump. You can choose between everything, just your processes or just one process,
16:41
and then you can talk to dump on, and you can add the dash e flag, and it will estimate, okay, here's the settings you set, how large is this thing going to be, and that can be very useful in production. They do gzip compression. They have dump to swap on ZVol, which is something I'm very interested in,
17:01
because when you don't have a swap partition but you do use ZFS, you can always add a swap on ZVol. And then lastly, they have live dump. I'm sort of waiting to find somebody who thinks this is like the coolest feature, but while writing this, I was really interested in it. So it's useful for production machines where interactive debugging is not possible,
17:22
in particular for hangs, but you can dump on a machine that stays online. And so lastly, we have backtrace.io. So backtrace.io is not an operating system. It's just a tool for curating kernel and user space cores. They have a really cool kind of continuation
17:42
of the mini dump idea. How do you make these cores even smaller? And so they have a thing called a snapshot. And so they take your mini dump and then use some intelligence to kind of figure out what parts of this dump do you need and are relevant to your problem,
18:01
and they'll only take those parts. And snapshots also don't have the problem where you need a copy of the source and the environment to debug them. It allows for debugging on your laptop instead of directly on a crash machine or a very similar environment. So backtrace.io, in addition, has these,
18:24
since it curates everything in a database, you can ask these questions about your panics, like what panic is the most common, and then correlate those by data center, storage controller, hard drive model, timestamps, and more. Then there's a bunch of core dump extension code
18:42
that I'd like to go over. The first is modular dump code. That's something Rod Grimes is working on. So for embedded platforms, you might not want all these extensions. It's starting to get ridiculous, right? There's six. So if you'd like to mix and match, that'd be great on an embedded platform.
19:01
Net dump is after that. Then we're going to do mini dump size, which is the dump sizing module. Compressed dump, striped dump, and then encrypted dumps. So modular dump is a mix and match of features, right? You may not need net dump, but you might want your dumps compressed.
19:22
R. Grimes, ask R. Grimes for info. His talk contains more information, but if you're here, you're not there. So ask him afterward. So net dump, so I've been corrected, but is this the name? Okay. Okay.
19:41
Yeah, you have to go on the way back machine, actually, to see this page. But there's newer code now floating around with Mark Johnston. I'll have to change this. So this code got picked up at Sandvine, and then later at Isilon, that's where they're using it.
20:00
It was almost part of FreeBSD 9, but that never happened. Mark Johnston's still working on it. Adding encrypted dumps sort of gotten in the way of upstreaming it. Mini dump size is an online mini dump sizing estimation tool. It's a kernel module. It's essentially a no op version of the mini dump code,
20:23
and that allows you to figure out how large your dump is going to be in production, and that's really useful for people like me. So if I was going to figure out how large I can have, if I needed to create larger swap partitions, how large do they need to be? It turns out it's a ridiculous number.
20:42
Oh, I wish I had it. There's another slide here. Sorry, I moved from Beamer to Keynote yesterday, but I missed this slide. Yeah, so I ran this tool. It's 138 gigabytes. I'm never going to make a swap partition that big.
21:01
That's ridiculous. Next is compress dump. Compress dump is confusing because we have save core dash z, but that's not what I'm talking about, and so even smart people like this guy here, you might recognize him, get this confused.
21:25
So there's two things we could be talking about if we talk about compress dump. The normal thing, and this has been around forever, was the dump process is mostly the same, except when you go from swap to your file system, it gets gzipped, and that's great,
21:41
but what we're talking about is compressing it on the fly as you dump from your RAM to your swap partition, and so if you gzip that on the fly, that could fix the problem I had as well.
22:01
So this code exists. This also sits with Mark Johnson at Isilon. You could get really nice compression ratios on core dumps, right, because they're not meant to be compact, so 6 to 1 to 14 to 1 compression ratios are achievable, so if you get a 6 to 1 ratio and you have a 32 gigabyte core,
22:21
it's now 5 gigabytes. That's a big deal. The patch shouldn't be hard to apply to FreeBSD 12. Oh, I indicate that this is harder than I should have. This isn't actually that bad. This should happen soon.
22:41
Okay, so last night I was hanging out with Rod, and I think he told me that there was actually something new I should add to my paper. It's called Stripe Dump. So most of us now, on our z route, we're running like four disks, each with some swap, and let's say this is, I don't know, 2 gigabytes each.
23:01
So now instead of only being able to write to one of these, you can stripe across them as you dump, and that can be really useful because a lot of us with large RAM machines, if you have 384 gigabytes of RAM, you might have two or four disks in your z route, but they all have a small amount of swap. If you can activate all of these, you can just add them up,
23:22
and you get four times the swap. This hasn't hit my paper yet. If you'd like to learn about it, there's a long, long, long mailing list thread here. In addition, Joey enhanced about being able to do a text dump and a real dump sequentially, which is not something we're able to do right now.
23:40
This could be useful for people making appliances. Last is Encrypt Dump. This is something that's available in FreeBSD 12. So your kernel dumps can include your sensitive information, right? You might have passwords or keys sitting in RAM.
24:02
So you might need encryption to protect this information if you're in an untrusted environment. So a guy named, I forget, created Encrypt Dump. It's a strange name. I don't know where he's from. I think he's Polish.
24:21
So your kernel only supports AES 256 CVC, but the process is just slightly extended, right? So during this part over here, when we go from RAM to swap, we encrypt with a key set in rc.conf, and then later, after we do the normal save core procedure,
24:42
we then run decrypt core, which I think you have to run it manually, and that'll give you your core back. The dump on 8 man page example is great. It tells you how to create your keys. It tells you how to run the processes. It's easy. The on-disk format is slightly altered from minidump
25:01
just to add in a kernel dump key and key size to the kernel dump header. The kernel dump key consists of what algorithm you're using, some initialization stuff, and the symmetric key. They changed the panic string, so it's four bytes smaller. I don't think that's going to affect anyone.
25:23
But text dumps are not supported. Only full dump and minidump. This is also not yet in the paper. This was just recently added. So there's the mailing list announcement. And then there's two proposed core dump extensions that I'm sort of gauging for interest. The first one I'm going to do anyway,
25:41
whether or not you're interested, but the second one I'm interested in is, would anyone here by a show of hands be interested in live dump for your production systems? All right, we've got two. Three. Live net dump? Is that what you want? That's why we need the modular dump code.
26:01
Yeah, that'd be cool. So are you interested for the hangs, or what gets for you?
26:29
Because I think OS 10 has a networked DDB, which is really nice.
26:41
Exactly. Yeah, that'd be cool. No. So your dump won't be consistent. So things get weird.
27:01
Yeah, right. It seems to work pretty well. But yeah, your dump won't be consistent, especially the parts that you're messing with. And it seems to me that what you'd be interested in would change. I don't know. That's why I'm asking this question, is who thinks this is really what they need?
27:21
Yeah, totally. And you could just take another one if your first one didn't work out. Yeah, well, so you don't snapshot it, right? You just start going through it.
27:42
Like there's... Exactly, yeah. Because otherwise you'd have to halt your system to take that snapshot. So you just kind of let things go, and hopefully it works. So yeah, so the main product here of this talk
28:01
is actually the paper, because it started at university. It's a large org mode file, if you're familiar with Emacs. It includes tons of commit messages, emails, and code references. There's a bonus email from Jordan Hubbard that I thought was pretty funny when I asked him for help.
28:20
I'll let you go find it. It includes information not included in the PDF also, because there's some raw notes in there that I wasn't ready to publish at the time of publishing. So the paper actually starts at version five, and there's a bunch of other incomplete sections and stuff I thought wasn't ready for prime time.
28:45
Yeah, so there's also entire files of code in there when I was trying to figure out what feature was where, and usually the file path as well. I still use that, and I'm trying to reference these things. So I did want to take a look,
29:00
just to show you how complete it was. I don't know where I am on time, probably early. Yeah, this is originally a 45-minute talk, so let's see. So it turns out when you're looking at core dump information, there's sort of other things you can glean about architect support, architecture support
29:22
in an operating system. So you can figure out questions like when was alpha support added? Well, FreeBSD 3.0. And then you can see also when it was removed, FreeBSD 7. I didn't add when Luna was taken out, but Luna 86, 68, yeah.
29:43
It was added in 4.4, and you can actually see it's not in there, but it would be removed in 12. So different sort of interesting things like that. It was a good part of the operating system to attach a history to, because it's obviously going to be there
30:00
from day one to the day the operating system's done. So that's pretty cool. And so you can go find all sorts of little gems like emails back and forth to me and Rod Grimes and different feature support.
30:20
I've got like two more slides. So here's a bunch of links. First is to my repository. Next is to the Unix history repo. That's where you should go take a look. That is probably my favorite repo on the Internet right now. You can just go. It's just a large Git repo, and each OS is a branch.
30:41
You can see how they're all connected. Next is Rod Grimes doing a lot of the work I talked about here, so you can find his work on kernel dumps there. I'd like to thank Deb Goodkin. I met her at a conference, and that's why I'm here. Rod Grimes for helping me read PDP-11 assembly.
31:02
And Michael Dexter, he actually came up with this idea that I should write this paper. And then at ASIA BSDCon, he was like, you didn't thank me, so there he is. Are there any questions?
31:22
Because you had to network dump it. I don't have the whole history of OS X, but it was definitely in there in the latest version. There was more change. There were two or three versions of X and U that I looked through, and there was a lot more change in that segment of code
31:42
than I thought there would be. But it got cleaned up, which was nice. They added comments. Go ahead.
32:06
Okay.
32:27
Yeah, that's pretty cool. Just for user space processes, though? Okay. Yeah, that would alleviate all the... You wouldn't have to support it in kernel.
32:53
Yeah, that's really cool. You should go talk to those backtrace.io guys.
33:04
We'll see. All I work on is BSD and soon the LUMOS. Oh, yeah. If you could get at it, that'd be great. I'll think about it. First is dump to swap on zvol. If I could get that, I'd be so happy.
33:22
You guys, too, should volunteer on that project. Any more? Sorry.
33:43
Right. I'm not sure I understand. When you say live dump, do you mean the... Oh, okay. So if we can do a minidump size estimation,
34:00
why is it so hard to do a live dump? Ah. That's not a question I know how to answer, really. The live dump, I don't know really what goes into it. It might not be that hard. It's just someone... I think something no one has had interest in or even knew really existed.
34:22
Like I said, I've still yet to find somebody who's like, man, that solved my problem. And I've talked about core dumps for like six months nonstop. No, that... It wins.
35:14
And then it runs. It picks up a kernel. Its only job is to... To dump.
35:21
Yeah, I actually heard about that when Michael Dexter was like, you know what they should do is just load another kernel. And we're like, ha, ha, ha, that's ridiculous. And then we found about this. Well, so what happens when that panics? There's like 1K of memory for this tiny, tiny kernel.
35:45
There you go. We thought that was so funny, but it would be nice for adding all this functionality because you have this clean state instead of like sort of picking up the pieces when your kernel panics.
36:03
Right. Yeah.
36:45
Yeah, things get weird once you're in a panic context. Is there any other... Any other questions? All right. Well, thanks for coming.
37:00
Sorry I ran a bit short.