We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

MINIX 3: a Modular, Self-Healing POSIX-compatible Operating System

00:00

Formal Metadata

Title
MINIX 3: a Modular, Self-Healing POSIX-compatible Operating System
Title of Series
Number of Parts
97
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
MINIX started in 1987 and led to several offshoots, the best known being Linux. MINIX 3 is the third major version of MINIX and is now focused on very high-reliability and security. MINIX started in 1987 and led to several offshoots, the best known being Linux. MINIX 3 is the third major version of MINIX and is now focused on very high-reliability and security. When you buy a TV set, you just plug it in and it works perfectly for the next 10 year. We are trying to make operating systems as good as that. The current version of MINIX 3 can detect device driver crashes and some server crashes and automatically replace the failed component without user intervention and without affecting running processes. The talk will discuss these aspects as well as new work.
5
15
Thumbnail
48:33
41
Thumbnail
35:21
47
48
Thumbnail
1:03:30
50
75
Thumbnail
50:56
94
Physical systemModul <Datentyp>Operations researchCloningSoftwarePhysical lawHacker (term)ComputerSoftwareWindowWordUniverse (mathematics)Content (media)Product (business)Operating systemMathematicsStandard deviationMultiplication signDesign by contractRevision controlSpacetimeDirection (geometry)GodPhysical lawDifferent (Kate Ryan album)CloningQuicksortCausalityMeasurementRandomizationNeuroinformatik2 (number)Crash (computing)Kernel (computing)CodeOvalExplosionHacker (term)Closed setTopological vector spaceView (database)Projective planeCoefficient of determinationNumberElectronic mailing listKey (cryptography)Computer animationXMLLecture/Conference
InfinityComputer hardwareSystem programmingOperations researchBefehlsprozessorBand matrixSoftwareModul <Datentyp>LogarithmKernel (computing)SineDevice driverCodeSanitary sewerProcess (computing)Asynchronous Transfer ModeInterrupt <Informatik>InterprozesskommunikationArchitectureSummierbarkeitComputer chessSoftware bugLine (geometry)Gastropod shellHill differential equationOrder (biology)WindowObservational studyComputer hardwarePhysical systemGoodness of fitGame controllerBit rateProjective planeInfinitySoftwareBand matrixSemiconductor memoryMiniDiscAsynchronous Transfer ModeComputer programmingData managementProcess (computing)QuicksortVirtual memoryRegular graphFile systemKernel (computing)Point (geometry)MereologyMikrokernelGreen's functionRight angleMultiplication signCycle (graph theory)Modul <Datentyp>View (database)Computer architectureScheduling (computing)Level (video gaming)Operating systemInterrupt <Informatik>InterprozesskommunikationCodeDevice driverDirection (geometry)LaptopMehrplatzsystemMoore's lawRadical (chemistry)Flow separationPower (physics)Virtual machineComputer network2 (number)AlgorithmReal numberError messageCrash (computing)Message passingRandomizationWeb pageMultiplicationNumberCategory of beingGreatest elementTrailComputer fileOperator (mathematics)Computer animationLecture/Conference
Kernel (computing)Server (computing)System callDevice driverSpacetimeAddress spaceMemory managementRead-only memoryProcess (computing)InformationSanitary sewerInfinityLoop (music)Asynchronous Transfer ModeDatei-ServerVisual systemComputer networkMessage passingBlock (periodic table)Cache (computing)Computer fileKernel (computing)Power (physics)Block (periodic table)SpacetimeConstructor (object-oriented programming)Physical systemMultiplication signProcess (computing)AuthorizationMappingServer (computing)Message passingComputer fileInformationTouch typingSoftware bugProgram slicingFraction (mathematics)Device driverMiniDiscCache (computing)Address spaceCategory of beingComputer programmingReading (process)Library (computing)Bit1 (number)File systemComputer networkSystem callVector spaceVirtual memoryInterface (computing)outputAsynchronous Transfer ModeFehlererkennungOverhead (computing)CASE <Informatik>Sensitivity analysisRaster graphicsError messageVideo gameTask (computing)Computer virusDatei-ServerState of matterInformation securityInheritance (object-oriented programming)Data managementNoise (electronics)Loop (music)Order of magnitudeComputer hardwareQuicksortPerimeterSemiconductor memoryComputer animationLecture/Conference
Kernel (computing)Computer fileCache (computing)Block (periodic table)Datei-ServerSanitary sewerDevice driverProcess (computing)LogicVirtual realityRead-only memoryMechanism designTrailWeb pageServer (computing)Inheritance (object-oriented programming)Group actionMiniDiscData recoveryCrash (computing)DemonCodeData structureSpacetimeMessage passingPhysical systemData bufferData managementInterrupt <Informatik>Physical systemTask (computing)File systemComputer fileMiniDiscDevice driveroutputSpacetimeMessage passingBlock (periodic table)Overhead (computing)Bit2 (number)Process (computing)Natural numberServer (computing)RootTable (information)Patch (Unix)MereologyQuicksortLevel (video gaming)NumberVolume (thermodynamics)Datei-ServerOperating systemComplete metric spaceData storage deviceElectronic visual displayDemonKernel (computing)Data managementCodeNeuroinformatikSemiconductor memoryLine (geometry)Letterpress printingInformation securityResultantSoftware bugEmailWindowCD-ROMRandomizationModul <Datentyp>LogicBuffer solutionInterrupt <Informatik>Crash (computing)Client (computing)Computer hardwareComputer clusterVirtual memoryMappingPlastikkarteInformationInheritance (object-oriented programming)Computer networkAlgorithmMechanism designTrailWeb pageInstallation artState of matterExtension (kinesiology)Communications protocolCodeData recoveryStack (abstract data type)Core dumpLeakFluid staticsHidden Markov modelGoodness of fitNumbering schemeWritingReading (process)LengthRoutingInsertion lossCASE <Informatik>Point (geometry)Data structureBuffer overflowMultiplication signComputer animationLecture/Conference
Message passingPhysical systemInterrupt <Informatik>Data managementData bufferServer (computing)Client (computing)Device driverCodeModul <Datentyp>Kernel (computing)Data structurePointer (computer programming)ProgrammschleifeRead-only memoryMiniDiscDatei-ServerMountain passPrice indexInjektivitätLoop (music)TouchscreenGastropod shellLink (knot theory)EmailDatabaseTable (information)CodeSemiconductor memoryAuthorizationAddress spaceKernel (computing)Level (video gaming)Touch typingSystem callDevice driverMiniDiscInheritance (object-oriented programming)Physical systemProcess (computing)InfinityServer (computing)Computer hardwareLoop (music)Point (geometry)Formal languageCompilation albumGastropod shellScripting languageCompilerStandard deviationOperating systemDatei-ServerMessage passingProgrammschleifeBlock (periodic table)Survival analysisConnectivity (graph theory)Event horizonSystem administratorNumberComputer networkText editorInjektivität2 (number)Utility softwareBitReal numberComputer architectureOverhead (computing)Digital photographyMedical imagingEmailWeb pageSubject indexingSpacetimeEntire functionProgrammer (hardware)Axiom of choiceMathematicsCondition numberIntegrated development environmentBranch (computer science)Directory serviceError messageComputer programmingSimulationMultiplication signWeb 2.0Operator (mathematics)Tracing (software)Software bugVirtual machineTouchscreenStudent's t-testBinary codeLink (knot theory)Real-time operating systemRevision controlMaizeSpeicherschutzTerm (mathematics)Thread (computing)CASE <Informatik>Client (computing)QuicksortComputer virusDatabaseModule (mathematics)Public domainCrash (computing)Bus (computing)1 (number)DivisorProjective planeGradientGraph (mathematics)WindowProcess capability indexCore dumpLine (geometry)Pointer (computer programming)Data structureUsabilityComputer fileSoftware testingRegular graphEquals signComputer animationLecture/Conference
Execution unitOcean currentPersonal digital assistantStudent's t-testGoogolComputer programWikiProgrammer (hardware)EmailWeb pageMultiplicationCore dumpReplication (computing)Physical systemData recoveryService (economics)AdditionDevice driverAsynchronous Transfer ModeLaptopLogarithmWebsiteOnline helpOperating systemOpen sourceGroup actionSource codeCartesian coordinate systemSimilarity (geometry)Installation artReal numberConfiguration spaceWeb browserOcean currentStudent's t-testQuicksortProcess (computing)Perl 5Projective planeMultiplication signWebdesignSoftware engineeringPoint (geometry)WebsiteNumberLine (geometry)Drop (liquid)File systemGoogolCD-ROMComputer architectureDevice driverComputer virusGraphical user interface1 (number)AreaSign (mathematics)WindowConnectivity (graph theory)Scaling (geometry)Direction (geometry)Computer programmingBitSlide ruleProgrammer (hardware)Kernel (computing)Type theoryHacker (term)WikiHome pageInheritance (object-oriented programming)Position operatorOpen setCodePhysical systemRevision controlOperator (mathematics)EmailMulti-core processorBootingCore dumpMatching (graph theory)LaptopComputer fileLibrary (computing)Formal languageSoftwarePersonal digital assistantNegative number2 (number)Different (Kate Ryan album)CuboidMathematicsFlow separationFitness functionLattice (order)Execution unitElectric generatorSoftware bugCrash (computing)Server (computing)MultiplicationDegree (graph theory)Google ChromeFault-tolerant systemAsynchronous Transfer ModeSingle-precision floating-point formatMessage passingComputer hardwareScripting languageTheoryComputer animationLecture/Conference
WebsiteProcess (computing)Multiplication signKernel (computing)2 (number)BootingProgrammer (hardware)Stress (mechanics)Message passingBlock (periodic table)Address spaceServer (computing)Device driverBitMoment (mathematics)SoftwareRaster graphicsEmailGroup actionLimit (category theory)ScalabilityEndliche ModelltheoriePresentation of a groupLine (geometry)MikrokernelLevel (video gaming)Virtual machineAverageRegular graphCodeWeb pageVotingCore dumpDatabase normalizationPhysical systemTerm (mathematics)Standard deviationWater vaporTheoryNumbering schemeTopological vector spaceArmSoftware bugError messageCylinder (geometry)Operating systemVery-high-bit-rate digital subscriber lineMereologyComputer programmingNumberAxiom of choiceLatent heatDifferent (Kate Ryan album)AlgorithmData storage deviceProxy serverInterface (computing)Scripting languageNormal (geometry)MiniDiscTrailGastropod shellComputer hardwareWordLinearizationPoint (geometry)Disk read-and-write headMechanism designForm (programming)Direction (geometry)Physical lawSimilarity (geometry)Game controllerBackupTable (information)File systemSeries (mathematics)NeuroinformatikSemiconductor memoryTotal S.A.Crash (computing)Modul <Datentyp>QuicksortOperator (mathematics)Insertion lossOrder (biology)Goodness of fitDrop (liquid)Hybrid computerComputer animationLecture/Conference
XML
Transcript: English(auto-generated)
OK.
Wait, I haven't given the talk yet. And I have a tendency to talk too long. So I have taken measures, preventative measures. Unfortunately, I am as sick as a dog. But I have a background in theater. And as we say, the show must go on.
And so I'll give it my best shot here. I hope you can understand me, though. Hello? Hello? Am I? Yeah? Yeah, that's OK? OK. I'm Andy Tannenbaum. And I'm going to talk about Minix 3. And I'm sort of running the project,
but there's a whole bunch of other people, too. And I didn't have enough space to list everybody. But sort of taking my word for it, there's a whole bunch of other people. Know what? Oh, no. Do something. Here we go. OK. Let me start with a brief history of Minix. It's got a somewhat strange history,
and it's confused a lot of people. In 1975, Bell Labs released Unix version 6. It was the first one that got out of Bell Labs. It was a huge success. John Lyons, a professor at the University of New South Wales, wrote a book describing version 6, a little booklet. It's become a real classic, describing it. People taught that at university courses
for a number of years. And then AT&T, in its wisdom, when it released version 7, 1979, put in a clause in the content saying, thou shalt not teach this in courses. It's our product. We want to keep it secret. We don't want anybody to know about it. And so that was a brilliant move on AT&T's marketing department.
That's why Unix rules the world now. In 1984, I was teaching an operating systems course. And since we were forbidden by contract from teaching it, I decided to write a clone of this myself, which is probably a crazy idea. But I did. In 1987, I actually finished it. And I released it. And I wrote a book about it.
And so it got out there. It was mostly for teaching purposes to get around the stupid contract from AT&T. Then there was a second edition of the book 10 years later. And then we got interested in doing research on reliable operating systems. In about 2004, there's another version. The book came out. And then in 2008, everything changed.
I got a grant from the European Union, from the European Research Council. A couple of words about the grant. Grant was 2 and 1 half million euro. So this is a fair amount of money. And the goal was to develop a highly reliable operating system. So apparently, they think that reliability
is lacking now if they're willing to give me 2 and 1 half million euro to try to make one. So they at least thought this is worth some real money to make one. In 2009, there was a discussion within the EU, which didn't come to fruition. But there was a discussion of having software fall under the standard product liability laws.
If you're a tire manufacturer, and one tire in 10 million explodes, you can't say, tires explode sometime. That's the way it is. Get used to it. It doesn't work. It's got to actually work all the time. With software, if it doesn't work, people say, it doesn't work. That's the way it is. If software fell under liability laws
like everything else does, all of a sudden, companies would be liable for selling software that doesn't work. That really changes the situation. And I've been interviewed a number of times where people say, well, what difference does it make? I said, suppose they go to Microsoft and say, by law, your software has to work. And they say, it can't be done.
You can't make a law saying, pigs can fly. And if the EU could say, well, Tannenbaum did it. Why don't you do it that way? Then they would say, well, we don't want to. That's not as strong as it can't be done. And so that's the direction we're going. A couple of words about software reliability.
Hackers, like all of us, have very much the view, if God had wanted software to be reliable, he wouldn't have created reset buttons. OK? Your grandma, however, thinks, why isn't it like a TV?
You buy it, you plug it in, and it works perfectly for the next 10 years. Why aren't computers and software like TVs? Because we all know, because the software is changing every 45 nanoseconds and so on. But a lot of people kind of want it to be like that. And so I think if your computer crashes once
every three months, you hit the reset button, you think it's really reliable. Suppose your car stopped working once every three months at random for no particular reason, and you knew from experience. You get out of the car, take the key out, open the hood, close the hood again, get back in the car, put the key, it works. You say, no big deal. I think most people wouldn't like that from their car, even though the car manufacturer said,
what's the big deal? It costs you 10 seconds every three months, no big deal. And people wouldn't like that. But we're used to software not working, but grandma isn't. So the question is, can we make software that grandma actually might think is pretty good? I think the main cause of the problems is bloat in software. There's all that code, and it's getting bigger and bigger and so on.
Windows and Linux and everything else is growing like crazy. I saw this article in PC Pro magazine just a couple of months ago. And the headline is, Linus Torvald says, Linux is bloated and huge. This is Linus saying this. This is not Bill Gates. So if Bill Gates said Linux is bloated and huge,
I'd, you know, OK. But Linus had said, Windows is bloated and huge, OK. But even Linus thinks that Linux has got out of hand, that there's certainly nobody who understands all of Windows. Maybe Linus understands all of the Linux kernel, maybe. But things have just gotten, I think, out of hand. So I have the feeling that there's a need to rethink operating systems.
And the research that we're doing is in that direction. We have basically infinite hardware. My little notebook here is 5,000 times the computing power of the PDP-11 I sort of started with. You know, it's smaller also. Not to mention it's 1 50th the price and has 1,000 times more memory and a disk 1,000 times the bigger. Nevertheless, booting the notebook
takes about four minutes. And booting the PDP-11 took five seconds. And the machine 5,000 times slower. What's wrong with this picture? OK. There's infinite cycles. There's infinite RAM. There's infinite bandwidth. And there's infinite bloated, useless, crappy software. And so you've got all this bad software.
And to become more like a TV, I think future operations has to become smaller, simpler, more modular, which is very important. Modularity is people have built more complicated things than operating systems. An aircraft carrier is much more complicated than an operating system. But aircraft carriers are very, very modular.
The designers understand that. So like if a toilet gets clogged on the aircraft carrier, it doesn't begin firing missiles. Because the toilet system and the aircraft system, the missile system, are separated very much. They understand you don't want problems in one part going over to another part. And likewise, if an incoming missile is detected,
the toilets don't start flushing. They really understand about building modular systems in most other worlds. And we don't. And make it reliable and secure. And I think self-healing is a major part of this. It has to fix itself. People don't understand it anymore. So that's where our stuff is going. I'm not talking about intelligent design, at least
that's applied to operating systems. I'm well-known for espousing microkernels. And I still believe that. Ours is about 6,000 lines. L4, I think, is about 10,000 lines. It's QNX. Industrial systems, Green Hills and Pike and QNX, there's lots of industrial systems and avionics
and automotive that are microkernels. And they're all on the order of 5,000, 10,000 lines of code versus 6 million for Linux. And I think Windows is probably above 100 million now altogether. Did a lot of studies about how many bugs there are per line of code. If you find a bug, you just fix it. In industry, that's not the way it works.
Many companies have careful bug tracking systems. If you find a bug, you have to report it using some automated bug reporting system. And they get a log of how many bugs they found. And the experience is about 5 to 10 bugs per 1,000 lines of code in industrial projects where they're really careful about quality control. There's a study in FreeBSD, which is very, very good quality control,
three bugs per 1,000 lines of code. That's considered very, very good. At that rate, Minix would have maybe 18 bugs in the kernel, and Linux would have 18,000. Now, of course, not all the bugs are fatal. So maybe spelling errors on messages and minor stuff. But if you get 18,000 errors, there are going to be some that are going to be important, even though many aren't.
Also, there was a study at Stanford of the Linux drivers. And they have about three to seven times more bugs than the rest of the kernel. Because everybody wants to look at the virtual memory algorithm. That's really cool and neat and fun. But nobody wants to look at the driver or some obscure printer, because it's no fun at all. It's a real mess. It's 100 pages of yuck.
And unfortunately, 70% of the code is the drivers. And they have an error rate of three to seven times. That's where the problems come in. In Windows, it's known that 85% of all crashes are due to drivers not written by Microsoft. And they get the blame for it. But it's some random guy in Taiwan who was in a big hurry to get his driver out who wasn't very careful. And that's where the trouble comes from. So in a modular design with walls around the pieces,
you get a chance to do this. So ROS runs as multiple user-level processing. So the operating system runs basically in user mode. So here's the basic architecture of MINIX 3. The microkernel at the bottom is about 6,000 lines of code.
And it handles the interrupts, very basic notion of a process, scheduling, IPC, and the clock, and this thing called system. But we've already moved that out of the kernel. So basically, it's just handling the interrupts and the inter-process communication, and a very primitive notion of a process. It doesn't really manage the processes. Somebody tells it, there's a process.
Go run them. And it runs it without really understanding what it's doing. One level up in user mode are the driver processes. And each one is a separate process. So the disk driver is running as a user mode process in protected mode with the MMU turned on, just all by itself as a non-privileged process. And the terminal, and the network, and the printers,
and all these things, running as separate user mode processes with relatively little power. On top of that, there's another layer of user mode processes with the file systems, and process manager, and virtual memory manager, and all this other stuff, again, running as user mode processes. And the top layer are the regular user programs.
But from the kernel's point of view, they're just user processes. The file system is no different than make or the shell. It's just another user process. In all Unix systems, the shell is just another user program. And there's many shells, the born shell, and the born again shell, and KSH, and a whole bunch of shell, C shell. In Windows and many other programs,
the shell is sort of built into the operating system. It's hard for them to understand how could the shell be a user program. It's that way for many Unix people. How could the file system be a user program? But it is, and the virtual memory manager is a user program, and so on. That's sort of the basic design. It makes it very modular, and there's a number of interesting properties. First of all, the kernel has some calls
that these drivers and servers can make, and these calls are different than the POSIX calls. These are internal calls for the benefit of drivers and servers. We also have the POSIX interface for user programs, but the kernel calls are low-level things like a user process can't read or write an IO port. They have no access to the IO system. If you wanna read or write an IO port,
you gotta ask the kernel, hey, here's a bunch of IO ports. Go read them for me, and it'll check to see if you're authorized to do that, and if you are, it'll go read them for you, and so on. There's stuff for setting interrupt vectors. There's things copying between address spaces. There's a file server runs in its own address space. It can't talk to anybody else's address space,
so if somebody says to it, give me a block, it can't actually deliver the block so it doesn't have the authority to get outside of its address space, so it has to ask the kernel. Go give him the block. There's all kinds of checks, which I'll describe later. You know, DMA mapping, assigning memory maps, setting up privileges, I don't know, all these things that are internal kernel calls,
about 35 or so of these kernel calls for the benefit of the drivers and the servers. Everything uses the principle of least authority, so they're running as user mode processes. Everybody's got very carefully granted powers, so they can't just do anything. For example, nobody can execute privilege obstructions.
They can't touch the privilege registers. They're time-sliced, so if one of these things gets into an infinite loop, it doesn't hang the system. It simply wastes a certain fraction of the time, but not all of it. None of them can touch kernel memory, so they can't corrupt kernel memory due to a bug. They can't touch other address spaces. There's a bitmap per kernel call saying which kernel call you're allowed to do,
so if you're an audio driver, for example, and there's no reason for you to fork, you can't do the kernel call that's sort of the primitive for fork. There's also a bitmap saying who you can send to, so again, if you're the audio driver and you've got no business talking to the network driver, you can't do it. You'll try to send to the network driver, and the kernel will return an error
saying I'm not authorized. There's no direct IO. You can't touch IO ports, so the disk driver can't get to the disk. It has to ask the kernel. It's a humbling experience for the disk driver not to be able to touch the disk, but that's life. It says can I please write on the disk, and it says yes, but if you're the audio driver and you say can I write on the disk, the answer is no.
That is the consequence, for example, if a virus or something should get into your system and take over the audio driver, it can make really weird noises, but it can't take over the disk because when it tries to use the disk, it's told sorry, no permission. So there's a lot of security value in this, too. So the driver's a user mode. They run a separate processes. They don't have any kind of super user power.
The MMU is turned on, so there's no special protection. They don't have access to the IO ports. To do anything, they've got to ask the kernel, which checks. To copy to other address spaces, they've got to ask. They're really not very powerful, which is intentional. There's a bunch of user mode servers. There's one or more file servers. There's a process manager.
There's a virtual memory server, data store, information server, network server. The X server, of course, is always a user process, and the reincarnation server, and I'll talk a bit more about these things in a couple of minutes. Let's start with the file server. That's one of the more interesting ones. Here's a picture of the file server. So here's a user program,
and it wants to read, you know, it doesn't read. So read is a library routine, and the library routine sends a message to the file server saying, I want, you know, to read, you know, 512 bytes from file descriptor six or whatever, okay? And here's the file system's little cache, these little colored thingies. If we're lucky, then the file system will have the block in the cache,
and it'll call the sys task in the kernel to say, please copy the block to the user space, because it's not allowed to do that. The file, the system task will say, okay, done it, and then the file system replies to the user saying, you know, read completed with, you know, error code if, you know, or no error code. And, you know, so there's like four messages
needed to do this, and you might say, how long is a message? It's about, I don't know, 500 nanoseconds, that order of magnitude. So there's a little bit of overhead, but it's not immense. Now, let's look at a more complicated case when the block isn't in the cache. The user says to the file system, copy block. The file system looks in the cache, can't find the block, sends a message to the disk driver saying, go get the block.
The disk driver, you know, sends a message to the system task saying, here are some IO ports, please write these values on the IO ports. So, you know, get the block. Then it waits. Then there's a notification from the hardware into the disk driver, which is like a message, which says basically, okay, you know, disk, you know,
has finished activity, wants something from you. And so then it goes and checks the, you know, does some reads from IO ports, finds out what the, whoops, finds out what the status is. And if all is well, the disk driver notifies the file system, saying, you know, I got the, you know, I did the IO for you, and so on. And then the file system calls the system task to say copy at the user space,
and it copies the user space, and then it tells the user it's done, okay? So there's like nine messages in here at 500 nanoseconds each, so we're talking about burning up four and a half microseconds in this kind of overhead plus a bit more. But we've actually read a block from the disk that takes several milliseconds, okay? So another five microseconds more or less
really isn't a central issue here, okay? On the process manager, it manages processes. It contains the logic for starting and terminating processes. I think signals are in there too. They had to go somewhere. But, you know, the basic logic for keeping track of things is in this process
called the process manager, okay? This is a virtual memory manager. It has the logic of the virtual memory, but not the mechanism. So the kernel, you know, the virtual memory manager says to the kernel, here is the memory map for this process, and the kernel just says yes, sir, and just takes it and doesn't, you know, if it's stupid, well, it'll crash. It doesn't know, it just follows instructions.
All the logic for keeping track of the pages and setting up the maps and so on, that's all done in this user-level virtual memory manager, okay? So it knows where the free pages are. It knows, you know, who's got what. You know, page faults all get redirected to it. It figures out what to do about the page fault. So all the smarts and all the algorithms are in the virtual memory manager as a user process.
When it's all done, it builds the appropriate page map and just tells the kernel, here's the page map from number six, okay? The datastore is a little name server where you can say, here, save this, and then you can ask for it later, and there are reasons for that, which I'll come to in a minute. It could be used, in fact, is being used for recoverable drivers, okay?
Very briefly, what that means is, if you want a system to be reliable and be able to survive a failure in a driver, some of the drivers have a small amount of state, some of a large amount of state, but for example, an audio driver knows what the volume levels are for the audio devices, and it stores those in the datastore, and if the audio driver ever crashes,
it's replaced by a new audio driver, which goes to the datastore and says, you know, give me the appropriate data for me, and then it gives it back the audio levels, and the new driver can set the audio levels to whatever the old driver had set them. The information server is basically for debugging. All the F buttons in Minix display various debugging dumps,
so you can see what's going on. There's the network server, it's a complete TCP IP stack, runs entirely in user space, whole thing runs there, and then there's the reincarnation server, which is kind of a fun thing, which most systems don't have. The reincarnation server is about reviving the dead, okay?
It's the parent of all the drivers and the servers, okay? Whenever a driver or server dies, the reincarnation server collects it, and sort of fixes things, okay? And there's a table which it looks at, and that tells it what to do. It also pings the drivers and the servers frequently, so the reincarnation server will go to a driver and say,
hey, disk driver, how you doing? And the disk driver will say, fantastic, I did 57 megabytes a second in the last second, the disk is really humming, we're going great. And then three seconds later, it'll ping the disk driver again and say, how you doing? And the reincarnation server says, hmm, not good.
Okay, disk driver, how you doing? Okay, I'll give you one more chance, okay? Now either you answer or you're toast. And so, at this point, the reincarnation server,
you get three chances, sorry about that. Okay, I'm killing you. So it kills its child, goes and fetches a new one, starts it up again. The new one goes to the data store and says, do I have any state, which we haven't really got for complicated state, but for simple state, disk driver doesn't have any state, so it's easy. It starts up a new one. A new one tries to patch up the pieces,
and it may be, depending on the nature of the driver, you may or may not screw up one or more processes, but you don't screw up the whole system. And in most cases, you don't screw up anything. Well, in some cases, you don't. It's tricky, that's what the research is, for example. The file server knows it gave a command to the disk driver and didn't get an answer, so we have to deal with that. I'll show that in a second.
So here's the recovery, for example, for disk drivers. So the user sends a message to the file server saying read from a file, okay? And the file server sends a message to the disk driver saying read block, and the disk driver now crashes. Okay, so what happens next? The reincarnation server detects this, because it doesn't answer its pings, and it fetches a new driver.
You might ask, how does the reincarnation server fetch the disk driver from the disk when there's no disk driver? And the answer is, it has a shadowed copy of the disk driver in memory all the time, so it can always get to the root device from its memory copy. And once you've got a working root device, you can keep all the other drivers on the root device, so you can always fetch the rest of them.
So even a failure of the disk driver isn't fatal. And then, a message is sent to the file server saying, I have some bad news for you. Your friend, the disk driver, has passed away. We're really sorry about that. The file server says, all right, I wonder if there's a new one. So it goes to the data store and says, by any chance, is there a disk driver?
And the data store says, you're in luck. A new one just appeared a couple of microseconds ago. Here's its address, go for it. Then the file system says, oh, new disk driver. I'm trying to write a block here. You think you could do it for me? And the disk driver says, I'd love to write blocks. It's my favorite activity in the whole world, besides reading blocks, is writing blocks.
So yes, I'll do that, and so you're back in business. And the user process doesn't notice this. It's all transparent. You're replacing parts of the operating system on the fly, while it's running, without the user process being disturbed. That's sort of the whole idea. Now, we're not finished with all of this yet. We can only do the simple parts now. But we're working on, if the file server goes down, with all that state, that's much more complicated.
We're not really dealing with stateful things yet, except for small amounts of state, like audio drivers. But this is the basic idea in where we're going. Okay, so the system is self-healing. It detects its own problems, and it can fix itself to some extent. Okay, what about crashes of other drivers? Well, Ethernet is very easy,
because Ethernet is unreliable to start with, that is, its best efforts. If it goes down, you lose some packets. But nobody ever guaranteed that packets were gonna be delivered, so the higher-level protocols, if it was a UDP packet you lost, well, nobody ever guaranteed that UDP packets were gonna be delivered, so nothing's wrong. If it's TCP packet you lost, TCP itself times out and tries again,
so that's really easy. The printer goes down, you don't know where you were. You're sort of halfway in the middle of a file or something, so all you can do is the printer daemon detects the problem and just prints the file again. There's really no way out. It's theoretically impossible. And audio, it doesn't know where it was, so it can play the song again. So sometimes you can recover completely gracefully,
and sometimes it's hard to do. But still, if the audio driver goes down, and the result is this little small glitch, and then the song plays again from the beginning, it's better than an operating system crash. No, it's not totally transparent. It's very hard to make it totally transparent, but we're trying to get as close as we reasonably can. The kernel has many reliability and security features.
There's less code, so there are fewer bugs. If you're talking three bugs per thousand lines of code, and there are thousands of lines of code less, there's gonna be fewer bugs. It also means the trusted computing base, the stuff that actually has to work, is much smaller than in conventional systems.
There's no foreign code in the driver. If you go out on your Linux or Windows or whatever PC, and you buy a FireWire card, it comes with a CD-ROM, and you put the CD-ROM in the drive, and guess what happens? Random code written by somebody in Taiwan gets installed in your kernel. It may or may not work. The kid may or may not have been in a hurry.
Management may or may not have cared about the quality of any of this stuff. And there's this foreign code sitting in your kernel now. You really don't want that. It's a very, very bad design. In Linux design, you'd have a new user process started with the FireWire code in it, which might or might not work, but if it didn't work, it wouldn't take the system down. It would just take FireWire down. I think that's the modularity principle.
I think in principle, a better design. We also have static data structures. RAM is cheap, and so there's no malloc in the kernel. Tables are all fixed size. The number of processes that you can have is fixed by a constant in some header file, so I think it's 256 processes. If you ever run above that, it says full,
but that means no malloc and no free and no memory leaks, so we waste a little bit of RAM by over-dimensioning various tables. It seems in this day and age, with memories, you can't buy a computer with less than a gigabyte, so we waste a couple of thousand bytes on bigger tables. It's really not a bad trade-off.
Of course, moving bugs to user space doesn't reduce the number of bugs, right, the same amount of code. It just makes them less powerful. A bug in a user space process can do less damage, so we have the same number of bugs as everybody else does. It's just that these bugs are sort of emasculated, you know, with these castrated bugs up there, which, you know, can't do very much.
Okay, IPC reliability and security. We have fixed-length messages, although we're having to rethink that a little bit now. So there's no buffer overruns. Initially, we had a rendezvous system. You know, you sent somebody a message, he did a receive, the message was transferred, and everybody's happy. We're having to move away from that for reasons I'll say in a second. This had no lost messages, there's no buffer management.
It was a very simple scheme. Interrupts and messages are unified, so if an interrupt happens, it's turned into a message by the low-level kernel, and the driver just gets a message saying, you know, message from disk hardware, and then it's up to the disk hardware to, you know, to figure out what to do next. There's one problem we've run into with reliability.
The client sends a message to the server, and the server tries to respond, but the client has died, that hangs the server, because it can't do the respond, okay? And so we're having to go over to asynchronous messages a little bit, which I don't like, but it's really necessary in case, to avoid hanging servers. You know, everybody worries about, you know, sick servers,
but we also worry about sick clients, and so we may have to go over that. Anyway, the drivers are, we regard drivers as untrusted code. In every other system, the drivers are regarded as trusted code, and experience shows that we're right. They are, it is untrusted code. And bugs and viruses and stuff can't spread from module to module easily,
because they're so isolated. None of the drivers or operating system processes can touch the kernel data structure, so they can't mess them up, okay? If a bad pointer occurs, you know, and see, bad pointer errors are very common, it wipes out one driver, it crashes, it's parent, the reincarnation server, is informed via signal that one of its children has died.
It says, oh, you know, the disk driver has died, look up in the table, what am I supposed to do? It runs a shell script, the shell script is likely to, you know, log the event somewhere, possibly take the corpse, you know, the A dot out, you know, the core dump, and save it somewhere for future debugging. It might send an email to administrator, you could set it up and send an email to the manufacturer, there's all kinds of things you could do.
It's running a shell script. When it's all done, it starts the driver again. Okay? If a driver or a server gets into an infinite loop, then when the, you know, the reincarnation server says, how you doing? There's no answer, because it's in this infinite loop, and after three or whatever the constant is, number of tries, then it kills it and starts a new one.
And so, we can survive infinite loops inside critical components of the operating system. I mean, nobody else can do that. And again, because these things do not run a super user, there's regular user processes running at the lowest possible privilege level that can't do a lot of damage if something goes wrong. Let me describe memory grants, which is an important concept that we have.
You know, file server and some other processes need to access memory from another process. So if you say to the file server, read a block, it's gotta write the block into your address space. But it can't touch your address space, because it's just a humble user process.
So how do we do this? Every process that wants to have somebody else write into its address space, builds a memory grant table, okay? And it makes a kernel call saying, here is my memory grant table, here's the starting address, and it's got so many entries in it, okay? So the kernel knows where everybody's memory grant table is. And it could make in its memory grant table
an entry saying, the disk driver, process nine, may write in my memory from bytes 1400 to 1499. So it puts in the exact amount accurate to the byte. There's no page alignment required, okay? And then, when the user wants to talk to the file server,
or the file server wants to talk to disk driver, it passes an index saying, you have the authority to write in my address space using memory grant number one, okay? And then, when the disk driver, you know, wants to write into the file server's address space, it says to the kernel, I want to use this, you know, the memory grant to write into his address space.
And then the kernel checks the table to see, is this guy authorized, and which bytes is he authorized? And if so, the kernel does the copy accurate to the byte. So it can protect memory down to the byte granularity, but having the kernel do the copying instead of the other guy doing the copying. Now this introduces a little bit of overhead, it's true. But it protects the address space, so you can say to somebody,
you can only write in this little piece of my memory and no more. And after the operation is completed, you know, normally you'd erase the grant, so he can't use it again. Okay, so this protects memory down to the byte level granularity. Fault injection. We ran experiments injecting faults into drivers
to see how good it was. And so there's a program you can get which generates faults in binary code to test it. And we injected 800,000 faults in each of three different, I think it was network ethernet drivers, okay? So this fault injection program runs in real time,
another process like, you know, using ptrace kind of calls. It generates bugs in your code. It doesn't just generate junk, what it does is it analyzes the code, understands the machine architecture, and then it does things like, if you said, you know, move, you know, one register to another register, it will swap those.
So it simulates the error if you wrote, you know, I equals J and you met J equals I. So it's a very, you know, it's simulating semantic errors that programmers might make. It looks for conditional branches and changes the condition. So it finds a branch less than, it turns into a branch less than or equal to. This simulates the error, you know, for I equals zero, I less than N, you know, I plus plus
to, you know, for I equals zero, I less than or equal to, you know, N, I plus plus. A typical error a programmer makes, this program generates errors that are common programming errors. And we injected like, you know, 800,000 faults in these things. The way we did it was we injected 100 at a time, so we took the driver, put 100 faults in it,
changed the 100 pieces of code, and we waited one second to see if it crashed. Many times it didn't crash because it never executed the code we messed up in that one second, or the way we messed it up wasn't really important. If it didn't crash, we injected another 100 faults and repeated the process, and then we waited until it crashed, okay?
Well, we eventually got 18,000 crashes out of this stuff, but the operating system never crashed despite running two and a half million trials. We found a lot of things. We found bad hardware sometimes in the 18,000 crashes. We found ways to hang the PCI bus that couldn't be recovered other than pulling the plug out. We found all kinds of wonderful things,
mostly crappy hardware, but the operating system never crashed in all these trials, so we think it's relatively robust because we crash the drivers all the time, but not the operating system, okay? So that's sort of the basic technical story. There's a lot of academic projects where they build a little tiny kernel and it does something to do whatever,
but it only works when the grad student is there and it can't actually run anything because it's too much work to do all the rest of the stuff, but we've actually made a real effort, maybe too much of an effort, to make it an actual sort of usable UNIX system, okay? It's not state-of-the-art. It's not nearly as fancy as UNIX or as a Linux or BSD
or any of those things, but it's a usable, recognizable UNIX system, so you could actually believe it could work. For the screen, there's X11, and we have a simple desktop program, EDE. We don't have kernel threads yet. Oh, it's on our agenda, so we can't run things like GNOME and KDE, but EDE gives you a desktop if you want,
but it's our experience that a lot of programmers just want X11 and a bunch of Xterms, and that we have. We have a bunch of shells, Bash and the public domain, and corn shell and z shell and other shells. We've got compilers for C and C++ and Python and Perl and PHP and a number of the other sort of standard languages. We actually have two C compilers.
We have GCC, which is a big pain, and we have ACK, our own compiler, which is, in some way, it's not as fancy, and it only implements ANSI standard C. I wish GCC implemented ANSI standard C, but it doesn't, and it's much, much faster. We can build the entire operating system, all of MINIX, the kernel, all the drivers in user space,
all the servers in about eight or nine seconds with ACK. So that's a relatively fast build for 125 compiles and 11 links in about less than 10 seconds. Editors, we have Emacs, we have NVI, VIM, Vial, Netit, you know, standard editors.
Photos, we have ImageMagick, JPEG package, XV. Utilities, we have all of the version seven, you know, standard real Unix utilities. We also have all the GNU utilities. We also have the BSD utilities. We put them in different directories. So if in your path, you put user GNU as the first thing in the path, then it'll find the GNU utilities first,
and you're getting GNU copy and GNU LS and GNU everything. If you put, you know, the BSD path and the BSD directory in the front of your path, then you get the BSD utilities as first choice. If you don't put either of these, you get our utilities as the first choice, the version seven utilities. So you can sort of choose which environment you want by setting up your path with the appropriate directory in the beginning.
We have some stuff for the web, Apache, Dillo, Lynx, Lynx. We don't have Firefox, we'd like that, but I think it needs kernel threads, and it's a very big, hairy program, and I don't think it's very portable, but we'd like to have it. Somebody wants to support Firefox, that'd be wonderful. We'd really love it. Mail programs, Pine, Poptart, XM, Sim, and so on. Database, Postgres, SQLite, MySQL client,
we don't have the server. QEMU, which is nice, you can emulate, you can run Windows 98 on top of Minix, and it runs at the same speed Windows 98 ran on the hardware. We can't run Win 7, it's just too big and complicated, it barely runs on current hardware. But Windows 98 ran on 60 megahertz Pentium 1s,
and QEMU slows you down by a factor of 10. So take a modern machine, it's like a 300 megahertz Pentium 2 and Windows 98 runs perfectly on that. So it's running at a better speed than Windows 98 ran on the hardware it was designed for.
Then we have MPlayer and NetHack, we have Subversion, it's about 600 programs available for it. So it's not a full-blown system with thousands of packages, but it's enough to prove the point that it's doable, and you could actually use it. Does anybody care, does anybody use this stuff? Well, we have, of course, we log the traffic to the website, minix3.org.
Here are the number of visits per month we've had to the last year. It's running about 25,000 visits a month, okay? We've had 1.6 million visits since the site went up in 2005, about four years ago, actually, at the end of 2005. Compared to Linux, this probably isn't very much.
But for those of you who work on other open source or academic projects, you know, 1.6 million visits is probably a fair number compared to other comparable open source, you know, low-budget kind of projects. Downloads per month, it looks like we're running, I don't know, 12,000 downloads a month for the last year. So we've had 610,000 downloads since, you know, like about four years.
So again, I mean, compared to Linux, this is nothing. But compared to other open source projects of other, you know, kinds of things, 600,000 downloads, you know, there is some interest in there, in the world, okay? The current team is me. I've got five PhD students working on this. There's one postdoc. I've got two student assistants.
I've got three paid full-time programmers. I've got a couple of master's students doing work on it. We had four Google Summer of Code students last year. And there's various volunteers around the world working on all kinds of things. So this is sort of the group. This is an ad, which is why I came here, of course. We're always looking for help to work on new things,
volunteers, for example, to port programs. Porting programs, you know, I don't know. Everybody here has probably taken a course in software engineering somewhere along the line, and all of you forgot it the day after the exam that it's just amazing how much free software is so crappy. It's not portable. You know, you run ./configure, and it always fails.
Always, you know? You know, what happens? It was looking for, I don't know, Perl 5.2.3.26.9.4B, and if that wasn't there, it gives up, even though the application you're trying to install doesn't use Perl at all, okay?
You know, these configure scripts, configure strips of 20,000 lines of incomprehensible code generated by a program, which itself was generated by a program. It's just not the way to do things when the application in question didn't even need this facility. So porting programs, you know, typically it's one or two lines have to be removed from it,
but it takes some effort to figure them out, so that's a problem. Porting libraries of all kinds would be useful. We have a wiki adding documentation to the wiki would be, you know, fantastic. All kinds of things are not well-documented. Translating the wiki into other languages is very welcome. Now, the real thing, I actually have money from this EU grant to hire more people.
I spent most of the money on the PhD students and so on, but I'm looking for a fourth full-time paid programmer. So if you're looking for a job or you have a job now you don't like, you know, you're doing web design for a company making bathroom equipment or something and you don't really like the job much, and you'd rather do kernel hacking and be paid for it, this is your chance, so get ahold of me, you know.
So I'm looking for a fifth program for another related project, but so basically I have two openings for paid programmers in Amsterdam. So if you're interested, try to see me later or at least mail me your CV. You can find my, just type my name to Google, you can find my homepage pretty easily. So I can do all this hacking as a full-time paid job,
Amsterdam's a lovely city, and so on. There's more on my homepage, or just Google me and find the homepage. What's the current work? We're working on a live update. We wanted to replace pieces of the system while it's running, so, you know, if you can replace it after a crash, doing it intentionally is, you know, easier in some sense, although in some ways it's harder.
Like in most operating systems, if you're using, you know, version 6.2.4.13, and there's now a .14 that's come out, they tell you reboot the system. We don't want to reboot the system, we want to go over to the new version while it's still running and while all the processes are running. That, you know, there you are, you know, your TV used to be, you know, hardware,
now your TV is software, it's got a little box, you know, which is, you know, 40 megabytes of code running in there, and there it's the Super Bowl and it's the last couple of minutes, and you get this message, rebooting, and it takes seven minutes, and it reboots with the new version of the operating system and you've missed, you know, the final couple of goals, you know, you sort of want to replace the system, you know, while it's running.
Banks don't like going down and so on, so we think it's important to try to be able to update the system, component for component, while it's running and while things are going on, and during the update process it might slow down a little bit for a couple of seconds, but not have to have a reboot. Multi-core has become very common, and we've got a multi-server operating system.
The operating system itself runs as many processes. Suppose you had a lot of cores and you had a lot of processes. Well, you know, it might occur to you, gee, we have a lot of processes and we have a lot of cores, can't we sort of make that match together somehow, like run each process on its own core, for example? So, but it's not so easy, but we're looking at that whole area of sort of multi-server meets multi-core.
We're rethinking the file system. The current file system is basically the Multics file system from 1965. The only major change, really, is in Multics they use the greater than sign as the separator, and now we use the slash and Windows uses the backslash. But other than that, it's the Multics file system, essentially. It's now 2010, let's rethink the file system, you know, so we're looking at that.
And we're trying to recover stateful services, which is a bit tricky, okay? The license, which is always, you know, an issue, it's the BSD license, which does do whatever you want, except sue us, but other than that, do whatever you want. I know there are religious wars. I was speaker, keynote speaker, at the Linux conference in Australia a couple of years ago,
and I didn't have the slide then, and somebody asked me the question time, like, what's the license? And I said, it's the BSD license. I was expecting tomatoes to be headed in my direction. It was a huge cheer from the audience. Linux group cheering the BSD license on, anyway. But for better or worse, it's the BSD license. We have a lot of GPL packages, but they're all sort of add-ons. You know, in a theory, if they got annoyed at us,
we could put all the packages on a separate CD-ROM, and you have to have two CD-ROMs, but whatever. Okay, positioning of Linux. We're trying to show that multi-server systems, it's not about micro-curls, it's about multi-server systems, show they're reliable, demonstrate that drivers belong in user mode, high-reliable and fault-tolerant applications, possibly future $50 single-chip,
small RAM laptops for the third world, you know, the one laptop per child, the next generation of that, embedded systems. We have a logo, that's a raccoon, wild raccoon, everybody has to have an animal logo. Small, it's cute, it's clever, it's agile, and most important, it eats bugs.
Okay, conclusion. Current systems are bloated and unreliable. It's an attempt to build a reliable operating system. The kernel's quite small. The OS runs as a collection of user processes. Each driver is a separate process. The operating system components have restricted privileges.
We replace the drivers on the fly. We have a website, minix3.org, go there and find all kinds of stuff. We have a Google News Group. You know, you can talk to us and you can answer questions and the whole thing. We have a wiki, please contribute to it. I also have CD-ROMs. I didn't realize the scale of this meeting, so I didn't bring very many. The feeding frenzy, but you can download it,
make your own CD-ROM, only these are the official ones with, you know, our little thing on it, but it's the same thing on the website. So I have the CD-ROMs and whatnot. Okay, questions.
Hello. Are there any similarities between what you're trying to achieve with what Chrome OS is trying to achieve? With which is trying to achieve? Is there, are there any similarities between the design philosophies that you're taking
and the ones that Chrome OS are taking or are they completely different? Chrome OS, Google Chrome OS. Chrome. Crow? Chrome. Chrome OS. Chrome, Chrome, Google. Chrome, I think, is trying to run basically only one program that's their own browser. And for many people, that may be enough.
All they want is a reliable browser. I mean, internally, I don't know, I haven't seen the source yet of that, but it's a similar kind of idea, only this is a real operating system. You can run all of the Unix stuff on it and not just the browser. So there's philosophically a certain difference, but you know, I don't know what their architecture is of Chrome, so it may or may not be technically similar,
but the idea of not so big, you know, I think that's probably in there, yeah. You said that per message, you have a delay of about 500 nanoseconds. Saying that it's not much delay, but we have people, especially from the audio guys, that nagged us that we have a delay of five microseconds
and would like to get it to two microseconds or one microseconds. Grandma doesn't know what a microsecond is. I mean, if you're the guy who wants to get every drop of performance out of it, you know, no, this is not for you. But I used to have, when I got my first 60 megahertz Pentium One,
I thought that was fantastic, because it was still 100 times faster than my old PDP-11, and you know, there's some people, whatever you have, it's not enough. You know, this isn't for that crowd, I'm afraid, but if you want a reliable system, that's sort of our direction. But no, the last drop of performance, this is not gonna be it. I don't know how much the performance loss is. We never measured it. The L4 guys who have a similar kind of system
have actually measured, they've really done a lot of work to optimize it, and they say that the microkernel approach cost them about five to 10% in total performance. And so, if five to 10% is much too much for you, then this is no good.
First, thanks for the presentation, because it was really interesting, and well, quite funny, actually. I have a question in the line of the previous question regarding the performance. You go for a pure microkernel approach, but how about the hybrid approach, and how do you actually place yourself
with respect to the level of performance that you can achieve using a hybrid approach? Again, performance has not been on our agenda, that if you have a three gigahertz machine, and it behaves, I've never seen this experiment, but if you went to the average grandma or regular user, and the grandma calls up Dell and says,
I want a new computer, and Dell said, we have two models. We have the high-speed model that runs at three gigahertz and so on, and it crashes once in a while, and we have the two gigahertz model, which never crashes, I bet they'd sell a lot of the two gigahertz models. I mean, I don't have any actual data, but I have just a very strong feeling that there's an awful lot of people
who'd give up a third of their performance, or maybe if I'd give up personally, 50% of my performance, if you could guarantee it would never crash. Right off the bat. So, the people who want highest performance, this is not your thing, but I think an awful lot of regular people don't care about that. Things are fast enough now.
Yes, also thanks for the presentation. What happens when the reincarnation service goes down? Can you recover from that? No, we're dead in the water. In theory, we could have three copies of it running, and have a triple modular redundancy scheme, where all three copies were checking on each other, and if any two could outvote the other one,
so if one of them failed or began acting weird, the other two could kill it, and then start a new one, so that the techniques for doing that, triple modular redundancy, are well understood, but the reincarnation server is so simple, in terms of its code, it's only 10 pages of code, that we haven't had any problems with it, but we could go to TMR if that became an issue.
So, you compared to standard operating systems, and said you want to boot faster. How fast does Minix reboot? I don't know, the actual booting of the basic operating systems under maybe 10 or 15 seconds.
Most of the boot time, normally, when a computer boots, it starts up doing the BIOS checks of the hardware, and for Minix, nearly all the time is the hardware checks, and the time the hardware is actually finished, and starts booting Minix, it's only a couple of seconds, so it might be 30 seconds to boot,
of which 22 seconds or 25 seconds are the hardware checks of the memory, and so on, but the actual booting of the operating system is fairly short, a couple of seconds. A question about scalability. You mentioned that a lot of permission checks are done with bitmaps. Can you, on compile time for your kernel system,
go beyond 32 servers or devices, or are you just limited of 32 devices are enough for the world? At the moment, it's in fact a bitmap with 32, although there's no reason we couldn't make it a bitmap of 64, but it has to do with which servers you can talk to, and the number of servers is limited, the number of servers and drivers is on the order of 15 or 20, something like that.
So far, we haven't run into that limit, but if we ever ran into the limit, we'd have to change something from being a 32-bit number to a 64-bit number, but it's not, you know, if you have lots of user processes, that's not an issue, that user processes see the POSIX interface, and they can't send messages to each other, they just have the normal POSIX interface. If they wanna talk, they use pipes.
It's only a question. Now, if you had more than 30 or 40 drivers running, then we'd have to have more bits in the bitmap, but it wouldn't be very hard. I mean, if words were 64 bits, which we don't support, you'd have more bits, but you could have two words. So it's a small point, but it could be dealt with.
What happens when a data server goes down, and maybe afterwards, a driver? The data store? The data store. The data store is in, you know, I talked about a smaller TCB, and smaller TCB means we're trying to get the amount of trusted stuff as small as possible, okay? The TCB includes the reincarnation server, the data store, the kernel itself,
there's a certain amount, maybe 20,000 lines of code, that's absolutely crucial to how the system works, and the data store is indeed part of that. Again, because the data store doesn't have a lot of traffic going to it, then you could have, again, triple modular redundancy, having three data stores, and they could vote, and so on, so at the cost of a little bit more overhead, you could make that reliable
if that were really important, but we don't currently, because the amount of code in there is only a couple of pages, and it's fairly simple and fairly reliable, but one could do that, you know, for even more reliability. Yep. Given you're talking about TVs and things, any work on non-X86 ports, say ARM, for example?
We've tried to get an ARM port, but you know, one of the problems, unfortunately, with volunteers is we've had two people start the ARM port and both sort of get bored, you know, and then drift off, and so our ARM port isn't finished yet, and so if someone says an ARM port, we're really happy to do that, so to get into the embedded world in a serious way, we'd probably have to get the ARM port actually finished, and you know, I read somewhere recently
that more than 75% of all the Linux code was written by paid programmers working for IBM, Red Hat, and a couple of other companies, that the story of the users contribute stuff, that is, you know, it's a nice story, but it isn't actually the reality, and you know, so if somebody wants to work
on the ARM port for us, we'd be very, very grateful, you know, so we're trying to, that's why I'm trying to hire another paid programmer, the volunteers, they have other things in their lives sometimes, and you know, the programming sort of becomes second choice, so I think we need the ARM port to get to the TV world and the embedded world, we don't have it yet, although we've tried.
If an aircraft carrier launches a missile, and the reincarnation server is going to restart the process of the kernel, then it's going to launch a new missile, so can I avoid this from happening, and is this programmable in a way that I can say for this specific kernel process, I do not want this to happen,
that it will retry all the time? Yeah, yes, but as I said, when a process, when a kernel, when a driver or a server dies, the reincarnation server gets a message that it died, it then looks up in a table what it's supposed to do, and what it normally does is it runs a shell script, and that shell script could be set up to say, stop, don't start it again, just send a message to so-and-so or an email or whatever,
or it could restart it, so it could easily put into the script, don't restart this one, take some other action. Hello, okay, so most people that write software have the experience of having a QA department that file five identical bugs of the form, I clicked this button and it crashed, then I clicked it again and it crashed,
and then I clicked it again and it crashed, and it seems great to restart, for example, the disk driver, but then the file system is told to send the same request that crashed the last one, so what are you doing to make sure that these ripple failures don't... If the failure is a true algorithmic failure and it is incapable itself
of converting the linear block address to the correct head sector and track and cylinder, then there's no way we can fix it on the fly by starting it, but our experience and everybody else's experience is that most errors that you run into are transient errors caused by weird timing or two things happen at the same moment or something like that, that the true algorithmic bugs that it never works
typically get checked before they ship the thing, but so we mostly can recover from what are effectively transient errors and that works most of the time. What is possible, we don't have, but wouldn't be hard to do, is if you had two copies of the disk driver or three copies that were actually different code, different algorithms written differently,
the reincarnation could start, so after number one crashes, then run number two next, so the mechanism is there if you have a backup. We can't fix the code, debug the code and fix it on the fly, but we do have a way to run a different one if one wanted to do that, but in practice, that isn't needed. What is Minix doing or can do
to protect against a hardware that does a bad DMA or a driver maybe that programs DMA incorrectly? I mean, if the programmer has... If the code in the driver is incorrect and it's programming the DMA wrong all the time, there's nothing we can do about that. I mean, somebody's gotta fix the code. If there's another driver available, maybe an older one, we could run the older one or something,
so we can't fix program bugs on the fly. Hardware errors, of course, we can't do anything about. If the DMA controller is actually broken, well, nobody can fix that. I mean, of course, putting a new controller in. The best we could do is we conceivably could have a series of drivers, but run this one. If that one fails, second choice is to run an older version.
That we can set up easily, just the shell script says, and the first failure, run this driver, and the second failure, run that driver, and so on. So that mechanism's easy. Okay?