We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

NetBSD/mips

00:00

Formale Metadaten

Titel
NetBSD/mips
Serientitel
Anzahl der Teile
24
Autor
Lizenz
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Since NetBSD 5 was released, the support for MIPS on NetBSD has been completely revamped. It is now one of the more advanced ports of NetBSD. This talk is an overview on what has changed and what the current state of MIPS support and a brief look forward to what else is coming. Subjects to be covered: Why? (Big Embedded space, large amounts of memory, etc). Quick Introduction to the MIPS architecture Overview of XLR/XLS/XLP Overview of what changed (toolchain, SMP, pmap, PCU, compat32, new cpu support, use of MIPS features, fast softint) Design decisions why N32 by default? why no separate mips64? Major features 64-bit address space cpu abstraction dynamic fixups (changing indirect calls to direct calls) splsw UVM changes Fast software interrupts SMP (for NetLogic XLR/XLS/XLP) mostly lockless pmap Choosing a new page size COMPAT_NETBSD32 networking filesystem mounting 32-bit systems N32 Kernels Effects on the NetBSD in general PCU direct-mapped UAREAs COMPAT_NETBSD32 common pmap for TLB based MMUs
CodeDatensatzMathematikBitFunktionalLastMomentenproblemCoxeter-GruppePunktGenerizitätLineare OptimierungVideokonferenzVerknüpfungsgliedHyperbelverfahrenPhysikalisches SystemCoxeter-GruppeHilfesystemOpen SourceNeuroinformatikAssoziativgesetzJSONXML
ComputerarchitekturDialektPhysikalisches SystemKernel <Informatik>Service providerAdressraumTLB <Informatik>Mapping <Computergraphik>Lineare OptimierungAnalytische FortsetzungCachingJSONUML
Attributierte GrammatikCodeComputerspielDatenstrukturFolge <Mathematik>HardwareMathematikMehrprozessorsystemComputerarchitekturProgrammierungHalbleiterspeicherDatenkompressionTypentheorieEichtheorieFehlertoleranzMakrobefehlMAPSoftwaretestMIDI <Musikelektronik>ProgrammierumgebungVerknüpfungsgliedAutomatische IndexierungKonfiguration <Informatik>Total <Mathematik>AnalogieschlussKlon <Mathematik>BefehlsprozessorBitDivergente ReiheEinbettung <Mathematik>FunktionalGeradeKomplex <Algebra>Konvexe MengeLastLeistung <Physik>MereologieMomentenproblemMultiplikationOrdinalzahlPhysikalisches SystemPhysikalismusRechenwerkSpeicherabzugSpeicherverwaltungStellenringTabelleTeilmengeVirtuelle MaschineWechselsprungZahlenbereichE-MailVerschlingungQuick-SortSystemaufrufKonfigurationsraumGüte der AnpassungAutomatische HandlungsplanungMatchingVerzweigendes ProgrammStochastische AbhängigkeitParametersystemRuhmasseCoprozessorAusnahmebehandlungCASE <Informatik>Prozess <Informatik>ProgrammfehlerZusammenhängender GraphHeegaard-ZerlegungFehlermeldungInstantiierungEnergiedichteDatenfeldZeiger <Informatik>DistributionenraumAdditionPunktKlasse <Mathematik>QuaderSchnittmengeEin-AusgabeInformationsspeicherungKernel <Informatik>Web-SeiteHilfesystemThreadUmsetzung <Informatik>Interaktives FernsehenRichtungEinfügungsdämpfungZweiunddreißig BitVirtuelle AdresseAliasingGraphfärbungAdressraumSoundverarbeitungQuellcodeRegulärer Ausdruck <Textverarbeitung>Vorzeichen <Mathematik>BootenProzessfähigkeit <Qualitätsmanagement>SichtenkonzeptWeg <Topologie>WhiteboardSilicon Graphics Inc.Trennschärfe <Statistik>Endliche ModelltheorieSeitentabelleSchlüsselverwaltungDifferenteKonstruktor <Informatik>Drahtloses lokales NetzNeuroinformatikHook <Programmierung>Rippen <Informatik>sinc-FunktionSymboltabelleSocketKontextbezogenes SystemMultiplikationsoperatorTVD-VerfahrenZweiOverhead <Kommunikationstechnik>Mapping <Computergraphik>PCI-ExpressCachingMinkowski-MetrikRechter WinkelGotcha <Informatik>Dienst <Informatik>VirtualisierungInterface <Schaltung>Figurierte ZahlDoS-AttackeIdentitätsverwaltungSoftwareentwicklerEinsLesezeichen <Internet>KoroutineDynamisches SystemFolge <Mathematik>ImplementierungLineare OptimierungMathematikComputerarchitekturInformation RetrievalGrenzschichtablösungDeterminanteAnalytische FortsetzungBitGleitkommaprozessorGruppenoperationLeistung <Physik>Physikalisches SystemRechenwerkSpeicherabzugVirtuelle MaschineWiener-Hopf-GleichungKonfigurationsraumStochastische AbhängigkeitFontElektronischer DatenaustauschCoprozessorRouterKernel <Informatik>Web-SeiteTragbarer PersonalcomputerBetriebsmittelverwaltungFramework <Informatik>GraphfärbungAdressraumJensen-MaßElektronisches ForumElektronischer FingerabdruckARM <Computerarchitektur>SymboltabellePERM <Computer>Nichtlineares ZuordnungsproblemCachingVirtualisierungInterface <Schaltung>JSONProgramm/Quellcode
Transkript: Englisch(automatisch erzeugt)
Okay. Yeah, no that's probably fine. It doesn't matter if this ends up in the recording. Hello everybody. Our presenter is Matt Thomas. He's in California at the moment. He couldn't travel today.
He has a great and very detailed presentation for you about BSD MIPS and a bunch of the changes that he's been making. So I will hand you off to him and then I'll just step in to help with fielding questions when we get to that point. And off we go. So basically I've been doing BSD MIPS stuff since 1988 when I first worked on PMAT for that.
So yeah, MIPS is kind of in my blood. About six years later, at BSD, he imported the BSD 4.4 Lite for the PMAT, which I actually have never looked at.
Finally, generic MIPS code was at BSD when the new MIPS port was added in 1998. Since then, there's been a lot of other MIPS ports, but most of them were done between 1999 and 2003. And pretty much everything is stagnated since then.
For those who are not familiar with MIPS, just give me a brief overview. The first MIPS was MIPS 1, which was the RF2000, R2000. It was basically 3B2 bit with delay slot, load delay, load slots, and all those other things. They had a really simple TLB, but since they were sold, nobody actually thought of having integrated locking functions.
And AMD on those were the real thing, which we did that. Later on, the MIPS 3 came out, the RF3000, and it was the first 64-bit MIPS.
Later on, that became MIPS 64. And along with that came the MIPS 32. So those are 32-bit locking instructions, and pretty much the same TLB as the MIPS 3. And those are all pretty much the same, but perhaps slightly different.
Now, the MIPS architecture is split into basically kernel and user address spaces. And one significant thing about the MIPS architecture is the sign, which means that kernel addresses are negative and user addresses are positive.
In addition, the kernel address space is split three ways. Half of it is used for direct mapped access to TLB maps, which means you need a TLB entry to translate the virtual system to the provider. Now, the direct mapped access only maps the first 512M.
It's split into regions that either map it as cached or uncached. And pretty much all the MIPS systems only address the first 512M. I've never actually encountered a MIPS that had addresses outside of 512M.
The 64-bit address space that was added to the RF4000 is split into four segments. User, supervisor, that's how you really need the supervisor. And then you have a kernel direct map, which is connected to k60, k61, but it encompasses all of the memory up to 15M bits.
And then you have another 62-bits worth of kernel TLB map space. So you have pretty much all the access spatial.
Again, as I said, there's still a sign. And positive addresses are different. Negative addresses are the kernel. I'm just going to make it super flexible since somebody uses it. As I said, k61 is still not the first time you're going to make a physical memory. Now, interestingly, since addresses are signed, they're also signed extended with money.
So it turns out that k61 is still the first. Except for they are in the top four gate around such that they're at FFFF. FFFF. FFFF.
Blah, blah, blah, blah, blah. So I mean, they're basically, the kernel space that was added was added below k61. As I said, Xscape direct maps all have a physical memory. But the way MIPS does it, you have what are called TCAs, or cache coherence attributes.
And they say whether you're memory coherent, uncached, local coherence attributes, and you have up to eight of them. Xscape is designed such that it splits into eight segments. One for each gate. So every piece of memory is mapped.
You know, so you can access it via any gate you want. So that actually lends a total amount of physical memory to each gate. Most systems only have 40 or 8 bits or 44 bits. All added to it. Those are the only pieces that'll work, but fully as an initial.
Now MIPS where it originally came out was 2-bit only. And that was called the 032 ADI. And it's pretty much what's used even today on the R4000 came out with 64-bit architecture.
Extended 032 and extended attacks, blah, blah, blah. And they came up with 064, which is pretty much identical to those that were used for 64-bit Y. Now over time, they really wanted to be able to do an 060. Because you really cannot pass enough arguments and registers.
And since that, since memory access, it turns out that really can lead to a significant increase in speed. So they came out with the N64 ADI. But it changes those, but it allows you to pass 8 arguments rather than 4.
It also means that you suddenly realize that, hmm, this one's pretty close. People still need to fix ADIs because a lot of, because before it was kind of voiceover.
So they came up with the N64, except for pointers are now back to 32-bits. Longs are 32-bits, but everything else is still... While long is still 64-ware, and it's still 32-bits,
it still only changes things to be at 32-bits. I got engaged while I was kind of doing that 64-bit problem. I just never got around to it until somebody paid me to do it.
Well, that's just the grace of life. And basically, they wanted NIP-64 on the RMI, the NetLog raw, by BiggerFish. The XLR, XLR, these are very big steps.
They're multi-core, they're super-scalar, they address memory, they have a lot of idle, they have a lot... The ones that I've been using have a core, up to 32 threads per chip. And with the XLP, you can actually tie multiple chips together, so you can actually get up to 120 threads.
And there are plans for things in the second way to go, much larger than 320. So that is a very big thing in R3. And that required a lot of changes.
They use 40-bit, up to one terabyte of RAM, but they don't. And they, unlike the other 4,000 previous CPUs, they actually support a full 30-bit memory map space. Most of the other earlier CPUs, you only can map one terabyte, 16 terabytes.
That's enough. Duh, drop the XLP, and you can actually do full 60-bits. That's exactly what you want to do. They also have multiple NIPs on the XLR, or multiple PCIe on the XLR as an XLP.
They also have, you know, XLR edge, peer-to-peer-to-cell public PC, compression network, and for the newer TVs, they have a regular expression match. Well, when I started this, it was for NIPs.
Which again, only used 32-bit, which pretty much meant that 64-bitness was never going to be used. In general, it was 32-bit only. They only used a number which we addressed immediately, because there was a limited 512-bit map. These computers have a slip-back of 256, 256, 256, and 256 minutes due to this anxiety.
So that's much less of a 512-limit computer. Now, most of the stuff in the PDI I owe was 512. But, that was something that just could not fit in the case I wanted.
Of course, it was uni-processor on the XLT. So obviously, what I can do is, I can use a 64-bit address space, be able to use lots of memory, not over one gig, but actually a steam gig or more.
And multi-process support, and as part of multi-process support, no, I had to work for preemption, because that is almost done with preemption. And as Eris talked in her talk, the Spark 64 are the power of the C66 for good.
We never really liked the 3-bit stuff or the 30-bit stuff. They're pretty much the same, and so I avoided that,
and there is, though, since our chips are stored, all in that piece. It's all conditional and mild. A lot of these pieces, simply, it's identical between N32 and O32. We use the standard master that we're doing by SGI.
We're doing loading a pointer, loading a long, loading a rip. So it just happens at the right place. Now, there are times where you actually have to know what these guys and kernels compile. Also, you have to know what ABI is actually in the process.
And for what kernels compile, for example, I had a C3 macro that usually tell you what ABI instead of the obscure stuff that you normally use. Which, when I look at the other C-quotes,
the best C-quotes uses a 32-bit ABI. All user programs are fine. We use N32 because, face it, LS is not unique.
Nor do 99%. Maybe once in a while you're compiling the ability to do that. But for the most part, it's not really unique.
Now, the other significant change from our O32 port is that it's less than XLR. So I would have to have added a lot of things, like flooding constructions that are used for N32. We're doing a lot more work and a lot more effort.
And I just planned it and decided to change things so we would need to model it instead. And that's probably a thing we have that doesn't even care about flooding. Now, the other thing that if you have a 64-bit kernel in the use lens,
you're probably thinking, well, maybe we'll try this and look at the kernel memory. Maybe it's the infrastructure that's developed a little right. Maybe it's the kernel memory with pump rock, with a medstat or things like that. But we are going to need more and more stuff to do that. And we just actually look at, we didn't start to be a system clone.
but it's really not that big of a thing. Then the other big advantage of having a usual lens is that if you have a kernel that's an ultra-entry to a 4-link, or if you have a kernel that's a 12-link into a 4-link, if you have, you know,
for a usual lens, you can, now you're using it for you to use a lens. So this gives you, you know, pick what kernel is more appropriate for your environment. Now, the thing that inter-entry to the entire set, obviously, we haven't heard of it yet, but, you know, and that'd be by default,
it's technically 4-link, but that'd be to say it's a really cool thing called Compatnet VPC32. So a lot of people run 3D programs and then a 4-link kernel. And it works pretty well. It was very incomplete. It was a lot of syscall,
and some of the syscall that were submitted were uniquely emulated and everything. So, one of the things that had to be done was, obviously, the thing that moved the syscall you seem to think there was a big problem which was not possible for, I just, first of all,
I had had not two, I had to deal with the various mount structures that are used by various fossils. They're all different in all worlds. Also, the rocking socket was really not 64-bit claimed, and that had became, actually, it becomes 64-bit claimed. And those were the biggest items
for getting a completely 64-bit program. Because not only is the writing for being top-level just for, like, the rocket command that's also used, you do a bit by a hundred, and it lives deep. But there's a whole bunch of other, it's getting more 64-bit aware,
but we still have all these 3-bit programs either Spark, Keras, RBC, or IK86, all new interfaces that we keep on having internal on, some of those were 64-bit. Then we turn the data structure,
that will be the same whether you would be writing a 3-bit, but you don't enforce 64-bit, you don't. And that helps a lot. But there's still a lot of other great ones that require explicit conversion of one or the other, especially if it contains pointers. And as I said, I have to advise
that also compatibility mode, if I present the old 3-bit and freezer programs that were compiled for that. And this, that means that I can fund 4-bit kernel in the 2030s, because it's only,
not even for i386. The other thing about doing 64-bit addresses is that, you know, this uses a standard two-level thought, just to give you the update. But that will only get you, that just means when you're talking 64-bit, 4K page.
Now, obviously you want something larger than this. You can do a full-level of this, which is what Linux does, but I settled on a three-level, there'll be our three-level page table, which gives you one terabyte address, which seems to be sufficient for that.
Now, if you go with an 8K page, that actually is 16 terabyte address. And that's an easy fix, or easy way to get a larger address without adding another level of page table, just slowing it down. Now, given that most of our programs use 32-bit address space,
this code that optimizes, you still only do a two-level lookup, even though, you know, your system is compiled for 64-bit addresses. So there's really, that part of the additional extra page table allocated for that top-level page table,
there's really no overhead. In 32 program on our before program, in 32 program on our before program, they both go through that two-level lookup, except, I don't think that's significant,
you have 64-bit of address space. X-stages can be used to access memory without needing a pickager. This is really useful when you do match flushes, I mean, it's back and forth. It requires all,
also all address, all addresses don't need to be a back-end, they're just directly accessible. So, I mean, an example of that is that the PCI-specific index is 256 megabytes.
So if you map that with four megabytes, or four kilobyte of page table, a lot of them, and you would speed up most of your library, through the different terms, that then you can actually use, because you avoid all those type of issues.
One of the other components of address space is you find out a lot of DMA devices, DMA analog PCI devices, like DHCI, USB. So obviously when you're doing an IO,
you need to make sure that you're within that first four gigabytes of device, and if not, you have to talk about that. So, you really want to use partitioning your memory between those that are below the four gigabytes and those that are above the four gigabytes. And I'll get into more of that why that's significant later on.
It's careless when you go in and then regardless, and you know why that is, is so you can take advantage of XK-5. You have to have such operations,
you're not actually passing the virtual headers to the register, you're actually, I mean, a register, like, you know, a register two, which is the key to the phone. Which means, my capture key, is XK-5, and I don't have to worry about, do I have to go back to the stage? Do I not have to go back to the stage?
And that simplifies a lot of code. Still compiles into the loaded XK-5. And they use very good symbols. The nice thing is, when it's only a new instruction load in two.
The other advantage is, I think, from the instance before kernel, in two, through to one end, you know, if the bootloader doesn't understand how it's stored, well, you would have been screwed, but, they are, you don't have to listen to the kernel. They're perfectly happy, and you can avoid problems with the bootloader.
We had that way, using the bootloader that RMI supplied us with their eval board, so it just didn't understand how it's stored. And rather than modify the firmware, we can just load 32 kernel in all we have. Now what?
Now the what's the difference between 32 kernel and the support kernel is the amount of the TLB mass based on the kernel. In 32, in 32, in this case, there was just limited to one gig. The support kernel is the best case, which is limited to 64. That's about it. In reality,
we limited to block classes, typically two gigs. That seems to be sufficient. The one problem that we have is, we're having a feeling of lots of differences used to have different, you know, TLB needs different needs,
what registers they use. And so they have a whole bunch of, so we have a whole bunch of parameters that tell us what we need to do. A jump which has like the TLB routine, that's if you switch your gene,
that tells us various things that need to be done. And a low core atomic effect that basically has, you text your gene, you pass your gene, and you know, remember whether we should be using LLSC, which is low loss or conditional, or that's blocking technique for the discernible atomic sequence.
If your box is not capable of doing NP, there's no reason to use LLSC for a lot of that, because there's more overhead. So unless you have a system that really is capable of being multi-processors, we default to the discernible, accomplices is there much less overhead.
And then we also have what we call low core switching, and some of the helpful routines for us, your bio-routine, all the mix, or just have the same thing, all the, you know, kind of quick core interrupt, and so low core switch allows us to
check that out. And finally, each system has a different inter-interrupting of how, or why, and we check that for the next people.
Now the next thing that is, our mind is very good at inter-interrupting, and we expose the rest of it, and then the rest of it. Plus, very abstract, and that if it's,
the problem with that is, being abstract, and if you want to close a multiple system or whatever, you're gonna have an indirect call. Hang on one second. There's a question. What's the question, Paul? Hey. On the previous slide,
you got the SPL. Is that just the legacy name, or? Oh, it isn't. Those are basically top or priority levels, which are like IPL none, IPL BN, blah, blah, blah, and how does that change that to a hardcore priority level.
But I mean, you can think of SPL and IPL. They're still referred to as SPL.
Basically, inter-interrupting calls have a cost. We all know that. And these are really, you're calling all the time. That cost, still be nice. You can actually avoid doing the indirect call. So what I've done is,
basically, call a subroutine. That subroutine, basically, will look, you have to replace all those names as in there, in between sub-starts,
and when they're basically. Okay, you've used this in time, you have to just jump in. Of course, that's all those are nice. We can help find the whole thing.
Literally, each instruction is gonna help emulate it and figure out where to do a direct call
and where the stuff would have felt. A distraction of indirect calls and emulate files. It doesn't care. You can just do a branch through a routine,
and then that'll be, you know, it can occur magically later on. Actually, you have to have all this and remember, those of you that did it, you know, these in-laws are really good. It'll all interact with each other. Now, I think, term-like, some of the models that are,
some, I've heard some, I've said about five or six of, and on three models, I think it's okay, but I'm all set. There's like, they would, actually, this works, this approach is,
and it works just well there. It's for harm, and in future, it's still well. You don't have to worry about it. You can do a loading of an outtake of a model, and every one of the separate things you have to have built, because that's called, actually,
the, the indirectors that would have been used by the set. So you need a little help in your model, because that's the deal. And it's much easier to manage them, and obviously, not on architects that have been set up to construct them, but aren't there. I don't even want to think about doing this for everything you think is bad. I think that's it.
Now the answer is, well, yeah. No confirmation on this. No. Nobody here thinks, not for variable length architecture.
Except Andrew. And he has a different set of fun than we did. After the SOP changes, we did not do it as well.
The PMAT was not that hard to make. It was mostly changing these, walk-less, algorithms, but you didn't talk to people. It was actually a lot of development.
Again, as I said before, a little lot of sort of conditional, only used on machines capable of running with multiple, I mean, and so, I mean, those are just very standard. You know, they do it as a sort of thing here. There's something special about that.
I did have to add a machine dependent hook, or hardware dependent hook, I mean, I guess it's, there's a way to get it, get it just differently. Again, that is the deal with this. I mean, it's 21 code. You have to have a hook for doing that.
You're possible to do that again. It's normally, but I guess the wall works, and then you don't have to do that again. So once, there was a machine called CPU1 that said,
I need to reschedule this CPU, and I had to be made into a worker. But, surprisingly, not very good in between our efforts. You're still doing the same thing, oh, does this run here? Yes, I've got it, da da da da. So, you know, I don't mind,
you only could have. It's my life. I mean, this is the only tool in the field.
You need that, so you have, you know, it's a good, a low part, you have one VA,
so that's great, I'm also a lot less, low level, that's when you have even less of a VA. So, no, I'm talking about that for a moment. Well, Well, how hard would it be to make like a ship? Turned out that it was a lot easier than I thought.
And I was able to get rid of a lot of complexity. And I only have to load one piece. And I can use that piece to load another piece independently.
The other advantage of the kilobyte, or the long-hitch ship, is that it actually makes a lot of energy off of the ship. And when you're talking about your ship, it's only a few sources long,
depending on what things you're doing. And by going through a single PPP, in some of the third sources, let's say I mix two red CQ, red new construction,
I was able to actually cut out seven construction, and it may not seem a lot. When you're doing, let's say, a nephew's ego in a mid-system, you're talking about getting a CQ on a spin chart,
I mean, not quite as good. Now, as I said, you know, you have to... The next one, you have to have a mid-64. They're all pretty much the same size,
or they have different... The mid one's pretty good. And then you have the mid-32R2, which will allow you to use it, which will allow you to insert construction into the mid. You wouldn't, you know,
all you're going to need to do that. It turns out to be a mid-64. Maybe that's why they added it there. The mid-64, the mid-64 bit has its own idiosyncrasies, so we had to have a certain performance point.
The mid-64 is pretty much the same issue. If you want to be able to use those, just cut down the configuration. And then, of course, you have some easy issues that require the... And remember, this is where you've got...
It's easy to do. We basically need to make it supportable. It requires some sort of wireless. You really didn't build it with standard NDC, so it was easy just to leave it and just let it go.
Part of the answer is... Oh, we've got another question. Okay. The last two entries you have for XLC, what's the difference?
Well, you can use the... You can use the M32 for this, for our normal hypothesis, give you a, how do I say, a brain part in your thinking. You really...
You have your performance... For some reason, they didn't actually allow... A big, clean POB. So you need to cut it back.
What's the POB? Those variants for the R-line. If you're... There's no... Can I explain it?
Most medicines have... The XLR has 16 mega brand.
They're a different type of storage services. But again, they really haven't... They only have a big invention.
Now, much of the invention... Which means that... You flush your...
They realize... That was due to... Which is basically a variation of another instruction.
Such that... They won't work correctly. And again, that's part of the acceptance... Of course, the embedded...
No storage... Don't have to worry about it.
Again... Once you get to it... ...
... You might find out that... You really don't have...
You're basically... You're about... You're about... Zero by...
Zero by... Or whatever you need... an option of how much more. It means that if you're going to use direct mapping, that they expect, they're still, it just makes life a little easier.
The things that it uses in 6.0 are the exact convex distribution kernel, so plus any 6.0 kernel preserved, because they're used by the memory allocator to do a picture of the reality.
Especially if you don't have to clean it up. In fact, that mapping, timing mapping can be surprised to deploy as a proposal. And this requires MVC to have these pages separately so that it's not in the case of their own data. We now have pages that have to be
up to the track set. The pages are up to the track set. And then you have the rest of the pages. Now we're going to move our bits out, and then we have virtual cache yelling. Which basically means that they're really easy to set back.
So on the diagram, this is a one line cache. If two of us are much smaller in page height,
typically what happens is a 32-byte pillow-like cache writes for a way because it's 4.0. It means that a page between you're now aliasing a cache line.
Excuse me for saying I'm good with this color.
And it's that that's when you jump properly and that's the cache part. A pair of CPUs basically does, oh yeah, we have the icon, the data cache stuff, and rail it to a favorite.
Our D-cache is a favorite. Our I-cache is a favorite. And then our XLP has cache aliases, and that's exactly what we've done. And each step we've done the cache. So this solves a whole bunch of problems
and complications in the past example. Now you can prevent this by taking your page size and making it greater than your way size. You can see that if the boxes are shared in the same set,
then you can never have an alias problem. There will be no alias. And you can take care of that as long as you manage your map and your computation very carefully. And that's done by splitting your prelist
such that they manage their pages by color or what's done, and what's that hearing done, and what color the page is. Now the good thing is NetPVC already is not hard to use, or to stop page coloring.
So that way it doesn't do what we do. So not only do you have to have each prelist, but it also has to manage each prelist as if it had its own prelink page target, instead of activating the NetPVC, all those fun things that go with your pre-memory.
Part of the link is just one global feeling. So if you want sort of one color, and you're going to go, yeah, I'm going to go grab some things that there were times that you'd just not be able to allocate that color, and you'd just spend time scratching through your page views,
it'd be no problem. In addition to this, as I mentioned before, you don't have to track the things that have been redirected to one. So there is a solution.
And this is one of the things that I had to do for the next, is I added something called your gauge table. So not only do you select things by color gauge selection, but by view selection. And you can kind of call this individual collections of pages. Gauges are on three targets, they're on three answers, three minimums,
they're all in active pages, and they're all trapped at any point in the class. Now the nice thing about this is, one, eventually those three take more from each of these pages, so you'll get more from current the functionality of these pages. Two, this gives you a schema
in which you can actually have pages on this CPU, on these pages, on that CPU, and I'm gonna speak as a film teacher, because it's easier than using pages on multiple pages.
Now it turns out that for mimsism, you might have 12, 24, 18, whatever pages. It's fun. One of the issues is, in some ways, I don't know if you've ever been able to track, but the HTTPAs, of course,
didn't have hundreds of colors. And that was bad. I'm not worried about that issue. One of the other issues, we gathered pages, we gathered them by color, mapped them where they would be less than 12 years old.
And part of the problem is, with current IOs, is that they're all in there. So I have to actually map that page off page IOs and call it that.
And part of the issue is that I don't want to do a... I did a paintball in this, but I have to go make sure that I don't find a vertical map
that's not a seven color. This is one reason why you can't track a map so that it's so useful, and that it allows me to go up with it. I can just refer to the paint,
and I can't agree about it. Now the other thing is, there are other parts, like RMP2, which needs its support. It's not only Al Smith, it's a lot of them all are.
And so one of the other features of that is, we really don't have bars or keys. One of the problems is that your kernel set always has to...
Every time you do one, you have to go to the new kernel set. That's a bit of a problem. And so what it helps is that I can directly map that to the kernel set, the xkviz or k6-0.
I've really enjoyed doing that overhead. It will gain you about another two center points. It might gain a lot of all these things out. And so based by adding in there, part of this is that
kernel set is no longer a palpable part. And that in fact will eliminate a whole bunch of really zero gates.
And the series is not only used by boards or people's support. And that's why I'm going to support that out. I'm going to use it that I just haven't had in just a couple of minutes yet. And that's coming soon.
And again, this is again, the idea that we have more and more comments. And the most significant thing that we did was, all of course, they should use a floating point. Do a floating point, and we only switch the point context, and somebody else wants a floating point.
And I think that you would think that all these important things that they don't. Every part of the sentence, there's no way, and there are bugs. And so one of the things that we've said, I'm going along with some of the other developers,
is that maybe this would be the thing that you're asking about, the thing that is basically a help. You want the test to use, and all that can do is load, send, and receive the various context. And the thing turned out really well, is what we call for CPU units.
And it's used for floating point, and it's also used on hard PC for all sorts of functions, you know. And maybe six of these, it also also uses it for installing points. And basically, you have three routines supplied by CPU for an architecture.
And you have something to say the same to the CPU unit, and I'm going to load the CPU unit, and when it came, oh, did we need squinches? No, we got the CPU. It basically has a release call that says, next time you reload this,
you're going to have to force no loading back. Either via processor testing, error on an initial error, and that pretty much covers it.
And so I'm on now, waiting for, unfortunately, once a time.
Any other questions? I think they're all scared.
Anyone else?
Thank you very much.