We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

BIND 10: DNS by Cooperating Processes

00:00

Formal Metadata

Title
BIND 10: DNS by Cooperating Processes
Title of Series
Number of Parts
90
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
A quick look at the details of how BIND 10 provides DNS with a collection of cooperating processes. Unlike BIND 9 or other DNS servers today, BIND 10 does not operate via a monolithic process that does everything. Instead, it uses individual processes to provide specific functions. This provides a number of benefits, such as scalability and improved fault tolerance and security, but requires additional software to hook it together, and has a few tricky corner cases.
Slide rulePoint (geometry)BitCovering spaceData miningLecture/Conference
Direct numerical simulationProcess (computing)Kerr-LösungSoftwareRevision controlServer (computing)Time zoneQuery languageRadical (chemistry)ArchitectureSoftware engineeringDesign by contractBefehlsprozessorInternetworkingThread (computing)Standard deviationOperations researchError messageConsistencyCodeLogical constantScalabilityLimit (category theory)Data recoveryMaxima and minimaMultiplication signInvariant (mathematics)Thread (computing)Bit rateServer (computing)QuicksortCategory of beingSoftwareBefehlsprozessorCore dumpComputer architectureSoftware bugSlide ruleCartesian coordinate systemOrder (biology)CodePreconditionerCondition numberKeyboard shortcutTask (computing)Web pageProduct (business)Arithmetic meanDifferent (Kate Ryan album)Direct numerical simulationQuery languageProcess (computing)Time zoneRevision controlBitCovering spacePresentation of a groupData structureRandomizationMoore's lawSemiconductor memoryNP-hardDesign by contractState of matterInternetworkingInformation securityRadical (chemistry)MathematicsRootComputer configurationComputer programGoodness of fitFunctional (mathematics)CASE <Informatik>10 (number)2 (number)Endliche ModelltheorieCache (computing)ScalabilityDemonPhysical systemInformation and communications technologyCuboidSystem administratorRight angleOnline helpDrop (liquid)Frame problemResolvent formalismLevel (video gaming)DampingFerry CorstenSingle-precision floating-point formatSolid geometryComputer animation
Task (computing)Process (computing)Thermodynamischer ProzessServer (computing)Mail ServerData modelTime zoneDirect numerical simulationHeat transferQuery languageDemonConfiguration spaceService (economics)AlgorithmMessage passingTelecommunicationInformation securityGateway (telecommunications)System administratorSocket-SchnittstelleNetwork socketComputer fileFlow separationPhysical systemPublic domainOverhead (computing)Kolmogorov complexityLevel (video gaming)CodeSoftwareTime zoneFlow separationMessage passingThermodynamischer ProzessComplex (psychology)MereologySystem administratorHost Identity ProtocolTouch typingDemonDirect numerical simulation10 (number)Bus (computing)EmailGateway (telecommunications)Open setDynamic Host Configuration ProtocolWindowComputer fileInternetworkingPhysical systemTask (computing)Functional (mathematics)Server (computing)Query languageProcess (computing)Goodness of fitOverhead (computing)CodeEndliche ModelltheorieReal-time operating systemConfiguration spaceIP addressSoftware bugPublic domainOperator (mathematics)Different (Kate Ryan album)Product (business)Link (knot theory)Keyboard shortcutDigital rights managementMathematicsInterface (computing)Beta functionComputer programRule of inferenceError messageTransport Layer SecurityBefehlsprozessorContext awarenessCASE <Informatik>Type theoryLevel (video gaming)BitNumberData structureWeb pageMassAlgorithmLogic gateInformation securityWeb browserCommunications protocolScalabilityMechanism designSoftware maintenanceCartesian coordinate systemCodeLimit (category theory)Socket-SchnittstelleCore dumpReal numberInstallation artAuthenticationPlug-in (computing)Right angleContent (media)Computer animation
Row (database)System administratorDynamic Host Configuration ProtocolLevel (video gaming)BuildingSoftware developerPhysical systemCombinational logicCodeState of matterLibrary (computing)Direct numerical simulationFunctional (mathematics)Alpha (investment)Configuration spaceComputer programKeyboard shortcut2 (number)Server (computing)Query languageLecture/Conference
Transcript: English(auto-generated)
So I realized as I was putting my slides together, and then I came to the conference, and I saw that everyone had quite fancy slides, and mine are quite boring bullet points, so I thought I'd put on a fancy shirt to try to make this a little bit fancier here, a little more professional-looking. All right, here we go. So this talk doesn't cover every aspect of DNS or every aspect of BIND10.
That would be more like a half-day of presentations. I'm just going to talk about what I think is the interesting architecture of BIND10. So, all right, what is BIND? Well, BIND is DNS software, and if you don't know what DNS software is,
you're probably in the wrong room. Right now, our main product is BIND9, and we estimate that that's used by about 80% of DNS servers in the world. That doesn't mean that 80% of queries are handled by BIND9. It doesn't mean that 80% of zones are supported by BIND9, but about 80% of servers.
So that's hundreds of thousands of servers around the world, possibly millions. And BIND10, of course, is the next version of BIND, and it's a complete redesign, and it has a lot of radical changes. So a lot of it's kind of experimental, and we're going to get a lot of interesting comments on it as people start to use it.
Now, my company is ISC, the Internet Systems Consortium, we're a non-profit company in California, and we make BIND. We do a bunch of other stuff, too. We make ISC DHCP, oh, there we go. We run one of the root name servers, we run F. We do secondary name servers for non-profits,
and we do all kinds of good stuff for the Internet. Right, so architecture. I always have a hard time with architecture because I'm never really sure what it means. It's kind of high-level design. But in the case here, I think architecture is basically just
the general principles behind how the software is built. So with BIND8, now I say recent history because, of course, before BIND8 there was BIND4, and there was even older versions in the murky past. But starting with relatively recent, I think BIND8 is from the 1990s.
It was monolithic, meaning that there was a single application which did everything, single CPU, which we now call cores, but at that time they didn't have the concept of a course. They called them CPUs. And the quality was typical of Internet software of that era,
which means that there was tons of bugs which made it crash, it was a security nightmare, and so on. BIND9 was a complete rewrite from scratch, and it was still a monolithic design, which meant that there was a single application which did everything, but it did support threads which gave you some benefit
for having multiple cores. Because of the time when BIND9 was created, this was 1998 when it was designed, 1999, around that time frame, threading support outside of, say, Solaris at that time was mixed. So they had to design it so that it would be able to run without threads.
So the architecture is complicated a bit because it has a task model which can optionally support threads. But basically it's a multithreaded monolithic application. And in order to improve the quality of the software because of all the problems that had happened with BIND8, they used a lot of the concepts from the design by contract approach.
And this is basically where you define precondition and postconditions for every function, which is kind of documenting exactly how it's supposed to be called and what the conditions are supposed to be when it exits. And you also set up some invariants for your data structures, which is sort of what the properties of that data structure have to be at all times.
And at the time it was quite solid engineering, so this prevents a lot of bugs from creeping in, and it means that if there is any kind of problems that it won't be a security issue.
Now, in the past I've given talks talking about BIND10, where I've listed for pages and pages all the different problems with BIND9, and I've been told I shouldn't do that. And I shouldn't, because BIND9 is a quite solid piece of software. However, if there wasn't room for improvement, we wouldn't be doing BIND10. And so there are some problems with the architecture that BIND9 has.
The problem with the design-by-contract approach that I mentioned is that it's actually quite brittle. If there is a problem discovered, then the software terminates. That's safe. That's probably what you want to have happen if the other option is having your program write over random pieces of memory
or be in an unexpected state or something like that. But as an administrator, it can be quite frustrating when your software falls over. Another problem with the architecture is what I call shared fate. And because you have a single daemon, which is doing all of the functions that you need to run a DNS server,
if anything goes wrong with any of them, then everything is gone. This is especially important for busy recursive resolvers, which get tens of thousands of queries per second and have caches in a couple gigabytes. And what that means is that if your server crashes,
it takes you several minutes for it to recover all that state about the Internet, and you're suffering under reduced performance during that time. So the restart is quite a painful process. And the final problem with the architecture of BIND9 is that it doesn't scale that well.
You can get it to scale up to between, say, four and six cores, but beyond that, additional cores don't help. They don't really hurt. It's not like you have a performance drop-off if you run it on 12 cores or something. It'll be slightly less performant than, say, six or five cores, but it's not a major problem. But given the way that Moore's Law seems to be going,
we seem to have reached the maximum instructions per clock, and we seem to have reached the maximum clock rate. So really, scaling in the future looks like more and more cores. So in order for us to perform well, say, ten years from now, we need to be able to handle hundreds of cores.
So we decided we're going to write BIND10. There's a few requirements which went into the architecture, which I'll talk about. I'll talk about the actual architecture in a couple of slides here. So one of the issues is that we want it to be customizable, both what I call out-of-the-box. You just install it, and then you can change it
and have it run only the pieces that you want. The other requirement is that you need to be able to add or change the code easily. I mentioned the scalability to tens or hundreds of cores. And then the obvious one is robustness. We want to avoid these problems that we had with BIND9. We want to limit this fate-sharing that I just discussed, and we want to, if a problem is discovered,
to allow the system to recover as much as it's possible to do so. So the basic model that we use is cooperating processes. We were kind of inspired by Postfix, the mail server, which runs a number of different programs to actually do all the mail processing. We see this in a bunch of other domains as well. Browser vendors have started to put your different tabs in different processes.
They put your plug-ins in a different process. And it gives you advantage of isolation. In the browser context, it's because they don't trust the content. That's not the problem in our case, but it's a similar approach. And the idea is it makes each task much simpler.
It does one or two things. And the full service is done by these processes cooperating. So a DNS server not only has this simple functionality of getting a query and sending an answer, but it also has to do maintenance of zones by transferring them in from the primaries and transferring them out to the secondaries and so on.
We've also built in DHCP functionality. So you can run BIND10 and have a DHCP server as part of that. We did that to provide all the good basic functionality work that we're doing in BIND10. So the logging system, the configuration mechanisms, and all that are also now available for our DHCP product.
And all these different things work together to provide a full service. And a big advantage is that you only have to run the processes that you actually need. So if you're not doing dynamic DNS, if you don't update your zones in real time, for example, then there's no need for you to run that application. We had a bug a few years ago where there was a coding error in the code path of the dynamic DNS
which would cause the server to crash even if you weren't using dynamic DNS, even if you had an ACL installed. So bugs happen, and it's a good basic administration security policy to not run code that you don't need. Right. How does it work? We have a master process which starts everything up.
It starts up our message passing daemon. It starts up our configuration daemon. And then it starts whatever services you've configured. So as an administrator, you configure the system, and it does all this management for you. So it's basically zero cost for you. If you're just an administrator, you don't have to care about all these things that are running. The system just provides services.
Now this initialization process also will restart processes that die. And we have a very simple back-off algorithm. So if you have a process that can't restart for whatever reason, like the resource that it needs, like the interface or something is unavailable, or there's some problem with your security rules so that the system won't let it start, it's not going to use all your CPU just spinning trying to restart it.
And then, of course, when the system shuts down, everything gets cleaned up. We've got a... Because we have all these processes that are working together to provide functionality, they have to communicate with each other. We ended up building a custom message bus, which we hate, but we're kind of stuck with it.
Nothing that we found met our requirements. If anyone has any really good suggestions for using a message bus, we looked at quite a few, and nothing met our license requirements. We have to have something that's BSD compatible. We need something that works on C++ and Python at least, and it needs to be something relatively simple. So anyway, we built one of our own.
It's based on Unix domain sockets. For security of our message bus, we have a gateway process to prevent users who aren't authorized from accessing the system, but once you're in, you're in. We decided they're cooperating processes.
These aren't wild beasts on the Internet, so we don't have authentication between the tasks. We actually have also a separate type of IPC, which we use to pass open file descriptors around. This is a problem that arose because if you're doing a zone transfer, the request will come into your authoritative server because they all sit on the same port and IP address.
That's the design of the protocol, but we wanted another process to actually do that work. So in modern Unix, as it turns out, you can freely pass open file descriptors between processes, and the documentation says you can do this in Windows, too, but we don't actually support Windows yet. Downsides, complexity, of course.
It's much simpler to just pass a data structure around than to actually have to send that across to another process. Most of the complexity is at the global level. Individual tasks, because they're only doing one or two things, it's not very complex for them, so it's not like when you're working in the code, it's not a bunch of interconnected, highly coupled things
that you have to know everything about. And as I said, administrators shouldn't have to care about this. Now, when you do a PS, you're going to see a lot of processes. We've tried to name them in a way so that it's not scary, and it looks reasonable, and you're not wondering what all these weird things are doing.
I think we actually do a little bit better than other things, like postfix, or other systems where you have no idea, you see a weird process. It's like, is it part of my mail system? I don't know. So we don't have that. There is slight performance overhead, of course. Our message bus, which we hate, is not very fast, but that's not a real serious limitation because this doesn't affect our fast code pass.
The query code doesn't actually get sent, or queries hit code that doesn't actually touch this message bus. It just gets handled by a single process. Another downside is it's kind of weird for administrators. It's not something that they're used to, and administrators are conservative and suspicious, and they hate change like all of us do.
So I expect we'll see a little pushback, and I look forward to the rants on Slashdot. Yeah, so what's the status? We're just finishing our beta. We have a release candidate which is coming out in a few weeks, and then hopefully we'll put out the actual release shortly after. I don't expect any.
I mean, we have tons of bugs to fix, but nothing that should prevent you from running it in operation. So I'm ready for production use. There's a link there. You can go to the webpage and check it out. You can download tarballs or get it from Git, or read all the documentation, all that good stuff.
Anyway, that's it. So this is great news. We'll have a new BIND release in February. Is something else a question?
Is it possible to transfer the configuration files that you've got already from BIND9? Saved by the bell, no. You can answer. Not right now, no. That's one of our high priority things to do, but we needed to get the functionality done.
So not yet. We will be making tools for that. While Python is popular among programmers, Perl is still highly popular among system administrators. Do you plan to support Perl BIND9 as well?
Do we plan to support Perl? Well, the system itself was written in a combination of C++ and Python. We have no intention of making a... We have no real intention of making a Perl library.
However, if there's demand, we'll do it, basically. Is there another question? Oh, over there. Micro is coming. Yeah, we need to use the microphone. We're recording, so... The DHCP server, is that new code, or is that just the old...
The DHCP server is also new code. OK, so it can do more than 150 queries a second. Yeah, that's actually sponsored development. It's not quite in as advanced a state as the DNS code. It's being actively developed. It's about an alpha-level release, I'd say.
It's being sponsored by Comcast, and their main requirement is performance. I can do another 15 minutes about that, if you want.