We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

FreeBSD Unified Deployment and Configuration Management

00:00

Formal Metadata

Title
FreeBSD Unified Deployment and Configuration Management
Subtitle
A practical approach to managing highly heterogeneous installations
Title of Series
Number of Parts
24
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
When we needed dozens of storage, processing and front-end machines for a prototype of a new cloud media service, we developed a cost-effective, but technically challenging hybrid strategy of purchased, rented dedicated and rented virtual servers. FreeBSD was an easy choice thanks to its performance, reliability, and unparalleled ease of management on a per–node level. However, while the number of infrastructure–level tasks kept growing and we needed to scale through beta and release stages, there was an obvious need to reduce complexity. After a year of tentative design and experimenting with partial solutions, we started implementing in November 2011, the result-in-progress being something we call unified configuration management (and deployment), bringing immediate returns on time invested. The talk focuses on a new unified approach to deploying and managing modern versions of FreeBSD across a wide variety of technical and administrative circumstances: different countries, data centers, hardware, access policies, boot methods, networking, support contracts, machine roles, etc. While avoiding any popular Linux-centric CM systems, such as Puppet, Chef, and CFEngine, we achieve very low complexity by leveraging rc(8), loader(8), glabel(8) and other existing instruments, such as pkgng, to their potential as necessary. The cornerstone is keeping configuration and deployment versioned and unified — same across all cases, with no duplication of common parts and very simple specification of per-role/per-case peculiarities. The approach spans everything from installation and booting to managing third-party and custom site-specific software. The method is being actively developed and applied in production environment of a popular online music service.
VideoconferencingSystem programmingOpen setConfiguration spaceComputer networkPhysical systemCompilation albumDisk read-and-write headKolmogorov complexityInformation managementTrigonometric functionsRootNormed vector spaceBootingScripting languageVideo gameOvalPartition (number theory)Computer-generated imageryPivot elementComputer clusterScalabilityConfiguration spaceInternet service providerInformation managementMedical imagingType theoryPhysical systemLevel (video gaming)Musical ensembleMultiplication signPoint cloudRootQuicksortPartition (number theory)Complex (psychology)Computer hardwareVirtual machineData storage deviceCASE <Informatik>NumberInheritance (object-oriented programming)Flow separationVideo gameSoftwareFLOPSChannel capacityBitService (economics)StatisticsMereologyProcess (computing)Design by contractDifferent (Kate Ryan album)Client (computing)Order of magnitudeArithmetic meanInformation Technology Infrastructure LibraryFacebookScaling (geometry)Heat transferElectronic mailing listPosition operatorStreaming mediaExtension (kinesiology)Network topologyCartesian coordinate systemProduct (business)Rule of inferenceDigital photographyComputer fileSupercomputerBranch (computer science)Single-precision floating-point formatLine (geometry)Parameter (computer programming)Dependent and independent variablesWater vaporBootingRevision controlDirectory serviceDigitizingHybrid computerSequenceSemiconductor memoryGastropod shellEnterprise architectureScripting languageCuboidEscape characterImage resolutionLocal ringWindows RegistryCycle (graph theory)Structural loadMaizeInformationRepository (publishing)VotingAddress spaceMessage passingConnected spaceSystem administratorGroup actionFlagFiber (mathematics)Context awarenessThermal expansionRoutingEndliche ModelltheorieSlide ruleFinite-state machineStatement (computer science)Kernel (computing)WordMassSoftware testingPoint (geometry)CodeTheoryMetropolitan area networkSynchronizationWeb 2.0Subject indexingDivisorSound effectServer (computing)Computer-assisted translationEntropie <Informationstheorie>Vertex (graph theory)Moment (mathematics)Range (statistics)MappingDemonDirect numerical simulationSystems engineeringIntegrated development environmentBeat (acoustics)PasswordFunctional (mathematics)View (database)FluxDatabaseStandard deviationSystem callUniform resource locatorHard disk driveSheaf (mathematics)Real-time operating systemAuthorizationLoginDampingLatent heatFile systemOffice suiteExecution unitBoss CorporationKnowledge-based configurationStaff (military)Web serviceSoftware developerDefault (computer science)Logic gateSet (mathematics)Right angleNumbering schemeOrder (biology)Operator (mathematics)IP addressInteractive televisionPlanningSpacetime10 (number)Euler anglesOpen sourceFlash memoryDatabase normalizationAreaSatelliteInformation securityMixed realityTerm (mathematics)Task (computing)NeuroinformatikNoise (electronics)Operating systemHypermediaSign (mathematics)Projective planeOpen setLocal area networkDigital mediaTorusAttribute grammarPatch (Unix)Data centerCompilation albumTheory of relativityComputer programmingProcedural programmingResultantState of matterGeneric programmingSeries (mathematics)PurchasingFigurate numberMetric systemForcing (mathematics)Field (computer science)Source codeChief information officerShared memoryPixelBootingNP-hardVideo game consoleOcean currentPower (physics)High availabilityForm (programming)Asynchronous Transfer ModeRepresentation (politics)GodFile archiverRemote procedure callLimit (category theory)Hash functionCrash (computing)Serial portFocus (optics)BuildingCellular automatonVolume (thermodynamics)Replication (computing)INTEGRALMoore's lawMoving averageMathematicsGoodness of fitGame controllerWorkstation <Musikinstrument>XMLComputer animation
Transcript: English(auto-generated)
Hey guys, when you're all done, the power cords and extension cords that aren't yours, could you pick them all up and bring them down and put them on this table, please? Thank you.
Hey guys, my name is Andrew. I'm an ex-systems engineer. So, many years ago I was quite active in the ports tree, as any good systems, FreeBSD systems engineer is. But yeah, life happened and I kind of, as time passed by, I got into a very interesting
situation of being the sole systems engineer on a sort of large scale project, but having
like a zillion of other tasks along with it, and having just like 5 to 10 percent of my time dedicated to the sort of tasks I'm used to doing, or was used to doing. And that's, on one hand, a very unfortunate position, like you, whatever you know to do
best, designing systems, architecting, making ports, packages, all that stuff, you just have very kind of small portion of your time to do that now. But on the other hand, you don't have any excuse to do routine stuff, to waste your
time, to do stuff that can be automated. So when you're forced, when you really have little time, when you're forced to do
stuff efficiently, really forced, then I find myself in a very fortunate situation of being forced to come up with efficient solutions. So this story starts like many others, it's currently a medium sized company, we have
kind of moderately sized private cloud, which is tasked with a large assortment of tasks associated with online media services. So ingesting content, processing, streaming large amounts of it, currently most of the
computing power is spent on a music service, so that's ingesting, copying, encoding, extracting audio features, and then of course streaming to lots and lots of clients.
So it's by no means anywhere near the sort of scale Yahoo or Facebook or any of that kind of companies are, but it's a fair amount of processing, it's currently petabytes
of storage, almost a hundred gigabit of aggregate transfer capacity and teraflops of processing. So to give you just a bit more stats, it grew to four countries now, Western Europe,
Eastern Europe, North America, 10 cities, more than 10 data centers, we have to deal with a number of very different service providers, different support contracts, SLA levels ranging
from very agile and hands-on support to no support at all. And still it's just around a hundred machines, so they're fairly powerful, many of them carry a lot of storage attached to them.
It's about 20 really distinct hardware configurations and just about a hundred of mostly large capacity hard drives. So that's where all the petabytes of data are located.
Yeah, it boils down to several dozen local networks, different network types, depending on the service provider, whether we own it or not, about seven types of out-of-band consoles.
Currently one operating system, I'm sure you can guess it, but the sort of task we're dealing with will probably warrant potentially more operating systems. But still it's currently about five boot types, local hard drive, local USB flash,
network, several types of network boot types. And this is what really forces us to do stuff efficiently. There's just one systems engineer, one network engineer who of course is 90% in networking
tasks and one field engineer. So initially when we just were starting, we sort of solved all the problems the usual
way. So the machines we owned that were nicely co-located in just one data center, we basically made a cluster out of them. It was initially network booted with NFS root, and well, thanks to the work of Brooks Davis
and many others, it works quite nicely right out of the box. You don't have to do a lot. And notice that the shared root configuration where anything you added on any box is immediately shared with all other boxes worked really nice.
So that setup was kind of very well working for us, not demanding too much attention. As for the lead service, we went with the usual route of setup once and kind of forget
for at least for some time because it becomes increasingly difficult to update everything in time to change configuration unless you employ some kind of external automation. So when things started scaling, we briefly conceded Poppet and other configuration management
systems like it. I kind of tried to deal with them before that, but they always seemed like, maybe to a guy who didn't really spent a lot of time with them, they almost seemed like
an unnecessary level of complexity. So there was also kind of the usual way of using a lot of in-house scripts to generate basically all the configuration files and everything else with custom made scripts and
deliver them to all the machines. But obviously that, well, if that route is to be considered seriously, you better go with a standardized solution like Poppet. So priorities were quite obvious.
We were scarce on manpower and we still needed extremely high reliability and performance, higher than a lot of out-of-the-box ready-made solutions provided.
So of course, like if you want to store a few petabytes of data, you can just purchase a solution from NatApp or Isilon, and that would solve a lot of management problems. We personally dealt with NatApp appliances and they are really easy to deal with.
But for example, we needed streaming, processing and storage. So NatApp gives you just storage, Isilon gives you storage and streaming, but no processing. And when we started, we wanted to be cost effective.
That's not so critical now, but we wanted to be cost effective and slipstreaming all these three major tasks into one custom FreeBSD-based solution seemed very logical at the time. And in hindsight, it still looks that way.
So there was a period of agony. So we had to maintain configuration to deploy a lot of, to start deploying more and more boxes in different locations, different networking setups, different SLA types, whatever, KVM types.
So there was that, and we needed to scale internal processing. So a lot of different factors, and there needed to be one answer to all of that. So how can we, is there a solution custom or existing to ease everything at once?
And so I talked to a lot of systems engineers, very seasoned, very experienced. And when you lay out the whole range of problems and look for one single answer to, well, in part to all of them,
then most of them look at you and say, well, are you an idiot? So there's a solution to that. There's a solution to that. And so you need Puppet, you need Enterprise or open source, well, whatever, out of band management solution. You need kind of this solution for deployments.
You know, well, in fact, FreeBSD doesn't do well with automated deployments as much as these and these Linuxes. And so the agony was, is there an answer to that? And I believe we kind of found our own, well, maybe my personal holy grail.
And it boils down to just a few simple, just a few simple methodologies, a very simple ideology. So if some of you are expecting rocket science or a lot of buzzwords, I'm afraid you will be disappointed.
There's very little code, very little, actually very little stuff to talk about, but a few simple methods of doing stuff. So I call it unified, unified configuration management and unified deployment. And it's basically the same thing. And so I'll go, I'll just describe the current status as opposed to the long road that led to it.
So what's unified? It's when you have exactly the same root file system and basically all the configuration everywhere.
This kind of, it sounds difficult to do in production, but then again, looking back at the NFS, common root NFS base setup where it works really nicely, that ceases to be fantastical.
So it requires a bit of sorcery. So we decided to go with Git. And currently there's one main Git repository covering the whole root file system.
There's some custom scripts kept elsewhere under use our local project name, .git. And actually every home directory, every user that, every administrator or external user that wishes that his home directory is distributed through the whole private cloud needs to convert it to Git.
And from there it's really, when you achieve that, then it really becomes a straightforward solution, a straightforward fully distributed solution.
Git is really one of the nicest tool to do true master-master sync. Yes, it's manual, but that gives you unparalleled flexibility. It has very powerful conflict resolution. And when you think about it, when you think about if any of you had experiences with OpenLDAP or any
enterprise solutions which kind of do master-master, but you have to spend a lot of time setting them up. And when something goes wrong, you'll spend, well, sleepless nights debugging it. Then Git and, well, Git-based, file system-based registry of configuration doesn't seem like a crazy solution.
And we didn't deploy any symlinking or copying, file copying solution based on Git.
A lot of people are using those, like you keep your Git-based checkout or subversion-based checkout somewhere to the side, and symlink to that, so that it's kind of more, it seems more orderly.
But all you have to do in Git really is disable show untracked files, and then you can really kind of keep the Git checkout in the production live directory. So what they gave us is really when we slipstreamed dozens or, yeah, dozens of different, very different configuration types,
different types of rc.conf files, different types of all configurations of all the services we employ, HTTP and everything else.
So when we slipstreamed that into a single repository, single repo, single branch, that really concentrated all the complexity in a single place. And so instead of when you want to find out what kind of, you know, how that machine differs from this
one, you don't have to log in to two different machines or look up, check out two different branches, two different repositories. You just look at a single piece of configuration and depending on how exactly it's done, you either see it all in a single file or in two files lying just next to each other.
So, but of course the question is, if you have hundreds of thousands of machines, a lot of different roles assigned to each of them, so it's not a big HPC cluster where everyone is sort of doing the same job.
It's really a private cloud production setup, so depending on location, hardware configuration, you have boxes dedicated to ingestion, dedicated to processing, streaming or log collection.
So how do we keep it whole, how do we keep the roles in one place without infringing on each other? So, well, first things first, I thought how to store roles.
Role is a very simple concept, it's used, well, it's basically in Unix the synonym is group, so in Postgres we have roles, in Solaris there's a separate concept of roles.
It's a very widespread concept and so I needed it not just for users but for machines. And I thought about, you know, implementing, just keeping it in a separate file, mapping host names to roles or that, but then it struck me that, well, at our scale, hundreds, thousands and maybe tens of and even maybe hundreds of thousands machines,
we could really use password, just password and group files, the standard Unix infrastructure for users and groups. And that worked really nicely, so you just place every host name there, every host
name is assigned an eponymous group, and then you have groups of hosts signifying their roles. So, web server hosts would be assigned to the group named whatever, web host.
Just one file that is kind of separate and needed for machines to learn who they are when they're booting up is called awareness, a map of awareness. Because at boot, when you have, when each machine has exactly the same route, you have to ask yourself how does it know what
host name to attach to itself, what IP address, if it's not DHCP booted or DHCP enabled network, what IP address to assign to itself. And so we decided to go with just one map or just one file that ties host names, which is the same as role
names, because every host has a unique role, apart from all others assigned to it, role names tied to one or more MAC addresses. So when you boot, you wonder who am I, and you look for your MAC, and when you find it, you
just basically grab a web map or just look into a web map, and you instantly know which host you are, which basic role you have, and all the other roles you get from etc group and all the other files, configuration files.
So how do we get away with having, for example, one rc.conf on a hundred of very differently tasked boxes? So with rc.conf, it's really a breeze, because it's basically shell script, so I call the type of configuration files that
are very easy to convert to role-aware setup, one config file for a hundred differently tasked boxes, I call them role-aware.
So it's a shell script, the only problem is that you need to know how it's evaluated, but because it's not just sourced one time, you have to really understand, especially at boot, how exactly it is sourced and what happens there, but when you do that, that boils down to a solution somewhat like this.
So there's a common part, and obviously 80 or 90% of configuration, depending on the setup, is common to all
boxes, so you basically want NTPD and some other parameters on every one of your boxes, but then comes the interesting part. When you want your web service to have a specific parameter enabled, what we currently do in rc
.conf is just define a function with the name role.rolename and place anything specific you want there. So if you want a specific hack to be enabled on just one host, you can do that too. And just at the end of rc.conf, well, you can actually do that not just in rc.conf,
but anywhere else where you know that it would be sourced just after rc.conf, we do just this. So for each of your role, you look for a function named like that, and if it exists, you just execute it.
So that's kind of the flux capacity, all the complicated code that went into enabling this kind of setup. ng-nax.conf is not, for example, I'm not sure how many of you have ever used ng-nax,
but it's sort of, the configuration syntax is sort of reminiscent of HTPD and light HTPD and other web servers. So it's not role-aware in that you can't do really crazy stuff within it, you can't,
from within the single config file, you can't really ask yourself if I'm that host to do that. But what you can do is define all the servers, all the types of servers you need, just put it in one file and use the effect, use the fact that on each of your hosts, only the relevant parts of your configuration file will be invoked, needed for that.
Well, actually, another example that I didn't put in my slides is the new hasd daemon, the high-availability storage daemon.
What I like about it is that its configuration system, configuration syntax, specifically supports multi-host configuration, so if you ever tried setting up hasd, then you know that you can put
configuration of two or more of your hosts in one file and they will be invoked. Each part of the configuration file will be invoked exactly where it's needed. Most configuration files though are problematic, for example syslog.conf is something I call role
-unaware, it's very difficult if you're using, especially if you're using the stock syslog daemon, it's very difficult to use a single configuration file across the whole infrastructure, especially if you have a dedicated log collection server.
So what we do is just, in this case, keep two configuration files, one is common to most nodes, which just sends all or most log messages to the log collector, and for log collector we have a second file.
Both of them are kept in the same Git repository, so you can always look for that. And we have RCConf-based workaround, so for all the boxes by default syslog.conf is used, but if the machine is assigned
the role named log collection or whatever, then in RC.conf we just assign a different configuration file to that.
I'm not sure if I got the flag here right, but I just wrote from memory. The hard case is fstab, so obviously fstab is needed at boot, so you can't do RCConf-based workaround, and what we settled on is just keep it empty.
Just keep it empty, and we basically specify where to boot from in loader.conf. You can obviously use a loader parameter named vfs-mountroot from, or something like that.
And what we discovered while doing that is that what's interesting about the setup is that machines don't care if they booted from network or locally.
If you keep the configuration files more directly in sync, they just don't care. So what we have is when new machines arrive or when a boot drive on any one of them goes toast, then they just start booting from network. And what we discovered, if you have that parameter in loader.conf that says boot from local hard drive, if
it doesn't find that local hard drive, and it's booted from network, from NFS, then it just ignores the line. So it's kind of a very useful fallback. So if you're booting from NFS, then if the
local hard drive is present, then you'll mount root from it. If it's not, then you don't. For all the other partitions, but the root one, we just use a single script that does a very simple thing, sort of looks for what's available and just mounts it where needed.
So here we come to actually how to do deployment efficiently. In most cases it's not a problem when you have like a single type of infrastructure, just rented boxes or just NFS booted boxes or just anything.
It's not a big problem, but when you have a lot of different scenarios at once, you have to come up with something kind of different.
So what I wanted to do is to find some kind of setup that kind of mimics what appliances do, like all the beloved Juniper devices, NetApp, all the other FreeBSD and Linux and custom OS-based appliances.
They're very straightforward in that you load some kind of single image into them and they just work. You have to tweak configuration, but they just work. If you want to upgrade, you sort of just load a new configuration, a new image, and they just continue to work.
You reboot, they just work. So what I came to is we don't do embedded, so all machines basically have at least several tens of gigabytes of space.
So it got down to an image the size of two 10-gigabyte partitions and one 4-gigabyte swap partition. So what you do is actually find a drive that's suitable for booting, whether it's USB flash or SSD or HDD.
And you partition that drive using, we use GPT for that, you partition that drive in at least three partitions. Two of them is for root, well two basically for redundancy and upgrades. One of them goes to the swap and anything that's left on the drive, it might be a 4-terabyte drive, goes just, is partitioned into UFS too.
And we use the schema like devUFS and serial number of the hard drive.
So yeah, that's loader conf. In loader conf you just specify basically devUFS root and then if you boot it from NFS and your host does not have that partition at the moment of boot, it just falls back to NFS root. If it does find that partition it mounts. If you're not booted from NFS then you probably have that root partition and it's all very good.
That's how we keep a single loader conf across the whole infrastructure, both NFS booted part of it and locally booted part of it.
So if you get a new box, a sort of hardware adjustment, all you have to do is adjust a web map for assigning MAC addresses to hosts. You place the new hosts into your password and group files, assign whatever roles
you need them to be, adjust your DHCP and basically your networking infrastructure, whatever, DNS. And then you just, well, while on the new box, however you got to it, whether it's
MFS, MFS BSD based rescue environment on Hetzer or your self-made custom-made NFS boot environment. When you're in that box you just find a suitable hard drive, partition it according to the script, it's just five lines of
shell script, and you untour a recent image from any other box, you untour a recent image of a root partition into it. So you just, well, you can use SSH for that, SSH and, well, if you have a tar archive ready then you just cat it via SSH and untour it live.
Or, well, if Rsync did extended attributes and all the other stuff nicely then you would probably use that, but it's extremely straightforward.
So when you have it deployed then there comes a question of upgrades. So we have three levels of upgrades. The most disruptive one is full upgrade, it's not completely finished at the moment.
So the idea is just, well, you have a currently mounted root drive, just untour the new, a complete new image to the second one, and then you pivot. You assign, you change UFS labels on these drives. Or, well, maybe do it any other way, but changing UFS labels worked nicely for us.
So I'm just not sure what happens, where the kernel would be taken from, but it's just, for us it's kind of a bit of an incomplete setup, but mostly it works.
At least userland will always be fresh this way. So, obviously there's a desire to use Rsync or, well, maybe package-ng if we come to a point where our base system is sort of representable in packages, in form of packages.
Or maybe a custom, you know, on-site configure, FreeBSD update server, something like that. But for now, well, actually it's not 24GB of information. The image is just about 2.5GB and when compressed it's like 700MB.
So if you have like a 100MB connection, then it's really easy to load it fully, untour it fully each time.
The second level, less disruptive level of upgrade is package upgrade, obviously where you upgrade all the packages you use, and that's fairly straightforward thanks to package-ng. Obviously we don't have to keep ports on any of the boxes or compile anything of them, everything is done just once, all the compilation.
And what you can do is dynamically assign any box to ports building. So you just check out your cell ports, build ports, and then you have, you basically have the new image and the new package repository.
So the environment is really, it was our focus to make it fully distributed. Well, obviously it's never perfect, but mostly it is. It is fully distributed. And then the least disruptive way to upgrade and the one that goes on
continuously is just git pull of the root partition, the custom scripts and home directories. That's now done semi-automatically, but I believe it can be automated just very close to real time.
So that would constitute a very nice, well, low data but very consistent, very reliable distributed file system, if you allow me to call it that.
So when you have all that, you edit any configuration file on any of your boxes, wherever you're comfortable logging in, whichever box is closest to your current location. Just log in, edit, commit and push. And you don't need to think about, you know, any centralized template-based puppets at all.
You just work with your end configuration files, with your current configuration files. You don't need to think how they will be generated and stuff. You just edit and commit and push and they apply verbatim to all your other boxes.
And if you have a complex situation, obviously git has very powerful instrumentation to resolve that. So having the benefit of having the whole system almost perfectly distributed, you kind of
experience a very, you kind of have a very scalable solution where it doesn't really matter. So I'm pretty confident that scaling up to tens of hundreds of thousands of boxes will be pretty seamless.
Because all you have to do is to tweak how basically the git-based configuration files will be distributed. You don't need to do it from a centralized place, you just, you know, make a hierarchical or maybe a random, randomly distributed infrastructure.
You know, check your neighbors and transfer to them or from them if they're new, sort of. And it's also scalable in terms of human resources. So we just, we have very few people, we'll be glad to hire a few dozen more, even today.
But while we're limited, we're confident that it's scalable even now. Because the load on operations is really low at the moment.
It's mostly non-routing stuff, so a sort of new type of problem. Yeah, you have to accommodate it in custom scripts or whatever. A new type of hardware, yeah, you have to think about that and accommodate that. But routing problems are really, well, at that point started demanding several orders of magnitude less of our attention.
You don't have this, like, hundreds of different configuration files, you just have everything in one place. But there are problems, of course. One of them is Git is really kind of beautiful until you really know its gods.
So when you really start working with it, leveraging its kind of obscure parts of its functionality and, well, you start hating it. And it's also not really designed to support file system versioning.
It's designed for code versioning, it would be trivial to add permissions and file mode to that, but obviously whoever is behind current Git development is really against that.
Well, that might be understandable, but I think it should be really reinvented. It's a very nice master-master synchronization, fully distributed solution. It opened up, in many respects, Git opened the eyes of many developers and systems engineers to how things could be done.
But now with that experience it can be reinvented, I think, from scratch to a really better solution. Rsync obviously does not support many of FreeBSD-specific and basically any specific features.
It's just a universal tool and we would be using it much more if it had better support for all the attributes and other file system-specific stuff.
I think the most important problem is, I would say, any daemon author, any application author should start thinking that his application will not be run on a single machine. If it's a good one, then people will use it in companies where they don't have just one box.
For really simple tasks, one box is not enough. Ten boxes are not enough these days. Usually for moderate tasks you have to have dozens or hundreds of boxes, at least, starting from that.
So if you design your software with that in mind, then you have to think about how will that poor guy with systems engineer badge will be managing the configuration of your daemon or whatever.
And just to accommodate sort of role-aware stuff into your configuration parser, sort of like Parwell did with HESD, where you... From the start he knew that HESD cannot be run on a single machine.
It's storage synchronization daemon, it doesn't make sense to run it on a single machine. So if it's run on several machines, why don't we have the opportunity to keep all the configuration of all the machines in one configuration file? That sort of stuff is really... That sort of mindset is really something that I think many software authors should awaken to.
So still the result I think is pretty simple. There's no, as I promised, no rocket science.
Just a couple of very simple tricks and a couple of obscure issues that you just need to stumble upon once and learn them. But it's pretty foolproof. You can have... Well, you can recover really easily whatever you do.
You have everything in one place, so you don't have weird out-of-sync issues. And you can have different versions of your Git repository on different machines. In fact I do think it's useful to have slightly different versions of configurations, unless that's security-related.
To just have... When you scale, then your programs, your machines will crash. And when you have the data of... This configuration crashes machines more often than this one.
Whether it's rc.conf or a particular kernel configuration or kernel version, then it really helps. It helps more than when you have just one single configuration across the whole infrastructure.
So, yeah, I think that's about it. So if you have questions, shoot. If you don't have questions, for those of you who manages a lot of machines and thinks that most of this, what we're doing is pretty nonsensical,
then I'd like to hear that from you and maybe have a chance to respond to that. Did you evaluate union-mounted NFS reasons-wide? Yeah. So the problem with NFS is that we got from a single data center to a lot of them.
And, well, you can't keep a single NFS route in the center because NFS is very sensitive to latency. So even like 20 kilometers really messes things up.
But when we got to 700 kilometers, obviously you can have NFS routes everywhere, but then that's a layer of complexity. Well, NFS basically works really nicely. I just wanted a single solution that also
works for our rented hosts that we don't have an opportunity to boot from NFS. And the only way I could do that is to come up with a local kind of boot solution which is compatible with NFS. Another thing with NFS is it's really unstable. NFS routes work nicely, but when you saturate your network coupling, obviously you have problems.
Basically when you do a lot of access to NFS file systems, your system might become unresponsive. Placing a union on top of that, I think it would only exacerbate issues. But it's a nice solution.
Do you do anything to try to, when you say to make a change in one host that spreads out to a together, that basically means that one host can compromise all servers?
Yes, security. My approach to security is if you don't have your infrastructure doing anything useful efficiently, then there's no security to speak about. So this whole thing is kind of just getting it to doing something usefully while you don't have to work 24 hours on it.
From there I'm currently just starting to look into security. I don't think it should be put before functionality, when you don't have any functionality.
But yes, it could actually, cloud security is really a kind of messed up topic, when you kind of have to have machines accessing data and functions on other machines without you being involved into that interactively.
It's not an easy problem, I agree, especially with the resource constraints, if you don't have any more people to do with it. Yeah, but we're working on that. I have some ideas maybe for next BSDCan. Something like that.
So for me, how do you structure it?
At that point, I'm obviously not familiar with that kind of scale, but at that point I would ask why the configuration has to change so often.
So in our example, you have to have what basically we've been posting, so you're not taking the effort, but you can't.
Yeah, I would just separate the really dynamic, high volume stuff into something separate. We need three applications, rather than the role of the replication. Cool, yeah, that solves it.
Can you talk a little bit more about how you managed the configuration for Nginx? Well, yeah, we had three different web servers with different configuration files. Each of them had sections with server name, server name 1, server name 2, server name 3.
And then when we wanted to convert it to a cloud, well, sort of single configuration files, we just put those sections into one configuration file and then put it on each box. So in each box, only the section that matters to it is really invoked.
In many cases, we may even manage to condense it even further, so some server roles got merged.
Well, we have some configuration in RNGX. Yeah, well, actually we do have also the problem with SSL keys. Some have them, some don't. So obviously, yeah, in that case we also use the RC.conf workaround,
where we have Nginx conf wrappers, which just include other Nginx subconfiguration files. And that solves it. We don't have... It's a bit of a... Yeah, it's not that nice, but it works quite simply.
Well, yeah, it's currently quite a kind of NATO state where it's semi-automatic in that I basically have a command that invokes...
Well, it's just on the order of hundreds of machines. I have a command that invokes 100 SSH commands on all the machines if I want them all updated. So in that part it's currently manual. What I'm working on in relation to the
security topic is kind of a way for machines to talk to each other securely, where each machine has its own host account, so I want to leverage that into kind of secure remote procedure calls, but it's not fully automated yet.
I think that's pretty... What doesn't scale? I mean, the simultaneous... No, we can. Why? Don't we? No, that's not the limit. It's 32 bits, I think, currently, or 64 even.
You might have other scaling problems when you go into that side. Yeah, and there's a nice hash in them, but obviously that's just a way to do it simply without writing anything custom when you don't have time to...
We don't detect, we don't have compromised hosts, at least... You can't say you don't have compromised hosts. Yeah, obviously we do. I mean, everyone does have compromised... Especially people who run on Linux.
Yeah, okay. So do you have compromised hosts? Everyone does. Yeah, so you see? I have compromised hosts to get compromised. Okay, so do you have compromised hosts that you don't detect? Yeah. Okay, so obviously we probably do too. No, we don't...
We'll probably skip from the shell when we're probably detecting that. Well, yeah, we'll deal with security later, but... Actually it would be probably to detect somebody submitting something into it, because I assume somewhere
on a workstation there will be a copy, and in the pool you can see the history. And you can just get it online. Yeah, yeah. I don't think it's perfect, but it's easier for somebody to do that.
Well, yeah, but actually... I agree. Well, it would be easier for you and me. I mean, you know, we just log in and push. But for people who really break into boxes, I think it doesn't matter for them if you use Git or... You know what, every couple of minutes you just have a crime job doing Git reset ed, and poof, changes magically disappear.
But then you can't make that change without... Well, if you really... I think, well, it's important to think about security as a kind of important, but a
kind of an issue that doesn't need to get in your way before you have something useful. Because if you start with security, then you better not touch computers at all. The other thing I see here is reinventing the wheel with several other systems.
Yeah, I don't... Yeah. You've got the binary interface, which is active. And you've got CF engine, public, chatbot, or all of those systems that can deal with configuration files from a central control perspective.
Yeah. But how does it conflict? I mean, I don't require having the opportunity to edit anything anywhere. I just... It's a current feature that you can disable. And then you just have, like... You can use, I guess you can use FreeBSD update instead of tar and SSH.
Well, we use package-ng instead of port-master, which is also package-ng standard solution. And we just... I just didn't want CF engine or puppet because you can really do very well without them.
Yeah, if you don't need that security compromise situation where you can push anywhere from anywhere, you just disable it. I don't know. I would be worried about reinventing if I really had something complicated.
At the moment there's like five lines of code and okay, I'm fine if I reinvented something with them. It's just so simple that I don't care. If I piled up Ruby, Python, whatever scripts in thousands of lines, then yeah, I would be worried.
Did I waste a month of my time doing that? Currently it's too simple to worry about that, I think. Yeah, 140.
The other concern is that puppet also comes everywhere, and that's the thing that it has to do.
If the whole Ruby system is running, you have your own code. You need to do that. Any time, if it's big enough, you need to do the integration management. If it's big enough, it should be something that you can use to do that. And it's not part of it.
Yeah. Well, what I'd like to see is operating systems incorporating at least part of those solutions so that... Well, actually reinventing the wheel is what bothers me, well, personally, as I see every company, every set up reinventing it, whether using puppet or not,
they still have a lot of custom stuff. And sort of pushing some of that back to the operating system where I think it's supposed to be, because, well, the operating system is not a kernel, it's a repository of shared effort, something like that.
So, it would be nice. Roll away configuration files would be nice, or maybe something completely else that just makes all your solutions easier.
Anything else? I think it's the final session, and see you all around. Thanks for coming.