Squeezing efficient and small Nix containers into Nomad
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 28 | |
Autor | ||
Lizenz | CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/61028 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
NixCon 202211 / 28
10
12
20
21
22
24
25
26
27
00:00
SoundverarbeitungLokales MinimumSpezialrechnerHalbleiterspeicherCASE <Informatik>Mini-DiscDifferenteStatistikDienst <Informatik>SurjektivitätKonfigurationsraumKonfiguration <Informatik>Interaktives FernsehenBitPunktBeanspruchungTypentheorieVirtuelle MaschineInverser LimesOverlay-NetzAggregatzustandRandomisierungNetzadresset-TestWort <Informatik>Reelle ZahlComputerspielMixed RealityZweiVerzeichnisdienstHierarchische StrukturMathematikSpeicherabzugSoftwarePhysikalisches SystemSoundverarbeitungModulare ProgrammierungDokumentenserverSkriptsprachesinc-FunktionSystemprogrammBooten
06:21
SpieltheorieGammafunktionTreiber <Programm>Lokales MinimumDiagrammVersionsverwaltungSoftwaretestNamensraumPunktStandardabweichungWurzel <Mathematik>MatrizenrechnungE-MailMessage-PassingKonfigurationsraumDistributionenraumBijektionProgrammfehlerTaskBeweistheorieBootenSchnelltasteTreiber <Programm>Hook <Programmierung>MagnetbandlaufwerkPhysikalisches SystemSpezifisches VolumenAggregatzustandsinc-FunktionKonfiguration <Informatik>TypentheorieElektronische PublikationModulare ProgrammierungLochkarteRankingDienst <Informatik>Uniformer RaumBimodulGeradeGrenzschichtablösungProzess <Informatik>Formale SpracheImplementierungComputeranimation
12:41
BimodulKonfiguration <Informatik>KonfigurationsraumDienst <Informatik>MathematikInterface <Schaltung>Notebook-ComputerTeilmengeImplementierungE-MailMultiplikationsoperatorEnergiedichteCodeLeistung <Physik>GeradeMereologieComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:00
So this talk will be about squeezing efficient and small mix containers into nomads. So it kind of relates to what we had before. Well, I'm known as Magigarbi Online and Rikhan Brezhak in real life. I've been working as a contractor at SeroCal for a year, about just.
00:23
And since September, I am a full-time bachelor student at Davoo in Amsterdam. And since roughly June, after convincing them, I've been working on Nomad Bringup at SeroCal. So if you don't know what nomad is, the other word for nomad is Kubernetes.
00:42
It's essentially the same thing, just by a different company and works in a bit different way. The difference between nomad and Kubernetes, if I had to summarize, it would be that Kubernetes, where Kubernetes uses many different types of workloads.
01:01
If you want to deploy things differently, nomad has a switch. So it's very extensible and works in a different fashion than Kubernetes, which means it's a bit buggy, since it can distribute and stuff. But it's getting better really fast. It's actually quite stable at this point.
01:21
So I think that even though we're like next OS people, containers are kind of the future. With putting next OS on plain VPSs, you have to worry about topology. You have to worry about cross-service interaction and configuration time, because even though the module system should ensure that if you change
01:42
one option for NGINX, it doesn't affect Apache, that's not always the case. And also, VPSs generally tend to be too big if you want to buy them efficiently. So if you want to separate your services onto different systems, you're wasting resources.
02:00
So that's what containers can help us with. And so for example, these are the statistics from Serokel. You can see that we have two build servers, which as you would expect, have large disks and are mostly full. Then we have the server, which is basically
02:22
something which I can't name. And generally, everything else is below disk usage wise, like below 50%. And most things, except for a few other things, which, well, memory wise are above 50, which in this case is just vault
02:42
and the miscellaneous stuff which just holds a lot of random things we didn't have any place else to put. So we could easily consolidate everything into containers and save a lot of resources that we're not really very painful about using. So the effect of switching, in my opinion,
03:02
is less work for engineers because you don't have to worry about the topology and about the IPs and stuff like that. You don't get the interactions between services that I mentioned. And with better packing, you can have less machines and therefore save some money. And then we have a meme.
03:21
So like we were talking about at the last talk, during the last talk, in my opinion, Docker tools and OCI tools are fundamentally flawed. It's nice that we can interact with existing runtimes but stuffing next store paths into layers, which in the kernel, there is a limit of 128
03:40
for overlay FS, which everyone essentially uses. It's a bit troublesome because it's common to have way more start paths than 1.8, and therefore you have to somehow choose a heuristic, but that's complicated. And also to achieve a system where you're reusing the layers
04:00
between different containers, you would have to somehow do the stuffing into layers outside the next store so you can keep state between individual containers. With the current solutions, so OCI tools and Docker tool in the next packages, if you, for example, wanted to run a high drive container, the last layer will end up being 1.2 gigabytes in size.
04:22
So if you change a config file, you're copying 1.2 gigabytes over the network. Yeah, also overlay FS has to do a lot of things to match folders, which it doesn't know anything about, but we do know that the store paths, they are first immutable, and the second of the, they diverge at a very specific point
04:42
in the directory hierarchy. So we can simplify lookup a lot. Then the next question is, what do we want to put inside the containers when we have figured out the store paths, which I'll come back to later? So one option is NixOS, which we already have, but NixOS is not really aiming to be small.
05:02
It never was. By default, it uses systemd as the init and system manager, which can be changed. I'll get to that. And also it's constantly changing. So if you want to do some large changes to it, it's hard to keep up, and you have to do an RFC and everything, which is annoying.
05:21
So I tried this initially, just by getting rid of it. You end up with essentially a Docker file, which is just a bash script. You have to create, being storage, and all the other stuff. You also need a process reaper. So normally people recommend dumping it and a lot of other things. So you end up with a Docker file, essentially.
05:45
We have the module system, so why don't we use it? And that's exactly what Arian tries to do. I have a few issues with this configuration, and I'll go over them. So we essentially just have to say hi
06:00
to bash scripts again. The core of the module system is a bash script since Arian utilizes bash instead of systemd to get rid of systemd to go to the setup. The optional boot.is container in NixOS is quite fun. If you want to figure out what it does, you have to grab the Nix packages repository
06:22
to see where it's being accessed, which I don't think is ideal. It should be the other way around, but I see why it's not. Then this is just a personal thing, but I think that the services in NixOS, they're fundamentally flawed, where the nginx module and the Apache module and a lot of other modules do way too many things,
06:41
and it's really hard to know what they're actually doing. You have to read the 1.2K of lines to actually understand it, and even then it's not easy to understand. So when I'm doing stuff, I go a bit more minimal, and then we have the return of bash, where there are things done which should not be done by bash, like setting up etc pass wd or like bnsh and stuff like that.
07:05
Oh, bnsh is actually being set up in Arian. These things should not be done by bash. It should be done by the module system. We have it for that reason. And then just a minor gripe, I don't think that putting what I would consider physical configurations like the ports
07:21
or whether to use the host or inside the config for the container itself is a good idea. It should be separate from the container. So the question is whether Arian should be adapting NixOS. In the normal world, we see the bnarch 1.2, and I have a typo there, versus Alpine.
07:40
And I don't think I see Debian being adapted into something which we can put into containers. Debian is just taken if you need it. I generally recommend that you use Alpine as far as I know. So on our end, we have NixOS and kind of nothing. And that's what I've been doing for a year, kind of slowly. We have, well, I have this distribution called NixNG,
08:02
maybe blood, the name is debatable. It's a NixOS-inspired GNU Linux distribution, which was made for containers specifically. So currently, it uses run8, which isn't great, but that will change someday. It's stable, it works. It can run a lot of things.
08:22
The modules, they generally correspond to NixOS modules one-to-one. And as for the testing, we talked about NixOS tests in VMs. If we got support for user namespaces in the Nix build,
08:40
we could do it through containers with this. So the point is of NixNG is that it's an experimentation guard, and this has been for me. So I invite all of you to come and do something weird. I don't really mind the weirdness in the repository as of now. It's a waste molecule,
09:01
so you don't need to change as many things, and you're also less likely to break something if you change something. I think the deviation from NixOS is good at this point, because NixOS has gotten quite large, and I'm not saying that it's bad. I'm saying that we need to experiment with new things again, and that's what you can do here. So what about the layers?
09:21
Let's get back to that. There's this thing called eraser darlings, which essentially means that you do not have anything in the root FS but the NixOS, and the Nix boot system sets up everything, and we can do that in the containers themselves using bind mounts. So we can run the NFS in either Kubernetes,
09:41
which I'm not familiar with, I don't like, or we can run it in Nomad, HashiCorp Nomad, I mean, and Nomad has a notion of task drivers. So the nice thing is that you can take literally anything, even a punch card reader, and hook it up to Nomad and it will run it. It doesn't care. So I've built a experimental driver
10:02
which takes the container, the driver by Roblox, tapes on Nix and hope it works, and hope it works, it actually works, though. It has a few bugs, but those bugs come from the original task driver, not my changes, but it's still great as a proof of concept.
10:21
With the simplicity of Nomad for task drivers, we can do literally anything. We could use bubble wrap. We can even do it in Rust since it's all a gRPC. We can adapt the Docker driver in Nomad itself, which is the most stable out of them all, since you can actually force Docker to run a root FS if you're really smart about it.
10:41
And we can also directly just call it on C or any other option you can think of. So, and what we get by using Nomad is that we get interop with Docker containers for absolutely nothing. Nomad's design allows all types of tasks to, if the task driver is set up correctly, which isn't hard to do,
11:01
they can interop via the network, via file system, and via, they can also like mount volumes over NFS. So we don't lose any functionality. We just lose some size and annoyance. And as for the future, well, the task driver makes work, a lot of work.
11:20
It works. I've been using it. It's been running in a state of not doing anything at several costs for a while, it's stable, but most work is needed in Nix-NG itself. I think that the rename is due because the name doesn't really roll off the tongue. And also we need a new init system because the run is not amazing because it boils down to bash scripts again.
11:43
My idea was to write something new and experimental. We could do the, we have a lot, we could use lots of processes communicating over something standardized like MQTT, for example. And then the processes would ask the others to do something
12:01
and we could easily replace them one by one with other languages or other implementations if we hold to respect. So that's all. And I have to thank Seroka, of course, for allowing me to be here. My friend from Bratislava who helped with the initial design of this presentation, two people from the NixOS Discord server,
12:21
one for proofreading and one for running a test version of this thing for a while. My dad, who doesn't know anything about Nomad, for listening to my ranks. And my friends and uni suffering through the initial versions of this talk, which was just me looking at my notes and trying to say something. So yeah, that's all. You can contact me on Matrix via email
12:42
and the laptop just died. Okay, if you want to contact me, then come talk to me. I'll give you my email and my matrix and everything. So that's all. Thank you. Thank you.
13:01
Good timing. So that's very energy efficient, actually, because we're not using any more power than necessary, apparently. Are there any questions? No questions? Yes. I think you briefly mentioned that you were using the Nixos modules that look kind of,
13:22
that mirror the existing Nixos modules. But I guess they're based on Nomad service runners in some way. No, it's separate. So if you have the users.users option in Nixos, Nix-NG does the same thing, but its own way.
13:40
Since I'm not using systemd, I can't rely on systemd. But the modules are close enough where you could feasibly make a transpiler from Nixos configurations or modules from a subset, right? From a subset of Nixos, the option set, to the options of Nix-NG and get it to somehow work. If you look on my GitHub for the,
14:02
for example, the Hydra module, there is a header, MIT license code begins here, and then 500 lines later, hangs there. So I just literally copied part of the Nixos module into my own module. It's really similar. You just have to, generally only the config changes because you're not using systemd. Okay, so the implement, the interface is about the same,
14:24
but the implementation is different. Yeah, you can keep the interface about the same, but you don't have to, that's the nice thing. Yeah, thanks. Any other questions? No? Okay, thank you again Richard. No, no, it just looked like you wanted to raise your hand.
14:42
I'm sorry, thank you again Richard.