Scaling Up Flakes
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 28 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/61027 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
NixCon 202212 / 28
10
12
20
21
22
24
25
26
27
00:00
System programmingSource codeNetwork topologyRepository (publishing)Local ringoutputPhysical systemRevision controlUniform resource locatorHacker (term)Default (computer science)Computer virusSpacetimeError messageComputer filePerformance appraisalCloningSoftware repositoryVirtual realityAbstractionScaling (geometry)Process (computing)File formatQuicksortAdditionExpressionData storage deviceMultiplication signProjective planeEntire functionType theoryReading (process)Different (Kate Ryan album)MultilaterationImplementationSoftware bugResultantRevision controlSoftware developerComputer fileAbstractionLocal ringSource codeVirtualizationHacker (term)Error messageSpacetimeFile systemIntegrated development environmentMiniDiscNetwork topologyDefault (computer science)Disk read-and-write headStandard deviationGoodness of fitCASE <Informatik>Uniform resource locatorCalculationMathematicsBinary multiplierPerformance appraisalHash functionBuildingDot productRepository (publishing)MetadataOperator (mathematics)Meeting/InterviewComputer animation
08:58
Network topologySource codeError messageControl flowStack (abstract data type)Game theoryPatch (Unix)Computer fileInterpreter (computing)String (computer science)Physical systemAbstractionRootDirectory serviceSoftware repositoryRandom numberCoprocessoroutputAttribute grammarComputer fileReading (process)outputGame controllerInformationLocal ringQuicksortData storage devicePatch (Unix)Network topologySource codeBefehlsprozessorCASE <Informatik>MathematicsSet (mathematics)MereologyMiniDiscImplementationSpacetimeRepository (publishing)Operator (mathematics)Error messageStack (abstract data type)BitTupleWordSemiconductor memoryMaterialization (paranormal)Tracing (software)Configuration spaceBuildingString (computer science)File systemAttribute grammarPhysical systemRandomizationComputer animation
17:37
Attribute grammaroutputNetwork topologyString (computer science)Representation (politics)Source codePhysical systemType theoryMathematicsResultantQuicksortEntire functionRight angleAttribute grammarNetwork topologyComputer fileSoftware repositoryRevision controlHash functionSource codeVirtualizationCategory of beingPerformance appraisalHoaxLocal ringCASE <Informatik>Mechanism designoutputLevel (video gaming)Cache (computing)Repository (publishing)File systemMereologyModulare ProgrammierungForcing (mathematics)DatabaseString (computer science)Representation (politics)Substitute goodGoodness of fitFlagCollisionDifferent (Kate Ryan album)Uniqueness quantificationNumberComputer animation
26:16
Computer animation
Transcript: English(auto-generated)
00:01
Let's get started with the last talk. I hope you still have some energy. Wow. All right, so this talk is about, well, it's called Scaling Up Flakes. And it's about fixing one of the issues that
00:22
is preventing flakes from being stabilized, namely a problem that a lot of people run into when they try to flake-ify their project. So if you have some big repositories, then it turns out that flakes don't work very well. Because as it turns out, Nix has the tendency
00:42
of copying entire repositories to the Nix store every time. And what a surprise, that doesn't scale. So this talk is about fixing that. And that takes away one of the big blockers for stabilizing flakes. So just a reminder of what flakes are for people
01:02
who have never seen them. So a flake is basically just a source tree, like a Git repository, that has a file named flake.nix in it. And this is a standard way of packaging projects that have Nix expressions in them. So where you previously might have a default.nix or a shell.nix, flake.nix standardizes that.
01:24
It gives a way for these flakes to have dependencies on other flakes. So you don't need to have this sort of monorepo anymore. And in addition to dependencies on other flakes, flakes, of course, have so-called outputs,
01:41
which are arbitrary Nix values like packages or developer environments or hydro jobs or whatever you want to export. The flake format doesn't really care about that. And like I said, source trees can be Git repositories, but also tarballs or paths in a local file system.
02:02
But usually they're Git repositories. So this talk is actually not really about flakes. It's about fetch tree, which is sort of the operation that underlies flakes. Fetch tree patches a so-called input, which could be a Git repository or a path.
02:23
And yeah, so what's the problem? So the problem is large source trees, because Nix currently copies every flake or every tree to the Nix store. So regardless of what you do with the flake,
02:41
it ends up being copied to the Nix store. So if you do something like Nix flake meta data on an arbitrary flake, as you see over here, that ends up in the Nix store. So if that is a multi-gigabyte Git repository, then every time you run Nix, it will first copy
03:05
all those gigabytes of data to the Nix store. If you make one character edit to your repository, well then you end up with a new multi-gigabyte copy in the Nix store. So this is not very scalable. And this is not just your top-level flake,
03:22
it's also all the dependencies, like Nix packages. So you end up with a lot of copies of Nix packages in your Nix store. And disk space may be cheap, but it's not that cheap. Yeah, so in particular, this makes hacking on large flakes very slow.
03:41
So Nix packages is not even that huge of a repository, but it's already big enough that hacking on Nix packages in the flake way, using the new CLI, so you have saying nix build dot hash hello,
04:01
takes several seconds because it needs to copy the entire Nix packages flake to the Nix store. And yeah, like I said, it costs a lot of disk space. So for example, here, 93 gigabytes, that's actually on ZFS, which compresses things automatically on file systems like X4,
04:22
which doesn't compress things and which has big cluster sizes. A Nix packages copy might be a few hundred megabytes. And of course, like I said, this multiplies every time you make any change to this flake, so that's terrible.
04:43
If your calculator works on these copied source trees, you get error messages that refer to the Nix store rather than to the original location. Now, often there might not really be a sensible original location,
05:01
because if, say, you're using a GitHub flake, that flake didn't exist in your local file system anyway, so it's not really clear what sensible error message it should give, but an error message that just says slash Nix, slash store, slash hash,
05:20
dash source, yeah, good luck figuring out what flake that actually is. So it's not very user-friendly. But in the particular case of when you're hacking on a local Git flake, if, for example, you add a new source file, but you forget to Git add it,
05:41
it won't be a copy to the Nix store, because Nix tries to be sort of hermetic. And so you get a mysterious error message. You get no such file or directory, and you're left scratching your head and thinking, but this file exists in my repository, so what's going on here?
06:00
So, because I was lazy. But the underlying reason is that Nix wants to ensure reproducible and hermetic evaluation. So that was actually one of the main goals for flakes.
06:20
So in the pre-flake days, if you had your project, you might rely on the Nix path environment variable to find Nix packages. And then two developers clone this repository, and they both try to build it. They might have a different version of Nix packages
06:42
on their Nix path. So they get a different result. And that is what we really wanted to avoid. So Nix was good at ensuring reproducible builds, but it wasn't very good at ensuring reproducible evaluation. So the easy way to ensure that evaluation
07:04
was unaffected by any sort of, say, pollution you might have in your Git work tree, like unadded files, or that you have references to outside of the tree, like dot dot slash some file dot Nix,
07:21
was to just copy everything to the Nix store. And yeah, that was pretty easy, but it doesn't scale. So the better solution is something called lazy source trees, which I've been working on for a while. It's in a pool request, it's ready for review now,
07:43
so everybody is welcome to try to find bugs in this. And the main idea is to not copy source trees to the store, but of course we still need to maintain hermetic evaluation.
08:02
And we want a couple of other advantages that I'll show. Now I'll talk a bit later about the implementation details but the idea is that we have a sort of, internally inside of Nix there is a sort of virtual file system abstraction
08:20
for accessing the source code of flakes and different types of flakes, like a GitHub flake or a local file system flake, have different accessors that, yeah, to lazily read files like Nix expressions.
08:41
So this is what it looks like in practice. So yeah, if I'm hacking on the Nix packages flake, so a Git work tree, yeah, now a Nix build, yeah, it's a lot faster, 0.2 seconds, so it's not copying the Nix packages flake to the Nix store anymore. Okay, great. So that's awesome.
09:03
So what does this mean? Well, let's look at the REPL. Let's fetch a tree from GitHub, namely the Patch Elf Repository. So these are the trees with pre-lazy trees.
09:28
So fetch tree no longer returns an attr set with a out path referring to the Nix store, but instead it has this sort of magic value, which is actually, if you print it in the REPL,
09:42
it will say, yeah, GitHub colon Nix OS slash Patch Elf slash this particular revision slash. So this magic value actually, yeah, is a reference to, I mean, so it provides provenance, so it knows where it came from,
10:01
so it can provide useful references. But so this out path is, yeah, an internal value, like I said, so nothing gets copied to the Nix store unless you ask for it. Now, if I do something like add the string
10:20
slash readme.md, I get a value that represents, yeah, the file readme.md inside of that source tree, which is still lazily fetched. So still nothing has been copied. Now, only if I do something like an anti-quotation, like string, it's a quote, dollar, three,
10:44
so if I'm trying to pass something as an input to a derivation, well, then the derivation, of course, needs to be able to read the source tree, so it needs to be in the Nix store, so then it will get copied. But if you don't do that, then it doesn't happen.
11:04
So now, since we now have all this provenance information so we know where these path references actually came from, we can get better error messages. So, for example, if I try to do this Nix build on,
11:20
and I, so I created this file, my package, but I forgot to git add it, so it will actually say that. So it will say access to path is forbidden because it is not under git control. Maybe you should git add it. Or if it really doesn't exist, then it will say that. It will say path, my package does not exist
11:42
in this git repository. Yeah, so much nicer error message. But also in stack traces and user-frown errors. So, for example, if I have a NixOS configuration where I'm referring to steam, which is not allowed,
12:03
so previously I would get an error message like package steam in slash Nix slash store has an unfree license, and now I get package steam in git hub NixOS slash Nix packages slash revision, blah, blah, blah, has an unfree license.
12:21
So, yeah, this is much nicer. Okay, but that's not all. We now have a built-in dot patch operator, which takes a source tree, which is lazy, and applies some patches to it, and returns a tree that is also lazy.
12:42
So previously, if you wanted to, for instance, patch Nix packages, well, at first you need to materialize Nix packages to disk, which takes a few hundred megabytes, and then you need to apply a patch, and then that gets copied to the Nix store, so it takes another few hundred megabytes. So it's very slow, and it takes a lot of disk space.
13:02
But this operation, yeah, since it takes a lazy source tree and returns a lazy source tree, it doesn't materialize anything to disk unless you ask for it. So here I have this patched, which is source tree that takes the patch of source tree
13:22
and applies a patch to it. I'm using the word patch to off. Maybe I shouldn't have used patch elf as an example here. That's a bit ambiguous. Yeah, and then when I do built-ins dot read file to read a file from that source tree, then it will actually look whether it has a patch
13:43
that applies to that particular source file and apply it in memory. And yeah, all right, now a little bit about the implementation, how this works, because that actually has something to do with
14:02
having an absolute path in the local file system, but they are tuples of input accessor comma path, and an input accessor is this thing that refers to, for instance, a GitHub repository or a hit repository or material or whatever,
14:27
and the path here is the path underneath that source tree, so in that flake, usually. So it has a bunch of operations on this file system abstraction, like for example, read file,
14:42
which given up one of these paths inside this flake will return a string. Yeah, and there are a bunch of implementers like Heather, the input accessor for local file system, for zip files, for applying patches. Yeah, and a couple others.
15:01
So for example, suppose I'm evaluating something from the local file system but not a Git work tree, so just a path flake, that will use the FS input accessor with the top of my flake as the root, and FS input accessor applies access control,
15:24
so if you try to do dot dot slash to escape out of the root, that doesn't work. So yeah, this ensures that you cannot gain access to something that is not part of the flake.
15:43
Git work trees are actually also implemented using FS input accessor, which also has a feature of, yeah, it has a set of files that you're allowed to access in the case of the Git work tree, that is the files that are under Git control.
16:01
So if you try to access a file that doesn't exist, it can give you a useful error message. That's how it's implemented. Now, GitHub repository, so here there's a big change. So GitHub flakes used to be implemented by downloading a tarball, unpacking it, and copying it to the next store.
16:21
So now it downloads the zip file, and it doesn't copy it, or it doesn't unpack it, and it certainly doesn't copy it to the next store. So there is a zip input accessor that will directly extract files from the zip file, and the reason for using a zip file instead of tarball
16:41
is that zip files are random access, and tarballs are not. So this saves a lot of disk space and CPU time, depending on your system. All right, and similarly, there is a patching input accessor that takes, that wraps an arbitrary accessor,
17:05
and yeah, so for example, if you do a read file operation, it will call the underlying read file, and then apply a patch to it. All right, so this does have some, causes some incompatible changes.
17:21
The main one is that flake log files no longer have a narhash attribute, at least not for all inputs. So this is a backwards compatible change, so if you have a flake with a narhash attribute in it, that still works, but if you take a new flake
17:41
and try to run it on an old Nix, it will complain about the missing narhash attribute. And the reason is that computing the narhash attribute is too expensive, because by definition, that requires the entire source tree to be read, and that, of course, was the exact thing that we were trying to avoid.
18:01
So it now relies on other attributes to lock the input, like the git revision. There are some other incompatibilities, so it's not super likely that you run into them, but so, for example, path values, since they no longer necessarily have a representation
18:23
in the local file system, two string no longer necessarily gives a useful result. So if you do something like two string three, and three is not a local file system flake, then you get something like underscore virtual,
18:44
and, yeah, sort of non-flake code, or if you're not using fetch tree, since both of those things are experimental, we can sort of get away with changing the behavior here.
19:03
Yeah, similarly, so there is a breaking change that occurs kind of frequently, but we're working on a fix for that, but since flake inputs are no longer in the NIC store, that lib.istore path will return false, whereas previously, it returned true,
19:22
and even though it can be coerced into something that is in a NIC store, if you're doing this as a type check, as certain types in the NIC source module system do, it might break.
19:41
Okay, that's what it currently does. There are some future improvements like, so git repositories are actually still copied to the NIC store, except when you're using a working tree, but so in the future, maybe we can use libgit too
20:01
in the same way as we're using libzip, and yeah, that would be very nice. Yeah, and yeah, or in that, yeah, merge it. Okay, that's about it, any questions?
20:28
Good, it's, ah, there we go. Are there questions? Oh, wow. So one question I have is these virtual, like the underscore virtual underscore path kind of types,
20:43
are they gonna be unique or, because you had like underscore virtual underscore dash four or something, right? So for two different previously stored paths, will you have unique virtual paths? Yeah, so I guess the underlying question there
21:02
is, is this completely reproducible? And currently it is not, so that number four could actually be something different. So I have actually some ideas about it, or I had an idea and then I forgot it, but, ah. So follow up, during one evaluation,
21:21
will they at least stay unique or? Yeah, yeah. Just not over evaluations, all right. There's two in the group here. So without the nine hash, but with Git revisions, for instance, it's relatively easy to get hash collisions, right?
21:41
Won't that mean that you can no longer guarantee that if two people build it with the lock file, you'll actually get to see the same thing? I mean, it can have a Git repo that produces something with a Git collision, right? I didn't entirely hear it, but it's about the NAR hash and can you still guarantee that it's the same thing?
22:01
Yeah, because we're using SHA-1 for Git revisions instead of SHA-256 for NAR hashes, right? Say again. So it's easier to get a hash collision with Git revisions, right? Right, yes, it is.
22:23
Yes, that's true. And I would sort of say that's a problem for Git to solve. So maybe they're going to switch to something else. I understand that'd be thinking about it.
22:41
I mean, if it's good enough for Git, then it's sort of good enough for us. So maybe we can still have a flag to really force locking by having a NAR hash there. Because that actually has some other useful properties like it makes the source tree substitutable
23:03
from a binary cache. Without a NAR hash, you cannot substitute it. Okay, is this the last step before removing the experimental flag? No, for that we should do a NAR hash.
23:25
So I haven't taken a lot of questions from this side. I'm sorry, so I will now. There is an evaluation cache for Flakes that allows you to bypass the evaluation
23:42
and just fetch from the database the result for some attributes. Can you restore that feature with the lazy file system or can you just disable the lazy thing and force to upload everything so that you have all of the ashes and can still access that evaluation cache?
24:00
I heard part of it. So it was about the evaluation cache. I mean, so the evaluation cache still works. How does it work if you don't know if the inputs are the same? Oh, well, I mean using the same mechanism as lock files. So the top level fake should be locked in some way.
24:22
But if you have a, right, so you don't get evaluation caching on a dirty Git work tree. But that was already not the case. Well, that's sort of similar to this NAR hash issue.
24:42
So maybe you could have a way, an attribute to force locking.
25:20
In principle, yes, that is possible. Yeah, so you could have a clean Git tree and then you start modifying after Nix has determined that it's clean. I think that falls under the don't do that category. Yeah, yeah, actually, yeah, you're right.
25:49
But in that case, at least the NAR hash would still reflect the dirty state. Right.
26:04
Thank you, Joko. I put away the mic already, so I'm sorry. All right, thank you.