Sneaking Nix at $work and become a hero, hopefully - TIB AV-Portal

Sneaking Nix at $work and become a hero, hopefully

00:00

7

Chevalier, Jonas (zimbatm)

Formal Metadata

Title

Sneaking Nix at $work and become a hero, hopefully

Title of Series

Number of Parts

14

Author

Chevalier, Jonas (zimbatm)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/39618 (DOI)

Publisher

Release Date

Language

Production Year

2017

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

This talk explores ways to introduce Nix into an existing infrastructure (at work), based on real-life experience.

NixCon 201712 / 14

Automatic playback

Speech

Text

Image

00:00

Level (video gaming)Goodness of fitComputer scienceQuicksortComputer animation

01:20

Product (business)ScalabilityChannel capacityBuildingClient (computing)Cache (computing)Binary fileSource codeDigital filterComputer programComputer scienceComputer programmingLevel (video gaming)Gastropod shellPhase transitionMobile appSource codeProjective planeCodeComputer animation

01:55

Price indexData modelStack (abstract data type)Source codeFront and back endsConnectivity (graph theory)Java appletScripting languageData structureSlide ruleProjective planeDebuggerBitSoftware repositoryComputer fileGastropod shellProgram flowchart

02:21

Gastropod shelloutputHookingSource codeDerivation (linguistics)Gastropod shellPhysical systemRight anglePoint (geometry)Revision controlComputer fileDerivation (linguistics)Scripting languageSource codeIntegrated development environmentCASE <Informatik>Variable (mathematics)Projective planeMereologyoutputComputer animation

03:43

HookingGastropod shellSource codeGastropod shellBuildingProcess (computing)Asynchronous Transfer ModeProjective planeConfiguration spaceLine (geometry)MereologyDistribution (mathematics)outputXMLComputer animation

05:17

Derivation (linguistics)Power (physics)BuildingLevel (video gaming)Computer fileJSONXMLUML

05:44

Vertex (graph theory)Phase transitionSoftware testingPhysical systemStack (abstract data type)Revision controlDerivation (linguistics)DisintegrationDifferent (Kate Ryan album)Projective planeFunction (mathematics)Multiplication signSpacetimeRevision controlAcoustic shadowScaling (geometry)outputSlide ruleFormal languageMereologyHash functionType theoryComputer fileAttribute grammarSoftware testingPhase transitionSource codeData loggerMixed realityDerivation (linguistics)INTEGRALOperator (mathematics)BitAsynchronous Transfer ModeBuildingFacebookDynamical systemComputer animation

10:07

Digital filterSource codePhase transitionoutputModule (mathematics)Multiplication signBuildingSource codeLecture/ConferenceMeeting/InterviewComputer animation

10:33

Digital filterSource codeInheritance (object-oriented programming)Computer fileRule of inferenceFunctional (mathematics)Source codeComputer fileLoop (music)Condition numberDirectory serviceFilter <Stochastik>InfinityProjective planeResultantLibrary (computing)Boolean algebraFile formatComputer animation

12:48

Rule of inferenceDigital filterSource codeDifferent (Kate Ryan album)Projective planePhase transitionDerivation (linguistics)ResultantPoint (geometry)Computer animation

13:30

Gastropod shellSoftware repositoryPhysical systemEuclidean vectorDerivation (linguistics)Inheritance (object-oriented programming)Slide ruleRevision controlScripting languageConfiguration spaceData structureDirectory serviceDefault (computer science)Default (computer science)Multiplication signComputer configurationFiber bundleBuildingConnectivity (graph theory)Information securityEqualiser (mathematics)Line (geometry)Computer fileFeedbackLatent heatSlide ruleOverlay-NetzSoftware repositoryConfiguration spaceScripting languageRevision controlGastropod shellSource codeOcean currentInfinityAttribute grammarRootCASE <Informatik>Hacker (term)Control flowProjective planeString (computer science)Structural loadGoodness of fitSystem callPhysical systemSet (mathematics)Utility softwareRight angleLoop (music)BitDerivation (linguistics)Computer animation

19:26

Scripting languageCirclePhysical systemCache (computing)Configuration spaceDeclarative programmingAerodynamicsOpen sourceScheduling (computing)Module (mathematics)Cache (computing)Projective planeMultiplication signVotingData storage deviceGame controllerComputer fileLogicGoodness of fitBitProcess (computing)Traffic reportingSource codeReading (process)Hash functionScheduling (computing)CASE <Informatik>Mixed realityData managementFunction (mathematics)Logic gateElectric generatorDataflowElectronic mailing listPhysical systemRevision controlOpen sourceBuildingJSONXMLComputer animation

24:03

Cache (computing)Binary fileMultiplication signNumberSoftware developerCASE <Informatik>Computer animationMeeting/Interview

24:34

Link (knot theory)Cache (computing)Binary fileScripting languagePoint cloudSynchronizationClient (computing)Data storage deviceKey (cryptography)Asynchronous Transfer ModeDampingCASE <Informatik>MehrplatzsystemDifferent (Kate Ryan album)Physical systemMultiplication signWrapper (data mining)Single-precision floating-point formatClient (computing)Configuration spaceComputer fileAsynchronous Transfer ModeBinary codeData storage deviceOptimization problemCache (computing)Lecture/ConferenceMeeting/InterviewComputer animation

26:10

Right angleMultiplication signContent (media)Computer animation

26:34

Content (media)Configuration spaceComputer-generated imageryPhysical systemModul <Datentyp>User profileIntegrated development environmentBootingWeb 2.0Server (computing)Multiplication signContent (media)Modulare ProgrammierungRun time (program lifecycle phase)Derivation (linguistics)Product (business)Subject indexingMaxima and minimaCASE <Informatik>XML

27:22

ExistenceComputer-generated imageryPoint cloudDigital filterGastropod shellDerivation (linguistics)Cache (computing)Binary fileInternetworkingMultiplication signRevision controlGastropod shellFile systemDefault (computer science)Source codeState of matterProjective planeMereologySlide ruleComputer fileSoftwarePlug-in (computing)Distribution (mathematics)Hand fanModule (mathematics)Declarative programmingSoftware repositoryConfiguration spacePhysical systemService (economics)Right angleComputer animationLecture/ConferenceMeeting/Interview

Transcript: English(auto-generated)

00:04

All right. Our next talk is by Jonas. He's been a contributor since 2015. He organizes the Nix London Meetup with Thomas Hunger and is now working as a Nix contractor.

00:23

Thank you. Thank you. All right. So today I want to talk to you about using Nix at work. The title is a bit facetious, but basically I want to show how you take a codebase from

00:44

work or customer and Nixify it. And there's different stages that you will see that you have to go through. So just for a quick history, so September last year I was tired of hacking Nix on the

01:03

contractor, but there's not a lot of work around yet. And after a while I met with Twig and they have really cool customers and I was able to do a lot of Nix stuff with them. And now I joined Twig and they're doing sort of R&D, applied computer science. So

01:24

it's kind of a R&D lab for other customers. All right. So the program is really to show you the different stages. There's going to be first phase is Nix shell. Then you want to package your things. And finally you set up your CI.

01:43

Okay. So I can't really show the customer's code. So I built a little app. I actually took some of the source code from a project called TodoMVC, and then I Nixified it. And you can find all of the source code over there. So if I'm going a bit quickly through

02:02

the slides, you can always have a look on the repo. And you can see the file structure is basically you have one backend, one frontend, and the backend is a Haskell project and it has two components. And the frontend is just a bunch of JavaScripts. And yeah, first thing is Nix shell. So you just drop a shell.nix in the project and

02:30

your colleagues ask you, what is this file? And you say, don't worry about it. And then at some point, suddenly they have problem with system dependencies. And then

02:41

you're like, oh, you could try to fix it or you can install Nix and just run Nix shell and it's going to be fine. And basically the shell.nix file would look something like that. So you import Nix packages and then you create the derivation. And the important

03:02

part is the build inputs where you put all the system dependencies that you need. And then a last feature of Nix shell is you can run some bash scripts that in this case would source the .n file that contains typically environment variables for the project.

03:23

So that's kind of the version zero. And so one thing you might notice is you have to put a name, you have to put a source, which is always null or some path, but it's not really relevant. So I think we should introduce MK shell that just simplifies the

03:44

process. And also that pulls in all of the build inputs from your different packages that you're going to define later on. All right. So here I'm using an overlay

04:03

and so I can pretend it's already in the next picture. It's in the project and my intention is to submit a PR at some point. Or if you want to do it, you're welcome. The other thing that's important or that can trick you quite

04:21

easily is if you run Nix shell in dash dash pure mode, then lots of the tools that you have available are not going to be there. And usually if you customize your bashrc, then suddenly you're going to have failures. And so that's why this first line is quite common. You find it on most Debian distributions. And then I recommend to add

04:44

this second line, which is something that basically escapes the config of your bashrc if you're in the pure mode or just any next shell actually. All right. So you can even start running things in the next shell without really stepping

05:03

into it by using the dash dash run command. And now you can even plug this into your CI and now you build with the next shell and it's not really pure, but it kind of does the job. Okay. So that's the first stage and it works, but it's not really pure.

05:25

And I would say the main issue is that it's not composable because one of the really cool things about Nix is that you can make a derivation and then compose it and make another derivation. And that's really the power of Nix and something you don't find

05:42

with Docker files, for example. And so you would build packages or I'm not really going to go really into how to make packages. The main recommendation I have is read Nix packages. There's lots of examples out there, but basically it looks a little bit like that.

06:02

You have a package name, then you specify the source, and then you have different types of inputs, which you should probably learn because not everyone knows. So the native build inputs are tools that you use to build, but are not going to be part of the outputs.

06:20

And this is important, especially for people who do cross-compilation. Then there's the propagated build inputs, which are tools that you might use for build, but you also want to install afterwards. So they kind of come with, like, if you have a binary dependencies and then the build inputs that probably everyone knows. You also want to learn about

06:42

the different types of phases there are, different types of outputs, and yeah. So for this project, we have some Haskell packages, and I'm not going to go too much into details, but right now in this space, you have multiple tools. So you have kyabalt to Nix,

07:02

stackage to Nix. You can also try to use just the Haskell packages that are in the Nix packages. You can also use the stack tool, which is a tool that's in the Haskell community, and it has a dash dash next mode, but basically all it does, it pulls GHC from the next star,

07:23

but that's it. Or you can swap things around and create a derivation that invokes stack, and it's maybe impure, but you can maybe control the dependencies a little bit better. And for the Node part, with Martin. Where is Martin? Ah, there it is. We worked on this project

07:50

that's called Yarn to Nix, and what it does is, so Yarn is a Facebook project that tries to replace npm, and the good thing about it is it generates a log file that contains the hashes

08:03

of all of the packages that you depend on. Sorry? Of the downloaded package from npm. Okay, I have a question. Okay. So the cool thing about this is that actually what we do is,

08:24

so the first phase was to do like any other languages where you do a Yarn to Nix project that generates a Nix file from your log file, but then we realized that actually this operation was pure because all the inputs is the hashes from the other file, so now we can import from the

08:41

output another, we can import the Nix file that's generated, and we don't need to compute any checksums, which means that basically your Yarn package doesn't have any shas. That's the magic. So what it does, it takes the name from the package.json,

09:07

the version from the package.json, so you can see the name attribute is missing here, and you don't have a shas for the rest of the dependencies. Okay, so this is a slide I

09:23

wanted to finish earlier, but I didn't get the time. It was actually taking on one of the talks we were seeing earlier where what I do is instead of making one derivation where I run the tests, I tend to create multiple derivations, one that has the build outputs and another that has the

09:45

tests, and because Nix is composable, you can do that, and it's really nice because sometimes you just want to build your project and you don't want to run the test again because maybe they're integration tests and they take a long time to build, but if you change the do check

10:01

attributes, then you're forced to rebuild, so it's kind of a way to be more dynamic. Okay, so one thing that's missing here is you see the SRC inputs is actually pointing to

10:20

your current folder, which is where you have all your node modules, and all of your source code is going to be inserted into the next star at build time, but you don't want to have the node modules folder be inserted into the next star, so there's a built-in that's called built in that filter source, and you pass a function, you pass an absolute path which is where you

10:46

have your source code and it gives you a next star, and the function itself gets the absolute path of each of the files, so it goes through the files of your project and then invokes the function, and if it returns true, it keeps the file, and so you get the absolute

11:07

path to the file and the file type which is like a file or a directory or maybe a symlink, and in the next packages, there's one tool that exists that's called lib.cleansource, which

11:22

basically removes the result file that you would get from a next build, so you don't, if you don't do that, you're gonna run next build, and you run next build again, and it builds again because it inserted the results from the previous build, and so you kind of are in an infinite loop.

11:42

And it's nice, but it doesn't really compose, so typically you would have to rewrite it or make a function, so what I propose is that we change a couple of things to filter source. First, I think the function should return true if we should remove it, so inverts the

12:02

boolean logic, because it's more natural. If you look in the, I think in next packages, oh you're not gonna see it here, okay just trust me, but the clean source function is actually doing exactly that for a lot of conditions and then adding a node in front,

12:27

so I think it's more natural to have it this way, and then I have a second function that allows you to compose these filters, and that way you can take the clean source from next packages and then add your own special cleaning functions. All right, so now we have,

12:51

we have packages, and there's a last thing we need to do to make it really nice, and it's pinning next packages. So what's happening right now is you have your project with all your

13:03

derivation that are being built, but each colleague might have, might be on a different channel, and so they might actually get different results, which kind of, yeah the point of next is to have reproducible builds, well one of them actually. So I went

13:23

through different phases of how, what is the best way to pin next packages. So the main, like the trivial version, version zero, is you use the built-in called fetch tarball, and you say fetch the next packages from this sha on github,

13:43

and that's it, and you get back the source code and then you can import it, and it works pretty well. I mean the only issue is that every time you invoke next build or next shell it's going to try to re-download it, and one solution is to upgrade to next 1.12,

14:03

and then add the sha 256, but then your colleagues who are still on next 1.11, it's not going to work anymore because the sha attribute is not supported in that case. So then the next idea was okay, maybe we can import fetch from github that we love and use

14:25

and all over next packages, and just fetch the source like that, and it works also really well until you set the next path to this file, because then you're importing next packages from itself

14:40

and you have an infinite loop. Okay so then the next idea was, so this one you're not really supposed to read it but it's a bit crazy, and it's been invented by a guy called Taktoa, which I don't think is there today,

15:00

but basically he reimplements fetch URL, fetch tarball, by in this line he finds the config of next, and it's a secret file, well I didn't know it exists, and it contains the reference to all of the built-ins that are used to build the next utility, and so you can actually

15:27

without importing next packages already have access to gzip and tarball, and you wrap this in a derivation, and you put a sha 256, and you're good. Until you do next build option

15:43

sandbox true, and then because these paths are actually, they come from the next tar, but they are not, they're strings and they're not really paths, and that breaks this system. So I was a bit disappointed because I really like how convoluted and

16:05

there's a special place in my heart for this kind of hacks. So in the end I think we should just have a compatibility layer for a fetch tarball that switches on the next version, and that's it. That's the best way I found currently.

16:23

All right, so now we have the source of next packages. Okay, I have a question from Domen. All right, so Domen made it work, and he's going to show us clever. Okay, cool.

16:42

So that's one of the reasons I wanted to give the talk, is to get this kind of feedback. Okay, so one more to-do slide. Basically you should really have updated scripts because now you're probably tracking 1709, and you want to get the latest security updates,

17:00

and to make it easy you need the scripts that you can invoke either by hand or by CI, but that's really the last step that we need to standardize and make it easy because otherwise all the good work that's being done by the security people is kind of wasted. Okay, so we have the source, but there's a last step that you have to do,

17:24

and is when you import, so we're all familiar with this import brackets next packages, and then you have an empty attribute set, and actually you're supposed to set some stuff in there. You're supposed to set the empty config and the overlays, because if you don't do the empty

17:44

config then each user can have its own config, and then it's not pure anymore, and the same for the overlays, plus overlays are cool, so you should really use them. All right, so now we have our next file repo. Each of the different components has a default

18:06

dot next, and then in the old packages you kind of call package into, old package is the overlay, and you call package to load of these components. The default dot next is the one that ties everything together.

18:22

The next package's src is the version 4, and the release dot next is just gonna re-export everything from default dot next that you're interested to build for your project, because default dot next contains all of next packages plus your project packages.

18:41

And that's it. All right, this is just a nifty trick I found out just last week. Usually you have a scripts folder where you have tons of scripts, specific to the project, and one of the things that you can do is if you set the dash i next packages equal next,

19:03

and next is the folder that we have here, right? And so now you're reusing all of the same setup that you were using before, and you're not again diverging with the packages that you're using. Only downside is you have to invoke your scripts from the

19:23

root of your repo. All right, so last step, set up the CI. So the general approach is just gonna be next build the next release, that next file, and that's it. And I think

19:42

most of the logic should go into the next files, and then maybe later on you're gonna add some impure stuff, for example pushing docker packages or talking to kubernetes or something like that. So the other thing is you're coming at an existing place where they

20:03

already have the CI in place. So one thing you're gonna see from this list is Hydra is missing. And I actually tried all of these CIs, and basically I just want to go through quickly through all of them and show you the advantages and disadvantages.

20:23

So Travis, Travis is the first CI that's had Nix support after Hydra, and is doing a good job and is probably working better on smaller projects, but it's a bit hard to debug and sometimes it's a bit slow. But it works all right. And then there's a Circular CI 2.0.

20:49

It's docker-based, and you can restart your builds, and you can get SSH into the container. So it's kind of handy sometimes to debug things. The only issue is that they have an

21:02

immutable cache. So if you want to store the next store for the next build, the problem is you need to give the unique hash, and the unique hash is gonna be, I don't know, something. But if it changes, then you have to re-download all your next store. So it's not, I don't know, I'm a bit annoyed by this actually. I don't know if there's a better

21:25

way to do it. But basically it's kind of they're working one against each other, and it's actually quite common with the next store and other caching system is that if they're not perfectly aligned, so I would say if they don't have the exact same notion of hashing I would say,

21:43

I'm not really sure, but it makes things a bit difficult. In the Travis case, I think they just load the latest stuff from S3 and then re-dump it back, and so most of the time it works all right. So it's actually an advantage for us that they're not too pure.

22:05

All right, so GitLab is also really cool. It's agent-based, so you can run the agent on NextOS and then have GitLab manages, schedules all of the jobs, and you have good control on the targets. And it's already in the NextOS Next packages.

22:25

The only downside is that you need to move all your source code to GitLab, so if you have already your own workflow around GitHub or some other source control, it's not very handy.

22:41

But overall, GitLab on its own is also pretty good. Then there's Jenkins. I don't know if I need to talk about it, but it's actually they're making a lot of efforts to make it work, but even last month I was still trying to make it work, and I spent a lot of time

23:03

just fixing things. So maybe it was because I tried to make it work on Kubernetes. I don't know. Okay, so last one is buildkites, and it's a commercial thing, but the agent is open source

23:20

and it's already in the Next packages, and then they control the dashboard and the job scheduler, and it seems to work pretty well. It's configured as well with the pipeline that you can configure with a YAML file, but you can also dynamically generate the YAML file.

23:42

So the next thing I want to explore is see how I can split up the builds by invoking next build, finding all of the outputs that are going to be built, and then maybe generate the YAML from that and then like shelve out the builds. I don't know. So if you have any ideas, let me know after the talk.

24:04

All right, so we have a CI now, or one of these, but don't forget about the binary cache, especially if you're doing high-scale development. It would be really nice if you could take most of the time the builds from the binary cache, and you would just save a lot of

24:22

time. Also in some of these cases, when you're scaling the number of nodes, it's really nice to be able to do a node from the binary cache, because otherwise you're just going to have one node that's rebuilding from scratch, and it takes two hours. So to do that, what you do is...

24:44

I didn't know about NeXT 1.12 features, which looks really, really cool. So I had to build my own little wrapper that basically invokes NeXT push, and then, in this case, it sends to Google Cloud Storage.

25:07

And it works pretty well. The only issues when the NAR files are big files, like Docker containers or something like that, it can take a lot of time to build. So I don't know if there's an optimization problem or something, but it would be nice. Sorry? Don't push Docker containers.

25:28

Yes. But that's not enough. So that's the first side, and then the second side is setting up the client. And right now you have to change the system config to allow

25:46

to fetch from the binary cache. And unless you're in a single user mode, I think, so it's actually different if you're in single or multi-user mode. So now you have to ask your colleagues, are you using the single user or something. So it makes things a little bit

26:04

complicated. But as we learned recently with NeXT 1.12, it's going to be solved. All right. So maybe last bit, how am I on time? All right. I feel like I spoke already enough.

26:25

All right. So I think one of the cool things about NeXT is the composability, which as mentioned before. And one of the things you can do is take your package, so in that one is the front-end, and just put it into this other derivation that's going to produce a Docker

26:45

container. And your Docker container is going to contain just a minimal amount of stuff that you need to ship into production, just the runtime dependencies. So in that case, you have

27:01

the index.html, some JavaScript, and then CADI, which is a web server. All right. And that's a nice hack, but I'm running out of time. So this is how you can rebuild NeXT. You can reuse the NeXT OS module system to build a container.

27:22

And that's just pushing the containers. So that's it. We started with the NeXT shell, then we did the derivations, and then we set up the CI, and now we're good. Thank you. Is there a time for questions?

27:53

Yeah. All right. Yeah, over there. Yes, a small question about this filter source example and the node modules folder. Did you

28:04

never run into any problems with that? Because for example, some packages may also declare bundled node modules, and they may use slightly different packages than the upstream versions. And I don't know if you've ever run into trouble with that.

28:22

Not that I recall. Okay, because I've seen a couple of packages that really require the bundled node modules folders to work. Oh, right. So yeah, you may want to make that optional. Because in general, it's a good thing that you clear out the mess, but sometimes you actually need the node modules folder

28:42

that is in your project. Okay. Thanks. So I can't quite believe that I'm about to defend Jenkins, and I feel dirty already for doing that. But actually, I'm not a fan of Jenkins, but we do use it at

29:06

REA-JEK, because we have Nix builds which aren't pure. And it's actually working pretty well for us. So we've got everything is defined in a declarative way, and we have all the plugins pinned, and it's working really well. I'm not loving Jenkins,

29:25

but it's pretty easy to set up and to use for Nix stuff. So that's my experience. So I think I agree with the... So what Jenkins did is they introduced a Jenkins file that you can add to your repo. And this part is very declarative now. And they also did a lot of

29:44

work on cleaning up. So the default setup now is going to integrate with GitHub more properly, because there was lots of work to do. For example, just make sure that it pushed the build states to the GitHub PR and stuff like that. So that's much better. But you still

30:01

have a snowflake problem where the config on its own of Jenkins is not declarative. Okay, maybe it's declarative. All right. So have you had to think much about

30:28

distribution of channel? And what I mean by that is those source references with dot slash dot become very painful, at least in my experience in this similar problem, when you actually want

30:44

to distribute your software to other people using Nix. And I was wondering if you encountered that, or if you had any thoughts of that, or you want me to clarify and actually explain what I mean by that. So in general, what I have is a self-contained repo, where the artifacts

31:03

I'm pushing out are Docker containers and things like that. So maybe I'm not encountering the same issue. Any other questions? Thank you. There's a last question.

31:26

Just for the slide that you ran out of time, would you be around after the closing? Yeah, sure. I'm going to publish the slides online. I think actually getting the full Nix source with services to run in Docker, I think that's rather interesting. All right. So it generates the Nix file system,

31:45

but you don't have systemd in the container. So this thing, it would be nice if we could serve it actually. And I agree. Yeah. All right. Any further questions? Thank you.

32:02

Thank you.

Recommendations