Untrusted CI
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Alternativer Titel |
| |
Serientitel | ||
Anzahl der Teile | 19 | |
Autor | ||
Lizenz | CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/50692 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
Nixcon 201917 / 19
10
11
16
17
00:00
Gebäude <Mathematik>GoogolMinkowski-MetrikRechnernetzHardwareSoftwareMinkowski-MetrikMAPGebäude <Mathematik>InternetworkingHardwareComputeranimation
00:59
Gebäude <Mathematik>CachingBinärdatenCachingInverser LimesImplementierungElektronische UnterschriftBinärcodeComputeranimation
01:36
Kontinuierliche IntegrationRückkopplungCachingGanze FunktionBinärdatenComputervirusProtokoll <Datenverarbeitungssystem>Attributierte GrammatikDefaultVerschlingungProjektive EbeneMultiplikationsoperatorSchlüsselverwaltungMailing-ListeVorzeichen <Mathematik>MereologieCachingKonfigurationsraumCloud ComputingQuick-SortVirtuelle MaschinePublic-Key-KryptosystemProdukt <Mathematik>SoftwareentwicklerMixed RealitySoftwaretestBildschirmmaskeInverser LimesFreewareUmwandlungsenthalpieBinärcodeQuellcodeInformationsspeicherungAttributierte GrammatikDomain <Netzwerk>DefaultArithmetische FolgeFunktion <Mathematik>AusnahmebehandlungErwartungswertComputeranimation
05:18
SkriptspracheEin-AusgabeNabel <Mathematik>CachingBinärdatenSchlüsselverwaltungGebäude <Mathematik>CodeURLPhysikalische TheorieNabel <Mathematik>CachingVorzeichen <Mathematik>SkriptspracheBitVirtuelle MaschineQuick-SortGebäude <Mathematik>Produkt <Mathematik>CASE <Informatik>Prozess <Informatik>SchlüsselverwaltungSoftwareentwicklerMultiplikationsoperatorAttributierte GrammatikMAPBitrateAlgebraisch abgeschlossener KörperFunktion <Mathematik>SubstitutionIterationMereologieMixed RealityAutomatische HandlungsplanungComputeranimation
08:37
MultiplikationDämon <Informatik>CachingMehrplatzsystemCASE <Informatik>MathematikRPCDerivation <Algebra>Wurzel <Mathematik>DefaultLoopEinsProzess <Informatik>Kontextbezogenes SystemSystem FRegulärer GraphSchlüsselverwaltungInverser LimesInelastischer StoßGebäude <Mathematik>Dämon <Informatik>CachingHook <Programmierung>Skriptsprachep-BlockHash-AlgorithmusDomain <Netzwerk>CodeMereologieVirtuelle MaschineStellenringVorzeichen <Mathematik>Nabel <Mathematik>InformationsspeicherungComputeranimation
12:12
CachingGebäude <Mathematik>BinärdatenSchlüsselverwaltungEin-AusgabeSkriptspracheNabel <Mathematik>Hook <Programmierung>KonfigurationsraumDerivation <Algebra>SkriptspracheMixed RealityZahlenbereichMathematische LogikInformationsspeicherungSubstitutionStellenringCloud ComputingCachingVirtuelle MaschineProzess <Informatik>MehrplatzsystemSchlüsselverwaltungElektronische UnterschriftProgrammierumgebungVorzeichen <Mathematik>Gebäude <Mathematik>Mailing-ListeElektronische PublikationGenerizitätComputeranimation
13:37
KonfigurationsraumCachingGanze FunktionSkriptspracheGenerizitätSystemprogrammierungDämon <Informatik>SynchronisierungKonfiguration <Informatik>Desintegration <Mathematik>BimodulGruppenoperationMehrplatzsystemCodeHook <Programmierung>SkriptspracheRadikal <Mathematik>CachingQuick-SortVirtuelle MaschineQuellcodeGruppenoperationEinsComputerarchitekturMultiplikationsoperatorSystemplattformFunktion <Mathematik>Dämon <Informatik>Repository <Informatik>PCMCIABildgebendes VerfahrenLoginGebäude <Mathematik>BestimmtheitsmaßSpeicherbereinigungWeb logService providerEinfügungsdämpfungRechenschieberInverser LimesMereologieNabel <Mathematik>Cloud ComputingGemeinsamer SpeicherOpen Sourcep-BlockFunktionalDatenfeldKonfigurationsraumEndliche ModelltheorieTemplateThreadHackerInhalt <Mathematik>AdressraumPatch <Software>SchnittmengeSechseckRechter WinkelProgrammierumgebungComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:12
Hey, welcome. I'm Florian, and I need to press
00:26
Elsewhere there we go hi. I'm Florian the internet knows me as flaky Apart from Nick doing X packaging all day. I'm interested in build pipelines infrastructure
00:44
low-level user space stuff networking and thinking with hardware in general At work I work as a circular ability engineer at week By the way, they are hiring so if you're interested in that just reach out Today I'm going to talk about untrusted CI and how to use post build hooks to get automatic caching of untrusted builds
01:08
I'll be talking about CI in general What you want a CI to do and how you wanted to behave? About next binary caches in general how to use private caches and how to handle signature of
01:22
those builds in the private caches How to handle limitations in simple implementations Proposed solution how this improves things in general and future ideas on what to do with it So how do we want to see I to behave well
01:41
It should in general it should lint it should analyze it should build it should test and package your project It should do that on each commit To assist developers in their workflow while they are iterating over a PR So especially you want it to run on PRS To discover all breakages before they reach master or most of the breakages before they reach master
02:06
But most importantly you want to see how to be fast so if you're waiting for like 30 minutes or an hour to get your tests to pass or not to pass and you basically blocked on spending your time on this you
02:20
That's just a huge problem and massively decreases developer productivity productivity so Yeah with a small project. That's not so much a problem, but as projects grow and Yeah build time likely does as well so having still having a snappy CI becomes more and more challenging and
02:44
Yeah When using next to provide those dependencies or like build the entire project you can you make use of binary caches In fact, we do already most of the time. There's cash next or sock for all the packages next packages built by Hydra
03:02
Except I'm free and package is currently failing, but that's another story However in your project there might also be other Packages not generically suitable for next next packages because they are domain specific stuff They are custom overwrites or non free packages And you still want to cache those in your CI pipeline. So
03:23
What you then do most of the time is you might have an Hydra or whatever does your build? but in general you go with a private cache of some sorts that is added on your developer machines and They can make use of this cash in of some sorts It's either self-hosted or based on some bucket in some cloud or entirely managed
03:47
So I'm gonna be talking about how to set up those caches quickly What you do is you generate a Signing key signing key pair on one machine on all machines that use this cache you configure your
04:04
next source configuration or your next conf to point to those endpoints to the public endpoints to download the binaries form and you add the public key part of the signing key pair and To upload you use some sort of next copy command
04:21
eventually, or you expose your entire next door of some machine and to the others and In general next copy supports like SSH NG to copy to another machine HTTPS like HTTP put to upload stuff SV buckets and there's a In-progress PR to push to GCX buckets as well
04:44
Yeah, so assuming your project has a default mix and Dependency attribute containing all the dependencies of your project. You might end up with doing something like this Yeah, you you somehow get a list of all the dependencies and all the next door paths that are part of your built of
05:06
Your dependencies of your build or of your entire project and then you issue a next copy command If you don't expose the next the next store Yeah, so what are the limitations of this naive approach It might work in a lot of cases
05:23
but sometimes there's some drawbacks like You might not have all the builds dependencies already available at a central location So so you can't call next build a and some magic attribute because there's scripts invoking next by themselves You have some shell scripts calling like calling next shell or you have basil shelling out to next bill to build other packages
05:47
So you might you have IFD and don't really know at first what you're gonna be end up building with Yeah, and then you can of course you can track those manually in your dot next file and make sure that you cache everything
06:02
That you that you basically catch catch all the packages you want to build and make sure you say I built it and then You start the actual build process, but that's all like quite laborious and it gets even harder. Yeah, it doesn't get better Another problem is that if one of those packages fails to build
06:22
The approach of waiting for the output path and then copying over the whole transitive closure Will just just won't kick in it because it never got a chance to upload this intermediate Dependencies if you have not specified them before so Yeah, you might end up bumping a low higher level thing it fails to build and
06:44
All this those other dependencies. You also need to rebuild for some reason. They just don't end up in the cache because you never reach to the endpoint of actually building the Package like the leaf package in your dependency Another problem is that the upload is another manual step in your CI pipeline
07:05
Very likely you end up with code dealing with all the signing and uploading part inside your CI code itself that should in theory only say like I'd like to build this thing and Then it should be cached You don't really want to mess with with looking what you want to upload and then manually calling to upload it. It just
07:22
just Normally, it should just work and it shouldn't increase your pipeline code and another problem Which I personally find find a bit of a bigger problem Like as the binary cache is added and used as a substitute on all developer machines are probably even production machines Having write access to it and having developers or like external contributors being able to to to change this way of this your scripting
07:47
inside your CI pipeline It's very easy to extract this signing key and You don't want you basically don't want to have a backdoor and want to have somehow a way to pollute the cache in some sorts
08:03
Yeah, so yeah, that's all not so nice and while while one and two might just decrease cache it misses and Three might be just annoying like three and four together Due to the reasons mentioned basically require some sort of approval process for PRS at least for external PRS
08:25
Yes, and that's all not very nice and negatively impacting Those cache it rate and does around trip time for developers. So how do we solve this? There's one way to solve it that I'm gonna propose it's
08:43
With together with multi-user Nix and some recently introduced next feature you can basically fix this Yeah Yeah You basically what you do Is you have a CI user that that runs your regular build process it uploads the build recipe
09:02
To a privileged next demon. Oh those animations work nice And this next demon yeah, it's basically instructing all the builds to happen on some temporary unprivileged other sandbox build users And afterwards it takes care of persisting it to the local next door and
09:24
Assuming you have no local user privilege escalation on that machine or some weird hash collisions This effectively prints prevents regular CI users for manipulating the local next door like in a non multi-user Nix installation Basically all those three different concerns Would be running would be running as the same user
09:42
So the regular CI user could in practice like modify the next door in some weird ways in some cases So yeah Well user Nix in that case solves a lot of those concerns and isolates this
10:06
it's it's a default on Nix OS but But it's it's not the default on a lot of the hosted CI's Like if you have Travis or Jenkins and you have your like your docker based CI and it's in basically there was like a shell script that you call to to install Nix and then you end up
10:22
Single-user Nix installation. Yeah. Yeah, I will no no. No, it's it's one way to configure Nix in a certain case and I'm yeah Okay so with that we kind of solve the direct access to the next door, but we did not get solved the Signing part so if we go with the bash loop approach we saw previously
10:47
We still end up signing inside the context of the CI user so The CI user can still like change stuff before uploading to the remote cache and that's something we don't want to do
11:04
Because this way the user can still extract the signing key and if he has some way to access the the s3 bucket or something He could he could modify stuff we sign and and basically get code execution on other machines So yeah, we don't want to do this
11:22
As I said with Nix 2.3, there's a way around You can configure a post build hook Which basically gets triggered for each realized derivation even the intermediate ones and in multi-user Nix it's run as in the context of the next demon so as a privileged user and
11:43
you don't have the problem with exposing the key to to the CI user and Instead it's run as maybe as the root user Yeah There's some side notes regarding this like normally you don't want to exit Nix copy there because it's blocking
12:00
You want basically you want to queue the upload to happen? To some other process so you don't block the main domain by built the process So let's look back at our limitations that I spoke about the CI user doesn't have any direct access to the local Nick store anymore and
12:22
Doesn't have access to the signing key So there's no way to produce a modified signed artifact under the original store path which effectively fixes number four As I said like in some cloud environments Users might still be able to alter files in the cache because it's just like a cloud setting that this machine is allowed to
12:42
access this bucket But as it cannot access the signatures Substitution won't happen because Nix will verify the signature it will realize. Hey, it's it's the wrong signature It won't substitute from there and it might fall back to build locally Yeah by moving and and
13:01
By moving the uploading logic away from the CI pipeline into the generic post build hook in a multi-user next configuration We also fixed three because we don't need to have any manual scripts inside our CI process and Because the post build hook is triggered on each derivation realized in next door
13:20
No matter how we End up building this we also solve two and one Yes, so we don't need to manually maintain another list of dependencies. We just catch all intermediate builds So as I said like above architecture will automatically upload all builds happening on the certain machine into the binary cache
13:45
And can be entirely described in this CI built slave image that you want on your club provider Maybe without the need for any cache related configuration in the build pipeline itself That means it's currently most suitable when you provide your own self-hosted builders
14:01
Because multi-user Nix requires multiple users and setting those up outside the repo and not inside of some of those Setting those up outside the repo But most of the time means like you can't use a lot of the hosted CI solutions That's like have you some sort of shared runners because there's just no way to set up other users there
14:21
It's it's it's just often not possible But Depending on your thread model You could still start using post build hooks in a single user Nix setting which will at least solve one Limitation one two and three. Yeah, that's why I spoke about
14:45
Yeah, another problem is that running Nix inside docker requires privileged containers Because of some of the sandboxing features not currently working And failing so it might be unsuitable for some container platforms
15:01
Another problem is that the official Nixos Nix docker image doesn't provide a multi-user installation But it's based on Alpine and the shell script installing Nix Yeah, but as I said depending on the platform you're running on you could you could go with multi-user docker It's containers privileged ones as well Yeah So TLDR use post build hooks to upload to the cache instead of other hacks
15:30
Future plans so when new machines are spinned up in off-bork They often hit another node in the next off-bork run and you have to wait again for dependencies to be compiled
15:43
So one way to fix this might be to have off-bork use not the official cache Nixos off bucket but another bucket that all the builders share and We could still Not worry about having to pay too much money for it because we could just nuke it Nobody's really relying on it and we can rebuild it
16:02
So either by garbage collecting or throwing the way completely in like some weeks For all some weeks what it also like to see is nicer tooling in general and documentation on how to use it a Demon to handle the asynchronous uploads
16:24
There's a Nix copy PR to upload to support GCS That would also help in some cloud environments, but it's not strictly related to post build Yeah, and as I said more in more documentation in general and how to integrate this with CIs so maybe some some Nixos module
16:43
Describing on how to why it is all together for your own self-hosted machines have some code ready that I would like to open source But it's really not so much Not so much of code and also maybe a github action template I mean you with github actions as far as I know you cannot have multiple users
17:01
But you could at least get like the single user parts set up with post build hooks Yeah, and some blog posts describing this in a more readable fashion than a slide That's for my side. Thanks any questions. Have you looked into implementing a post build hook script?
17:35
They can upload the source tarballs and the patch files that were used to do the build to content address tarball cache
17:41
No, but it's an interesting experimentation field when the build hook fails of the post upload post build hook fails does your build fail I
18:04
don't know to be honest and Logs of the build hook will they be will they kind of Say if I run this from Hydra Where are the locks or the locks of this?
18:21
Hook go to will they end up in the the log of the build? Are they I think it's the next demon logging it Can you get the mic? I get my mic. I like this
18:47
Yeah The hooks all output always goes to the users terminal if the hook fails the build succeeds but no further builds execute And the hook executes synchronously and blocks other bills from progressing while it runs
19:02
Okay, you mentioned possibly doing garbage collection of the s3 buckets Does such a function actually exist with next collect garbage or a tool to garbage collect the buckets? I think there's a Perl script, which is a bit old
19:22
But I don't think we actually running it on cash next loss org and we could we could dog food This script on the of pork bucket. All right, so if there's any more questions, feel free to
19:51
Yeah, hit me after the talk Thanks