We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Deployment to hardware

00:00

Formal Metadata

Title
Deployment to hardware
Subtitle
A multi pipeline challenge
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Our project takes a fun, road-following app which leverages a basic neural network and deploys it to real hardware with an OStree update system. This has meant managing a variety of different CI-runners; GPU, aarch64 and x86_64. These have variety of different dependencies, drivers and have interfaces with a number of services and caches. I will focus on how we constructed and developed our CI pipelines to build, test and integrate a number of disparate components to produce images and push updates into an OStree server to be deployed over the air onto our hardware.
MultiplicationComputer hardwareRoboticsSystem programmingDisintegrationWave packetMobile appConfiguration spacePhysical systemLaptopSoftwareComputer fileVirtual machineGraph (mathematics)
FeedbackSystem programmingComputer hardwareSoftware testingCASE <Informatik>Integrated development environmentServer (computing)SoftwareProcess (computing)Computer hardwareINTEGRALUnit testingSoftware testingPoint cloudMobile appFeedbackComputer animation
Computer hardwareMiniDiscComputer-generated imageryRootStreaming mediaSystem programmingDisintegrationSoftware testingComputer configurationContent (media)Address spaceComputer fileCurvatureBit rateSoftware testingPhysical systemFreewareINTEGRALLibrary (computing)Game theoryDifferent (Kate Ryan album)Computer programmingAtomic numberServer (computing)BootingData storage deviceMultiplication signBitMereologySocial classMiniDiscPartition (number theory)Series (mathematics)Graph (mathematics)Medical imagingRevision controlAxiom of choiceCapability Maturity ModelWhiteboardCommitment schemeRootComputer fileNetwork topologySpacetimeElectronic mailing listComputer configurationProjective planeState of matterPoint (geometry)GradientFunctional (mathematics)Content (media)Address spaceComputer animation
Software developerComputer-generated imageryKernel (computing)BootingDevice driverComputer hardwareBuildingPhysical systemComputer fileDisintegrationLocal ringMultiplicationConnectivity (graph theory)Data modelCodeConfiguration spaceSoftware repositorySoftware testingPhysical systemWhiteboardPlastikkarteMultiplication signRobotSoftware testingPoint (geometry)Medical imagingProduct (business)File systemSoftware repositorySoftware developerDifferenz <Mathematik>Level (video gaming)Integrated development environmentBitMathematicsComputer fileRoboticsRow (database)Network topologyDistribution (mathematics)INTEGRALIterationServer (computing)Data structureSet (mathematics)BootingComputer animation
Connectivity (graph theory)DisintegrationData modelCodeConfiguration spaceSoftware repositoryComputer-generated imagerySoftware testingMultiplicationComputer hardwareBuildingMiniDiscWindows RegistryDevice driverVirtual machineGraphics processing unitServer (computing)MereologyData storage deviceMathematicsSoftware repositoryRoboticsDifferent (Kate Ryan album)Machine learningMiniDiscSeries (mathematics)ArmInternet service providerGraph (mathematics)Wave packetSimilarity (geometry)Point (geometry)Endliche ModelltheorieProjective planeStreaming mediaVirtual machineINTEGRALSoftware testingMedical imagingBitSoftwareWindows RegistryMobile appAreaQuicksortCore dumpElectronic mailing listComputer fileSlide ruleUser interfaceGraphics processing unitCloud computingConnectivity (graph theory)Combinational logicKernel (computing)Computer animation
BuildingComputer-generated imageryWindows RegistryVirtual machineDevice driverGraphics processing unitMultiplicationComputer hardwareDifferent (Kate Ryan album)Server (computing)Cache (computing)Source codeWeb pageDifferent (Kate Ryan album)MiniDiscSoftwareMedical imagingMoment (mathematics)Proxy serverGraph (mathematics)Internet service providerServer (computing)Web pageContent (media)Computer hardwareBuildingStreaming mediaDampingKey (cryptography)Multiplication signBitCache (computing)Computer animation
Server (computing)SynchronizationCache (computing)Scale (map)CurvatureContent (media)Computer hardwareStack (abstract data type)SoftwareLatent heatBoilerplate (text)Cache (computing)Content (media)SoftwareCondition numberSoftware repositoryINTEGRALBootingRobotLevel (video gaming)Server (computing)WhiteboardQuicksortRemote procedure callCASE <Informatik>Physical systemSeries (mathematics)Distribution (mathematics)Computer fileBitData managementStreaming mediaNumberSoftware testingVolume (thermodynamics)Key (cryptography)Data storage deviceProduct (business)Proxy serverDifferent (Kate Ryan album)Revision controlMechanism designCartesian coordinate systemComputer hardwareLoop (music)Proof theoryArmMiniDiscDirected graphMedical imagingComputer animation
Computer hardwarePoint cloudFacebookOpen sourceComputer animation
Transcript: English(auto-generated)
Oh dear. Hopefully everyone can hear me now. That's my train of thought. Oh yes, so the speed controller for the wheels are over the GPIO and I2C. And then some other things
that we need to think about. We want to do updates of this, we want to do continuous delivery so you also need quite a lot of configuration on it and things like that, particularly around your network. So in order to do that we need to
integrate the system. So obviously you've got the software that runs on top so we happen to have a little Python app that runs some machine learning. It's a little app. The reference software for this device isn't very far away but we wanted to write it as a proper app rather than as a Jupyter notebook so
we've done that and we've published it. So in order for that app to work you need some machine learning assets so they can be really big files so they can be awkward. Obviously you need drivers, obviously you need the right kernel and like I said before you need to integrate all that configuration in. So when we come to
deciding how we want to test and deploy this. So there's a few things I want to talk about. One of them is kind of the trade-off between wanting to test something in a really representative way and also wanting to
get really fast feedback. So we have our unit tests and things like that when we're developing our app at the top and they're really good but they're usually meant for running generically and then they're not the integration tests. So you want to get your integration tests to be as accurate as
possible while also being really fast. So that was one of our trade-offs that we wanted to minimize. So something like Docker is really nice for doing that. It's really nice for bringing your server environment down to your laptop
so that you can get that feedback soon. But it doesn't do some, it doesn't work in some key ways for our use case. So just things like sorting out the network if you need to have all that configuration in. If you want to store all that configuration, Docker tends to want to let Docker host worry
about all of that. Again things like the kernel, Docker tends to want to let the host worry about that which for our server deploying to servers is really sensible but it's not what we want here. And then just simple things
like sensors. So like the camera or in our case the I2C for driving the wheels. So there was a talk earlier on today about using QEMU in the cloud to try and do a good job of that without having to have the hardware involved and I really like that and that was really interesting. I'd go and
recommend that as a nice way to do this even earlier but again it's not entirely representative. QEMU is only so flexible so there's another trade-off there. So another thing is what tool are we going to run in our CI and what
tool are we going to use to prepare for what we're going to deploy. So you're we've just talked about why we don't want to use something like Docker but Docker does all of the integration work upfront. Each layer describes how
you integrate all the things together and so all of the integration is done by the time you have your image. It doesn't integrate everything we want but it is a good tool and we do use it for a lot of our CI for this project. Things like APT and DNF are fantastic for deploying just the little bit that
you want but it can be very hard to to test that integration because they do the integration on the board. Especially if something goes wrong trying to to fall back and undo what that package did can be quite tricky. So
that's why we have our full disk image creation or full partition creation tools. You build root, your Yoctos or your buildstream. They all have different trade-offs but essentially they all spit out the same thing. So
we're using buildstream but that's not really important to this talk. It's just that class of integration tool. So before we go too much, so in order to pick what CI CD we want to use. So one of the things I've been talking about
is that we want the testing. We want to be able to test our integration and we want it to be reproducible and also another thing is the AB system. So being able to try out your being able to use your
system being able to upgrade itself and if it has a problem with what it's trying to upgrade into it wants to be able to fall back. So the idea of AB is that you have your A system, you upgrade to your B system and if when you
boot into your B system something goes wrong then you can then reboot back into your A system and then try again with something else. So we then come to our deployment tools. So so lava is more about testing and how you deploy during
testing but you know so something like mender is a really good choice it it has that AB and it's a very mature system and it's something that
Codelink have used and we you know we rate it. So then you've got things like Ostrich which is which was relatively new to me well wasn't you yeah relatively new to me when we started doing this work and actualizer
which I only discovered quite recently which builds on top of Ostrich but it can do more stuff but if it wants to do more stuff finding free FOSS servers
for it to talk to we're still trying to understand that space. So Ostrich has a content run Ostrich creates content addressable store of everything that you're going to deploy so it has commits which can be used a little
bit like docker layers I'm still currently we're using the commits relatively naively and I think there's a lot of space for using them much better and that's on my to-do list of things to do to make this better is to
store different parts of my system in different commits or on top of each other in the same way that with docker you don't have to rebuild everything I don't want to have to commit everything so like mender it it
doesn't do any integration on the board it just goes and gets what it wants and then boots into the new system and it can also boot back into the old system if you tie it up with you boot correctly it doesn't
necessarily do that by itself so yeah you have a you the way that Ostrich works is that like I said before you have a content addressable storage of every single file on your system and then the way that they're laid out is recorded with a Merkle tree which is also a content addressable concept and
so by comparing two different Merkle trees you can see you can very quickly work out the difference between two deployments and you can very quickly go and get just what you want and then reboot into the new one so as I said I
hadn't I kind of had heard of Austria a little bit but I didn't really know very much about it at all but actually there are some really cool things that you might heard of that use Ostrich so flat pack uses two
interesting technology so uses Austria and it also uses bubble wrap and what what flat pack does is it it lets you run a series of programs that each
might use slightly different versions of a particular library and so you might instead of having to keep a whole library checked out in several different places with slightly different versions Austria is able to know it the slight differences between them and then it can tell bubble wrap when you run this
this library make it look like this program is running in a churro with these particular these particular libraries a lot like docker but it's not docker so that's really cool and flat pack is quite well used and so I
don't know you guys know about it but particularly in the gaming on Linux community it's really popular fedora atomic and now fedora silver blue also use Austria so one thing that I was I meant to say before is that your
Austria deployments you can set them up to be immutable so even if a root privileged program wanted to go a mess with part of the Austria deployment it
wouldn't be able to this is really good because it means that every time you you always know your beauty into the same thing and this is this is sometimes known as atomic and for you have atomic states and hence fedora atomic but it's now being well it's evolving into atomic silver blue and
actualizer there is an option for the automotive grade Linux yocto project to use actualizer and if you do that you can point it at an Austria server
and you can get the Austria functionality actualizer can do more than that but currently actualizer points you at a free service rather than a FOSS service as far as I understand it so Austrian embedded so just kind
of an overview of what I've just said so you don't have to put everything inside your Austria deployment but you can put an awful lot generally you don't
put your bootloader but apart from that if you want everything inside there you can so as I said it's it's the the updates are based on the changes between your Merkle trees so it only needs to go and download exactly what it wants and it can very quickly work out what it is it
needs to go and get from the Austria server and so that's great for your device being able to update quickly but it's also great for your development workflow so while we were setting up this bot and we were getting our our little Linux distribution working we used this in our development
environment and it wasn't me but we had a lot of luck with that so it's quite interesting and that's so and because of that it it comes to the next point that one of the things I really like about Docker is that the developer can get a lot of gain from using a particular tool that also the
continuous deployment people use and so having your developers being used to the same tools as the development people using and the testing people using it helps with that kind of DevOps see us all using
similar tools and us being able to help each other out so I think that's a really nice thing it means I might have to go and learn three different things for each bit I can use the same tool for all those different stages yet when we do our incremental changes we've we we're deploying a
thing rather than deploying something that needs integrating so that's really important and I've talked about the a B updates so how do we make our commit so we run our integration tool and our integration tool spits out a a file
system and we can then commit that into an Austria repo and then we can then mirror that into a server somewhere and our bot can see it and see that it's got a newer commit and update itself also a developer could check out
the the in production commit deploy it locally into a little chroot go and tweak something and then with a really quick little command create their own
little Austria server and have their local tell that they can log into their local robot and say actually going point at this particular server and then they can then have that really nice developer workflow and they can they don't have to go and rebuild that they can just have a little bit of a play and then once they're happy with that and then after a few
iterations they can then actually see right exactly what did I change something that I often end up doing is logging into my developer board making a change oh that wasn't quite right making another change oh that wasn't quite right by the time I've moved by the time I've jumped around a couple of files oh what exactly and I finally got it working oh have I
you know what what what is everything that I needed to change you know okay I can remember the last couple of things I changed but actually did I also change something else so because of the way it's laid out you can get that diff between the different commits and know that you've when you then go
and reverse that back into your build system you can know that you've caught everything oh yes that's that point yeah and also another thing I spend a lot of time is reflashing a whole SD card or reflashing a whole board so
because it's quick I'm not doing that so I'm saving lots of time basically talked through this already so this is CI CD deaf room so that's why I need to be talking about so I think up until this point I've been trying to have
been talking about how I test and how I deploy and everything but here's a bit more of a structure so we kind of have I kind of think of it as three chunks so we have things like creating our Docker images so that's one git repo and when you change the Docker images in that then it triggers some CI
to make a new set of Docker images again we have an infrastructure repo that record boards how we set up our infrastructure in ansible so you change that the idea is that it goes away and continuously deploy deploys our infrastructure we've been focusing on deploying to the robots rather than to our own infrastructure but we're getting there on that one we record
what we did even if we haven't got that all automatic just yet and then we've got three core bits of our kind of our base components but then you know kind of our software area but then we also have a lot of upstream
software that we use so kernel and all sorts of other things so they're all well lots of them have CI and CD so there's an awful lot of software there so I wasn't sure how to include that on this slide so yet we so I don't
want to dwell on machine learning because I think it was a really good talk earlier on that I have on my list of things to go and watch after FOSDEM so I'm really interested to see how they did it but I talked to some of my colleagues who have you know a bit well who had got some different experience and they seem to think that this would work quite
well and it would scale well and they've seen similar things in other places so what we do is we have a git repo that holds all of our training data and instructions on how to use our app to create models and we store all of our training data using git LFS and that seems to work pretty well and then the
CI of this pipeline then goes and creates a commit for our models in our model store and uploads that using git LFS and then that can trigger some other CI and then we also have our app and that has its own CI and then we bring
it all together so we have a third layer so we have the integration part of the project which so we have a git repo that contains all of our VST files but it could be a Yocto project and then whenever you update
that that triggers the CI for this project and when you merge to master then it deploys to the robot and we'd like to have some more stuff happen on our... we'd like to expand that. Another thing that I'd really like is
for as these change to have... as the CI for these repos succeeds I'd like that to automatically trigger a speculative build of our system but
we're not quite there yet so our integration pipeline so at the minute we trigger build stream to build everything then we have a series of test steps and then we push out to our Austria server and we also publish
our whole disk image so if you just if you haven't already deployed it and you can't use Austria to incrementally update then you need to go and grab a whole disk image so we're using we're using gitlab this is
just a screenshot from the gitlab web interface but nothing I've said is really specific to gitlab. I quite like gitlab and I can talk about it quite a lot but I don't think if you use some other git server or some other CI
server or some combination then I don't think that makes much difference you might have some bits will be slightly easier some bits will be slightly harder
so how does all of our CI come together well how does it how's it made to work so as I just mentioned it's in gitlab you can have a look so some of the things that we use from gitlab are the docker image registry
but like I said lots of people provide docker image registries we have a bash we have a gitlab bastion runner which we have to worry about but that's very small and then that can create all sorts of different runners with
your chosen cloud provider or your host your local provider so we have things like GPU runners for doing our machine learning we have things like arm runners for doing our integration work and then there are various points
when we need various different size x86 machines things like making docker images and also oh yeah and our GPU runners are currently on x86 so what does gitlab not give us what we had to do ourselves so one of the
realities of using snazzy hardware is that your hardware vendor will have some bits of software or some artifacts that might have different
licenses to what you would like them to have or might be a bit awkward so sometimes you can't just share everything in a way that you would like so we'd like to be able to share all of our intermediate artifacts and we'd like it to be really easy to share all of our finished artifacts but
the reality is is that at the moment we can't do that particularly for our incremental artifacts we we think we will be able to publish our full disk images but we're not quite there yet so we need a way to control our artifacts there's lots of different ways to do that as I said before in in
gitlab in order to do this with gitlab you need a bastion but other CI providers have different trade-offs so the Austria server so we have our an Austria server it occurred to me that you might be able to use something like
gitlab pages to do that because a an Austria server is essentially serving up static content so I thought of that recently but I haven't tried that out and we also need a build cache if you're using something like build stream or Yocto you don't want to build everything from scratch every time
so you want to only rebuild what's changed the joy of the optimum build stream are that they have a very good idea of exactly what has changed so they they yes they have a very good idea of exactly what's changed so they
can have a reasonably strict cache keys and they can only reuse things that they are really sure haven't changed so yeah we don't rebuild everything every time so what's our solution for running Austria so at the moment our
proof of concept has a dock with our sink in it and a shared volume with a nginx Docker container so as I said before everything in an Austria repo is essentially static and almost all of it is also essentially immutable because
the content addressable storage the file name is the is a key to the content and so 99% of of that repo can be incredibly well cached so while
this is very naive and it needs to be better actually it doesn't need to be a lot more complicated flat pack serve a very very large amount of data and
they have a series of caches around the world and the all mirror back to a smaller server so those caches only need to update themselves when someone
makes a new release of a particular piece of software that they serve so while this is not kind of the way you'd want to push out production or something or to a really big production it doesn't need to be wildly more complicated as flat pack have demonstrated so some notes and summary
so I'm really pleased with most of what we've done as I said the Austria server is a bit naive there are certain race conditions where it might
not be perfect but the bot the the Austria running on the bot would notice that and be upset but not try and boot into a broken deployment so there's a but flat pack do have a series of tools that they use to manage their servers and I'm very interested in seeing if I can modify those tools
for our use case or learn better to use those tools so I don't need to so looking forward so our integration pipeline currently has a big arm worker
that builds everything for us it would be really nice to use the remote execution API so that we could have a little farm of servers that our CI
could trigger off and then our CI would only need to trigger a much smaller arm server and a series of those could all share resources but yeah the main thing is the Austria tooling and the Austria server so and then I'm on to any questions
so we have a pool of devices that can watch for a server so all that get lab but sorry all that get lab stage does is it checks out the Austria commit from our integration tool and then it puts it onto the server what
we should what we what we could do with doing is understanding what's already on the server so that if you had a couple of different versions on the server a couple of different references on the server which they should all
cohabit nicely but I don't think that's the role of the the role of the integration tool isn't to know all of the other systems so I don't think that needs to happen inside our integration but almost everything happens inside the integration so that very first pipeline creates everything does
that answer the question okay so this so I see this question as being kind of a
long-standing solution something like Menda does this really well Austria's is
the deployment mechanism rather than the fleet management mechanism so if you want to understand how many of my devices have actually received that update then you need something different you need something else on top we haven't aimed at that I my suspicion is that that will end up getting really
tied in with the application itself but not necessarily yeah okay no no so we
build a full disk image so you get you get a zip file that contains a dot IMG file no so we're using build stream which is very much like yocto so in the
same way as octo couldn't create you a full distribution and a full disk image we use build stream to do the same thing yeah yeah again so the the
question was can we validate what we're doing before we roll it out to everyone and again that that falls under that kind of fleet management sort of
thing I was thinking about one thing that would be nice to do is in our testing stage would be to we'd like to we'd like to have a board like a hardware in the loop board that actually showed that you could roll forward and then you could roll on again and so you could potentially do
that with a number and again you could also potentially do that with maybe Q mu I think that's us thank you very much