We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Yet Another repoman

00:00

Formal Metadata

Title
Yet Another repoman
Subtitle
How We Do CI at oVirt
Title of Series
Number of Parts
611
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2017

Content Metadata

Subject Area
Genre
Abstract
Repoman is a tool developed in-house and used as a core tool in oVirt CI andrelease processes. It aids the process of integrating RPM packages frommultiple sources into the single repo. Made to be self-contained, so it iseasy to use from CI. Come and see what our use cases at oVirt are and how weuse repoman to solve them. Being developed with an abstraction in mind itmight be helpful to you too.
17
Thumbnail
24:59
109
Thumbnail
48:51
117
Thumbnail
18:37
128
146
Thumbnail
22:32
162
Thumbnail
23:18
163
Thumbnail
25:09
164
Thumbnail
25:09
166
Thumbnail
24:48
171
177
181
Thumbnail
26:28
184
Thumbnail
30:09
191
Thumbnail
25:08
232
Thumbnail
39:45
287
292
Thumbnail
25:14
302
Thumbnail
26:55
304
Thumbnail
46:54
305
314
317
321
Thumbnail
18:50
330
Thumbnail
21:06
333
Thumbnail
22:18
336
Thumbnail
24:31
339
Thumbnail
49:21
340
Thumbnail
28:02
348
Thumbnail
41:47
354
Thumbnail
26:01
362
Thumbnail
18:56
371
Thumbnail
13:12
384
385
Thumbnail
25:08
386
Thumbnail
30:08
394
Thumbnail
15:09
395
411
Thumbnail
15:10
420
459
473
Thumbnail
13:48
483
501
Thumbnail
32:59
502
Thumbnail
14:48
511
518
575
Thumbnail
25:39
590
Thumbnail
25:00
592
Thumbnail
23:32
SoftwareVisualization (computer graphics)Source codeRepository (publishing)Video gameOrdinary differential equationMultiplication signParticle systemPrisoner's dilemmaSource codeTraffic reportingMetropolitan area networkData storage deviceOpen setCartesian coordinate systemConfiguration spacePoint cloudProcess (computing)Graph coloringEstimatorBoss CorporationContinuous integrationOrder (biology)Web applicationSpecial unitary groupBitDemonSerial portCASE <Informatik>PlanningType theoryChainWebsiteSoftware engineeringOpen sourceRow (database)InternetworkingWeb pageTable (information)Field (computer science)Social classObject (grammar)MiniDiscoutputMereologyFilter <Stochastik>GoogolEndliche ModelltheorieSoftwareForcing (mathematics)Physical systemImpulse responseState of matter2 (number)Core dumpTorusWaveArmComputing platformSign (mathematics)Uniform resource locatorRegulärer Ausdruck <Textverarbeitung>Revision controlLink (knot theory)Set (mathematics)DatabaseLocal ringFile systemFunction (mathematics)Projective planeVirtual machineSoftware testingComputer fileVideo game consoleBuildingInformationPatch (Unix)XMLSource codeJSON
SpacetimeLibrary (computing)Computer virusMetadataContent (media)Computer fileSoftware repositoryRepository (publishing)Electronic mailing listEmailInformationDifferent (Kate Ryan album)TwitterProduct (business)Process (computing)Link (knot theory)Metropolitan area networkMultiplication signDistribution (mathematics)Continuous integrationFile systemMereologyDemonReal numberMiniDiscCodeNP-hardFlow separationOpen setRevision controlInternetworkingStandard deviationCartesian coordinate systemDatabaseData storage deviceFilm editingInternettelefonieCAN busOrder (biology)CuboidCommutatorParameter (computer programming)Particle systemPhysicistArithmetic progressionRoundness (object)Execution unitDiscrete element methodDirectory serviceArmGroup actionSystem callDisk read-and-write headProjective planeGame controllerGoodness of fitShared memoryWater vaporSoftware testingData structureService (economics)TorusForcing (mathematics)INTEGRAL
Game theoryComa BerenicesComputer animation
Transcript: English(auto-generated)
Hello, everybody. Our next speaker is Anton Marczkiewicz with yet another Ripperman. Please give him a warm welcome. Hello, everybody. Welcome to talk about Ripperman.
So important thing to understand, the Ripperman is like a Superman but that works with packages. So a few words about me. I'm a software engineer working on Red Hat and I'm a member of OVERT community info team.
So OVERT is open source virtualization platform based on the KVM hypervisor. And we are info team, so we are responsible for all the infrastructure that this project uses for builds, releases. And we are also doing continuous integration and recently also doing continuous delivery.
And this tool was developed by us to make this possible. So what is the challenge, why we did it? If you Google, you find some other Ripperman not necessarily dealing with packages. But the thing is we have the problem is that OVERT is quite large application.
It consists of multiple parts, the packages in RPMs and we really have to deal with a lot of RPMs and also ISOs, by the way. So sometimes we process around thousands of RPM a day
and we somehow need to make usable composers out of this so users can install them, testers can test them and all the stuff. And obviously we are not having infinite resources and we need to do it efficiently. So that's why this tool was created.
And as you see it has nothing evil inside so the reason is that we made this tool for CI. So you supposed to run it as a standalone application on command line or you use it from Jenkins for example. And if you do it like that, that's why it's important that this tool doesn't use database.
There is no database. Also there is no cloud inside, it doesn't have any rest interfaces, it doesn't open network sockets that you supposed to deal with or interface with. And also there are no daemons inside so it's not evil and also no daemons.
So it's not some daemons and runs in the background and you need to manage it. So it completely has no state inside. So if you run it, it always runs from scratch because if you have Jenkins, you already have a way to run jobs, define jobs. Jenkins stores their configuration and you can store their artifacts.
So that's why this tool doesn't have anything inside. And yeah, you pretty much guessed it, it's a sweet old console application. So once you install, you just use it like this from command line or obviously Jenkins is able to run command line application so you can run it from Jenkins.
It's distributed currently as RPMs, it's Python application. But the problem is that it uses some command line application like create repo, it's from yam. And RPM sign, so they are not available as Python models so we can't just satisfy it from pi, pi dependencies.
So that's why it's in RPM but if you have everything installed, then you can access it as a Python model because it's actual Python model. Okay, so what it does, so the idea behind the tool is that you have to define some sources, then you can define some filters and then it stores in the output store.
It's made pretty abstract in code, it uses classes, of course you can override it. So currently as input source it supports the local disk file system and that's primarily how we use it.
It's also able to get packages from the internet just by using HTML pages with HTTP links. Anything that has links to artifacts like RPMs or ISOs on web page will do it. There is also a way to recursively crawl the pages and look for artifacts, but it's configuration settings.
It's also able to understand some particular HTTP applications like Jenkins. If you pass a link to Jenkins artifact page, it understands that it's Jenkins and it will just download the artifacts.
It will not be parsed in full HTML of Jenkins. We also support Fedora build systems such as Koji, that's the primary Fedora build system, and also we support copper links. So if you know what copper is like, Fedora shared party repositories, then it will understand it. And for Koji there is particular support, if you have KojiTec you can pass the KojiTec and it will download stuff from KojiTec.
Then for the filters, so basically to the inputs you apply the filters to filter the RPM. I think mostly we use this, only mission one. It allows you to, for example, take two input sources and use one as a primary and from second just fill what is mission.
Because it understands the versions, it understands the names of RPMs, so a table just to take what is mission. But you can also leave latest and last versions or you can filter by regex. And then it just stores it on a disk.
So for store it currently only supports the disk file system. And it's able to split it, I will show you a bit later on RPMs and ISOs. Again it's pretty abstract, if you have other types of artifacts you can define some classes inside and probably send us a patch.
Because it really was meant to be a general artifact tool. And this is a real use case, this is how we use it for our weird experimental repo. So our weird experimental repo is our weird repository where you can install our weird from it. But the thing is that it's basically updated on commits.
So here on the left you have some jobs from our Jenkins. Names are not important but the thing is that there are a lot of them. Because our weird consists of these sub-projects and each of them is maintained by their own teams. So they all build artifacts. But we as a CI team, we need to make one our total repository of all of this.
So essentially when something builds it from code commits we need to take it and place it into this large repository. And this is when we use the repo man. And as you see here the structure, because repo man understands the distributions and understands the versions and artifacts.
So you see that it created sub-directory RPM, sub-directory ISO depending on what content you give it. And then it also created sub-directories for all the distros it saw in packages. So it happens automatically by repo man, that's why it's useful. And then from here the latest, it's basically the latest continuous delivery that has just top latest based on commits.
And as I said about efficiency, this is one particular thing that repo man does is that hard links are essentially included. It will create a hard link to the file if it's not changed.
That's why we can really produce hundreds of other repositories daily. Because if we will be copying all the files, we will run out of space pretty soon. But here, for example, we have the previous repository that was ordered. And you see that basically two new packages version were built.
Like for artifact A we have version 2 and same for B. And then when we use this only mission filter and ask repo man to take this and just add this new stuff, it will basically hard link all the artifacts that are the same. So they won't really occupy any extra space. And since it's hard links, then basically if you delete this, this will stay.
Because it's like separate file names, but no extra disk space is occupied. So that's very useful for us just because those repos take a lot of space and we only mutate part of them. But it also produces problems if you store it on different file systems.
So it can't really seem linked right now. And that's pretty much it. So just to describe where you can find additional information. So this is my Twitter. Feel free to just ask questions. I will reply them if you want.
And this is our info team list. So if you have any questions regarding OVID, how we do CI, whatever, or repo man, just use this mail. And currently the primary repository is our Gerrit. This is because we really do continuous integration for repo man. We test it on different fedoras and OSs and we build RPMs there.
So we use our infrastructure. But there is GitHub mirror. It's basically mirrors from Gerrit, so feel free to use it. We, as a Python project, support the read-the-docs. And it's pretty standard Python application. And if you want current RPMs, they are now OVID RPM repository.
That's basically it. So if you have any questions, yeah, just copyright slide. Because I took pictures from the internet. So I will be also available on OVID booth. We have OVID booth here. So as soon as I'm not on the second talk, then feel free to come.
Thank you very much, Anton. If you have questions, I come around with the mic. Thanks. So if there is no daemon. No, it's not daemon.
Yeah, if there is no daemon, my question is how frequently you run repo man to create repos. We run it, I think, several times. It is Jenkins job. So it triggered by commit to the code. We have like maybe 50 repositories where people may commit.
And then each of them builds their artifacts. Yeah, that's all this Jenkins job. And so it happens several hundred times a day. And then for each of them, the repo man runs. So it's hundred of runs of it per day. It doesn't really need a lot to load.
It just reads RPMs, parses the metadata, and then composes the final repository. And when possible, it creates the hard links. Does it also read the RPM contents or only the final? It reads only headers. So we use RPM library to read only the beginning of RPM where the metadata is.
Okay, thanks. Hello. My question is, what's the reason of using Jenkins for producing RPMs? I mean, I know that Red Hat has a lot of different tools for doing that. Yeah, but this is not really a Red Hat product.
We have ref, that is a product by Red Hat based on overt, but overt is community project. And community infrastructure uses Jenkins. Yeah, we are reviewing other tools, but Jenkins works pretty good for us. And it really hasn't anything we need right now.
So yeah, we are a Jenkins shop. Any more questions? Well then, thanks a lot. Give a warm welcome, applause to Anton.