Implementing a Build Manager in Ada
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 287 | |
Author | ||
Contributors | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/57023 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2022170 / 287
2
4
6
8
12
17
21
23
31
35
37
41
44
45
46
47
50
62
65
66
67
68
71
73
81
84
85
86
90
92
94
100
102
105
111
114
115
116
117
118
121
122
124
127
131
133
135
137
139
140
141
142
145
149
150
156
164
165
167
169
170
171
172
174
176
178
180
183
184
189
190
192
194
198
205
206
207
208
210
218
220
224
225
229
230
232
235
236
238
239
240
242
243
244
245
246
249
250
253
260
262
264
267
273
274
277
282
283
287
00:00
Read-only memoryRegular graphInformation securityInterface (computing)Control flowSource codeSystem programmingInformationTrailKey (cryptography)Broadcast programmingData managementComputer fileVertex (graph theory)BlogDatabaseServer (computing)Line (geometry)Configuration spaceFile systemAsynchronous Transfer ModeInternet forumSanitary sewerData modelTable (information)Java appletCode generationElectric generatorCodeSoftware developerLibrary (computing)Object (grammar)Presentation of a groupBuildingData managementProjective planeCoefficient of determination1 (number)Electronic mailing listInformationTrailComputer fileSeries (mathematics)MereologySocial classCartesian coordinate systemLine (geometry)CodeOperator (mathematics)Software maintenanceInterface (computing)Computer architectureCodeSource codeQueue (abstract data type)Binary codeServer (computing)DatabaseTable (information)Multiplication signWeb applicationClass diagramKey (cryptography)Physical systemFile systemDifferent (Kate Ryan album)Object (grammar)LoginResultantElectric generatorFlow separationRegular graphEndliche ModelltheorieLibrary (computing)PasswordNumberTraffic reportingService (economics)Repository (publishing)MathematicsDirectory serviceSensitivity analysisWeb 2.0Utility softwareWordBitUnit testingVirtual machineRevision controlView (database)Data storage deviceEncryptionVulnerability (computing)User interfaceSemiconductor memoryJava appletLatent heatBlock (periodic table)Set (mathematics)Software frameworkAxiom of choice40 (number)Local ringField (computer science)SakokuVideo gameWebsiteInheritance (object-oriented programming)Square numberSequelFigurate numberRight angleAngleForcing (mathematics)Wave packetProcess (computing)Form (programming)Core dumpEmailMatrix (mathematics)MeasurementGame controllerFamilyDemosceneSystem callDigital photographyType theoryDirected graphInformation securityParameter (computer programming)Information retrievalCausalityDiagramDiameterScaling (geometry)DiagramComputer animation
09:53
Gamma functionType theoryPermianUser profileSocial classCASE <Informatik>Data modelEndliche ModelltheorieSource codeConstraint (mathematics)ChecklistOrder (biology)Data typeAsynchronous Transfer ModeBuildingBit rateRevision controlInterior (topology)Table (information)IntegerClient (computing)Element (mathematics)Boolean algebraString (computer science)Game controllerTable (information)CodeElectric generatorEndliche ModelltheorieSocial classRepresentation (politics)Class diagramProjective planeBuildingDatabaseMetric systemPhysical systemMultiplication signMathematicsTheory of relativityResultantExecution unitSet (mathematics)RootAttribute grammarRow (database)Type theoryMatrix (mathematics)SummierbarkeitArithmetic meanKey (cryptography)Computer animation
12:16
DatabaseData modelTable (information)IterationFinitary relationCivil engineeringMathematicsScheduling (computing)Pole (complex analysis)Structural loadBuildingEndliche ModelltheorieQueue (abstract data type)Query languageElectronic mailing listFunction (mathematics)Boolean algebraNatural numberOrder (biology)Service (economics)Configuration spaceBinary fileEmbedded systemString (computer science)Statement (computer science)Type theoryLogical constantCodeFluid staticsCode generationElectric generatorComputer fileContent (media)Object (grammar)Endliche ModelltheorieComputer fileCodeLine (geometry)String (computer science)Procedural programmingNumberMathematicsDatabaseContent (media)Multiplication signType theoryQuicksortTable (information)Source codeMereologyWeb 2.0Set (mathematics)Rule of inferenceComputer programmingOnline helpData managementInformation2 (number)Operator (mathematics)Data storage deviceFormal languagePhase transitionElectric generatorInheritance (object-oriented programming)Compilation albumLibrary (computing)BuildingKey (cryptography)AlgorithmConstraint (mathematics)Declarative programmingLatent heatProjective planeQueue (abstract data type)Electronic mailing listScheduling (computing)Information securityQuery languageElement (mathematics)Code refactoringStatement (computer science)Point (geometry)ConsistencyOrder (biology)Binary codeAttribute grammarProcess (computing)Pairwise comparisonFunctional (mathematics)Vector spaceEmbedded systemVariable (mathematics)System callMenu (computing)ArmWritingDivisorGame theoryProduct (business)Programming languageRight angleSuite (music)Network topologyAlgebraic closureConfidence intervalWordCoefficient of determinationWave packetNumbering schemePlanningPresentation of a groupProgrammer (hardware)Reading (process)Program slicingFreewareRevision controlComputer animation
20:08
TouchscreenComputer animationMeeting/Interview
20:29
Streaming mediaProcess (computing)Goodness of fitMeeting/Interview
20:50
BuildingPhysical systemJava appletServer (computing)Semiconductor memoryProcess (computing)Hydraulic jumpHierarchyTerm (mathematics)Meeting/Interview
21:20
Web 2.0Term (mathematics)Java appletMeeting/Interview
21:38
WeightPhysical systemCompilerOperator (mathematics)Function (mathematics)Library (computing)Process (computing)Utility softwareRight angleCASE <Informatik>INTEGRALDatabaseMathematicsExecution unitWindowMeeting/Interview
22:44
MathematicsDatabaseCASE <Informatik>Endliche ModelltheorieHuman migrationCodeElectric generatorQuery languageTable (information)Numbering schemeScaling (geometry)TunisMeeting/InterviewComputer animation
23:41
Multiplication signMeeting/InterviewComputer animation
Transcript: English(auto-generated)
00:05
Welcome to implementing a build manager in Ada presentation. Why a new build manager when there already exists several solutions? I'm using Jenkins for more than 8 years now with more than 30 projects and several build nodes.
00:23
Jenkins is really nice and powerful but it is slow and it needs a lot of memory. More annoying is the fact that it requires Java on build nodes. The Java virtual machine must also be quite recent and up to date with the Jenkins server.
00:43
Jenkins suffers from several security vulnerabilities which means that you have to update your server regularly. I decided it was time for me to get rid of Jenkins and write a build manager fully in Ada.
01:01
The main requirement is security and performance. Let's build a safe design in Ada. I need a command line interface because that's easier for me to control and build engine. But I also need some web interface that I can look at on my desktop and mobile phone. I like a nice interface.
01:26
I need a flexible architecture for build nodes to be able to use SSH, Docker, VRSH so that I can start and stop a container or a virtual machine. What a build manager must know? Well, a lot of things really.
01:45
It needs to know the list of projects with their source control methods. It must know the receives to build the project. The receive defines the steps that the build manager must perform to build the project, all the steps.
02:00
There can be several receives for the same project. The project has some dependencies that you want to track. This is useful to trigger or build another project when the project is changed. The build manager must know a list of build nodes with different systems, different architectures.
02:23
Then we have the build information to track the build results. Each build also contains some metrics. The build manager must connect to build nodes so that it needs credentials. It also needs some API secret keys when you want to publish the build results.
02:41
And you have more secret keys if you want to sign a build. What a build manager must do? Again, a lot of things. First, it must probe the source repository for changes to trigger builds. Then it must schedule builds when changes are detected.
03:02
It has a build queue. From the build queue, it must launch a build and execute build steps described in the receipt. It can launch them locally or remotely. While executing the build steps, it must control and track the execution. It also collects and builds the results with logs, collects build metrics, code coverage, unit test execution.
03:27
It must publish the build results whether the build failed or not. Of course, it must send build notifications. And well, it's always nice to have reports on the projects.
03:41
What a build manager must protect? A build manager has access to sensitive data. First, there is a source code if you run proprietary projects. Often, you may use API secret keys to use various external services. The API secret keys must be protected.
04:02
The build manager has access to various credentials either to get the source to connect or to build the node. Sometimes you have to sign the build result and this is in general protected by a secret key or a password. Finally, you have the build results and the build logs.
04:22
You must not click some API secret key through a build log because that build log is published somewhere. Let's have a look at some numbers. First word about the cost. I was able to write this project in a reasonable short of time, less than 30 weekends on my free time.
04:46
The project is written mostly in Ada with some HTML and a little bit of TypeScript. TypeScript is used on the web interface. The project has 32,000 lines of Ada code. Half of it is generated code.
05:07
There are 43 Ada packages and 30 Ada private packages. The database contains 19 tables.
05:23
Let's have a look at the architecture. The project has two binaries, a web server binary and a command line binary. Both are written in Ada. The web server is built on top of ada web application and ada web server.
05:42
The porion command line utility is built on top of the database objects library. The command line will connect to build nodes to run the build receipts on the file system. We will see a config, tmp, logs and project directory.
06:01
An SQLite database contains information about project, receipts, builds and so on. For the secret and sensitive information, I'm using the ada key store which uses several encryption keys. The key store is locked either by your user password or by your GPG key.
06:22
The project sources are extracted in the project directory. Here is a detailed view of the porion command line tool. It runs on Unix or BSD systems and uses SQLite for the database. We could easily switch to MySQL or PostgreSQL but SQLite is easier to configure and set up.
06:46
The tool is built on top of several ada libraries. Most of them are part of the ada web application framework. The porion lib block is shared between the command line and the porion web server.
07:02
On top of that we have the set of commands provided by the tool. You see on the picture the two generation tools that generate a big part of the ada code. Dynamo generates support to access database tables in ada and the advanced resource embedder generates ada helper packages.
07:25
The server architecture is almost similar. It uses ada web server within the ada web application so that we can benefit from application permission layer provided by AWA. We use the same porion library and Samsung's specific web access operation.
07:45
The same ada code generation tools are used. Let's have a look at the UML to ada code generation. I have described the database model in UML. There are 19 tables organized in 5 packages.
08:04
For the UML I've used Argo UML which is an old java tool. It was migrated from tegrais.org to github. It's still maintained. Argo UML works well to define the UML class model.
08:21
Dynamo reads the Argo UML file and it generates the ada packages from the UML classes. Quick overview of Dynamo that I've presented 3 years ago. So you write your design using UML class diagram but you can also describe it by using UML model or XML model.
08:45
The Dynamo tool reads those files and it generates documentation, ada packages and SQL files. The code generation works for the 3 databases, Postgres, MySQL and SQLite. And now to access the database you only need to use the generated ada packages
09:03
and write your application. Dynamo generates 14000 lines of ada codes in 6 ada packages. It's one more package from the UML model because I have an XML file that contains a series of SQL queries that I want to map in ada.
09:23
All this handles SQL insert, update, delete and of course SQL queries. When you do a select to retrieve a list of data this is put in ada containers vectors. This is part of the generated code.
09:42
Other objects of the model are using reference counting so there is nothing to do to manage memory. Let's have a quick look at the UML model in Arco UML. We have a set of packages. The name of the project is porion, the root package is porion
10:02
and we have child packages where we could put different models. The project package contains a model which describes the project itself. So we will see the class diagram.
10:22
Here we have a project class which has a table stereotype. In this table we have several attributes. Some of these attributes are marked by some stereotypes
10:41
and the stereotypes in fact help the Dynamo code generator to drive the generation. So table is a main stereotype, pk is a stereotype to identify a primary key and auditable is a stereotype to help to instruct the code generator to generate audit
11:05
so each time the name record is modified we will track changes in the database. We also have project dependencies with a dependency table
11:23
and we have some relations in that table with other tables. So let's have a look at other, I cannot see everything but here we will see the build recipe and here the result, the database table to hold the build result.
11:48
Here we can see how we store the unit tests which are executed and identified and here we have the build metrics which are collected by the system.
12:04
Here we have the build nodes representation and so on. Everything is written in UML and it helps in the code generation. Let's look at the benefit of using UML. When I started the project I've not made a UML database model that was correct at the first time.
12:28
I've made several iterations, you add new tables, new relations, new attributes then it becomes too complex and I move tables in another packages.
12:42
Code generation is fast and it's easy to change the UML model and to rebuild the Ada code. The good thing is that a change in the UML model can break the Ada compilation but you can detect issues quickly because this breaks the compilation.
13:02
You only have to fix it and it's done, almost. Last point is that it keeps the consistency between the Ada generated code and the SQL database which is good. Well, refactoring in Ada is really safe and when it compiles again
13:21
you are a pretty sure and confident that it will work. Let's have a look at the build queue scheduler. The build queue contains a list of receipts that must be executed on the build nodes. The rules of the scheduler is to keep that list receipt ordered
13:41
but we want to minimize the number of builds and take into account project dependencies. We have a set of four projects, project B and D depends on project A. Project C depends on project B. The build queue contains receipt for project A then C and then D.
14:03
If we add B at the end of the queue after the rebuild B this will trigger again the receipt C which is not good. The build scheduler will reorganize the queue to avoid that. We still build A but now the receipt C is put at the end after B.
14:23
The first step to do this is to load the build queue in an Ada vector. There is the queues vector declaration, some variables for the query definition. We set up the filter on the query to only get the receipt of the current build node
14:42
and we only need to call the list procedure. Now we have our receipts A, C, D in the vector. We define a comparison function between two queue entries. Its job is to decide whether the left element must be executed before the right element.
15:07
I will not show the body of that method because it is too complex for this presentation. With our comparison function we just instantiate the generic sorting package and we have the sort queue package that we can use.
15:23
Now we only have to append our new receipt to the current list and call the sort procedure. And now we have a list which is sorted on the build execution order. We just have to iterate over it, set up the order number in the database and save the entry.
15:46
Let's look at another problem. How can we configure the database on a fresh installation? The idea is to embed the SQL schema in the binary. The SQL schema is generated by Dynamo in a plain text file
16:02
and this is useful when you have to create a database manually. Having a plain text file is not easy to use. Instead it would be nice to have an array of strings with each string being an SQL statement that creates a single table.
16:23
That's the purpose of advanced resource embedder. It will do the job for us. It generates 2000 lines of code in three ada packages. Let's have a look at this second code generator.
16:41
So the goal of R is to embed various files or contents in the binary. We have a set of configuration, help files, web files or whatever we want to put in the binary. Finally we define a set of rules. The goal is to tell R how to manage those files, iterate them
17:03
and make their access available in three programming languages, C, ada and go. The tool reads the files and generates some C, ada and go source. The generated source becomes part of the compilation of the final program
17:21
and then it contains the files. R generates simple ada code, at least far more simple than Dynamo. It can generate ada types but for Poryon I prefer to define the type myself in a parent package. The Poryon resource package defines the content array and content access types.
17:45
This is the way we will represent our array of strings. Then R generates the Poryon resource schema package and it has only one function, getContent. The body of the package contains the static read-only strings that represent the SQL schema.
18:05
Well, writing a build manager is not easy. We have many information to take care of, many operations to perform. A secure build manager is even harder because we have all the security aspects.
18:20
Storing and keeping sensitive information is not an issue with ada-keystore but using that in a build receipt is more complex. Ada really helps but I will always find that you must think a lot more about your design and how you organize your package and types.
18:42
If I compare with other languages like Java, the design phase is longer. It's not a matter of knowing the language or not, it's a matter of specific or more constraint rules that you have in ada, like no circular dependencies or forward type declarations.
19:04
Code generation was key for this project for me. There is a database layer that is now fully mapped in ada with generated code. And there is the R embedder which simplifies the installation of Poreon.
19:20
The ada database layer was the key to simplify the project. This helps to implement complex algorithms. I described only a short algorithm but the library contains several others which are interesting. This project is new. It lacks many features if you compare with other build managers,
19:42
but ada allows to implement quickly complex features and I will not stop there. Thank you for listening. Thank you for my wife who allows me to record this presentation. It's a fun project. You are welcome to use it. You are welcome to contribute.
20:01
You have the source and if you want, please submit a pull request. Thank you.
20:28
And we are live with Stéphane Carré. We have only three and a half minutes for a lot of questions. So let's please be quick. After that, people can join us in the live stream.
20:40
The first question from Fernando is that you mentioned you migrated from another system, namely Jenkins, because you wanted a lighter wand among other goals. Yes, I had two goals. A light system and avoid the dependency of Java on build nodes. So today I don't depend on Java on build nodes and light system because Jenkins is running on my server
21:07
and it's using more than one gigabyte of memory and now the server is using around 50 megabytes of memory, which is a lot less in fact.
21:20
It's huge. Now also in terms of performance, in terms of web performance, I have seen web requests are really, really faster in Ada compared to Java in fact. And this is a big win for me. So mission accomplished. Yes. What about BSD systems?
21:41
That's the second question. Have you tried them? Have you needed to make any changes? On BSD, I am building Poryon on NetBSD with the latest compiler that Fernando has updated and it works.
22:01
So in fact, the adautil library has been ported on various systems on NetBSD, FreeBSD, Windows, and I rely on it for some specific operations that are required to, for example, launch a process,
22:22
get the output of a process and run external things, external API. These operations are not standardized by Ada in fact. So I'm using Ada utility library, which integrates all this and has been ported on various systems in fact.
22:44
All right. So you didn't have to make a lot of changes to Poryon itself? No, no, no. Mission accomplished again. Yes. What happens about the database schema in case you need to migrate to a new schema? Okay.
23:00
In general, I never modify the SQL schema. I always modify the UML model and because the code generator generates a new schema. For now, Dynamo does not generate migration schema. It does not generate migration queries.
23:24
So I have to do it by hand. But by looking at the previous schema and the new schema, it's quite easy to know what the column are missing and table are missing and so on. So this is done by hand.
23:41
Okay. And last question, is the generator code high quality? Do you like it? It generates Ada. So yes, it is. That's all the time we have. We'll still wait for another minute or two.