Cappulada: Smooth Ada Bindings for C++
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 561 | |
Autor | ||
Lizenz | CC-Namensnennung 2.0 Belgien: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/44180 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
FOSDEM 2019149 / 561
1
9
10
15
18
19
23
24
27
29
31
33
34
35
38
39
40
43
47
49
52
53
54
55
58
59
60
63
65
67
69
70
78
80
82
87
93
95
97
102
103
104
107
110
111
114
116
118
120
122
123
126
127
131
133
136
137
139
141
142
148
153
155
157
159
163
164
168
169
170
171
172
173
174
181
183
185
187
188
193
196
197
198
199
200
201
205
207
208
209
211
213
214
218
221
223
224
226
230
232
234
235
236
244
248
250
251
252
253
255
256
257
262
263
264
268
269
271
274
275
276
278
280
281
283
284
288
289
290
293
294
296
297
300
301
304
309
311
312
313
314
315
317
318
321
322
327
332
333
334
335
336
337
338
339
340
343
345
346
352
353
355
356
357
359
360
362
369
370
373
374
375
376
377
378
383
384
387
388
389
390
391
393
394
395
396
406
408
409
412
413
414
415
419
420
425
426
431
432
433
434
435
436
438
439
440
441
445
446
447
448
453
455
457
459
466
467
471
473
474
475
476
479
480
484
485
486
489
491
492
496
499
500
502
505
507
508
512
515
517
518
529
531
533
534
535
536
539
540
546
550
551
552
553
554
555
557
558
559
560
561
00:00
GlättungVektorrechnungKomponente <Software>ProgrammierumgebungTaskTemplateUmwandlungsenthalpieCodeTermE-MailSymboltabelleInelastischer StoßObjektorientierte ProgrammierungOrientierung <Mathematik>Konstruktor <Informatik>VererbungshierarchieDatentypArchitektur <Informatik>ImplementierungBriefträgerproblemKlasse <Mathematik>Funktion <Mathematik>Ganze ZahlCodeMinkowski-MetrikObjekt <Kategorie>Projektive EbeneFunktionalKlasse <Mathematik>VirtualisierungSpieltheorieData MiningSymboltabelleTabelleNamensraumExistenzsatzZusammenhängender GraphTemplateKonstruktor <Informatik>Kartesische AbgeschlossenheitMailing-ListePunktInstantiierungProgrammbibliothekObjektorientierte ProgrammierungEinsE-MailMomentenproblemLesezeichen <Internet>Orientierung <Mathematik>SoftwareMapping <Computergraphik>HalbleiterspeicherQuadratzahlDatenverwaltungVollständigkeitVariableValiditätDatensatzParametersystemTypentheorieUmwandlungsenthalpieCoprozessorInelastischer StoßAbstrakter SyntaxbaumGenerator <Informatik>FehlermeldungTopologieAggregatzustandUmsetzung <Informatik>Zeiger <Informatik>ProgrammierumgebungDatenstrukturZwischenspracheVererbungshierarchieComputeranimation
05:58
Strom <Mathematik>ImplementierungTemplateGenerizitätTeilmengeInelastischer StoßCodegenerierungPräprozessorSoftwareAggregatzustandVektorrechnungKlasse <Mathematik>Notepad-ComputerArray <Informatik>VererbungshierarchieTemplateKonstruktor <Informatik>GenerizitätE-MailJukebox <Datentechnik>Güte der AnpassungEinsComputersicherheitMaßerweiterungVertauschungsrelationGenerator <Informatik>LaufzeitfehlerKlasse <Mathematik>CASE <Informatik>SymboltabelleSynchronisierungGanze ZahlMAPDatenverwaltungDifferenteKlassische PhysikProjektive EbeneVollständigkeitVideokonferenzDeklarative ProgrammierspracheComputerspielMultiplikationsoperatorEinfügungsdämpfungDemo <Programm>SinusfunktionCoprozessorTypentheorieReelle ZahlProgrammfehlerGeradeIdentifizierbarkeitCliquenweiteInnerer PunktElektronische PublikationPräprozessorSoftwareentwicklerRichtungInstantiierungCodeObjekt <Kategorie>Folge <Mathematik>Wort <Informatik>TeilmengeKomplex <Algebra>ComputeranimationVorlesung/Konferenz
11:49
Inklusion <Mathematik>Lokales MinimumDivisionZeiger <Informatik>URLDreizehnSoftware EngineeringKlasse <Mathematik>ProgrammbibliothekProgrammierungKonstruktor <Informatik>ZahlenbereichMultiplikationsoperatorTypentheorieTextbausteinSampler <Musikinstrument>FunktionalInnerer PunktTemplateWort <Informatik>GenerizitätHalbleiterspeicherInterface <Schaltung>NamensraumCodeReelle ZahlObjekt <Kategorie>SymboltabelleElektronische PublikationKlassengruppeBellmansches OptimalitätsprinzipMapping <Computergraphik>MAPE-MailFormation <Mathematik>NeuroinformatikTrennschärfe <Statistik>Komplex <Algebra>Gewicht <Ausgleichsrechnung>DatenflussVollständigkeitParametersystemInstantiierungComputeranimation
17:49
Hill-DifferentialgleichungDreizehnInklusion <Mathematik>Zeiger <Informatik>Konvexe HülleHardware-in-the-loopTypentheorieMultiplikationsoperatorMaßerweiterungInterface <Schaltung>TaskFramework <Informatik>DifferenteNamensraumEmulatorLaufzeitfehlerFunktionalGenerizitätKlasse <Mathematik>TemplatePräprozessorMapping <Computergraphik>CodeInverser LimesNetzbetriebssystemInnerer PunktVererbungshierarchieVariableProjektive EbeneVirtuelle AdressePunktVirtualisierungMAPVerschlingungTabelleVisualisierungMinkowski-MetrikKomplex <Algebra>Prozess <Informatik>HalbleiterspeicherUnrundheitPhysikalisches SystemPhysikalische Theorie
23:49
Computeranimation
Transkript: Englisch(automatisch erzeugt)
00:06
Yes, thank you everyone for coming. Today I'm going to talk about Capulator, our Ada binding generator for C++ code. So, what is our objective? We want to use Ada components in C++ environments or maybe use C++ libraries with Ada.
00:28
While writing a binding is technically relatively easy, if you have a large API it's quite cumbersome and you need to do a lot of stuff manually that could be automated.
00:40
And there are already existing solutions, so the GCC provides a binding generator, GTK Ada uses a binding generator for the GTK code. And also there's a project on GitHub called Headmaster which generates binding for C code. So, why do we want to use and want to reinvent the wheel? Well, all of these solutions have drawbacks that we need to overcome.
01:07
So, the GCC binding generator generates uncompilable code which is quite bad because you would have to build a post processor anyway that fixes the errors the GCC makes.
01:20
Also, they don't support things like templates and some symbols just are left out because GCC says, well, this isn't a valid Ada symbol. And if we wanted to fix GCC itself, we would probably have to maintain our own fork of GCC which isn't really feasible in the long run.
01:44
Also, GTK Ada, this is really, really project specific and they just generate GTK bindings from GTK's own specifications. And also they have no C++ support. The same goes for Headmaster which is really an only C project.
02:07
So, what do we want to do? Well, we want to automatically generate Ada bindings from C++ headers. We want an API layout that keeps the types and that which has a semantic mapping. So, if I have a namespace and a class called A and B and maybe a subclass called C, if you use that in Ada, this should also be A.B.C.
02:28
And not any further packaging around it but we really want to use the binding as native as possible. Also, we need to automatically generate all the import linker symbols and we have to detect name collisions because if
02:45
I have a variable that is called begin in C++, this won't probably work in Ada because it's a keyword. So, we have to get that out of the code or rename it in a valid way. And also since we are often using Spark, the generated bindings should be as Spark compatible as possible.
03:02
So, if we use function pointers in C++, we won't get Spark code but for a normal class that doesn't use these things, Spark is possible. So, what are the C++ features that are hard? Object orientation. Okay, we have that in Ada but we want static and dynamic dispatching to work so we
03:24
have to look in C++ if we have a V table and if we use virtual functions. And if we need to use a tag type in Ada, also, yeah, we want to use C++ templates which is really, which is a concept that doesn't exist in Ada.
03:40
So, this is going to be interesting. Also, variadic templates which means we have a C++ template that has a yet unknown amount of arguments and you can instantiate it with just as many arguments as you want to. And we want to automatically call constructors and destructors. So, if you have a record in Ada, you can just take it and it's there.
04:03
But in C++, the class needs to be constructed by the constructor and also if it goes out of scope, it needs to be destructed and if it goes out of scope in Ada, this doesn't happen so we have to fix this somehow. This is our architecture. So, the upper layer is the data and the lower layer are
04:24
our components. We have the C++ headers and just pass them with libclang because it's easy. That gives us an abstract syntax tree. So, basically, the C++ code is a tree-like structure in memory. We have a converter where we enrich this data.
04:42
So, we have to instantiate the C++ templates by our own because the instances are not in this tree. And if we find a template and we find a template use, we have to create an instance of this template that binds to the actual instance that is generated by the GCC when compiling the C++ code.
05:06
With our intermediate representation where all this data is existing, we can then call the generator that takes the data and just generates Ada snippets and those snippets are then put together to a complete Ada specification.
05:21
So, how does this look? We have a small class, the namespace and the class and we have a public member and just the constructor and the generated code is a package A that represents the namespace and the package B that represents the class.
05:42
And since we don't want to have state in our classes but we want to have objects and the class object itself is a limited record which then contains the member variable and the constructor is an auto-generated function that returns this type.
06:02
And the pragma C++ constructor is from Knut which tells Knut, okay, this is our constructor and here we have to supply the constructor symbol of the object file. And then if we call this function, the C++ constructor gets called and returns us an initialized object.
06:27
How do we fix the naming problems? I told further. So, at first we apply Ada casing. This means, I don't know if this is in the standard but it's good practice.
06:41
So, uppercase at each beginning word and delimited by underscores. Also, if you have invalid things like leading underscores or double underscores, we just insert an X and also keywords are prepended with an X.
07:04
Otherwise, if we have unmappable characters like this German umlaut, we replace it with a UTF -8 sequence and hope that nobody else in the C++ code ever did use this variable name.
07:22
Yeah, so that's like a little more advanced thing. We have a C++ template. So, C++, I hope everyone knows what C++ templates are and how they work a little bit. So, they are basically just a code that isn't an object file but once you instantiate it, this code gets copied
07:45
with integer as type and then if you instantiate it again, it gets again copied and those are two different things then. And for Ada, we have to somehow map this since we don't have templates, we just use some naming so the class A gets A and then underscore T
08:04
for template and then underscore int for the integer here and the other thing is package A underscore T and underscore T for B for the instance of B. And then we have two separate things which we can use from Ada and if
08:21
we call something from this package, actually this instantiation of the template will be used. So, what are the challenges? Well, the name escaping, as I said, you can just push all the names into C++ and hope that it collides somewhere after our naming.
08:44
So, because the C++ identifiers have a complete superset of the Ada identifier rules, so you can just do anything that isn't allowed in Ada. Also, real C++ generics. So, the problem is C++ generics aka templates are a compile time construct while Ada generics are a runtime construct.
09:07
So, if you wanted to use templates as you use them in C++, you would have to use to add some preprocessor to Ada that does that but I also use Ada because it doesn't have a preprocessor.
09:21
And using Ada generics to do things that are usually done at compile time isn't really working, so you can't use Ada generics to map to templates. And we have circular dependencies in C++ code, so you can generate a forward declaration, use it, and then declare something else.
09:46
I have an example for this. So, you have a forward declaration class X. This just tells, okay, we have a class X that is something. And then we have a class Y that uses a reference to this class X. And then we have the class X defined that again uses the class Y.
10:04
So, we have a circular dependencies because these both classes depend on each other. In this case, we only have a reference, so we can fix this by a so -called limited width. Those are two files, so think a line here which divides those files.
10:25
And this limited width gives us an Ada the possibility to do some kind of circular dependencies. So, we can define an access type of X.class which is defined in the X
10:41
package, but we can only define an access type, so we can't use X.class directly. But we have our reference that is used here. And then we can just use Y in the X package and use it regularly as an object. So, the conclusion of this, we can map complex C++ scenarios in Ada. Also, advanced features such as templates can be used to some extent.
11:11
We think that this is sufficient for real-world use in the most cases, but some features are just not doable or really, really, really hard to do.
11:23
So, yeah, it's still under heavy development. There are quite some bugs we need to do and we don't, for instance, we currently don't have array support, so this is some quite important feature that we need to add. But we support namespaces, classes, and templates, typedefs, and you can find the project
11:43
on GitHub if you want to help us in doing all these other hard things. And now I have a small demo for you. Yeah, you can see this.
12:01
So, we have a simple C++ library called number that just is a template of the class number and it has a value. Is this large enough or should I maybe make it larger?
12:27
And you have a function add that adds something to the value and a function value that returns the value. And now we want to use this in Ada. So, since this is a dummy library and there's only this header file, we need to trick the
12:49
GCC to instantiate our template because we need the compiled code at the end to link against Ada. So, this is a dummy class that just tricks GCC into actually creating an object
13:03
with all these functions that contains all these functions, otherwise they would just be not existent. Yeah, and we need to create an object, so we have to instantiate this once in a CC file so we get something to link against.
13:24
This is something you probably don't need if you use a real word library. And yeah, let's create our example. So, we call it a number and we name it example which is just the top level package name it gets.
13:48
And now we have our example.ads that uses interfaces.c for obvious reasons. It uses Spark and we have some type renames and more type renames.
14:01
And there's our package number which is a template instance of int. We have our private value here. This happens because we don't want to alter values in Ada that are private in C++, so we want to somehow keep this mapping.
14:22
But we still need to have this value in the record, otherwise the C++, can you see the cursor? Yes. Otherwise the C++, the GCC will create a wrong memory layout so we have to keep this.
14:40
And we have the constructor with a symbol and we have an add function with a symbol. Let's scroll a little bit. And we have the add function and the value function and there's our dummy class starting but this isn't really interesting. And now we have our example program.
15:07
So this is just boilerplate code that reads command line arguments and here we use it. So we say okay, A is an example number g int class and it is created by the constructor with the value of x.
15:25
And then we add the second argument, the y to this object and then we just return its value. So and here we put out the value and if I compile this I get a little
15:44
program called add which will just add two numbers and like five and four and we get nine. So yeah, that's it.
16:02
So thank you, that's yeah, I'm finished. Other questions? Yeah, well one technical, maybe two, so I'll start with perhaps more straightforward one. So you are mapping, I noticed you are mapping C classes in the packages, not like
16:21
because you could technically try to go with type and then just make a mapping like that. Is that because class in C++ also provides a namespace and you want to have this? Yes. Because otherwise you could have different mapping. Yes, because the add function for example in C++ you call this add function with class name or with object name dot add.
16:47
So you want to keep that mapping? Yeah, because if you don't do that and you have two classes that implement add, you would collide in Ada if you don't encapsulate this. In principle it's solvable, but that I agree with.
17:04
The generics, I mean templates and generics are not the same, sure, but wouldn't it be, I mean you can, as far as I can see, implementing templates as generics as like generic package could be actually even simpler.
17:22
Going the other way around would be way more complex. Well, the problem is templates are compile time and Ada generics are run time. Still you can emulate behavior. I thought about the thing. Because it kind of logically expected your two pieces of code, like template, that
17:42
you would make generic package and then instantiation of this package with this specific type. Because that's the main use, the intended use on the other side and intended use on the C side. It's true that then people abuse this stuff, like horribly abuse it, especially in Ada, to make proper multiple inheritance.
18:04
In other, the generics. That's not what they were designed for, I don't understand. But still it seems to me kind of the nature of mapping. Yeah, that's true, but due to the technical differences it's really hard to map that correctly because you have to move something that is usually happening in run time to the compile
18:24
time and then you lose the possibility, for example, to instantiate Ada generics with run time variables. That is not going to work with C++ templates at all. Yeah, but you are generating Ada code for C. I agree that if you were...
18:42
No, no, we are only generating the bindings. Yeah, the bindings. That's what I mean. If your task was the opposite, that would be a severe limitation. You cannot, with C++ templates, really completely emulate Ada generics. But it seems to me that Ada generics should be able to give you most of the templates of C++.
19:01
Can we talk after? Okay, yes. What's the most complex or the most C++ project that you have to run? We are building this to use Ada in the Genode OS framework, which is an operating system framework built on C++ and uses all the extensive templating and stuff.
19:23
I would say this is quite complex. The Genode guys are sitting right there, so you can ask them about the Genode project. Yeah, I would say this is the most complex thing we are currently doing. It's not working completely, but some things are already mappable.
19:41
You had a question? Okay. QT is something else because they have their own preprocessor stuff.
20:10
The problem is in C++, if you create just a class, as you have seen here, it doesn't have a Vtable, which means the memory mapping starts at zero.
20:31
And if you add a tag, the memory mapping starts at byte eight. The same is the other way around.
20:41
If you use a Vtable, for example, due to a virtual function, the memory mapping in C++ starts at byte eight, which means that C++ classes that use virtual functions are tagged classes in Ada. And those that are not using virtual functions or are not derived from virtual functions are not using tagged types in Ada.
21:09
Yes. Yes. Small detail, why didn't you use the interfaces.c package? It seems you map int to
21:24
integer, while in Ada you have a package especially intended for that mapping called interfaces.c. We use the interfaces.c for the types. We just rename them. The C++ int was mapped to Ada's integer, not Ada's interfaces.c.int.
21:50
Here it is mapped, we convert it to our example.int, which is a subtype of interfaces.c.
22:05
The point that we rename all these types because we want to keep our own namespace. If you convert, if you generate bindings for a project, everything that comes from here is its own namespace, so we rename it.
22:34
I've done an Ada function that returns this struct and got this struct in a C++ binding, so the other way around, and this worked.
22:46
I can't say why it shouldn't work. In theory, you should just be able to put a struct in the function. Can you add pragma convention to the structure? Yes. So if I go down here, convention C++, yes.
23:17
C++ is quite weird. Don't you fear that you import the weirdness of C++ into Ada?
23:25
To some extent, yes. On the other hand, I'm not able to rewrite everything that I use in C++ that is currently written in C++ and Ada. I'm probably better by importing some things and saving some time. Thank you.