The C2 programming language
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 90 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/40329 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Product (business)Computer fontFormal languageOpen setAbstractionGoodness of fitComputer programmingOpen sourceProjective planeMachine codeRun time (program lifecycle phase)Domain nameSoftware developerWebsiteCartesian coordinate systemCodePoint (geometry)Device driverFeedbackGUI widgetLine (geometry)Level (video gaming)Lecture/ConferenceXML
02:08
WordMultiplicationNumberEmailData bufferExecution unitLoginOvalCodeComputer-generated imageryLogarithmHill differential equationLibrary (computing)Configuration spaceState of matterComponent-based software engineeringFormal languageCodeError messagePhysical systemUtility softwareQuicksortWebsiteType theoryFunctional (mathematics)Computer fileElectronic mailing listDataflowOperator (mathematics)MereologyNamespaceString (computer science)Statement (computer science)CompilerSpacetimeRevision controlOrder (biology)Sweep line algorithmComputer programmingEmailSubject indexingPreprocessorMathematicsPresentation of a groupMultiplicationCASE <Informatik>LogarithmNumberOcean currentAbstract syntax treeJava appletSymbol tableLengthDeclarative programmingFront and back endsParsingBuffer solutionPoint (geometry)Core dumpMultiplication signDifferent (Kate Ryan album)Projective planeState of matterScripting languageLibrary (computing)Linker (computing)Line (geometry)Set (mathematics)Letterpress printingCompilerNormal (geometry)
10:41
State of matterMathematical analysisComponent-based software engineeringEmailStatement (computer science)System callPreprocessorElectric generatorCodeClient (computing)MereologyMathematical analysisComputer fileConnectivity (graph theory)Abstract syntax treeFormal languageRevision controlQuicksortType theoryFunctional (mathematics)ArmParsingCompilerParsingAbstract syntaxLecture/Conference
12:11
LoginLogarithmCodeState of matterMathematical analysisComponent-based software engineeringMultiplicationNumberOvalLink (knot theory)Formal languageRepository (publishing)Arithmetic progressionThread (computing)Functional (mathematics)Symbol tableFile archiverComputer fileCompilerSingle-precision floating-point formatCodeMathematical optimizationFluid staticsMultiplication signIndependence (probability theory)Code refactoringQuicksortPreprocessorCache (computing)Type theorySemiconductor memoryResultant1 (number)Kernel (computing)Variable (mathematics)Order (biology)Projective planeStatement (computer science)WordBuffer solutionSource codeXML
15:54
Type theoryElectronic signatureLogical constantMacro (computer science)CodeComputer filePreprocessorModule (mathematics)Lecture/Conference
16:46
Configuration spaceLibrary (computing)CodeInterior (topology)InformationComputer fileMacro (computer science)Statement (computer science)Electronic signatureDeclarative programming
17:21
Electronic signatureType theoryFunctional (mathematics)Exception handlingLecture/Conference
17:47
MultiplicationNumberFunctional (mathematics)Type theoryPointer (computer programming)Source codeXML
Transcript: English(auto-generated)
00:00
Hi, welcome, thank you for coming. My name is Bess Vandenberg, and I want to talk about a new language I'm working on. It's called C2. It's a variant of C, so hence C.2. The purpose of this talk is to inform other developers of this new language, and
00:20
maybe get some feedback or ideas, hopefully even patches that will be best, but we'll see. So this project started in June last year. I program a lot of C, and I was wondering, well, I like it a lot, but it has some things that I'm not so happy about.
00:45
So probably there are a lot of people here who program C, and maybe like it, as well as me. So I thought a lot of open source is made in C, but development in C is not so fast as I would like, so maybe we can improve that somehow.
01:04
So the question is, how can we make it better and more productive to program in C, and more fun? So let's look at the points, let's analyze the C language, not C2, just the old C.
01:22
The code base is huge, there are lots of open source, millions of lines, I don't know how many C developers there are in the world, maybe also a million. It has good runtime performance, and for me the most important abstraction domain is very nice, it's very good at what it does.
01:43
There are lots of other languages that do other things, generating websites or applications or widgets, but for some things, C is still unbeaten, like kernels, drivers, low quality. I really wanted to keep that.
02:02
There were also some things that I felt were not so good. C is probably older than me even, and every day I learn new things about C, while I program it for 15 years already. Last week I found out this is possible, you can just sweep a buffer and the index,
02:23
it doesn't matter. It surprised me, I've been programming this language for 15 years. And the next one, that's some type definition. You probably need a Nobel Prize to read this. One other thing that bothered me is that you always need a lot of tools to just build
02:44
a C program. A normal C project has multiple C files, so you need something like a makefile to build it, not just a compiler. And the preprocessor in the compiler is heavily used, or abused.
03:00
The thing that I found most bothersome is that I had to type a lot. It's for declarations and header files, while languages like Java don't have that. So why can't we make a C compiler that just understands that? So those were the main points I'm trying to reach with C2.
03:25
So in this presentation, I will show you some core aspects of C2 with the mandatory Hello World, of course, and three concepts. This is the Hello World in C2.
03:44
It's almost the same as a normal Hello World in C. There are four differences, I'll briefly go over them. Every file in C2 starts with this declaration package. Every file belongs to a package, and a package can consist of multiple files if they just
04:01
have the same package statement. The second one is, we don't have includes. They're allowed, but they're not used. Instead we have the use statement that does the same thing, but it doesn't copy-paste the entire stdio.h into your file.
04:23
So when the C2 compiler compiles it, it reads like eight lines. It doesn't read 800 lines. The third thing is the function keyword. I found it very nice to prefix all the functions with func.
04:40
All the types are also prefixed with keyword type. The last thing is the change to the type system, that types are always continuous. So in C you have the one part here, and the other parts behind the variable name. So in C2 it's always a continuous string.
05:04
Then the first concept, it's the multipass. There are three things here. It's a function main, it's a function getNumber, and a type. This is the typedef. The order of these three doesn't matter. You can do them in any order, the compiler will just figure it out.
05:23
So you see the type is used here, and it's defined here. This function is defined here, and used here, so that doesn't matter. So you don't need forward declarations, and they're even forbidden. They'll just compile, give compile errors.
05:40
The second concept you need when you don't have includes is the packages. We saw them briefly. So this is a file that uses the package utils. And these two files are inside the package utils. And they, is also a new keyword, one defines the type buffer, and the other one is a function here.
06:01
For this, the calling side, it doesn't know where they are. It just knows they're in some package utils. If you move these files around, it doesn't matter. So you can just use this one, the type, or you can use the log function. You can use the prefix, like the C++ namespace operator.
06:24
You can also leave it out if you don't want to type too much. As long as there's no conflict, and there might be other packages you use here that also have a log statement, then it's okay.
06:42
If there are conflicts, you have to use this. The third concept is the recipe. When you don't have include statements where you specify the file, but
07:01
have a use statement, like here, the compiler needs a search space. So it needs to know where, the compiler needs to know where to find these files, and where to look for them. So for every C2 project, the compiler compiles a target, an executable.
07:20
It doesn't compile the separate files individually. So normally in Make, you just call GCC for this file, and GCC for that file, with a whole long string of defines behind it. That's not the case anymore. You just run the C2 compiler, and it will look up the recipe file,
07:41
and it looks like this, and it will start running these things. And the funny thing I found is that this looks like a CMake file, if you're familiar with that, only it's inside the compiler. So here we have an executable, it's called example1, it has two files.
08:02
And you can build it. The second one is a library, it has some defines. So this is like minus capital D, for people who know that. And they're used when compiling these files. There's also an export function, or entry,
08:22
that dictates to the compiler which symbols of these files to export outside. So they're visible, and you can use them if you're linking to this library. So instead of just using the language to do one thing, and then using a linker script to modify symbols, which is quite hard,
08:42
you can just specify that in the language. And that's the basic thing about C2, it's more holistic than C. C, you just have one thing, you just compile one file into a .o file, and that's it.
09:00
C2 tries to do more than that, it just does the whole flow. And my to-do list is quite long, especially with tooling, because I want to integrate those inside C2. Like A style sort of stuff goes into the compiler tool set.
09:28
Now the current state, C2 is, if you look at the file here, the body of the method is almost the same, it's the same as C.
09:42
So it's still quite a complex language to parse. And I knew the LLVM project and the C lang compiler with it. So I thought, well, this is almost similar, I just hack in the package keywords and the use statement, and that's easy, and we use the other stuff.
10:00
Well, that proved to be quite difficult, because when Clang parses this, it tries to, yeah, it already knows that there is a print f statement because it's in here, while C2 just parses it and then analyzes it afterwards.
10:20
So the whole code base of, well, the analyzing and the syntax tree code base for Clang was unusable. So we are using LLVM to generate, well, we generate IR code, LLVM IR code so that we can use the backend to generate the assembly stuff.
10:43
We did use some Clang components, especially the preprocessor. We modified it slightly. We're also using the diagnostic engine, which is really nice. But the whole middle part, the analyzer, the parser, the semantic analyzer, we stripped it out.
11:01
And replaced them with our own version. So currently, parsing into the AST, the abstract syntax tree, is quite far. So we understand what the language wants to do. We, hey, this is a function call, and this is a return statement, and so on. The analysis part is just starting, so
11:20
it will work as long as you don't make any mistakes. But if you do, we don't analyze it, we just generate faulty code. Another thing is that, because they are such a huge code base, I wanted to be able to generate C code from C2 files. So that's also in, that's quite far, because we don't need to analyze.
11:41
We can just convert it and let the C compiler sort it out. That's no problem. And the IR code generation has just started. The hello world will work, and it will compile, but that's about it. So it's quite a lot of work to mix all the symbols,
12:01
because if you look at this one, when we have this type, so first we parse all the files. And then, so because the C2 compiler always compiles the whole project, so it parses all the files and then starts analyzing all the files.
12:23
So it knows it has some function here. It knows when it's parsing this one, it will just note, hey, I have a statement here. It appears to be something here that's probably a type, something here that's probably a name. Let's go on. And then when analyzing, it will look at the global symbol table and say,
12:42
hey, buffer, that should be a type. Yeah, I got one, and it's public, so that's okay. And it will go on and on until it's done. If a symbol like a type or a function or a variable doesn't have the public keyword, its scope is within the package.
13:03
So this file could use, if there's no public here, it could be used from within this utils file, because it's in the same package, but otherwise it can't be. And for optimization, we use the same optimization as C static keyword. It would be as if this is in one single C file and
13:24
all the non-public functions are pretty fixed with static. So they have a really local scope, and the compiler knows, hey, this is only used in here, so I can compile it or optimize it or inline it.
13:41
How am I doing on time? Four minutes. Four minutes left, okay. So another thing is because we have this independence between the order, we can just create a refactor tool that's on the agenda.
14:05
To analyze this, it will get three entries, and you can just drag them to reorder them. And you can even drag them to another file, say, hey, I want this one. I want to merge this with this one. That's done. And as you can see, this code, it doesn't have to be modified that way.
14:23
You can just do that. You can also call a sort function there, so I want to sort them by type or by public or non-public, and you just reorder it and that's done. You have to recompile. The tricky thing currently is that I put the scope at 1,000 files.
14:44
When I compile a kernel, I see 1,000.0 files, so that's about the largest thing I can get. So I have to parse 1,000 files at the same time. So that requires 1,000 parsers, 1,000 analyzers, 1,000 preprocessors. So that's quite a lot of memory.
15:00
Well, that's not really an issue here, but it might be on some, if you have really huge code. So we have to look at saving the intermediate results into some cache file and then continuing, and that's currently also a work in progress. But for now, we use a single thread to just parse all the files and save them.
15:32
So we have a website. There's not much on it, except a bigger document with a lot of explanations about the language. And we have a GitHub account.
15:40
There are two repositories. One is the C2 compiler. The other one is a modified Clang archive. So thank you for coming, and if you have any questions.
16:01
Yes, two questions. How would you handle scoping for macros and constants? And I would take Erno as an example when it comes to modules. And the second one is, you showed a pretty horrifying signature in C. Could you show what it would become in C2? Can you repeat the first question? Sure. So how would you handle scoping for macros and constants?
16:21
Macros and constants. Macros are a pain in the ass. Because normally the preprocessor is abused to just inline code there. So a macro will work. So there are three types of macros. One is just a feature I want to with OpenGL or without.
16:42
You check those with ifdef. Those are currently in the file here. Constancy you can replace with a const int. That will work like C++ instead of C. The third one is a macro like the max, maxAB. It's more like a replacement.
17:01
Those can only be done with include statements. So what you have to do, you can include a file as well, but you can only include it with certain information. So if you have a file in there, or some forward declaration, it will not work because you only can specify the macro in there.
17:20
Thank you. And for the signature? The signature of what? You had a horrifying signature in C. Of a function, for example. Yes. Would it be really different in C2? Because the reason it looks horrible is that it's pretty complex. Yeah, I don't think it will change a lot.
17:41
Except for the types, the types are easier. You cannot define a, I don't really have an example here. You can replace this with struct and then define a struct as possible. You can, but you have to do it in several steps. You have, first say I define a type function, and it's this.
18:01
And then I point it to that.