We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

A deep dive inside the Rust frontend for GCC

00:00

Formal Metadata

Title
A deep dive inside the Rust frontend for GCC
Title of Series
Number of Parts
542
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Started in 2014, the gccrs project is working toward creating an alternative compiler implementation for the Rust programming language. At the moment, the project targets the 1.49 version of the language and hopes to catch up once that milestone is reached. In that talk, we will explore some of the components inside gccrs, as well as dive into some of the hurdles encountered during the project's lifetime. Finally, we will explore ways to cross-pollinate with the Rust community, in order to help and benefit both projects. Specifically, we will dive into some ways we plan to share components with rustc, and how to achieve that: namely, we will look at how we plan on integrating the Polonius project to perform borrow-checking inside gccrs, what our efforts with running the rustc 1.49 testsuite are, and what we need to achieve to start being useful to the Rust- for-Linux project.
14
15
43
87
Thumbnail
26:29
146
Thumbnail
18:05
199
207
Thumbnail
22:17
264
278
Thumbnail
30:52
293
Thumbnail
15:53
341
Thumbnail
31:01
354
359
410
Source codeOpen setInformation securityCodeParsingAbstract syntax treeFront and back endsCompilerWritingLibrary (computing)Network topologyChainImplementationOpen sourceSoftware developerFormal languageFrequencyWikiSystem programmingComplete metric spaceKontrollflussCore dumpPattern languageThermal expansionMacro (computer science)Data structureGeneric programmingPatch (Unix)CompilerSuite (music)Revision controlTypprüfungParsingImage resolutionFormal verificationError messageFormal languageCompilerQuicksortProjective planeRevision controlBitMultiplication signDampingThermal expansionOpen sourceProgramming languageError messageGoodness of fitShared memoryFront and back endsCompilerImplementationSuite (music)Software testingLibrary (computing)Patch (Unix)Software developerPhysical systemTraffic reportingGeneric programmingFunctional programmingNetwork topologyTransformation (genetics)Exterior algebraIntermediate languageCrash (computing)Abstract syntaxCodeException handlingInformation securityType theorySlide ruleSatelliteArithmetic progressionRight angleMultiplicationComputer programmingStandard deviationFigurate numberPlug-in (computing)Computer architectureMultilaterationSymbol tableInstallation artBookmark (World Wide Web)Latent heat
Abstract syntax treeGeneric programmingHierarchyMacro (computer science)Function (mathematics)LogicThermal expansionTupleVariable (mathematics)Assembly languageInformation privacyUniform boundedness principleGroup actionPointer (computer programming)TypsystemCodeEmailType theoryTranslation (relic)Patch (Unix)MathematicsBlogGeneric programmingElectronic mailing listSystem callFormal languageTupleMacro (computer science)Inheritance (object-oriented programming)QuicksortCompilerThermal expansionFunctional programmingPointer (computer programming)Representation (politics)Type theoryParameter (computer programming)Multiplication signOperator (mathematics)CodeExpressionProgramming languageCASE <Informatik>Latent heatSet (mathematics)NumberInfinityControl flowMereology2 (number)Goodness of fitIntermediate languageSubsetEmailDifferent (Kate Ryan album)File formatCompilerAssembly languageWhiteboardLine (geometry)Source codeBitData structureSoftware developerParsingPhysical systemConstructor (object-oriented programming)Codierung <Programmierung>Figurate numberAddress spaceInformation privacyPatch (Unix)Context awarenessMetadataObject-oriented programmingRaw image format
BlogMathematicsEmailCodePatch (Unix)File formatTemplate (C++)TrailSoftware developerCore dumpMacro (computer science)Attribute grammarServer (computing)Remote procedure callSystem on a chipSoftware testingUnicodeError messageCodierung <Programmierung>Suite (music)WebsiteExplosionLibrary (computing)CryptographyPersonal digital assistantAsynchronous Transfer ModeBootstrap aggregatingTraffic reportingSuite (music)ChainRule of inferenceFormal grammarRevision controlRepresentation (politics)QuicksortCodeTraffic reportingBootstrap aggregatingMereologyFormal languageStudent's t-testSoftware testingLibrary (computing)Computer clusterStandard deviationRemote procedure callToken ringProcess (computing)outputComplex (psychology)Form (programming)CuboidProgrammschleifeFehlererkennungContinuous integrationUnicodeMathematicsSkeleton (computer programming)Core dumpBranch (computer science)Error messageLevel (video gaming)Data structureAuthorizationRobotCodeMacro (computer science)Web 2.0Software developerRight angleSystem callLink (knot theory)DatabaseEmailProper mapRewritingMultiplication signElectronic mailing listComputer configurationAddress spaceCompilerFile formatMetadataPatch (Unix)TrailLine (geometry)Performance appraisalArithmetic meanProjective planeContext awareness
CompilerGoogolCodeTraffic reportingSystem callOpen setCompilerCode refactoringOracleImplementationFormal languageGoodness of fitTraffic reportingPoint (geometry)Stack (abstract data type)System callInformation retrievalRepresentation (politics)BitMultiplication signBenchmarkLatent heatContext awarenessRight angleProjective planeCompilerQuicksortNumberTheory of relativityPairwise comparisonTerm (mathematics)Type theoryDebuggerNormal (geometry)Computer virusExistenceSoftware testingPhysical systemCompilerDisk read-and-write headElectronic mailing listOnline helpLink (knot theory)UsabilityMacro (computer science)CodeFile formatStability theorySource codeStandard deviationView (database)Parameter (computer programming)ResultantComputer animation
Program flowchart
Transcript: English(auto-generated)
Okay. Hello, everyone. Can you hear me okay? Good. Okay. My name's Arthur Cohen. I'm a compiler engineer at Ambicosm, top left, and today I'm going to talk to you
a little bit about Rust GCC. So first of all, a little summary. We're going to get into what is GCC, because this is a Rust dev room. I assume at least some of you have never used GCC, which is good for you.
It's good for your health. Then what is Rust GCC? So why do we make it? Why... I mean, working on it. Who was stupid enough to even think about reimplementing a Rust compiler from scratch? Then how do we do that? So I'm going to get into some of the steps of our compilers, our parser, our intermediate
representation, and all of the extra fun Rust stuff that we have to handle, because it's a really complex language. Then I'd like to get into our workflow, the community. So all of the contributors, how we work together, our merging process, GitHub, all of that, and all that interesting stuff that comes with it.
And finally, some sort of future questions. What are we going to do? What are our goals? When are we going to stop? And so on. Okay. So what is GCC? GCC stands for the GNU Compiler Collection. It's sort of a very, very big program that contains multiple compilers from multiple
languages and that all share the same backend. So the same sort of assembly, emission, and optimizers, and so on and so on. One fun thing about GCC is that it's very old. It's 30 years old. Maybe more.
It's written in C++ 11, so that's great. As I say, it's multiple languages in one. So you've got a C compiler, a C++ compiler, Fortran compiler, so on and so on, and we're trying to add Rust to it. And if you know a little bit about how Rust C works, Rust C is called a frontend.
It sort of does its thing and then talks to LLVM to generate code. And that's what's good about LLVM is you can use it as a library. You cannot do that with GCC. So you have libgcc-jet, which is sort of an attempt at having a library for GCC, which is quite recent.
Or you can do like Rust GCC does, which is create the compiler in tree. If you've been following sort of the Rust in GCC story, you'll know that Rust C code gen GCC, the project by Antonio, actually uses libgcc-jet. And that's a way better idea than Rust GCC. But let's keep going.
So what is Rust GCC? It's a full implementation of Rust on top of the GNU tool chain. So as I said earlier, this means that we're actually reimplementing the compiler from scratch. So we started from sort of nothing and kept adding and adding stuff until today.
The project was originally started in 2014. So just for one quick bit, I think at the time libgcc-jet did not exist. So it's not as bad an idea as it is to add it to the GCC tree.
And originally in 2014, if you know a bit about the history of Rust, you didn't have a stable version yet. Rust version 1.0 released in 2015. So that meant that in 2014, there was a lot of churn within the language. If some of you were here at the beginning, you remember maybe the tilde pointer, the add symbol
that was used for a lot of stuff, the garbage collector, and so on and so on. So eventually, the project had to drop. Because even though he was very, very into it, one developer could not just keep up. It was revived in 2019, thanks to multiple people. First of all, Open Source Security, and then
Ambicosm, who are the two companies sponsoring this project. It receives contribution from many GCC and non-GCC developers. So I'm going to talk about that a bit later. But we do have some people that have been working on GCC for a very long time helping us. And I'd like to thank them, but more on that later.
So why do we do that? The goal is to upstream it with mainline GCC. So that means that whenever you're going to put your favorite Linux distribution, install GCC, you're going to have GCC Rust with it. It's an alternative implementation of Rust. We hope that it helps maybe draw out and sort of drive
the specification effort, and that we can help the Rust C team figure out some pieces where the language isn't as clear as they'd like it to be. It reuses the GNU toolchain, so GNU-LD, GNU-AS, GDB. But it does reuse the official Rust standard library,
so libcore, libstd, and so on. And because of the way GCC is sort of architectured, once you get to that common GCC backend and common GCC intermediate representation, you can basically reuse all of the plugins that have been written for GCC ever.
And that means a lot and a lot and a lot of plugins, security plugins, stuff like the static analyzers, so you might have heard about that, the LTO, which is not really a plugin, but we can make use of it, CFE, CFI security plugins, and so on.
We also hope that, because we're writing it in C++, that means we can backport it to previous versions of GCC. And hopefully, that will help some systems get Rust. And then, because GCC, as I said, is much older than LLVM, it has support for more architectures and more targets
than LLVM. It had, now, you guys have the M1 Mac, and we're still far in that. So technically, thanks to GCCRS, you'll now be able to run Rust on your favorite Soviet satellite and so on. And there's a link for that.
The slides are on the talks page, and there's a lot of frequently asked questions. So that's sort of the milestone tab that we put together in each and every one of our weekly and monthly reports. And the takeaway from here is that the effort
has been ongoing since 2020, and even a little bit beforehand, and we've done a lot of effort and a lot of progress. Right now, we're around there. So we have upstreamed the first version of GCC Rust within GCC. So next time, when you install GCC 13,
so sorry for the people on Ubuntu. That's in like 10 years. But next time you update GCC, you'll have GCCRS in it. You can use it. You can start hacking on it. You can please report issues when it inevitably crashes and dies horribly. And yeah, we're sending more and more patches upstream and getting more and more of our compiler,
whose development happens on GitHub, towards and into GCC. So currently, what we're working on is we have a base for const generics. So I'm not going to get into details on that. Just a cool feature of Rust that's
not present in a lot of languages, except C++, and we're getting them working. We're working hard on intrinsics. So those are functions declared in the standard library but implemented by the compiler. They are very LLVM dependent, and we're running into some issues doing the translation.
One big thing we're doing is some work towards running the Rust test suite. So because we want GCCRS to be an actual Rust compiler and not a toy project or something that compiles a language that looks like Rust but isn't Rust, we're striving to, I mean, we're trying really hard to get that test suite working.
And we're almost, I think, almost done with compiling an earlier version of libcore, so 1.49, which was released a few years ago. So a quick overview of our pipeline. Basically, for a Rust compiler, if you don't know anything about compilers, that's fine. What you're going to do is you're going to do a parsing
step. So you're going to take the Rust code, and you're going to turn it into a data structure, which is sort of a tree, which is called an abstract syntax tree, AST. Then we're going to run an expansion on that. So any time we're going to see a macro, we're going to expand it and then replace it by its expansion. Name resolution, that's basically
putting which use, any use, linking it to its definition, and so on. We're going to do some more transformation on that AST, and then finally type check it. And then we can do a lot of error verifications, linting, so stuff like the warnings you get when you have an unused value and that you can prefix it
with an underscore, for example. Finally, when that's done, we lower it to the GCC intermediate representation. So that's sort of similar to the last step of Rust C, where it gets lower to LLVM IR. So as I said, we have an AST. We have an HIR. The advantage of having these two high-level data structures
to represent Rust code is that we can desugar the AST. So remove the syntactic sugar that you have in Rust source code to have a simpler representation within the compiler. So one example, for example, is that the difference, as you know, between methods and function calls
is you got like self dot method. But within the compiler, it doesn't make any difference. A method is just a function call with an extra argument. So that's how we represent them in the HIR. And we sort of do these other transformations, such as removing macros, because at this point, they've already been expended, and we
don't care about them anymore. And finally, as I said, the last intermediate representation is called generic. And it's not generic at all. It's just the name. And it's the GCC intermediate representation. So one thing I'd like to get into is macro expansion. And the reason I want to get into that is because, I mean,
I wrote most of it in GCCRS. So I'm the one you have to blame if it stops working when you try GCCRS. So as you know, macros in Rust are typed. So you can have expressions, statements, path, and so on. And someone has to do that checking. And so that's part of the macro expansion part.
And as I said, macros are sort of like function calls. You just expand them, and then you paste the AST that was generated, and you're done. And actually, in Rust, you got repetitions in your macro. And that's extremely annoying to take care of. So repetitions, if you've ever written them,
they're unreadable, but they're very useful. You have sort of these operators, which are the clean star, interrogation mark, and plus sign, which allow you to specify, well, I want between 0 and infinite of something, at least one of something, one or more of something. And because Rust is a very well thought out language,
it's actually got ambiguity restrictions to make sure that no matter how the language evolves, your macro is not suddenly going to become ambiguous. And so again, someone has to do that checking and make sure that your macro is not ambiguous. So that's me. So here, this is probably like a very basic macro
that you've maybe written or used or whatever. It's a macro that does an addition, and that takes any number of argument. You can see in green, I've highlighted the repetition sort of operator, marker, thingy. And it basically expands to E plus adding
the rest of the expression. So that's a macro to make tuples. So basically, you're going to give it a list of arguments on the left, a list of arguments on the right, and it's going to make a list of tuples. The thing I like to point out here is that whenever you don't have the same number of arguments, if you're
merging repetitions together, it's actually going to, well, it's going to go bad, and you have to check that. And again, on really complex macros, making sure that your merged fragments are actually the same number of repetitions and so on, it gets very hard and very tedious. And Rust macros are sort of a language
within the language that needs to be taken care of. And as just one last example on how fun Rust macros are, for the ambiguity restriction, for example, you can have a keyword after an expression because that keyword might become a reserved keyword, might be another expression.
A lot of good reasons for why it's an ambiguity. And the thing here is if you look at the second sort of matching, second matcher in that macro, you can see that the operator means it's going to appear between zero and one time. For the third matcher, it's going to happen,
it's going to appear between zero and plus infinity times. Same for the fourth matcher. So the macro sort of checker has to move forward and make sure that in the case where two doesn't appear, three doesn't appear, and four doesn't appear, the thing after that is allowed in the set of restrictions.
In that case, it's not because, well, it's the same as above, so we have to error out. It gets really annoying. And there's more checks that are Rust specific that we can't really copy paste from the other languages in GCC. So for example, you got privacy in Rust. So you know how you mark your functions as public,
or just leave them as private. But you got fun privacy. So you can have a function that's public in a path, so in a module, but not in another one. You can have a function that's public for your parent module, but not anymore. You can have a function that's public for the entire crate, but not for users of that crate. And yeah, lots of stuff.
Same, you've probably come across unsafe. So unsafe is a keyword that sort of unlocks superpowers and segfaults, and basically at the language level, it's just a keyword. So whether we're dereferencing a raw pointer or an actual safe pointer like box,
it doesn't matter to the parser or the AST. But we have to go afterwards in the HIR on that type check representation and make sure that what we're dereferencing, well, if we're dereferencing something of type raw pointer it can only happen in unsafe context.
Finally, macros are lazy. So if you're from Haskell, you know what that means. It means basically you're gonna expend them as they go before expending the arguments given to them. The fact is, macros are not lazy because you got some built-in macros that need to be expended eagerly.
And so when you just spent like three months rewriting the expansion system to make sure that they're expended lazily, and you realize that built-in macros need to be expended eagerly, well, I guess really annoying. Finally, caught sharing between crates. So if you've had the misfortune of writing crc++, you know you have to write headers, basically declaring your generic functions,
your public functions, and so on. How do you do that in Rust? The answer is you don't, the compiler does it for you. And basically what it's doing is it's putting some metadata magic in the L format, so the object file. And it's gonna encode and serialize all of your exported macros, the generic function,
the generic types, the public macros, and so on and so on. Again, more fun stuff that no one in GCC has done. Maybe GCC Go and we have to figure out. Finally, the type system in Rust is extremely safe, complex, and powerful, as you know. There's lots of fun stuff like the never type,
generic associated types, and so on. You got some types. And the fact is these constructs are not really present in any of the other languages within GCC. So that's stuff that we sort of have to figure out, figure out how to, first of all, implement them, and then how to compile them,
and translate them to the GCC internal representation. Finally, the last fun bit, you got inline assembly in Rust. It's not the same format as GCC's inline assembly. So we have to do the translation. And if you look at Rust C code and GCC, because Antoio is much farther advanced than us
in sort of the backend term, it's a very fun, like, thousand lines of code to translate from Rust's inline assembly to GCC. As I said, I'm going to talk a little bit about contributing, reviewing, and so on,
our workflow, basically. So the workflow for GCC-RS is inspired by Rust's workflow. All of our development happens on GitHub. Our communication, messaging, and so on happens on Zolip. And we use the board spot to merge our PRs. But at the same time, because we're a GCC project, we have an IRC channel, we have a mailing list,
and we accept patches sent on the mailing list, and so on. So the, sorry, the idea about that is that no matter your sort of background, whether you're a new, very young Rust developer who's only used GitHub, or, sorry, Thomas Dinosaur,
who's used IRC and mailing lists, you can send patches, and we'll accept them, review them, and make sure that your contributions get accepted. So GCC development is hard. I made that experience firsthand, because I'm not an IRC and mailing list kind of guy.
I'm a GitHub kind of guy. And sending patches via email, getting reviews, submitting them, and so on, it's very, very hard. In GCC, you got a fun thing that, on your commits, you have to add changelogs. They have a specific format. They're annoying to write. They're very helpful, but they're annoying to write.
To send, actually, patches to get reviewed by GCC, you have to use get send email. So, sort of something that sends the email for you, and sends the patches in the meantime. Because I wanted to, you know, make sure I didn't break anything, wasn't gonna, I don't know, blow up my computer,
I decided to try get send email to my own personal address the first time. The one thing I didn't realize is that get send email automatically adds every contributor to the CC list. The first time I sent patches, I actually pinged like 150 people three times,
leaked my personal email address. That's fine, no one yelled at me. And so I removed the option to automatically CC people, and so when I actually sent the patches, no one was CC'd. When patches were getting reviewed, the authors weren't aware that their stuff was getting reviewed. Very fun. So, yeah, we do that.
I got used to get send email. I'll do that for you. If you submit comments on GitHub, pull requests and so on, we'll take care of handling that. We have lots of continuous integration to make sure that your comments pass the weird new coding style, to make sure that they respect the changelog format,
to make sure that they build and pass the tests and so on. And we're actually working on a little bot to generate the changelog skeleton for you. Furthermore, because of the way GCC works, development happens in stages. So right now we're in stage four. So basically between January and May,
you're not allowed to make changes to common GCC parts. And this is a very good idea. We try to avoid breakage of the common structure of GCC that's gonna affect the most languages. But that also means that we have some patches that we cannot merge until May.
And so, again, GCC-RS takes care of that. We have a staging branch and so on. We keep track of the stages for you. You can merge your stuff. We'll do it for you. And make sure you don't get annoyed by that. So is that working? Are people happy to contribute on GCC-RS? Yes, I think so.
In 2022 we've had over 50 contributors. That's mostly code contributors. We've also had people helping us with the Git stuff, the email stuff, CI stuff and so on. But I'm not counting here the people reporting issues because there's a lot more than that. We have a lot of students working on GCC-RS
which I'm really proud of. I actually started as a Google Summer of Code student on GCC-RS and now I'm a full-time engineer on it. And we've got multiple internships that are also coming that way. So, for example, we'll have a full-time, six months internship to take care of libprog this year.
As I said, we also have a lot of GCC developers helping us. So people helping us with the Git stuff, with the merging stuff and so on. People providing very valuable input. And we have people from the Rust team helping us which is really nice. So people are willing to work with us
on getting the test suite to pass. People that are explaining us how Rust works because it's complex and just helping us not stray far from the path.
So what's coming? When is GCC-RS ready? GCC-RS, to be at least sort of useful, has to be able to compile libcore. So if you're not aware of this, the standard library in Rust is actually three kids in a trench code where you got the core stuff
that's necessary for things like additions, creating lambdas, iterators, for loops and so on. On top of that, you got the alloc create which takes care of all of the structures that need dynamic allocation, so your vector, your box and so on. And all of that forms the lib standard
which is used by most projects right now. There's a lot of unstable stuff in libcore. So that means that even if we target Rust 1.49, we have to actually be able to compile a much more advanced version to compile the core library.
Finally, we also have to take care of libproc. If you've never written a proc macro in your life, well, you're missing out. But it's basically a very complex shmuelblick that takes the AST, sends it to a remote process communication, gets an AST back and pastes it.
And we have to implement all of that sort of piping between the crate and the compiler, sending the AST tokens and so on, sending it to a location, all stuff like that. Finally, borrow checking. If you've ever written Rust in your life, which I'm going to assume you have,
you've been sort of held at gunpoint by the borrow checker. And that's really a core part of the language experience. And we can't really be a Rust compiler without a borrow checker. So our aim for that is to reuse the upcoming Polonius project, which is a formalization of the rules of borrow checking,
and make sure that we can integrate it to GCCRS. So the way we're going to do that, again, is make sure we have sort of an internal representation that works for Polonius, create that tiny FFI layer that allows us to speak to Rust from our C++ compiler,
and ask Polonius to do the thing. Finally, we're a part of this year's GSoC. So if any of what I said interests you, there's probably a project you can work on. For example, last year we had a student that ported the const evaluator from C++ over to our front end, meaning that we can do, well, const evaluation now.
So run const functions, do conditionals, for loops, and so on, in const context. This year's GSoC at least include the following four projects.
So adding a better debugging experience for a high level intermediate representation, adding proper unicode support, proper metadata exports, so that stuff like the dilib, rustlib, clib, and so on, formats that you'll find when you're exporting Rust libraries. And finally, better error handling for the user of GCCRS,
and starting to integrate the Rust C error codes to allow us to pass the Rust C test suite. There's a lot of tooling around GCCRS. So there's a test suite that takes like four hours that we run each night. There's a test suite generator
because it's a thousand lines of code. So to make sure that, to make sure, well, we don't pass any of the test suite for now, but we have it. So there's a Blake-3 cryptographic library, which is quite nice, and doesn't rely on the standard library. There's making sure we can compile libcore 1.49, making sure we can try and compile
all of the Rust C test suite, and we're running that every night. We have a generator for that, as I meant. We have a website, a dashboard for the test suite. We have a report generator because they're annoying to write as well. And we got cargo-gccrs, which will allow you to, well, instead of doing cargo build, use cargo-gccrs build to build your code with Rust,
with gccrs. And all of that tooling is written in Rust for two reasons. The first one is it's much better than C++. The second one is wouldn't it be so freaking cool to compile our own tools with our own compiler?
And three reasons, actually. The most important one is to get people from the Rust community to contribute to those tools. Actually, if you're interested in helping gccrs in one way or another, a good thing would be to start working on that tooling. And it's all just fun stuff. The web dashboard is Tokyo and async
and a rocket database and so on. So not database, API. I'm not a web dev. So if you're interested in that, feel free to come and contribute. Finally, can we rewrite GCC in Rust? Maybe. For bootstrapping purposes,
so make sure that we have a full bootstrapping chain. You can read a lot of papers on that, trusting trust and so on. We'll have to write that compiler in Rust 1.49, which is gonna be annoying. It's still a ways off. And I'd like to really point out that the goal of gccrs is not to break the ecosystem.
So we wanna make sure that whenever someone compiles one of your crates with gccrs, they're not actually blaming you for the failure that's going to happen. And yeah, that they report the issue to us because we're not a proper Rust compiler yet and you shouldn't have to suffer for our hubris.
The community, we got mugs. If you do pull requests, we'll send you a mug. People that have helped with the compiler got this one. People that have helped with the merge got the one on the right. Lots of links. You can attend them. As I said, maybe I didn't say it,
but we have monthly and weekly calls on JITC. You can attend them even if you're just interested in listening in. We have an IRC channel, a website, and so on. The goal is to make compilers fun. The goal is to get contributions from everyone, from the GCC community as well as the Rust community. We have Google Summer of Code.
There's lots of stuff for you to work on. We got good first PR issues. If you're interested in compilers, come talk to us. We don't bite. We got reports every week. We shoot out, shout out contributors. So if you do a pull request, we'll tell you about it. We'll tell people about it.
We got monthly calls. Do you have any questions? Hey. Hi. Awesome project.
Thank you. So you mentioned one of your goal was to help develop a spec of Rust with the Rust C team. Can you share more about that? Well, there's nothing really started. It's just that you have the Rust reference at the moment, and it tells you how Rust works from a user point of view but not specifically from a language point of view.
And at the same time, we don't want a Rust standard like you have with C or C++ where it gets really annoying to get features done. So there are efforts from people like Mara Boss and Josh Triplett and so on to have a Rust specification. And one of the goals of GCCRS is to say, well, we've had trouble with that, because that's not how it is in the reference,
or it's not explained well enough. And we had to look at the Rust C source code or try it out to figure out how that works. Stuff like dereference chains, what type actually gets used for a method call, and so on and so on. So yeah, this is just we can point out and say,
well, maybe that could take some tweaking, because that's not, yeah. Do you have a list already of stuff like that? It's mostly type system stuff. I have some on macros. There's not really a formal list. I think we have some, like we have an actual list somewhere. But yeah, I don't have it in my head right now, sorry.
Thanks so much. Two questions, perhaps related. First on performance, I wondered if you had any numbers at all on the performance comparison and what your goals are for that. And secondly, I'm kind of surprised by how much you re-implemented in terms of the IRs.
Was that an intentional decision, or was that because it needed to be in C++? Or why not effectively consume more of the Rust stack and then replace LLVM with GCC at the bottom? So regarding performance, we're much faster because we do much less. But we actually don't know about performance yet.
We haven't measured it, no benchmarks. We have a ton of stuff missing, the code we emit. We're not trying to optimize it for Rust yet, or at least not all the time. So yeah, we're just not there yet. It's going to happen eventually. Regarding the internal representation,
consuming the Rust C stuff is difficult. There's a lot of, even if Rust is a very well designed compiler Rust C, there is some stuff that makes sense only in a Rust C context. And that's also one of the things with Polonius that we're trying to work on, is that it does depend on some Rust C specific stuff.
So we do aim to contribute to Polonius and make it so that it's a little bit more compiler agnostic, I want to say, but not just to help us, just for it to make more sense and maybe be used by even more languages, who knows. But yeah, sorry, we needed representation.
Yeah. I know it's still too far away, but is binary reproducibility a target of this? No, not really, sorry. It would be difficult. The Rust ABI is not stable. Rust C changes its internal formats and representations.
I don't want to say often, but it does happen. And it would be really difficult to keep up with that without a stability guarantee or a specification of that. It's really not one of our aims. Thanks for the talk. I was wondering about your cargo re-implementation.
Wouldn't it be easier to have a command line compatibility with Rust C and then plug that thing into cargo to tell cargo, don't use Rust C, use GCC Rust? So it's not a cargo re-implementation. It's a cargo sub-comment.
So it's the same as cargo fmt, for example. How it actually works is that we intercept the Rust C command line, as you mentioned, instead of saying, well, fork and start Rust C, we start GCC RS. And on top of that, we do argument translation, so stuff like dash dash edition equals 2018 for Rust C
is gonna become dash F Rust dash edition equals 2018 for GCC RS. So we have that list and we do the translation and then just launch GCC RS and pipe the result back to cargo.
Thanks for the great talk. And one question, or maybe a tip, I don't know whether it's one, but is there a project or some possibility to transform the LLVM IR to the GCC IR? Because if it is, then you could maybe run some tests on it, like creating the IR via normal Rust C and then your variant, and then you compare the IRs.
I think there is a project like that. I can't remember which way around it is, if it's an LLVM compiler that takes in GCC IR or a GCC front end that takes in LLVM IR. I think something like that exists. I don't know much about it. I think it's not very famous or anything,
but it could be interesting. Hello.
Do you have a link with Rust in Linux project? Because if I remember, Linux is compiled with GCC, right? Yes, so one of the big, big, big, big, big targets of GCC RS is for it to be able to at least help or be usable in Rust for Linux.
Linux is compiled with GCC a lot. You also have efforts to compile it with Clang. At the moment, what Rust for Linux does is use Rust C, so an LLVM tool chain, but it is one of the sort of goals
of the project to, yes, be able to have a fully compilable Linux project even using Rust and C in the kernel, but yeah.
Thank you. I would guess that while reimplements in such a complex project from basically scratch, you probably have a really good chance
of finding some mistakes in the upstream, in the original implementation. So do you contribute back to the upstream in such cases? And maybe you remember some of such examples. Thank you. So I don't have sort of these specific examples
in my head, sorry, but we do have, as I said, we did find some sort of stuff that didn't make a lot of sense in the spec, in the specification, sorry, the Rust reference that might have been fixed and so on, but yeah, whenever we see something that doesn't,
to us, make a lot of sense or that deserves some explanation, we try and let people know about it. We try and contribute back to the Rust C project. We're really not treating Rust C as sort of a competitor or anything, and we do want to improve it. GCCRS is built by people that love Rust
and that wanna push it forward in our own way. And for bugs, regarding like Rust C bugs, GCCRS treats Rust C as sort of the overlord. So whenever Rust C does something, we do the same thing. We don't wanna sort of argue about what is correct Rust
and what is not correct Rust. Rust C is the Rust compiler. It's the Rust implementation. When you ship a Rust version, you ship the compiler, the library, the sort of the language is all of that, those three projects. So yeah, we just try and stick with that as a reference
and we don't wanna step on any toes. Yep, unfortunately, that's all the time we have. I think we had a few more questions, but maybe we could do it in the hallway. Okay, let's thank our speaker.