We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Porting RISC-V to GNU Guix

00:00

Formal Metadata

Title
Porting RISC-V to GNU Guix
Subtitle
A year in review
Title of Series
Number of Parts
542
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
GNU Guix is a from-source distribution with binary substitutes available. It is also a functional package manager, meaning that all the inputs are hashed and the build results are placed in their own destination folder. Guix also does its best to minimize bootstrap seeds, instead relying on a few cross-compiled bootstrap binaries used to build all other packages on the system. This provides some interesting bootstrap issues, especially for newer architectures, as we need to recreate the bootstrap path as it may have existed years ago in order to support programming languages. Some languages, like nodejs or ocaml, need to have support backported only a few versions. Others like java need more than a decade. Rust needed to use an alternate implementation of rustc to be bootstrapped and Haskell currently isn't on the roadmap.
Reduced instruction set computingProjective planeSinc functionDiagramComputer animationLecture/Conference
Boom (sailing)BootingWrapper (data mining)Similarity (geometry)ArchitectureScripting languageError messageConfiguration spaceTimestampUniverse (mathematics)FingerprintConvex setComputer architecturePairwise comparisonDirection (geometry)WhiteboardBootstrap aggregatingGreatest elementProgramming languageGraph (mathematics)SoftwareBinary codeArmoutputLatent heatFluid staticsComputer hardwareMultiplication signDistribution (mathematics)PowerPCTimestamp1 (number)Just-in-Time-CompilerPhysical systemQuicksortFlagDampingLevel (video gaming)Source codeDifferent (Kate Ryan album)Point (geometry)Configuration spaceSlide ruleComputer fileTerm (mathematics)MereologyIntegrated development environmentReduced instruction set computingData managementNeuroinformatikDatabase transactionComputer animation
BootingBoom (sailing)1 (number)Library (computing)AdditionBinary codeHacker (term)Integrated development environmentMereologyBootstrap aggregatingBuildingHierarchyFile systemComputer fileCodeComputer architectureComputer animation
Fiber bundleWhiteboardMach's principleSurjective functionWechselseitige InformationLimit (category theory)Software testingEmailRevision controlComputer programPoint (geometry)Source codeMereologyMultiplication signPatch (Unix)Compilation albumOcean currentSubsetComputer configurationVirtual machine
Thread (computing)CompilerComputer programmingProgramming languageSystem programmingRead-only memoryWeb pageProcess (computing)BuildingJava appletSuite (music)Revision controlResidue (complex analysis)Similarity (geometry)Process (computing)Computer architectureMultiplication signFlagSlide rule1 (number)Binary codeComputer programTunisExtension (kinesiology)Physical systemJava appletMereologyArithmetic progressionNeuroinformatikRevision controlBuildingSoftware testingCartesian coordinate systemComputer configurationImplementationMathematicsPowerPCType theoryVirtual machineCombinational logicSuite (music)Programming languageWeb pageCompilerComputer hardwareFreewareSocial classSoftwareComputer animation
BuildingPartition (number theory)Binary codeComputer architectureWhiteboardLocal ringRevision controlDifferent (Kate Ryan album)Multiplication signCartesian coordinate systemIntegrated development environmentFront and back endsArithmetic progressionSoftware testingPoint (geometry)Physical systemCASE <Informatik>CodePlastikkarteNeuroinformatikLaptopBitError messageDistribution (mathematics)SoftwareInteractive televisionMachine visionPlanningBefehlsprozessorExtension (kinesiology)FrequencyNP-hardComputer hardwareVirtual machineBootstrap aggregatingMedical imagingOperating systemEmulatorMereologyRootSource codeEntire functionContent (media)Installation artVideo projectorComputer animation
Thread (computing)CompilerProgramming languageComputer programmingSystem programmingRead-only memoryHome pageCodeProcess (computing)BuildingBootstrap aggregatingBinary codeBuildingDirection (geometry)WhiteboardComputer architecturePowerPCBitComputer hardwareComputer animationDiagram
Java appletSuite (music)Revision controlPlanningLevel (video gaming)Patch (Unix)Software testingBuildingMedical imagingComputer architectureComputer animation
Program flowchart
Transcript: English(auto-generated)
Okay, so hello everybody. I am a Friam Flaschner. I work on porting RISC-V to Gnu Geeks or Gnu Geeks to RISC-V, depending on which way you look at things. I've been involved with Geeks as a packager and involved with Geeks in general since about 2015.
Over the years I've just ended up touching everything in the codebase by accident. It just ended up happening. I also worked, I guess the first big project was porting Geeks to AR64, which was similar but different with a lot of pieces.
So this slide I meant to fill in a bit, but not very good with drawing on the computer. So some quick stuff about Geeks. At the heart of it, it's a transactional package manager.
So that means that anything that you do, you can undo and roll back. And so that gives you the chance to, I guess in terms of porting things, you get to build things and see how they break and then build it again and see how else it breaks. And in the end, when it works, you don't have to worry about all the broken stuff polluting your environment.
So you're using just the pieces that work every single time and just the pieces you're actually putting in every single time. Everything is built in a container. Again, back to the everything is self-enclosed when you're building it.
And everything is built natively. There is support for cross-building just about everything. But when we're actually installing programs, everything is actually built on native hardware. This picture is a little small and actually a little old. Let me slide this down a little bit.
This one is from the actual bootstrap. It's a little dated. It's changed a little bit. But you start from an actual couple's statically compiled binaries. And in this case, it's actually download in this picture the GCC 4.7 source code and compile it to the GCC bootstrap.
So DAG is a directed acyclical graph. You can follow the inputs from one package to the next and see exactly how everything is built.
So going back to the first talk, which I suggest everyone watches. I really enjoyed it. Our self-hosting comes from a couple of cross-compiled static binaries from another architecture running Geeks,
which then we use to bootstrap everything up from the start on direction. So the same glibc bootstrap here and gcc bootstrap here. For whatever reason, we've actually flipped the graph around going the other way, and now we're going from bottom to top.
We have our glibc bootstrap at the bottom and aiming up. So go and actually bootstrap everything. I actually haven't used Gen 2, but from what I understand, the stage 1 and stage 2 starts.
You start from just about nothing, and it just builds itself also the way Linux from scratch does, until you actually reach a fully functional, as much as you want the optimizing, fully functional and optimized,
or built with dash 02 binaries that you actually use to build everything else in the system. So coming at it from the distribution side, RISC-V, what's it similar to?
What sorts of problems did I run into with the actual porting? So the AR64 port, the boards, at least initially, were a little easier to get.
ARM already had partners building boards and could get somewhat expensive ARMv8 boards in different places. RISC-V boards that I picked up actually have mostly been through Kickstarter. One way that I found radically different than ARM is that anyone can put together a chip and create it and sell it,
and there are just so many more options available for the actual chips and for the actual boards. So on one hand, looking at online discussions, sometimes it seems like RISC-V will save the world,
and it's a unicorn farting out rainbows, and it's just the end-all be-all. And from the pure packaging side, it's another architecture that not a lot of people are using yet. But it really is somewhere in the middle, as all things seem to be.
My impression so far is that it's gotten much faster adoption than AR64. I bought my first AR64 board in 2017, I think.
It was ARMv8, or the 8.0 specifically. You go out and buy an ARM8 board today, it's still the 8.0 architecture. Other than Mac, the M1s, which are almost 8.5, ARM just finalized the ARMv9 architecture,
and everyone's still selling ARM8 boards. So it really is exploding everywhere. And as a project, Geeks aims to make it available as another architecture that Geeks runs on.
So I guess other comparisons that I've run into. A lot of software doesn't have any sort of just-in-time compilation in it. A lot of it seems to be hand-coded, assembly, or added after the fact.
And a lot of times, when building software, you end up with something just saying, you know, RISC-V isn't supported for just-in-time. Your architecture's not supported past the Node JIT flag.
So PowerPC 64, one hand in the past, you still have the Apple G5s, which were big-endian PowerPC 64, and now we've switched to little-endian, and you end up in a similar situation of there just aren't...
No one's written that for it yet. You come across a whole bunch of packages where you don't use autoconf and configure make install to install the packages, and things just end up being too old.
I apologize, this one's a little small again. You go and run configure and have the timestamp. It was last updated in 2014. It was the last time they grabbed the, let's say, stock file.
It's a regularly used file for plenty of packages, and this just says, yes, I recognize that your RISC u-names comes back as RISC-V64. I don't know what to do with that. So as part of the packaging of everything, we do go through
and have to update these packages. So in terms of... I guess this is where I'm starting from with some of this. There's been a lot of work by the other distributions by many people to actually get RISC-V support into all of the toolchains
and into major programming languages and really make it to the point where it's just go and create some bootstrap binaries and things just start building. And so we've been very lucky that it's been such a big effort
and everyone's so excited for it that we built the bootstrap binaries and everything more or less just started working. So we really didn't have any... I mean, other than we weren't using these versions, these ones were a little too old, but as things built up, things pretty much just kept on going
and working correctly. So one thing... So I mentioned that in Geeks, everything is built containerized. When it's installed, it's not installed into FHS, into the file hierarchy system, file system hierarchy.
It's not installed, and we don't use the regular prefixes. So here, we don't have a user lib. We don't have a slash lib. As part of a... I assume as part of a multi-lib change,
which happened years ago in most Linux distros, a lot of packages end up getting installed. We'll say user lib, and then we'll say what architecture and what ABI it is, and then we'll fall back to...
Oh, and otherwise, it's just here in slash lib or slash user lib. All of our packages get installed into its own hashed prefix. So for example, this one is from GCC.
So instead of looking in user lib GCC... I forget the actual path. But anyway, so we're not going to find libgcc.so used for linking everything, pretty much. I actually had to comment out the start file prefix spec
so that GCC would correctly link to itself after it had been built. It was a odd situation that most things built, and then occasionally, things would just fail and say, I can't find libgcc, and say, you just built everything else.
Where is it? Then I'd add the library specifically, and suddenly, it could find it. And eventually, it turned out that as part of taking everything from its own special prefix, putting it together in the build environment and building, when I added libgcc specifically,
it got added into the library path, and everything found it, and without it, it's supposed to be brought in by GCC itself, by the binary, and then say, oh yeah, and here's libgcc if you need it. This was kind of a little gotcha that got me for a couple of months, and I actually started adding ugly hacks
around the Geeks code base to say, and when you're building on RISC-V, add in libgcc here, and add it in over there, and I had ended up with a paragraph-sized note going back to the, after bootstrapping everything
and getting to the final GCC used to build everything, we actually had a special one just for RISC-V that said, in addition to everything that's normally there, we also need libgcc. So this was a, looking back at it, it feels silly adding everything, but some of it also is, I guess, live and learn,
and get things to work. I found upstream, and this one is, I guess a little dark, maybe a little hard to read. Upstream has mostly been, it's been happy to accept patches for RISC-V support. Some of the patches that I've added to make things work
have just grabbed from upstream from newer versions or from pull requests, or modified from other ones and said, you know, it worked here, it'll work here too, or this is about the same. So as part of bootstrapping everything from source,
and we support Rust as we aim to support pretty much every programming language, and so current installation instructions for Rust are download Rust and use Rust up to manage Rust, and in the distributions, at some point it's in, and then you use one version to build the next, and then as far as actually putting Rust in the first time,
you know, coming from source only, our options really were to follow the OCaml path back from the very big, very early days and build a thousand copies of it until we got to something more modern. Luckily for us, there was an effort called mRustc,
which aims to implement enough of the Rustc binary itself to rebuild Rustc, and the rest, in this case, the rest of Rust 1.54. So, and this was the, you know,
the pull request that I had for, hey, and it also works on RISC-V. So we said, oh, everything just works? So we said, yeah, you know, after, you know, split up the build instructions and built everything, and everything just went, and after 56 hours, I had Rust 1.54.
So it was, you know, on my x86 machine, it took, you know, four or five hours maybe. Here it took 56 hours, and it's, you know, the machines are, they're getting faster, and it was, you know, luckily I didn't, you know, there was no hand compiling.
There was no interacting with it in the middle. It's, and Geeks made it easy to just say, here's the build instructions. Take the pieces and go. So I said, here's the build instructions, and I was always curious, so I added time before everything, and said 56 hours later, here you go. It just works.
So yeah, so this is, let me skip that one. So, I mean, inside of Geeks, you know, was, you know, add the patch, and then, you know, just tag supported systems saying, you know, yes, it also works on RISC-V.
And, you know, here, you know, going back to the previous slides about, you know, end up with similarities with AR64 and with PowerPC, that certain things just aren't, you know, aren't fully tested or aren't fully supported. You know, I ended up making the same mistake here,
and we've changed, I changed it from, you know, it only works on these architectures, to, you know, it works, and it's, it does, it's, you know, it's not expected to be super efficient or fast, but it gets the job done. And, I left the, it may support i686 soon, and I completely left out PowerPC.
No mention that, you know, it's coming, or it's in progress, or it probably works, or even it just works on all 64-bit machines that we have. It's, yeah, completely forgotten. So, it's, you know, I fall into the same traps too sometimes with those. As far as other language support,
for Go, for most languages, well, for Go, for most architectures, we start with the Go 1.4 release from Google, and then use that to build newer versions. There's an issue with GCC Go with,
I forget the interaction, it got fixed later. So, we're using GCC Go 10 to build Go 1.16 to build newer Go versions. I've done it on x86-64, which, for this type of thing, is where I normally end up doing most of the testing,
not to say cross-compile it, but just to say, use Go, use GCC Go for x86-64 to build Go 1.16 to build newer Go's, and just say, you know, does this process work? And then say, okay, well, let's do it on the, on actual RISC-V, and there was, there was some issue with the test suite
that I need to fix up. The actual building process of building, of the building before the testing takes, I mean, that part takes 12 to 20 hours. It's, I'm not sure why that part takes so long. It's just one of those things.
I mean, for Node, this one, we fell into some, we fell back into our bootstrapping trap of some sort of circular dependency between LLHTP and Node itself in later versions. So, I think that's what I'm actively working on. I think Node officially got support for RISC-V
in 16 or 18, partway through the cycle, so I have to backport it to 14, which we currently have packaged, and then again to 10, so that we can move forward with it.
Java, Java's a problem. I guess, I'm not sure we're actually ever going to get Java support. It's one of those things, it's going to take a really long time. Someone can correct me. I believe Java support officially upstream was added in Java 18.
We build Java using the previous version, going back version by version. Initially, we had used, until through GCC 5, there was a Java compiler as part of GCC, after which it got removed, and we used that for a while,
but it turned out that Java compiler also needed a Java compiler to compile itself. After some software archeology, luckily not by me, someone else worked on this one, managed to package early versions of the new class path from, I want to say,
the year 2000 or so, and use that to build IcedT, the then free version of Java 7, of Java 1.6, 1.7, 1.8, and used that to build all the open JDKs, and so to backport RISC-V support
all the way back through everything, and we actually have a hard enough time keeping AR64 working on some of the earlier versions. I'm hesitant to touch that one. Haskell is actually one of the ones where we've looked at it, and it comes up every six to eight months,
like, can we do better on this one? I've spent a fair amount of time looking at the Haskell download page binaries going back years and years. They have 0.29 listed as an option for downloading. Every version of Haskell needs an earlier, every version of GHC needs an earlier version of GHC
all the way back to the beginning. There are alternate implementations back in the early days, which can't actually get to build GHC itself, so for other architectures, we actually do just say, okay, we've chosen A point, grabbed the official binary released by GHC,
and used that to build all future versions. There's nothing for RISC-V currently from upstream Haskell. We could add support. Currently, our Haskell build system
doesn't actually support cross-building. More of it hasn't come up. No one's done it yet. We could cross-build from x86-664 to RISC-V for that, but then you're left with, if you want to build anything using Haskell, you need a second computer to build it.
So looked into briefly, can we cross-build the other way from foreign to native? Other than an interesting thought experiment of now you have to build an entire second architecture to build the one you're actually using, we actually haven't made any progress on that one, and it seemed like it was more of an interesting thought experiment than something that we actually
thought we would end up going forward on. So I guess going back to, we're actually talking about RISC-V itself. Another thing that Geeks has is like all Linux distributions, we target a base architecture, which is great for distributing binaries, it's not great for actually running optimized binaries.
So you have a flag called dash dash tune, obviously for hello, which just prints hello world, it's not useful, but for plenty of other programs it is. So when starting the RISC-V port, I followed the process that Debbie and Fedor
and I assume others took and I targeted the GC extension combination. So as time goes on and we get more extensions, we'll be able to use the tune flag to say, yes, I'm happy using the baseline or that's what I have,
or actually I have these other extensions, rebuild some of the software so that I can actually make use of the better hardware, of the more advanced hardware that I have with the extra extensions. In that way, we've had
good use of the tune flag in high-performance computing, in people with newer machines, be targeted by x86-664-v4 as a sub-architecture. We're still going through and trying to find programs that are good for tagging saying, yes, this will actually run faster enough to be worth it,
but certainly plenty of math applications do very well with that. So everything's in place to add more sub-architecture support for RISC-V as it comes.
We just have to actually have access to it and hopefully test it before just saying, yeah, yeah, it'll work fine. So it'll probably just work fine, but it's there, and it works fairly well for everyone that's actually using it.
I was wondering if any questions, any comments. I seem to have an extended Q&A period here at the end of my talk. I'm happy to talk more about how Geeks interacts
with other architectures, with anything, how other fun stories with hoarding software to work on RISC-V. Yes? GCC is gaining a REST content.
Have you looked into using that one for RISC-V? Okay. So the question was, GCC is gaining the Rust front-end, and have I looked into it for the bootstrapping part? I haven't looked into it yet for the bootstrapping part.
My understanding is that similar to M-REST-C, it aims to implement the, I guess, the Rust-C binary more or less itself also so that currently when we're building Rust programs,
we'll say, okay, well upstream says cargo build this, we run cargo build this, and then cargo itself says Rust-C library, and this part is here, and that part is there, and that part is there, and we can do the same thing with either just with Rust-C itself
or with the Rust front-end from GCC. Obviously, it should also be possible, which I'm not sure for bootstrapping Rust if it's something that we would slot in. I guess at first thought, I could see slotting it in instead of M-REST-C itself
and just saying we'll take the M-REST-C infrastructure to go and say here's the pieces we need to build to rebuild Rust-C from upstream, but we're going to use GCC's Rust to actually build everything faster and more efficiently, which would also solve some of the other architecture problems, and as Rust gets into the Linux kernel,
we're definitely planning on using the GCC Rust front-end for that. Yes? Okay.
So the question was has there been progress on building desktop environments and not just languages? So I have built, let's see, current... I'm thinking through all the source code bits. I personally run Enlightenment on my laptop.
That one doesn't work yet, but that's because we use an older version of LuaJIT as an input, and I just need to actually tell it, no, for RISC-V, use Lua itself instead of LuaJIT, or no, really, I'm not giving it to you and don't look for it and don't error if it's not there. Other than that, Enlightenment should work,
which I know has many people using it. As far as major desktops, let's see, XFCE just builds fine. Mate just builds fine. Gnome... I'm sorry, say that again?
Running it is another thing. I have not actually tested running it on the hardware itself yet. I've been, I guess, so far focused on the actual... I guess just getting everything to build first. As far as actually running Geeks on the hard...
I guess... So Geeks runs in two ways. It runs as the entire operating system or on top of A4, an operating system. So, so far, I've been running Geeks on top of whatever operating... whatever Linux happens to come from the vendor,
and I've been using that as the base. As part of actually testing the... It should be not too hard to test it with QEMU, either from RISC-V or from an x86-64 machine using the QEMU emulation.
As far as actually building an actual image for one of the boards, I ran into some issues with different boards needing specific offsets for... It was the G-parted version of Fdisk
helps create some partitions in a scriptable way, and assigning magic partition codes and things like that. So it's very doable in Geeks. This one was actually a problem I ran into.
I actually burned through all of my spare SD cards. So I was building everything on the RISC-V boards, and then I needed a new one to go and say, now actually create an installable image and install it to that SD card, and I had actually run out of them.
But I've picked up more finally, and at the point now where I can go and say... So the first plan was to go and say, here's an upstream image, either from a vendor or from another Linux distribution. I'm going to flash that onto the SD card.
That gets me all the partitions that I need. Then I say Geeks system init into the SD card that's already partitioned correctly. It'll install U-boot over the U-boot and the OS over the entire root partition, and then it should either work or not.
And if it doesn't, then we're into the why not. But assuming everything just works, everything should just come up and work, and I'll be able to better test the graphical applications.
So currently I am using... What was it? I'm using the Sci-5 unmatched board. After a year-long wait or so,
it was a long lead time with Mouser. It did finally arrive and shocked the local computer guy when I showed up and said, I need a case and a CPU, and I have this. He's like, oh, what else do you need? He's like, graphics card, peripherals. Like, nope, I have RISC-V board. And he's like, you're killing me again.
So he's the one I've gone to in the past when it's been, okay, I need a graphics card, and it has to be 10 years old and work with Linux library kernel. And sometimes it just becomes a... Work with the pieces that I have. So that's the main machine that I've been doing a lot of the work on.
I also have the first Vision 5 board that's been helping a lot with a lot of the building and I recently got the Vision 5 II, which so far seems to be at least as powerful as the Hi5 Unmatched. I got the one with the 8 gigs of RAM
compared to the 16 from the Hi5 Unmatched and a CPU up to 1.5 gigahertz instead of 1.2-ish. And, you know, actually in the... We had some Geeks Meetup days before FOSDEM. Actually hooked up to the projector
and just started building, you know, building from nothing up to hello and, you know, things were building and people said, oh, what are we building now? Where are we up to now? So it was, you know, GCC itself takes six to eight hours to build. From the bare bootstrap binaries...
From this one here. From the bare bootstrap binaries up to hello, which, you know, is the first quick-to-build package after building everything you, you know, GCC and binutils and everything you actually build everything with. On RISC-V, it was somewhere between
16 and 20 hours of building. On x86-64, the entire process, I think, was four to six hours, maybe a little faster. It's taken a little longer now that the bootstrap bit has been extended.
One of the other architectures that I worked on for fun was 32-bit PowerPC. And so that one was really more of a because I had it and not because anyone uses it.
And so with that one, GCC 10 actually takes more than 24 hours to build. So it's, you know, it's actually useful hardware unlike the 32-bit PowerPC. May I ask? Is there some cross-compilation with Wix?
I'm familiar with Nix, so I would expect that the initial would be to use cross-compilation and not the native. But Wix doesn't have a cross-compilation or is it just the direction to try the native? Because, of course, that would be way faster. Yeah, it would be way faster. So the question was,
so you are familiar with Nix, which does have cross-compilation, and you're wondering if Geeks also had cross-compilation, which would be much faster to compile than native. So Geeks does have cross-compilation. You can cross-compile binaries and then use the basically Geeks send command
to send it to the board and then run it there and everything just works. As far as actually building from it, though, everything gets built from the native binaries. So the initial bootstrap binaries cross-compiled and they were then used as the native binaries
to go and build everything else. But I've used cross-compiling. The... Where was it? For Node, that one backporting basically two rounds, the plan for backporting Node was to backport it to Node 14
and then cross-compile from x86 to RISC-V and say, does Node 14 work with this patch that I've written? And then once it does, then to go and backport it the rest of the way to Node 10 and then to go and... I haven't decided there whether the actual building
would be faster to cross-build Node 10 as a test or whether to start from native as the test. So Geeks does do... We do have cross-compilation and we do use it for creating images
or creating binaries for other architectures, but as far as actually building from it or using it as a stepping stone, it's really only in the very initial stage when the architecture is first added.
Okay. So I think that's it. Thank you, everyone.