We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

X11 and Wayland: A tale of two implementations

00:00

Formale Metadaten

Titel
X11 and Wayland: A tale of two implementations
Serientitel
Anzahl der Teile
254
Autor
Lizenz
CC-Namensnennung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
In this talk I will outline my journey implementing my X11 window manager `hikari` and the corresponding Wayland compositor shortly after. `hikari` is a stacking window manager/compositor with some tiling capabilities. It is still more or less work in progress and currently targets FreeBSD only but will be ported to Linux and other operating systems supporting Wayland once it has reached some degree of stability and feature completeness. This talk covers: * a brief explanation regarding differences between X and Wayland * some of `hikari`'s design goals and motivation * choice of programming language * an overview of libraries that were used * tools for ensuring code quality and robustness * obstacles * resources that helped me to implement the whole thing
Schlagwörter
ImplementierungRechter WinkelComputeranimationJSON
ImplementierungArchitektur <Informatik>BildschirmfensterPhysikalisches SystemKernel <Informatik>ClientARM <Computerarchitektur>ClientEreignishorizontTouchscreenTermDatenverwaltungGruppenoperationBitNabel <Mathematik>Minkowski-MetrikKonditionszahlBenutzeroberflächeProdukt <Mathematik>BildschirmfensterKartesische KoordinatenMultiplikationsoperatorPunktMAPNetzbetriebssystemInformationNeuroinformatikPixelRadikal <Mathematik>Arithmetisches MittelProgrammierspracheEnergiedichteFamilie <Mathematik>Fächer <Mathematik>GrundraumTesselationHochdruckDatenflussDämpfungImplementierungPhysikalisches SystemServerAggregatzustandSchnelltasteDreiecksfreier GraphCoxeter-GruppeRechter WinkelJensen-MaßSoftwareRoutingDefaultProgrammbibliothekWirkungsgradKernel <Informatik>Konfiguration <Informatik>ATMProtokoll <Datenverarbeitungssystem>Schreiben <Datenverarbeitung>SichtenkonzeptWurzel <Mathematik>Case-Modding
BildschirmfensterArchitektur <Informatik>Physikalisches SystemClientKernel <Informatik>ServerSchnelltasteSystemaufrufFreewareKartesische KoordinatenMultiplikationsoperatorBasis <Mathematik>RechenschieberMechanismus-Design-TheorieSystemplattformTelekommunikationProzess <Informatik>SchnelltasteDatenverwaltungDatensichtgerätServerClientBildschirmfensterLeistung <Physik>EreignishorizontTouchscreenNeuroinformatikSoundverarbeitungCodeSynchronisierungFunktionale ProgrammiersprachePuffer <Netzplantechnik>Primitive <Informatik>GruppenoperationDreiecksfreier GraphBenutzeroberflächeBitOrdnung <Mathematik>Quick-SortExogene VariableMenütechnikEin-AusgabeSichtenkonzeptAuflösung <Mathematik>TransversalschwingungSoftwareMaschinenschreibenMAPImplementierungGeradeARM <Computerarchitektur>Inverser LimesWeb SiteMetropolitan area networkErhaltungssatzInformationsspeicherungGraphische BenutzeroberflächeGesetz <Physik>EinflussgrößeMinkowski-MetrikZeichenkettePhysikalischer EffektSystem FWärmeübergangProdukt <Mathematik>KreisflächeLuenberger-BeobachterComputeranimation
Architektur <Informatik>ClientKernel <Informatik>BildschirmfensterCodeModul <Datentyp>MereologieWurzel <Mathematik>GeradeGemeinsamer SpeicherKiosksystemSchnelltasteGlättungClientMaßerweiterungProtokoll <Datenverarbeitungssystem>Mapping <Computergraphik>Weg <Topologie>SpeicherverwaltungProgrammierumgebungWeb logQuaderVariableComputerarchitekturCodeBildschirmfensterTouchscreenComputersicherheitGebäude <Mathematik>Reelle ZahlInhalt <Mathematik>DatensatzEinsEreignishorizontGrundraumEin-AusgabeE-MailDatensichtgerätMatrizenrechnungProgrammiersprachePuffer <Netzplantechnik>Funktionale ProgrammierspracheGanze FunktionBitMultiplikationsoperatorDelisches ProblemAdressraumKomplex <Algebra>VideokonferenzDatenverwaltungStreaming <Kommunikationstechnik>SkriptspracheGefrierenNamensraumPunktDifferenteSoundverarbeitungARM <Computerarchitektur>Textur-MappingSystemzusammenbruchServerImplementierungSchlüsselverwaltungWhiteboardFreewareMinkowski-MetrikRoutingDigitaltechnikAssemblerRohdatenSichtenkonzeptHalbleiterspeicherQuick-SortPhysikalischer EffektRechter WinkelExogene VariableWasserdampftafel
Computeranimation
Transkript: Englisch(automatisch erzeugt)
All right, welcome everybody.
We're about to start the next talk, which is about a topic that I personally know very little of, so I'm really excited and looking forward to learn a bit about Window Managers, which is definitely interesting. So I'm very happy to introduce Raichu, who's going to talk about his self-written X11
Window Manager. And yeah, he's going to talk a little bit about his experience implementing it and what he learned on the way. So please welcome Raichu. Wow, fancy. Hi, I'm Raichu.
And this talk is basically terrifying me because this presentation is given with a software that I've written. So it's an early alpha state, so yay. Pretty terrifying situation. Anyway, I'm going to talk a little bit about my experience with X11 and Wayland and implementing
Hikari, which is my Window Manager slash compositor. And another interesting thing about this topic basically spawns a lot of people that have a lot of opinions. It's kind of weird that people have very strong opinions on that, but maybe I can give you some interesting, maybe informed information about what's basically going on.
So yeah, I said this talk is a little bit about Hikari. I'm going to talk a little bit more about X11 and Wayland. But first of all, I want to tell you why I basically started doing what I did in the last one and a half years. So I wanted to build a Window Manager for some reason and later on a compositor.
So I've been spending the last one and a half years looking at X11, looking at Wayland, those different protocols, and roughly spending nine months working with each one of these. And so I've written this whole Window Manager, which basically does things like moving your windows around, resizing them, does displaying and all this stuff,
and gives you abilities to manage your windows. So yeah, I've written this thing from scratch, and I was largely inspired by things like CWM and Herbsleuth VM. So I wanted to have something that is keyboard-driven.
So I'm a VIM user. I want to have shortcuts and fancy things for everything I don't want to use my mouse. So I want to be able to do that. So fast navigation and stuff like that. And I want it to waste very little screen span. I will show you what I mean to have wasted very little screen space.
So this is basically my aesthetic. So this is a terminal, and it has a one-pixel border. And every pixel means something. So the white border tells me this window has focused. That's basically all I want to know, all I want to see. And I don't have title bars, which consume a lot of stuff, except when I have title bars.
So I built something like when I press mod, it shows me the information that I need. This is like fish, my shell. It's on the first workspace, and it's in the group shell. I'm going to talk a little bit about what these groups mean. And as you can see, it tells me, OK, this is the window that has focus.
So this is inspired by CWM, which has this concept of groups. It can put windows inside of groups and display them independently. And you have groups one to nine. And I wanted to be able to have independent groups, because I started using them as workspaces, which just kind of like defeats the purpose. But I wanted to be able to group windows together
in an arbitrary way. So when you open another window and open another one, you can see that when I press mod that these frames turn orange, that this belongs to a group, and I can cycle between windows inside of a group. And there is another thing, like now I
started a root shell group. This is a different group, so I can cycle between the groups. So this is something that I wanted to have for some reason, and it turned out to be very, very much, it very much fits my workflow. So yeah. Also, I'm not a big fan of tiling, except when I want to arrange my views.
And so I built something that works like Huff's little fam, where I can tile all these views and skim through them. And this is configurable. So you can write your layouts in something that kind of resembles JSON. It's UCL, which is the Universal Configuration
Language, which is used by FreeBSD, by the way. So yeah. This is basically what drove me to do this. And I also wanted to have minimal dependencies, like have a very slow set of libraries that I'm using. And I wanted to be energy efficient. I tried to be as energy efficient as possible, because yeah, it saves a lot of time.
It gives you a lot of time with your computer, because battery time increases drastically. And also, I wanted to target FreeBSD, because that's the operating system that I'm using. I will, at one point, support Linux and other operating systems, but right now, it's FreeBSD only. Yeah, so it has those two implementations.
I wrote this thing basically twice, which spent twice the time on one thing. Why not? Yeah. So what were these different approaches? So we basically have X11, which most of you people probably know. And then we have this new thing, like 10-year-old Waylon.
And both of these things are basically protocols, like TCP. So it's a protocol that somewhat describes your application can describe, I want to look like this. I want to work with events in a certain way. And there are certain implementations, like the XOR server.
And there's also this Waylon thing that I'm going to talk about. But first, let's talk a little bit about X here, and the X window system, which is the implementation that most of you are probably using, except when you're using recent GNOME, which is Waylon by default. So this thing looks like this. So we have the kernel.
We have this kernel mode setting stuff. We have fdev, which gives you keyboard, mouse events, and stuff like that. And in the middle, we have this X server. And it's basically responsible for rendering all this stuff. And then we have a bunch of clients up there. So this could be your terminal. This could be your screen locker and your window manager, because your window manager is basically
just the client to the X server. It's kind of like that. And this step is optional. This is what recently, maybe like 10 years ago, I don't know, even more than 15 years ago, they added this compositing thing. So when you press a button, you generate an event.
The X server figures out which client it goes to, sends that one to the client. The client does things. It figures out how it wants to look like, sends this back to the X server. Then the X server sends things to the compositor. Compositor bunches everything together and brings that back to the X server.
And the X server basically displays all this stuff. So we have a lot of things going on here. And yeah, you can see there are a lot of processes involved and a lot of communication between all these things. So what does a window manager look like? This is like the most simple implementation that I've found. It's TinyWM, written by Nick Welch.
And it fits the slide. I don't know if you can read that. You shouldn't, but just to give you an impression of how easy it is to getting started with Window Manager. I basically had my first working implementation of Ikari after a week. And I started using it on a daily basis after that,
which I think is kind of impressive because the platform gives you so much. There are so many mechanisms that it provides that you basically have everything in your X server. And yeah, you get a lot of the stuff for free. So now I want to be able to talk to this. I said talk to the X server.
I said it's a protocol. And there are different ways to speak to the X server. And the old way, like the old people did, is called XLIP. And all these API calls are pretty synchronous. You write a request, then you wait, then you read the response,
and over and over again you do that. So you can see you waste a lot of time waiting. A lot of applications are really waiting like a long time before the X server responds. It's kind of annoying. So people came up with XCB, which is the foundation of a lot of things that XLIP is basically built around this XCB thing. But with XCB you can write, write, write, right?
Then wait for the responses and just consume all of these responses, which is a lot faster. And I went with this XCB and it gave me a sort of really fluffy window manager feeling compared to others that were using XLIP.
So yeah, I went that way. So now I want, I have to pack a lot of stuff, so hopefully I'm not going too faster. So I want to talk a little bit about some interesting things that I discovered when working with X. So let's think about how I order windows in a stacking window manager.
So I basically put them on top of each other, and then the X server has to render them. Certainly you need to have some sort of ordering there, in which order the X server is rendering them, and you want to cycle through them. So these, I have this concept of these groups that I showed you and the X server has no idea about them
so it doesn't know when I go to the next window. I couldn't do that because it doesn't know what groups are. So what I had to do is essentially, I had to re-implement all this functionality in my window manager and synchronize them. The X server now has an ordering of windows, and my window manager has an ordering of windows,
and I'm not the only one doing that. Basically every other implementation I looked at of the modern, modern window managers, they're all doing the same thing. So I had to re-invent the wheel basically, which is a bit annoying. And there's another thing.
Just think about, I want to move this window two up. The thing is that the X server, basically it has just one giant buffer, and then your client just sends some primitives to the X server and it draws things there. So it just draws this in one buffer.
And maybe you've seen this. If you raise this window, then this portion of the screen needs to get redrawn. And so what the X server does is it sends an expose event to window two, to the client of window two, and then it generates all these primitives, like write a line, draw a circle,
and draw some text over there, and it just redraws them. So sometimes you really see this effect where when your computer is on power management that you can watch it redraw itself. And I learned to accept that, but it gets better with compositing,
but it's still not pretty. It's really something that annoyed me at some point. And with modern toolkits it's even like they basically draw everything into a PIX map and hand this PIX map over to the X server and say draw this, but don't touch it, just draw it, please.
And you can think about how much traffic you can generate when you have this giant PIX map and on my screen resolution one frame of the entire screen with basically like 10 megabytes and let's talk about network transversity here. That's an interesting thing to think about. But yeah, this is basically what happens.
And this was kind of annoying thing with X that I uncovered. And another thing, so this is code from, I don't know if you can read the comment, this is code from awesome that I saw. Before I said that the X client is, the X window manager, the window manager is basically just a client.
And what it sometimes had to ask for a keyboard. Like when I open a view and I want to change the group it's in, then I want to be able to type stuff here. And so I have to grab all those keyboard inputs because I want to like get all the screen events,
get all the keyboard events and write that into a buffer. So this is something I saw in awesome and awesome is basically like begging the X server, give me that keyboard. It's doing that for a thousand times then it waits for a millisecond. So it basically tries for a second to please give me that resource.
Even though the window manager should be the thing in charge, it's basically begging for resources here. And this felt wrong when I wrote it, but yeah, it's also what most menu managers seem to deal with here. That's a problem when you have a middleman. Yeah, and this is, so with the conclusion of
when I implemented this in X, I basically said, okay, wow, it's really easy to come up with a window manager. Just like takes a week to get something roughly working. And yeah, but all these graphical user interfaces, they kind of evolve. They all basically all do off-screen rendering,
then just shove around pics maps and stuff like that. Maybe there could be something better. You have a gazillion of extensions. That's also fun because at one point you will discover a client that will use an extension that you never heard of before, and it will do something weird, which is also fun.
And then you have to look up in all of these other different window managers how they are dealing with it. And it's not pretty. And X is a global namespace. Like every client at every point in time can become a key logger, can become a screen recorder, and just like send your stuff over the wire.
You can have a lot of fun with that, believe me. So from a security standpoint, that's not good. Yeah, the window managers is the client. It has to back for things like the mouse or keyboard. So, and you also duplicate a lot of functionality and you have ugly screen artifacts. So I was basically a bit fed up with this
and thought, well, there's this new thing called Wayland. So why not look at that as well and see how that works? So this is the architecture of Wayland. And we basically just take out the entire X thing. Now we just have clients and all these clients do upstream buffering.
And now the client just said, hey, use this buffer. And we all use shared memory here so it's not going over the wire or anything. Like just I write into this buffer and then I tell the compositor, please display this. And the compositor takes care of all the input events. So I don't have to back for my keyboard anymore. The compositor controls it.
Which also makes this, from a security standpoint, is a lot more interesting because now you can say, okay, I will just deliver this keyboard event to this client. The other ones don't see that. And the other ones think they are perfectly, the only things that exist in the universe. Like this is real UI isolation. You cannot build something that records
the content of any other screens, which is pretty awesome. And every frame is perfect. This is really something that came to me previously. Wayland really evolves around the notion of a frame. It's like, you know, the compositor decides
when to redraw things. It's not like draw a line, draw a circle, draw some text, and in between I could just draw a frame and flicker and do screen tearing and all this stuff. This just doesn't happen with Wayland. Everything is like super smooth. It's real. You don't want to go back when you see that once. It's pretty impressive.
So, and there's also stuff like damage tracking. And if you want to read more about how Wayland and Wayland compositors can do things, I really encourage you to read that blog post by Emerson. Probably butchered his name, sorry for that. But that's really interesting stuff. I have to hurry a little bit here.
So how did I write this thing? Obviously, I need to be able to write this Wayland protocol stuff. So I chose to use WL roots, which is the foundation for Sway, which is basically IE3 for Wayland. And it's like the 50,000 lines of code you need to write anyway. So I thought, yeah, I don't want to write those. Thank you.
And I used that thing. And it's basically now, for a lot of compositors, it's the foundation. And it's very well-written stuff. So you should check that out. But it was released after I started working on the X implementation, so no harm done here.
And they want to look at the most simple Wayland compositor they can look at, TinyWM. It ships with WL roots. It's around 1,000 lines of code. That sounds like a lot, but keep in mind, that's a compositor, that's a server, and a window manager, like three things in one. And Kage is also something interesting to look at if you want to learn, have different resources.
It's like a kiosk thing for Wayland. It's also used WL roots. And basically, all of these toolkits that you see, they all support Wayland out of the box. So if your client, written in GTK or Qt or Clutter, whatever that is, I'm not educated, I don't know,
and SDL, they all have Wayland backends now. So you can basically transparently switch to all these things without even noticing, which I found was pretty neat. Firefox works, Thunderbolt works, just set this environment variable. It's a bit flaky at times, so I'm kind of glad this thing didn't crash. It doesn't crash that often, but could happen in the worst of times.
There's MPV, a video player, and WL clipboard makes my NeoM happy. And if you want to be able to have X applications, you can do this with X Wayland. So yeah, had to hurry here a little bit. Basically, it's a lot less complexity
and looks way better. There's a lot of cool stuff going on there. Yeah, I have roughly around the same amount of code here, which I think is pretty neat because it does so much more. You have more responsibilities, things like stream lockers, you have to implement that. So now you wonder, what kind of programming language
did I use to implement this? And this probably divides the room into yay and ugh. But I basically did this for good reasons. Like I said before, it's 50,000 lines of code, C. And there were other people trying to do that in Rust, and they basically said, okay, it's too hard. We can't do that.
This is from the Waycooler compositor, which is awesome in Wayland. And they basically said, okay, we can't do this. It's too much work. And I don't want to rewrite 50,000 lines of code in Rust. I basically don't have the time for that, even though it will be probably interesting to do so. So yeah, I did address sanitizing.
This is a very cool thing to check your, if you have things like double freeze or use after freeze, ASAN is pretty cool. And I used a lot of DTrace. I can show you the script later on that basically keeps track of all the memory allocations that I have, so that I can delete memory. So yeah, that's basically it.
If you want to get a hold of me on our Mastodon, I'm on Matrix. You can write an email to me or just join our Hikari chat room or get in contact with me at the ECME Labs Assembly. Thank you. Right on time. Thank you.