We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Reinventing Home Directories

00:00

Formal Metadata

Title
Reinventing Home Directories
Title of Series
Number of Parts
44
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Producer

Content Metadata

Subject Area
Genre
Abstract
Let's bring the UNIX concept of Home Directories into the 21st century. The concept of home directories on Linux/UNIX has little changed in the last 39 years. It's time to have a closer look, and bring them up to today's standards, regarding encryption, storage, authentication, user records, and more. In this talk we'll talk about "systemd-homed", a new component for systemd, that reworks how we do home directories on Linux, adds strong encryption that makes sense, supports automatic enumeration and hot-plugged home directories and more.
Directory serviceSystem programmingPhysical systemDirectory serviceRevision controlComputer animationXML
Directory serviceState of matterConfiguration spaceSystem programmingEncryptionAuthenticationMechanism designDatabaseResource allocationLatent heatSelf-organizationLimit (category theory)MereologySoftwareWindowContext awarenessKey (cryptography)Directory serviceMehrplatzsystemEncryptionConfiguration spaceArithmetic meanDatabaseDemo (music)Numeral (linguistics)AuthenticationValidity (statistics)Data managementPasswordInformationDifferent (Kate Ryan album)LaptopField (computer science)Hard disk driveComputer filePhysical systemAcoustic shadowDemonSet (mathematics)Endliche ModelltheorieExtension (kinesiology)File systemDiscrepancy theoryMultiplication signComputer clusterRow (database)Type theoryBootingFacebookMiniDiscIntrusion detection systemState of matterImplementationComplex (psychology)ExistencePlastikkarteInternetworkingDescriptive statisticsException handlingRight angleFlash memoryTouchscreenGraphics tabletMassIdentity managementFraction (mathematics)Logic gateInformation securityPattern languageSpacetimeSemiconductor memoryVirtual machineWeb 2.0Translation (relic)RootPoint (geometry)BitView (database)Computer animation
Directory serviceState of matterConfiguration spaceSystem programmingEncryptionMechanism designAuthenticationResource allocationFocus (optics)LaptopExistencePasswordPhysical systemNumberPhysical systemRight anglePasswordMiniDiscDirectory serviceDatabaseLocal ringOperator (mathematics)MereologyComputer fileFocus (optics)EncryptionState of matterHard disk driveLaptopVirtual machineRun time (program lifecycle phase)InformationMetadataNeuroinformatikData managementGroup actionBitSemiconductor memoryPoint (geometry)Sheaf (mathematics)Data structure2 (number)Key (cryptography)Configuration spaceArithmetic meanExtension (kinesiology)Field (computer science)Sound effectFile systemCryptographyReverse engineeringAcoustic shadowContext awarenessCategory of beingRow (database)Token ringDemonServer (computing)Execution unitSearch engine (computing)PropagatorAuthenticationWordVarianceData storage deviceOperating systemExistenceHuman migrationClassical physicsInformation retrievalOvalNP-hardPartition (number theory)Message passingComputer animation
Directory serviceSpacetimeMiniDiscSubsetLocal GroupLink (knot theory)Interface (computing)Regular graphKeyboard shortcutComputer-generated imageryEncryptionData storage deviceElectronic signatureKey (cryptography)Directory serviceCodeIdentity managementLoginAuthenticationBitMultilaterationSheaf (mathematics)Computer fileHexagonCategory of beingMereologyVirtual machineLimit (category theory)Row (database)Extension (kinesiology)Arithmetic meanData managementTouchscreenLatent heatData storage deviceInformationAcoustic shadowConfiguration spacePhysical systemReading (process)CryptographyData structureKeyboard shortcutLevel (video gaming)Process (computing)LaptopPosition operatorSemiconductor memoryPasswordGroup action1 (number)MetadataSlide ruleBus (computing)Interface (computing)SoftwareGeneric programmingInterprozesskommunikationSubsetSpacetimeDatabaseNeuroinformatikElectronic visual displayNamespaceRight angleReal numberMiniDiscHard disk driveMoment (mathematics)InternetworkingRange (statistics)Kernel (computing)Multiplication signShift operatorString (computer science)Object (grammar)Link (knot theory)Direction (geometry)VarianceCASE <Informatik>CollisionVideo gameAsynchronous Transfer ModeRing (mathematics)Computer animation
Directory serviceEncryptionPhysical systemProcess (computing)Integrated development environmentLimit (category theory)DisintegrationBefehlsprozessorRead-only memoryFile formatRegular graphKeyboard shortcutComputer-generated imageryData storage deviceElectronic signatureKey (cryptography)Row (database)INTEGRALPhysical systemHookingMedical imagingDatabaseGastropod shellClassical physicsExtension (kinesiology)Data managementEncryptionIntegrated development environmentLaptopLoginComputer fileAuthenticationFocus (optics)DemonGroup actionKey (cryptography)Identity managementNeuroinformatikUniverse (mathematics)Instance (computer science)Student's t-testMathematicsConnectivity (graph theory)CASE <Informatik>CryptographyPasswordSoftwareProjective planePropagatorDirectory serviceSign (mathematics)CodeSoftware development kitRandomizationInformation securityLocal ringReverse engineeringSlide ruleArithmetic meanSystem administratorElectronic mailing listElectronic signatureMetadataStandard deviationMultiplication signBlock (periodic table)Service (economics)Variable (mathematics)Front and back endsComputer wormPublic-key cryptographyLevel (video gaming)Game controllerLimit (category theory)Data structureRight angleResidual (numerical analysis)Token ringSuite (music)Computer programmingMiniDiscRhombusDenial-of-service attackClient (computing)Term (mathematics)Server (computing)Scripting languageComputer animation
Directory serviceResource allocationSoftwareMiniDiscAreaPhysical systemDirectory serviceSemiconductor memoryConfiguration spaceKernel (computing)Surjective functionRevision controlRow (database)Distribution (mathematics)DatabaseInformation privacyLaptopCryptographyKey (cryptography)Multiplication signPasswordPatch (Unix)Different (Kate Ryan album)Electronic signatureProcess (computing)Virtual machineSign (mathematics)Information securityComputer fileGame controllerBefehlsprozessorPerfect groupTime travelMereologyHuman migrationVolume (thermodynamics)File formatClassical physicsFile systemIdentity managementRight angleCrash (computing)BackupCASE <Informatik>Link (knot theory)Level (video gaming)Figurate numberView (database)Computer animation
WebsiteSystem programmingLattice (order)Computer animation
Transcript: English(auto-generated)
I'm going to take away your home directories now. I'm a bit sick, so I hope I'm not going to lose my voice in the middle of this talk.
But yeah, I'm talking about something that has not been merged into System.ly yet, but hopefully will. And it's about reinventing home directories, as the title says, because, I mean, reinvention is what we do with System.ly, right? Anyway, I think I have good reasons for this stuff, and I hope to explain why.
And I hope to also explain what precisely I have in mind that we do. So what am I talking about? I'm talking really about your home directories, right? Dollar home or tilde, whatever you want to call it. It's that stuff that is paired with an entry in etc-passwd and makes up your user record
and your user dataset, if you so will. After working with Linux and Unix for a long time, I came to see quite a few problems with the approach we so far had.
I'm pretty sure every one of you knows how this works, right? Like on Unix with etc-passwd and home, some directory, right? I don't need to repeat how that works, right? I saw a couple of problems with this. First of all, the way we currently manage user directories on Linux is basically it
requires a writable etc to create a user, right? So far, we always have this goal that if a system's configuration is not supposed to change, etc should be read-only, should be immutable ideally even, and hence the requirement
that if a user is created or removed or modified or whatever, you need a writable etc kind of sucks because a user's existence is not really configuration, at least in my view, right? So the concept that mixes state, I believe that the user record is state, it's not configuration,
and the rest in etc is generally understood to be configuration in this way. So there's this philosophical problem already that, yeah, etc mixes state and configuration and then this propagates into the problem that etc needs to be writable. There's also this problem with etc-passwd and home management that user ID assignments on Unix need to be propagated between systems, right?
There is a lot of infrastructure in this world for LDAP databases, how user records are distributed among many systems, how NFS has user ID translation, demos and whatnot because everything is bound to user IDs, like numeric Unix user IDs, like these 32-bit values where zero means root, and the meanings of those in context of Unix network file
systems always need to be propagated between systems, and that to me sounds so backward because it basically means that every system you have only exists within a very specific institution like organization because only within that very specific institution organization
you can say that user land has numeric user ID 555556 or something, right? So yeah, I think the classic model worked as long as you had few systems, right? And created, like as soon as people tried to scale it to more systems, they came up
with this massive complexity that is LDAP and things like that, and they end up managing this user ID that nobody actually really cares about which is just an artifact of the implementation and is nothing something people should really care about. And in a world of today where we scale to the internet, right, user IDs are complete in artifact of the 1970s, right? Like if I have a Google account or Facebook account or whatever account on the internet,
there is no user ID assigned to it because it makes no sense, nobody wants that, right? I mean there are reasons why user accounts have usernames, there are reasons why they have passwords, but the reasons for them having a numeric user ID on a specific system are a little bit weird. So yeah, I see that as a problem. There's another problem.
This model means the way we deploy it right now, no encryption, right? Home is generally not encrypted by anything that is directly related to the user, right? You might have full disk encryption, right? You might have encryption of slash home as a whole, right? But this is system-wide, if you will.
It's not specific to the user, and that's weird, right? Because right now, like at least on my laptop, the way I installed in Fedora a couple of years ago, I got the full disk encryption, so I boot up, I type in a password that is my password, but also it's not my password because it's actually my laptop's password,
right? Then I authenticate against my system. That decrypts the hard disk. Now this is the password that actually matters, and after it booted up, I have to reauthenticate a second time into my actual user account, and that password doesn't really matter that much because at that point, all the data of the home directory is already fully viable to anything on the system that ever wants to see this.
So I see there is this really weird discrepancy, like this mismatching encryption that we protect, like the stuff that's actually associated with user account is not what protects your data. It's the other stuff that protects your data, that is system-wide, that if we actually
had multiple user systems, like where multiple users use the same machine, where they would have to share that password, and that makes no sense. The only way why we so far got away with this is because most laptops are single user systems effectively, even though Unix concept can support more.
So I think this stems from, yeah, nobody wanted to touch the way how Unix user management worked, and so we came to the discrepancy and we got away with it because we effectively are single user systems. I also see the problem, yeah, Etsy, PassWD only knows Unix passwords, right?
Passwords as you know them. It doesn't know anything else, right? Everything else how you might want to authenticate these days, YubiKeys, whatever else, cannot be an Etsy PassWD because it's not extensible, right? So anything more modern, anything if you even want to authenticate with a pattern, right, like which is what phones do and which you might want to do on your laptop too if you have a touchscreen, doesn't really fit into that model because, yeah, there's
no way how that's available there. It's not extensible. This is actually a massive problem, right? Like Etsy PassWD, this database was designed in 1985 or something like this, has not been modified once since then. There has been this extension like the shadow database which added like five more fields
or something like this, but everybody wants to add their own fields. And this is like, this is like fucking ugly if you ask me, because they created these sidecar databases, right? These databases that extend the user record, but they're not actually stored in Etsy PassWD in the user database itself, but at some other place.
And that other place means something very different regarding where you're looking. So for example, Etsy Shadow was the first sidecar database that was introduced, and it contains information about the actual password used and like account, like validity restrictions, things like that.
But then accounts demo is like this GNOME thing. They wanted to add a picture for each user, so they have a sidecar database for that and a couple of other things. Samba has one because they wanted to have a GUID like in a Windows context. They created their own one, it's also somewhere in VAR. There's SSH which wanted to have the authentication key, right? Like the authorized keys thing in your home directory, right? Which is part of the user identity, it's how you authenticate.
It's like your password in many ways. But no, we can't put it in the password file because it's not extensible, so we put it in the home directory. And then, yeah, SSR tries to read from that, and it's highly problematic because, yeah, you have a privileged daemon that goes into the unprivileged user directory, and it's dangerous as shit. So then we have PAM limits, which is about resource management, right?
Like people want to be able to set limits on specific users. They want to make sure that some users can have more processes than others. The way they do it is they came up with their own configuration file, which is not even a database but a configuration file in etc.security.limits.conf or something like this. So, yeah, we ended up with this, yeah, simply because it's not extensible,
we ended up with a distributed thing that you can't even manage centrally, like even if you use LDAP and try to manage this centrally, it usually covers like a fraction of this, usually the Unix records and maybe the shadow stuff, but then as soon as you go into like sharing pictures of users with LDAP,
that's, yeah, that's strange territory. Yeah, this is kind of the same thing. There is no resource management doable with etc.passwd. And I think resource management matters a lot, right? Like because not all users are equal. So you want to be able to assign specific resources, memory,
disk space, whatever else to users, and yeah, etc.passwd can't explain this. Then they create another secondary database for this, which is a quota database on disk, but yeah. So resource management is, as soon as you think about multiple users, you also want to think about the differences between multiple users and how many resources you assign to each.
But we can't really do this with etc.passwd except through this mess of sidecar databases. So this is, yeah, so far like a number of the problem. They think they're a lot more. This is kind of though the problems that I saw that I wanted to fix was what I'm talking about here. So let's talk a little bit about the focus of what I'm suggesting here.
First of all, it's only about human users, right? I don't care about system users in this context, right? Like system users being the stuff that demons run under. This is about human users like me and you, right? The stuff that you have on a laptop. It's not the stuff that you have on a server so much. The goals that I want to be able to deliver
with what I'm talking about here is, first of all, migratable home directories, right? Migratable home directories means that you can take a home directory as one unit from this laptop to your new laptop or to your third laptop, and it will always work and be self-consistent in itself and comprehensive and hence migratable, right?
Like so this basically means that as little as necessary should leak into the system surrounding it and particularly when it comes to configuration, which is very different from what was there before, right? Like before, home directories user accounts were not migratable, right? Like because, as mentioned, we have all these sidecar databases and there's no way
you can sensibly migrate all that metadata that is distributed in all these configuration files and Etsy and all the weird sidecar database and VAR and whatnot from one system to another. I want to go to this way where home directories are, yeah, somebody didn't turn off his phone. I want this to go to the way where they're truly migratable, right?
Like all the way to the point where you have a USB stick like this one and that this can be a home directory and you plug it into my laptop and that can work there and I can take it out and put it into another laptop and work there, but that this USB stick actually is my user account, that it is my home directory and it's truly and natively migratable
without any magic bullshit of propagating configuration into the system, right? So it's about isolation, it's about unification, everything into the home directory itself and cleaning this up. I already mentioned this a bit. It's about self-contained home directories, right?
So right now, as mentioned, if you want to have a user account, you have files everywhere. You have files in etsy-passwd, in entry. You have an entry in etsy-shadow. You probably also have an entry in etsy-group and etsy-gshadow. You have the home directory itself. You have all the sidecar databases. The goal with this is, yeah, self-contained home directories
that everything that is on this stick comprehensively describes the home directory. By the way, I'm putting this stick here always in your face, but it's not supposed to be just about the stick, right? The fact that we can have home directories on a stick is kind of a nice side effect of what I'm doing, but my main focus is actually not that.
Most people probably want to have the home directory on the laptop. I know this and that's what I want myself. I just want to make the point that it should be that migratable that you can have it on a stick and it kind of makes sense. So yeah, self-contained home directories, right? The metadata about the user should not be distributed across the system with all its information about the user itself, with all the resource management and stuff.
It should be part of the home directory itself. This then means that the mere existence of the home directory's file where everything's stored should synthesize the user account in full, right? It should make sure that if you call get-pwnem
or something like this, like the classic Unix APIs to query a user, the mere existence of the user file store should synthesize everything else, right? In a way, you could say, I want that, yeah, you know Unix, everything's a file,
that yeah, users should be a file too, right? Like where you just have one concept in the file system and that everything else comes from there during runtime and is not propagated persistently anywhere else. One of the goals that I also have is UAD assignments
should be a local artifact, right? I want that if I stick this USB stick into this laptop that I might get a UAD, stick it to another one and that UAD is already used by another account there, they get a different one and shouldn't matter, right? This is a hard problem. We have to cut some corners to implement this,
but yeah, I want that it is a local artifact. It's like when this home directory is bound to this specific laptop, it gets a user ID assigned and then that's how it is, but if it's bound to a different laptop as well, it might get a different one there so that the propagation of the actual UAD numeric value is not necessary and all this infrastructure
having centralized databases like LDAP and stuff to stabilize this are not necessary. I also want that a unification of the user password and encryption key, right? This is what I already mentioned earlier. So far, we had the encrypted password in etsy shadow, like this Unix password string, $6 something
and we have usually the hard disk encryption like the LUX password. I don't want this to be two different things. I want this just to be the same thing, right? Like if you're capable of decrypting a home directory, that's good enough as authentication that you're also allowed to and also the reverse should be true, right? That if you're allowed to log in, only then there should be any chance
for you to access the home directory. So yeah, I want these two concepts to just be merged and be the same thing. Also, and this is really important, we should have in 2019 extensible user records, right? So that people can put in these user records whatever they want, right?
And we propagate that through our APIs to whoever wants to know this and yeah, so that we are not stuck to struct pass WD, right? This five field structure defined in 1985 or something but people can stick in there whatever they want. Like if they have some weird requirement for their own company, they can put something there
and if other people have different requirements, they can put something else there but we agree on a common basic vocabulary but are highly extensible. Also one of my really important goal actually is to lock locks on system suspend. So nowadays, I'm pretty sure that first of all,
most of you probably use heartless encryption on your laptop and second of all, I think most of you probably don't even turn off the computers anymore at all, right? Like usually you just close the screen and that suspends the system and then when you come back, you just authenticate again. But this systematically, I mean it kinda defeats the encryption that you have there
because basically the way it's currently implemented, you use full disk encryption. So while the system is up, suspended or running, we are up in that context, the decryption key is in memory, right? So if I go through customs to a country
that I don't trust and I have the laptop with me and it's suspended and I pick up the laptop, in that memory, you'll find the decryption key from a heartache and that's something I think we should not do, right? We should and this is something that people might not find important but I think it's actually one of the most important things in this entire approach at all is yeah,
when you suspend the system, the decryption keys need to be removed from memory, right? So that I know for sure that if somebody steals my laptop in suspended state because that is the most common state that they will probably steal my laptop in, they should not be capable of getting any access to my hard disks, right?
Again, this is something you really, really should care about because it basically so far defeated all kind of encryption that you had because yeah, as long as you did system suspend and everybody of you does that, if somebody steals your laptop, they can extract everything they want. So doing this is actually much harder than it sounds at first, like because the reason
why we're not doing this on general systems right now is because yeah, if you use full disk encryption then the operating system itself is encrypted, right? So if you come back, like if you actually flush out the cryptographic keys when you go down for suspend and then you need to require them from the suspend, who's going to ask the question if it can't even
be loaded into memory because it itself resides on that encrypted partition that you want the password for, right? So that's a bit of a chicken and egg problem and nobody solved this so far. With this, I wanna solve this, right? So that when I suspend my machine, the operating system itself is independent of the individual home directory so that we can suspend the home directories independently of the system itself
and hence solve the problem. Also, I kind of indicated this already, I think we need to move to more modern ways of authenticating yourself to the system. Passwords, great, but also maybe we should do something better. So yeah, one thing I wanna be able to deliver from day one of this system to home using
is YubiKey support, right? I mean YubiKey I put in the sense of like how you would use the word Google for search engine, it's not just about YubiKeys, it's about anything that is like a authentication token that implements PKCS11, right? And I think we should do it properly, meaning that we actually use the cryptographic properties that YubiKey provides and hence,
yeah, if you don't have the YubiKey that there's no way in the world, the cryptographic key for unlocking your account and for accessing the encrypted data on disk can even be retrieved. So that's the goals that I have. There's some complication with all of this.
SSH logins, right? I mentioned that I wanted that user authentication and decryption of the hard disk is the same thing, right? This is inherently incompatible with how SSH traditionally works, right? Like because as mentioned, if you authenticate via SSH, it goes via the authorized keys file
in the home directory. So if you wanna authenticate something that is inside of the home directory so that you can access the home directory, where does the decryption key come from to access the home directory, right? You follow what I mean? It's the chicken neck problem. You're reading out the key that allows you access from the encrypted thing, which you only can after you authenticate it, right?
So my answer to that is we don't allow SSH logins as long as you have not logged in first locally. Because I mean, allowing SSH logins the way they're traditionally done means that inherently the home directory of the user
needs to be accessible before the user logs in. And I think that's something I really don't want to have, right? I really want that when I'm logged out here that not me, but not also anything else in the system can even access my home directory. That includes OpenSSH to read the keys from there. So yeah, my answer to this is SSH logins
into systems that use this can only be done after I logged in locally, which decrypts the hard disk, makes it accessible, and then SSH can access and everything's fine. Also, by the way, actually the authorized keys file is something I think, and I already implemented even, should be part of the user record,
like the successful user record, so that you can actually, yeah, it's just there. And yeah, this already actually works. Disk space assignments are a bit of a problem, right? We probably should come back to this one later a little bit when I talked about how the storage
of what I'm doing is actually done. Another problem is UID assignments. I kind of already indicated that. So the problem is if we want the UID to be specifically local to the system, right, it might happen because the range of user ID is very small, like it's just 32-bit, that there will be collisions, right? If I had my home directory on this laptop and then take out the USB stick, put it in another laptop,
then somebody else might have already used the UID that I used here. Yeah, this is very likely because the space is so small. Ideally, the kernel would help us with this, right? Because we had something like ShiftFS, so that it doesn't matter what's actually on disk, that at the moment where I log in, I get assigned the UID and then I mount
virtually all the files to the right user ID. We can't do this, so we go for the next best thing, which is churning recursively if we have to and try to avoid really hard having to do this if we can. Thankfully, churning recursively is surprisingly fast, actually, if we actually have to do this.
And I'm kinda okay with the behavior of this because, I mean, we tried very hard to make the UID assignment stable. So for example, we hash it from the username, but that's never gonna be enough because the namespace is so small. But this basically means that probably in real life, if it's just Leonard and I have three machines
and I take this one out and move between them, I will probably never need the churning, but it's there to make this safe. And ideally, sooner or later, we don't have to do the churning anymore. Yeah, looks locking, I already mentioned that that this is kinda complicated, but there's also a couple of other complications.
If we do the locking on suspend, where we remove the cryptographic keys from memory as we go into suspend mode, we need something that queries for the password again when the machine resumes. Because until that happens, the home directories are locked, completely locked and cannot be accessed because there is no cryptographic key known for them. So we actually need cooperation for this from the UI.
Everything else that I'm doing here I can do without any UI involvements. But in this case, we need some cooperation from the UI, specifically meaning that when the system comes back from looks, your display manager, like GDM, whatever you use, will have to re-authenticate the user
because, I mean, mostly, effectively, they already do this, but they do it the wrong way right now because mostly the way that they do it is that the screen lock that is turned on on suspend actually runs under the user identity of the user itself. This doesn't work if we suspended the home directory of the user because we can't run code
under the user's own identity until the home directory is resumed again. So what we need here is that GDM, for example, is patched that when the system comes back from resume, it asks the password again so that we can unlock the home directory and then can switch back to the user's login session.
So I know this is a lot of material so far. Let's now actually have a look in the actual stuff I did on the actual code concepts. Yeah, there are two new concepts.
One of them is I wanna go for JSON user records. Like everybody knows JSON. JSON is like what the internet people all do. I think we should just start using that for our user records. Yeah, the reason why JSON is because it's just the most basic thing. It's supposed to be machine readable. It's not supposed to be so much something you write. It's something the computers write for you
and things like that. And it's also what probably most of the internet-facing user database is probably used anyway when they exchange information about users across the network. So that's the first thing. JSON user records. It's supposed to be a superset of the NSS. NSS is a name service which of Glipsy, it's like a generic term,
Glipsy uses for struct passwd and struct group, which for the ones who don't know Unix so much, that's how user records are done by Glipsy since 1985. These JSON user records are supposed to be queryable via violin interface. Violin, for those who don't know,
is a very simple IPC system. It just uses JSON and it's supposed to be the most trivial IPC in the world so that as long as you have your own JSON parser, you don't actually need any kind of other code base. I have a slide later why violin is the right answer here and not D bus.
There are good reasons for this. So, yeah. The idea is also that from these JSON user records that are queryable via violin interface, we can convert forth and back between this and the classic ones. In one direction, of course, it's lossy
because the JSON stuff can have any kind of metadata while struct passwd and struct group cannot. How does this specifically look like? This is an example. Now, if I had a laser pointer, I actually could show you. I think some of them, this is pretty much self-explanatory, right? Like username, you guessed it, is username. This position is just, yeah, ignore that,
but nice level, for example, configures that when this user called Groby logs in that all his processes get nice level five, right? It's pretty self-explanatory. Member of means, yeah, this user's supposed to be member of the group wheel. I think, yeah. What's interesting here is the binding thing. Binding is supposed to bind a specific user record
onto a specific system. The reason why this is a sub-object and not part of the main thing is simply because depending on specific systems, resource management is probably gonna be different, right? Like if I have a user record for Lennart on my desktop, my super beefy desktop, and I wanna put a size limit on it,
it should be specific to that desktop because it has so much more memory than my laptop might have, right? So the binding concept here, basically, if you see that hex string, that's actually the SC machine ID. It basically just says, yeah, the following properties apply only if this is bound to a specific system. And then, yeah, GID is the GID, UID is the UID.
What you see else there is like meta information about how the actual storage is done. We'll talk about this a little bit later. This is continuous. Privileged is a special section that contains the privileged information. It's basically what Etsy shadow is. The reason why it's a sub-object is so that we can strip it
if somebody's not supposed to see it, right? So the idea basically is that all the information that is an Etsy shadow, traditionally, plus all the new stuff that people wanna come up with, should be in a specific sub-object so that whenever somebody who shouldn't see this information wants to see the record, we can automatically remove it.
Anyway, I don't wanna go too much detail about the structure of these records. There's documentation plenty about this. What's also interesting is you can sign these. We'll talk about this later, why that's interesting. So that was concept A. Extensible JSON user records plus an API, how to query them, kinda replacing get PW end but also compatible.
Concept B is I want that home directories are encrypted LUX loopback files in slash home, right? So that if I have a user called foobar in slash home, traditionally there was slash home slash foobar and was a plain directory. I want that there now is slash home slash foobar.home
and it's a file and it's actually a LUX loopback image and then when the user logs in, what do we do? We set it up via LUX stuff as a loopback file and then we finally mount it after validating that everything's okay. These extensible user records are supposed to be stored inside of that image
inside of file dot identity. There's nothing magic about that. This stuff is supposed to be managed by a tiny daemon and it's actually not, it's really small because it doesn't do much that is called Systemd home d dot service. What does that thing do? It's okay, I have not much time left.
There's so much more awesome stuff in this but let me finish this quickly. So Systemd home d is what it actually does is it just, the thing that sets up these home directories as you log in, right? And it also provides NSS interfaces so that the old APIs all work, right? So that you can do get to end on the shell for example
to get the user database and find then automatically all these user records and everything works. This concept B, like this encrypted locks stuff can be used, like it relies on concept A but concept A you can actually use without it, right? So if you just care about the extensible user records that's totally fine and you should by the way
because like even if you have the LDAP and these more classic user databases you might want to be able to hook up, hook into Systemd resource management because all this new stuff is hooked into Systemd's user management. Like for example there's PAM Systemd that has been around for a while but what it does actually, it takes the JSON record,
pulls all the resource control that makes sense like from the nice level as you have already seen the classic resource limit but also environment variables and that kind of stuff and sets it for all the persons that log in. There is also support in Systemd log in service which kinda does the same thing when you log in but it applies it to the C group stuff, right? Like so that it apply to the user as a whole, right?
So if you are into LDAP and provide these extensive user records just by doing that you get this integration to get all the fancy resource management you ever dreamt of. Yeah, so I think this, I kinda already explained this.
Yeah, I care about encrypted home directories and loopback files. I care about encrypted home directories on block devices like this USB stick for example so that we have something truly migratable there. While this is what I tell people I'm going for, right? Like encrypted lock stuff, actually the Systemd HomeD component does not just support that as a backend, it actually has four more backends like classic plain directories for compatibility
but our sub-volumes, fs-crypts, sifs and looks. But again, looks, that backend is the one I wanna push people towards because it's the only thing that's actually fully secure because not just the payload of the user but also the metadata is, yeah, encrypted
and it's fully featured in all these kind of things and industry standard and stuff like that. I mean, this is after all an exercise in not coming up too much new stuff but just taking what's already there and that is looks for example and just sticking up in a more nicer way. But anyway, my time's over. I'm fully aware that I totally did not finish
all the beautiful slides I have there. These slides have been online for a while but I probably should take some questions at least. Yeah, somebody has a question there. Can you, like there's a mic. Thanks. So in the user, the JSON structure,
there was a list of groups for the user so that needs to be somehow verified by the system, right? So how do you approach that? So because user can change like whatever groups he is, right, and then it will automatically gain access on whatever systems they look in, right? I mean, the explanation isn't this thing in the signature.
So basically the user records are supposed to be changeable by only privileged, like the administrator or whoever manages the user record and then you're supposed to sign them and each system will only accept the user records that are signed by the right people where the key, like where the signature can be verified and the key is known for this, right?
And this is actually, it's not an option. Like at least if you manage stuff with systemd home D, it will insist that everything is signed. Like we live in a world where cryptography is everything, right? So yeah, this stuff will refuse allowing you to log in on a random USB stick. It will only allow you to log in with random USB sticks that have a user record on it that is signed
by some key that is recognized by your specific laptop and you can recognize as many as you want, right? So you can change that file as much as you want but if you do, you just lock yourself out because you can't sign it because you don't know the private key being used.
Yeah, so the question was whether you need privileges to change the SSH key off your own user account. Yes, you do. It's the same way as it always was for changing your password. That also requires privileges. This is hidden from you usually by suing Zuit and it's the same way we can hide this
through policy kit. But yes, I mean there's no way that your code paths that run under your user identity can make these changes and I'm pretty sure that's how it should be. If there is desire to allow unprivileged users to change some artifacts of the user records, it always needs to go through privileged component that's policy kit, whatever else to allow this.
So if I got you correctly, you said before that you need to log in first in your account before being able to do an SSH login to that. Let's say I'm a university student and I have an account on the university lab and I'm a Tom and I want to access my account on the university lab computer
but of course the system was rebooted overnight so I cannot travel 200 kilometers. How would we handle that use case? Would it be possible for instance to store the SSH key as a looks key and to change open SSH to unlock my home? I mean that's not, I mean SSH doesn't deal in passwords. Like it gives you a challenge and you're supposed to respond to that
and if that kind of stuff means that the only way you can do things if the other side you're authenticating against is an active program, right? That's not how looks works. Looks, you authenticate against a disk image, right? So this is inherently incompatible. You cannot use, like at least I don't, I mean I don't see how you could use SSH authentication
to decrypt a home directory that is semantically incompatible. So my answer to this is if you really want that this system can come up on its own and decrypt your data, don't use this stuff. This is about security. This is not about giving up any kind of access to your data.
This is about supposed to protect the system, to protect your data from the system as much as possible. So that when you're not locked in, it's really cut off and like mathematically cut off. And regarding SSH, what about having like SSH as reverse
as your, so that your local home folder you already decrypted and opened is used over there and then the credentials inside of that is somehow unlocking over there the homes, you know what I mean? Yeah, I mean this of course always works, right?
Like I mean, yeah, if you, so, you know for the YubiKey support, right? Like I'm not sure if you know YubiKey stuff, like there's this general accepted API for accessing security tokens, it's called PKCS 11. And there is a project inside of Red Hat and that's actually I think included in Fedora and everything which allows you to propagate PKCS 11 devices
across the network through SSH for example, right? And with that kind of stuff you can make all the security, like the encryption keys and all that stuff that is available locally also available on a remote system to do that kind of propagation. I was interested in this because I wanted to make it possible, like I have YubiKey in this device
and I would ideally would like to just use that to authenticate through SSH onto another system that runs system bhome d and so that, yeah, the crypto key is sanely encrypted through that. So there are certainly things possible there, right? Through this PKCS 11 forwarding. I'm not sure how realistic that is in the short term,
but I'm kind of, yeah, let's see how that works. But again, I mean, the other way, right, that we were just talking about, like logging over SSH into my laptop, I'm not so sure that's that much of a common case because I mean, laptop is a client, it's not a server, right? Like you usually lock from it into other systems, you usually don't lock so much into your laptop
from the outside, it's like philosophically, yeah. Sorry, having, oh, on the other system, I mean, sure, yeah, I mean, there's the solutions.
Like I mean, one thing that is actually kind of nice is like you could even share these LUX files via NBD and then they show up on other systems and then you can log in there and it will just work because yeah, USB stick is not very different from NBD or something like that. But that's out of focus for everything. I mean, my main focus really, I want my own laptop to be finally secure that I can suspend about this.
I mean, this is kinda like the, this is a summary after all, right? I want these problems to be solved finally because we never could solve them and I want to solve for my own machine because I care about the security and these kind of things, right?
So if you remove the LUX encryption key on suspend, does that mean that you have to log out? So are all processes terminated? And then follow up question, if the processes are not terminated, there might still be some private data in the memory in running applications, right? Sure. Okay.
So yeah, I mean, there's data and there's data. There's crypto keys and there's your private data. Yes, if like, I mean, if I would, if I steal your laptop and try to extract the memory, the first thing I'm looking for is having the crypto keys. That's the one we definitely should kick out. People use encrypted swap and these kind of things. I don't know, maybe there's something more to solve there
so that as much as possible, the kernel flushes out all the memory onto disk into swap or something and then it's encrypted and stuff like that. But that's outside of my area of expertise and what I can solve. I just want to solve like the really obvious stuff. Yeah, but anyway, when the system goes down, we will suspend the LUX device
and that basically means that any process accessing it at the time will hang while it does that until we resume it again. This sounds ugly and is ugly, but also it's not much of a problem because we go down anyway into system suspend. There's not gonna be any process that might hang
because the CPU halted anyway. So if I want to have my home directory migratable but also as a backup, let's say I back this up to a separate disk, then I also need to back up the signing key of the signature file, the identity file,
the key that was used to sign this so I can go back into that system if I have to restore it. I mean, this stuff, it doesn't like, ultimately it's just a LUX file system. You can use the classic LUX like crypt setup to look into this and do whatever you want. And LUX of course has no understanding of user records or signed user records.
For a backup, it really doesn't matter that you retain the original signing key there. The signing key controls who can log in. It does not control who can access the data if he knows the password. So for backup, it's totally sufficient to just take that one file and back that up.
By the way, this is something people should never forget. Because this says that the user record and the home directory all become one file, you can just take that file from one laptop to another laptop and you don't need to do anything at all anymore. It just pops up there and it's there. And for the backup case, I think it would even make a ton of sense that if the backing file system of slash home is something like XFS or Brotherfest or something,
you can do a reflink copy every day or something of that one file and there you go. You have a perfect time machine kind of stuff. I mean, this opens up really nice possibilities as soon as we unified everything into one file because yeah, in Unix most things are focused on one file. Anyway, I think my time's mostly over
but maybe one more question there? Okay, how do you plan to deal with different distributions and versions of distributions because we have different versions of software and even if you use common home directory,
it would be not working configuration. So, I mean, System.ly is a part of System.ly, that's kind of the goal, right? Unless people patch around this heavily in the distributions, hopefully it's kind of gonna be the same thing in all the distributions.
All this stuff that I'm explaining here is not supposed to replace or modify the existing users that you might have, right? This is an add-on, right? So that if you have a home directory like this, great. It will show up in your user database and everything's great and because it's in this version, it's gonna be compatible. But if you're talking about the software that is inside the home directory,
people kind of solve this because NFS shared home directories always work like this already, right? And it's a mess and I don't care so much. It's a problem I'm not gonna solve for you, right? I'm not gonna make sure that every version of Emacs packaged by any version of any distribution can read the same configuration files.
That's something other people have to solve. But I at least want to make it possible that the home directories are truly migrateable. And yeah, how migrateable then they are between the distributions, that's not my problem to solve. The lower levels, they will be perfectly migrateable because, I mean, LUX, if I format a LUX volume on Fedora and then run it on SUSE, it just works.
They did not break compatibility and since this is really just LUX, yeah. Any very last question or are we done? If you...