We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

MLS meets Matrix

00:00

Formal Metadata

Title
MLS meets Matrix
Title of Series
Number of Parts
287
Author
Contributors
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
MLS (Messaging Layer Security) is an upcoming IETF standard for messaging systems. In this talk, I will discuss some of the work in integrating MLS with Matrix. This will include: - why we would want to use MLS - making MLS work in a decentralised messaging system - how MLS concepts can fit into Matrix - what will be needed to migrate to MLS
Matrix (mathematics)Maximum length sequenceEncryptionStandard deviationSystem programmingImplementationGroup actionElement (mathematics)Network topologyYouTubeBinary fileKey (cryptography)BootingLevel (video gaming)Message passingInformation securityDifferent (Kate Ryan album)RootStandard deviationGroup actionEncryptionMessage passingRaw image formatConfidence intervalMatrix (mathematics)ImplementationInformationNetwork topologyNumberSet (mathematics)View (database)MathematicsElement (mathematics)Flow separationPhysical systemServer (computing)Binary treeClient (computing)Point (geometry)Public-key cryptographyBijectionLevel (video gaming)Link (knot theory)VideoconferencingType theoryWeb pageInterprozesskommunikationUniform resource locatorKey (cryptography)Ratsche <Physik>Doubling the cubeTraffic reportingCASE <Informatik>State of matterUniform boundedness principleCommitment schemeReliefMultiplication signOptical disc driveArmMetric systemArithmetic progressionDiagramEngineering drawingComputer animation
Key (cryptography)Network topologyMessage passingEncryptionMaximum length sequenceLinear mapMatrix (mathematics)Event horizonSoftware bugElement (mathematics)Operations researchTrailInformationObject (grammar)Codierung <Programmierung>Binary fileIdentity managementKey (cryptography)Server (computing)Multiplication signPublic-key cryptographyLinearizationMathematicsEvent horizonDifferent (Kate Ryan album)Element (mathematics)Connected spaceMatrix (mathematics)Network topologyInterrupt <Informatik>Codierung <Programmierung>Fault-tolerant systemRaw image formatLink (knot theory)InformationMessage passingOperator (mathematics)EncryptionLatent heatRootSequenceResultantTrailHeegaard splittingNormal (geometry)Water vaporSmartphoneNeuroinformatikHome pageService (economics)CASE <Informatik>BlogArmCoefficient of determinationObject (grammar)MereologyDemosceneStreaming mediaExecution unitComputer animation
Key (cryptography)Maximum length sequenceBinary fileObject (grammar)Message passingCodierung <Programmierung>Identity managementNetwork topologyMatrix (mathematics)Event horizonEncryptionMereologyCloud computingLimit (category theory)Key (cryptography)Optical disc driveArmMultiplication signFlow separationRevision controlMechanism designMatrix (mathematics)RootExecution unitMetric systemPublic-key cryptographyType theoryCodeEvent horizonMessage passingFormal verificationPoint (geometry)Order (biology)Computer configurationWebsiteCASE <Informatik>Commitment schemeLoginSign (mathematics)QuicksortIdentity managementCombinational logicMatching (graph theory)GodMereologyDifferent (Kate Ryan album)Raw image formatDemo (music)AlgorithmNetwork topologySemantics (computer science)EncryptionFlagField (computer science)IdentifiabilityMathematicsCiphertextClient (computing)Computer animation
Key (cryptography)Maximum length sequenceBackupInformationRevision controlMatrix (mathematics)Information securityMereologyImplementationBackupFigurate numberInformation securityTouch typingComputational complexity theoryMatrix (mathematics)Workstation <Musikinstrument>Chromosomal crossoverCodeSoftware testingRevision controlLattice (order)Network topologyKey (cryptography)EncryptionQuicksortMultiplicationArmInformationMathematics1 (number)Public-key cryptographyMessage passingClient (computing)AlgorithmRing (mathematics)Similarity (geometry)Multiplication signSymmetric matrixMereologyIntegrated development environmentImplementationSingle-precision floating-point formatPerformance appraisalSymmetric-key algorithmQuadrilateralWeb pageGoodness of fitComputer animation
Standard deviationMatrix (mathematics)Coefficient of determinationComputer animationMeeting/Interview
Revision controlPlanning2 (number)Matrix (mathematics)Right angleEncryptionMeeting/Interview
EncryptionBit rateMessage passingMeeting/Interview
Computer animation
Transcript: English(auto-generated)
Hi, I'm Hubert Chatty, and I'm going to be talking today about MOS and how it can be used in matrix.
So in this talk, I'm first going to describe what MOS is, why it's better than Ohm and Megohm, and how it works. And then I'm going to be talking about how we can use MOS in matrix, what challenges there are. And then I'm going to talk about how
MOS interacts with other things in end-to-end encryption in matrix. So what is MOS? MOS stands for Messaging Layer Security. It's an upcoming ITF standard, so similar to how TLS,
or Transport Layer Security, is a standard for doing client server encryption. MOS is a standard for end-to-end encryption in messaging systems. And it's designed for group messaging rather than one-to-one messaging.
So in matrix, we use Ohm and Megohm. And Ohm is a one-to-one messaging system based on the double ratchet system introduced by signal. And the way we use it in groups is that basically, or the way we use it for groups
is basically we create a whole bunch of one-to-one channels. So by using something that's designed for groups, we hopefully get better performance in large groups. So you may have noticed in matrix, right now if you encrypt a message in a very large room,
then sometimes it takes quite a long time, possibly several minutes. Another thing about MOS is it knows about group membership. So you know who is in the group
when a message was sent. So MOS has had many different contributors. And it's been analyzed by several people. And it's been the subject of academic research.
So it gives some confidence in the security around MOS. There are also multiple implementations. Of course, MOS hasn't been finalized yet. So all those implementations are in progress. And some of them are based on different drafts of MOS.
And not all of them have been tested for interoperability yet. But the goal is that eventually there should be several different implementations that people can choose to use. And they should all be interoperable with each other.
So I'm going to give a very high level overview of how MOS works. If you want some more detail, you can watch this video of a talk that I did for Element last year. The link to the video is on the FOSDEM page.
So I don't have to type the URL in. But it's about a 45 minute talk. It goes into some more detail. It's still a somewhat high level overview. But it should touch on most of the points to give you a better understanding of how MOS works.
But for today, I'm just going to give a very high level overview. So basic idea is that there is a binary tree associated with each MOS group. And the group member, in this case, I'm talking about each device that's in the room,
will have a leaf node associated with the device. And each leaf node will have a public key stored in that leaf. Of course, the device itself will know the private key
for that public key. And each internal node will have a public key as well, or it will be marked as blank. And the idea is that each descendant of that internal node will know the private key for that public key. Or they'll be marked as unmerged.
And then members can update their own key as well as the keys for the nodes between their leaf and the root. And the root node is never marked as blank. So it always has a private key associated with it.
So here is a binary tree with eight members. So in this case, Alice here has her own personal private key. And she will also know the private key for these three nodes.
Bob will have his own private key as well. And he will know the private keys for these three nodes as well. And Carol will know the private key for these three nodes, and so on. So the tree gets updated whenever there is any group membership changes.
So whenever someone joins or leaves, or if someone just wants to update their key. When someone is added, then what happens is that the person who adds the new number
will fetch what's called the user's init key, which is just a set of public keys that the user publishes. And then when they're added to a tree, someone will fetch one of those keys and use that as that member's leaf node.
And then they'll be sent a welcome message. Welcome message basically just includes all the information needed to generate their view of the MOS tree.
And it's encrypted with their public key. And then commits message is sent to the room. The commit message is just a message that includes all the information for all
of the other current members of the room to apply the same changes to the tree so that they end up with the same tree that the person who modified the tree ended up with so that everyone has the same view of the room.
And this is encrypted, and it can be decrypted by all of the current members, but not by any members who are removed. And then when you change the tree, then you create what's known in MOS as an epoch. And then messages that are sent to that room
are encrypted using that epoch. So yeah, the private key at the root, along with information from the previous epoch, is used to derive several other secrets.
So it's used to derive the encryption keys, which are used to do the actual encryption of the room messages. And each sender gets their own encryption key. And it's also used to derive the initialization secret, which
is used to derive the secrets for future epochs. And then there are several other secrets that are derived, which I won't get into in this talk because they're for specific purposes.
So the way it works is you have the init secret from your current epoch. Someone modifies the tree. You get the private key from the root node. Combine that with the init secret to derive a joiner secret.
The joiner secret is used to derive the next init secret, encryption secret, and the other secrets. The init secret is then used with the root key of the next epoch to derive the next joiner secret. And the joiner secret is used to derive
the next init secret, encryption secret, and so forth. So there is a problem with MOS and matrix in that MOS assumes that we have a linear sequence of epochs. So if we go back here, this is all a very linear thing.
But in matrix, we don't have a linear sequence of events because people are on different home servers. And there could be latency between the home servers. There could be net splits. And so people on different home servers could be modifying the MOS tree at the same time.
So for example, we have a room with Alice, Bob, Carol, and Dan. And this is what the MOS tree for users looks like. And then Alice and Bob are on one home server. And Carol and Dan are on another home server.
And then for some reason, whatever reason, there's an interruption to the connection between the two home servers. So Alice and Bob can keep talking to each other. And Carol and Dan can keep talking to each other. But they won't see each other's messages. Alice and Bob won't see Carol and Dan's messages
until the connection between the home servers is restored. But they can also make their own modifications to the MOS tree. So for example, Alice invites Erin to the room and adds her to the MOS tree. And then Alice decides to leave the room.
So at the end of this, this is what the MOS tree will look like. So Erin gets added as a new node. And then Alice leaves, so her node gets blanked. On the other side, Carol adds Frank to the room.
Dan updates his key. And then Frank adds Grace. So this is the result at the end of that. Frank and Grace are new nodes, and Dan has a new key. So then when the connection is restored,
then we have two different trees. And we need to figure out what to do here. So what I've been doing for the last little while is figuring out how to make MOS work in a more decentralized situation. And what I've ended up doing is
I've added a layer on top of MOS. And I allow epochs to fork, but we keep track of who created them. So in MOS, an epoch is just identified by a counter, basically. So we have epoch 1, epoch 2, epoch 3, et cetera.
And so obviously, if you have multiple people modifying the tree, then, for example, you'll have multiple epoch 2s, which will get confusing. But if you keep track of who created them, then you can have, say, Bob's epoch 2 or Carol's epoch 2.
Then you know exactly which tree you're talking about. So now, if you have multiple epochs and you want to merge them together, then you pick one of them to be what I call the base epoch.
And then first of all, you make it agree with the matrix room membership. So if people have joined or left the room, then you add nodes or you've removed nodes. And then you merge in the changes from the other epochs. And you do all of this using just normal MOS operations.
So you don't need to make any changes to how MOS works. All you need is some extra bookkeeping information. So I have a more detailed explanation of how that works in a document. And I have it linked from the FOSDEM page,
again, so you don't need to write down that whole link. So if we go back to our situation with Alice and Bob, we have these two trees. We want to merge them together. So we'll pick one of them. In this case, we'll pick this tree to be our base.
And then we merge in the changes from this tree. And the changes are that Aaron has been added and Alice has left. So in MOS, we do removes first and then adds. So we remove Alice, leaving a blank node,
and then we add Aaron. And since there is a blank node left by Alice, we put Aaron in that blank node. And then so this will be the resulting tree. And for example, if Bob had also updated his key, then we would also update his key in the merged tree.
So now we have that. How do we actually do MOS over matrix? So MOS defines a binary encoding for messages. So all we have to do is base64 encode that and wrap it in a JSON object, just like we currently
do with OM and MEG-OM. MOS, since it knows about different users, it has an identity. So for matrix, we could just use some sort of combination of the user ID and device ID.
So for example, we could just concatenate them with some sort of separator. It doesn't really matter exactly what we do as long as it fits within the limits of MOS. Next, we have init keys, which, as you will recall, are basically the initial public keys
used in the MOS trees when someone gets added. And basically, they have the same semantics as OM one-time keys. So they're just keys that you publish a bunch of, and then someone claims them.
And when a key gets claimed, then no one else can claim that key in the future, which is basically how one-time keys work with OM. So we can just reuse that situation, that mechanism
in matrix. MOS even proposes the equivalent to OM fallback keys, where it says there is a situation where a user could run out of init keys because they've all been claimed.
So it says that you could have a special init key that gets reused when it's claimed rather than disappearing, which is just the same as how OM fallback keys work. So welcome messages can just be sent as a normal to device
event. In my MOS demo that I've written, I use a special event type to flag it as a welcome message. And then I just base64 encode the welcome message in that event. Another way to do it would be to just use a normal m.room.encrypted event,
and then use a different algorithm field to indicate that it's a welcome message. And then for messages sent to rooms, we can just do the normal thing of sending a m.room.encrypted message.
Of course, the algorithm would be whatever identifier we use for MOS. The ciphertext would just be the normal MOS encrypted message. And then there's the extra bookkeeping that we need to do. So we need to identify who was the creator for the epoch
that the message was encrypted with. And then if it's a commit message, then we need to indicate which other epochs this commit resolves.
So there are still a few open questions. So MOS assumes that you see all of the commit messages in order, and you can decrypt them in order. But in matrix, it just gives you the most recent messages. And if you want to see older things, then you can backfill.
So if you're in matrix, you leave for a while, go to sleep, for example, and then wake up the next morning and open your matrix client again. Then your client will receive the latest few messages from each room.
And then if you want to see older messages, then you scroll up and it will load the older messages. But that won't work with MOS commit messages because in order to decrypt the recent messages, you need to decrypt all the previous commit messages.
So we would need some sort of mechanism to fetch the old commits. So one way to do that would be to just identify the event ID that contains the commit for the epoch that each message is from so that you can just
fetch that event. And if you need another epoch to decrypt that event, then you fetch the previous one and so forth. And that could work, but it might be too slow. We'll have to see. If it is too slow, then we might need
to add some sort of special API. And then there are some parts of MOS that haven't been investigated yet. I'm not really expecting much problems with some of these. Some of these might cause problems
in the decentralized situation, but they could be things that we just disallow use of in matrix.
So we'll have to see. So what about other parts of end-to-end encryption? So verification, cross-signing, device keys. So MOS can use ED25519 keys, just like OM. So we could just reuse the existing device keys.
However, that would mean that we need to extract the private key from libOM, or we would need to make libOM understand MOS, which might not be so nice. Another option would be to just add another ED25519 key.
And then in that case, we would need a different algorithm name for that new key, so maybe something like ED25519 MOS. But that won't be a huge issue. Verification would mostly be the same.
So device keys would be signed and cross-signed in the same way. And the verification methods were designed to be flexible in the type of keys that they could work with. So that would be fine. MOS can also use ED448 keys, which are sort of a stronger version of ED25519 keys.
So we might also want to switch to that. Or we might not. We might want to just stick with ED25519 keys. If we do switch, then it probably makes sense to also switch or cross-sign in keys,
because it doesn't make sense to sign a strong key with a weak key. So that might cause some changes to happen. But again, it won't be a huge issue.
There will definitely be some changes to key backup that need to be made. So we need to determine what's the minimal information that we need to keep about each epoch. And I mean minimal in sort of a security sense.
We want to make sure that if someone gains access to the key backup, that they can't do more than just read messages.
So we need to figure out what information needs to be kept. We need to figure out, for example, do we want to back up all the private keys that are associated with the internal nodes, or the leaf nodes in the tree, or just the shared secrets that were derived from the joiner secret.
We also need to change backups so that it can handle multiple algorithms and multiple versions. So right now, with key backup, you can just have a single backup, which is tied to a single encryption algorithm.
But if we're migrating to MLS, then you'll probably have both mega-ohm rooms and MLS rooms, and you want to be able to back up both keys. So we'll need some API changes for that.
And then in key backup as well, there is a thing to determine which is the better key. So if you have, for example, multiple devices backing up the same key at the same time, or if you have an existing key in the backup and you want to back up
another version of it, then there are some situations where you want to replace the existing backup with the new backup. So we need to figure out whether any of that still applies to MLS.
So Quad S shouldn't really be affected directly by MLS because it's just a symmetric encryption. It doesn't really touch any of the rooms stuff.
So in conclusion of MLS, or at least a modified version of it, works in Matrix. I've implemented a demonstration code for it, which is linked in the FOSDEM page. It works over Matrix.
It still needs to be reviewed for security, and it needs to be updated to the latest draft of MLS. And of course, eventually, when MLS has finalized the standard. There's still some details to work on for getting MLS to work in Matrix, but most of the hard parts
are done. And we also need a good MLS implementation to use. The MLS implementation that I've done wasn't written to be secure. It was written to be a test environment. And it's written in TypeScript, so it's probably not
of use to C++ clients. So we'll need to probably evaluate other implementations of MLS and see which ones can be used in the decentralized situation.
But yeah, that will probably have to wait for more implementations to show up and for the interoperability testing. So thanks for watching my talk.
We have quite a few questions here and some very interesting questions. We can really see that this is Matrix Devroom
and that people care about standards because Matrix is all about you need to respect the spec. And the most important question is about MLS or DMLS rather because DMLS is decentralized MLS. So the standard that is being worked on at the IETF is MLS.
Is DMLS still going to be a standard? Yeah, so DMLS is just the name I use for my work on it. I don't really have any plans per se
for getting it into the standard, although it would be nice. And if I can get it in somehow, it probably won't be in the first version that's being worked on right now, but maybe in a future version possibly. All right, oh, we only have 30 seconds left.
Would MLS, if DMLS was used, for example, in Signal, could we have end-to-end encryption with Signal even if Signal is not speaking Matrix? Or would we have end-to-bridge encryption? We would still need to have something in between that would translate the messages because, well, I
don't know how Signal works under the hood, but I believe I would assume that it