We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Post Quantum Cryptography in Voice/Video over IP

00:00

Formal Metadata

Title
Post Quantum Cryptography in Voice/Video over IP
Alternative Title
Secure voice/video over IP communications today and tomorrow thanks to post-quantum encryption !
Title of Series
Number of Parts
542
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
14
15
43
87
Thumbnail
26:29
146
Thumbnail
18:05
199
207
Thumbnail
22:17
264
278
Thumbnail
30:52
293
Thumbnail
15:53
341
Thumbnail
31:01
354
359
410
VideoconferencingCryptographyQuantumPascal's triangleOvalFocus (optics)Context awarenessMechanism designAndroid (robot)Standard deviationSession Initiation ProtocolDerivation (linguistics)Communications protocolLocal GroupSource codeProxy serverEncryptionHypermediaStreaming mediaAuthenticationAsynchronous Transfer ModeData managementPublic-key infrastructurePairwise comparisonString (computer science)UDP <Protokoll>Continuous functionKey (cryptography)TorusCache (computing)Instant MessagingProxy serverStreaming mediaService (economics)Open sourceCryptographyEncryptionContext awarenessSymmetric-key algorithmHypermediaProjective planeTransport Layer SecurityQuantum computerFocus (optics)Standard deviationBitSession Initiation ProtocolCommunications protocolCurvePublic key certificateInformation securityKey (cryptography)System callCASE <Informatik>Group actionSchlüsselverteilungComputing platformDerivation (linguistics)WordAuthenticationAnalytic continuationDifferent (Kate Ryan album)Cartesian coordinate systemCategory of beingFlow separationMereologyCache (computing)Pairwise comparisonCommitment schemeSymmetric matrixMessage passingVulnerability (computing)FreewareFunction (mathematics)Axiom of choice
Key (cryptography)Pairwise comparisonFormal verificationCache (computing)Cloud computingCollisionInterface (computing)Mechanism designEncapsulation (object-oriented programming)QuantumEncryptionSymmetric matrixWeb pageComputerAlgorithmCommunications protocolVulnerability (computing)Quantum cryptographyQuantum computerElectric currentInformation securityHybrid computerAuthenticationCommunications protocolAlgorithmPublic-key cryptographyChainKey (cryptography)CollisionBitInterface (computing)Set (mathematics)Connectivity (graph theory)Function (mathematics)Encapsulation (object-oriented programming)Hash functionSystem callCryptographyForm (programming)MereologyMechanism designData conversionNational Institute of Standards and Technology2 (number)NeuroinformatikSchlüsselverteilungMixed realityStandard deviationRandom number generationConfidence intervalOrder (biology)Constructor (object-oriented programming)AuthenticationRevision control
AuthenticationMechanism designHybrid computerEncapsulation (object-oriented programming)HypermediaUDP <Protokoll>Streaming mediaEncryptionEmailRevision controlImplementationModule (mathematics)System callLink (knot theory)CryptographyModulo (jargon)Row (database)Multiplication signCommunications protocolDivisorMaxima and minimaObject (grammar)Revision controlClassical physicsQuantumData conversionOpen setTerm (mathematics)Information securityBuffer solutionAlgorithmMoment (mathematics)Right angleEmailReal-time operating systemFunctional (mathematics)Quantum cryptographyMixed realityLibrary (computing)System callVariety (linguistics)Slide ruleServer (computing)StatisticsSpeech synthesisPhase transitionEncryptionSession Initiation ProtocolKey (cryptography)Pairwise comparisonCryptographyNational Institute of Standards and TechnologyLevel (video gaming)Source codeComputer animation
System callCache (computing)Communications protocolHypermediaSystem callMultiplication signPoint (geometry)Session Initiation ProtocolKey (cryptography)Message passingServer (computing)Dependent and independent variablesMobile appComputer animation
Program flowchart
Transcript: English(auto-generated)
I've been contributing on the Lean Fund project for the past 10 years, more or less, and going to talk about the introduction of post-quantum cryptography in the voice of our IP soft phone. So, quickly, the agenda for some context, then we'll dive into the RTP protocol,
and then how we had to modify it to use post-quantum cryptography, and then a few words about ebride, post-quantum, and classic key exchange, and some conclusions. So, first, some context for advertising for Lean Fund first.
It's a project which is around for now more than 20 years. It's available on lots of platforms. The idea is that we have, like, a common library, and then, on top of that, different applications for different platforms. It tries to use, at most, SIP standards and everything standardized,
RFC and so on, for audio, video, instant messaging. We also provide secure group messaging. It's based on a derivative of signal protocols that we presented years, three years ago. We also provide a SIP proxy, which is called Flexi SIP, also open source. Everything is open source. And I encourage you to use our free service on SIP, which is sip.leanfund.org.
So, basically, I don't know if you're familiar with VoIP, but basically, you have two streams of data. First stream is a signaling path, which connects the endpoints together.
And then you have the media stream, which actually sends data, video, audio, encrypted. And this one, we have to encrypt. So, how it works, there is an RFC for that, and the protocol, which is called SRTP. And SRTP is symmetric encryption. So, so far, we are not very concerned by quantum computers.
The main problem with that is that it requires an external command engine. So, we have to exchange our symmetric keys. So, for that, we have three choices. The historical one is called SDES. So, on this one, the keys are transmitted in the signaling path. Which, if the signaling path is protected, which is normally the case by TLS, is okay.
The only weakness is that this proxy gets access to the symmetric keys. So, we are not actually end-to-end encrypted. So, basically, people running the service could decrypt your media stream. So, there is another one, which also gets an RFC, which is called DTLS-SRTP.
Basically, on this one, on the media stream, you perform a TLS handshake. Actually, DTLS handshake, because it's over UDP. And this one works well, but you have to deploy a PKI. And you have to manage certificates for all of your clients and everything.
So, it's a bit heavy, and also, you still have to trust someone. You trust certificates, sure, but still. And then there is another one that we favor. Well, all three are available and informed, but the last one, which is called ZRTP, is one we'll focus on this one today.
And this one, on the media path, you perform the ZRTP protocol, which is based on Diffie-Hellman, which using electric curve or simple Diffie-Hellman. This one has no third-party read, which is good. The only small thing is that you have to confirm, make some kind of spy thing that you have to tell secret code on the phone,
as you are talking with each other. For the user, end user is a bit of an annoyance, but you have to lead once in the call history, the whole call history, with your other endpoint. So, we think it's acceptable for users. Obviously, one has to get involved in security, but normally, it works.
The experience tells that people focused on security tends to not be driven away by this small drawback on the protocol. So, it's an RC which is now more than 10 years old. It has been mainly written by Phil Zimmerman, the guy behind PGP,
which always focused on avoiding third parties. And it provides different properties. I won't explain the key continuity and stuff, because this one is unchanged, and we'll focus on man-in-the-middle attack detection.
So, first, a small reminder of what is Diffie-Hellman. So, basically, it's a protocol where it's completely symmetric. One part, both parts will generate key pair, and then they exchange public keys, and with this separate key and other side public keys, we'll get a shared secret. So far, so good. It's kind of easy.
On the drawback, it's obviously vulnerable as many key exchange protocol to man-in-the-middle attack. So, man-in-the-middle attack, what it is, is basically someone putting herself in the middle and exchanging keys with both sides. So, the side cannot know.
Basically, Alice cannot know that Eve is sending her key. She thinks that Bob is sending the key, and she performs the exchange, and at the end, what you get is that Alice gets a shared secret with Eve, and Eve gets another shared secret with Bob, but Alice is convinced that she exchange keys with Bob, and she has no ways to actually detect this.
Well, she has actually some ways. Okay. Yeah, sorry. Where are we?
So, the ERTP handshake is the first phase of discovery. So, what is happening is both endpoints will exchange their capabilities, their choice of preferred algorithms, stuff like this, and then start the actual ERTP handshake. So, first, you have one packet of commits. I will go into detail now.
And then you actually perform the DFLman exchange. So, Alice is sending her key, Bob is sending Eve, and they both compute from this. They will compute the shared secret, and adding all the transcripts of the communication, they will generate S0, which is the base secret,
the output of the ERTP handshake. From the S0, they will derive the SRTP keys, which is what we are trying to do here, and they also derive something called SAS, short authentication string, that will be vocally compared over the phone, because we are, Alice and Bob are actually talking to each other.
So, the end of the protocol is just some updates and writing in cache for key continuities, mechanisms, it's not really interesting now. And then, after that, the SRTP streams start, actually, and they can talk. And once they start to talk, once in the call history, they will do this vocal SAS comparison.
What it's for, the SAS comparison is, basically, if they want to detect a man-in-the-middle attack, they have to ensure that Alice is using the keys that Bob has sent, and also Bob wants to know that the key that was sent by Alice is the one he actually got. So, what they could do, as they are talking, they could basically read the R1 keys to the other.
But the key is something which is few hundred bytes, so it's a bit long to read few hundred bytes of hexadecimal chain over the phone. No one would do that. So, what they do instead, we derive these short authentication strings, which is only four digits, and has 20 bits, actually derived from 20 bits.
And this SAS is also derived from the SQL 0, which is the output of the protocol. The only problem with that is that you can actually perform a SAS collision with that because the SAS is very short. How it will work? So, actually, the beginning of the protocol, as soon as Alice sent a public key to Bob,
Bob is able to compute S0 because he has his own SQL key, and he is able to compute the SAS name. So, what one could do is that Eve performed first the RTP exchange with Alice. She got the SAS 1, and then she received Bob's public key.
When she got Bob's public key, she can generate a huge set of key pairs until she finds a SAS that collides. Basically, she will try a lot, she generates her own pairs. SAS is only 20 bits, so if you generate one million keys and try all of them, you will for sure find a collision on the SAS. So, to prevent this,
Eve is forced to send a commit packet. In the commit packet, what we have, we do not have a public key, but we have an hash of the public key. And so, when you receive the hash of the public key, Alice will receive, for example, Bob's hash public key, she will store it,
and then when Bob sends the public key, she will compile, she will just hash Bob's public key, and she will compile, so that way she's sure that Bob did not wait for receiving her public keys, and cannot generate millions of key pairs to find a collision on the SAS. So, this is quite effective,
and so far so good. Now we want to switch to using, to use post-quantum cryptography. Problem with post-quantum is that on the NIST call for standardization, they required all the algorithm to use key encapsulation mechanism, and not deferment. So, key encapsulation mechanism is a bit different,
because there are two sides that are not the same. In deferment, the two sides were exactly doing the same thing. They are both generating keys, exchanging public keys, and then computing secrets. There, we have one side generating keys, one side encapsulating this key, a secret, and the other side will be able to de-capsulate the secret that was encapsulated by the first one.
So, it's not symmetric, so we cannot switch directly from deferment to KM form of key exchange. Obviously, KM is still vulnerable to man-in-the-middle attack, because nothing has changed. You can still put someone in the middle and perform the exchange with the other side,
without them knowing. So, what we have to do is adapt the RTP and change a little bit the actual handshake, so the central part of the protocol. So, S0 is still derived from the exchange secret and a transcript of all the conversations.
I bought only commits and two packets, but you also have yellow packets and stuff. So, in the commit packet, the one which used to hold only the hash of the hash part of the second packet from Bob, Bob will now insert his public key. Why do we do that? So, Alice can encapsulate the secret. So, at this point, Alice receives the public key from Bob.
She encapsulates the secret, but at this point, she's not able to compute S0, because she's missing the second packet from Bob. So, she sends back the ciphertext, so the output of the encapsulation, and at this point, she has the shared secret
from the key encapsulation, but she cannot compute S0. Bob now retrieves the shared secret, and he can't compute S0, but he already committed on dhpr2 that he has to send to Alice, so still he cannot manipulate the secret, the final secret in S0. And what is in this packet? It's just a random number that is used once.
Okay? So now, another problem is that we don't want to focus only of using only a post-quantum algorithm, because we know that sometimes they got broken, like, for example, Syke, which was broken a bit late in the standardization process.
So, it might happen or not in the future. So, to protect against this weakness, this possible weakness, we still want to use a mix of post-quantum and a classic algorithm. So, we'll use both at the same time, and in order to not complexify the protocol too much,
the idea is to have one version of the protocol, which is DFIRman, and the other one, key encapsulation mechanism. And the protocol won't know exactly if it's using a mix or not, because probably in the future, at some point, we'll be confident in this with some post-quantum algorithm, and then we'll stop using the classical one,
maybe, or not. But still, the protocol should not be modified at this point. So, the protocol is done to use a KMM interface without even knowing if it is a mix of classical and post-quantum, or just post-quantum, or several post-quantum. So, first, we have to make a KMM interface from DFIRman.
This is quite a standard construction. You can directly use the DFIRman construction to generate key pair. Then, you can send your public key to the other side. The other side will encapsulate.
How would the other side do that? It would just generate an DFIRman, a key pair for DFIRman, compute the DFIRman, and then hash it with the transcript of the exchange, and send back its public key to the other side. So, the encapsulation is quite obvious,
and the same thing on the other side. And then we combine two or more algorithms together. So, one would just build from a classical DFIRman, or an elliptical DFIRman, with a post-quantum one. So, this way of doing it has been published by Nina Bindel, sorry, I'm not sure how to pronounce the name,
a few years ago. So, it's a bit convoluted, but if you want more details on why you are doing this, I encourage you to read the paper. It's quite interesting. So, basically, what you do, when you generate a key pair, you generate key pairs for a set of algorithm. In my example, it's only two, but you can do more of that. And send concatenated both public keys,
or all the public keys, to the other side. The encapsulation would just split your public keys to retrieve the individual ones, and perform the encapsulations on all the components. Then you use hashmax to combine your results,
chaining it. So, first you combine key one, and then key two, and you can add several layers there. The final step is to use a transcript of all the public keys you received, and the decapsulation is completely symmetric. The paper from Nina Bindel is quite clear
on why these steps are needed. I have no time to explain it here. A few more words. We also tweaked the protocol packets because in the deferment form, the maximum size you can get is around a few hundred bytes,
but if you start using Kyber, for example, or HQC, the one we used, you'll reach several kilobytes. And several kilobytes, you cannot send in one datagram over UDP. It's not possible. It probably won't arrive. So, what we have to add is a way of fragmenting the RTP packet.
So, it's kind of classical way just as DTLS is doing it, or other protocols using UDP. The only thing is that we made it in a way that packets are not fragmented, and the header is modified, but if it's not needed, the packet remains exactly the same as the old packets.
The objective in this was to be able to start deploying the new version of the RTP, but still keep compatibility with the old one, all deployments. So, how it's done, in the end, we use crypto libraries, liboc.us, which is from the Open Quantum Safe project,
which basically collects all the NIST candidates, and Kyber also, which is a normal candidate, in a convenient way. And we use libdecaf and embed TLS for the ECDH and ASHMA functions that we need. So, we packed it all in an independent module.
So, the RTP library will use this module, but it's completely independent, actually, from it. So, if anyone wants to directly use this hybrid or mixing varieties of first quantum and classic exchange,
it's fully available. You can combine usually more than two columns, as it was printed, written in C++, and in our ZRTP implementation, we deployed it with some already pre-set combination. So, we have X, well, you can see them, we tried to mix algorithm with more or less
the same level of security. So, mixing the Kyber R512 with X250, this one. And it is usually, as I said before, fully compatible with the older version. So, the deployment is progressive.
It's basically in the agreement phase at the beginning. If one, if most parties support this version of the RTP with this algorithm, they will use it. If one is old and don't support it, they will just fall back on classical deferment or electrical deferment.
So, just how it looks like. So, first, you have the ZRTP in check going, and the call is starting. And once the call has started, if it's the first one, the two endpoints are calling each other, you'll get a pop-up that asks you to confirm the security string.
So, both parties will just confirm it. Just say it on the phone. It's written like you have to say this. The other one confirms you said what it's expecting to say, and you confirm it. Then this will be saved in the ZRTP cage, and we'll never be asked again to do that.
At any time during the call, you can check on the call stats and see what kind of algorithm you use to perform the exchange. So, on this screenshot, you see that it was using Kyber R512 and X22519. Here are some links,
just if some of you downloaded the presentation, towards the Linfone website, directly pointing to the GitLab, where you can find the source code of both the ZRTP and our post-quantum crypto module, and to the publication from Nina Beindel, explaining how to hybrid several times.
Here we are. Thank you for your attention. So, we've got time for questions, and I've got one question on metrics, and there is a question, why post-quantum encryption is not enabled in the precompiled Linfone SDK?
Sorry, I didn't. Why the post-quantum encryption is not enabled in the precompiled Linfone SDK? It is now. It is now? It is now. It is now. Based on the record. Yes, sorry.
Given that we're dealing with threat actors that might be capable of dealing, you know, cracking quantum cryptography. Sorry. Okay, given that we're dealing with threat actors that might have a lot of resources, it seems like one particular attack factor might be to essentially use real-time deep-break technology
to intercept the vocal SAS comparison. Do you see any particular mitigation for an attack like that? Well, some kind of attack like this has been already studied and published, so basically what came out of what I found is that it's kind of easy to synthesize,
to use speech synthesizer to synthesize the voice of someone else. The main problem there would be to insert the SAS at the right moment in conversation without adding a huge delay in the conversation so that people won't be able to talk, basically. If you add, like, two to three second delays because you have to analyze the signal and, like, buffer it to be able to insert back your SAS,
people won't talk with three seconds, three to four second delays, there is no way people will be able to talk. I agree, I think it's going to be very difficult to do something like that in real time, but I think that's probably, you know, because your solution looks really, really solid in terms of being able to fix it like that, so it looks like that might be one of the weaker aspects of it.
Since now, I've been trying to monitor the publication on the subject, and I never found someone able to publish an actual attack on the RTP working, really. So it might depend on some point. That's great, thank you.
Thank you. I think I missed it, but then, in this particular method that you are doing, is it actually trusting the middle server that you're using, or is it using keys from another, like, a phone or something? SIP, assuming. Is this running with the SIP protocol, you said?
I'm sorry, the sound is very low. Hello? Better, yes. So I wanted to ask if this was being used with a mobile phone to connect to the SIP server, and then use post-quantum cryptography as you demonstrate.
Can you go back to the two slides before, please? Yeah, so the phone, is it actually trusting the server, which is running, or is it like the end-to-end, the actual key is being checked with the other host? Yeah, this is the main point of the RTP, that basically,
the idea is to not trust anyone, not your server. So the server will be in charge just of connecting the two phones, and then the media will go directly from one to the other one. The media pass will go straight from one phone to another one, and it won't go through the server. And that's why the RTP exchange is performed on the media pass,
and not on the SIP signaling pass. When you establish a connection, actually, you go through ICE protocol, if you're familiar with that, which basically finds a way to connect directly, because at the end, you don't want the media to be relayed, because you lose too much time. You have to send media packets directly from one endpoint to the other endpoint.
Hi. You said that you have to compare the SAS only once. Is it once per phone or once per user? It's one per endpoint. So basically, in each endpoint, you have a cache of previous,
each time you end the RTP exchange, you'll keep some shared secret that you'll use next time. And so during the exchange, at some point, you will compare the shared secret, and if they are the same, you'll use them to compute a SAS, which is always available.
You can always ask to compile a SAS, but it won't pop, because the protocol will know that you performed the exchange before. But it's just from one phone to another phone. This cache is not shared. Okay, so in practical terms, if I buy a new phone and install the same app with the same account, I have to do it? You have to do it again with all your respondents. Okay, thanks.
We've got time for our last question. Is there any other last question? If not, thank you for your call. Thank you.