Introduction to Mix Networks and Katzenpost
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 102 | |
Author | ||
License | CC Attribution 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/43237 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
Chaos Communication Camp 201950 / 102
1
6
8
14
17
18
19
20
25
27
28
29
30
34
35
36
39
41
47
52
53
54
55
58
59
63
65
67
71
79
81
84
85
86
87
91
92
93
94
95
96
97
98
99
100
101
00:00
Computer networkTelecommunicationChaos (cosmogony)MathematicsMixed realityComputer networkWeb pageFilm editingQuicksortContext awarenessJSONXMLLecture/Conference
00:56
Term (mathematics)Web pageRandom numberOnline helpLink (knot theory)InformationReading (process)Latin squareDialectInclusion mapSpeciesPay televisionBlock (periodic table)Information managementArmElectronic meeting systemComputer networkTelecommunicationContext awarenessWordSource codeGoodness of fitArithmetic meanAreaInformation and communications technologyInformation privacyMoment (mathematics)PseudonymizationLecture/ConferenceComputer animation
02:10
Projective planeNeuroinformatikTrailFlash memoryGoodness of fitProcess (computing)Context awarenessLink (knot theory)Information privacyForm (programming)
02:51
Latent heatUniform resource locatorMessage passingCategory of beingState observerComputer animation
03:27
Endliche ModelltheorieEndliche ModelltheoriePerspective (visual)Latent heatModel theorySoftware1 (number)Meeting/InterviewComputer animation
04:09
WebsiteOperator (mathematics)Different (Kate Ryan album)System administrator1 (number)DataflowComputer networkCore dumpMultiplication signIP addressGroup actionConnected spaceTorusMeeting/Interview
05:34
EncryptionClient (computing)Computer networkLine (geometry)EncryptionComputer networkFerry CorstenClient (computing)Model theoryComputer animationLecture/Conference
06:21
TheoryFraction (mathematics)Computer networkRouter (computing)System programmingModel theoryMathematical analysisPoint (geometry)VolumeEncryptionAddress spaceCryptographyEmailPhysical systemContent (media)TelecommunicationDigital signaloutputOrder (biology)Message passingState observerDigitizingConnected spaceStrategy gameAddress spaceThresholding (image processing)WebsitePressureInformation privacyDifferent (Kate Ryan album)Model theoryMultiplication signService (economics)Link (knot theory)QuicksortTheoryComputer networkMixed realityPoint (geometry)Category of beingPhysical systemVulnerability (computing)Function (mathematics)Local ringPlanningAnalytic continuationEncryptionUniverse (mathematics)Marginal distributionUniform resource locatorRandomizationInformation securitySystem administratorException handlingComputer networkBlogWordCore dumpTrailWeb browserSingle-precision floating-point formatSource codePseudonymizationCASE <Informatik>Slide ruleIn-System-ProgrammierungExterior algebraSet (mathematics)Object-oriented programmingSource codeComputer animation
11:51
Public-key infrastructureClient (computing)Denial-of-service attackType theoryPhysical systemRandomizationMixed realityMessage passingCASE <Informatik>Extreme programmingComputer architectureMultiplication signComputer networkClient (computing)Communications protocolPhysical systemThresholding (image processing)TelecommunicationRouter (computing)Strategy gameStreaming mediaInformationRoutingOverhead (computing)Authoring systemRight angleProjective planeAuthorizationDirectory serviceBand matrixDifferent (Kate Ryan album)Order (biology)Bounded variationConnected spaceType theoryComputer networkMoment (mathematics)PurchasingCellular automatonReal numberCovering spaceLatent heatGame controllerBitComputer-assisted translationProfil (magazine)VotingExpressionLecture/ConferenceComputer animation
15:30
Overhead (computing)Band matrixQuicksortCovering spaceSet (mathematics)EmailAlgebraic varietyBitLattice (order)TorusMixed realityModel theoryLecture/Conference
16:43
PredictabilityRoutingNetwork topologyPermutationType theoryCovering spaceDifferent (Kate Ryan album)Strategy gameMixed realityRight angleSpacetimeScaling (geometry)Thermal expansionLecture/ConferenceComputer animation
17:26
Overhead (computing)Computer networkNetwork topologyFingerprintClient (computing)1 (number)Function (mathematics)BuildingScaling (geometry)Computer networkExterior algebraRoutingMessage passingoutputInternet service providerEntire functionStatisticsCategory of beingMixed realityNetwork topologyInformation securityLine (geometry)MereologyDiagramRevision controlGame theoryVector potentialAxiom of choiceSelectivity (electronic)State of matterQuicksortClient (computing)Marginal distributionSubsetRight angleBitInformationType theoryInformation retrievalService (economics)Communications protocolQueue (abstract data type)Model theoryPredictabilityPartial derivativeFingerprintCharacteristic polynomialMetadataSource codeIdentical particlesComputer-assisted translationOrder (biology)Different (Kate Ryan album)Term (mathematics)View (database)Hydraulic jumpRouter (computing)Film editingSet (mathematics)CASE <Informatik>PhysicalismThresholding (image processing)Lecture/ConferenceMeeting/Interview
22:58
Compact spaceFile formatInformation privacyInformation securityGroup actionRAIDComputer-assisted translationPrimitive (album)Revision controlPhysical systemMixed realityCryptographyFile formatFilm editingComputer networkForm (programming)MaizeDrop (liquid)TorusTelecommunicationBitModel theoryMultiplication signInteractive televisionComputer networkDirection (geometry)Link (knot theory)Communications protocolClient (computing)Noise (electronics)Group actionStreaming mediaRAIDRoutingOperator (mathematics)Digital electronicsData storage deviceEncryptionProper mapSpacetimeOverlay-NetzKey (cryptography)Meeting/InterviewComputer animation
25:02
Mixed realityKey (cryptography)Message passingRotationDigital electronicsEvent horizonOperator (mathematics)RoutingComputer networkIndependence (probability theory)Multiplication signState of matterLecture/Conference
26:06
Hybrid computerSystem programmingEncapsulation (object-oriented programming)Communications protocolSign (mathematics)NoiseIn-System-ProgrammierungLink (knot theory)OSI modelAuthorizationDifferent (Kate Ryan album)Numbering schemeRoutingMixed realityMessage passingComputer networkMetadataStrategy gameNetwork topologyFehlererkennungPartial derivativeKey (cryptography)Latent heatPlanningOverlay-NetzRotationScheduling (computing)Traverse (surveying)Multiplication signMultiplicationAutomatic repeat requestComputer networkCommunications protocolReading (process)Row (database)Inheritance (object-oriented programming)Meeting/Interview
28:09
CryptographyKey (cryptography)Noise (electronics)Encapsulation (object-oriented programming)Software frameworkCommunications protocolLink (knot theory)Mechanism designQuantumPhysical systemQuantum cryptographyBitComputer-assisted translationLoop (music)Spherical capComputer animation
28:44
Scale (map)Client (computing)Cohen's kappaComputer networkService (economics)Message passingLoop (music)Uniform resource locatorSlide ruleMechanism designInternet service providerSoftware developerMehrplatzsystemPhysical systemMixed realityTelecommunicationBlock (periodic table)Line (geometry)RoutingRotationKey (cryptography)Client (computing)Drop (liquid)CASE <Informatik>Parameter (computer programming)TunisBinary treeProcess (computing)Strategy gameFlow separationOverhead (computing)Computer networkFile formatNumberNeuroinformatikSequenceWechselseitige InformationCategory of beingDatabase transactionTransmitterAuthorizationQueue (abstract data type)PredictabilitySeries (mathematics)UsabilitySinc functionLambda calculusEstimatorDiagramDifferent (Kate Ryan album)Roundness (object)ProgrammschleifeComputer networkStructural loadGreatest elementSingle-precision floating-point formatArtificial neural networkMultiplication signTheory of relativityInformation privacyFilm editingComputer scienceGroup actionPlug-in (computing)Link (knot theory)Computer-assisted translationQuicksortStatisticsState observerPosition operatorRight angle
35:36
CryptographyInformationMessage passingStatisticsQuicksortComputer networkExtension (kinesiology)Computer clusterClient (computing)Slide ruleMultiplication signMetadataCryptographyBitComputer sciencePosition operatorArithmetic meanProof theoryRight angleFocus (optics)Physical systemMeeting/InterviewLecture/ConferenceComputer animation
37:13
Hybrid computerTraffic reportingInformation securityAuthorizationFocus (optics)LeakMetadataProof theoryPhysical systemNumbering schemeCategory of beingSecret sharingInformation privacyComputer networkComputer networkMixed realityEndliche ModelltheorieWebsiteTelecommunicationMeeting/Interview
37:56
TelecommunicationInformation privacyWebsiteInformationTwitterEmailOpen sourceNetwork topologyReal numberSoftware testingOrder (biology)State of matterDecision theoryInformation privacyGastropod shellTelecommunicationSoftwareType theoryPhysical systemProjective planeComputer networkCategory of beingClient (computing)Computer networkEndliche ModelltheorieCode2 (number)Service (economics)Data storage device
39:20
Model theoryComputer-assisted translationComputer networkCASE <Informatik>Task (computing)Meeting/InterviewLecture/Conference
40:02
Mixed realityModel theoryMessage passingEquivalence relationInformationQueue (abstract data type)Category of beingInternet service providerStatisticsIn-System-ProgrammierungLine (geometry)AuthorizationDecision theoryInformation securityDirectory serviceRoutingQuicksortStrategy gameRight angleEndliche ModelltheorieComputer networkPartial derivativeCovering spaceComputer-assisted translationGradientMetropolitan area networkOcean currentCore dumpOperator (mathematics)Meeting/Interview
42:00
Key (cryptography)Pairwise comparisonComputer networkEndliche ModelltheorieMultiplication signDifferent (Kate Ryan album)Metropolitan area networkPoint (geometry)WindowProcess (computing)Dimensional analysisMixed realityTorusCompass (drafting)VideoconferencingMeeting/Interview
43:32
Message passingService (economics)Computer networkComputer networkVideoconferencingPublic-key cryptographyScaling (geometry)RoutingPartial derivativeBasis <Mathematik>Different (Kate Ryan album)ScalabilityType theoryDenial-of-service attackMixed realityOperator (mathematics)Data managementPredictabilityChannel capacityCartesian coordinate systemControl flowFrequencyHypermediaMultiplication signRight angleFood energyLecture/ConferenceMeeting/Interview
45:26
Band matrixCartesian coordinate systemDependent and independent variablesWeb 2.0Service (economics)Ferry CorstenStallman, RichardContent (media)Computer networkComputer networkProcess (computing)Multiplication signEmailWeb browserInternetworkingInformation retrievalReal numberReal-time operating systemCrosswindConnected spaceLecture/ConferenceMeeting/Interview
46:40
VideoconferencingCASE <Informatik>Roundness (object)Lecture/ConferenceComputer animationJSON
Transcript: English(auto-generated)
00:15
here is from two lovely gentlemen, Dave and Lief. In the far plant, you might have seen Mo, but that changed.
00:24
So we have this talk by David Stanton and Lief. And the title is Introduction to Mixed Networks and Katzen Post. And this is about a new anonymity movement and how mixed networks actually work. I have no clue, so I'm thrilled. Hopefully, you are as well. And here are Lief and Dave.
00:54
OK, I'm going to give a brief introduction. David has a lot to say about mixed nets. I'm going to talk more generally about anonymous communication
01:01
first to sort of give some context. And the first question I want to ask and answer is, what does the word anonymous really mean? If we look at Pictionary, of course, the good source of truth here, it shows us that anonymous means without a name. It has a few different definitions there.
01:22
None of these really apply to privacy-enhancing technologies, anonymous communication technologies. The closest thing in this definition is actually about Wikipedia, which says anonymous contributors of Wikipedia.
01:40
So edits on Wikipedia that are not logged in are often called anonymous edits. Now they call them IP edits, which is more accurate. But a lot of people still call them anonymous edits. And those edits are much worse for privacy in a lot of regards than if you were to log into Wikipedia. You could have multiple pseudonyms.
02:01
And if you do that, then your edits are not linkable to each other, which I'll talk more about in a moment. So, oh, wrong computer. This is an interesting project you can find on GitHub at github.com slash edsu slash
02:23
anon that tracks so-called anonymous edits on Wikipedia and identifies who's doing them. And there's a thing called Congress edits, which showed edits made by people in the US Congress and lots of other places as well.
02:41
So let's see. So when people talk about anonymity in the context of privacy-enhancing technologies, they're really usually talking about unlinkability in some form, which is a lot more precise but still too vague, because it doesn't define what is unlinkable to who.
03:02
In the literature, you can find a lot of more specific concepts, like sender anonymity, receiver anonymity, location anonymity, third-party anonymity. These are all about preventing people from linking who is talking to who or who sent a message. Then there's some more interesting properties,
03:22
like sender unobservability and receiver unobservability, which is to avoid having somebody be able to observe that a user has sent a message at all, regardless of who it's to. All of these are about making events unlinkable from the perspective of some adversaries with specific capabilities.
03:42
So we talk about those capabilities in threat models. And threat models should define adversary capabilities and say what a given tool is trying to protect against. Sometimes they're not well-defined, and sometimes they are very well-defined,
04:01
but they're not well understood by the users that rely on the software. I think when looking at threat models, it's useful to invert them. Instead of thinking about the adversaries that it is protecting against, consider the ones that it doesn't protect against. So if you have an example of somebody is editing Wikipedia,
04:20
they post an edit that somebody doesn't like, who can link that edit to another edit that the user has made or link it to an IP address? There's a couple of different adversaries that clearly can. There's the operators of Wikipedia. And there is people who are able to observe the user's traffic to the site.
04:41
Wikipedia is encrypted, of course. But if somebody at your ISP, or say you make this edit from work, and it's about your company, and they're looking at this edit, they don't like it, they wonder if somebody at the company did it, they can see who, if they have net flow data that records all the TCP connections, they can see who sent that amount of data
05:02
to Wikipedia at that time. And they can clearly say, this user probably made that edit. So those are two very different adversaries, the site admin and anybody able to observe the user can do that, because they can see the other end as well, because that's public if you're making a public post. Also, anybody who could compromise
05:21
one of those two groups, the site admins or somebody can observe the user. So the most well-known anonymity tool today is Tor, which I'm sure everybody here has probably heard of. There's pictures like this that show how Tor works.
05:41
There's actually layers of encryption on each of those green lines. So David made another picture here that shows the layers of encryption. So you pick a path through the Tor network, and you connect to the first hop there, the entry guard, and extend to the middle, and extend to the exit,
06:01
and the middle can only see the traffic's coming from the entry and going to the exit. It doesn't know the client or the destination. And likewise, the entry guard knows about the client in the middle, but not the exit or the destination. So Tor's threat model, somewhat well-known, is their 2004 USENIX paper.
06:23
Oops. Yeah. This is the first paragraph of the Tor threat model, which says that a global passive adversary is the most commonly assumed threat when analyzing theoretical anonymity designs, which at the time this was written, mixnets were something that were being researched a lot,
06:45
and the Tor network was very new then. And they say, instead of trying to protect against that, we're going to try to achieve more by having a lower latency system. We'll talk about the properties of mixnets and why they made these trade-offs.
07:01
A global passive adversary at the time was considered to be pretty unlikely. It was kind of a theoretical thing. Of course, today we know that there are entities that observe all traffic passing by lots of different points around the world. And they're not even just passive. They do active attacks as well. And unfortunately, there isn't a general purpose anonymity
07:22
system that's designed to protect against those sort of adversaries still today. But even worse than that, there's much weaker adversaries that can also often do what Tor is trying to prevent them from doing. The second paragraph of the threat model
07:40
explains that actually it's not just global passive adversaries, but people who can just see both ends of the connection. They could confirm that that's happening. And in the case of posting a public message, that's everybody can see one end of the connection. You can see when a blog comment was posted or when a Wikipedia edit was made or so on.
08:01
So going back to the example earlier with those two different adversaries, the site admins or somebody who's monitoring your traffic, like your ISP or somebody on the Wi-Fi, your employer, a university, they could actually, in some scenarios, still tell who made that comment on a blog, even if you're using Tor. Even if they don't observe anything
08:22
except for your connection. If you have a local adversary at your university, your employer. So that's not great. So am I saying we shouldn't use Tor? Actually, there was supposed to be another slide in here that said, why use Tor? And I've got a lot of reasons why I still use Tor,
08:42
even though it's not got the strongest threat model. And I recommend people use it for lots of things. The reasons are that it provides location anonymity from sites if they aren't observing your network connection. It provides some browsing privacy from adversaries that do observe your local connection if they aren't observing the sites.
09:03
A single-hop VPN could do both those things, but it's much weaker because the anonymity set is much smaller and the VPN is a single point of failure that could be observed and see all of your source and destinations, link them all together, regardless of where you travel to. So I think Tor is, it's not able to defend against somebody
09:24
that sees both ends of the connection, but it is a lot better than any other alternative for a general purpose anonymity system. The Tor browser also has some great anti-tracking features and there's hidden services. Tor is great. I don't want to sound like I'm saying otherwise.
09:41
It's just, we'd like to have something stronger. And so that's mixed networks. Mixed networks were proposed, I think, maybe in 1979 even. The paper about it came out in 1981. This is the first paragraph of David Chaum's paper, Untraceable Electronic Mail, Return Addresses,
10:01
and Digital Pseudonyms from 1981. So how do mixed nets work? They're kind of like Tor. They have the layers of encryption, but there's a big difference, which is that they are reordering the messages. All the messages are fixed size. So we have four messages.
10:20
They're going into a mix. Inside the mix, we don't see what happens if we're an external observer. There's a layer of encryption that's removed and new messages come out in a different order that are bitwise unlinkable from the inputs. The mix, of course, can link them. So you actually want to have more than one mix.
10:42
You could think about the whole mixed network as a single mix, logically, though. So messages go in, messages come out, and you can't explain that. So I guess I'm gonna turn it over to David here.
11:02
Okay, so Leif mentioned the threshold mix strategy. I didn't, actually. Well, the threshold mix strategy, say we have a threshold set to four, so this mix wouldn't send any messages until it accumulated four messages. And so that means that these output messages have a 25% chance of being linked
11:23
with one of the input messages, which is a pretty weak security margin. So if we were gonna use this in a public deployment in a real-world scenario, we would want to set the threshold to like 10,000 or a million. We can also use what are called continuous-time mix strategies, which don't have concrete bounding
11:44
on the anonymity set size. For example, we're gonna talk about Katzenpost. It uses the Poisson mix strategy, which was first published in the Lupix paper, and it's continuous-time mix, so messages come in and out of the mix at random times, and users set the delay to a random delay.
12:03
So continuous-time mix strategies, in this case, would, well, I mean, so let's talk about the architecture. You have to get the public key material of all the mixes to be able to send a nested encrypted packet through the mix network. So MixMinion, which was a project before Tor project,
12:28
they used the pool mix, and they had a PKI system similar to Tor. So Katzenpost uses a directory authority system similar to Tor and MixMinion, but we plan in the future to improve its design.
12:42
Right now, it's not Byzantine fault-tolerant, but it is slightly decentralised in its voting protocol. So once clients gain access to all the connectivity information, all the key material, they can send these nested encrypted packets through the mix network. Okay, so this is a great paper
13:02
that discusses mix strategies. So there's many different trade-offs for mix strategies. There's performance and anonymity trade-offs, and this paper takes a functional look at the different trade-offs for various mix strategies. But one thing they all have in common is they all add latency. And so, whereas Tor tries to route messages
13:24
through the network as quickly as possible. And actually, that might not be the most accurate way to say it, because Tor actually routes cells, which are pieces of a stream. So I want to discuss some attacks and some defences,
13:41
and we have a lot to cover, and I thought I would do it by talking real fast. That's kind of a talking strategy to get all the information in here. So, but most of the attacks we're gonna talk today are about all communication networks. They don't just apply to mix networks. All these attacks apply to Tor
14:01
and other anonymity systems as well. So it's pretty useful, but this n-1 attack is really specific to mix networks. And to briefly describe it, if we have this threshold mix, and the adversary controls some routers upstream or downstream, but they don't control the mix itself. If they see the target message enter the mix,
14:21
they can always just send their own messages into the mix, and make sure it hits its threshold so the messages are shuffled and sent out. And they know what their own messages will look like coming out, but so the one message they don't recognise is obviously the target message. So they've just traced the message through one hop through the network. And this attack would have to be repeated
14:40
through all the hops in the route in order to completely compromise the unlinking between sender and receiver. So that's one example of an attack, and it's called an n-1 attack, but there's many variations on this attack. And this is a pretty good paper to read to get a kind of overview on how to apply it to different types of mix strategies.
15:03
So Anya Pachowska's the main author behind the Lupix anonymity system. Katzenpost is based on a lot of the design from her paper, which includes the Poisson mix strategy, this idea that we can make decoy traffic trade-off with latency.
15:20
So we can make the latency a bit lower if we're willing to exchange some bandwidth overhead and send decoy messages. But there's a- I could interrupt for a moment. Something that I neglected to say here was that the original Mixnet projects that were deployed in the late 90s, early 2000s had extremely high latencies,
15:42
and were sending email only, which made them not very useful and thus not a very big anonymity set. And the renaissance of Mixnet research in the last decade or maybe a little more than a decade has sort of arrived at thinking that maybe we could actually have much lower latency while still having very strong anonymity.
16:01
So that's where the work is today. Yeah, did you want to mention the anonymity trilemma? I think you have that later. But yeah, there's this paper about the anonymity trilemma, which there's a big oversimplification of it, says there's three things you'd like, really strong anonymity, low latency, and low bandwidth overhead.
16:22
And Tor is kind of on the low bandwidth overhead, low latency side, so the anonymity is not as strong. And the new thinking is that you could have a bit more bandwidth overhead of sending a lot of cover traffic and have reasonably low latencies still with much stronger anonymity.
16:42
Yeah, so in the Lupix model, let's see, let's not talk about N minus one attacks for Poisson mixed strategy in this talk. So there's various types of cascade, various types of topologies for Mixnets, and David Chong's first paper published in 1981 covers this cascade topology.
17:02
But one interesting difference between Tor and Mixnets is Tor has to have lots of relays, and you have to have a big route permutation space, because you want route unpredictability so that your adversaries can't predict your route. And for Mixnets, we just don't need route unpredictability. Everyone could be routing through the same four hops,
17:23
and it would work fine. Except it wouldn't scale, so you need more so it can scale. But if there's four hops, you could still provide strong anonymity for however many users they have capacity for. Yeah, so it doesn't scale, it doesn't have high availability, and so 10 years ago or so, people thought that free route was a good alternative to cascade,
17:42
because you at least have high availability. If one of these nodes in the network has an outage, you can route around it. So free route is a topology where any, you can make a path that goes from any relay to any other relay. You can completely pick your path freely, hence free route. Yeah, so one of the downsides is that actually
18:04
if Alice is sending a message to Bob, and Bob is sending a message to Alice, and they're using the same three relays, but in a different order, if the messages, they're gonna intersect on one of these Mixes, and when they do, the source of the message will be a distinguishing characteristic. So the message won't actually be mixed.
18:21
What we mean to say is that the anonymity sets on the Mixes will be split, and they'll be smaller sets, so a smaller security margin. And so these academics came up with the stratified topology, also known as the layer topology. And so here we have a diagram with three layers,
18:45
layer one, layer two, layer three, and they can only send to the next layer to the right. And in Katzenpost and in Lupix, we have providers at the beginning and end of the route. Providers are a superset of a Mix. They allow message queuing for later retrieval
19:02
and network services that you can interact with. But there's other approaches as well. If you think about it, if there was a compromised Mix on each of these layers, then the more messages you send, you would choose a new route for every message you send,
19:20
and so you would be increasing the probability of eventually choosing a bad route. So in our threat model, a bad route is defined to be a route in which every Mix in your entire route is compromised. Then it's game over, right? So if we have one honest Mix in your route, you still have this unlinking property between input and output messages.
19:41
So if we could instead, another approach to dealing with that sort of compulsion attack threat model is to have various cascades be distributed by the PKI so clients can choose whichever one they want.
20:00
So when you allow clients to choose routes though, you need to make them not distinguishable characteristic for that one client. Otherwise it could be fingerprinting attacks. So this is why a gossip protocol is probably not good to distribute information about the network. This is why we have a PKI that distributes the consensus document
20:20
that covers the entire network to all the users. So that everybody has uniform knowledge about the current state of the network and one user's choices about their path selection will be indistinguishable from other users. Yeah, and this not only applies to Mix nets, but it applies to Tor and I2P and all anonymous communication systems have the potential to have route fingerprinting attacks,
20:41
which we would like to avoid. But I feel like the most important attack category that we wanna protect against is statistical disclosure attacks. And all types of statistical disclosure attacks are important to have some defense against, but mainly we are trying to provide partial defense
21:00
against long-term intersection attacks. So we can extract away the entire Mix network as if it's a single mix, a single router, where there's some input messages and some output messages. So if, for example, Alice one goes offline, Bob one and two might be observed to receive 20% fewer messages. So if that's the case,
21:21
it's now obvious that some metadata was leaked, right? Alice one was previously sending the messages when she was online. So this statistical disclosure attack applies to all communication systems. And so all our communication systems leak some amount of metadata. An expanded version of this diagram is this.
21:42
So if clients are receiving messages directly from the Mix network, then a passive adversary on the right, the vertical line represents a part of the network they're viewing, right? They can see all the clients send messages into the network, and all the messages come out. And they can make these long-term statistical predictions or sort of assumptions about
22:02
which client is talking to who. And however, if we replace the edge of the network with providers that have many queues for each client, then this statistical information that's leaking here on the right is a lot less specific about which client. So if each provider has, say, 10,000 message queues
22:24
for like 10,000 other users, then this would be leaking a lot less information. So Cats and Posts also has clients retrieve messages later from providers, and they use a traffic-padded protocol to do so. So each of these clients receives the same amount of information.
22:42
Jean-Paul, in this example, he has one message in his queue, and it's indistinguishable from the other retrievals. So that's a little bit about statistical disclosure text. There's actually a lot of literature about it. So we use the Sphinx packet format.
23:03
It's specifically designed for decryption mix networks, but it has also been used in the design of low-latency systems like Hornet. So Sphinx was designed as a drop-in replacement for the packet format in MixMinion. It never was deployed as such.
23:20
We have a slightly modern version of it in Cats and Posts where we use newer cryptographic primitives. And I don't have time to talk about all the details, but basically, since we don't have an interactive bidirectional communication channel, we're using these Sphinx packets
23:41
that are being transformed as they traverse the network. We don't have forward secrecy properties, so we are more vulnerable to compulsion attacks. Actually, this is interesting because Tor is, in this one particular threat model, Tor is a bit safer than mix networks.
24:01
So when we talk about overlay networks, we usually refer to the wire protocol that you're actually talking to the machines over as a link layer, right? This is connecting the clients to the mixes and the mixes to the other mixes. And in Tor, they use TLS, and in our mix network, we use a noise-based cryptographic protocol,
24:20
and if we just ignore that we have this cryptographic link layer for a minute, if you were to grab some Tor ciphertext from like an Ntor handshake, and you wanted to get the police to compel, legally compel a Tor relay operator to decrypt it so you can find out where the next hop is, if you were to try to do that with Tor,
24:41
would be difficult to make such an attack successful since you have these ephemeral keys being destroyed every few minutes by every hop in the circuit. And in mix networks, there is some opportunity to perform these attacks with legal action, police raids, or just compromising the mixes in the route. To clarify a little bit of what the
25:03
compulsion attack here is possible because mixes have to use a fixed key. You don't set up a session like in Tor where you create a stream, you extend to each hop, so before you actually send your request through Tor, your HTTP request or whatever you're sending, you're doing a ephemeral Diffie-Hellman
25:22
with each of the intermediate hops. You have key material that can be thrown away at the end of that stream or that circuit. And in mix networks, you don't have this state. Each message is an independent thing. So you can't throw those keys away because you learn them and you have to use the keys
25:41
that are the current key. So instead, the keys are rotated at some key rotation interval. And during that interval, this compulsion attack is possible where somebody could go through, find a message they want to de-anonymize and ask each hop, say, or tell them, tell each hop, you have to decrypt this or we'll kill you or something. So that's the compulsion attack.
26:01
Key erasure is the main defense we have. In 2002, George published a paper, Forward-Secure Mixes, where you can interact with the specific mix key which is destroyed immediately after your packet traverses that route. However, you're leaking extra metadata to that mix. So you're saying, I'm the same entity as last time
26:20
that interacted with you and now I'm doing it again and you're destroying the key for me, thanks. So this is kind of a trade-off because on the one hand, you're leaking extra metadata to this one mix. On the other hand, it destroys the key sooner. In Katzenpost currently, the key rotation schedule is set to three hours but we plan to soon set it to much lower,
26:41
maybe under an hour. And so this is our main defense, this key erasure. And there are other partial defenses against compulsion attacks. Here's another paper about it that's pretty interesting. It's got deniable routing, multipath routing steps and things like that. So I don't want to go into any detail about these
27:02
but I just want to mention that there are other avenues of thought regarding mixnet design. And so Amir Hertzberg is one of the authors of these two papers and they have a different kind of strategy. I mentioned the multicascade topology before
27:22
and so it's somewhat related to the compulsion attack in the sense that, like I mentioned before, if you send your messages through the mix network with a new route for each hop, you don't want to necessarily increase the probability of choosing a bad route.
27:40
So you could use one route for a while and then switch your route. So our mix network's an overlay network and so what we mean by that is we're using IP, right? We're using IPv4, we're using TCP, and we're adding some protocol layers on top of that. So we can have a custom automatic repeat request,
28:03
error correction scheme to retransmit messages if they get lost in the mix network and things like that. And so Trevor Perrin helped us a bit. Jonning Angel designed our wire protocol. It's based on the noise cryptographic protocol framework.
28:22
Peter Schwabe also was very helpful in communicating about New Hope Simple and Kyber and was involved in creating the post-quantum key encapsulation mechanisms that we use. So we have a post-quantum cryptographic link layer which I think is pretty cool
28:41
thanks to Jonning Angel and these other people. So this is the anonymity trilemma we mentioned before. There's this trade-off, inherent trade-off between latency, bandwidth, and the strength of anonymity that is offered by the system. So in the, I just want to mention,
29:00
in the Katzenpost, or in the Lupix design in general, using these Poisson mix strategy on the mixes, clients aggregate several Poisson processes and they send out traffic which has legitimate traffic mixed with decoy messages. And the reason they do this is so that you can tune the mix network
29:23
by modifying parameters in our PKI. So it's distributed to all the clients and the clients can set their traffic to these parameters and then everyone's traffic more or less looks the same. And there's a kind of trade-off between how much decoy traffic is sent
29:43
and how much latency is there and then the strength of the anonymity. So the tuning problem is, so here's an example of a drop message. Alice chooses a random provider in the network to send it to and the provider sees that it's a drop message so it drops it. So it's not fooling an adversary
30:01
that's on the destination provider. It's only indistinguishable to a passive adversary watching the network in this case. And here's a loop message. So a loop message in Katzenpost is sent to this loop service on the destination provider and Alice uses this single-use reply block.
30:24
It's a mechanism within the packet format that allows anonymous replies. So the reply is sent to Alice without the provider knowing Alice's location on the network. I think maybe a little more should be said on this slide here of the single-use reply block. The light blue line there, the reply, that's a route that Alice has chosen
30:42
and sent that to the provider at the bottom left. So they can't see what the return route will be. Alice picked that and gave them this SERB, a single-use reply block, which is sort of like a self-addressed stamped envelope that says, hey, this is how you reply to me. And so this is the mechanism for anonymous replies.
31:01
But these are only good for the duration of the key rotation interval, which is this trade-off about how long you have to do a compulsion attack. So SERBs are in, older MixNet designs are actually used for human replies, but in our current design, they're more for automatic replies from services
31:22
where you're not gonna use a SERB to reply the next day. It's gonna be used soon within the same key rotation interval. Right, I should also, that also reminds me what Lief said is, since we're using SERBs with the Poisson mix strategy, the client, actually, Alice, in this case, chooses the delays, all the forward delays
31:42
and all the reply delays. And so she actually knows the rough estimate of the round-trip time, or fairly accurate unless there's computational overhead at each hop or so, but she knows the artificial delay placed by each mix for the round-trip, which is helpful so that if she doesn't get the reply, she knows it was lost in the network.
32:04
And so there's, let's see, drops, loops, and Lambda P could be a forward message or perhaps a loop if the queue is empty. So here we have another diagram where Alice is retrieving a message
32:21
from a rendezvous provider. So instead of rendezvousing on the provider you directly connect to, you wanna exchange messages using somewhere else on the network, because you want some unlinkability property. You want to retrieve your messages using the mix network as communication transport to retrieve the messages. So, I mean, it's a fairly simple design
32:42
and it's different than the Lupix paper. The Lupix paper is not actually presenting us with an anonymous communication system in the sense that Leif was talking about where we have location-hiding properties. Here, we in a sense want a kind of mutual distrust property where if Alice and Bob
33:01
are talking to each other and if Alice is compromised, she shouldn't learn Bob's location on the network, or the adversary compromising Alice shouldn't learn Bob's location. And so they can use these intermediaries. What I wanted to explain really quickly is that if Alice's behavior is predictable,
33:23
there's an active confirmation attack. So if she retrieves a message from this provider, this rendezvous provider on the network and it's compromised, then the adversary could make an outage on the network. So that the reply might not ever get to Alice. And so if Alice then sends the reply again,
33:43
it might use the same sequence number or so, and so the adversary just had a positive confirmation that Alice is on a certain half of the network, because they've created an outage for half of the nodes that can receive the reply. So the adversary gets to learn which half she's on, and so if you have some background in computer science,
34:00
you can easily see this attack would happen in logarithmic time. It's basically like a binary tree search. And so instead we want to randomize the retransmission delay. This might create usability problems, but we can't have client behavior predictable. And so Katzenpost internally,
34:21
it's a series of queues connected in a pipeline, but providers can have services, and we have a plugin system, so you can add services to the network. So we really want to collaborate with other developers and academics as well, and if developers would be interested in creating new messaging systems, we support that.
34:41
We have messaging systems that are usable now in Katzenpost, but we also want to support making new messaging systems. So it could also be used to transmit cryptocurrency transactions, obviously, because that's just a message.
35:01
And so Anja, the author of Lupix, also wrote a paper about it and you would want other people transmitting their cryptocurrency transactions. You want to hide in a crowd of other people doing the same thing. Mixing other services on the same network wouldn't really create
35:20
this end-to-end unlinkability property you want. You're only really being mixed with the other people in the network. So you might be thinking, why does Zcash need this? Isn't it already anonymous? And Zcash, the anonymity that it provides keeps you from linking transactions together, but it doesn't provide the sender unobservability. And that's what the MixNet can provide.
35:41
So Zcash sort of, like many things, could potentially have a statistical disclosure attacks against it, where you see when somebody sends a message and that's a useful piece of information. And if you're a MixNet Zcash client, people can see that you're connected to the network, but they couldn't see if you're sending right now.
36:03
Okay, so what's next? I think maybe we need to skip some of these slides if we want to have time for questions. Okay, right, yeah. These slides are a bit too much detail, but after them, I have some paper recommendations.
36:21
Hold on. Let's just skip all these things, you don't need it. The moral character of cryptographic works by Philip Rogaway is a really excellent essay. And we want other cryptographers and computer scientists and people like that to think more about helping society and not just furthering their academic careers
36:40
and publishing papers that have no meaning. So we'd like cryptographers to collaborate with us and with other computer scientists and to do practical things that would affect society positively. He mentions CHOM 81, CHOM's 1981 MixNet paper, 13 times in this paper.
37:04
And it's really what he's trying to point out is that we should really think about protecting metadata that's leaked, right? Cryptographers focus too much on confidentiality or on these zero-knowledge proof systems or whatever fancy thing they're working on, but they don't focus enough on protecting the metadata that we're leaking.
37:23
I also wanted to mention the authors of the anonymity trilemma paper wrote a technical report which is published on their website, and it is called Beyond MixNets, and it talks about hybrid networks that are pretty interesting, have some interesting performance security properties
37:40
that seem to be slightly better than mixed networks. You can enhance a mixed network using a hybrid network scheme by adding secret sharing, offline secret sharing. And I think this paper is pretty cool. Privacy notions, it's pretty interesting. It talks about threat models for anonymous communication networks
38:01
and what privacy notions do you expect from these systems and how exactly to articulate that. So that's our talk. Is there any questions? Before questions, I'll answer a couple questions I anticipate is what's the state of the project today?
38:21
Katzenpost, David, and other people have done a lot of work on it. There's code you can run. There's a test network that is run by David. So it's not providing real anonymity properties. It's not a thing you should start using today unless you wanna hack on it and play with it. There's a simple text-based client. There's a lot of work left to do,
38:42
but it's being very actively developed. Another frequent question is what is this delay, this lower latency but still not as low as Tor? It's not low enough to type in an SSH shell or something, but I think David really doesn't like to answer that question because it has to be tuned, and there's a lot of decisions to make about how low the latency should be,
39:00
but I think we could say on the order of seconds, not minutes, is what we're anticipating is gonna be reasonable. The other question I frequently get is about censorship, circumvention, and yes, our software works with Tor onion services, and you can use it with Tor and use Plugable Transports.
39:21
Tor's got that stuff covered already. Thank you for this super interesting talk. We do have some more minutes for questions, so there are one microphone angel up in the front. If you'd like to ask a question,
39:42
which I would like you to do, then please step up to this lovely gentleman and talk into his microphone, and in the meantime, while you do that, is there a question from the internet? Dear signal angel, do you have anything for us? That seems to be not the case. Therefore, please go ahead with your question.
40:02
Sure, you started by saying that Tor's threat model is not for a global passive adversary, and it seems that the threat model that Katzenpost and Mixnet's have does cover that, and perhaps goes stronger because you talk about compulsion attacks, but maybe does not go to what we might say is fully a global active adversary.
40:20
Can you sort of draw the line of what you think your threat model is and that boundary? So, first of all, in the Katzenpost model, the receiving side of messages is a provider, so if you wanted to do a statistical disclosure attack with a high amount of statistical information being leaked,
40:40
it's helpful to compromise the providers so you can see which message queue receives messages. The other active attack for Mixnet I can think of is like an N minus one attack, where you're attacking the mix strategy, and we have some partial defences against that, but it's kind of still in, it's not a full defence.
41:01
One aspect of the Lupix design is that the individual mixes are sending loops, so they could potentially detect if there's an N minus one attack happening. If messages that they send through a certain relay aren't actually ever making it back to them, they can see that that's happening, and what to do about that is,
41:20
well, kind of an open question, but the PKI operators, the people, the equivalent of Tor directory authorities could have some policy decisions to say, we're not gonna use that relay anymore because their ISP is facilitating some sort of attack.
41:41
Yeah, I think the... I mean, the compulsion attack is the other active attack, right? So mixed networks, we still get to have these security properties if we, even if we only have one honest hop in our route. So that's a kind of defence by design, so to speak. Yes, please, go ahead.
42:00
There's just a quick remark about the threat models, that threat models can be like put in strictly stronger and strictly weaker, which was kind of the question before, right? I think if that clears anything up, but maybe David can say something about that.
42:22
About stronger versus weaker threat models? I mean... Yeah, because the last question was also about compulsion attacks that Tor is better against. Okay, it's almost an unfair comparison because Tor is easily broken by much weaker adversaries and all these other attacks. It is true it's stronger just for compulsion attacks,
42:42
but I think for mixed nets, since we can have partial mitigation, we can make the window of time very small in which the adversary can actually compromise the key material and make use of it. So, and, you know, another defence against that is we can put our mixes in different continents
43:01
and different countries around the world, make the man's job really hard, right? To trace your path through the network. But I think that actually, that point really does kind of illustrate that you can't always say that threat models are necessarily stronger or weaker than one another. There can be better or worse in different dimensions.
43:31
Yes, please, one more question from the audience microphone or a remark. Just to say, because a global passive adversary, there was also some question about that.
43:42
I mean, you can, the active one usually can inject and delay and stuff like that because you didn't quite get on that one. Thanks a lot for your remarks. David, would you like to respond? We pretty much covered the active adversary
44:00
when we talk about N minus one attacks and compulsion attacks. I mean, there's some other attacks that can be done. I mean, certainly we don't have a defence against denial of service and things like that. We have partial defence, but yeah. Is there another question? Yeah, hi, I'm wondering what's your opinion on speed and scalability because in my opinion,
44:23
one of the major drawbacks of something like Tor is that it's just too slow to use it on a daily basis and it would be much more secure if everybody could use it on a daily basis without having such drawbacks. Okay, well, that makes me think of a lot of different things to respond with.
44:40
One of them is that mixed networks tend to not be super low latency, tend to be maybe medium latency. Some of the historical designs are high latency, like Leif mentioned. But we think that mixed networks are very efficient for scaling up, much more than Tor in the sense that we don't need lots of nodes in our network to scale.
45:01
We don't need lots of route unpredictability. We just need enough nodes to handle the traffic capacity. And currently, we have two public key operations per Sphinx packet, and I mean, there's some computational overhead, but it's manageable. So I think mixed networks scale very well,
45:20
probably to millions of users. And for a different type of application, right, it's probably not gonna be video conferencing. It's probably gonna be lower bandwidth applications, maybe high bandwidth applications would work over longer periods of time. This is not a thing that you're going to browse the web over in real time, exactly.
45:40
You could use this as a transport for downloading web content. You could have, and that's something that various people are considering, but it's not a replacement for Tor, which ultimately gives you TCP connections, where you expect to have pretty low latency.
46:01
Another interesting side note is that Tor has exit nodes, and generally speaking, mixed networks don't have exit nodes to the rest of the internet. We could have services at the edge of the network that perhaps go and retrieve something for you, like maybe some web content, or you could have an offline browser that. You could browse the web the way Richard Stallman does, where you send an email to a process
46:21
and it replies with a snapshot, you know. Yeah, so yeah, we get to control the entire network if we, and we can put decoy traffic everywhere if we don't exit, right? So it's one of the advantages. Thank you for this in-detail response. Do we now have a question from the internet?
46:44
That is not the case, so please give a warm round of applause for this very interesting talk to David and Lee. Thank you.