We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Cat & Mouse: Evading the Censors in 2018

00:00

Formal Metadata

Title
Cat & Mouse: Evading the Censors in 2018
Subtitle
Preserving access to the open Internet with circumvention technology
Title of Series
Number of Parts
165
Author
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The deepening of global Internet infrastructure comes accompanied with an invigorated capacity and intent by adversaries to control the information that flows across it. Inextricably, political motivations and embedded power structures underlie the networks through which we interpret and understand our societies and our world - censorship threatens the integrity of the public sphere itself. The increasing technical sophistication of information controls deployed by censors in adversarial network environments around the world can be uniquely viewed and researched by circumvention tool providers, whose work continues to preserve access to the open Internet for all communities. Through this presentation, we endeavour to share insights gained from the front lines of this technical contest.
Keywords
2
Thumbnail
36:48
16
Thumbnail
1:00:12
17
Thumbnail
45:59
45
59
Thumbnail
1:01:02
83
Thumbnail
1:02:16
86
113
Thumbnail
1:01:38
132
141
154
Thumbnail
1:01:57
Semiconductor memoryMultiplication signMusical ensembleLecture/Conference
Computer virusOpen sourceStreaming mediaCircleoutputFreewareSoftwareWindowTrans-European NetworksAndroid (robot)Lecture/ConferenceComputer animation
Sampling (statistics)outputInternetworkingSoftwareLocal ringShape (magazine)InformationDifferent (Kate Ryan album)Content (media)Game controllerDivisorDynamical systemPoint (geometry)TwitterIntegrated development environmentCycle (graph theory)Computer animationLecture/Conference
Event horizonObject-oriented programmingEvent horizonTwitterIntegrated development environmentComputer-assisted translationFrame problemInternet service providerGame theoryComputer animation
InternetworkingControl flowComputerDigital libraryDirected setComputer networkCommunications protocolArmInternetworkingSpherical capForm (programming)Different (Kate Ryan album)Digital libraryIntercept theoremGame controllerCategory of beingLecture/ConferenceComputer animation
Communications protocolFingerprintDirected setDigital librarySound effectSpywareUniform resource locatorInjektivitätComputer networkComputer networkComputerDigital mediaMechanism designContent (media)Domain nameInternetworkingAddress spaceSeries (mathematics)Network topologyInformationForm (programming)Game controllerCategory of beingShift operatorElectric generatorIntercept theoremDirection (geometry)Series (mathematics)Group actionComputer animation
Computer-generated imageryComputer networkInternet service providerKolmogorov complexityInternetworkingInternetworkingConnected spaceGateway (telecommunications)Software maintenanceSpacetimeType theoryInternet service providerPower (physics)QuicksortKolmogorov complexityDifferent (Kate Ryan album)InformationScaling (geometry)Computer animation
PressureUDP <Protokoll>Fiber (mathematics)Wireless LANData modelOSI-ModellData structureSynchronizationSocket-SchnittstelleIPSecFrame problemBridging (networking)Block (periodic table)Address spaceTelecommunicationNeuroinformatikCartesian coordinate systemEndliche ModelltheorieIP addressAddress spaceWebsiteDiagramBlock (periodic table)Computer animation
WebsiteContent (media)MereologyWebsiteIP addressUniform resource locatorVulnerability (computing)Error messageCASE <Informatik>Connected spaceContent delivery network
Uniform resource locatorBlock (periodic table)Domain nameIntercept theoremAddress spaceProcess (computing)Web pageSpywareContent (media)Uniform resource locatorMereologyMobile appBlock (periodic table)Latent heatDirect numerical simulationPoint (geometry)IP addressProcess (computing)Image resolutionLecture/ConferenceComputer animation
Revision controlImage resolutionBlock (periodic table)Web pageSquare numberCASE <Informatik>WebsiteLecture/Conference
Intercept theoremAddress spaceProcess (computing)Web pageSpywareInformationDigital filterTerm (mathematics)Block (periodic table)Entire functionDomain nameUniform resource locatorServer (computing)Block (periodic table)Domain nameWebsiteEmailCASE <Informatik>InternetworkingLatent heatWeb pageInternet service providerContent (media)Connected spaceException handlingPrice indexBitError messageAddress spaceType theoryComputer animation
Communications protocolEmailProcess (computing)Interior (topology)Uniform resource locatorIP addressDomain nameProcess (computing)Internet service providerContent (media)WordWeb 2.0Latent heatRepository (publishing)Term (mathematics)Parameter (computer programming)Uniform resource locatorCommunications protocolUsabilityEmailCircleLecture/ConferenceComputer animation
EmailProcess (computing)Uniform resource locatorInterior (topology)FingerprintWeb pageBlock (periodic table)Virtuelles privates NetzwerkEncryptionConfiguration spaceRule of inferenceEncryptionMathematicsStructural loadIP addressDomain nameWeb pageRule of inferenceConfiguration spaceLatent heatFingerprintBlock (periodic table)Multiplication signSound effectComputer animationLecture/Conference
Domain nameProxy serverVirtuelles privates NetzwerkPerspective (visual)SoftwareVulnerability (computing)Shape (magazine)Electronic signatureIdentifiabilityDomain nameDirect numerical simulationInternet service providerGoogolContent (media)Resolvent formalismLevel (video gaming)WebsiteOpen setBlock (periodic table)In-System-ProgrammierungCASE <Informatik>Proxy serverCircleLecture/ConferenceComputer animation
Client (computing)Server (computing)InternetworkingPoint (geometry)Open setServer (computing)Regular graphVirtuelles privates NetzwerkMetropolitan area networkMoment (mathematics)Communications protocolSystem administratorShape (magazine)SoftwarePerspective (visual)Lecture/ConferenceComputer animation
Communications protocolFingerprintComputer networkMoment (mathematics)Configuration spaceWeb 2.0RandomizationGeneric programmingInsertion lossQuicksortMathematicsDifferent (Kate Ryan album)Communications protocolBlock (periodic table)Lecture/ConferenceComputer animation
CalculusMorley's categoricity theoremWeb 2.0Decision theoryRandomizationProcess (computing)Computer-assisted translationBlock (periodic table)SubsetIdentical particlesGeneric programmingLecture/Conference
Communications protocolFingerprintDomain nameProxy serverWebsiteContent delivery networkInternetworkingDomain nameConnected spaceContent delivery networkDiagramWeb serviceData centerConnectivity (graph theory)Computer animation
Domain nameServer (computing)Transport Layer SecurityClient (computing)TelecommunicationDistribution (mathematics)Enumerated typeCommunications protocolFingerprintDatabase normalizationVulnerability (computing)Cartesian coordinate systemMultiplicationDistribution (mathematics)WebsiteSoftware developerMobile appComputer animation
Point cloudServer (computing)Vector spaceSoftwareIP addressWebsiteGeneric programmingData storage deviceLink (knot theory)EmailAddress spaceNumberComputer fileInformation securityLecture/ConferenceMeeting/Interview
Distribution (mathematics)Server (computing)Enumerated typeCommunications protocolFingerprintOverhead (computing)Computer networkLoginAddress spaceInternet service providerVirtuelles privates NetzwerkPlastikkarteServer (computing)SoftwareSubsetTerm (mathematics)Transportation theory (mathematics)Database normalizationTrans-European NetworksCommunications protocolFingerprintScalabilityOverhead (computing)Computer networkInternetworkingComputer architectureMultiplicationMultiplication signPeer-to-peerInternet service providerInstance (computer science)GodProxy serverEqualiser (mathematics)Computer animationLecture/Conference
IP addressEmailDifferent (Kate Ryan album)SoftwareGeometryIntegrated development environmentCartesian coordinate systemInformationStatisticsNumberClient (computing)PlastikkarteAddress spaceLecture/Conference
Cloud computingComputer networkLoginAddress spaceInternet service providerVirtuelles privates NetzwerkPlastikkarteCommunications protocolSoftwareDynamical systemReal-time operating systemLevel (video gaming)Event horizonDivisorDigital mediaFacebookTwitterMultiplication signComputing platformComputer animation
Digital mediaMultiplication signBlock (periodic table)CASE <Informatik>Dependent and independent variablesTelecommunicationTwitterVideo gameDialectComputing platformFacebookFrequencyLecture/ConferenceMeeting/InterviewComputer animation
Multiplication signFrequencyConnected spaceGoodness of fitSoftwareComputer networkInternetworkingMereologyAuthorizationType theoryServer (computing)Incidence algebraWeb servicePeer-to-peerQuicksortCASE <Informatik>Physical systemScalabilityBlock (periodic table)Point cloudContent (media)Lecture/ConferenceMeeting/InterviewDiagram
Internet service providerIn-System-ProgrammierungRule of inferenceOrder (biology)Diagram
StatisticsLibrary (computing)Client (computing)TorusBridging (networking)Multiplication signConnected spaceDirection (geometry)Computer animationDiagram
Multiplication signComputer architectureSoftwareCommunications protocolConnected spaceType theoryMultiplicationGroup actionTransportation theory (mathematics)Chemical equationComputer animationDiagram
19 (number)System call19 (number)VideoconferencingQuicksortDifferent (Kate Ryan album)Multiplication signFrequencyMobile appData storage deviceCommunications protocolInformationOrder (biology)Virtuelles privates NetzwerkDiagramLecture/Conference
19 (number)Entire functionBlock (periodic table)Scale (map)TwitterContent (media)QuicksortPoint (geometry)Scaling (geometry)Electric generatorVirtuelles privates NetzwerkNumberInternet service providerWeb 2.0Cuboid2 (number)Lattice (order)Physical systemComputer animation
InternetworkingWeb browserDomain nameEncryptionMereologyContent delivery networkWeb 2.0Rule of inferenceDatabase transactionTransport Layer SecurityPoint (geometry)Centralizer and normalizerMultiplication signBlock (periodic table)TelecommunicationServer (computing)Level (video gaming)Internet service providerEntire functionLecture/Conference
Partial derivativeInternetworkingSource codeOpen sourceInternetworkingDigital mediaFreewareInternet service providerWeb 2.0Cartesian coordinate systemPartial derivativeSoftware developerLibrary (computing)SoftwareComputer animation
Musical ensembleStress (mechanics)Acoustic shadowState of matterDisk read-and-write headAlgorithmMetropolitan area networkTwitterPerspective (visual)Focus (optics)FacebookMereologyLecture/ConferenceComputer animation
Mechanism designInformationFacebookPhysical lawElectric generatorTwitterIntercept theoremGroup actionForm (programming)Shift operatorSystem administratorLocal ringMobile appQuicksortContent (media)Game controllerComputing platformMultiplication signWeb 2.0Meeting/InterviewLecture/Conference
AlgorithmOpen sourceImplementationClient (computing)Latent heatLecture/ConferenceMeeting/Interview
Open sourceSoftwareServer (computing)Connectivity (graph theory)Internet service providerScaling (geometry)Client (computing)InternetworkingMobile appQuicksortBusiness model2 (number)Closed setContent (media)Pay televisionWeb serviceBroadcasting (networking)CodeLecture/Conference
InternetworkingAbsolute valueMereologyStatisticsOpen setSoftwareTrans-European NetworksPosition operatorNumberFreewareCASE <Informatik>Machine learningType theoryMilitary baseCalculationCartesian coordinate systemObservational studyScaling (geometry)Device driverGoodness of fitMeasurementUniverse (mathematics)ResultantSound effectFingerprintContent (media)Virtual machineLecture/Conference
Computer virusSoftwareInternetworkingLatent heatYouTubeContent (media)CASE <Informatik>WebsiteQuicksortDomain nameMessage passingInternet service providerSound effectSystem administratorDegree (graph theory)Connected spacePersonal digital assistantScaling (geometry)Group actionCircle2 (number)Band matrixException handlingComputer networkLecture/Conference
Self-organizationFilter <Stochastik>Content (media)Domain nameSound effectTouch typingLecture/Conference
Domain nameCASE <Informatik>Trans-European NetworksSoftwareDifferent (Kate Ryan album)Term (mathematics)Perfect groupGoogolMobile appWeb serviceScaling (geometry)MereologySoftware developerKey (cryptography)EmailPoint (geometry)Lecture/Conference
Semiconductor memoryCartesian closed categoryMusical ensembleComputer animation
Transcript: English(auto-generated)
So, and without stealing further time from Keith McManaman who is coming here from Toronto,
who is working at the CIFON, a censorship convention, NGO, and speaking about evading the censors in 2018 now, so censorship year round-up, I'm very happy to have you here and to see your talk now on the last day of Congress.
Thank you.
Hello. Thanks, everyone, for coming this afternoon. Hope you had a fantastic Congress this year, I know I did.
Thanks for sticking around to the final sessions. For many of you, this will be the last talk you see until next year, so I hope it's worthwhile. And to everyone watching the stream online, hello and welcome.
My name is Keith McManaman, I'm an analyst at CIFON where we operate a circumvention network that's used worldwide by tens of millions of people, and we provide free open-source circumvention tools for Windows, Android, and iOS. Yes, there is a circumvention tool that's running whole-device VPN for iOS in CIFON.
Due to its accessibility, free-ness, localisation, and overall network resilience, that has made CIFON a widely adopted circumvention tool which provides a
decent sample size of internet users, and therefore a reasonable barometer of circumvention tool usage in a country, which makes it an apt vantage point from which to analyse the impacts of internet censorship.
In my work, the kinds of questions that I'm interested in are how the social and political dynamics of information controls in different places, for example, the trends in the censorship legislative environment, political cycles, social unrest and social movements, emerging discourses in the media and online.
How do these factors add up and determine what content is accessible, and how does that shape people's online behaviour and their use of circumvention tools, including CIFON? This is an overview of what we'll be talking about today.
I'm going to go over the basics of censorship technology and how it's deployed. I'll talk about some of the circumvention methods and technologies that are in use today.
I'll recap some notable events from the past year, and then talk about some notable trends that we've observed in this environment. Just a short note on framing and metaphors. The cat and mouse game is a terminology that's kind of widely used to describe the
interplay between the circumvention tool providers and the censors. Sometimes, you'll hear like militaristic kind of framing, like the battle for the free internet, or the technological arms race.
I just want to say that there's nothing really, there's no Sylvester and Tweety, there's nothing mad cap or wacky about it, as you will see. So what is internet censorship?
It's the control or suppression of what can be accessed, published or viewed on the internet. I just took this definition from the Wikipedia. It comes in many different manifestations and forms. I'm going to be focused on the digital interceptive forms of censorship, which is what
circumvention tools are designed to deal with. But as you can see, there are other very important categories that have increased in their prevalence in recent years.
Specifically, the shift from direct interceptive forms of censorship, sometimes referred to as the first generation of information controls, to the second and third generation, which is characterised in this excellent series called the Access series by Rod Deibert
and the Citizen Lab and his colleagues, which is really the seminal work on that transition. So this is what we'll focus on for this talk.
Censorship is preventing you from treading all of these fascinating, wonderful paths. And it does that by taking advantage of certain features in the way the internet works.
How they're able to do that is the sensors control all connections across the international gateway to the respective country. Through the information ministries, they control the internet service providers, and they possess powerful methods of detection.
Increasingly, the internet censorship space is enabled by private sector actors. The cost of purchasing and running those technologies that allow you to maintain national blacklists, sort and filter different types of traffic, have become much more accessible
for national governments and to deploy at scale. The methods that we're going to go over vary in their complexity and their resource intensity. This is something called the OSI model of basically computer and telecommunication
systems. Suffice to say that censorship can happen at all layers, from the application layer all the way down to the physical infrastructure. So one of the lower-level tactics is IP address blocking.
Sensors can learn the IP addresses of the sites that they want to block and add those to a blacklist of forbidden IPs, so requests of those addresses will be discarded. It's a simple diagram of how that looks.
So when you attempt to visit a site that's blacklisted, you'll either get a connection reset or a 404 error. The weakness of IP address-based blocking is that a lot of IPs are not static. In many cases, they're hosted on content delivery networks, which are ephemeral in a way.
They shift from place to place, and people's IP address could constantly be migrating to a new location. So it's not effective, and it's a lot of work to maintain oftentimes.
It also comes with a high risk of collateral damage, like you'll tend to block other parts of content that are hosted on the same IP, and generally this kind of works better for blocking either specific apps, basically, rather than specific content.
In the same vein, URL blocking involves a blacklist of forbidden URLs, and when you
request will similarly be rejected. Port blocking also works the same way. So a sensor can choose a certain port that they don't want to allow any traffic through, and similarly, you would not be able to connect to those endpoints.
Okay, DNS hijacking or sometimes called DNS poisoning, DNS spoofing. This involves basically the DNS lookup process. So how a URL is resolved into an IP address, because that's controlled from a highly centralized
vantage point, the sensor can actually intercept your DNS resolution request and deliver a page of their choosing, basically, instead of the page that you've requested. Typically, that involves a block page of some kind, saying that, you know, the site
you've requested to visit is forbidden, but they can also even deliver a malicious page pretending to be the page that you've requested but actually isn't. There was a case in China before Wikipedia was HTTPS enabled, SSL enabled.
If you requested the page for Tiananmen Square, the Wikipedia article, they're actually delivering a kind of sanitized version of that site instead of the legitimate article. Of course, HTTPS adoption kind of prevents blocking specific sub pages nowadays.
So if you're in Iran, for example, this is a page that you might see that says you can't go to this site, but here are some great other sites that you can visit. In Saudi Arabia, this is a block page that you would see that's put there by the information ministry.
So in both cases, there's a clear kind of accountability, someone that you can contact, an email address, someone that you can contact about your inability to access that content. But oftentimes, your request will just fail to complete. You might get a 404 error, and there's not a clear indication of is this site banned content,
is there a problem with the content provider, or is there a problem with my own internet connection? So some kind of ambiguity as to why you're not able to visit that content.
Keyword filtering is kind of an escalation because it allows the center to filter URLs based on keywords anywhere in the path name. Again, pre-HTTPS, that was a bit more relevant because TLS or SSL-enabled connections you
can't see into the path name except for the top-level domain. And it also allowed them to block new or unknown pages that are related to that type of content rather than having to discover the domain and the IP address and add it to
the blacklist manually. They also have the ability to blacklist or whitelist entire protocols, say HTTPS. If they can't see into it, this is something that happened in Iran in the 2013 elections. It's a gradual escalation between the circumvention providers and the sensors there, which culminated
in eventually only HTTP traffic being whitelisted. And obviously, I come back to the term collateral damage. That really is something that can break a lot of other essential internet services and make that essentially unusable.
Deep packet inspection. This is a word that some may have heard spoken about through the Congress earlier in the week. This is basically a high-level processing method that allows the sensors to look throughout
the content of a web request in the header, in the inner traffic, as well as the URL for certain keywords and other specifications that pertain to a repository of blacklist arguments and choose to block that traffic.
So with the keyword filtering and deep packet inspection, the sensors need to process a lot more data. It's very much more resource-intensive. And it really depends how deep they want to dig. And as I mentioned at the beginning, the technology has gotten much more widely available,
cheaper and easier to implement. And more effective. Traffic fingerprinting is something that's enabled by that, because even without knowing the domain or the IP address or being able to see it through encryption, the sensor
can record what a browsing session looks like and create rules for how the user sees that page or if they do. Because encryption doesn't change that technical configuration. And so they can block a page based on its size, load time, and other kind of technical
details, which would even allow them to block, say, specific sub pages of Wikipedia that are HTTPS-enabled. They might incidentally block some other page that follows that specification, but that's kind of the trade-off that's being made. And I will come back to this, but just to mention, VPN traffic, SSH traffic, though
they are encrypted, they have a very obvious signature size and shape that's identifiable on a network perspective that can be fairly easily fingerprinted, which is definitely a vulnerability.
Now I'm going to switch tracks and talk about some circumvention methods. So, to each of the censorship methods that I discussed, there's kind of a circumvention answer, and it escalates from there.
So if your DNS is being poisoned, then you could switch to an open DNS resolver or a third-party DNS resolver. You've often heard of people switching their DNS to 8888, which is the Google DNS, or 1111, the Cloudflare DNS. It's like, Google and Cloudflare maybe aren't going to censor us, you could argue, or it's
at least better than trusting your ISP if you're in China or Iran or something like that. If you're a content provider and you think that your domain is blacklisted or the IP of your domain is blacklisted, you can migrate or mirror your block domain to a new one.
I mean, you're always racing the censors in that case, like chances are they can discover your new site just as fast as your readers can, but that's another way of kind of evading the lower-level censorship techniques. Another circumvention method you can use is by connecting to a web proxy.
So first you connect to some other website that's not on the blacklist, and from there you use that as your vantage point to kind of browse the open internet. Of course, you can use a VPN, and you can use other circumvention tools like Siphon or Tor, which I will tell you more about.
So SSH, this is a protocol that's used to communicate with servers and administrate them. It's great because it's encrypted. Any man in the middle that's trying to look at this request, they're only going to see something that they can interpret, but again, because of its regular size and shape
on a network perspective, SSH can be fingerprinted using the off-the-shelf technology. Same thing with VPN. And so for most censorship regimes, it's easy enough to block all VPN traffic in
and out of the country just by flicking the switch that says we're not going to allow VPN. And increasingly this year we've seen during, say, politically important moments like elections or public demonstrations, that the censors will utilize this ability and leverage that
over the networks they control. So, which brings me to OSSH. OSSH is an obfuscated protocol, it stands for obfuscated SSH. There's basically ways that you can innovate on the existing SSH tunnel to make it as much
as possible indistinguishable from random bytes of generic web traffic. So rather than looking like this strange encrypted thing that the censors can pick out and block, it's designed to blend in with all the rest of the web traffic that's going on. And there are a lot of different things that you can do to sort of change the exact configuration
that it follows so that it's as random as possible. And some of the things that you can do are insert random packets alongside the tunnel, like random web traffic both ways. You can vary the packet size, the packet interval, and other kind of ways of making
that as amorphous as possible. Again, back to the concept of collateral damage, a censor that's going to endeavor to block something that's indistinguishable from random web traffic based on certain features that they identify, and probably cause them to block, incidentally block, some generic
web traffic as well, which is a calculus that they're always going to have to make. What deep packet inspection is doing is it's scanning deep into every web request, but that process, as I mentioned, is quite resource-intensive.
So generally, the censor can only look at the first subset of packets, try to make a decision based on what their categorization of that traffic might be, and decide to either let it pass or filter it. So what circumvention technology is trying to do is make that more computationally intense
for them, and it really depends how deep they want to dig, and they do risk kind of slowing down general internet performance in the country if they do that. This is another technique called meek, or domain fronting, basically involves routing
traffic through what's referred to as high-value domains, so typically large infrastructure pieces of the internet, and hiding the real request inside the TLS encrypted connection.
For example, forcing traffic through CDN data centers that typically get a different blocking treatment because they are large infrastructure components of the internet that a lot of essential services require to run on. This is a diagram just showing how that request is passed along.
And if you're interested in learning more, I'd encourage you to refer to the paper David Fifield and colleagues worked on, including some CIFON developers.
What are some vulnerabilities of circumvention tools? The censor can attempt to disrupt your distribution. If people can't get your apps, then they can't use them. So one thing that we do is we have multiple redundant kind of distribution methods.
The censor can always block your website where you have your applications available for download. They might even blacklist the Play Store or the Apple App Store, or some countries that's embargoed and not available anyway. So one of the innovations that we use at CIFON is email autoresponders.
Basically you can request a number of generic email addresses, and the return email you will get has links to secure cloud-hosted download sites, and even the APK or EXE file
as an attachment. The censors might also be able to enumerate your servers one by one, even if you have thousands and thousands of servers. If they have enough people running enough discrete copies of your software, you have to make sure that they can't catch up with all your endpoints before you roll
them over. On the CIFON network, it's fairly ephemeral. Like, no IP addresses are really static, and the servers are constantly turning over. Really protects us against that vector of attack. Another thing is no individual copy of the software is ever going to know more than
a very, very, very small subset of the servers, like maybe 1% or less. Protocol-based attacks are interesting. CIFON is using what we call a multi-protocol architecture, basically protects against the blacklisting of one or even a few protocols, because there's always redundant transport
methods that the traffic can use. And then as I mentioned, we do various traffic obfuscation methods to be resilient to traffic fingerprinting as well. In terms of transports, what makes a protocol relevant?
So, one, is it effective? Does it work? Does traffic get through? Is it able to actually transport enough data? Like through actual throughput? Secondly, resilience. For how long is it going to work before it gets figured out and blocked?
Another thing is it should have low overhead. You can't insert too much extra data into the tunnel. And lastly, not placing too much demand on users. For instance, peer-to-peer traffic requires users to actually do something to run themselves
as a proxy node in a network, and that could affect scalability and even performance. The reason I say that is because even though some circumvention methods, experimental new methods that have been discovered and worked on are excellent, but they're not as easy
to scale from tens or hundreds of people to tens of millions of people, especially not rapidly. And some of the examples I'm going to show will show you how this network has really the availability, the ability to rapidly scale itself in critical events, and that
keeps people connected to the open internet. Just a small note on network data as well. I'm sure everyone in this crowd has at one time or another or regularly uses a VPN,
and not all VPN providers are created equal. Not all of them are to be trusted. So I want to just make a note on how to be privacy-conscious as a VPN provider. You're not technically anonymous from your VPN provider, because at the end of the day,
you are agreeing to tunnel all the traffic from your device across some third-party servers that you don't know them, and you don't really know what they're doing with your traffic, and you have to click that button that says, I trust this provider. So, with Siphon, we make sure that we don't log anything.
The only data that we're privy to is statistics that come from the network, aggregated network statistics, and no personally identifying information on any users. You can know where people are without collecting their IP addresses, because you can do the
geo-IP lookup on the client side and discard their IP address without it ever having to leave their device. And another great feature that Siphon offers is you just download an application, you don't have to register, you don't have to provide your email address, your phone number,
credit card, et cetera. So, what this data allows us to do is make some conclusions about the censorship environment that's being faced in different places, and try and make sense of how our network
protocols are being affected by those dynamics. It allows us to see how the software is performing and how that could be improved, and it also allows us to ensure that we stay one step ahead of the sensors.
This is a map in real time of just showing where Siphon users are in the world. There are at least some users in every country. I've highlighted Sudan in the center there, because the recent blocking event that occurred
starting December 19th, it involved basically centrally orchestrated blocking of all the major social media platforms, Facebook, Twitter, WhatsApp, and within a matter of days, as you can see, we've gone up to half a million users a day there. Interestingly, a lot of VPN tools are not available in Sudan because of the sanctions,
economic sanctions. So, I think that's another factor driving adoption, and it works. It's not the first time that we've seen a rapid spike in Siphon usage in response to
social media blocking. There was a case earlier this year in the summer, starting about mid-July in Iraq, where there were protests in Basra and the southern regions, and again, the government reacted by blocking Facebook, Twitter, WhatsApp, basically essential social media and communication platforms
that people rely on. Anyone from the MENA region, you know WhatsApp is life, and a lot of other regions too. And we were above 4 million users a day over that time period.
This is a snapshot of the protest period that began near the end of December last year in Iran, where, thanks to, I guess like good, overall good network performance
in the country, where sometimes VPN connections or other circumvention methods like the Tor network aren't as reliable, Siphon has a fairly good reputation there, and we reached
a peak of 14 million users a day after they blocked Telegram, basically the only essential instant messaging service you could argue that was already left uncensored by the authorities there, as well as Instagram, same case.
And so basically there was a country-wide demand for ways to stay connected with that, and that kind of represents almost a fifth or even a quarter of all internet users in Iran were using Siphon during this time period.
Another note on scalability is that the network is pushing at the peak like 1.4 petabytes of data per day. On networks that are known to be like average speed of two MBPS connection or something
like that, it's pretty impressive. And again, I did mention the sort of the challenges of scaling peer-to-peer type circumvention services. One of the advantages of having a kind of cloud-based centrally managed system is that
we are able to provision servers rapidly when there are incidents like this. Iran has been some of the most challenging, some of the most sophisticated, some of the most aggressive censorship that we've encountered over the past year.
And that comes from their motivation not just to block content, but also to block the methods that people use to get around the filtering there, which has become in the past decade a regular part of going on the internet.
Telegram was finally banned via a court order which also stated that Telegram must be blocked in such a way that no Iranian can access it, not even with circumvention tools. And so there was a push from the internet providers there, which are...
So some countries you might see a lot of heterogeneity in the blocking enforcement, as say the talk that was about the Telegram blocking in Russia, which was given by Lanny yesterday, his research showed that there was some varied compliance or delayed compliance
from some internet providers there. In Iran, the ISPs seem to be very centrally controlled, and so the blocking rule was basically implemented country-wide. So these statistics are showing daily users of Siphon and a Telegram client that was
deployed integrated with our library called TelegramDR, which within the first day was already up to close to a million users.
This is a shot of Tor usage during the same time. You notice Tor direct connections start to drop off in favour of bridges. But that's maybe 10,000 users a day, compared with 10 times, 100 times.
This is an example of the advantage of the multi-protocol architecture that Siphon is using for transports. These are just by protocol group showing hourly connections, and you can see when one
or two types of protocols get knocked out, the balance of connections is picked up by the other transports that we use. So effectively, without blocking all the protocols, the network will remain resilient.
This is another example showing China during the 19th Party Congress, which happened last October. Basically beginning in July of that summer, WhatsApp voice and video calls were beginning
to be blocked, and only messaging was working. Sort of drove the adoption of lots of different circumvention tools, but there was simultaneously an order to ban VPNs in China as well, which was slowly being orchestrated, and
even complied with by parties like, say, Apple, which removed VPN apps from the app store. Finally, in about mid-September, WhatsApp was blocked completely, and you can see that
our usage there started to increase a lot. Then at the beginning of the Party Congress, actually an attempt to filter Siphon based on protocols, and actually the protocols that were targeted were not connecting successfully or were taking a long time to connect, but nonetheless, we were able to sustain usage
through that time period to make sure that people had open access to information. Okay, so just a recap of some of the trends that have been noticed over the past year. For sure, deep packet inspection is getting cheaper and easier to implement, possibly
even able to look deeper into web traffic without sacrificing performance. A spec that I saw, it was an article from Radio Free Europe, Radio Liberty, which was
about a meeting of the basically DPI filtering providers in Russia. The newest spec was they wanted these systems to be able to run at scale, at scale on a one terabit per second feed of data, which, I mean, that's really shockingly high numbers.
That's possibly the next generation of this technology that we're going to see. As I mentioned, a crackdown not just on content, but on circumvention tools and VPNs, VPNs especially becoming a lot more easy to block, not just for the sort of notoriously
notoriously sophisticated censoring nations like China and Iran, but anyone, any government can really afford a DPI box nowadays, and it really has just a switch that you can turn off VPNs if you want to.
Another good point is collateral damage is becoming less reliable. So the example that I showed from Iran, we observed that, so this is just an anecdote on the same story, basically, even when you have encrypted communication on the internet,
some part of that transaction happens unencrypted, it's called the TLS handshake. So basically when you're going to communicate encrypted to that server, you're going to say, hey, there, server, I'd like to talk encrypted with you. The server is going to say, okay, I can talk TLS 1.1, 1.2, 1.3, and then you agree
and then you talk encrypted. So Iran blocked traffic based on some TLS handshakes that were suspected to be being used. The TLS handshakes that Siphon was using at that time were emulating some of the most
ubiquitous and common TLS handshakes that are used online, like Google Chrome, Firefox, Chrome Android, like most widely used kind of web browsers in the country. And when this filtering rule was implemented, users reported problems with using those central
internet services, your web browser, starting to break because of filtering rules that were deployed to target one specific thing, but not surgically enough. That said, it seems that sensors in 2018 are willing to sustain large amounts of collateral
damage or block unreasonable amounts of benign web traffic going in and out of the country, just in an effort to enforce their blacklist rules.
Another thing that we've seen not in 2018 but in previous years is the willingness to even block entire CDNs. So using kind of critical infrastructure pieces as a way of concealing or obfuscating circumvention
traffic may not be the most reliable method going forward. Another thing that has been observed is sensors are beginning to block only certain IP ranges at the sublevel of the CDN as well that are suspected to be involved in circumvention
traffic and kind of like being able to block just part of a CDN instead of the entire domain. So that's something to look out for in the coming year. So, by way of conclusion, what can you do? One, don't settle for partial internet.
Anything less than the entire open internet is not the world wide web. Secondly, you can't blow the whistle on censorship if it's safe for you to do so. Thirdly, you should use free open source circumvention software and support it.
Fourthly, come work with us. If you're a researcher, if you're a developer, if you're a media provider, we can work together and we would love to collaborate. And lastly, especially app developers, you can use our open source libraries. They're all on our GitHub. If you want to add some censorship resilience to applications that you're working on, then
that's something that is highly encouraged, and we would love to speak more about that. And we have about ten minutes for questions, so I will leave it at that. Thank you very much.
And I already see people lining up at the microphones, and we also have questions from the signal angels, so we just hurry to your questions. Microphone one, please. Hi, I wonder, did you mention, I see the biggest threat from censorship is from big tech giants
like Google, Facebook, and their manipulative algorithms. So this is like the biggest threat, and you kind of didn't mention, I think this would be like, I think this should be the focus, like how do we bypass this? Because these tech giants become more powerful than any like head of any state, and they're
like really, you know, I experienced just days ago, we started petitioning, which was totally shadow banned on Twitter and Facebook. So this is the biggest threat from censorship, even bigger than a state part is.
I mean, you haven't mentioned anything, what's your perspective on that? I think I did mention at the very beginning kind of the shift from first generation, that's interceptive forms of censorship, to the second and third generation of information controls, which is like, one, being able to have legal mechanisms that enforce the blocking
of content, and even the takedowns of content from, say, major social networks like Facebook and Twitter. And other apps too, Telegram has kind of administrators on their channels and groups now that are accountable to sort of local laws.
That's something that circumvention technology doesn't expressly deal with at this time. We are more concerned with maintaining access to the sort of the web platform itself when it gets blocked. But sure, that's definitely a concern.
I would say an even greater and more difficult concern in this day and age than simply being able to access content is being able to preserve content that's up there. So that's not something that I have an easy solution for, but definitely something that I'm gravely concerned with. Sure.
Democracy can be pre-programmed by the algorithms and censorship. So this is kind of the biggest threat from censorship, as I say, but thank you. Thanks. So next one from the Signal Angel, please. As far as I know, it's implemented in Siphon.
I'm not aware of any other clients that use it, but I believe it's an open source transport, so probably there are some people out there using it. And I should add to that, it's not always guaranteed to be the same thing.
Like, it doesn't stay the same for long, and probably other implementations of it have different ways of obfuscating the traffic specifically.
We have more questions on Microphone 2. Thanks for your talk. I have two questions. One is, as I understand right, the end user software is open source. How about the server components? Can I run my own VPN provider?
And do you plan to open source the software from the server side? Second question would be, like, I can imagine that running an infrastructure that serves so many users at this scale is pretty costly.
So what is the business model if you don't need to register donation-based, where do you get the money to pay the bill at the end of the month? Okay, first question, thank you for the question.
First question first, Siphon is open source, the client software, server software, or server code, it's all in our GitHub. Theoretically, you could compile your own circumvention client, run your own network if you have some servers at your disposal that you want to do that, then there's nothing stopping you from doing that.
Second question, sure, it's definitely a challenge maintaining a user base this large on a free service, so some of the ways that that's supported is, one, we have an app that allows you to subscribe for premium service for, like, the sort of not explicitly censored countries
like the United States, or European Union, whatever. People that feel conscious enough that they want to support internet freedom for others by supporting the network, that's really encouraged.
We also work with international broadcasters that have a mandate to support internet freedom around the world, and we can work with them to basically provide circumvention technology that helps them deliver content into closed societies, and in exchange, we have a way of supporting the free users.
Thank you. I would take the signal, Angel, again, because the others can also come in front then later. Have you seen countries with surprising amount of users, countries which are not usually considered to have heavily restricted internet access? Pardon me, could you repeat the question? Have you, in your statistics, do you have countries where the amount of users in that
country surprised you because that country is not usually considered to heavily restrict internet access? Yes, absolutely. Plenty of western countries have pretty significant siphon user bases, maybe tens, maybe hundreds of thousands of users. But even here in Germany, or in the UK, it's not to say that, like, countries that are
known to be free internet, open internet countries, it's not to say that every network in that country is free and open. Plenty of institutions, workplaces, universities maintain a pretty aggressive blacklist of some
types of content, or some applications, so that's something that I see as a key driver of circumvention usage in those countries. Thank you. Microphone 4, please. Let's suppose I'm a sensor, and who likes to use traffic fingerprinting to sensor, and
I wonder how efficient is it if I want to keep a number of false positives low. Are there any studies on the effectiveness of traffic fingerprinting?
I want to filter the bad traffic, but to keep good traffic safe. Thanks for your question. Traffic fingerprinting is known to be fairly imprecise. I mean, I don't have any studies I can reference, but anecdotally, within the internet
freedom community, what people are saying is that it's becoming better. But yeah, collateral damage is a calculation that every sensor needs to make, and in some cases they don't mind, and in some cases they don't know what other traffic is being
filtered as a result of the measures that they're implementing. And it remains in many ways kind of a whack-a-mole approach. There was a study last year that was on specifically machine learning-enabled censorship,
which was extremely imprecise, it had a lot of false positives. It was something like 80 per cent successful, they claimed, the researchers claimed, which obviously that's not enough to deploy at internet scale. We have one question at microphone one.
Hi. Hello, hello, hello. So sometimes I wear your hat, and sometimes I put on another hat where I'm your enemy. I won't go into the reasons why. I sometimes find that throttling can be more effective than blocking, because a user
might get bored downloading and waiting for a YouTube video to load and go away and do something else, which leaves the bandwidth available for other people. So I'm wondering if you've ever seen that tactic being used, where rather than you would actually block a website and give up a message saying 404 or second-action reset or something, where you just actually make it unusably slow.
You're a sysadmin. Are you a sysadmin? Of course. Thanks for your question. Yes, that's definitely a tactic. I think also trying to diffuse the accountability for censorship is another reason that they
do that, because there's sort of no guarantee that it's not a problem with the content provider's site or something. There have been lots of examples, I mean, I can share with you afterwards, of internet throttling used on specific domains to sort of, like in China, for example, famously kind
of not exactly blocking your connection, but just throttling you to a completely unusable degree. I would say when that's deployed on specific domains, circumvention tools still work effectively.
When it's deployed on a network scale, then there's not too much that we can do. Or like an internet shutdown, that's one case where it doesn't matter how robust and resilient the circumvention software that you're using is if no-one has an internet connection, with one exception, notably in Siphon history, where we kept networks online
in the case of an internet shutdown. Does that answer your question? Thanks. Thank you, and we have one more question from the signal angel. Have you had governments or other organisations try to fight you legally for enabling their
users to circumvent the content filters? No. That is a nice, short answer. So we have one more question here, or three more in the audience. I take microphone two, please. You mentioned domain fronting as an effective way for high-value domains.
When Google and Amazon stopped tolerating domain fronting, have you and the organisation been in touch with them? We were part of some discussions on the issue, since we notably do use that as a
technique. Siphon has never done domain fronting through Google or Amazon. I mean, to go back to the example of like a circumvention method that works for tens or hundreds of users but may not work for tens of millions of users, this is a
perfect case in point, I think, for that. It's like, sure, it's a cool trick. Maybe it doesn't require a lot of technical sophistication to put Google.com in the domain fronting. If I have tens of millions of users, that's potentially going to sabotage the person's domain that I'm using. Any domain fronting that is done on a scale that, like the Siphon network uses, it's
done under close collaboration, and a formal agreement, not in the informal way that it was being done by so many different app developers which I think is the reason that
it was eventually cracked down upon. And so, it does violate the terms of service of those domains. So, yes, does that answer your question? So, the person on microphone 4 has now disappeared.
Not coming back? Okay. So, thanks a lot, Keith, for your presentation and for answering all those questions, and you're still here for a few offline questions. That's really nice. So, thanks, you all, that you were here. Thank you. Thanks, Keith.