We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Investigating the Practicality and Cost of Abusing Memory Errors with DNS

00:00

Formal Metadata

Title
Investigating the Practicality and Cost of Abusing Memory Errors with DNS
Title of Series
Number of Parts
109
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
In a world full of targeted attacks and complex exploits this talk explores an attack that can simplified so even the most non-technical person can understand, yet the potential impact is massive: Ever wonder what would happen if one of the millions of bits in memory flipped value from a 0 to a 1 or vice versa? This talk will explore abusing that specific memory error, called a bit flip, via DNS. The talk will cover the various hurdles involved in exploiting these errors, as well as the costs of such exploitation. It will take you through my path to 1.3 million mis-directed queries a day, purchasing hundreds of domain names, wildcard SSL certificates, getting banned from payment processors, getting banned from the entire Comcast network and much more. Speaker Bio: Luke Young (@innoying) - is a freshman undergraduate student pursuing a career in information security. As an independent researcher, he has investigated a variety of well-known products and network protocols for design and implementation flaws. His research at various companies has resulted in numerous CVE assignments and recognition in various security Hall of Fames. He currently works as an Information Security Intern at LinkedIn. Twitter: @innoying
32
Thumbnail
45:07
CodePartial derivativeInformation securityEmailPublic domainProjective planeExploit (computer security)Partial derivativeCASE <Informatik>Complete metric spaceStudent's t-testAreaLatent heatBitCodeUniverse (mathematics)Type theoryAddress spaceComputer animation
Quantum stateRead-only memoryError messagePRINCE2BitVariety (linguistics)TrailObservational studyRead-only memoryMultilaterationError messageRead-only memoryVirtual machineModule (mathematics)Computer animation
Read-only memoryError messagePublic domainPurchasingPublic domainProjective planeBinary codeWordBitRow (database)Exploit (computer security)Kernel (computing)Latent heatComputer animation
Computer clusterDomain nameBitPoint cloudCASE <Informatik>Representation (politics)Exception handlingPublic domainSubject indexingComputer animation
BitElectronic mailing listInformationPublic domainProjective planeLevel (video gaming)WebsiteTerm (mathematics)Extension (kinesiology)Computer animation
Direct numerical simulationWeb browserBootingCodeBoom (sailing)Document management systemUser interfaceQuery languageComa BerenicesProjective planeStandard deviationWikiServer (computing)Physical systemWeb pageRead-only memoryDirect numerical simulationResolvent formalismProcess (computing)Line (geometry)RootGame controllerQuery languageWeb browserBitCodeWebsiteTheorySummierbarkeit2 (number)Context awarenessTrailStructural loadPoint (geometry)Uniqueness quantificationPublic domainDifferent (Kate Ryan album)IP addressContinuous integrationType theorySound effectOpen setMultiplication signGroup actionExpected valueSlide ruleComputer animation
Scripting languageJava appletFlash memoryHTTP cookieTouchscreenUniform resource locatorPersonal area networkPublic domainQuadrilateralComputer-generated imageryGoogolComputer configurationLocal ringPerformance appraisalPhysical systemBand matrixBitElectronic mailing listSet (mathematics)Direct numerical simulationScripting languageDescriptive statisticsCASE <Informatik>Web 2.0Web browserConfiguration spaceTrailProjective planeIP addressComputer fontHash functionMultiplicationMultiplication signComputational fluid dynamicsDomain nameWebsiteProxy serverDifferent (Kate Ryan album)Read-only memoryPublic domainLocal area networkHTTP cookieProcess capability indexAdditionFlash memoryResolvent formalismQuery languageTouchscreenCommunications protocolResultant1 (number)InternetworkingMedical imagingStudent's t-testRoutingComputer animation
1 (number)Point (geometry)Server (computing)Moment (mathematics)Object-oriented programmingHTTP cookieContent (media)Link (knot theory)EmailNatural numberAuthenticationBitMobile appComputer animation
Boom (sailing)GoogolCross-site scriptingData storage deviceACIDWeb pageWebsiteCache (computing)BitAuthenticationComputer animation
Service (economics)Euler anglesGoogolBitPublic domainLastteilungCASE <Informatik>MathematicsSoftwareElasticity (physics)Service (economics)Point cloudExclusive orComputer animation
Service (economics)EmailScripting languageDomain nameWebsitePublic domainEmailBitoutputAddress space1 (number)Computer animation
Trigonometric functionsService (economics)WebsiteWeb 2.0StatisticsBitAnalytic setWebsiteService (economics)Computer animation
Computer networkSyntaxbaumComputer-generated imageryPrototypeGoogolFluid staticsMoment (mathematics)PurchasingPublic domainContent delivery networkWebsitePlug-in (computing)SoftwarePublic domainInternetworking1 (number)LoginContent (media)Web pageFacebookConnected spaceGraphical user interfaceSoftware testingComputational fluid dynamicsGoogolFluid staticsExtreme programmingYouTubeCountingTwitterComputer animation
Limit (category theory)AuthenticationInternetworkingInformation securityWebsiteDomain nameLimit (category theory)WindowEmailProcess (computing)Information securityOrder (biology)Computer animation
Public domainPublic key certificateSocial classSelf-organizationUniqueness quantificationProcess (computing)Formal verificationPublic domainPublic key certificateAdditionEndliche ModelltheorieException handlingTotal S.A.MathematicsProjective planeDiscounts and allowancesComputer animation
Public key certificateMessage passingLoginAvatar (2009 film)Dependent and independent variablesPublic key certificatePublic domainProfil (magazine)LoginEmailObject-oriented programmingComputer animation
Coma BerenicesLimit (category theory)Public key certificateIdentity managementPublic domainMereologyGoogolComputational complexity theoryVector potentialTrigonometric functionsSelf-organizationLoginDomain namePublic domainMereologyPublic key certificateProjective planeSet (mathematics)System callComputer animation
MassPublic key certificateWeb browserGoogle ChromePublic domainEmail
EncryptionDependent and independent variablesPublic domainNatural numberPublic key certificateEncryptionAuthorizationExploit (computer security)AdditionProcess (computing)Internet service providerComputer animation
Public domainSquare numberBitWebsiteServer (computing)Video game consolePublic domainDirect numerical simulationError message
BitServer (computing)State of matterLine (geometry)RootMereologySocial classBlock (periodic table)Public domainRow (database)Computer animation
Level (video gaming)Information securityInformationMagnetic stripe cardServer (computing)Magnetic stripe cardEmailPlastikkarteDependent and independent variablesSlide ruleInformation securityIncidence algebraLevel (video gaming)InformationSystem callCoprocessorCASE <Informatik>Multiplication signDatabase transactionComputer animation
Direct numerical simulationQuery languageMereologyBitGraph (mathematics)Server (computing)Query languageMultiplication signDirect numerical simulationResultantNumberSheaf (mathematics)Graph (mathematics)Computer animation
Query languageDirect numerical simulationBlogQuicksortConnected spacePhysical systemPublic key certificateServer (computing)Public domainNumberAverageMultiplication signCorrespondence (mathematics)Validity (statistics)Computer animation
BitFormal languageStandard deviationSubsetEmailVirtual machineTouchscreenComputational fluid dynamicsImage resolutionPhysical systemMatching (graph theory)Graph (mathematics)Set (mathematics)Right angleComputer animation
Image resolutionAddress spaceQuery languageGraphical user interfaceWeb browserWeb browserClient (computing)Query languageRead-only memoryGraphical user interfaceComputer animation
HTTP cookieGoogoloutputWindowHTTP cookieGoogolLoginCodeToken ringAnalytic setComputer animation
HTTP cookieToken ringData storage deviceMobile appCASE <Informatik>outputCartesian coordinate systemValidity (statistics)Program flowchart
GoogolAddress spaceDifferent (Kate Ryan album)SoftwarePhysical systemLocal ringInheritance (object-oriented programming)InternetworkingNatural numberAddress spaceIP addressComputer animation
Server (computing)Hydraulic jumpBitNumberEmailPublic key certificateObject-oriented programmingComputer animation
TwitterDirect numerical simulationPie chartRange (statistics)Query languageRoutingDirect numerical simulationTwitterEmailAddress spacePasswordRow (database)BitConnected spaceUltraviolet photoelectron spectroscopyComputer animation
Computer networkAddress spaceLatent heatPublic domainDependent and independent variablesTwitterFunction (mathematics)BlogMathematical analysisVorwärtsfehlerkorrekturRead-only memorySystem callEuler anglesFacebookHeat transferCodierung <Programmierung>Game theoryScheduling (computing)Public domainDependent and independent variablesType theoryBitHeat transferQuery languageSoftwareRead-only memoryWebsiteAuthorizationProcess (computing)TwitterObject-oriented programmingFacebookOpen sourceComputer animation
Server (computing)Direct numerical simulationEmailCore dumpWebsiteCodeDisk read-and-write headPublic domainInformationServer (computing)Hash functionAreaOpen sourceSound effectComplete metric spaceEstimatorOrder (biology)LoginStatisticsQuery languageWeb 2.0BitFrequencyMultiplication signFormal languageProjective planeDirect numerical simulationQueue (abstract data type)WebsiteAliasingUniform resource locatorConnected spaceRight angleRead-only memoryAddress spaceInformation securityComputer animation
Transcript: English(auto-generated)
undergraduate student at a big university that had nothing to do with this research. I'm a security engineer originally from Minnesota. I'm currently working in the bay area. Say that again? Is this better? Sorry about that. I'm also
the founder of hydrant labs LLC which has graciously funded this research. It's funny how you can do that when you're the only employee. In case you guys didn't catch that, that means unlike a lot of the speakers out here, I'm not hiring unless you want to work for a 19-year-old kid for
minimum wage. If you do, there's my e-mail address. Also if you have any questions about the research or you'd like to send me legal threats or both, that's my e-mail, or you can snail mail things to the whoisinfo on the project domain which will be listed at the end. So as usual, we're
going to start with a quick rundown of what I'll be talking about today. I'm going to talk about what a bit flip is and the history of their exploitation. After that I'll get into bit squatting which is a specific type of bit flip exploitation. Finally we'll move into my research on bit flips via bit squatting and I'll finish things up with a complete code release, partial data dump, followed by
Q and A. So what's a bit flip? A bit flip occurs when a bit flips from a 1 to a 0 or a 0 to a 1. It's a pretty simple concept. It can happen for a variety of reasons. Heat, electrical problems, radioactive contamination, cosmic rays, among others. I'm not going to focus very much on what causes a bit flip. I'm going to instead focus on how
we can actually exploit them. However, we will take a quick track into the history of bit flips. In 2003 a paper was published by Princeton University titled Using Memory Errors to Attack a Virtual Machine. In the study they literally took a 50 watt light bulb, put it over a memory module to intentionally induce bit flips and then used it to
escape the JVM. Since then a variety of research has been done into bit squatting which I'll get into a bit later. However, it wasn't until 2014 when a paper from CMU was published that investigated the use of DRAM flushing to intentionally induce bit flips. Many of you probably heard of this by its more common name, Row Hammer. In 2015 Google's Project Zero team showed that it was
possible to exploit a bit flip to actually gain kernel privileges. Now what is bit squatting? I keep saying this word. Bit squatting was a named coined by Artem Dynenburg. It refers to a specific exploitation of bit flips via purchasing domains that are one bit away from the
legitimate domain. In the hopes that a bit flip will occur and the user's traffic will be directed to your domain instead of the original intended domain. So let's take an example. We take the domain CNN.com. You can see the binary representation of it right there. If the last zero in the second N or in the first N flips from a zero to a one,
you can see it changes to CON.COM. So normally these domains aren't actually registered and so the request will fail silently or the user won't ever actually notice something happened. And so the idea is with a bit squat to purchase these domains instead. Now how do you actually come up with all the possible bit squats? You can't
just flip every single bit. So if we take a legitimate domain name, let's say www.defcon.org and you can see the binary representation of one of the letters in E for example, we can actually throw away any bit flips that occur in the first bit because in seven bit ASCII it's always going to be zero. And in most cases we can throw away any flips that occur in the third bit depending on where
you're indexing from. That's because this represents the case of the character and in domain names this is irrelevant. Unfortunately, this doesn't leave us with six possible bit flips. That's because domain names only have certain valid characters. Primarily A through Z, 0 through 9 and dashes. In
the case of E this gives us five possible flips, U, M, A, G and D. Now I mentioned there were some exceptions to that rule with case. And sometimes a letter if a flip were to occur in it, it changes to a different character. So for example, a N can flip to a dot or vice versa or a slash can flip to an O. So these can actually happen where a flip
will occur and it will still complete to a valid domain. Or for example, a sub domain can change to www.n.defcon.org. This was actually researched at Defcon 21 by Jason Schultz from Cisco. He explored the possibility of bit flips in the new GTLDs such as .exchange, .cloud. So using this
information we can find there's actually 43 possible top level registerable bit squats of www.defcon.org. I've written a tool called BF look up that will generate a list of all possible bit squats of a given domain. It will be linked on the project site which I'll show at the end. Now I want to quickly note some of the previous bit squatting before my
research. During Defcon 19, Arnim Dinaburg first coined the term bit squatting and more recently Robert Stuckey and Jason Schultz conducted more extensive research into bit squatting which was actually the inspiration behind project bit flip. And what is project bit flip? Project bit flip is my research into how do you actually exploit a bit squat. It
makes a lot of sense of how a bit squat can occur and how it gets sent to potentially the wrong person but how do you actually exploit that to pivot somewhere. So let's take an example. Let's imagine an internal site, say a corporate wiki or continuous integration system that includes jQuery from the jQuery CDN, code.jquery.com. Since your
browser is loading this page for the first time it doesn't know who code.jquery.com is so it's going to initiate a request to the user's DNS resolver. This would be Google DNS, Comcast, Open DNS, etc. Now suddenly a bit flip occurs. Code.jquery.com has become code.jqueezy.com
instead. This could have happened in memory on the device running the browser. It could have happened in the memory of the nick. It could have happened in transit when a check sum was recalculated or even in the memory of the DNS resolver itself. Now the DNS resolver, if we assume this is a cold request, doesn't actually know who's in charge of jquery.com so it's going to ask the DNS root via a
authoritative name server look up. It's going to send that. The DNS root is going to answer with NS1 and NS2.bit flip.com. Now what is this domain suddenly? Bit flip.com is the domain I purchased to act as the control domain for the entire site. And it has a 1 in it because someone else already
owned bit flip.com. So these answers get sent back to the DNS resolver which the DNS resolver is then going to use to send a query to the project bit flip server. It's going to ask for code.jqueezy.com. Now this is also where this is going to begin to deviate a bit from a standard DNS question
and answer. We're actually going to send back two answers to the DNS resolver. One for code.jquery.com and one for code.jqueezy.com. Now the reason we do this is we don't actually know where in the process the bit flip has occurred. So we don't know what answer the DNS resolver is expecting. So we send both back to them and let the DNS resolver pick the correct one and ignore the other packet.
The DNS resolver is then going to send this back to the browser which it will then use to issue a HTTP get request. Now if you were attentive, a few slides back you may have noticed those two different answers actually had different IP addresses. And this is because it allows us to determine which type of bit flip occurs most commonly because this HTTP get
request will be triggered for a different IP address depending on which flip was accepted by the DNS resolver. Now this HTTP get request is going to get sent to the project bit flip server. Now this is also where we get to deviate a bit from previous research. Instead of just answering this with a 404 or 200, we're going to be a little mean. We're going to
send back a 301 moved permanently. This has the effect of permanently caching the bit flip in the browser's cache so that subsequent page loads are then directed to the project bit flip server even when a bit flip hasn't occurred. And we're also going to send it to a unique sub domain of bit flip dot com. Now the reason for this is the browser is not
going to know who's in charge of that sub domain. So when it gets the request, it's then going to have to issue another DNS question. It's going to send that to the DNS resolver which will send it through to project bit flip. This allows me to directly tie a specific user's browser with their DNS resolver because you may have noticed at this point there's no
way to tell which browser initiated a specific query which came because they all originate from the DNS resolver. Following this we have a pretty standard path. The answer gets back to the browser. The browser issues a HTTP get for jQuery dot JS. Project bit flip receives it and we find the answer with a 200. And we send back this tracking
JavaScript. Wait a second. I asked for jQuery dot JS not tracking JavaScript. Too bad. I'm not jQuery. So instead we're going to send back this which the browser believing it originated from code dot jQuery dot com is going to faithfully execute in the context of your internal site. Whether it be a continuous integration system, internal wiki, basically anything
with any important data on it. Or any site that a user would be tricked into entering credentials into believing that they are on the original site. Now how do you actually build all of this? This is great in theory but actually answering all these queries is complicated. The second tool I'm releasing is BF DNS. It's a go line DNS server
specifically designed to answer bit squat DNS queries along with BF www which is the configuration that I used to answer all these queries along with a bunch of PHP scripts gross that are used to answer the actual tracking JavaScript. I keep mentioning this tracking JavaScript. What
does it actually track? Basically everything that you can track with JavaScript. It pulls the users installed plug-ins, user agent, time zone, language, refer, the document title, the screen size, the resolution, the current URL, the do not track cookie. The installed fonts that are via flash and then it also pulls the local IP addresses on the system via web RTC. Some of you guys may
have seen this from the beef talk. There's a way to pull with web RTC the session description protocol actually contains all the local IP addresses installed on the system. And so JavaScript has access to those even if they aren't the route that was used to access the site. And these are internal LAN IPs. So in addition to external IPs if
your computer is directly connected to the internet. It also pulls the cookie names and a SHA256 of their value. You could actually pull the cookie values. I don't want to get sued so I just pulled the 256 hash of them. Now we need somewhere to host all of this. So we need to select a host. We need somewhere that supports multiple IP addresses so
that we can answer each question with a different IP address. We also want IPv6 support so that we can evaluate IPv6 usage and also somewhere that bandwidth is really cheap in case a bandwidth spike occurs if a big DNS resolver were to cache one of our results such as Google DNS or Comcast because then it would be serving it to their millions
of customers and that would result in a lot of traffic. Finally I wanted somewhere that was a smaller company in case there were letters or legal threats sent to them. Somewhere that would actually look out for their customers and wouldn't be able to just say go away, pick another host. I ended up settling on a host called RAM node, small VPS, 3 terabytes
of bandwidth a month. It cost about 15 bucks a month. Finally we need some domains other than the project domain. So I don't know how many of you know a college student but we're really lazy. Whatever option requires the least amount of work is the option we're probably going to take. So rather than building a list of domain from a bunch of data sets like the Alexa 500 or other similar sets, I fired up a
web proxy and browsed the internet for a day. At the end of the day I looked at all the top sites that I'd hit and looked for ones that would have interesting data. We got the mic falling over. Sorry. Hopefully. Nope. I took care to
only grab sites that would yield interesting data and try to explicitly avoid any sites that would have data such as HIPAA or PCI. The first site I bought was Google user content.com. It serves images for Google sites. Main reason I purchased
it is because it's a really, really long domain name. So there's a lot of opportunities for a bit flip to occur in memory especially when browsers copy the domain name multiple times. In fact here's all of the possible bit flips of it or rather here's the 72 of them I was able to buy. There are 79 possible valid ones. Now, I hadn't actually set up a proper
server at this point. So I pointed to that, the VPS and ran netcat on port 80. Those of you that are in IT have probably had what I like to call a oh shit moment. You accidentally run RMRF on the wrong server. You accidentally shut down all but the server you wanted to. Something like that. This is one of those moments. For those
of you that can't read the tiny text up there, that's a request for mail-attachment.Google user content.com. As it turns out Google user content.com serves all mail attachment downloads for Gmail and Google apps. Oops. To make
it even better, by their very nature those links are valid without session cookies. Meaning that each misdirected request I can go grab that attachment myself if I wanted to. So I decided to actually look a little bit more at the authentication. It turns out it serves not just mail attachments but the authentication for Google,
Google fonts, Google cache pages and Google translated pages. I'm sure there's no valuable data in any of that. Moving on from there I decided to take a look at Amazon. Specifically cloud front.net. If you're familiar with CDN it serves a lot of popular sites such as ESPN, Amazon.com itself among a whole bunch more. There were
43 possible bit squats, four of which were already registered. So I registered the rest of them. Moving on from there continuing with the Amazon theme I took a look at AmazonAWS.com. It serves pretty much all AWS services as subdomains of AmazonAWS.com with the exclusion of cloud
front. This includes Amazon S3, elastic load balancer and EC2. Interestingly this is one of the few domains I came across in my research that a lot of the bit flips were already owned for. The other one being Akimi who actually owned all of their bit flips. In this case Amazon owned 33 out of the 38
possible bit flips. However, the rest were registered by someone else except for one. AmazonAWS.com was of course I wasn't actually satisfied with a single bit flip so I decided to buy subdomain bit squats where the dot changes to an N of Amazon S3, EC2 and elastic load balancer. Moving on from there
I decided to take on double click dot net. Those of you familiar with ad networks knows that this serves Google's ad network and it primarily serves them over JavaScript which makes it a great target. There were 45 possible bit squats, 19 of which were already registered. So I
registered the other 26 of them. Moving on from there I hit Apple.com. Being a short domain name a lot of these were actually valid sites and not actually owned as bit flips so I was actually only able to grab one of them, AppleG.com. But continuing with the Apple theme I took a look at iCloud.com next. If any of you have an iOS or OSX device
as I'm sure a lot of you do, your device will check in with iCloud.com regularly for backups, contacts and a lot more. Additionally most Apple accounts have an iCloud.com e-mail address which is delivered to this domain. Moving on from there I decided to look at web dev. I don't know how many of
you have ever heard of JQuery. JQuery is a compatibility layer that makes IE suck less among other things. It's used by over 70% of the top 10,000 sites making it a really good target. I registered the 15 available ones of that. Continuing with the web dev theme I hit discus.com. It serves comments for about three quarters of a million sites.
And finally the peak, there's 24 purchased. The peak of my web dev phase, Google-analytics.com. I'm guessing a lot of you know what this is. It serves analytics for pretty much every site ever. It's the most widely used website statistics service in the world.
Interestingly I wasn't the first person to have this idea. 53 of the bit squats were already registered by another party. However I was able to grab the remaining 10 shown there. Hopefully you're beginning to see a theme here. Moving on, sfdcstatic.com is the Salesforce CDN. There were 42
possible bit squats. I bought all of them. I'm going to pick up the pace a little bit here. Aspnet CDN.com, Microsoft's AJAX CDN network. It serves a lot of Microsoft sites and a lot of JQuery plug-ins. Another 38 domains registered. Googleapis.com, it's Google's JavaScript CDN. Another 22 out of 39 registered. Gstatic.com. This
one's a fun one. It's Google's static content hosting. It serves things like the Chrome Internet connectivity test. So when you plug your computer into the network and suddenly Chrome says oh you're online now, it's hitting gstatic.com. Additionally it serves things like the Chromecast log-in page along with a lot of other stuff. The other thing
that makes this interesting is that this is one of the domains that was purchased by both Artem Dynaburg and Robert Stuckey in their research yet the domain was freely available, not purchased by any other entities or by Google itself. I was able to grab 19 out of the 30 possible bit squats. Finally to finish things up, Facebook's CDN,
YouTube's CDN, Twitter's CDN and there we go. Now, that was 337 domains. I know you all were keeping count. You're probably asking yourself how did I pay for all of those? Coupons. I don't know if any of you have seen the TLC show Extreme Couponing. That was a documentary of my life for
about a month. Eventually I actually ran out of coupons and I found one and one. One and one has a nice advertisement on the site for 99 cent domain names. If you go and try to buy one, the first one is 99 cents and then when you try to buy the second one, it charges you $11 which doesn't seem quite
fair. So if you start the process in an incognito window and then log in halfway through, it's still 99 cents. So after doing that, about five minutes per domain, I got this email. Dear Luke Young, you have exceeded the limit of our current special offer. Further orders placed under this offer will be canceled. Sincerely, security team. And that's when
I stopped buying domain names. The final statistics, I bought 89 domains from GoDaddy, 255 from one and one at an average cost of $1.62 per domain coming to a total of $545. Now
the next thing I realized is I was actually missing out on a lot of data. Most of this traffic was coming in over SSL. So I decided to buy SSL certificates too. Those of you that are familiar with SSL probably know that I need what's called a wildcard SSL certificate. This is because I
don't know the subdomains that are going to be requested for each of these. You also probably know that wildcard SSL certificates can be expensive. A wildcard SSL certificate from Digistert is $595. Of course you get bulk discounts and other things. But if you just do easy math there, that's over 200
grand which I don't exactly have lying around for a fun little side project. I think I was able to find a solution. Some of you guys may have heard of them from other talks. StartSSL is a kind of unique model. You can pay a one-time fee and then get SSL certificates from them at no additional cost except for EV certs. So at a cost of $60 I was
able to get wildcard SSL certificates. Now, because this is such a unique model, StartSSL has a very manual process. They don't have an API for requesting certificates or anything like that which makes both requesting the certificates really, really annoying for 337 domains because you have to do domain verification on all of them.
But it also makes it so that you're more likely that someone's going to notice what you're doing. In fact, 17 of the domains were flagged for manual review by StartSSL who then approved all of them. By four different employees of the company I might add. Here were the 103 certificates that I
was issued. Now, those of you that are attentive, you might have noticed that I bought 337 domains yet I only got 103 certificates. There's a good reason for that. I'll try to answer this email. My login certificate was revoked for quote abuse. Please contact us. Reaching out to StartSSL I
received the following quotes. I'm sorry but for high profile names only the name owner should be able to get certificates for it and those resembling them closely never issued. Followed shortly by quote most certificates really shouldn't have been issued to start with. Oops. Here's the
actual excerpt from the StartCOM certificate policy and I'm paraphrasing here. StartCOM will not issue certificates whose domain names might be misleading or of well known brands or names that are part of requested host names such as Googleme.com. So I'm guessing the fact that I own domains with their example in it was probably not so great. I went
back and forth with Eddie Nigg, the CTO of StartCOM letting him know about the project and suggested revoking only the Google certificates. Here was the timeline on that. Yep. So I don't know how many of you have ever actually been to a standup comedy show but one of the golden rules is that you
don't take your phone out during someone's set. Mostly because you get made fun of by the comedian but also because it's really rude. I was at a comedy show with some friends and my phone started vibrating like I was getting a phone call. And then it kept vibrating for five minutes straight. When the set ended I took out my phone and I saw this. And in the end they revoked 81 certificates with two
emails for each revocation. Now I'd like to note that I had full wildcard certificates for all these domains for two months before they were added to the certificate revocation list. Not that that even matters. Some of you may remember this from Heartbleed. As a lot of browsers like Google Chrome don't even check the
certificate revocation list. Meaning that if I were malicious I could continue to use these certificates at the risk of being sued by StartCOM. I chose not to do that because I don't want to get sued. However they did leave 22 certificates on revoked. When I inquired about why they didn't revoke these I received the response that everything we haven't revoked so far was considered not so problematic and hence we left
them to expire naturally. Now you'll notice all the domains that are remaining seem to not contain the trademark name of whatever company it is. That's likely the reason they are left behind however I don't know that for a fact. Now moving on to the future. The EFF has just announced their free automated certificate authority called
Let's Encrypt. In addition so using something like that it may be a lot easier to do exploitation as it's a completely automated process and you don't have to worry about your domains getting flagged for manual review. They currently don't support wildcard certificates but are looking into it. You could also use a much larger provider
like DigiCert at a much higher cost depending on how much money you have to throw away at this. Now I suppose those certificate revocations kind of beg the question did anyone else even notice? Well the first and only public case of someone noticing was by a third party it was x8x.net he noted that all the bit flips of gstatic.com were suddenly
registered along with some other domains by quote the same individual with name servers at bitflip.com so at least someone is having fun. I thought that was it until I went to a friend's house and I went to check on the site and I saw this. I quickly logged into my server console and found that the server was alive and well. It turns out the
error that I was seeing was with DNS resolution. I figured I had just broken my DNS server or actually it was handling packets wrong so I hit the ever trusty Google DNS check and it worked. This is where things got a little bit weird. After a bit of further investigation and a few panic calls to friends in other states I verified that I wasn't
actually crazy. Comcast DNS server was refusing to answer requests for bitflip.com. Further investigation reveals that it's only for A and quad A records of the subdomains of bitflip along with the root domain. Now here's the weirdest part. The requests were still getting forwarded to my server however my answers were not getting forwarded back to
clients. This makes it really hard to pinpoint exactly when they put this block in place and I still don't know to this day. I tried to reach out to the business class support line never really got an answer. Short while after noticing the Comcast incident I got an email that my credit card payment for the server was declined for the month. A quick call to my
bank and I was even more confused. They said they were approving the transaction and that I had adequate funds. I reached out to ram node via a ticket that was escalated directly to the CEO that's the advantage of picking a small company who reached out to stripe their payment processor. He received a response that quote we have reason to believe that card has been associated with fraudulent activity. What
makes this odd is that card was only used with about five vendors. I can count them all in one hand and only one of them this server was purchased through stripe. What makes it really odd is that stripe not my bank was refusing this transaction. I actually reached out to stripe on my own independent of ram node and they said quote we are indeed
blocking at an RN due to a level of risk on this card that we're not willing to take. I know this is a very vague reason but for security purposes I'm limited in how much information I'm able to give out. Still don't have a solid answer on this one. Needless to say I paid for the month on another card and then paid through the end of August ahead of time in case it were to ever occur again. Now I want
to take a second and have everyone look at this slide here and see if you can tell me what's wrong with this slide. I put my bank on there. About two hours after the Def Con CD slides came out, I started getting password resets. Whoever
is doing this, I don't bank with them anymore. Nice try though. Now moving on to the part that you guys are all actually probably here for, the data. The first question people ever ask when I tell them about this is, is this actually even a problem? Is there even traffic to these bit flips? Let's start with a simple graph. This is the
traffic I received during a section of the time the server was running. Now some of you probably can't see that because it's a little small so I'll give you a number. I received over a million DNS queries every 24 hours for over a month. Now you'll notice that that graph is kind of all
over the place. There's a lot of reasons for that. Different servers caching my results for longer along with I actually transferred these domains, some of them away over time and so it's not the cleanest data set. Now of those million DNS queries, about 4.8 of them resulted in corresponding TCP
connections. Of those TCP connections, 85% of the initiated SSL connections completed the handshake and issued an HTTP request. Now those of you that were paying attention earlier might have noticed that I had 22 certificates out of the 337 domains. Those numbers just
don't add up. The best answer I can give you on that is a lot of people have misconfigured systems that are actually accepting SSL certificates that are for the invalid name. Because these are valid SSL certificates that I was serving they're just not for the domain that the user requested. Now what about those HTTP access logs? The server handled about
2.4 million requests and I was able to determine that repeat users, so users that had cached the result because of that 501 redirect remained cached for an average of 4.33 requests. So this means that after a flip occurred, wherever it may have
occurred in the request process, you continue to try to access jquery.js from me for the next 4.3 times that you visited that site. I can also tell you all sorts of interesting things from this traffic. I can tell you the most common languages that users had on their system. The graph on the left is the HTTP accept language header and the
graph on the right is the languages returned from the JavaScript. Interestingly it seems like the JavaScript executed on Chinese machines much more than any other language. Not really sure why that is. It's also interesting that most of the traffic was coming from, had a Chinese
language set and this actually doesn't match up with standard data sets of what most common accept language headers are. So for some reason bit flips were occurring more commonly on Chinese computers. I can tell you things like the most common screen resolution or IPV6 adoption which was
abysmal, about 1.67% of queries. I do have to note that I wasn't actually forcing users to try an IPV6 so this was just clients that preferred IPV6 over IPV4. I can tell you things like browser usage. You can see the Wikipedia, so Wikipedia's traffic over February is on the left. Their
breakdown of browsers versus the traffic I received is on the right. You'll see a significantly larger chunk is coming from Chrome. It could be because Chrome is doing more memory copies in each of its requests. It could just be because more of the users happen to be running Chrome. Along with a larger proportion of IE. I can tell you things
like OS usage. Same statistics, Wikimedia on the left and my traffic on the right. A large portion of traffic coming from Windows 7 and Windows XP with a much smaller portion coming from OSX and iOS. I can tell you things like cookies. I
have acquired 240,000 cookie names and values. Now these cookies could be anything from Google Analytics tracking data to actual session tokens for say your Gmail login. The top cookies were from Google Analytics, Baidu and for some reason weather.com. So I have literally thousands of users
zip code that they always look up weather for saved in my data set. I don't know why. I can tell you interesting things like here is someone browsing Amazon.com looking for an iPad and their session cookies were sent happily to me. Anyone want
to buy an iPad? We can look at things like here is someone trying to log into OAuth and their session cookies being sent to me along with their OAuth token. Or things that are more interesting like iPhones checking in to download, in this case just checking into the app store or trying to download an actual app from me. Some of you may
have seen some of the hacking team. Their research was into serving malicious applications to users that could be utilized in something like this where a malicious application is sent back instead of the application the user intended to download. As long as it's signed with a valid certificate, iOS will accept it and install it. Or we could
just have iOS devices asking me for software updates. I actually did some research into this and you could serve them different software updates instead. I also do things like the HTTP refer value. Here were the top Google searches that I pulled from the refer values. For some
reason a lot of people really want to know about would birthday gifts for their wife. I have no idea why. The other thing I pulled was local IP addresses. I was able to pull 158,000 IP addresses off varying systems. 12% of
those were non-private IP addresses. This means that those systems were likely directly connected to the internet or using a lot of them are IPV6 addresses which by their very nature don't go through a NAT. I can tell you the most common local IP addresses, super fascinating graph, they're pretty much all 192.168 primarily consumers.
Where it gets more interesting is we look at other traffic. For example, here's the SMTP traffic that my server received. And you can't really see it out there. It's kind of small text. The numbers on the side are 20,000, 40,000 and 60,000. That's how many requests I was receiving each day. That's
someone trying to send an email, a bit flip occurs and it gets sent to me instead. Now if any of you have ever worked on mail systems, you probably know that SMTP doesn't really work well with TLS. Pretty much everyone uses self-signed certificates and even if you weren't using a self-signed certificate, I actually own the certificates for these
sites as well. So it doesn't even make a difference. And you'll notice that big jump there at the end up to about 60,000, that was June 19th where Hotmail cashed my entry for iCloud.com and sent all emails that were intended to go to iCloud users to me instead. Oops. In continuing with
the SMTP theme, I was looking through the traffic and I set up a query that would show me where the origin BGP route came from. And I found that 38% of my traffic was coming from AS 13414 which is owned by Twitter. Primarily those two
subnet ranges. And you'll notice that pie chart there is for what the DNS requests were coming from them. They're all for iCloud.com. As it turns out, these are mainly MX and A record look ups and they were resulting in SMTP connections. These DNS queries resulted in about 390 connections attempts per day and based on this info, I'm
guessing that what was happening was a bit flip was occurring and Twitter was trying to send password reset emails, promotional emails and other things to users with an iCloud email address and sending them to me instead. I reached out to Twitter who said after some discussion it looks like we're trying to restrict outbound traffic from our network to bit flip domains. This should address the specific
problems you outlined without having to worry about the domains and who owns them. Now, how did I actually run all these queries? I wrote a tool called BF Splunk. It's source types for Splunk and a bunch of queries. It will be listed on the site at the end. I'd love to say Splunk sponsored this talk but I never actually called the sales
person back. Oops. Now, how do you actually remediate this? Buy your bit flips. You probably already buy the typos of your domain. Buy your bit flips as well. It's really not that more expensive and it can actually save your users a lot of problems. You don't have to actually answer them. You don't have to do anything. Just own them so that someone else
doesn't. The other thing you can do is use ECC memory and set up an RPZ for internal flips. However, that doesn't protect your users. That just protects users within your network. Now, before I get into the data release, I want to talk about what I did with the domains. I reached out to all the companies whose domains I purchased for this talk and offered to transfer the domains to them for free. The
first company to get back to me was Salesforce. I wanted to take a second to really give props to their team. They had a response with authorization to transfer the domains in under two hours and the domains process was initiated in under 24. Next company was Apple. The domains were transferred in about two days. Following them was Amazon AWS with the domains
transferred in a little over two weeks due to some scheduling issues on my end. Facebook followed them closely by a tad over two weeks. Microsoft, they're being transferred right now. Now, those are the companies that accepted the transfers. Some companies like Twitter actually said that they
didn't want the domains. Here was the exact quote. We don't actually try to prevent bit flipping attacks by registering all the nearby domains due to the fact these attacks are relatively rare and that we own a lot of domains so this would be quite an undertaking. So we're not interested in acquiring the domains you have. Please just maintain possession of them until they expire, which they
have. The next company was Google, which after three weeks had a very similar response saying their domains team was not interested in purchasing them. Those 154 domains are now up for purchase. Now, the companies that did actually transfer them, some of them didn't actually do it right. Here's the who is info for two of the domains. They're
still pointing the name servers to me even though they own the domain. So I'm still getting all the user traffic even though they own the domain. I'm not going to call out the companies that they are because I just noticed this about ten minutes before the talk so I haven't given them a heads up. Now, on to the data release. I'm happy to
say I'm releasing complete logs in JSON format. So complete DNS logs for every single query I received. They contain the source, the port, the queue name, queue type, pretty much all that. The info is self-explanatory on the site along with anonymized web server logs which contain things like the user agent, the accepted language header, the
HTTP host they were intending for, the method. They do not contain the URL or the URL query and they contain a hash of the source address so you cannot tie it back to a particular user. In the same vein, I'm releasing anonymized SSL and SMTP logs which both contain the same hashed source IP. And finally, here's my contact info along with the
project website which contains all these data dumps. At this point, I think I have a very short period of time for some questions if anyone has any. Did anyone offer a bounty
or anything to help you cover the cost? No. And I wouldn't have accepted it if they did because that puts it in a bit of a, it's already a gray area legally and that can push you if you're accepting money for it or making a profit off of it and can push you a bit more. I did get some care packages from companies though. Question for you. Did you do anything
to look at double bit flips to get an estimate for just how frequently these happen? Is it one out of every billion queries, trillion queries, what order of magnitude? So you're referring to when two bit flips occur? Yeah, using the ratio between two bit flips and one bit flip, you could figure out the ratio between one bit flip and zero. I did not, however, I did receive, if you look at the
data, a lot of the queries I got where a bit flip had occurred, lots of other bit flips occurred in the actual query. So in the URL would be malformed as well. Indicating probably that those users have really messed up memory. So I don't have statistics on that. It would be very interesting to look into that. Any other questions? Does
enforcing name server bailiwick have any effect on your responses? Can you repeat the question? Does enforcing
name saver bailiwick, does that have any impact on your responses? I do not know. Any other questions? How did you
figure out who to contact in these companies to get quick response? How do you know who to e-mail in Google or Microsoft about this, just me and someone on the outside? So all of these contacts out to the company were sent to the security at alias or to whatever information they had
listed if I just Googled the company name and security. A few of them did have to get pinged through personal connections to make sure they actually got handled. All right. Thank you very much.