We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

CRYPTO AND PRIVACY VILLAGE - Geolocation and Homomorphic Encryption

00:00

Formale Metadaten

Titel
CRYPTO AND PRIVACY VILLAGE - Geolocation and Homomorphic Encryption
Serientitel
Anzahl der Teile
322
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
DatenmissbrauchKryptologieHomomorphismusChiffrierungBitChiffrierungMereologieHomomorphismusTouchscreenURLTwitter <Softwareplattform>
SoftwareCodierungNotebook-ComputerProjektive EbeneProzess <Informatik>CodierungComputerspielGruppenoperationHackerNotebook-ComputerOpen SourceVorlesung/Konferenz
DatenmissbrauchMultiplikationsoperatorURLResultanteFigurierte ZahlMapping <Computergraphik>BitServerComputeranimation
Konfiguration <Informatik>ServerDatenmissbrauchPunktRandwertDatensichtgerätClientRechenbuchQuick-SortClientPunktSchnittmengeServerMapping <Computergraphik>ProgrammbibliothekGoogolDifferenteInformationKonfiguration <Informatik>Reelle ZahlAggregatzustandFlächeninhaltRechenbuchSoftwareentwicklerComputerspielGüte der AnpassungApp <Programm>ElementargeometrieURL
HomomorphismusChiffrierungKonfiguration <Informatik>DatenmissbrauchKryptologieFigurierte ZahlZahlenbereichKryptologieWellenpaketRechenbuchMathematikerinMathematikMultiplikationsoperatorChiffrierungBitKonfiguration <Informatik>
HomomorphismusZahlenbereichE-MailKryptologieAdditionChiffrierungDifferenteAbfrageChiffrierungWort <Informatik>E-MailKryptologieMapping <Computergraphik>ResultanteZahlenbereichMultiplikationsoperatorHomomorphismusTermExpertensystemMathematikAdditionPhysikalisches SystemDienst <Informatik>VerknüpfungsgliedKette <Mathematik>SoftwareYouTubeKryptosystemMomentenproblemGüte der AnpassungNeuroinformatikGoogolZweiServerMAPRechenbuchMultiplikationCodierungProdukt <Mathematik>
GeradeKoordinatenServerChiffrierungBitEinfach zusammenhängender RaumHomomorphismusOrdnung <Mathematik>Computeranimation
AppletKryptosystemSkalarfeldHomomorphismusKryptologieGanze ZahlChiffrierungPhysikalisches SystemClientServerAdditionInverser LimesKryptosystemImplementierungOrdnung <Mathematik>Physikalisches SystemKryptologieBenutzerbeteiligungMultiplikationProgrammbibliothekRechenschieberZahlenbereich
MorphingAbfrageLambda-KalkülChiffrierungRepository <Informatik>ProgrammbibliothekSchlüsselverwaltungPublic-Key-KryptosystemOrtsoperatorCodierungDatumsgrenzeZahlenbereichMapping <Computergraphik>Natürliche ZahlURLPolstellePhysikalisches SystemDicke
ZahlenbereichCAN-BusSchlüsselverwaltungAbfrageZahlenbereichDigitalisierungTwitter <Softwareplattform>Inverser LimesServerKryptologieKryptosystemDifferenteMessage-PassingProgrammbibliothekPublic-Key-KryptosystemVorlesung/Konferenz
ZufallszahlenMathematikChiffrierungAbfrageRechteckRichtungQuellcodeServerDiagrammURLDifferenteCASE <Informatik>Rechter WinkelNegative ZahlComputeranimationVorlesung/Konferenz
Exogene VariableAbfrageObjektorientierte ProgrammierspracheFlächeninhaltExogene Variable
AbfrageServerServerInformationHackerDatenbankLoginVorlesung/Konferenz
Notebook-ComputerCodierungVerschlingungFehlermeldung
Kryptologie
Weg <Topologie>Mailing-ListeInformationPhysikalischer EffektElektronische PublikationDemo <Programm>Vorlesung/Konferenz
InformationsmanagementServerCAN-BusInverser LimesAbfrageClientKanal <Bildverarbeitung>Quick-SortDifferenz <Mathematik>AbstandServerProgrammbibliothekURLFehlermeldungPunktKryptosystemFastringImplementierungArithmetisches MittelAdditionZirkel <Instrument>
MereologieZahlenbereichServerBitMultiplikationsoperatorURLZahlenbereichTotal <Mathematik>Konfiguration <Informatik>Gemeinsamer SpeicherMAPPhysikalisches SystemQuadratzahlHash-AlgorithmusKardinalzahlElementargeometrieServerStrömungsrichtungTerm
Hash-AlgorithmusQuadratzahlServerClientBitMinkowski-MetrikRadiusURLHash-AlgorithmusElementargeometrie
Twitter <Softwareplattform>DreieckKryptologieAbstimmung <Frequenz>Shape <Informatik>RechenbuchClientServerShape <Informatik>DifferenteFächer <Mathematik>ProgrammbibliothekAbstimmung <Frequenz>Rechter WinkelMultiplikationsoperatorKartesische KoordinatenHomomorphismusVektorpotenzialChiffrierungDreieckRechteckTotal <Mathematik>RechenbuchPhysikalisches SystemTwitter <Softwareplattform>ZahlenbereichKryptosystemKryptologieSkalarfeldQuaderMixed RealityMathematikerinPunktgitterDifferenz <Mathematik>Quantisierung <Physik>AdditionElementargeometrieMultiplikation
Transkript: Englisch(automatisch erzeugt)
Please welcome our next speaker, Nick Dorian, on geolocation and homomorphic encryption. All right, good morning. Can everyone hear me correctly? Okay, awesome, awesome.
Okay, so today we're going to talk about geolocation and homomorphic encryption. My name is Nick Dorian. You can find me on Twitter, GitHub, maybe the Darknet, using that screen name. And this is my first DEFCON, so... Thank you, all of you, for being part of my first DEFCON.
Okay, so first, a little bit about me. I currently work at McKinsey. It kind of has this very traditional, old-fashioned vibe to it, but currently we use... Not to make a whole pitch, but currently we use React, Node, Docker, kind of modern stuff. Plus, if your grandparents have ever been like,
why don't you stop playing video games and get a real job? Your grandparents will know what you do now. But I like to think that I have this hacker ethos from my previous roles with One Laptop for Child, Code for America, kind of like open source groups. And I think, despite working on some very serious projects, kind of my more famous work is Fortran.io, if any of you have been there.
But anyway, today we're not going to talk about any of that. We're going to talk about this really interesting idea that goes to something we use every day, mapping and geolocation. And pretty much all the time you're doing this, probably you did this today already, you get a GPS signal down to your phone or device, you ask a server,
where's the nearest place that I'm trying to get to, and it comes back with a bunch of results. But also, if you think about it, it's a little bit creepy that, especially if you Google location history on, I can go back years and figure out what restaurant did I go to, what was I doing this day, that kind of thing.
So, yeah, so is anyone like a mapping geospatial developer person in real life? Okay, great. Okay, so if you've done this before, or even if you haven't, there's really two different ways that people go about doing this. The first one is make a client-side app, everything's client-side,
and to find places that are near you, or in that area, you have to download all the geodata down to the client, and then they make some sort of calculation based on a JavaScript spatial library. And there are two reasons that people, especially companies, don't like to do this.
One of them is that if it's a big data set, for example, let's say you were trying to figure out the precinct where I vote, that needs to be very accurate, needs to be very precise, and just downloading the data set for my state would be hundreds of megabytes. So that's not something we can do on the client side. And then for something like the Google example that I showed earlier,
if I'm looking up burrito places, Google doesn't want to share its whole data set of burrito places in my area. So instead, what most people do is the server option, this option B, where this client gives up its geolocation data, sends it over, and then the server decides what's good for that. So this works for all those different examples I mentioned before,
where there's a lot of data, you can't really download all the data, or it's proprietary information. The weak point is you have to give up your location. And it makes sense, because how are they supposed to tell where you are or what's near you if you haven't given them your latitude and longitude?
It's not possible. But what if there was a third option? So I'm not a mathematician or a cryptographer by training. I assume we do have a number of math and cryptography people here. But when I was reading about this a couple of years ago, I was surprised to hear about this thing, which is apparently a real thing, called homomorphic encryption.
And when I try to describe it to someone, it sounds a little bit like magic. But it's just advanced math that took people a long time to figure out, and I'm not prepared to fully understand. But essentially what it means is people can do calculations on numbers while they're encrypted without ever knowing the underlying value.
So there are two classical examples of why you might try to do that, which is what I have been hearing before I came into this mapping problem of it. So the first example is like a stock example. Like, say you're about to sell many thousands of dollars of stock,
and there are different fees and things that are going to be incurred by different people in your supply chain. But the moment you get the word out there, I'm going to sell $3,000 of this stock, other people are going to start speculating against you. Like, there's all these micro traders or high-frequency trading. So you don't want to leak that information while you're doing the calculation,
but you do want to know what the final calculation is. So the concept in that example is you might send the number encrypted, they do math on it encrypted, and then you decrypt the result on your side. The other example that I've seen, this is a commercial product, but I don't really remember whose it is,
is that if you think about it, Google has access to all the plain text of all of your e-mails, because when you do a search, Google needs to be able to search that text. And so there's this particular software or service where they encrypt all of your e-mails on the server, and you send kind of an encrypted search query into that
to search your e-mail without revealing what is your e-mail and what are you searching. So this is relatively new in terms of, like, math. The first, like, partially homomorphic system was described in the 1970s. So the reason it's called partially homomorphic is that it can do addition
or it can do multiplication, but it can't do everything in general. So it was only in 2009 that some researchers who now work at IBM published the first fully homomorphic encrypto system. So if you look at the code, it kind of goes down very far, down to, like, assembly of different logic gates and things like that.
So it's very intimidating, and we're not going to talk about that today. Instead, I used something called the Pilyr cryptosystem. Is there a cryptographer here who can correct my pronunciation of Pilyr cryptosystem? Oh, thank goodness. Okay. I get to be the expert. This is great.
Okay, I watched a YouTube video this morning, and they said Pilyr, so let's go with it. So why are we not using homomorphic encryption magic all the time? Because then everyone would have things. First thing is people are already pretty okay with sharing data. Like, you didn't feel uncomfortable telling Google that you were in Las Vegas.
And especially if you trust the business with their money. Like, the example I gave earlier of the stock market, like, makes sense, like, in terms of, like, an academic paper, but then it's like, wait a minute, why am I doing thousands of dollars of trades with a company that I don't even trust? Like, what if they get the result, and then they wait five seconds,
and then they broadcast it, but they've actually done some, like... So I'm not really... So it seems like a really interesting idea, but because of the practicalities, and then also because it's very expensive on a computation, speed, and time level, it hasn't really taken off as, like, a general purpose thing.
But the example we're going to talk about today, I think, is a little bit more possible or more relevant use of homomorphic encryption. And it's involving your geolocation. And the example that I'm going to do is sort of like a treasure hunt or geocaching kind of example.
So here you are on the road, and you're like, hey, I think that I've solved the puzzle at the end of the road. I think I've found it. But for whatever reason you're being watched, you don't want to give away your location. You're, like, being very secretive. So the server can't tell you where the finish line is, either, because otherwise you would say, hey, I'm in Las Vegas.
Am I winning? And they'd say, no, you're only winning if you're in Colorado. And then they'd be like, okay, great, now I'm going there. Right? So the server can't reveal the answer, and you, for whatever reason, don't want to share your coordinates. And then there's this other person here with the spy device that I'm saying, is watching your connection or something,
trying to see if someone's going to share where they are or where the finish line is. So in order to make this work, like I said before, I chose the Pilyr cryptosystem not because, it's partially homomorphic, so it can do addition and multiplication by other, like, non-encrypted numbers,
but it can't do something like x squared, multiplying a number by itself, just because of however it is architected. There are some other systems out there. While I was preparing these slides, I saw there's a web assembly library. The main reason behind me choosing the libraries I did was, I wanted there to be a server and a client
that could communicate using the same thing, and the bottleneck on that is finding, you know, some cryptographer who's made, like, a JavaScript implementation of this. So I found the Pilyr cryptosystem. This goes back to the 1990s or so, and there's a few different implementations out there. I was pretty happy with how it worked out here.
And like I mentioned, there's limits to what you can and can't do within these systems. So okay, I'm going to show you some code now. I hope there were, I don't, yeah. So here's, I'm going to show you some code now. So you're on the treasure hunt, and you've loaded this library. You create a key pair. So I know, I think a number of us have used PGP, GPG.
So if you really like public-private key pairs before, now you can have public-private key pairs for each one of your numbers that you use. So now you have a lot more keys. In this example, I generated a key with only 128 length, but I've since updated the GitHub repo.
Now it's 1024 length. You can pretty much set it to any size. It just takes a little longer for it to generate your public-private key. So you're at this location, you know where you are, just with latitude and longitude, and you generate a key, and then you want to send up the latitude and longitude encrypted.
Now, for whatever reason, the JavaScript library I was using only accepts them as positive integers. I think I asked earlier, and there's some mapping people here, but not a lot, and generally the way things we do latitude and longitude these days is we have negative 90 be the south pole, positive 90 north pole,
negative 180 be the international dateline going west, and then wrap around to plus 180. So I had to make everything into positive integers, so I have like a little ackee thing there, and then encrypt them. So that gets set up to the server,
and the latitude and longitude are now 617-digit numbers. Like, it looks just like a number, but it's actually an encrypted, kind of like when you send a PGP message, there's a blob of crypto text, and you're like, what is that? This is kind of a number that, in the pillier crypto system, represents your latitude or your longitude.
And, in fact, I assume some of you have done JavaScript trying to work with the Twitter API, and already the number of tweets is above the limit of a JavaScript number. So if you deal with a 617-digit number, there's a special JavaScript library called BigNumber
that helps you deal with these crazy big numbers. And another thing is there's some kind of nonce or something in the system so that the server can't just guess your latitude and longitude and generate the same number with the same public key. Instead, they're just going to keep generating lots of different numbers
and not be able to find out if this is bigger or smaller than it. So, in this case, I'm going to kind of spoil the source code for you. In my example, the winning location is Colorado because it's a rectangle, and I figured it would be a good, simple example.
So if you're inside Colorado, I should be able to measure your offset to the north, south, east, and west, and two of them should be positive and two of them should be negative, right? If you're in the center, kind of this rectangle diagram here is what I'm talking about. You should have north be positive, south be negative, and so on.
But then I thought, wait a minute, what if I send my location to the server and it calculates the difference from its target location? It's basically telling me how far away I am from the target. So I also multiply each one of them by a different random offset.
So you might know which direction you're in relative to the rectangle, but you don't know how far away you are from it because it's randomized all of them. So the only one who can actually decrypt and parse the response is the initial user.
So they get it back, they find out that they're still south of the target area, but they don't have to reveal that to anyone else. And even if our hacker friends go over and look at the server, even if they get inside the server and look at the logs, the database,
see any kind of information that the server ever had, they're not able to decrypt the original latitude and longitude. So I tried to make a Jupyter notebook for all of you, but when my code hit Jupyter, it created kind of a catastrophic error. This is like a 90s astronomy joke I thought of this morning.
Did anyone? Okay. Anyway, so it didn't do well on Jupyter. If someone wants to look at my code, I have a bitly link and a long link. If someone wants to see that. But you might say, wait a minute, Nick, in the first example, you're just showing places that were near me. I don't want to do a treasure hunt. I don't want to do that.
I want to find places that are near you. And this is something I'm really interested in finding, a really high-tech crypto way to handle it, except Chipotle isn't really a good example. So I actually last year met with someone who's in the Syrian American Medical Society, and they were saying that there are these hospitals
that they have to keep track of and send information and supplies back and forth, but they don't want there to be a list anywhere where you can just download a KML file and cause problems for people, because this is very high risk. And I thought because there's a risk that they're being targeted by Russia,
it's probably better if this list is never created. And also because this is just an experiment, I didn't want to dive into it. But as like a demo, I'm like, this is kind of something we should try to aspire to. Is there a way that people can keep track of their resources and know what's the place closest to them without revealing everything?
So I can't use the Pilear crypto system anymore because it only uses addition and subtraction. There's actually a really funny error if you try to multiply. It's a good luck with that error. Just a nice implementation by whoever wrote that library.
But it led me to think about this problem of like, if the server can calculate the distance and sort the closest one to the furthest one for you, they could do that for themselves as well. They could create millions and billions of points all around the world and find out which point is closest to you and say, I know where you are. And if the server can't sort by distance,
how is it going to do anything other than send you all the data and all the distances? And if you've ever spent some time with like a protractor and a compass, you know that if you get the distance from a few different points, you can triangulate where they are. So this would actually be kind of interesting cryptographically,
but practically if someone sends you three requests, like, oh, what's near me, what's near me, what's near me, now I know all of your locations. So instead there's something called a geohash. Has anyone seen a geohash before? It's been on XKCD a couple times. So the idea behind a geohash is there's 26 letters and 10 numerals,
so a total of 36 options. And what you do is you divide the world up into a six by six grid, and then you divide that square up into a six by six grid, on and on until you have the level of precision that you want. So where we are right now in Las Vegas is this geohash.
It's something you can kind of share kind of quickly and copy back and forth, and if you want it to be less precise, you just take letters and numbers off. So I'm interested in maybe the right way of implementing the system is for you to tell the server, this is my current geohash, can you give me the locations near me organized by a geohash
that's a little bit longer than my current location, if that makes some sense. So you're telling them, this is how much I'm willing to share in terms of location. How much further are you willing to go on your locations when you share them with me? And then I thought about it a little bit more and I realized there's a problem.
If I'm in one of those geohash squares somewhere like in the center, then I just send that one geohash square and I'm fine. But if I'm at the corner, I have to send, and my search radius is big enough to go into the neighboring grids, then I have to ask for four grids, and then it will be very obvious to whoever is listening in
that I'm somewhere near that corner if I'm asking for all of them. So instead, I think probably the right way to implement this would be for the client to suggest a random offset. So sometimes it's going to be one or two or three or four grid squares,
but it's meaningless because they've just randomly offset that location. So it's not clear to the server why they've chosen that offset or where they are within that space. So future ideas. Right now I just told you I had the example of Colorado,
which is just a rectangle. I think there's a potential for other shapes. Just even using addition, subtraction, and scalar multiplication, you could easily do a y equals mx plus db to make a triangle, right? So a mix of triangles and boxes to make up larger, more interesting shapes.
There's some different JavaScript libraries that I mentioned that could support this and other geo-calculations. I know there's a talk later today about post-quantum encryption. Do we have any post-quantum crypto fans here? So one thing that that is using is using lattice-based encryption,
which for reasons I don't fully understand, again, not a mathematician, lattice-based encryption is really good for homomorphic applications like this. So as that gets more popular and more widespread, these homomorphic applications could also benefit from that. There's also something, if you do research into homomorphic encryption,
there's like a voting system. Has anyone seen like a homomorphic voting system thing before? So like we said before, you're able to do calculations on numbers you can't see. So there's a crypto system. Okay, yeah, I'm running out of time. But yeah, there's a crypto system where you can add up the votes totals
without knowing people's votes. Okay, I'm sorry I ran out of time. Do we have time for questions or that's like it? Okay, that's it. But I still am alive as a person. I'm here, and you can email me on Twitter or GitHub. You can figure out where I am. Okay, thank you so much.