We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

SITCH: Inexpensive, coordinated GSM anomaly detection

00:00

Formal Metadata

Title
SITCH: Inexpensive, coordinated GSM anomaly detection
Title of Series
Number of Parts
93
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
It's recently become easier and less expensive to create malicious GSM Base Transceiver Station (BTS) devices, capable of intercepting and recording phone and sms traffic. Detection methods haven't evolved to be as fast and easy to implement. Wireless situational awareness has a number of challenges. Categorically, these challenges are usually classified under Time, Money, or a lot of both. Provisioning sensors takes time, and the fast stuff usually isn’t cheap. Iterative improvements compound the problem when you need to get software updates to multiple devices in the field. I’ll present a prototype platform for GSM anomaly detection (called SITCH) which uses cloud-delivered services to elegantly deploy, manage, and coordinate the information from many independent wireless telemetry sensors (IoT FTW). We’ll talk about options and trade-offs when selecting sensor hardware, securing your sensors, using cloud services for orchestrating firmware, and how to collect and make sense of the data you’ve amassed. Source code for the prototype will be released as well. The target audience for this lecture is the hacker/tinkerer type with strong systems and network experience. A very basic understanding of GSM networks is a plus, but not required. Bio: Ashmastaflash is a native of southeast Tennessee and a recent transplant to San Francisco. He entered the security domain through systems and network engineering, spent a number of years in network security tooling and integration, and currently works in R&D for CloudPassage.
33
35
Projective planeCentralizer and normalizerConfiguration spaceSoftware engineeringHydraulic jumpField (computer science)Product (business)Flash memoryPlastikkarteArithmetic mean
BuildingSystem programmingComputer networkGame theoryInformation securityDisintegrationEnterprise architectureService (economics)Covering spaceInformation privacyInformation privacyProcess (computing)Service (economics)Projective planePoint cloudBoss CorporationEnterprise architectureFocus (optics)EmailPlanningMathematicsHeegaard splittingWorkloadIterationOnline helpInformation securityComputer networkPhysical systemOcean currentCovering space
Data acquisitionOffice suiteRight angleData conversionSimulationElectronic mailing listNumberProcess (computing)1 (number)
AreaSource codeMobile WebCellular automatonSoftwareAbsolute valueNumberPC CardComputer-generated imageryComputer networkSignal processingCodeMobile WebAreaSimulationCellular automatonUniform resource locatorSoftware-defined radioAbsolute valueExecution unitDisk read-and-write headNumberSoftware protection dongleFrequencyCASE <Informatik>IdentifiabilityComputer networkSoftware
CodeComputer networkAreaDisk read-and-write headMobile WebNumberSource codeCellular automatonHierarchyUniform resource locatorMultiplicationIntrusion detection systemWater vaporComputer animationProgram flowchart
Computer-generated imageryComputer configurationOpen sourceExpressionComputer networkState transition systemSoftwareIntrusion detection systemAndroid (robot)Cellular automatonGoodness of fitDeterminantForcing (mathematics)Software-defined radioUniform resource locatorProjective planeInternet service providerScaling (geometry)Hecke operatorTowerMereologyNetwork socketIterationData managementBlogProcess (computing)Price indexMaterialization (paranormal)Open sourceRight angleSubsetConfiguration spaceOcean currentGastropod shellRow (database)HoaxCASE <Informatik>Scripting languageComputer configurationFrequencyState observerPlastikkarteComputer hardwareThresholding (image processing)Computer networkSoftware bugSystem callQuicksortMultiplication signFunctional (mathematics)Musical ensembleConfidence intervalMetadataSoftwareDatabaseBitWireless LANLevel (video gaming)Local ringStandard deviationPhysical systemSound effectAlgorithm
Green's functionIntelModemGreen's functionComputer hardwareTime seriesProcess (computing)Instance (computer science)Wrapper (data mining)LoginHeuristicSoftware protection dongleMeasurementTrailCross-correlationReading (process)IterationComputer fileFile formatWordGraph (mathematics)InformationData modelShooting methodPoint cloudBitProgram flowchart
BitMereologySmoothingSlide ruleComputer networkCASE <Informatik>Modal logicThresholding (image processing)Component-based software engineeringData conversionTable (information)Electronic mailing listSource codeGoodness of fitPower (physics)IterationMoving averageSoftware-defined radioOffice suiteBit rateForm (programming)DiagramLecture/Conference
Multiplication signAdditionQuicksortInformationSharewareThread (computing)Port scannerProgram flowchart
Image registrationComputer-generated imagerySharewareBand matrixWindows RegistryPresentation of a groupService (economics)Video gameMedical imagingDisk read-and-write headSoftwareBitFirmwareRepository (publishing)MereologyCommitment schemeUnit testingCodeBuildingSharewareHookingRevision controlExecution unitGSM-Software-Management AG
Service (economics)SoftwareStatisticsMathematical analysisInformationData managementDemosceneGraphical user interfaceCoprocessorDatabaseService (economics)Integrated development environmentKey (cryptography)RotationInformationUser interfaceSystem administratorAreaVariable (mathematics)LoginCartesian coordinate systemPower (physics)Open sourceInheritance (object-oriented programming)Enterprise architectureMiniDiscSoftwareGraph (mathematics)CryptographyStatisticsCalculationBitMereologyData managementInformation securityBound stateTime seriesSet (mathematics)Elasticity (physics)Multiplication signPhysicalism
Enterprise architectureService (economics)Data modelMathematical analysisTime seriesElectric generatorElasticity (physics)DatabaseWeb 2.0Combinational logicGraph (mathematics)LoginCASE <Informatik>XML
Service (economics)Enterprise architectureDatabaseRange (statistics)Service (economics)DatabaseTowerComputer fileOperator (mathematics)Mobile WebVideoconferencingLoginMultiplication signSharewareSource codeBuffer solutionSemiconductor memoryTelecommunicationInsertion lossWebsiteInformationSimulationComponent-based software engineeringData miningProcess (computing)Android (robot)ModemCharge carrierPlug-in (computing)Function (mathematics)Thread (computing)Cellular automatonDeterminantMereologyUniform resource locatorKey (cryptography)FreewareGodServer (computing)Archaeological field surveyOpen setInteractive televisionSoftware-defined radioXML
Function (mathematics)SharewareLarge eddy simulationMetadataSoftware bug2 (number)Moving averageMultiplication signMeasurementExpected valueModule (mathematics)SimulationBitLevel (video gaming)Right angleSemiconductor memoryPairwise comparisonThread (computing)MiniDiscCalculationWindowNeuroinformatikModemBlock diagramOverhead (computing)BuildingIntegrated development environmentService (economics)ResultantThresholding (image processing)Variable (mathematics)DatabaseAsynchronous Transfer ModeCellular automatonElectronic mailing listComputer programTowerState observerEvent horizonTime seriesLatent heatMathematicsData structureGeometryCache (computing)Software-defined radio
Multiplication signRight angleSharewareDatabaseGreen's functionComputer animation
Band matrixImage resolutionMedical imagingPiNetwork topologyPlastikkarteRight angleProjective planePhysical systemFile formatBitService (economics)GodArmMonster groupDialectComputer animation
SummierbarkeitService (economics)Cellular automatonLevel (video gaming)MathematicsWordAffine spaceFrequencyStandard deviationCartesian coordinate systemMeasurementTime seriesWindowFreewareRight angleBitLine (geometry)Real numberConfiguration spaceInformationProcess (computing)Default (computer science)Computer animation
SharewareResultantServer (computing)Multiplication signRange (statistics)BuildingPort scannerOpen sourceBitType theoryXMLSource code
Functional (mathematics)Electronic mailing listSystem callThresholding (image processing)Product (business)ResultantPlastikkarteObject (grammar)Range (statistics)Cellular automatonGSM-Software-Management AG
Service (economics)Software-defined radioIntegrated development environmentRight angleComputer hardwareIntelligent NetworkVariable (mathematics)Core dumpSoftware-defined radioInformationKey (cryptography)QuicksortBroadcasting (networking)Process (computing)Service (economics)
Bit ratePRINCE2CloningSoftware-defined radioWireless LANInformationSoftwareMobile WebSoftware testingCloningComputer hardwareAreaIntelUsabilityOnline helpNetwork socketInheritance (object-oriented programming)Cellular automatonArmMereologyRight angleIntercept theoremIntrusion detection systemComputer networkSocial engineering (security)BlogGoodness of fitLink (knot theory)Level (video gaming)Beat (acoustics)TwitterDirection (geometry)Solid geometryFlash memoryRemote procedure callHoaxCore dumpEmailParameter (computer programming)Interactive televisionGSM-Software-Management AGTrailSoftware-defined radio
Transcript: English(auto-generated)
Thanks for coming out. I'm Ashmaster Flash and today we're going to cover inexpensive coordinated GSM anomaly detection. More specifically what that means by inexpensive, the whole goal of the project was to come up with something that was going to be far less expensive than the production of a malicious device. Coordinated, meaning centrally configured, you don't want to
have to pull SD cards on a whole bunch of remote sensors and then reconfigure them, re-burn them and get them back out into the field. So central configuration and software management was really important. And by anomaly detection, specifically what we mean is picking up rogue BTSs and IMSI catchers. So, let's jump in. Uh, a little about me. I
started, uh, with actually getting paid for technology work around 2000 and I hopped, uh, disciplines every few years, kinda changed focus and now I'm working in R&D for a cloud workload security company. And I don't like talking about me, so that's where we're going to end that. Let's talk
about you. Uh, my, the audience I was writing this for, uh, has a background in systems and network engineering. Um, some interest in GSM threat detection, but probably not a huge depth. I mean, if you've got it, then great, but it's not required. Uh, I'll, I'll give you the crib notes so we can make it through. And tinfoil hat's certainly not required, but it's not
unwelcome, so go ahead and put it on now and, uh, and let's party. Uh, so I said that I'm working R&D now, I really love my job, and as such, I, uh, this has nothing to do with my day job, so if you don't like this, if you do something with this and get in trouble, I completely
disavow whatever it is that you do with this. So, um, yeah, and don't go talking to my boss about this, just come talk to me if you don't like it. Uh, so, so here's what we're gonna cover. Uh, first off, why you should care. Uh, the current threat and detection landscape, the original project goals, two iterations of the sensor, and the service architecture, because
it's kind of a split architecture, set up, uh, future plans for the project, and that's where I kind of beg you for your help and full requests. Uh, uh, nod and a hat tip to Prior Art and Q&A. Why should you care? Because invasions of privacy are bad even when they're unnoticed. Yeah, that's true, and this all is,
is kinda, kinda vague, so specifically what are we looking at? Um, what's the worst that could happen with a compromised cell phone conversation in your CFO's office? Uh, it could have a financial impact on your company. Um, in the right CFO's office, you could even be looking at something like, uh, insider trading or market manipulation with the right phone
conversation. So, um, these devices are so small and so easy to hide and so inexpensive, you know, can you really trust your ficus? Adjust your tinfoil hat. Um, and the second side of this is it's, uh, with an IMSI catcher, you can also determine, uh, whether or not a specific person is within a domicile. So, if, uh, with
one of these devices, someone could walk up outside of your house and they could get a listing of all IMSI numbers. Now, IMSI numbers are the ones that are burned into your SIM, in your phone, that's attached to your account. So, that identifies you as an individual. Uh, if you can take a listing of those from everybody inside of a house,
process of deductive reasoning, you can determine who is home. So, it's a little bit spooky and, uh, and it's not that expensive to carry off. Uh, the terminology, uh, baseline for the talk. Uh, software defined radio, uh, I had one of those in my pocket, but I gave it away. Uh, the, uh,
it's using software to perform your signal analysis and using a, uh, typically USB dongle that has a software controlled tuner. And in the case of this, we're using the RTL, SDR devices, the super, super cheapo, like $20, $40 units. Um, ARFCN, absolute radio frequency channel number. I
may just refer to that as channel number from here on out, given that this isn't a GSM in-depth talk, um, just because it's easier to kinda wrap your head around. Think of this almost like a television channel. Uh, CGI, cell global ID, is a globally unique identifier for the BTS. Um, that's comprised of a mobile country code, a
mobile network code, uh, location area code and a cell ID. All that comes from the BTS. And like I said earlier, the IMSI is what's burned into your SIM and that's what identifies you as an individual. Here's a visual aid to kinda wrap your head around GSM addressing in regards to the global cell ID. Um, every mobile country code has a
number of subordinate mobile network codes within that you have multiple location area codes and within that you have multiple cell IDs. So let's talk about threat and detection. So we'll go over, first a drink of water, um,
malicious devices, how you know that these malicious devices are in play and what's currently on the market to detect them. So HackFemtoCell is a trusted part of the provider's network. We saw some really good talks in Defcon 21 about, you know, hacking, uh,
honest IDS and also for some nefarious purposes. With a hacked FemtoCell, you can gather IMSIs and you can also record phone calls and SMS traffic, uh, that are going across it. Your phone has no idea if it's good or evil, your phone
is just going to attempt to attach to it. And EvilBTS, um, EvilSocket had a great blog post on how to build one for very, very cheap. Um, and HamHands for scale, this is the size of the SDR that's necessary to build a, um, you can, you can kind of see how this could fit in your Ficus,
right? So, and that's the largest device in the system. So that, coupled with the Raspberry Pi 3, you can build a, uh, an EvilBTS and record phone and SMS traffic. Again, this is the same case with the, uh, with the FemtoCell. Your phone doesn't know if it's good or evil,
it's just going to try and talk to it. That's a GSM thing. So, indicators of attack. How do you know when something weird is going on? Uh, ARFCN, remember, think of it like a TV channel. All of a sudden, if a channel goes loud over threshold, this is something you determine by a short period of observation. Uh, so you can set a threshold alert when it gets over that. Uh, ARFCN, outside of fart, of
forecast, um, you can use, here's, spoiler alert, we're going to use, uh, has Holt-Winter's algorithm built in so that you can, uh, have a confidence band over time. And so if something that's typically low, but all of a sudden gets a little bit louder, it may not be a threat to you, but it may be something nearby, um, a channel
all of a sudden getting louder may indicate that someone's trying to broadcast on the same channel. Uh, unrecognized cell global ID, uh, there are databases you can download with the GPS coordinates and all the, all the metadata for these cell global IDs. And it's useful for determining your
location. If you don't have a GPS chip, you can kinda make that determination based on where the tower is. Um, gratuitous BTS re-association, this is something that you would determine by observing the, the behavior of a cell radio. And if all of a sudden you have a stationary radio that starts associating to another BTS
or another, a bunch of other BTSs, typically for a, for a standard or a, um, stationary radio, you're not going to see a lot of that behavior. If you're walking around, it's supposed to happen like that with your cellphone, but if it's sitting in one place, you really shouldn't be hopping towers a whole heck of a lot. And if you have the GPS location of a tower by the cell global ID and the
BTS, um, is broadcasting a cell global ID of something that should be in, say, Orlando, if that cell shows up in Vegas, either someone's absolutely awful at their job of configuring BTSs or it may be something malicious. So current detection methods, uh, both Pony Express and Bastille
Networks have an offering of which this is a subset. Um, open source options, uh, fake BTS is a really cool project, it served as the original inspiration for this, it's a, it's a collection of shell scripts that use Wireshark and AirProbe and Calibrate to make a determination as to whether or not you have, uh, malicious nearby cells. Uh,
the Android IMSI Catcher Detector is software that you install on your phone itself and it interacts with your cell phone's radio to determine if there's any sort of anomalous behavior. And FemtoCatcher is very close in function to I, to the Android IMSI Catcher Detector, but it's
specifically for catching femto cells and it's really only effective for, um, phones on Verizon wireless' network. The original project goals, um, it's Vegas, I think it's okay to ask for what you can get for $100. So, um, so that was
the goal, is see if I can get the target price under $100 for the first iteration. Uh, I wanted a low footprint. For the raw materials I wanted it to be at least as small as this. And, uh, functional targets I wanted to be able to, um, pretty much use the indicators of attack as a metric on whether or not I would be successful detecting rogue, uh,
BTS's. And centrally manage software and configuration. That was really important to me because I have really big hands and it is such a pain to actually get those micro SD cards into the right slot in a Raspberry Pi and I've lost so many and gotten so frustrated having to crack the case back
open to get my, yeah, I didn't even want to screw around with that. I wanted to be able to drop this thing under a death, up behind a ceiling tile, pretty much wherever you might find a malicious device, I wanted to drop this thing so that you could get good local coverage inexpensively. And not have to touch it again. Um, in the
process of this, I collected a lot of hardware. Uh, I had a Raspberry Pi 2, a logarithmic antenna, a couple of O droids, a C1 plus, an XU4, a galaxy of red and blue and green and orange LEDs, an Intel NUC, an Intel Edison, a GSM modem, a
all this stuff. But when you get locked into a serious hardware collection, the tendency is to push it as far as you possibly can. So, that brings us to Sitch. Situational
information from telemetry and correlated heuristics. And I definitely started with the acronym side of that before I came up with the words to match. So, this is the first iteration of the sensor. The, I had an RTL-SDR device, I
wrote a wrapper in Python to get that into structured data, uh, using Calibrate. And all of that feeds into the main process. Uh, also running GPSD to pull accurate GPS readings from a GPS dongle. Using Logstash 4 to, to forward scan logs. Since we have it in structured format, it's pretty easy to
drop the file and Logstash picks that up, shoots it off to Logstash, Elasticsearch, all that good stuff in the cloud. And, um, and I was using a tool, uh, Python tool called GraphiteSend to send all this stuff over an OpenVPN, uh, channel up to a, uh, graphite instance for tracking time series measurements. Which was, uh, it was effective enough. I,
uh, I talked Verizon into sending me a femtocell to set up in my apartment. And, uh, and when you start it up, I mean, they never really consistently start at the same speed. Sometimes you'll be waiting for 40 minutes for it to get a GPS fix. But when it does go live, it's pretty plain to see. Um, honestly, this graph is a little bit smoothed
out. It's normally spikier than this. I went back in history and Graphite and graphite had already kind of smoothed things out for me. But it's very clear, very apparent when this stuff goes live. Uh, because it gets very loud and your phone attaches to it and then, ta-da, you're on a part of Verizon's trusted network. So, remember that slide earlier? Uh, here it is in table form. So,
these are our functional targets. ARFson over threshold is a big yes as well as ARFson outside of forecast. But, the tool that we're using called Calibrate, what it does is it produces a list of, uh, channels, nearby channels and gives you a power rating. It's typically used for picking up,
um, for determining your clock offset because, uh, SDR devices are notorious for being drifty. And the RTL SDR devices are especially notorious depending on temperature. So, those are the sense, you actually can't get away with running those things with the lids closed. They get way, way too hot. Um, and the price was a hundred dollars. Like, it
was right at about a hundred dollars. Not counting the case. I mean, the case was kind of necessary for the, uh, for the trip out here. But, uh, just the raw components you can get for about a hundred dollars. And it's, it's pretty effective. The problem is, you're looking at about seven minutes worth of, uh, resolution. So, it takes seven minutes to scan 850 megahertz GSM using a Raspberry Pi 2. And
you can actually have kind of, uh, a pretty important conversation in less than, you know, in less than seven minutes. So, good first iteration. I was thinking, eh, I got, this actually happened after I submitted my CFP. I was able to kind of prove what I was thinking. And, um, so this is like late April. And I was thinking, eh, I could kind
of roll with this, just write on this, and maybe it'd be fine and cool. And, uh, and I started looking at the source code and I was really, really not happy with it. It was self-conscious. There were a few problems with this. So, what's wrong with Mark 1? Um, main was single threaded. And
when you're pulling data from two separate devices, you can end up with some interesting situations if you've got to wait on your GPS to get a fix. And then you do your seven minute scan of 850 megahertz GSM, then it's, it's this sort of, sort of additive problem. It, it's, uh, you can really end up with some kind of ridiculously long scan
times, especially if you're indoors trying to get a GPS fix. Another thing I didn't like is that there were two secure channels for delivering the information. That's, it's inefficient, it's just more crap to manage. And I really kind of wanted to reduce those to, to one encrypted channel. So, now I'm going to start the demo. And I'm going to start it early in the presentation because it takes
a, um, this stuff is kind of bandwidth dependent. So, I'll explain a little bit more about that. Check this out. Is this thing on? Alright. So, this has got, uh, RTL-SDR device,
uh, GSM radio, it's a Raspberry Pi 2, and, uh, and just some stuff to support that. So, and this thing's being provisioned from zero using the, uh, orchestration stuff I was talking about. There we go. Alright. So, while
we're waiting on this thing, uh, what it's doing is there's a, the service that I'm using to orchestrate what I'll loosely call firmware, although maybe we'll have a discussion on the, actually, what firmware is later. Um, I'm going to call it firmware. It's just a bunch of Python code. But the, um,
what actually sits on the device, there's a service called Resin. And, uh, Resin has built an image to put on your Raspberry Pi that runs Docker. Uh, I think, I can't remember the version of Linux that's based on, I'm not going to promise something up here. But, what it does is it calls home to the service, um, and it pulls, uh, Docker images
of whatever you commit. So, basically, what you do, this is what your deployment pipeline looks for using, uh, Sitch and Resin. So, what you have, your actual user effort is you do a git commit of your code, you do a git push to Resin's repository, everything below the orange bar is all managed
by Resin. If your build completes, uh, and I'd like to mention that if you do not do unit testing, you are going to hate your life. You will pull your eyeballs out of your head because it takes a few minutes. And it's almost like, hey, here's Python, but now I have to compile it and wait and wait. So, um, git go to unit testing and then make that part of your commit. Um, the, the commit
hook will run a Docker build on your code. And if your build is successful, it'll accept the commit and moves the image into Resin's registry. And then your device will, it, it pulls like every minute to the Resin service and when you have a new container image, it just pulls down a new container image and restarts. And you
don't have to touch the thing to do software updates, which is really nice if you're sticking these things up in attics and all over the place. So, as far as service side software goes, um, we've talked a lot about what's actually running on the sensor and what's running on the service side. Uh, most people in here, if you're a
sysadmin, you're probably familiar with Logstash, Elastic Search and Kibana. It's a great, fantastic open source tool. And it's super versatile. It's a part of this, uh, as well as using carbon and graphite for time series, uh, database and for statistical calculation. And I'm using to Sarah because,
uh, as much as I love graphite, graphite scraps are really not pretty. You need something to go on top of it. And, um, graphite beacon is probably the simplest tool I found for just measuring and looking for things outside of bounds on graphite. It was so nice that somebody didn't over-engineer something. It's simple. You can figure it, set it up and fire it off. Um, so that's what I chose. Vault is
a really cool tool from HashiCorp and, um, and what it does, it does secret management. So, you can load certs, you can load credentials in there and you use, and we have the keys for accessing those loaded up into environment variables in the device itself. So you can do your
credential rotation against vault and then you just bounce your whole application, you know, through the resin user interface and everything comes back up, gets its credentials and all those credentials are written on the sensor to a, um, to a RAM disk so that if somebody does jerk the power, it's at least a little more difficult. I know, you know, with, with physical contact, all of your security should be
considered null but at least it makes it a little bit more difficult to uncover your, uh, your crypto material. Resin is the service that I use to manage the software and slack is where the notifications come out, you know, because at least you can do it over IP and you're not relying on SMS when GSM may or may not be, you know, a friendly area. So on the
service architecture side, um, uh, the first thing the information hits is the inbound information processor. What that is in this case is, uh, log stash, uh, document
retention, everything's stored in structured data in elastic search and the web based portal is, uh, kind of a combination of Kibana and Tocera. Uh, the time series database is graphite and analysis and alert generation right now are shared by, um, graphite, I'm sorry, um, that
other tool, graphite beacon and some stuff that's coming directly out of the sensor. The sensor is actually smart enough to do some alerting on its own and that stuff is caught by log stash and it kicks it out straight to slack. Uh, like I said, external alerting service is slack and
there's a user. So the intelligence fee, uh, if you're going to make a determination on, um, you know, on the location of all of these GSM towers, you don't want to do your own site survey and then compile your own database, you really kind of want to look and see if somebody else has already done that. The open cell ID database is out
there and it's super useful, the only thing I think it didn't contain that I really wanted was the carrier name because you can make that determination using the MCC and MNC parts of the cell global ID. Uh, so thank god for Twilio and their free pricing API because you can just pull all of that stuff down, API key is free and the way that this works is
it's all because once you start using Docker for something, you just want to use it for everything and, um, and so I have this Docker container that I can run as a job and it goes out, it pulls down the open cell ID database, it merges that with the information of the Twilio pricing API and it throws this stuff out into files based on MCC. The reason that's sliced up is because that
database file is so huge, uh, that you want to have this kind of broken up and, uh, knowing the company, country that you're operating in, you should be able to determine the, uh, mobile country codes that you need to be downloading for. So it reduces the, the download size, uh, but truth in advertising is as much as I want this live demo to work, it is a lot of information and maybe or
maybe it won't be able to download everything in time. If it probably wouldn't be the first time a live demo fell over at Defcon. So, uh, if you insist. So let's talk about the Mark 2
sensor and, uh, and kind of the improvements I wanted to make before I showed anybody this, this ugly baby of mine. Um, so there's a component, uh, the SIM808 collector, uh,
that interacts with a GSM modem to actually function in some, in a way that's somewhat similar to the way the Android IMSI Catcher Detector works, uh, by interacting with your phone's, uh, GSM components. So, the, uh, so everything that you see in green is its own thread off of the main
process. So that way you can, you can concurrently run collections against your GSM modem as well as your RTL SDR device so that you don't have to wait seven minutes and then do it, you know, it's just, decided to forego all of that. Um, everything that you see in blue is a first in, first out buffer, so all of this scan information goes into the enrichment buffer and the enrichment thread picks it up,
enrichment thread, uh, compares that against the enrichment database that you pull down based on the MCC file. Yes, the MCC file, uh, that comes down that's, all that stuff gets shoved up into AWS. It doesn't have to work like that, it'd be simple enough to tool around to work off of an HTTP server, but AWS was just easier, so that's what I did. Um, and the emitter, uh, can emit
straight to scan logs, uh, which are picked up by Logstash forwarder or you could point it off to, um, to the Logstash server itself. I felt a lot more comfortable having it work with Logstash forwarder because Logstash forwarder can run its own buffer if you end up with loss of communication. It just seemed like the smarter thing to do to not have everything just pipelined up in memory on one
of these small little devices. And everything goes up to Logstash over that single channel, no longer using OpenVPN and, uh, Logstash has some great output plugins that you can use to take that structured information that's coming in and spit it right out to graphite. So, kind of coalescing those two paths was super, super, uh, it just
makes things seem simpler to me. So, this is kind of a, a block diagram of what goes on inside the sensor. Uh, for a calibrate scan, everything goes, um, it goes into the enricher thread, er, enricher thread picks it up
from Q, and, um, and it can fire alerts on its own, um, based on a threshold that you set in the environment variables in Resin. Resin is a service that manages it, pushes out environment variables for running your program, um, so you can set a device specific threshold depending on where in the building it is, because you don't want to set the same, I mean, that's, that
wouldn't work. Um, and it also sends individual events or individual, uh, structures for arfs and metadata and the original scan document containing your timestamp, all that other good stuff. It's a little more interesting when you start pulling from the SIM808 module, which is your GSM modem, uh, the enricher thread gets it, does a comparison against the enrichment database, which is
kind of sizable, but it does do a little in-memory caching for a little while, just so you don't have to keep hitting disk for everything that, that comes through, and it can set, it can do alerts on changes in the primary cell global ID, it can do alerts on the cell global ID not being
in-range based on the geo location, it's coming down through the feed, um, what I kind of want to draw your attention to is, this calculation is actually happening on the Raspberry Pi, so the idea is that you should be able to stand this stuff up and, um, and have a fairly small compute overhead compared to some other services, because a lot of the compute's happening on the
device itself, uh, so doing stuff like geospatial calculations, stuff like that, you don't have to do all of that stuff, because the, uh, the, uh, the, uh, something I failed to mention earlier, remember I said that there's about a seven minute delay on getting results with an RTL SDR device, when you throw one of those little GSM
devices into engineering mode, it's every few seconds you get a list of all of your, um, nearby cells by preference according to the GSM, so, uh, the RTL SDR device is more of an objective observation, I see these channels, here's the tower, but you're interrogating the GSM modem actually tells you
what it prefers, so the stuff that's a little more GSM heavy of why do I prefer this tower over another, takes care of all of that and you can just query the GSM modem and ask it, what do you prefer the most? And you can tell when your primary changes and you cut the resolution from around seven minutes down to just a few seconds. Woo! Uh, so this
is what you see in slack when, uh, you know, after the thing gets, gets started and gets warmed up, uh, these alerts are for things, you know, like not being in the feed database, other stuff like that and you also get alerts for, um,
graphite beacon when you have problems with anomalies being detected when things fall outside of the forecasted expectation for your time series measurements. So, here's where we return to the demo and see if these things are actually going to behave for us. So, I don't think I get a
drum roll up here, but it's, I, I hope you can, I hope the anxiety is palpable. Woo hoo! Just truck this over there,
sorry, this is, somebody told me not to do it like this, and if I'd have had enough sense to listen. What? Alright. Yeah, thanks. Where did you go? Alright, so. Here in a minute,
you're going to have me some Jack Daniels and I'll know I just need to walk off stage. Let's try mirroring for the
win. Alright, can you see that? Okay, so it actually was able to download all of the feed database and everything. I'm going to take a drink. Live demo, y'all! I don't want that.
Alright, so, this is what it looks like, um, in resin. And with resin, you can, um, actually, okay, truth in advertising. One of these I plugged in in the speaker's green room just because I was afraid that it wouldn't have enough time to download all of the things. And the one that
I plugged up a few minutes ago, let's see how far along it is. This one's called Misty Mountain, isn't that beautiful? Okay, so, yep, still downloading. Uh, depending on bandwidth, I mean, it can take a little while. The initial download, so you've got a couple of minutes at the
beginning, when you pop in the SD card, it reformats it to work right for, uh, for resin's operating system. And then it dials home to resin service and it starts pulling your Docker image down. Um, this is actually a lot smaller. Originally, I tried doing this with, uh, GNU radio and oh my god, that thing is a monster. So you start dealing
with image sizes over 2 gigs and Raspberry Pi's struggle with it. Um, it is my hope that someday that I can get GNU radio trimmed down enough because I think that and especially the, uh, GNU radio GSM project, uh, Peter Chrysek, uh, put that together. So if you're looking for something fun to play around with, I highly recommend
that. I was hoping to get that originally worked into this but I think ARM's gonna have to get a little bit more powerful with the stuff that you can buy off the shelf before we'll actually be able to get GNU radio working, at least the way that I need for, for this project. But check out GRGSM if you have a minute. It's, it's some awesome stuff. Now let's go back to the working sensor. Alright. So in
the startup process, you see download the application, installing application, um, pulls everything down from the, uh, from the feed, get your secrets from, uh, from the default. Yeah, this one started up just fine. Oh, man, that's
great. Heh heh heh. Not, it's, it's not that I thought that it would really not work but superstition, you know, the same reason you don't do a, do a change window on a Friday afternoon. So to Sarah, let's see what that looks like. Let's see if we have anything for Defcon yet. And
we do have a little bit. So these are time series measurements and it, honestly, it looks a little ugly. This is probably due to my configuration of graphite. Um, let's find a resolution that looks decent. There we go. That's a
little bit better. And you can see the, uh, the channels that are being tracked by Arson. And this is kind of the RX level, this is the measurement that's produced by the cell radio itself. Uh, here's, yeah, this is the whole winner's anomaly stuff. This doesn't get really interesting until you actually have a measurement period by
which you can start to look at, uh, this whole winner's is super cool. You could probably do this with standard deviation. But I was, oh, winner's is a bigger word, right? And it's free. It's already baked in. There's another buzzword if anybody's playing bingo. Um, so whole winner's is super cool in that it takes seasonality into it. So if you, not that you would ever see this in
the real world, um, but, um, but if you have like, if Monday afternoons are really hot, you know, then it'll, it'll accommodate for that. You just have to let it see a Monday afternoon or else you get that. So we're even tracking the affinity, which looks like it may have made a change. That's interesting, isn't it? Uh, so the cell made an
affinity change shortly after coming online, which is pretty cool, from Arson 180 to something higher than that. 238. Yeah, that's cool. Anyway, we're all kind of discovering this. I was, honestly, I was pretty afraid of
turning this thing on to DefCon and blowing up the service by throwing so much information into it, but it's behaved surprisingly well. So let's see. And here's my Kibana server. Oh, no results yet. And that's because my time
range is crap. Alright. There's that. Let's trim that down a little bit to two hours. And there you can see
we started getting stuff coming through. And scans of all type. And this is all structured data, so if you wanted to build something on top of it just to interrogate, uh, Elasticsearch and pull these results out, go nuts. All this stuff is going to be released open source after the talk. So I hope that somebody out there enjoys this thing after how much time I spent doing it. So, let's
return to the demo and see if I can figure out this mirroring thing again. Alright, so summary of Mark 1
and Mark 2 functionality. Um, so like we discussed earlier, uh, ARFs and overthreshold and outside of forecast worked great with the first one. It was just really slow return results. Seven minutes, you've already told them your, your
stock trading tips and somebody else has them too. Um, with the Mark 2, we hit all of our objectives. Uh, ARFs and overthreshold and outside of forecast, of course, because we were still using calibrate. Uh, unrecognized cell global ID, able to pick that up. Gratuitous BTS re-association, able to pick that up as well. Uh, BTS detected outside of range, you can do that as well. Um,
and the price was, uh, 150 bucks. So, considering that, um, this, if you buy this at list price, not the price, like they've got a great deal in the vendor booth, you should all go and buy one of these, these are so expensive and they've got a great deal running in the vendor booth. They didn't pay me to say that. Um, but, um, with one
of these, um, yeah, I, I think you're looking at maybe around 600, 650 dollars with this, Raspberry Pi, GSM radio. You can build an evil BTS, for about 150 bucks, you can build a, um, a sensor to detect when these things are around. So, the original goal of having something that was, uh, easy to
deploy, you know, something, I mean, you just pop the SD card in, make sure everything's plugged up, you ship it out, plug it up wherever and leave it alone, let it collect its stuff. Uh, to have it less expensive than the, uh, than the, uh, uh, uh, uh, call it a win. So, there's that. Uh,
going forward, this is what I kinda like to do with it. Automatic device detection. Something I shielded you guys from was all the environment variables you have to configure. Some of them you want to have to configure, you know, like, what is, what is the, you know, the, the key to retrieve all of my information from vault, right? You don't want your search just hanging out there, so, so
there's that. I'd like to do device and service heartbeats, because right now that's something that you just kinda have to infer, because, uh, you'll start getting graphite alarms. Um, but it's really something you just have to infer. I'd like to get more specific with that. Uh, GNU radio, like I said earlier, I would love for
GNU radio to be the core of this if I could figure out a way to make it run quickly. And, uh, and, uh, honestly to run it all on a Raspberry Pi, because, uh, running that sample rate and doing GSM processing is, uh, is pretty intense. Um, but if you do go pure SDR, then not only GRGSM, but you can start playing around with ADSB broadcasts from aircraft, uh,
looking up, uh, FPV, FPV drones, all sorts of fun stuff. And, uh, maybe even running connectors for UberTooth1 and Yardstick1, because those are, you know, those are some kind of fun things to play around with and if you can just, if you never have to touch the thing except to install new hardware, why not, right? So here's a hat tip and thanks
to our prior art. Uh, DIY cellular IDS and traffic interception and remote mobile cloning with a compromised FEMTO cell. Uh, that served as the original ins- inspiration that kind of got me thinking in this direction. Because you can get a FEMTO cell for 250 bucks or you can social engineer one out of Verizon for pretty, I've been with you
guys for so long. If, if that argument has never worked before, it worked for me. I've been your customer for so many years and I have crap reception in my little apartment. Um, so 250 bucks or 600 bucks, uh, it's, it's, it's pretty cheap to, to be able to do some, some positively evil things. And last year, uh, Dukahuna and
Satanclaws put on a great intro to SDR and the wireless village. It was a one-on-one track that, uh, that I really enjoyed and kind of set me down the road of, of trying to figure this, this problem out. Uh, fake BTS is, served as the original functional inspiration for this kind of, the interaction between Rireshark and Airprobe and, unfortunately, it's a little too intense to run on
arms, so that's why I kind of had to write my own little hacked together thing. Um, and how to build your own rogue GSM BTS for fun and profit. Simone Margaritelli, thank you. If Evil Socket is here, I want to buy you a beer. If you're not, I owe you one. Um, it was a really well-written blog post on how to simply set up an evil
BTS using one of these and a Raspberry Pi 3 and a battery pack and a GSM radio and that helped me to kind of quantify the price and the footprint and, uh, ease of access for parts. So, thank you, uh, Evil Socket. That was a, that was a huge help for this talk and, and gave me a really
good, solid, solid target to shoot for. Um, thanks, uh, good new radio is, you know, as much as I wish it could have made it in here. It actually worked pretty well on the Intel NUC, but those things are kind of pricey. You are not gonna beat anybody on price using an Intel NUC. But, uh, but new radio runs pretty well on that for this purpose and, uh, Peter Krysik was really helpful, helped me to
get up to speed on the, uh, on the GSM stuff and calibrate, uh, calibrates the core of this and without calibrate, it really probably wouldn't work very well. So, hat tip and thanks to all of the prior art. And thanks to all these fools. John Minarek made a not small investment in test hardware. He was one of my
first beta testers. So, John, if you're out there, thanks bunch. Maybe not. Uh, Gillis Jones, super helpful. Great advice. Uh, Christian Wright and Dave Doolin suffered through this thing. And, uh, and there were a lot of silent contributors. They didn't necessarily want to be associated with a Def Con talk, but I don't know why. I have
no problem with it. But, um, but I got a really, a bunch of really useful information on, uh, on GSM networks from some, uh, some really helpful people in the background. Anyway, um, we can do Q&A now or we can take it off stage. Yep, I'm
gonna release it as soon as I get to a reasonably secure network. And the, uh, uh, on your Def Con CD there's a whitepaper with the links in it. Alternatively, my handle is ashmastaflash. Check me out on Twitter and, um, and I'll post there and I may see if I can get, squeeze an email through full disclosure as well. Ashmastaflash. Two A's.
Alright, thanks a whole bunch everybody.