We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

RECON VILLAGE -Targeted User Analytics and Human Honeypots

00:00

Formale Metadaten

Titel
RECON VILLAGE -Targeted User Analytics and Human Honeypots
Untertitel
Predictive user targeting
Serientitel
Anzahl der Teile
322
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Many significant breaches have resulted from adversaries knowing who to target, how to target them and where to target them. Most corporations are not effectively using the largest collection of targeting data that is available on the public internet and fail to build and refine data driven threat models using the information that our adversaries are using against us. Targeted User Analytics and Human Honeypots is a research project I am working on to identify and model targeting methods with the hope of tipping the scales in our favor to defend our networks, users and critical systems. LinkedIn is the largest collection of Business Social Networking data available to “unathenticated” persons on the public internet. With the right techniques this data can be mined to identify and enrich targets. The purpose of my talk is to present targeting techniques through a use case and to demonstrate the value of other enrichment methods involving data sets that are widely available or collected from corporate security tools. The end result is analytics that predict who will be targeted and why they are more likely to be compromised if they are targeted. This will allow for proactive action to be taken to defend users and our assets.
PrognoseverfahrenMultiplikationsoperatorTreiber <Programm>Prozess <Informatik>HypermediaBitTouchscreenProjektive EbeneAnalytische Menge
Exogene VariableWiederherstellung <Informatik>ComputersicherheitInnerer PunktModelltheorieUnternehmensarchitekturPunktwolkeRechnernetzStrategisches SpielIntelBinder <Informatik>QuellcodeStatistische HypotheseSystemprogrammierungFlächentheorieWellenpaketTypentheorieVerschlingungDatenfeldOpen Sourcesinc-FunktionSelbst organisierendes SystemKeller <Informatik>TypentheorieModelltheorieStatistische HypothesePhysikalisches SystemStrömungsrichtungVerschlingungTwitter <Softwareplattform>ComputersicherheitMultiplikationsoperatorPunktwolkeComputeranimation
Projektive EbeneWeg <Topologie>GeradeAnalytische Menge
QuellcodeModelltheorieSchlussregelDatenfeldFontÄhnlichkeitsgeometrieProgrammierumgebungMultiplikationssatzSoftwareschwachstelleSoundverarbeitungUnternehmensarchitekturImplementierungTypentheorieComputersicherheitPhysikalisches SystemSystemplattformGewicht <Ausgleichsrechnung>HilfesystemVektorpotenzialOperations ResearchTransaktionServiceorientierte ArchitekturDienst <Informatik>GatewaySummierbarkeitInzidenzalgebraProgrammierumgebungResultanteMultiplikationsoperatorSchnittmengeModelltheorieWurzel <Mathematik>SystemplattformFirewallSchnitt <Mathematik>ZahlenbereichCASE <Informatik>BenutzerbeteiligungPräkonditionierungSystemverwaltungSicherungskopieMatchingMultiplikationssatzPhysikalisches SystemTypentheorieComputersicherheitImplementierungTermBenutzeroberflächeRechenschieberSoftwareschwachstelleVektorraumSchlüsselverwaltungWort <Informatik>ProgrammierungElektronische PublikationComputeranimation
VariableSystemprogrammierungOperations ResearchStichprobeAbfrageWeb SiteCoxeter-GruppeFlächeninhaltZeichenketteSchlussregelHill-DifferentialgleichungAbfrageModelltheorieSuchmaschineVariableResultanteSchlüsselverwaltungGebäude <Mathematik>Quick-SortMatchingTermGewicht <Ausgleichsrechnung>GoogolComputeranimation
Textur-MappingWärmeübergangLesen <Datenverarbeitung>Innerer PunktKonvexe HülleCASE <Informatik>Mapping <Computergraphik>ElementargeometrieTypentheorieDemoszene <Programmierung>PunktDienst <Informatik>MatchingAnalogieschlussURLHyperbelverfahrenPhishing
Hill-DifferentialgleichungFunktion <Mathematik>OvalStichprobenumfangInternetworkingE-MailAdressraumSpieltheorieHilfesystemURLSchlüsselverwaltungInformationOffene MengeDatenfeld
FlächeninhaltGruppenoperationRechnernetzSimulationPhishingMittelwertRechenwerkWhiteboardKreisringKontinuumshypotheseNormierter RaumAnalysisVerschlingungComputersimulationSocial Engineering <Sicherheit>SchnittmengePasswortSkalarproduktAnalysisTwitter <Softwareplattform>E-MailVerschlingungMetrisches SystemInformationGruppenoperationTotal <Mathematik>Nichtlineares GleichungssystemIdentifizierbarkeitTask
PasswortSimulationGruppenoperationStrategisches SpielProzess <Informatik>HoaxPasswortWellenpaketAnalysisStrategisches SpielRechenschieberFlächeninhaltKorrelationDickeHash-AlgorithmusSystemplattform
KorrelationSimulationMetrisches SystemKorrelationsfunktionTotal <Mathematik>SystemplattformHIP <Kommunikationsprotokoll>Weg <Topologie>E-MailAdressraumDatenfeldFlächentheorieInverser LimesPunktAnalysisMittelwertSampler <Musikinstrument>KorrelationDatenmodellSkalarproduktModelltheorieProzess <Informatik>URLProfil <Aerodynamik>Projektive EbeneDatenverwaltungDienst <Informatik>VektorpotenzialComputersicherheitMAPMinkowski-MetrikPunktSchnittmengeOffice-PaketGruppenoperationMultiplikationsoperatorHoaxComputeranimation
StichprobeOperations ResearchTouchscreenAbfrageModelltheorieMomentenproblemDemo <Programm>Quick-SortTypentheorieRechenschieberComputeranimation
StellenringBinder <Informatik>LeckMatchingElementargeometrieURLPhysikalisches SystemMittelwertEinsTotal <Mathematik>TypentheorieGoogolComputeranimation
Lokales MinimumInnerer PunktModelltheorieMultiplikationsoperatorFamilie <Mathematik>Notebook-ComputerComputeranimation
Kartesische AbgeschlossenheitLeckModelltheorieElementargeometriePhishingGruppenoperationRekursiv aufzählbare MengeMAP
DefaultModelltheorieInverser LimesProzess <Informatik>PrognoseverfahrenRechnernetzProzessautomationComputersicherheitPhysikalisches SystemPhishingTermSoftwareschwachstelleLuenberger-BeobachterFacebookSuchmaschineSchnittmengeProgrammierumgebungVorhersagbarkeitVektorraumComputersicherheitPhysikalisches SystemPhishingProzess <Informatik>CASE <Informatik>Profil <Aerodynamik>MAPKryptologiePunktComputeranimation
Innerer PunktSchnittmengeSystemprogrammierungKette <Mathematik>BitrateMereologieOrdnung <Mathematik>Kette <Mathematik>KontrollstrukturModelltheorieRechter WinkelSchlussregelComputeranimation
VerschlingungPhysikalisches SystemGanze FunktionAnalysisHochdruckPersönliche IdentifikationsnummerObjekt <Kategorie>ProgrammiergerätPhysikalisches SystemSystemverwaltung
RückkopplungCoxeter-GruppeFamilie <Mathematik>HIP <Kommunikationsprotokoll>Zeiger <Informatik>Prozess <Informatik>Familie <Mathematik>Coxeter-GruppeMultiplikationsoperatorComputeranimation
E-MailSicherungskopieStandardabweichungIdentitätsverwaltungComputersicherheitDatenmissbrauchModelltheorieRechenschieberE-MailProzess <Informatik>PräkonditionierungParametersystemReelle ZahlComputeranimation
Elektronische PublikationSicherungskopieNotepad-ComputerPhishingData MiningElektronische PublikationQuick-SortEinsAbfrageAdditionMultiplikationsoperatorPhishing
Innerer PunktInformationVektorpotenzialVektorraumSicherungskopieNormierter RaumProzess <Informatik>KardinalzahlKonvexe HülleProgrammierumgebungKette <Mathematik>Prozess <Informatik>MathematikComputeranimation
Computeranimation
Transkript: Englisch(automatisch erzeugt)
The next session is a lightning talk by M. Bison. The title is up there on the screen, it's Targeted User Analytics and Human Honeypots. It's a research project, and because we're tight on time with this one, I will hand immediately over and let you introduce yourself. Yeah, I've got to take this shot. I'm not one much free-air, but here it goes.
So my talk is Targeted User Analytics. I think you will find it interesting. I've been busy trying to explore targeting processes that adversaries use,
both APT and the less sophisticated adversaries, and how social media is leveraged. And that was really the key driver of this, a little bit about me. I'm a beekeeper, as you will see. I have a few strange hobbies. This is one of them. And M. Bison is Rich Wickersham. You can follow me on Twitter, that's who I am.
I didn't want to take a bunch of attention prior to the talk, so that's why I waited until this time. I've been in the field for almost 20 years. I've been conducting OSN since before we really had a name for it, I guess. I've got a strong background up and down the stack.
In security, I'm busy in cloud, chasing the misconfigurations that have kept us all gamefully employed currently. I really get a lot of enjoyment out of deconstructing adversary attack methods, so, you know, I have my things that I like to do. So, jumping into my talk, my hypothesis is that our adversaries are using LinkedIn,
we know this, to target our users. Data-driven targeting models can be built, and I built one. Targets can be externally enriched using breach data and other open-source methods, and there is internal enrichment data that is only available to us to defend,
and this is what we need to do. So, I had to put a shirtless Vladimir Putin in here. This was for my amusement, and hopefully you find it funny, but users are still the weakest link, but it's only certain users, it's not everyone. It's those that have a large public footprint,
those of us, and all of us in the room that have a large breach data set, this is highly problematic, and also the behaviors that are demonstrated by those users. I call them habitual clickers, but you need to know who they are in your organization. If they map to your models of systems you're trying to protect,
you need to work against that, and the most important thing is the adversaries have every method and every type of data available to them, and as defenders, we can't use it all. This is a problem. So, targeted user analytics, the manual workflow, this is the thing that my whole deck tracks to.
This is where I started this project, and this is what you can refer to after the fact to build your own basically project along the lines of what I did. So, I kicked this, this was in my backup slides, but I wanted to kick it up to the front after chatting with a few people,
and just to show the due diligence that was done in terms of the preconditions and legal pieces and what you can do, and the fact that I was unauthenticated when I did any work against LinkedIn. I thought that was a key thing, and we had a legal precedent for that,
and also the way I leveraged breach data was a pull from have I been pwned. Hopefully everybody's doing the same thing and assigning one or zeros to data sets that are available based on how I rated it. So, the use case for generating my data was Swift. I think that that's the largest money-moving system
in the world. I've been fortunate enough to have built out a Swift customer environment, and it was right at the time that the Bangladesh and Vietnam incidents became public, so I got to look deep at those. I had a great headwind. I was able to really build a secure environment.
One of the things I ask when I build that environment or when I do anything is who has access to the system, and more importantly, who do the adversaries know that has access to my Swift system? So, if they know who to target, we've got a problem. So, I went ahead and I foot-printed my company, and the results were shitty, basically.
So, the next question I asked were, well, how do we look compared to the rest of industry? So, I did seven more, and the results were bad across the board, and at the same time as this, I was working on a graduate paper at UVA, and I needed a topic, so I was like, well, why don't I build
a conditional probability model for this? And that's what I did. So, a couple things about Swift. Swift is a secure platform. It's pretty well-built. There's been a lot of work done against it, but the customer implementations of Swift are not always good. We know this, and when you have
a weak customer environment, it opens you to two attack vectors, one being platform users that access the alliance access to a web interface, for example, and the other being infrastructure administrators. You've got root to the platform, then you can do some serious damage, and both user types have been targeted by adversaries,
and we've seen this, and of course, I think there will be more successful, but in companies with a lower security budget, the $50 firewalls are not gonna cut it, and Swift realized this as well, so they built their customer security program, and that was a good move, I think.
So, the seed files that I built are the keyword files, so I used the same methodology that a good recruiter would use. I have a couple of friends that are recruiters. I have people trying to recruit me. I sort of get a feel for what they're doing and how they're doing it,
and in the case of this seed file, you wanted to identify a target with access to Swift. A higher number of matches indicates a target that moves large amounts of money, so that's the target acquisition that we're looking for, and these are the seeds. So, utilizing the seeds, utilizing the seeds actually creates more seeds, which is interesting.
This is a simple query you can run in Google, so I've been told it's Google Dorky, which I thought it was Google Foo. That was funny, and this is due to my lack of automation in building something simple here. So, you can select a Fortune 500 or Fortune 100 and run this search right now and see what you find, and a base model, five matches out of 32
would be a successful basic match. I've created weight on all my variables. I'd suggest you do the same depending on how you're building this approach out, and again, you can play with advanced search too. You sort of learn some things when you're doing this. Less variables is more friendly in terms of returning results.
If you feed too much into a search engine, it will give you crummy results. Try two or three to talk out of your seed. So, what you can expect to harvest from LinkedIn. Relevant skill and probability match percentage achieves an initial targeting accuracy score. Geo location, where the SWIFT infrastructure is located.
This is important because you will know where the treasury team is, and also, you can use this same data to de-anonymize targets. So, the people that had good OPSEC and said they worked in the financial services industry, you can align up their skills and you can figure out what company they look for or they work for. So, geo is also important.
Another variable is where you move money to. So, that's indicated by skill type, which is our cross-border payments, Forex, foreign exchange, things like that that are in the seed. You wanna know, or an adversary would wanna know, how do I get the money out? How do I get it out to the point where I can take it or cash it out in a casino somewhere?
We're in a casino, so that's a good analogy. And that happened. So, org chart mapping of relationships to targets. I've assigned a value. I've got five values based on title and that allows you to build a phishing use case
or a lateral movement use case or a targeting use case. So, external enrichment. Once you've identified or acquired your targets, more OSN, a Google search is really effective within a unique name. A unique name and your geographic location is super effective in targeting
and that's, again, highly problematic. You wanna look for the primary key in finding breach information about somebody is their private email address, of course. Your company email address is not useful, but the Yahoo address you had, the Hotmail address you had, there's a lot of pwned data sitting behind that.
So, you find that and then you run that against how they've been pwned for a one or zero value and you create your data set based on that. So, that'll help to tell you what the bad guys have about your users and I'm listing some of those sample data fields that are really easy to find, again, with Google search.
Google is the can opener of the internet. So, internal enrichment, this is the most important thing for me and for us, I believe. These are the defender actions. This is what we need to do. A great metric to have is to understand your total user base and then what percentage of that user base
were you able to discover. That's a metric you can start with. Identifying the habitual clickers. You need to know who they are. You need to start feeding that information into your phishing tests. You need to correlate this data
with your targeted user analysis and lots of phishing simulations need to be done. Correlating targeted users with past breaches. You should do this. You may find something. You may connect the dots. That offset equals more breaches. Email and spam phishing metrics.
The bad guys aren't gonna burn a zero day. They're gonna take the simple approach. You can look for trends. You can look for the commodity approaches that will be taken against you. Also, targeted users are the canaries in the coal mine. They're where we wanna spend our effort because they're who the adversaries know about. They're who's going to be targeted first.
They have data sets to use against these folks. Let's eliminate the password from the equation. Of course, NFA, I'll keep saying this repeatedly in my deck and examine the password reset criteria and social engineering risks. So human honeypots and operationalizing defenses.
We have the same harvest data as our adversaries. We have the same enrichment data. We know who the targets are and we've separated the weakest links in our companies. So let's make it easy for the bad guys to discover these folks. And I asked myself, what would Arnold do? Let's terminate their attack strategy.
Let's make it easy for the bad guys to discover the targeted users. Let's create honeycreds, honeygroups, honeypots, infrastructure that's easily enumerated. Let's make our jobs easier. Let's add pwned passwords to password history. You might have to dump the hashes and match them because we can't take that pwned password.
Let's enforce multi-factor auth. We've got to do that. Let's disrupt them. Let's feed this data into our targeting phishing platforms. Let's provide more OPSEC training. I'll get into that. I've got a training through shaming slide in here too. Let's monitor the targets. User behavioral analysis is a new area
that I think can be folded into this. And let's create fake LinkedIn personas and target skills. Do this at your own peril, but it will work. So simple correlation and data to model. I've covered most of these and I want to blast through the deck so you get the whole thing
and get into the data harvest. So in my harvest, and I had to redo a lot of this because this project started really 18, 19 months ago. 223 total users were harvested with a high probability of access to Swift. Seven Fortune 100 companies.
32 users is the average enumerated size of the treasury and cash management team per company. Users in four companies leak partners and it was as bad as some users saying, I have this set of skills and I move money to these places. I couldn't believe it when I looked at it, but I found it multiple times, so it became believable.
Domestic international leakage, that's problematic. You know, when you map that against the attack path of like the Lazarus team or Lazarus group. Members of anti-fraud teams were easily enumerated at three target companies. That's bad because the bad guys like to watch the watchers to make sure they haven't been caught.
And we can't have that. The physical location of the cash management teams, I've gone through that. And again, I was able to de-anonymize those with financial services listed in profiles. One potential security officer role was leaked in a profile. That's a very high level of access.
So I'm hoping that was a fake profile. And other interesting points that I like to point out, and this is relevant to all targeting models. If somebody does something very specific, they're gonna be doing it in the next job and the job after that. This data is valuable for years.
Almost two years ago, some of the people I found, they switched jobs. They've cleaned up some of their OPSEC, but it's easy to connect the dots. And you've gotta think about the adversary space knowing these people and having already connected the dots. So this is my shame slide. All right, I'm not going to run a live demo.
That looks like a career limiting move. So I think it might be the sort of thing that you wanna do and maybe expose your users to and then explain this type of attack model to them. It might be a teachable moment. You won't correct the past, which is already going to haunt them and us,
but you might be able to correct going forward. So seed-based harvest, people with, I believe, a high probability of access to the target system per company. These are the ones anonymized. Again, you wanna anonymize this type of data. And then my average seed match per user,
some were better than others. And this is pretty fun when you look at it this way. Geo data via LinkedIn, locations, 11 locations in total of treasury teams. So I was able to find that in every single company. That was relatively easy. Thank you, Google. Partner location and forex skill.
So somebody that's leaking partners and gives you a skill that they might move money outside of the country. We got plenty of that. Added enrichment, I did just a few of these just to prove that it worked with how I've been pwned. And I ran out of time because I had family vacation and I was getting yelled at
for spending too much time on the laptop on the sunny days by the beach. That's what you do if you're a researcher. So I basically proved the model out with a few. So operationalizing the data. As I told you, I've assigned a value based on role and company. And I wanted to show how the spear phishing works and I wanted to be sure to use the Hamburglar
because I find him amusing. So anyway, this is showing enumeration, targeted spear phishing. And basically you can generate this whole model based on the data that I got. The other thing was operationalizing the geo data and I wanted to put something simple together
showing the relationships between the companies. An action that I've taken from yesterday that I need to do is obviously start learning Maltego and saving myself hours and hours of needless work. But it was still fun because I had the Hamburglar. So observations based on data and industry predictions.
Users have poor OPSEC. We can't do anything about that. We can correct to a point, but that's all we can do. I really wish LinkedIn had the damage that Facebook had because they're not forcing security reviews on people to turn on those new things that have been turned on within the last two years.
So I noticed in my data set that the profiles were leaking way more geo data 18 months ago than they are now. So there've been some things that tweaked at the LinkedIn level, but not at the user level. People are still leaky in terms of writing things themselves in there.
So I think as cryptocurrencies lose value, financial institutions will be more aggressively targeted. Sure thing is always better than a volatile thing. And ABT adversaries with an interest in disrupting the global financial system probably have collections using an approach that's way more sophisticated than what I put together
with a simple search engine. And low capability actors are also using this process for W2 phishing. I picked Swift over W2 phishing because when I started looking at W2 use cases against this, I was like, I can't even put this up here. I thought it would have been a dangerous thing to do.
And that, but this certainly does work. And I think this is a primary vector of ATT&CK and targeted phishing for W2s. So, and you can guarantee that those are low. There's a lot of low dollar environments with weak security protections there. So that's something we've got to work to.
So this is again, the basics. We need to predict who will be targeted, whether or not they're being targeted and the likely success rates if the user is part of the ATT&CK chain. Breach data is going to continue to grow. There's nothing we can do about it. We've got a model against it in order to be effective.
We don't have the same weapons going into the fight as the adversaries, which is a huge problem. And I don't know what we can do about that, but aside from break the rules, but we would not want to do that. So again, I think this again, where do you focus? You focus on an achievable objective.
Leadership system administrators, financial systems and programmers. What's important to protect your company. Don't try and do your whole company. You'll kill yourselves unless we automate some of this. So that's just that. I got one minute, so I did it about right. So credit given my family,
I ate up a ton of their time this summer. Troy Hunt, because we could do one in zero pointers as opposed to trying to take your breach data and look at it. He's done a huge, huge job of making us able to perform our jobs and the folks that reviewed my presentation. I've got backup slides that are gonna be very effective
for you to go back to your blast room real fast. Why is the job seeker the most likely person to click the phishing email? I thought this was an important precondition. I've written a graduate paper for this. I'll eventually publish it out. Mike, it'll be important to you. The arguments against, I mean, I don't really think that any of them are valid
unless you don't have data that's worth protecting. So additional seed files and things that I sort of played around with. W2s I think is actually the worst thing and the bigger problem, W2 spear phishing. And then I was able to time lock that. I was actually looking at this almost two years ago.
A friend of mine's company got hit and a query and I was able to pick two people that I thought for sure were the ones that got hit. That was interesting and I started looking deeper at LinkedIn too and looking for maybe personas in there. That's another talk and then I broke down
the attack chain and why we could prevent that attack chain from completing by using a process like this. So that's it and if you guys have some questions and stuff, I think we're at the end. Thanks David and thank you for coming today. We'll be golden off at the end,