We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

RECON VILLAGE - skiptracer: Ghetto OSINT for Broke Hackers

00:00

Formal Metadata

Title
RECON VILLAGE - skiptracer: Ghetto OSINT for Broke Hackers
Title of Series
Number of Parts
322
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Initial attack vectors for recon usually involve utilizing pay-for-data/API (Recon-NG), or paying to utilize transforms (Maltego) to get data mining results. Using some basic python webscraping of PII paywall sites to compile passive information on a target on a ramen noodle budget. The modules will allow queries for phone/email/screen names/real names/addresses/IP/Hostname/breach credentials etc.. This demo will go over the basic outline of using the script, the problems and pitfalls of dealing with scrapers, and how it will help you collect relevant information about a target to help expand your attack surface.
MechatronicsBitComputer programInformationTouchscreenMereologyEmailNumberReal numberProjective planeWeb pageQuery languageWebsiteDifferent (Kate Ryan album)Interior (topology)Domain nameTrailResultantType theoryUniform resource locatorEndliche ModelltheorieTwitterLevel (video gaming)Multiplication signMixed realityWeb 2.0Limit (category theory)Form (programming)Address spaceGoogolRevision controlAreaInternetworkingComputer animation
CodeReverse engineeringInformationEmailShift operatorHecke operatorHost Identity ProtocolGoodness of fitPlug-in (computing)Web pageWebsiteProjective planeResultantSource codeSoftware frameworkScripting languageSingle-precision floating-point formatConnected spaceForm (programming)SpywareGraphical user interfaceOcean currentTrailLibrary (computing)InternetworkingMachine learningDomain nameData miningPasswordInformation privacyFacebookEmailSoftware testingCore dumpProcess (computing)Link (knot theory)Module (mathematics)InformationMyspaceProfil (magazine)Right angleQuery languageNumberOpen setValidity (statistics)DatabaseComputer fileDiscrepancy theoryMereologyBlock (periodic table)CodeProduct (business)Statement (computer science)Spacetime
EmailData modelDemosceneDiscrete element methodEmailMultiplication signFacebookHypermediaBitState of matterNumberInformation privacyDemo (music)Domain nameWebsiteInformationRight angleRevision controlLibrary (computing)Computer programSource codeWeb pageCodeTheory of relativityUniform resource locatorFunctional (mathematics)FamilyRange (statistics)Level (video gaming)Computer fileParsingGreatest elementHooking1 (number)Dot product
Computer animation
PiType theoryParameter (computer programming)Physical systemMenu (computing)Computer animation
PiResultantHacker (term)WebsiteAxiom of choiceMultiplication signProfil (magazine)Different (Kate Ryan album)Module (mathematics)Demo (music)Menu (computing)Scripting languageReverse engineeringInformationFerry CorstenDatabaseEmailNumberComputer fileInheritance (object-oriented programming)Line (geometry)Hash functionElectronic mailing listWordMedical imagingProcess (computing)MyspaceBuffer solutionTouchscreenBit1 (number)HypermediaLink (knot theory)Slide ruleRight angleType theoryGenderQuery languageGreatest elementObject-oriented programmingCore dumpComputer animation
Module (mathematics)Software frameworkBuildingInformationWebsiteBackupComputer animation
CodeWeb pageReverse engineeringInformationNormed vector spaceInfinityNormal (geometry)Computer fileWeb pageFunction (mathematics)Plug-in (computing)Computer programInformationProjective planeQuery languageBitPoint (geometry)Electronic mailing listWebsiteInformation securityCodeProfil (magazine)Uniform resource locatorService (economics)Multiplication signSoftware repositoryCrash (computing)Software frameworkHybrid computerRevision controlDemo (music)ParsingRight angleFreewareDirection (geometry)Computer animation
Transcript: English(auto-generated)
How's it going, everybody? I apologize for being a little screwed up. I had a late night last night, but so who am I? I'm illwill. I'm xillx on Twitter, GitHub, and Pornhub. Got a couple sites out there.
And all this started from a site called tracksomebody.com, which I used to run. I ended up killing the domain last year. But so this is a project that spawned from that website. So as most of you know that you're here, what OSN is, which is basically collecting all the information from different sites, public sites, without a paid API.
Part of the reason why I did this project, because a lot of the other programs like this you had to pay money for to get the API, to get the information. So I tried to do it a little bit cheap for people but don't want to pay.
So scraping sites is kind of like a gray area. A few courts ruled that it's not illegal to scrape sites. But there is the people that you scrape from could try to mess with you a little bit.
But as far as I know, this is legal. I'm not a lawyer. So tools commonly used would be, one, Multigo. They have the paid version. They have the community version on Kali, which isn't as great. You can make a lot of transforms for them.
But it's a really good tool if you want to pay for it. Another one, Recon NG. Kind of a similar path that I'm using for this, but a lot of their stuff requires also APIs to get information from. Some free, some paid.
And then just the internet, Googling stuff, Google Dorking, Bing, a bunch of different sites like that. Why is scraping better than APIs? First, it's cheap, as I said. You don't have to pay a monthly fee or a yearly fee for x amount of times that you can scrape from their site.
There's no limitations. Again, you don't have that top level where you have, say, 10,000 search queries. And if you burn through them in the course of a month or a year, you don't have to keep on upgrading. And then most of it's getting the results that we want. So if we have a site that doesn't have an API to it,
we could basically scrape the page, pull all the information off of it, and just display it as we need. As I said before, I used to run tracksomebody.com. There's just basically a site that had JavaScript. You would basically choose any of the top level URLs
and basically you would just type in whatever query that you had, a phone number, email, screen name, real name and address. So I let it expire because I didn't think I was actually going to make something out of it, but here I am. Right now, in its present form,
if you still want to check out the web version, it's at this URL. You can do basically most of the stuff that you can do in Skip Tracer, you can do right from the web. So if you're just on your phone or something like that, you could just go there. It has pretty much everything that you can do, plus license plate look up.
The license plate itself, it doesn't give you any information on the person. It just gives you the VIN number, the make, model and all that. But basically, you just go there, you press any of the buttons up top, put in your query, and it just pops open a bunch of tabs. Doesn't work well on Chrome because they block it because of spyware
and all that stuff that messes with it. Internet Explorer, of course, lets everything happen. So I started off this project, I was working on something, I came across Beautiful Soup, which is a Python library that allows you to get the needle in the haystack pretty much.
You say, I want this, this, and this from this page, and I want to export it to whatever I want to use it for. So I had it for another project, and then I was like, okay, well, why don't I try to do the track somebody the same way? Now the code, if anybody ever asks you to upgrade to Python 3,
just punch them in the dick. The, I tried making this like the Pepe and all that stuff compatible. It just turned it into a hot mess. So I'm still working on it. Right now in its current form on GitHub, it's kind of messed up. So I got to kind of work on that today
to release it out. So basically, yeah, if anybody tells you that machine learning or AI is, it's just if statements. It's not anything special or magical that lets you know, I could do this,
but so what it is in its form is just basically scraping. You're gonna create a connection to the website. Once you get that connection, you get the source of the page, and then it's just gonna, BeautifulSoup's just gonna parse the results out for you. You're just gonna say, okay, I want something in between these HTML tags and something between these,
and it just spits it out all for you. So Skip Tracer is the project that came out of all this. Basically, I'm gonna go over the couple items that it goes over. So right now it's kind of like a framework.
I started this originally with like just a single Python script. Now I have an actual framework where you just basically put in the search query. The first one would be the phone one where you just put in the phone number, and it kicks back everything that it finds on these pages.
So far, I got five plugins from five different sites, one being WhoCalled, 401info, True People Search. There used to be good sites like Open CNAM that went to API, like we used to use that, scrape that with no issues. They started going to a paid stuff, and most of their stuff was current, and some of these sites are not exactly up to date,
but for the most part, it's pretty current. There is some discrepancies, like say like they just got a phone number like a month ago. It's not gonna show up in this stuff just because they haven't got access to that database yet. So most of these sites here, when you go to the site, it's one of those things where they scam you
into paying like $1.99 or $2.99, and then they just keep on billing you if you forget that you have to cancel it. So what it does is it basically goes to the page. It'll scrape out anything about that person or about that query and just display it for you. Right now, it outputs in JSON file,
so you can actually import that to whatever database or anything that you need to do. Email Recon is basically all those same background check sites. It also has a module for LinkedIn. The module for LinkedIn actually requires you to have a valid user account.
Don't use your personal user account because anybody that pings that, they're gonna get a notification saying so-and-so looked at your profile. So we sign up for like a burner account for that. The MySpace one, of course, people still have MySpace out there from their old emails. Might be good for recon where it was back in the day, nobody really cared about putting up their information
and not scared like Facebook. Packed emails, that went down recently for GDPR. Basically, all the Privacy Act stuff. So that got removed from the modules recently, but basically, they changed it over first to take down their API, and then they changed it to
if you were looking up to see if it was your email that was hacked, it will send you an email saying, somebody's querying you for this, is this you? If so, click this link to check it out. So the fallback to that, the Have I Been Pwned, works really well. Troy did a good job of compiling most of this stuff.
It's really good if, I mean, for the most part, if you search hard enough, you can find the dumps that this stuff is in. So if it's something where you're doing a pen test and you're trying to get some more information from a target, maybe a password reuse might get you in. The Who Is Mine is basically, it just looks for anybody that registered a domain
with that email. Sometimes good because a lot of people forget to do privacy on theirs. They'll have their home information, their home phone number attached to it. So it's really good if the person's not paying attention. Username Recon, there's two sites that basically go through all the social media sites,
Facebook, Foursquare, Myspace, all that. So the Noam plugin only gives you where it came from. So it just says, okay, they have a Facebook account, but it doesn't tell you where it is. A lot of times it's the facebook.com forward slash their username anyhow.
Name Check does a little bit better where they actually spit back the URLs where you can go and test and try to see if that actual account is valid. It's not always 100%. Sometimes there's just killed accounts that it may give you a false answer saying that this account exists here, but it really doesn't.
And then we got the First Name and Last Name Recon. Basically it goes through a couple of the sites that the phone number one goes through. A lot of that information is tied to the person. And these sites, the ones that charge like $1.99, stuff like that, advanced background checks,
they give you a small tidbits of information that you can hook together. In the course of creating this, I found that after I wrote all the beautiful soup to pull that information, if I scrolled all the way down to the bottom of advanced background checks or in the source code of advanced background checks, they actually have a JSON file
that gives you all the information. So all I had to do was just basically parse that information. The regular HTML would give you just the age. The JSON file will actually give you the date of birth or something that's linked to it. So you can use that in the course of trying to find out where somebody is.
It goes through a couple steps. It basically will ask you what state's zip code or city they're from. Are they male, female? Are they older than 30? And then one of the other sites is just basically it asks you for an age range. So if you kinda know or the first one spit back an age, you could say roughly between, as the example shows,
anywhere from 50 to 60, the state that they're in, and basically spit back any information. It's also good for relatives. So if you're trying to find somebody on social media, they don't have an account, a lot of times you can go to their family member's page and you can basically find them at family functions,
stuff like that, and you can kinda like dig in a little bit easier finding out their location and their relatives. So as I was saying before, the plate lookup, it doesn't give you the exact information. I don't wanna ping into DMV to try to pull that stuff out.
It does, I did find a site that actually gives you the information. When you plug in the plate number and the state, it'll kick back and tell you what the VIN number and all that stuff is. What I was doing before is, there's a couple sites where if you plugged in a plate and you got back the VIN number,
you can actually pull that from dot.gov and that'll pull the same information that kind of like Faxfin and stuff do. So I'm gonna try to do a demo. I don't know how well my, I'm gonna try to do it for my phone cause the wifi here is a little bit wonky. So right now I'm kicking back to the older code.
The stuff that's on GitHub right now is kind of a little bit janky, it's not working right. But it's basically pretty simple. To get the program started, there's a requirement.txt that you would basically just import and it'll download all the different libraries that it uses. The next version,
you're just gonna call it with Python three. And this one right now, which is, it's in a kind of a weird state, a wonky state. why is this thing not going on?
Did it go up? No. I hate Microsoft shit.
Let me just kill this.
All right, so basically the way it is now,
before it was just a command line driven where you can actually put the parameters into the command line. Some people were having problems with it cause they didn't know what to type or anything on there. So we tried to start making a menu driven system that people are a little more familiar with
from different things. Was that, kill me, plus or minus.
Is that good enough or one more?
Okay, so basically when you call the script itself, it's gonna start with a menu. It's gonna give you all the choices. You're gonna get email name, phone number, screen name, plate, and an interactive profile where the profile it does is say like you know some of the information on the person.
You can set it up to say their age, their gender, and any information of where they live. And when you're going into the other items, you don't have to type it in. It will basically save all that stuff for you. So the demo, what we're gonna do is we'll just do like an email. As you've seen in some of the slides,
basically I use Kevin Mitnick's stuff because he's easy to find. So let's see if this actually works. So what the first one does is,
actually scroll back up. What the first one does is it goes to, LinkedIn has a, I think it's a sales feature that they have for people to spam the hell out of you on there. But basically the link that it goes to, so I found it on inteltechniques.com. Somebody had posted something about
that you can pull this information. If you had a valid account and you can log into LinkedIn, you can pull that information. So you'll get stuff like their name, where they work, stuff like that. You'll get sometimes their image. A lot of times if you go from Google, they won't show their actual profile image. But what I'm doing is
I'm actually gonna compile that stuff. Eventually get it into stuff like TinEye and Google reverse image search. So that way you can see if that profile image is used somewhere else. It's in the works, it's not in there yet. The second thing, I guess Kevin doesn't have Myspace anymore.
And then it goes to have I been pwned. So basically it goes through the data on there and it basically pulls all the information or all the different breaches that someone's been in. So like I said, if you can search hard enough, you can get the database. A lot of these are cracked already. So you don't have to go through the process
of cracking this stuff. You just get the hash, compare it to some of the word lists that are out there. And I think hashes.org has a lot of stuff where you can actually compare that stuff and don't even have to crack it. Just use that as your dictionary, pull that stuff up. So as you see, like I was saying before, the JSON file, it will pull up their name, middle name,
date of birth, depending on what's out there. It's going to show all their old phone numbers that they may or may not have over the course of the years. You'll see stuff like their old emails. I think it goes back, probably spans back like 15 years. A lot of these old databases,
there's like a big six companies that have all this data where they aggregate it and sell it and kind of trickle down the line. So it basically, those sites that are $1.99, they're probably most of the same parent company or parent company where it comes from, but they just try to sucker you into doing all that stuff.
So what I did now was just an all. It basically runs through all the modules for email. So we'll go through all those, or you can singly do them just to check if somebody has a LinkedIn. Oops.
Since the name is already cached, when you enter it the first time, the name is already cached, so it doesn't ask you again unless you exit back to the main menu, then you can retype it in. But basically, yeah, it just pulls it off singularly or doing all. A lot of people like to do all because you just let it run, aggregate the data, dump it off for later.
If you hit back to the main menu, you can do other stuff like look up a license plate. A lot of times, Faxbin, for some reason, the site itself is pretty slow pulling up information. So if I want to do like somebody with a license plate hacker that lived in Connecticut,
sometimes it will hit, sometimes it won't. So it will say no results found, but if I hit it again, no results found. Let's try a different, I'll try a Las Vegas one.
So yeah, it's just, I mean,
you'll pull up any information that may be related to it. A lot of times, like advanced background checks, if you pull up an email, it'll pull their whole profile and they'll have the additional emails at the bottom, which you can then use to do a query for. Maybe it's something that's not out there. It's something that's an old email that they don't use anymore, but it's still linked back to other social media.
Yeah, yeah, I mean,
the Faxbin one's a little bit wonky. I don't know if there's something that's a buffer on there. I gotta work on that a little, but a lot of times, if you try to run it again, it'll pull it up, even though it says no results found. Not too sure why, but.
So what we're doing is we're basically, we're building out a framework where people can submit modules to the GitHub,
which basically pulls any information. So if you know of a good site that you can get X amount of information without any issues, feel free to submit it to GitHub. Let me just pop this back up.
All right, so the demo is semi-good.
So the things on the list to do, again, like I said, we're trying to move to Python 3. It's just a pain in the ass. So hopefully soon we'll get to that point where it's more pepe-compliant and all that stuff.
Hopefully they'll kill off Python 2.7, even though it's easy as hell to use and easier to understand. Eventually they're gonna move away from that. So like I said, looking for more plugins. I'm gonna be doing API support for some of the sites, even though some of them are paid or free, it's always good where you're gonna get
that information that's not normally on these free sites. Right now the output is just a JSON file. We're gonna get it to CSV and HTML, like pretty page, et cetera. And again, the GDPR is also a problem.
So eventually, hopefully that shit dies down and these sites go back to resuming business as normal. But I wanna give a couple special thanks to some people that helped work on the project. It's basically, I'm not a Python programmer. I just did it for shits and giggles
just to see if I can do it. A lot of these guys listed do Python day to day. They know a hell of a lot more than me. So I just wanna thank them. I think we're almost to the point, okay. So basically, yeah, it's gonna be up on GitHub. Hopefully it'll be updated today. If I don't crash out and die,
just been up for like 24 hours. But I'm gonna work through some of the problems that we have with the Python 3 is it's just not pulling the information and parsing it correctly. So hopefully today I'll get that updated. Right now, if you go to GitLab with the same URL,
you can pull down a working version. That was the old 2.7, 3.0 hybrid that we had an update. We're basically working off GitLab and then pushing it over to GitHub. So again, feel free if you have any information of some cool site or anywhere
where you can pull the information from. That's not a paid API, that you can just pull the information. Feel free to submit it. If you see that my code is shitty, feel free to shit on it in the comments. But I'm trying to build this out to be like a framework so people can just add and submit stuff as it goes.
Problem with all these sites, they go up and down all the time. Sometimes it's a free site for a while, then it'll go to a paid API because they suckered all those people into using the service and they figure there's nothing better. So again, if you have anything on there, just go to the URL, hit submit, let me know.
Any other issues. But I guess I got a couple minutes for questions if anybody has any. Not yet. I have one that's working, I just haven't put it into the repo yet.
So it does actually pull the information from there. Sometimes it's their age, sometimes it's where they work. But it tries to aggregate that stuff. I just haven't released it yet just because it's a little bit wonky in the way that they have it, but it will be released probably in the next couple weeks.
You got a question? Okay. Yep.
So the LinkedIn one where I'm pulling from that page actually just pings that person directly. It's not actually a search query on LinkedIn itself. So when you hit it, it's just a direct hit as a user. And it just picks it up to say, this person viewed your profile within the last day or something like that,
depending on how security is set up and all that. But yeah, it's not the normal searching through there. It's just basically getting that information and just displaying it on the page. Anybody else have any questions? All right, that'll be good. Thank you.