We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Predicting Susceptibility to Social Bots on Twitter

00:00

Formal Metadata

Title
Predicting Susceptibility to Social Bots on Twitter
Title of Series
Number of Parts
112
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Are some Twitter users more naturally predisposed to interacting with social bots and can social bot creators exploit this knowledge to increase the odds of getting a response? Social bots are growing more intelligent, moving beyond simple reposts of boilerplate ad content to attempt to engage with users and then exploit this trust to promote a product or agenda. While much research has focused on how to identify such bots in the process of spam detection, less research has looked at the other side of the question—detecting users likely to be fooled by bots. This talk provides a summary of research and developments in the social bots arms race before sharing results of our experiment examining user susceptibility. We find that a users' Klout score, friends count, and followers count are most predictive of whether a user will interact with a bot, and that the Random Forest algorithm produces the best classifier, when used in conjunction with appropriate feature ranking algorithms. With this knowledge, social bot creators could significantly reduce the chance of targeting users who are unlikely to interact. Users displaying higher levels of extroversion were more likely to interact with our social bots. This may have implications for eLearning based awareness training as users higher in extraversion have been shown to perform better when they have greater control of the learning environment. Overall, these results show promise for helping understand which users are most vulnerable to social bots. Chris Sumner (@thesuggmeister) is a co-founder of the not-for-profit Online Privacy Foundation who actively participate in and contribute to the emerging discipline of Social Media Behavioral Residue research. Chris has previously spoken on this area of research at conferences including BlackHat, DEF CON, 44CON, the European Conference on Personality and the International Conference on Machine Learning and Applications. Randall Wald is a postdoctoral researcher investigating data mining and machine learning at Florida Atlantic University. Following his BS in Biology from the California Institute of Technology, Randall chose to shift his focus to computer science, applying his domain knowledge towards bioinformatics and building models to predict disease. He also studies machine learning for other domains, including machine condition monitoring, software engineering, and social networking.
23
65
108
TwitterInformation privacyInformation privacyQuicksortParallel portSelf-organizationTwitterWhiteboardResidual (numerical analysis)Context awarenessMereologyRobot
Presentation of a groupAreaTerm (mathematics)Web 2.0Touch typingProjective planeData storage deviceComputer animation
Hacker (term)BitCASE <Informatik>MathematicsMereologyMachine learningVirtual machineQuicksortTwitterOptical disc drivePresentation of a groupRobotSelf-organizationMeeting/InterviewComputer animation
InternetworkingMetropolitan area networkRobotRobotTwitterComputer-assisted translation
Dependent and independent variablesComputer networkPoint (geometry)Dependent and independent variablesQuicksortTwitterRobotMechanism designSpeech synthesisInternetworkingVideoconferencingMeeting/Interview
Arrow of timeTwitterLevel (video gaming)Discrete element methodRobotDivisorAreaComputer animation
TwitterRobotDivisorAreaDisk read-and-write headOptical disc driveRange (statistics)Multiplication signMereologyQuicksortSubsetPoint (geometry)CuboidLine (geometry)Dependent and independent variablesAxiom of choice
TwitterDependent and independent variablesRobotOptical disc driveTwitterDependent and independent variablesContext awarenessDifferent (Kate Ryan album)Point (geometry)Group actionComputer programType theoryFacebookMeeting/Interview
CountingHoaxElectric generatorGroup actionService (economics)HypermediaMessage passingMeeting/Interview
Message passingHypermediaGroup action1 (number)
Information privacyInformationGroup actionRobotInformation privacyBitFacebookHoaxPhysical systemInternet service provider
HoaxMultiplication signInternet service providerLogical constantFacebookProcess (computing)Physical systemTwitter
GUI widgetSoftwareComputer networkRobotSphereQuicksortSoftwareTwitterArithmetic meanFlow separationOffice suiteHoaxComputer animation
HypermediaPrime idealTranslation (relic)Mach's principleHoaxTerm (mathematics)Chatterbot1 (number)DivisorRobotQuicksort
Digital photographyHoaxTwitterCountingBitDocument management systemQuicksortUniverse (mathematics)CASE <Informatik>
Regular graphUniverse (mathematics)RobotTwitterMultiplication signUniverse (mathematics)AreaReal number
TwitterOnline chatMathematicsMultiplication signTuring testEmailBitTwitterSoftware testingReal numberRobot
ComputerExecutive information systemTerm (mathematics)Surface of revolutionPlot (narrative)Virtual realityReal numberBlogInformationChemical equationMultiplication signRobotTwitterReal numberProjective plane2 (number)Video gameVirtualizationInformationRight angleBlogSurface of revolutionPlotterVotingComputer animation
State of matterVotingNegative numberInformationMUDResultantTwitterTablet computerQuicksortOpen source
Row (database)QuicksortStudent's t-testTaylor seriesState of matterComputer clusterSound effectBitBlogMeeting/Interview
Digital photographyFocus (optics)BlogPosition operatorStudent's t-testWritingHoaxTerm (mathematics)QuicksortMessage passingGenderGroup actionVideoconferencingProjective planeUniverse (mathematics)Electric generatorHypermediaSpacetime
Social softwareMilitary operationTerm (mathematics)Military operationSpring (hydrology)Direction (geometry)HoaxGroup actionQuicksortEmailContext awarenessOperator (mathematics)State of matterDigital rights managementComputer animation
Virtual realityPasswordMobile WebDigital rights managementLink (knot theory)RobotTwitterQuicksortNumberMobile WebInternet service providerPoint (geometry)Turbo-CodeProduct (business)Group actionCore dumpBlogUniverse (mathematics)PasswordComputer-assisted translation
Data storage deviceTheory of relativityInformationTwitterHoaxRight angleSelf-organizationAxiom of choiceStatement (computer science)
MereologySelf-organizationGroup actionPosition operatorGraph (mathematics)RobotMorphingBitQuicksortThresholding (image processing)Mathematical analysisSoftwareTwitterLevel (video gaming)Buffer solution
InformationMathematical analysisGroup actionSoftwareTwitterLevel (video gaming)Real numberSoftware testingQuicksortTheory of relativityObservational studyCuboidComputer animation
FacebookDirection (geometry)Decision theoryGUI widgetInformation securityFacebookTheory of relativityRobotGraph (mathematics)SpacetimeRaw image formatGroup actionUniverse (mathematics)DivisorProjective planeHoaxInformation securityCyberspaceMachine codeArrow of timeCuboidMeeting/InterviewComputer animation
RobotMachine codeTwitterInformationMathematical analysisSign (mathematics)CuboidGroup actionProcess (computing)QuicksortMeeting/Interview
BlogRobotLine (geometry)Medical imagingQuicksortCASE <Informatik>BlogGroup actionTwitterContent (media)Coefficient of determinationMeeting/InterviewComputer animation
Radio-frequency identificationService (economics)Degree (graph theory)TwitterContent (media)Interactive televisionRandomizationBitComputer fileFamilyDatabaseArrow of timeForm (programming)Computer animation
Radiology information systemRobotBitTwitterComputer-assisted translationProjective planeWeb 2.0Coefficient of determinationDependent and independent variablesMeeting/InterviewComputer animation
Hausdorff spaceQuicksortDependent and independent variablesInteractive televisionRobotLine (geometry)Computer clusterCoefficient of determinationData conversionTwitter
Data conversionTwitterAerodynamicsBlogRobotMultiplication signBitBuffer solutionRandomizationOcean currentMeeting/InterviewComputer animation
RobotReading (process)Rule of inferenceLimit (category theory)Arithmetic progressionMeasurementArrow of timeBlogSinc functionRight angleWeb 2.0Dimensional analysisRange (statistics)NumberDependent and independent variablesProjective planeBit rateSpherical capComputer animationMeeting/Interview
NumberRange (statistics)AerodynamicsRobotComputer-assisted translationDependent and independent variablesInteractive televisionMetropolitan area networkScaling (geometry)Observational studyMultiplication signDifferent (Kate Ryan album)Control flowEvent horizon
Moving averageScaling (geometry)Different (Kate Ryan album)QuicksortAerodynamicsMultiplication signRobotInteractive televisionEvent horizon9 (number)Computer animation
Moving averageNP-hardSoftwareSelf-organizationElectronic program guideSoftware testingTuring testTask (computing)RobotTerm (mathematics)Self-organizationRight angleElectronic program guideOnline helpDependent and independent variablesRandomizationSystem administratorSoftwareWritingTuring testSoftware testingMathematical singularityComputer animation
SoftwareSelf-organizationSoftware testingTuring testElectronic program guideTask (computing)FlagScaling (geometry)Classical physicsRevision controlFerry CorstenQuantum stateComputer animationEngineering drawingProgram flowchart
E-learningBitStatisticsTerm (mathematics)Multiplication signContext awarenessGame controllerWebsiteSound effectEngineering drawingDiagramProgram flowchartMeeting/Interview
Data miningPosition operatorVirtual machineReading (process)Green's functionData miningBitPerfect groupTerm (mathematics)Bit rateQueue (abstract data type)QuicksortComputer animation
Data miningEndliche ModelltheorieCASE <Informatik>Instance (computer science)ResultantBuildingRobotProcess modelingInformationAttribute grammarWave packetDependent and independent variablesIndependence (probability theory)TelecommunicationSocial classCuboidGreatest elementComputer animation
BuildingEndliche ModelltheorieProcess modelingAlgorithmConsistencyNumberQuicksortType theoryRobotDifferent (Kate Ryan album)WordCategory of beingFigurate numberTwitterContent (media)Endliche ModelltheorieData miningHypermediaOpen sourceOperating systemComputer animation
Data miningJava appletSoftwareTable (information)RankingResultantOperating systemData miningMachine learningOpen sourceSoftware development kitGoodness of fitProcess (computing)Category of beingProcess modelingVirtual machineEndliche ModelltheorieBitNumberRobotTwitterDifferent (Kate Ryan album)Set (mathematics)QuicksortNegative numberBoss CorporationPoint cloudComputer animation
Table (information)Bit rateProcess modelingMessage passingOptical disc driveNegative numberBoss CorporationSound effectMultiplication signPosition operatorAreaMathematicsCuboidForm (programming)NumberEndliche ModelltheorieDifferent (Kate Ryan album)RobotTraffic reportingComputer animation
Different (Kate Ryan album)Endliche ModelltheorieSoftware testingParameter (computer programming)Support vector machineProcess modelingData miningEndliche ModelltheorieDifferent (Kate Ryan album)Position operatorSet (mathematics)TwitterRobotArmComputer animation
Data miningDegree (graph theory)Different (Kate Ryan album)Mathematical optimizationSet (mathematics)NumberPosition operatorPlotterInteractive televisionCuboidArithmetic meanEndliche ModelltheorieBoss CorporationDifferent (Kate Ryan album)MereologyRobotProcess modelingComputer animation
Link (knot theory)2 (number)BitVirtual machineProcess modelingKey (cryptography)RobotQuicksortSocial classGame theoryMultiplication signDigital mediaCuboidComputer animation
Digital mediaBit rateSocial engineering (security)Process modelingMessage passingQuicksortTwitterGraph coloringLikelihood functionProper mapMeeting/InterviewComputer animation
Term (mathematics)UsabilityWeb browserIntegrated development environmentQuicksortMathematicsInformation securityRobotOnline helpWave packetFocus (optics)Natural numberImpulse responseRevision controlArea
Software testingTotal S.A.Term (mathematics)Impulse responseSoftware testingAreaFocus (optics)Price indexCore dumpTotal S.A.Message passingFormal languageGroup actionComputer animation
Formal languageRobotGroup actionCore dumpWebsiteData storage deviceCuboidSlide ruleVideoconferencingMultiplication signNeuroinformatikInteractive televisionField (computer science)AreaComputer animation
Software testingComputerTuring testMultiplication signInteractive televisionResidual (numerical analysis)FlagWave packetInformation securityContext awarenessRobotQuicksortHypermediaNeuroinformatikVirtual machineCuboidPresentation of a group
Transcript: English(auto-generated)
I'd like to introduce Chris Sumner and Dr. Randall Wald. All right. Just checking the audio there. Thanks for making it this early hour or for rolling by on your way home either way, very much appreciated. As the gentleman there said, I'm Chris Sumner, I'm representing a small charitable organization
called the Online Privacy Foundation or privacy in the U.S. And part of our remit is that we look at behavioral residue, that's the sort of stuff you do. And we look at that in an online context to see if you're giving away stuff without actually knowing that you're giving away.
So in this experiment, we're looking at susceptibility to interacting with a social bot on Twitter or you could perhaps say a stranger. So before I begin, I was also on the CFP review board panel for DEF CON. So I know there are a ton of awesome talks and there's a couple even running parallel to this one.
So what I wanted to do was just highlight some areas of this talk which some of you may already know about. So if you're familiar with the Web Ecology Project, Tim Huang, you're familiar with astroturfing, the term swift boating and a gentleman called Yazan Boschmaff, then approximately 50% of the presentation may be, how can I put it, a touch on the light
side for you. But if these terms are fairly new, then hopefully it will be pretty interesting. The other thing is that we ‑‑ oh, yeah, sorry, that's what I added later. It also contains about five minutes of maths and specifically machine learning which can
get a little bit heavy. So just so you're aware, there is some of that in there. Okay. So sort of moving on, because we're in Las Vegas and part of Las Vegas is all about trying to win money or in your case hack things and win money or just hack things, the goal of this experiment wasn't to laser, you know, with laser accuracy pinpoint somebody
who is susceptible to a social bot on Twitter. It was more to improve the odds, you know, in your favor. So ‑‑ or at least improve the odds over a baseline performance. So that's what we were really aiming for. So if you're expecting laser precision to find the most susceptible person in an organization,
you'd probably be disappointed. But if you want to move the odds slightly and give you a bit more of an advantage, then stick around. Okay. So with that out of the way, you know, if folks think that's not for them, then I pretty much ‑‑ I like the dude what the fuck in my car talk when I was reviewing it.
That looked pretty good. So moving on, starting the presentation for real, this gentleman is called Tim Huang. And in 2011, he ran a competition called the Social Bots Competition. And essentially how that worked is he had three teams with social bots competing to
win a prize, which I think was a small amount of money and then just props, essentially. So they were given a target audience of 500 Twitter users to go and apply their bots to. And they were given ‑‑ all of these Twitter users had something in common, it was that
they had tweeted or had some interest in cats and that ‑‑ okay. So that's a lot of Twitter users. But the social bot teams were scored. They got one for a follow‑back. They got three points for a social response. And they got ‑‑ killed 15 points ‑‑ docked 15 points if they were killed by Twitter.
So you know, sort of suspended accounts. So that was the scoring mechanism for the Social Bots Competition. It was described as blood sport for Internet social science and network analysis nerds. So I'm not going to go into this in too much detail because there's a really nice
Ignite video that lasts about five minutes. You can watch that where Tim explains the competition and it's really pretty good. All of that stuff is in the speaker note. So all of the references and what we're talking or intended to say is in the speaker note. So don't worry too much about writing notes. One team won, obviously.
And it was a team led by a gentleman called at arrow fade on Twitter. And his bot got 198 responses out of the 500, which is pretty good, almost 40%. So we're going to come back to arrow fade and his social bot in just a short while. Now he was in the audience when we talked about this at black hat.
If you're in the audience today, then make yourself known arrow fade and we'll get you on stage or something. So most research in this area is focused on actually looking at identifying social bots or bots on Twitter or social networks or other social networks. But there's far less research in looking at the human factors involved with
who responds to social bots or who interacts with potential strangers online. So as the goon mentioned at the beginning of the talk, last year we were looking at psychopathy. We weren't trying to identify clinical psychopaths. We were looking at psychopathy as a range to see if you could actually,
if it was possible to improve the odds in your favor of being able to, whether they had certain traits or not, whether you could reduce the chance of hiring one potentially. And that was kind of what we were looking at. Not saying that that's what it should be used for, but that's what we were looking at. And as part of that research, we thought, oh, this would be really neat to actually
start having a look at social bots as well, because that was the sort of same time we became aware of the social bot competition from Tim Wang. So we were like, okay, what can we do? We asked a subset of those users that took part in this experiment, if they'd also take part in an experiment where they'd receive an unsolicited tweet at some point in the future.
And you might think, okay, well that taints the experiment to begin with, but people get unsolicited tweets all the time. So we felt that that was quite reasonable. And we had a pool of roughly 700 people tick the box to be part of that part of the experiment out of 3,000. So we had two main questions. The first was, are some users more naturally
predisposed to interacting with strangers or social bots online? And the other was, is it possible to increase the odds of getting a response from a Twitter user based on this for any social bot? So that the social bot in context can avoid Twitter jail, accounts suspended.
So a good question at this point would be, who cares? Now for the younger people in the audience, you may not remember different strokes, but it's well worth trying to find on YouTube or whatever, because it was an awesome program from my childhood. So one group that are really interested in or have a lot of vested
interest in this is marketing and sales types. So the reason they're interested is because previously all of their work was based on likes and what have you on Facebook, for example, or follower count. But now that's shifting to engagement because companies paying for marketing services have got wise to the fact that it's easy to
generate fake likes and friend counts and blow them through the roof. But it's less hard or it's less easy for them to fake engagement. So it behooves them to actually try and create some engagement. The next group that are interested are propagandists, and we'll talk about that in some more detail,
trying to spread a message or their message and highlight that through social media. Another group, and this uses a mock-up of a tool called ModTego, which is a fantastic tool, may be interested in taking a group of social users from a particular corporation and trying to identify which of those
groups are more likely to respond to your stranger request or your social bot request. So again, picking out the users who are least likely to, focusing on the ones that are most likely to respond. Then the other group that are interested in this kind of research are privacy researchers, and this chap is called
Erhard Greif at MIT, and he's got an excellent paper called What We Should Do Before Social Bots Take Over Online Privacy Protection and the Political Economy of Our Near Future. And he's concerned, or one of the things he's concerned about in his paper is that social bots may be able to harvest
a lot of otherwise private information by engaging with users who are, I don't want to say gullible, but are trusting enough to provide certain information which they wouldn't provide if they knew it was a fake account or a robotic account. So he was quite worried about that, we'll talk about that in a little bit more detail.
And final group, social network providers. So Facebook, for example, have their Facebook immune system and it's kind of this constant battle of they're trying to find fake accounts and the fake account creators are constantly evolving to try and beat the Facebook immune system, but they're actually pretty good at it. And also I have to say that Twitter have done a pretty good job, if you look at spam, maybe it was like 2011,
they've done a really good job of throttling that down. So these are the bigger social networks and obviously they've been applying more time to this kind of activity. So we set about creating some social bots to actually go in and have a look at this. So let me go over some of the history
in this sort of sphere of research. So the talk's gonna proceed really with a history of some of the current research, then we'll talk about the experiment method, discuss the findings that we had and just wrap up with some conclusions. That's how it's gonna go down essentially. So Wagner et al, who are the only other people
that we've found doing research into social bots on Twitter provided this working definition of a social bot. The social bot is a piece of software that controls a user account in an online social network and passes itself off as human. So kind of well done but that is a reasonable
working definition. You might also hear the term Sybil. Sybil was coined by a chap called John Dooser at Microsoft Research and it looks also, I mean that's kind of the same thing with fake accounts and it derives its name from a lady who had like 16 different personalities and there's a movie and a book of the same name.
And some of you might well be going, well social bots aren't anything new because we had chat bots on IRC and all of that sort of stuff and you're right as well about that because this paper here was published in 1994 and there may be earlier ones too. So I mentioned popularity being a driving factor. We've certainly had a lot of that in Twitter
where you've got fake accounts, Justin Bieber's got a whole raft of fake followers apparently. But it does make you look a little bit popular. Who's gonna trust a brand that's only got maybe 10 followers? Maybe not that many. The other thing of course is spam. So you get DMs.
I'm wondering whether this was just a sort of hint that maybe I should lose some weight. But I didn't lose nine pounds using, whatever that stuff is, Acacia berries. But anyway, that was kind of what you saw a lot on Twitter was spam and you're always thinking who clicks on that stuff, right? The other thing is that you can get some social bots
that are actually quite amusing. So in this one, for example, we've got this chap called Kevin who tweets to his followers, whoever his followers are, had a successful auction yesterday, thank you universe, to which the universe promptly replies, no problem, Kevin, get out there and do your thing. So one bot that I'm particularly sad to have seen gone
was a bot by a gentleman called Neil Codner called the World of Shit bot. So every time you mentioned the movie Full Metal Jacket, it would tweet you back and give you a ton of grief. And it was actually hilarious such that you'd want to go out and just mention Full Metal Jacket
just to receive the grief. But that no longer does that, which is a real pity. If Twitter could somehow work with Neil, I'd love to see that reinstated. Then there was another chap, Nigel Leck, I think in New Zealand or Australia, who got tired of debating climate change deniers. So he created a bot. The clue is kind of in there.
It's a Turing test was the name he used for it. So any time somebody was talking about climate change or denying climate change, he would have his bot go and pick out bits of research and like email them back and try and set them straight. So you could apply that to religion, politics,
or anything you like to argue on your behalf because you've got other things to do, right? So it's a time saving device. Another gentleman, and this is well worth taking a look at, was Project Real Boy by a chap called Greg Mara. And he was kind of one of the first
to actually start looking at social bots and creating social bots on Twitter that behaved kind of more like humans really. So well worth checking out. I'm not gonna go into too much detail about that because we'll be short on time. Then politics. So I always mispronounce Rulof's second name, but Rulof Tamingi from Poterva fame,
the guys that wrote Maltigo, Rulof and Kenneth Geers wrote a paper in 2009 called Virtual Plots, Real Revolutions. And he discusses the concept here of what if both left and right wing blogs were seeded with false but credible information about one of the candidates.
Well, a year later on Twitter, we saw, you know, there was a vote between Martha Coakley and Scott Brown. But shortly before that vote, there was a lot of, I don't know, misinformation or dragging Martha's name through the mud on Twitter,
and that escalated pretty quickly so that it was kind of an overwhelming amount of negative information about Martha Coakley. And the result, although it may not be related to what happened on Twitter, was that Scott Brown won. There was certainly a lot of discussion on the use of Twitter to sort of slate Martha Coakley,
and that got a name, or has been also attributed to a name, and the name for that is Swift Boating. Now, that actually went a little bit better in practice, but that's just everything I do.
So, but that gained its name not from Taylor Swift, but from the US military veterans where you've got this sort of thing where you question somebody's military record, and that apparently has some effect on how popular they're gonna be in certain states. So the other thing we've been seeing
is that a bunch of Russian students or bloggers or what have you were paid some money to write positive blogs about President Putin in the run-up to his election. It's not a brilliant example of this, but it is an example, and that's got a term, it's called astroturfing, and that's essentially a fake grassroots campaign
to make somebody look popular and to build support for a message, a political agenda, any sort of agenda essentially. There's a group at the Indiana University who've been working on a project called Truthy which aims to detect sort of political and other sorts of astroturfing
and the generation of memes and sort of activity like that online, and that's worth having a look. They've got some neat videos. You can put in a meme or something like that and look at where it originated from. So they're actually doing some really good work because they feel that what's out there on social media has got a real impact in the real world,
and they need to pay some attention to this space, so really interesting. Then we've got sock puppets. It was HB Gary's email hack, when was that, 2010, 11 or something like that, kind of switched me on to the term of sock puppets, and it mentioned that US spy operations
or military operations were looking for sort of fake persona management, and there was some discussion about whether that had been used in the Arab Spring context to help nudge a certain group into a certain direction. The interesting thing about that particular concept is that this was cited in Rulof
and Kenneth Geer's 2009 paper as well. It states, a large virtual population scattered all over the world and encompassing different socioeconomic backgrounds could be programmed to support any personal business, political, military, or terrorist agenda. So there's some incentive there.
Next, at Christmas this year, there's a mobile phone provider in the UK that was sending out tweets like this to users who were having problems. Please follow us and DM your mobile number, post code, and password, thanks. That was the legitimate account.
That got picked up by a gentleman in Australia called Troy Hunt, who some of you may follow on Twitter or maybe read his blogs, but some excellent blogs, and he was tweeting them and got one tweet back from this account, MyEEcare, which looks a lot like EE but isn't.
It's subtly different. So he's looking, then Troy Hunt responded to this tweet. I'm not saying Troy Hunt's susceptible because I responded to that tweet too, saying something that was nice to see you sort of taking an engaging approach to tackling the problem, to which the MyEEcare folks said,
no, we don't really care, actually. But the point is that if you can get your bot to work on behalf of a brand and get in there before your brand, a number of users aren't actually gonna check whether it's a legitimate one or the non-legitimate one. Then after our Black Hat talk on Wednesday, gentlemen came up to us and started asking questions
about the possibility of misdirecting emergency resources during some sort of unrest. And that made me think, oh yeah, actually there's a chap at Manchester University professor, Professor Proctor, I wanna say, I might be wrong about that. I'll add that to the speaker notes. But he was looking at fake information
in relation to the London riots and what they'd observed, they were observing rumors. And the rumors were like there's lots of carnage and there's shops on fire don't go into this high street and that was perpetuating around Twitter at the time. And yet that high street had almost nobody on it and there was certainly not a riot
and no stores were on fire. So they were looking at how long it takes for a rumor to perpetuate and the truth catch up. And his statement, which made me chuckle, but I'm sure it's not new, is like the rumor gets halfway around the world before the truth has a chance to put his pants on. So there's another way you could sort of use it.
I mentioned Tim Huang, he set up an organization called Pacific Social because part of the experiment, the social bots experiment was really looking at, okay, you've got the group of 500 after that, it sort of morphed majorly the social graph. So he was like, oh, can you distort the social graph?
Maybe stitch two disparate communities together. So that was kind of his position on that topic and wanted to sort of research it a little bit more. The other thing he was interested in is something called emotional contagion and happiness buffering. So you take a group of people who are happy, when their happiness levels drop below
a certain threshold, then you start injecting happy tweets into the network and using sentiment analysis and turn the sad people happy again. So that's one of the things they're looking at. I'm not sure how far they've got to. And he also mentioned this concept to sort of social penetration testing. So spread information with small inaccuracies,
see where they're challenged and where they're not challenged, identify who's the most influential, but also the worst at evaluating what's real and target them. Then this chap here, Yazzam Boschmaff at the University of British Columbia studied bots in quite some depth in relation to Facebook. And he had this social bot steal 250 gigabytes
of user data, which ties back nicely to the air graph paper from MIT. So Boschmaff was one of the first to actually say, we need to understand some of the human factors behind who's engaging with social bots and fake accounts. And to that end, there are two groups that are looking at that, the Secure and Trustworthy Cyberspace Initiative
out of the US and the Insider Threat Project out of Oxford University. So we set about looking at social bots to find people who will talk with anyone about anything essentially. So our methodology, we used the winning bot code from the social bots competition at Aerofades bot. And we did that with some minor modifications
which I'll talk about. But essentially it was the same, it was under MIT license. We had 610 participants. We got their Twitter information, their personality information and their clout score and then a linguistic analysis of their tweets. We divided them into two groups of 305 each and assigned them each a bot.
We assigned them, we did that just so that we could get the whole job done with quicker. That was all. We gave our bot an image of an old lady with a bio along the lines of I've got my own teeth. Whereas the Aerofade bot had like a young dude in New Zealand. So we switched some sort of things around.
How the bot works is that it'd take like a flicker group, in our case dog fashion, but in his case kitten fashion, post it to WordPress and then tweet. So you'd get a picture like that, put it into WordPress like that, and then you'd get a tweet out there saying something like new blog post. So this is creating some content for the bot.
Then next we use a service called If This Then That, which actually then took, let's see, if the weather went above 20 degrees C in a place where this old lady supposed to live, then she'd post a tweet. And that looked something like this. I wonder if I can switch the heating off now. It's 21 degrees C in sunny in Bournemouth.
So this is creating some content at least. Then we got our target users, a 305, and began following them. If any followed back, then we'd log them in an interaction CSV file. That's how technical it got. No database, it's just CSV.
Yeah, let's hear it for CSV files. That, that, yeah, yeah, exactly. So then we started tweeting. We started tweeting some random, random shits basically. And this was not what we tweeted,
but what Aero, this is what Aero Fades bot tweeted, but we felt that those may be interpreted as slightly misogynistic. So we swapped our tweets where we felt that we might get into a little bit of difficulty and had stuff like this instead. And then, you know, like this as well. So these were obvious cues that our bot might not be
the person she's pretending to actually be. And here, you know, we swapped any references to cats to dogs. That's essentially all we did. So you might think, yes, that's banal. And yes, it is because we wanted to keep it as close as possible to the Web Ecology project. So we only switched dogs for cats and removed anything that might be construed
as slightly misogynistic. So all of the responses to that were, as I say, sort of logged in an interactions file. And then once we'd targeted all of the users, i.e. followed them all, then we started asking them questions to see if we could generate a response. This was how Aero Fades bot actually worked. So we had questions along the line of ever milked a cow.
What's better, a dog or a cat? And then we'd like look at trying to get some responses to that or maintain the conversation if a conversation actually occurred.
So Aero Fades blog would just respond randomly from a pool of tweets where our bot actually used an Eliza engine. So you get something like this. Hello, and it'd say, hey, how's your day going so far? But what I liked about Aero Fades bot is that it'd actually be pretty funny sometimes. So we wanted to maintain that and embed some of the random tweets in here.
So it was a bit of a flip of the coin as to what you'd get. So you could get for, I think it'd say, interesting or lol, that's what she said. So it could go kind of random, but we also had to consider the problem of ethics. So we'd put some buffering in there so we could actually look at what our bot was gonna tweet out before she tweeted it out.
And this is why, this is Aero Fades bot, James M. Titus. Do you have any pets, and if so, what? Your avi is adorable, your kitty, now I don't currently have any pets since my kitty passed away a few years ago. To which Aero Fades bot responded, lol, that rules.
So, props to Aero Fades for making me crying with laughter reading that blog post. So we got a ton of limitations which you're all thinking about, and you're right. We used basic measures of personality which you can read about. We had a pretty basic and dumb social bot. Each user got a different question, so maybe the questions had an impact
of whether people responded. As the experiment progressed, more people followed the bots and arguably gave it more credibility, and we had no user follow-up to see if the people knew it was a social bot and were just playing along. But either way, people interacted, and it's pretty much identical to the web ecology project, which has already got some research on it as well.
Although we looked at the dimension of personality. So what did we find? Well, we got 20% return or response rate, whereas the Aero Fades bot had got 40%, and that could be for a number of reasons. We targeted a diverse range of people, and they targeted people who were liking cats. We also had an old lady.
They had a dude from New Zealand. So there are a number of reasons why that could be, but we had 124 interactions, which were 39 follow-backs, which could be automated for sure, and replies were 85 replies. So the most interactions we had were 10 interactions with one user,
and two people interacted nine times, and there's kind of a steady breakdown to where we had 65 people replied just once. So the difference between us and Aero Fade is pretty clear if we look at it on a sort of percentage scale. We also had some interesting funny events around trolling, which Aero Fade had also noticed in his bot.
So we had this interaction using no more than 10 nouns, and only nouns describe yourself, to which the user replied facetious. It rhymes with runt almost 10 times, annoying, and then our bot with the Eliza engine, how do you feel when you say that?
Which gave me a chuckle at the time. That was actually one of the major benefits of conducting this research, is actually having a laugh at the interactions. So, I'm laughing so hard.
Bring it on, essentially, in bot terms. But we also got spotted as well. So we had some interaction, what do you do for a living? I help and guide. This is pretty clear response here, write software for administrative organizations. And she responds out of Aero Fade's random response as you're right, and when you're right, you're right.
You're a bot, aren't you? Granny failing Turing test after one exchange, so singularity is still a fair way off, or something like that, brilliant. So, okay, looking at personality, we found, during Richard Thiem's talk yesterday, it indicated that people might not be familiar with cultural classics from the 80s.
Well, that's Ferris Bueller from Ferris Bueller's Day Off, and I used him as a flag for extraversion. We found extraverted users were more likely to respond, statistically significantly more likely to respond than non-extraverted users. And that's using a scale called 10 item personality inventory by Professor Sam Gosling
in Texas. We found that clout score was significant as well. There were no other personality traits, by the way, and we looked at a bunch of them. Clout score, friends, and followers all had some statistical significant relationship,
albeit relatively weak. So what? Well, in terms of e-learning for corporations, most e-learning around phishing and social network is kind of a one-size-fits-all approach, and we wanted, and if you think about that, that's targeted more at introverts. There's some papers that look at the effectiveness of e-learning and personality,
and they cite that introverts have a better time if they've got less control over the learning experience where extroverts do worse. So I think there's some mileage in exploring this and exploring how this relates in the context to actually developing corporate e-learning experience. So moving on quickly to data mining and machine learning,
this was our baseline performance. False positives in red, true positives in green. We want to avoid this bit, really, because these are the people that are probably gonna try and suspend us like that. Try and aim for this, which is perfection, and a precision of 100%. We're gonna go light on some of these terms.
And the real aim here for us isn't to achieve perfection, well, it actually is to achieve perfection, but we're not gonna do it. More realistically, what we want to try and do is just reduce the false positive rate a bit so that we're spamming less people and hopefully getting more of the people we are talking to responding. So for the next five minutes,
we're gonna go a little bit heavier into some of the machine learning stuff, and then we'll wrap up five minutes after that. And so it's my pleasure to introduce Dr. Randall Wald, who I met here a couple of years ago in the Q&A room. Put my timer here.
Okay, as Chris was saying, basically, we had these data, we wanted to actually work with it more and build models, understand what's really going on with our users, and what we can use, how that can help us make better predictions and get better results. So this very basic, this is what data looks like. You've got instances.
In this case, we have three instances here. You're gonna see Alice, Bob, and Charles. And instance, each instance has many attributes. In this case, we see three attributes here. One, the class attribute, the one we care about, and then also the independent attributes or features, which are the pieces of information
we can get on new users. So the concept here is we have a training data set, which is labeled, where we know all of these individuals, whether or not they responded to our bot. And we're gonna want to use this to help us build a model which will let us, in the future, say, I have a new person. I have not yet sent any communication to them. Should my bot try to contact them?
Should I try to work with them? I'm gonna use the features, the independent facts I can find out about this person, to build a model that will let me predict whether or not that individual is going to be a response to the bot, so it'll be a good target for the bot. So we did two separate experiments on this data. The first was to figure out
which features are most important, because the data we're working with here, basically we took all the users, we have various demographic properties of them. For example, the number of friends, number of followers, clout score, how long they've been on Twitter, how many tweets per day, that sort of thing. We also took the content of their tweet
and tried to figure out how many words they're using. Are they using expletives a lot? Are they using I and me a lot, or using we, different types of words like that, try to figure out which of these is most relevant to the problem of will they interact with the bot, will they reply with the bot. So we wanted to figure out first
which of these are most important, and then once we figured out which are most important, we wanted to build our models to use that to actually predict whether an individual will interact with or reply to the bot. So all of our experiments here, when the data mining was done using the Weka open source tool, it is available on all major operating systems.
I don't know if it works on BSD, but all major, no, anybody, desktop operating systems, written in Java, you can extend it, actually we've done some research extending it and adding additional tools to it, but it's a very good toolkit. I encourage you all to download it and play with it if you're interested in going further with data mining and machine learning. Our results here were, first they interacted.
We wanted to see which individual was most likely to interact, so we'd use a bunch of different models, and we came out with that these three properties of an individual are what tells you if they're going to interact with the bot, if they're going to either follow or reply to the bot. The first is we see clout score, and then number of friends, number of followers. Those were, as Chris was saying,
those were the features that were most relevant for interacting with the bot. We also wanted to look at just replies alone, though. We found it to be a little bit different, so same concept, all the things, and we still see clout score is number one, but percent follow Friday is the second most important feature here. So that's basically what percentage of tweets
mention hashtag follow Friday, that sort of thing, and this actually is interesting because we have these two different data sets that are working a little differently, and once we figured out these features, we wanted to build models that could actually classify individuals into whether they're going to interact or not or whether they're going to reply or not. So we built a number of different models here
using different features, et cetera, and we found the best model gave us a true positive rate of 61%, true negative of 71%, or to put it in easier terms, we see here we have true positives, true negatives, false positives, and false negatives. Now, you're gonna look at this, and many of you are gonna say, but your model still gave more false positives
than true positives. If the model predicted that someone is likely to interact with the bot, it's still gonna be wrong more often than not. However, compared to the gray boxes, which are if we use just send messages to everyone and try to see who's gonna respond, we're able to eliminate all of these individuals
who would otherwise have gotten a message and they weren't gonna respond anyway, so we're improving our odds of hitting the people we care about. So remember, everyone here, the more, every time you send a message to someone who's not gonna respond, the chance they're gonna report your bot. So you want to minimize the number of people in this area to try to maximize the effectiveness of your bot.
And we did related models for the reply dataset. It's a little more challenging to build a model with a performance that's a little worse here. And we found also that different models perform well. And this is, data mining's important to understand. You might say, why don't we all just use the best model? What's the challenge of building different models and looking at how they perform?
Different datasets are gonna perform differently and the optimal model will vary. We see also here similar performance where, again, more false positives than true positives, but we're significantly able to improve the performance of our model, improve the chances that our bot is going to hit those people we care about,
people who are gonna reply to the bot. And that makes us build a better bot that'll be able to get a better response and last longer before it gets reported on Twitter. So overall, we found throughout this part here is that we have our model able to find the features that are most important for interaction and replies to the bot.
We also found that the two datasets, interaction and reply, are different. It's actually, well, clout score, which is related to extraversion, is important for interaction and reply. These, whether the number of friends and followers actually matters more for interaction, while Friday matters more for replies.
And even though our models are not perfect by any means, we still have a large number of false positives. We're able to build models that let us build bots that can attack those that are more likely to be susceptible to the bot. So with that, I'm gonna give it back to Chris here. Thanks, Randall.
So wrapping up with some conclusions here. So again, we found that extroverts perhaps presented the greatest risk because they're maybe more impulsive. We can look at that in maybe just a little bit more depth in a second. And bot masters could use machine learning
to improve the performance of their bot. So the key here isn't that they take our model or what we looked at to improve their bot because that was specific to our particular bot. When Wagner et al looked at the AeroFade bot, they got sort of pretty reasonable performance gains because they had a larger majority or a smaller minority, a larger minority class.
However, so you create a bot, you target a few users, you use machine learning to improve your chances and then you target it on the wider audiences, kind of how we think this could be employed. We're not suggesting people do that, but the marketeers have got a vested interest in actually looking into that sort of behavior.
And marketeers, if you ever go to a marketing or digital media conference, are actually sort of hell-bent on actually trying to improve engagement and understand click-through rates and all of that sort of stuff. They're kind of light years ahead in some respects. Propagandists, of course, could use this to find people who are more likely to respond and interact with them, maybe propagate false messages.
And social engineers, Randall mentioned that Weka's got a command line interface. Well, you can build that model into Maltego, for example, so that you get a bunch of Twitter users, return back the likelihood in a different color here of whether somebody's gonna respond. So find the most gullible, potentially the most gullible people in your organization
and reduce your scope appropriately. The other thing, this is what Yasam Boschmaff mentions, is that this could be used in terms of usable security. So help people understand if they're sort of more susceptible or more at risk from a social bot. We see that with browsers, and now the challenge is, how do you do that in a social network environment?
So probably want to avoid this fellow, though. So training, I mentioned training. Focus on the people who are perhaps most at risk. So maybe it's your sales team who are more, sort of more extroverted naturally. Then in terms of future research here,
likely focus on more detailed areas of the big five, specifically an extroversion which has many facets. Specifically, I'm thinking impulsivity. Impulsivity seems to be related to people responding to phishing messages. So you could use something that's called cognitive reflective test to see if that has an impact. So here's an example of one of the questions
in the cognitive reflective test. A bat and a ball cost $1.10 in total. A bat costs $1 more than the ball. How much does the ball cost? So getting that wrong is an indicator of impulsivity. And I think there are three questions or something like that in the CRT.
So also maybe a target centric approach for the social bot. Ours was a one size fits all pretty dumb bot, but you can actually start looking at the language of the core group that you're following and maybe do some work in that area. But it's not all negative necessarily. As a tall German gentleman by the name of Lutz Finger
who does a lot of work in this field and you can take a look at some of his videos from the Strata conference. And he mentioned that OKCupid had a problem with bots. So they created their own bots. And what they did when they identified a bot is they created a replica of their dating site entirely for bots.
And had those bots taught to other bots. So now there's an entire area apparently of OKCupid where you've just got what you could describe as some bot on bot. Loving. I have no idea whether that's true or not, but he states it in his video. I have no reason to distrust him
because he's taller than I am. So wrapping up then on the last slide, this gets back to Rilof and Kenneth Geer's paper. Illustrations from the Turing test suggest that sufficient interactivity with the computer should reveal that it's human or not. But maybe that's gonna be extending so that you're gonna need more and more time
to figure out whether it's human or not. Or maybe you don't care. But the key thing here is that I think you could apply machine learning to improve a performance of a social bot. But it also shows that that's actually a problem that the security community and folks in general need to start thinking about tackling because this sort of behavioral residue on social media
could be making users, sort of flagging users who are more likely to respond and therefore perhaps need more awareness and more training. So that, ladies and gentlemen, is the end of our presentation. We'll take questions in the Q&A room, I guess.
Thanks a lot.