Inside the Fake Like Factories
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 254 | |
Author | ||
License | CC Attribution 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/53127 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
HoaxRule of inferenceInternetworkingState of matterFactory (trading post)Level (video gaming)Student's t-testUniverse (mathematics)InformationFocus (optics)Broadcasting (networking)Projective planeComputer animationLecture/Conference
01:48
System programmingStudent's t-testMeasurementComputer networkInternetworkingFacebookAlgorithmFactory (trading post)Physical systemFocus (optics)Personal digital assistantProjective planeInformation securityStudent's t-testInternetworkingUniverse (mathematics)MeasurementResultantFactory (trading post)AlgorithmHypermediaInformationMereologyFacebookComputing platformGradient descentPrice indexSoftwareHoaxSound effectPairwise comparisonInstance (computer science)Computer animation
04:04
FacebookLoginHand fanHoaxResultantVirtual machineNeuroinformatikFacebookOffice suiteComputing platformProjective planeForm (programming)1 (number)Software engineeringComputer animation
05:43
Physical systemFacebookLink (knot theory)InformationState of matterComputing platformFacebookWeb pageHand fanBitXML
06:29
AreaBitState of matterVideo gameCovering spaceElectric generatorMeeting/Interview
07:15
Archaeological field surveySource codeComputing platformLink (knot theory)FacebookElectronic mailing listNumberComputing platformPower (physics)Variety (linguistics)Order (biology)TouchscreenResolvent formalismGoodness of fitOnline chatHand fanVideo gameInformationBasis <Mathematik>Boss CorporationPoint (geometry)Mathematical analysisWeb pageQuery languageFacebookPresentation of a groupWeb browserBitWeb crawlerUniform resource locatorSource codeMeeting/InterviewXMLComputer animation
12:27
InformationFacebookElectronic mailing listHand fanWeb pageFacebookFamilyMoment <Mathematik>Computer animation
14:23
Information systemsInformationTelecommunicationWeb pageHand fanCASE <Informatik>Office suiteComputer clusterElectronic mailing listHypermediaMeeting/Interview
15:20
TelecommunicationPersonal area networkFacebookExplosionIdentity managementFactory (trading post)Type theoryTuring testGroup actionHypermediaMedianTable (information)MassVector potentialElectronic mailing listMultiplication signFile Transfer ProtocolUniverse (mathematics)YouTubeRow (database)FacebookMathematical analysisBitGame controllerHoaxWeb pageComputing platformVideoconferencingTraffic reportingOffice suiteMeeting/InterviewComputer animation
18:31
FacebookSocial softwareHypermediaInternet der DingeFactory (trading post)Stress (mechanics)Traffic reportingForcing (mathematics)NumberContrast (vision)HypermediaDirection (geometry)Task (computing)Interactive televisionInternet der DingeInternetworkingView (database)Computer animation
19:58
FacebookData typePRINCE2Mach's principleFacebookDifferent (Kate Ryan album)Reverse engineeringService (economics)Profil (magazine)Mixture modelPattern languageBitIntrusion detection systemMultiplication signNumberResultantCollineationComputer iconSound effectDisk read-and-write headWeb pageGoogolClosed setPoint (geometry)Network topologyHoaxComputer animationMeeting/InterviewLecture/Conference
23:24
GodElectronic mailing listWeb pageMultiplication signFacebookComputer animation
24:32
FacebookWeb pageSummierbarkeitNumberFacebookHoaxOperator (mathematics)Scaling (geometry)Game controllerComplete metric spaceLecture/Conference
25:40
Sample (statistics)FacebookEstimationFacebookEstimatorCollineationSampling (statistics)Physical lawNumberComputer iconVideo gamePoint (geometry)Graph (mathematics)Intrusion detection systemCorrespondence (mathematics)Mathematical analysisTotal S.A.Computer animation
27:12
FacebookCAPTCHABlock (periodic table)Computer networkDependent and independent variablesWeb pageSample (statistics)Web pageMeasurementProxy serverMultiplication signNumberWeb crawlerConnected spaceMotion captureFacebookValidity (statistics)CASE <Informatik>MathematicsComa BerenicesSoftware developerInformationResultantSampling (statistics)SoftwareComputer animation
28:51
Zuckerberg, MarkFacebookWeb pageSample (statistics)System callCellular automatonMobile WebFacebookSoftware testingIntrusion detection systemNetwork topologyNumberLoginZuckerberg, MarkWeb pagePhysical lawLevel (video gaming)Streaming mediaComa BerenicesGoogolCASE <Informatik>Computer animation
29:50
Web pageSystem callFacebookStatement (computer science)Source codeFormal verificationInsertion lossHoaxIntelContext awarenessWordInteractive televisionFacebookDomain nameMultiplication signNumberRaw image formatProfil (magazine)ResultantInformationWeb pageGoodness of fit2 (number)MathematicsIntrusion detection systemFrequencySmartphoneScaling (geometry)Graph (mathematics)Metric systemLogic gateCASE <Informatik>Sampling (statistics)HypermediaImage registrationCountingOrder of magnitudeMobile WebSource codeRow (database)CausalityDecimalDisk read-and-write headArithmetic meanLevel (video gaming)Coma BerenicesMenu (computing)Coefficient of determinationStructural loadFrame problemCorrespondence (mathematics)AuthenticationView (database)UsabilityDiscounts and allowancesProof theory19 (number)Game theoryWater vaporDirection (geometry)Similarity (geometry)Computing platformWordCAN busInsertion lossMeeting/Interview
39:28
Computer configurationPhysical systemCountingWeb pageProjective planeSoftware testingComputer programmingSoftwareNumberHand fanInternetworkingIntrusion detection systemDifferent (Kate Ryan album)Automatic differentiationImage resolutionFacebookArithmetic meanElectronic mailing listProduct (business)Interpreter (computing)InterpolationPoint (geometry)Fundamental theorem of algebraRankingCollineationLecture/Conference
46:15
HoaxDifferent (Kate Ryan album)Software testingMathematicsPoint (geometry)Interactive televisionImage resolutionRepresentation (politics)Normal (geometry)CountingComputing platformNumberWeb pageAreaInternetworkingFacebookProjective planeSound effectMetric systemHypermediaMultiplication signHoaxSampling (statistics)Term (mathematics)Shift operatorEntire functionGroup actionMoment (mathematics)MeasurementHand fanSocial classSlide rulePlanningMenu (computing)Branch (computer science)Endliche ModelltheorieYouTubeSet (mathematics)1 (number)Perturbation theoryPopulation densityLecture/ConferenceMeeting/Interview
53:01
CASE <Informatik>Pattern recognitionOcean currentData storage deviceMeasurementPresentation of a groupMultiplication signRevision controlExtension (kinesiology)FacebookElectronic visual displayRobotYouTubeSoftware developerHoaxNumberSheaf (mathematics)Order (biology)Independence (probability theory)HypermediaCountingSource code2 (number)MetadataSoftware bugPoint (geometry)Uniform resource locatorContext awarenessLatent heatComputer programmingInstance (computer science)MaizeCollineationCore dumpContent (media)Traffic reportingMetric systemBitPhysical systemProof theoryVideoconferencingPolarization (waves)Internetworking40 (number)Musical ensembleLecture/Conference
59:47
Digital object identifierComputer animation
Transcript: English(auto-generated)
00:22
factories. I'm going to date myself. I remember it was like it was the Congress around 1990 or 1991 or so where I was sitting together with some people who came over to the States to visit the the CCC Congress and we were kind of
00:40
riffing on how great the internet is going to make the world, you know, how it's going to bring world peace and truth will rule and everything like that. Boy were we naive. Boy were we totally wrong. And today I'm going to be schooled in how wrong I actually was because we have Svea, Dennis, and Philip
01:03
to tell us all about the fake-like factories around the world and with that could you please help me in welcoming them onto the stage. Svea, Dennis, and Philip. Thank you very much.
01:25
Yeah, welcome to our talk inside the fake-like factories. So my name is Philip. I'm an internet activist against disinformation and I'm also a student of the University of Bamberg. Hi, thank you that you, yeah, that you
01:42
listen to us tonight. My name is Svea. I'm an investigative journalist freelance mostly for NDR and ARD. It's a public broadcaster in Germany and I focus on tech issues and I had the pleasure to work with these two guys on, for me, a journalistic project and for them on a scientific project. Yeah, hi
02:03
everyone. My name is Dennis. I'm a PhD student from Ruhr University, Bochum. I'm working as a research assistant for the chair for system security and my research focuses on network security topics and internet measurements and as
02:20
Svea said, Philip and myself, we are here for the scientific part and Svea is for the journalistic part here. So here's our outline for today. So first I'm going to briefly talk about our motivation for our descent into the fake-like factories and then we are going to show you how we got our hands
02:44
on 90,000 fake-like campaigns of a major crowd working platform and we are also going to show you why we think that there are 10 billion registered Facebook users today. So first I'm going to talk about the like button. The
03:02
like button is the ultimate indicator for popularity on social media. It shows you how trustworthy someone is. It shows how popular someone is. It shows, it is an indicator for economic success of brands and it also
03:22
influences the Facebook algorithm and as we are going to show now, these kind of likes can be easily forged and manipulated but the problem is that many users will still prefer this bad info on Facebook about popularity of
03:41
a project to no info at all and so this is a real problem and there's no real solution to this. So first we are going to talk about the factories and the workers in the fake-like factories. That there are fake likes and
04:01
you can buy likes everywhere. It's well known. So if you Google buying fake likes or even fake comments for Instagram or for Facebook, then you will get like hundreds of results and you can buy them very cheap and very expensive. It doesn't matter. You can buy them from every country. But
04:22
when you think of this bought likes, then you may think of this. So you may think of somebody sitting in China, Pakistan or India and you think of computers and machines doing all this and that they are fake and also
04:41
that they can easily be detected and that maybe they are not a big problem. But it's not always like this. It also can be like this. So I want you to meet Maria. I met her in Berlin and Harald. He lives near
05:00
Mönchengladbach. So Maria, she is a rent-a-ree. She was a former police officer and as money is always short, she is clicking Facebook likes for money. So she earns between two cents and six cents per like. And Harald, he was a baker once. He is now getting social
05:25
aid and he's also clicking and liking and commenting the whole day. We met them during our research project and did some interviews about their likes. And one platform they are clicking and working for is paid likes. It's only one
05:44
platform out of a universe, out of a cosmos. Paid likes, they are sitting just a couple of minutes from here in Magdeburg and they are offering that you can earn money with liking on different platforms. And it looks like this. When you log into the platform with your Facebook account, then you get
06:03
in the morning, in the afternoon, in the evening, you get, we call it campaigns, but these are pages, Facebook fan pages or Instagram pages or posts or comments, you can work your way through them and click them. And I blurred, you see here the blue bar, I blurred them because we don't want to
06:23
get sued from all these companies, which you can see there. To take you a little bit with me on the journey, Harald, he was okay with us coming by for television and he was okay that, and we did a long interview with him and I
06:43
want to show you a very small piece out of his daily life, sitting there doing the household, the washing and the cleaning and clicking.
07:26
You click and you earn some money. How did we meet him and all the others, of course, because Philipp and Dennis, they have a more scientific approach,
07:42
so it was also important not only to talk to one or two, but to talk to many. So we created a Facebook fan page, which we call like a line under a line, because I thought, okay, nobody would like this freely. And then we did a post, this post, and we bought likes and you won't
08:01
believe it and it worked so well, 222 people, all the people I paid for, like this, and then we wrote all of them. And we talked to many of them, some of them only in writing, some of them only we just called or had a
08:22
phone chat, but they gave us a lot of information about their life as a click worker, which I will sum up. So what Paylax by itself says, they say that they have 30,000 registered users, and it's really interesting because you might think that they are all registered with like 10 or 15
08:41
accounts, but most of them, they are not. They are clicking with their real account, which makes it really hard to detect them. So they even scan their ID so that the company knows that they are real, then they earn their money, and we met men, women, stay-at-home
09:05
moms, low-income earners, retirees, people who are getting social care, so basically anybody, so there was no kind of bias, and many of them are clicking for two and more platforms. It was, I didn't
09:24
met anybody who is only clicking for one platform. They all have a variety of platforms where they are writing comments or clicking likes, and you can make, this is what they told us, between 15 euro and 450 euro monthly if you are a so-called power clicker, and
09:42
you do this somewhere kind of professional. But these are only the workers, and maybe you are more interested in who are the buyers, who benefits. Yeah, let's come to step two, who benefits from the campaigns. So I think you all remember this page. This is a screen. If you
10:04
log into paid likes, and you see the campaigns with, yeah, you have to click in order to get a little bit of money. And by luck we noticed that if you go over a URL, we see in the left
10:25
bottom side of the browser URL redirecting to the campaign, you have to click. And you see that every campaign is using a unique ID, it is just a simple integer. And the good
10:42
thing is, it is just incremented. So now maybe some of you guys notice what can we do with that. And yeah, it is really easy with this constructed URL to implement a crawler for data gathering. And our crawler simply requested all campaign
11:04
IDs between zero and 90,000. Maybe some of you asked why 90,000. And as I already said, we also registered as a click worker, and we saw that the highest ID campaign used is about 88,000. So we thought, okay, 90,000 is a good value.
11:25
And we checked for every request between these 90,000 requests, if it got resolved or not. And if it got resolved, the redirected URL represents the source that should be liked or followed. And we did not save the page
11:41
sources from the resolved URLs. We only saved the resolved URLs in a list of campaigns. And, yeah, this list was then the basis for our further analysis. And, yeah, here you see our list. Yes. This was the point when Dennis and Philip, when they
12:04
came to us and said, hey, we have a list, so what can you find? And of course, we searched for AFD was one of the first search queries. And, yeah, of course, AFD is also in that list. Maybe not so surprisingly for some. And
12:25
when you look, it is AFD. And the fan page. And we asked AFD, did you buy likes? And they said, oh, we don't know how we got on that list. But, however, we do not rule
12:44
out an anonymous donation. But now you would think, okay, they found AFD. This is very expectable. But now, all political parties, mostly with local and regional entities, showed up on that list. So, we have CDU, CSU, we have
13:06
AFD, but not that you think Angela Merkel or some very big Facebook fan pages showed up. No, no. Very small entities with a couple of hundreds or maybe 10,000 or 15,000
13:24
followers. And I think this makes perfectly sense because somebody who has already very, very much many fans probably would not buy them there at paid likes. And we asked many of them. And mostly they could not explain it. They would never
13:47
do something like that. They were completely over-asked. But you have to think that we only saw the campaign, the campaigns, the Facebook fan pages. We could not see who bought the likes. And as you can imagine, everybody could
14:03
have done it. Like the mother, the brother, the fan, you know, the dog. So, we would have needed a lot of luck to call anybody out of the blue. And then he would say, oh, yes, I did this. And there were some politicians who
14:21
admitted it. And one of them, she did it also publicly and gave us an interview. She is a regional politician. And it was the case that it was after election and she was not very
14:41
happy with her fan page. That is what she told us. She was very unlucky. And she wanted, you know, to push herself and to boost it a little bit and get more friends and followers and reach. And then she bought 500 followers. And then we had a nice interview with her about that. Show you a small piece.
15:27
That is the theme of media competence, I think. I think that the people who are leading can do that. So, it is not a question of who likes.
15:40
No, but it is a question of who likes. Who can do that? When we are at the table saying that this is the mass or that the mass for us is the Okay, so you see, answers are pretty interesting, and I think she was that courageous to speak
16:01
out to us. Many others did too, but only on the phone, and they didn't want to go on record, but she's not the only one who answered like this, because of course, if you call through a list of potential fake-like buyers, of course they answer, like, no, it's not a scam. And I also think from a jurisdictional way, it's also very hard to show that this is fraud
16:26
and a scam, and it's more an ethical problem that you can see here that is manipulative if you buy likes. We also found a guy from FTP from the Bundestag, but he ran away and didn't want to get interviewed,
16:44
so I couldn't show him, yeah, he bought, or probably, he was like 40 times in our list for various Facebook posts and videos, and also for his Instagram account, but we could not get him on record.
17:04
So what did others say? We of course confronted Facebook, Instagram and YouTube with this small research, and they said, no, we don't want to fake likes on our platform, paid likes is active since 2012, you know, so they waited seven years, but after our report, at least Facebook
17:25
temporarily blocked paid likes, and of course we asked them to, and spoke to them and wrote with paid likes in Magdeburg, and they said of course it's not a scam, because the click workers, they are freely clicking on pages, so, yeah, kind of nobody cares,
17:43
but paid likes, this is only the tip of the iceberg. So we also wanted to dive a little bit into this fake like universe outside of paid likes, and to see what else is out there, and so we did an analysis of account
18:06
creation on Facebook. So what Facebook is saying about account creation is that they are very effective against fake accounts, so they say they remove billions of accounts each year, and that most
18:23
of these accounts never reach any real users, and they remove them before they get reported. So what Facebook basically wants to tell you is that they have it under control. However there are a number of reports that suggest otherwise.
18:42
For example, recently, NATO, the STRATCOM task force released a report where they actually bought 54,000 likes, 54,000 social media interactions for just 300 euros. So this is a very low price, and I think you wouldn't expect such a low price if it would be hard
19:05
to get that many interactions they bought, 3,500 comments, 25,000 likes, 20,000 views, and 5,100 followers, everything for just 300 euros.
19:22
So you know the thing they have in common? They are cheap, the fake likes and the fake interactions. So we also have, there was also another report from Vice Germany recently, and they reported on some interesting facts about automated fake accounts.
19:44
They reported on findings that suggest that actually people use Internet or hacked Internet of Things devices, and to use them to create these fake accounts and to manage them. So it's actually kind of interesting to think about this this way, to say, okay, maybe next
20:05
election your fridge is actually going to support the other candidate on Facebook, and so we also wanted to look into this, and we wanted to go a step further and to look at who these people are, who are they, and what are they doing on Facebook, and
20:27
so we actually examined the profiles of purchased likes. For this, we created four comments under arbitrary posts, and then we bought likes for these comments, and then we examined the resulting profiles of the fake likes.
20:45
So it was pretty cheap to buy these likes, comment likes are always a little bit more expensive than other likes, and we found all these offerings on Google, and we paid with Paypal.
21:01
So we actually used a pretty neat trick to estimate the age of these fake accounts. So as you can see here, the Facebook user ID is incremented. So Facebook started in 2009 to use incremented Facebook IDs, and they used this pattern
21:25
of 1,000 and then the incremented number, and as you can see, in 2009, this incremented number was very close to zero, and then today it is close to 40 billion, and in this
21:44
time period, you can see that you can kind of get a rather fitting line through all these points, and you can see that the likes are in fact incremented, and the account IDs
22:00
are in fact incremented over time. So we can use this fact in reverse to estimate the creation date of an account where we know the Facebook ID, and that's exactly what we did with these fake likes. So we estimated the account creation dates, and as you can see, we get kind of different
22:22
results from different services, for example, paid likes, they had rather old accounts, so this means they use very authentic accounts, and we already know that because we talked to them. So these are very authentic accounts.
22:41
Also, like service A over here, also uses very authentic accounts, but on the other hand, like service B, uses very new accounts, so they were all created in the last three years. So if you look at the accounts and also from these numbers, we think that these accounts were bots, and on service C, it's kind of not clear are these accounts bots, or are
23:08
these click workers? Maybe it's a mixture of both. We don't know exactly for sure. But this is an interesting metric to measure the age of the accounts to determine if some
23:21
of them might be bots, and that's exactly what we did on this page, so this is actually a page for garden furniture, and we found it in our list that we got from paid likes, so they bought, obviously they were on this list for bought likes on paid likes, and
23:43
they caught our eye because they had one million likes, and that's rather unusual for a shop for garden furniture in Germany. And so we looked at this page further, and we noticed other interesting things.
24:04
For example, there are posts all the time, they got like thousands of likes, and that's also kind of unusual for a garden furniture shop, and so we looked into the likes, and as you can see, they all look like they come from South East Asia, and they don't look
24:24
very authentic. And we were actually able to estimate the creation dates of these accounts, and we found that most of these accounts that were used for liking these posts on this page were actually created in the last three years.
24:40
So this is a page where everything from the number of people who like the page to the number of people who like the posts is complete fraud. So nothing about this is real, and it's obvious that this can happen on Facebook, and that this is a really, really big problem.
25:04
I mean, this is a shop for garden furniture. Obviously, they probably don't have such huge sums of money, so it was probably very cheap to buy this amount of fake accounts, and it's really shocking to see how big
25:23
the scale is of this kind of operations. And so what we have to say is, okay, when Facebook says they have it under control, we have to doubt that. So now we can look at the bigger picture.
25:43
And what we are going to do here is we are going to use this same graph that we used before to estimate the creation dates, but in a different way. So we can actually see the lowest and the highest points of Facebook IDs in this graph.
26:01
So we know the newest Facebook ID by creating a new account, and we know the lowest ID because it's zero. And then we know that there are 40 billion Facebook IDs. Now, in the next step, we took a sample, a random sample from these 40 billion Facebook IDs,
26:24
and inside of the sample, we checked if these accounts exist, if this ID corresponds to an existing account. And we do that because we obviously cannot check 40 billion accounts, 40 billion IDs, but we can check a small sample of these accounts, of these IDs, and estimate then
26:45
the number of existing accounts on Facebook in total. So for this, we repeatedly accessed the same sample of one million random IDs over the course of one year.
27:02
And we also pulled a sample of 10 million random IDs for closer analysis this July. And now Dennis is going to tell you how we did it. Yeah, well, pretty interesting results so far, right? So we again implemented a crawler the second time for gathering public Facebook information,
27:25
public Facebook account data. And yeah, this was not so easy as in the first case. Yeah, it's not surprising that Facebook is using a lot of measures to block the automated
27:44
crawling of the Facebook page, for example, with IP blocking or capture solving. But we were pretty easy, yeah, we could pretty easy solve this problem by using the Tor
28:00
anonymity network. So every time our IP got blocked by crawling the data, we just made a new Tor connection and changed the IP. And this is also with the captures. And with this easy method, we were able to crawl all the public Facebook data.
28:25
And let's have a look at two examples. The first example is facebook.com slash four. So the very, very small Facebook ID. Yeah, in this case, we are redirected and check the response and find a valid account page.
28:43
Does anyone know which account this is? Number four? Yeah, Mark Zuckerberg, yeah, that's correct. Yeah, this is a public account from Mark Zuckerberg. Number four, as we already saw, the other IDs are really high, but he got the number
29:01
four. The second example was facebook.com slash three. In this case, we are not forwarded, and this means that it is an invalid account, and that was really easy to confirm with a quick Google search, and it was a test account
29:21
from the beginning of Facebook. So we did not get redirected, and it's just the login page from Facebook. And with these examples, we did a lot more experiments, and at the end, we were able
29:40
to build this tree. And yeah, this tree represents the high-level approach from our scraper. So in the... What's that? Okay. He's sleeping. Yeah. We have still time, right?
30:00
Wait. It's... Bad time. Okay. Everyone is waking up again. In the first step, we call the domain www.facebook.com slash fid. If we get redirected in this case, then we check if the page is an account page.
30:24
If it's an account page, then it's a public account, like the example four, and we were able to save the raw data, the raw HTTP source. If it's not an account page, then everything is okay.
30:42
It is not a public account, and we are not able to save any data. And if we call... If we do not get redirected in the first step, then we call the second domain, facebook.com slash profile dot php question mark id equals the fid with the mobile user agent, and
31:05
if we get redirected, then again, it is a non-public profile, and we cannot save anything. And if we get not redirected, it is an invalid profile, and it is most often a deleted account.
31:22
Yeah, that's the high-level overview of our scraper, and Philip will now give some more information on interesting results. So the most interesting result of the scraping of the sample of Facebook IDs was that one
31:41
in four Facebook IDs corresponds to a valid account, and you can do the math. There are 40 billion Facebook IDs, so there must be 10 billion registered users on Facebook, and this means that there are more registered users on Facebook than there are humans on
32:02
Earth, and also it means that it's even worse than that because not everybody on Earth can have a Facebook account because not everybody, you need a smartphone for that, and many people don't have those, so this is actually a pretty high number, and it's very unexpected.
32:21
So in July 2019, there were more than 10 billion Facebook accounts. Also, we did another research on the timeframe between October 2018 and today, or this month, and we found that in this timeframe, there were two billion new registered Facebook accounts.
32:42
So this is like the timeframe of one year, more or less, and in a similar timeframe, the monthly active user base rose by only 187 million. Facebook deleted 150 million older accounts between October 2018 and July 2019, and we
33:04
know that because we pulled the same sample over a longer period of time, and then we watched for accounts that got deleted in the sample, and that enables us to estimate this number of 150 million accounts that got deleted that are basically older than our sample.
33:24
So I made some nice graphs for your viewing pleasure. So again, the older accounts were just 150 million were deleted since October 2018.
33:41
These are accounts that are older than last year, and Facebook claims that since then about seven billion accounts got deleted from their platform, which is vastly more than these older accounts, and that's why we think that Facebook mostly deleted these newer accounts, and if an account is older than a certain age, then it is very unlikely
34:07
that it gets deleted. Also, I think you can see the scales here. Of course, registered users are not the same thing as active users, but you can still see that there are much more registrations of new users than there are active users
34:24
than there are new active users during the last year. So what does this all mean? Does it mean that Facebook gets flooded by fake accounts? We don't really know. We only know these numbers.
34:41
What Facebook is telling us is that they only count and publish active users. As I already said, there is a disconnect between the registered users and the active users, and Facebook only reports on the active users. Also, they say that users register accounts, but they don't verify them or they don't
35:06
use them, and that's how this number gets so high. But I think that that's not really explaining these high numbers, because that's just by orders of magnitude larger than anything that this could cause.
35:25
Also, they say that they regularly delete fake accounts, but we have seen that these are mostly accounts that get deleted directly after their creation, and if they survive long enough, then they are getting through.
35:43
So what does this all mean? Okay, so you got the full load, which I had like over two or three months, and what for me was, or for us, was one very big conclusion was that we have some kind
36:03
of broken metric here, that all the likes and all the hearts on Instagram and the followers, that they can so easily be manipulated, and then it's so hard to tell, in some cases, it's so hard to tell if they are real or not real, and this opens the gate for
36:20
manipulation and for, yes, untruthness, and for economic losses, if you think as somebody who is investing money, or as an advertiser, for example, and in the very end, it is a case of eroding trust, which means that we cannot trust these numbers
36:42
any more. These numbers are, you know, they are so easily, they can be manipulated, and why should we trust this, and this has a severe consequence for all the social networks if you are still in them. So, what can be a solution, and, Philip, you thought about that?
37:02
So, basically, we have two problems. One is click workers, and one is fakes. The click workers are basically just hyperactive users, and they are selling their hyperactivity, and so what social networks could do is just make interactions scarce, so just lower the value
37:24
of more interactions. If you are hyperactive users, then your interaction should count less than the interactions of a less active user. That's kind of solvable, I think. The real problem is the authenticity, so, if you get stopped from posting or liking
37:47
hundreds of pages a day, then maybe you just create multiple accounts and operate them simultaneously, and this can only be solved by authenticity, so this can only be solved if you know that the person who is operating the account is just one person operating
38:06
one account, and this is really hard to do because Facebook doesn't know who is clicking. Is it a bot? Is it a click worker? Is it one click worker for ten accounts? How does this work? So this is really hard for the social media companies to do, and you could say, okay,
38:27
let's send in the passport or something like that to prove authenticity, but that's actually not a good idea because nobody wants to send their passport to Facebook, and so this is really a hard problem that has to be solved if we want to use social media in a meaningful way.
38:45
And so this is what companies could do, and now what you could do. Okay, of course you can delete your Facebook account or your Instagram account and stop staying away from social media, but this maybe is not for all of us a solution, so I think
39:07
be aware, of course, spread the word, tell others, and if you like and you get more intelligence about that, we are really happy to dig deeper in these networks, and we will
39:24
go on investigating, and so last but not least, it's to say thank you to you guys. Thank you very much for listening, and we did not do this alone. We are not three people. There are many more standing behind and doing this beautiful research,
39:46
and we are opening now for questions, please. Yes, please thank Svea, Phil, and Dennis again. We have microphones out here in the room,
40:06
about nine of them, actually. If you line up behind them to ask a question, remember that a question is a sentence with a question mark behind it, and I think I see somebody at number three, so let's start with that.
40:21
Hi, I just have a little question. Wouldn't a dislike button, the concept of a dislike button, wouldn't that be a solution too for all the problems? So, we thought about recommending that Facebook ditches the like button all together. I think that would be a better resolution than a dislike button, because a dislike button could also be
40:44
manipulated, and it would be even worse because you could actually manipulate the network into down-ranking posts, or kind of not showing posts to somebody, and that I think would be even worse. I imagine what dictators would do with that, and so I think the best option would be
41:06
to actually not show a like, like counts any more, and to actually make people not invest into these counts if they become meaningless.
41:22
I think I see a microphone 7 over up there. Hello, so one question I had is, you assigned creation dates to IDs, how did you do this?
41:44
So, we actually knew the creation date of some accounts, and then we kind of interpolated between the creation dates and the IDs, so you see this black line there, that's actually our
42:01
interpolation, and with this black line we can then estimate the creation dates for IDs that we do not yet know, because it kind of fills in the gaps. Follow-up question, do you know why there are some points outside of this graph?
42:22
Cool, thank you. So there was a question from the internet. Did you report your findings to Facebook and did they do anything? Because this research is very new, we just recently approached them and
42:41
showed them the research and we got an answer, but I think we also already showed the answer, it was that they only count and publish active users. They didn't want to tell us how many registered users they have, that they say, oh, sometimes users register accounts but don't use them or verify them,
43:04
and that they regularly delete fake accounts. But we hope that we get into a closer discussion with them soon about this. Microphone 2. When hunting down the buyers of the campaigns,
43:21
did you dig out your own campaign line below the line? No, because they stopped scraping in August, and you stopped scraping in August, and then I started, you know, the whole project started with them coming to us with the list,
43:41
and then we thought, oh, this is very interesting, and then the whole journalistic research started. But I think if we would do it again, of course, I think we would find us. We also found, there was another magazine, and they did also a test, a paid-like test a couple of years ago, and we found their campaign.
44:03
We actually did another test, and for the other test, I know that we also got like this ID, I think, and it worked to plug it into this URL, and then we also got redirected to our own page, so that worked, yeah.
44:21
Thank you. Microphone 3. Hi, I'm Farhan, I'm a Pakistani journalist, and first of all I would like to say that you were right when you said that there might be people sitting in Pakistan clicking on the likes, that does happen, but my question would be that Facebook does have its own ad
44:40
program that it aggressively pushes, and in that ad program there is also options whereby people can buy likes and comments and impressions and reactions. Would you also consider those as fake, I mean, they're not fake per se, but they're still bought likes, so what's your
45:04
view on those? Thank you. So when you buy ads on Facebook, then, so what you actually want to have is fans for your page that are actually interested in your page, so that's kind of the
45:20
difference, I think, to the paid likes system where the people themselves, they get paid for liking stuff that they wouldn't normally like, so I think that's the fundamental difference between the two programs, and that's why I think that one is unethical and one is not really that unethical. The very problem is if you buy these click workers, then you have
45:48
many people in your fan page, they are not interested in you, they don't care about you, they don't look at your products, they don't look at your political party, and then often the people, they additionally, they make Facebook ads, and these ads, they are shown again,
46:06
the click workers, and they don't look at them, so, you know, people, they are burning money and money with this whole corrupt system. So, microphone 2. Hi, thanks for the talk, and thanks for the effort of going through all of this project.
46:25
From my understanding, this whole finding basically undermines the trust in Facebook's likes per se in general, so I would expect now the price of likes to drop, and the pay for
46:41
click workers to drop as well. Do you have any metrics on that? The research just went public, I think, one week ago, so what we have seen as an effect is that Facebook, they excluded paid likes for a moment, so yes, of course, one platform is down,
47:03
but I think there are so many outside, there are so many, so I think ... I meant the phenomenon of paid likes, not the company itself, like the value of a like as a measure of credibility is declining now, that's my assumption.
47:23
Yes, that's why many people are buying Instagram hearts now, so yes, that's true, the like is not the fancy hot shit anymore, yes, and we also saw in the data that the likes for the fan pages, they rapidly went down, and the likes for the posts and the comments,
47:42
they went up, so I think, yes, there is a shift, and what we also saw in the data was that the Facebook likes, they went down until from 2016, they are rapidly down, and what is growing and rising is YouTube and Instagram. Now everything is about today,
48:00
everything is about Instagram. Thanks, so let's go to number one. Hello, and thank you very much for this fascinating talk, because I've been following this whole topic for a while, and I was wondering if you were looking also into the demographics in terms of age groups and social class, not of the people who are doing the
48:23
actual liking, but actually, you know, buying these likes, because I think that what is changing is an entire social discourse on social capital in the bold US kind of term, because it can now be quantified. As a teacher, I hear of kids who buy likes to be more popular than
48:43
their other schoolmates, so I'm wondering if you're looking into that, because I think that's a fascinating, fascinating area to actually come up with numbers about. It definitely is, and we were also fascinated by this data set of 90,000 data points,
49:00
and what we did was, and this was very hard, and was that we tried first of all to look who is buying likes, like are this automotive, you know, what kind of branches, who is in that, and so this was doable, but to get more into demographics,
49:22
you would have liked to crawl, to click every page, and so we did not do this. What we did was, of course, that we were a team of three to ten people, and manually looking into it, and what we of course saw that on Instagram and on YouTube, you have many of these
49:43
very young people, some of them I actually called them, and they were like, yes, I bought likes, very bad idea, so I think yes, I think there is a demographic shift away from the companies and the automotive and industries buying Facebook fan page likes to Instagram
50:01
and YouTube wannabe influencers. And I have to admit, here we showed you the political side, but we have to admit that the political likes, they were like this small in the numbers, and the very, very vast majority of this data set, it's about wedding planners,
50:26
photography, tattoo studios, and influencers, influencers, influencers, and YouTubers, of course, yes. Thank you so much. So we have a lot of questions in the room. I'm going to get to you as soon as we can, but I would like to go to the internet first. Do you think this will get better or worse if people move to more decentralized platforms?
50:49
If it gets better or worse. Can you repeat that, please? Would this issue get better or worse if people move to more decentralized platforms?
51:01
Decentralized. Decentralized, okay. So, I mean, we can look at this slide, I think, and think about whether decentralized platforms would change any of these two points here, and I fear I don't think so,
51:22
because they cannot solve the interactions problem that people can be hyperactive. Actually, that's kind of a normal thing in social media, that a small portion of social media users is much more active than everybody else. You have that without paying for it, so without even having paid likes, you will have to consider if social media is really
51:44
kind of representative of the society. And the other thing is authenticity, and also in a decentralized platform, you could have multiple accounts run by the same person. So, microphone 7, all the way back there.
52:02
Hi. Do you know if Facebook even removes the likes when they delete fake accounts? We don't know that. No, we don't know. We know they delete fake accounts, but we don't know if they
52:21
also delete the likes. I know from our research that the people we approached, they did not delete the click workers. They kept them. So, microphone 2. Yeah, hi. So, I have a question with respect to this. One out of four Facebook accounts are active in your test. Did you see any difference with respect to the
52:45
age of the accounts, or is it always one out of four through the entire sample, or does it maybe change over the, like, going from a zero idea to, well, 10 billion or 40 billion? So, you're talking about the density of accounts in our IDs, kind of.
53:04
So, there are changes over time, yeah. So, I think now it's less than it was before. So, now there are less than before it was more, and so I think it was, yeah, I don't know. But you didn't see anything specific that now only in the new accounts, only one out of
53:24
10 is active or valid, and before it was one out of two or something like that. It's not that extreme. So, it's less than that. It's kind of. We have to say we did not check this, but there were no special cases. But it changed over time. So, before it was less, and before it was more, and now it is less.
53:46
And so, what we checked was whether an ID actually corresponds to an account, and so this metric, yeah, and it changed a little bit over time, but not much, so. So, number three, please.
54:01
Yeah, thank you for a very interesting talk. At the end, you gave some recommendations, how to fix the metrics, right? And it's always nice to have some metrics, because then, well, we are the people who deal with numbers, so we want the metrics, but I want to raise the issue whether quantitative measure is actually the right thing to do. So, would you buy your furniture
54:22
from store A with 300 likes against store B with 200 likes, or would it not be better to have a more qualitative thing, and to what extent is the quantitative measure maybe also the source of a lot of bad developments we see in social media to begin with? Even not with bot firms
54:42
and anything, but just people who go for the quick like and say, hooray for Trump, and then you get whatever, all the Trump is liking that, and the others say, fuck Trump, and you get all the non-Trump is like that, and you get all the polarisation, right? So, Instagram, I think they just don't display their like equivalent anymore in order to prevent that, so could you maybe
55:04
comment on that? I think this is a good idea to hide the likes, yes, but I, you know, we talked to many click workers and they do a lot of stuff, and what they also do is taking comments and doing copy-paste for comment section or for Amazon reviews, so, you know, I think
55:26
it's really hard to get them out of the system, because maybe if the likes are not shown and when the comments are counting, then you will have people who are copy-pasting comments in the comments section, so I really think that the networks, they really have
55:43
an issue here. So let's try to squeeze in the last three questions now. First, number seven, really quick. Very quick, thank you for the nice insights, and I have a question about the location of the users, so you made your point that you can analyse by the metadata where, when the account
56:03
was made, but how about the location of the followers? Is there any way to analyse that as well? So, we can only analyse that if the users agree to share it publicly, and not all of them do that. I think often a name check is often a very good way to check where someone
56:25
is from for these fake likes, for example, but, as I said, it always depends on what the user himself is willing to share. Internet? Isn't this just a Western version of the Chinese social credit system?
56:41
Where do we go from here? What is the future of all this? Yeah, that's the dystopian, right? So, yeah, after this research, you know, for me, I deleted my Facebook account like one or two years ago, so, you know, this did not matter to me so much,
57:04
but I stayed on Instagram, and when I saw all these likes and followers, and also YouTube, all these views, because the click workers, they also watch YouTube videos, they have to stay on them like 40 seconds. It's really funny because they're hearing techno music, rap music,
57:23
all 40 seconds, and then they go on. But, when I sat next to Harald for two or three hours, I was so deluginated about all the social network things, and I thought, okay, don't count on anything. Just, if you like the content, follow them and look at them,
57:44
but don't believe anything. That was my personal takeaway from this research. So, very last question, microphone 2. A couple of days ago, the Independent reported that Facebook, the Facebook app, was activating the camera when reading a news feed. Could this be used in the
58:08
context of detecting fake accounts? I don't know. So, I think that, in this particular instance, that it was probably a bug, so, I don't know,
58:24
but, I mean, the people who work at Facebook, not all of them are crooks or anything that will deliberately program this kind of stuff, so they said it was kind of a bug from an update that they did, and the question is whether we can actually detect fake accounts with the camera,
58:46
and the problem is that current, I don't think that current face recognition technology is enough to detect that you are a unique person, so there are so many people on the planet
59:01
that there are probably another person who has the same face, and I think the new iPhone, also have this much more sophisticated version of this technology, and even they say, okay, there is a chance of one in, I don't know, that there is somebody who can unlock your phone.
59:20
So, I think it's really hard to do that with the current technology, to actually prove that somebody is just one person. So, with that, would you please help me thank Svea, Dennis, and Philip. One more time for this fantastic presentation. Very interesting.
59:43
And very, very disturbing. Thank you very much.
Recommendations
Series of 3 media