Algorithmic science evaluation and power structure: the discourse on strategic citation and 'citation cartels'
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 167 | |
Author | ||
License | CC Attribution 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/34986 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
AlgorithmPerformance appraisalData structureExecution unitLevel (video gaming)Computer animationJSON
00:35
AlgorithmPerformance appraisalData structurePressure volume diagramDecision theoryBiostatisticsData modelReflexive spaceMetric systemSound effectPrice indexObservational studyMetric systemPerformance appraisalInformetrieSound effectAlgorithmDecision theoryPhysical systemLimit (category theory)InformationReflexive spaceEndliche ModelltheorieEDIFComputer animation
03:19
Function (mathematics)Physical systemMetric systemCalculationMeta elementDatabasePerformance appraisalExpected valuePhysical systemMultiplication signCellular automatonComputer animation
03:57
Function (mathematics)Physical systemMetric systemCalculationDatabaseMeta elementPerformance appraisalDigitizingPrice indexSubject indexingDivisorInformationAuthorizationCalculationType theoryFunction (mathematics)Process (computing)Set (mathematics)Computer animation
05:13
Performance appraisalMeasurementPhysical systemNumberData integrityForcing (mathematics)NumberProcess (computing)Thermal conductivityPhysical systemThermodynamisches SystemAlgorithmPerformance appraisalDecision theoryMathematicsProcedural programmingObject (grammar)Network topologyPhysical lawInteractive televisionExpert systemDatabaseINTEGRALPrice indexArithmetic meanCellular automatonAddress spaceWeightComputer animation
07:51
Performance appraisalAlgorithmArithmetic meanCodePrice indexThermodynamisches SystemConstructor (object-oriented programming)System callComputer animation
08:23
Price indexDivisorObservational studyDatabaseInformationAlgorithm
08:56
AlgorithmPrice indexType theoryProjective planeDifferent (Kate Ryan album)Adaptive behaviorConstructor (object-oriented programming)
09:52
Performance appraisalAlgorithmBuildingComputer networkBasis <Mathematik>Subject indexingCategory of beingAdaptive behaviorOrder (biology)Physical systemInformation retrievalFlow separationGravitationIntelligent NetworkSign (mathematics)Scalable Coherent InterfaceSoftwareBasis <Mathematik>Sound effectXMLComputer animation
12:20
Data structurePower (physics)Field (computer science)PhysicistSlide ruleReading (process)Line (geometry)Lecture/Conference
13:20
Source codeNumberTable (information)Slide ruleWindowMultiplication signPhysicalismField (computer science)Basis <Mathematik>Context awarenessParticle detectorWordType theoryArmAverageCASE <Informatik>HyperplaneNumberEngineering drawing
14:43
NumberPressureField (computer science)1 (number)Shape (magazine)Dependent and independent variablesFlow separationStandard deviationLecture/Conference
15:25
MeasurementCodeMeasurementRight angleThermal conductivitySlide ruleDependent and independent variablesCoroutinePosition operatorMaterialization (paranormal)Computer animation
15:51
MeasurementCodeRule of inferencePoint (geometry)Computer animation
16:20
Address spaceAlgorithmComputer networkTerm (mathematics)Address spaceSpacetimeInclusion mapResultantDecision theorySoftwarePerspective (visual)AlgorithmComputer animationDiagram
18:20
Metric systemSoftwareMetadata
19:01
DivisorNormal (geometry)Group actionGroup actionDivisorText editorComputer animation
19:30
DivisorNormal (geometry)Group actionLine (geometry)Right angleSystem callPrice indexNumberMeasurementFraction (mathematics)Frame problemSubsetUniform resource locatorDivisorNormal (geometry)Computer animation
20:47
Pressure volume diagramWeb pageType theoryRankingDifferent (Kate Ryan album)AreaRankingSimilarity (geometry)Goodness of fitPlastikkarteAlgorithmInformation retrievalWeb pageSearch engine (computing)Mathematical optimizationPower (physics)Price indexComputer animation
21:39
Web pagePressure volume diagramRankingType theoryScalable Coherent InterfaceGoogolRule of inferenceLink (knot theory)WebsiteWeb pagePrice indexRankingSearch engine (computing)AlgorithmMathematical optimizationDivisorCustomer relationship managementEmailType theoryRule of inferenceResultantArithmetic meanThermodynamisches SystemPower (physics)ConcentricDatabasePlug-in (computing)GoogolContent management systemComputer animation
23:44
Scalable Coherent InterfaceGoogolRule of inferenceAlgorithmDatabaseCodierung <Programmierung>Message passingPower (physics)ConcentricMessage passingAnalogyCodeRow (database)Metric systemAlgorithmDatabaseSet (mathematics)DivisorData structureInternet service providerMultiplication signThermal conductivityStrategy gameStack (abstract data type)Computer animation
25:30
Message passingDatabaseCodierung <Programmierung>Data structureAddress spaceManufacturing execution systemFunctional (mathematics)Order (biology)Execution unitBitTerm (mathematics)MeasurementMereologyLecture/Conference
26:03
DatabaseAddress spaceData structureCodierung <Programmierung>Message passingAlgorithmDivisorThermal conductivityCodeProcess (computing)MeasurementTerm (mathematics)Functional (mathematics)Power (physics)Lecture/Conference
26:39
System callGradientPhysical systemPrice indexAlgorithmVotingVector spaceAffine space
27:36
NumberInternetworkingDivisorMeasurementLecture/Conference
28:08
DatabaseCodierung <Programmierung>Address spaceData structureMessage passingMeasurementSpacetimeWordJava appletPhysical systemCellular automatonElectronic mailing listMultiplication signSound effectRoutingInsertion lossWeb pageDivisorValidity (statistics)RootFree groupNumberAuthorizationPhysicistAlgorithmContext awarenessComputer animation
Transcript: English(auto-generated)
00:02
Please give a warm welcome here. It's Francisca, Teresa and Judith. Judith, you have the stage. Thank you.
00:42
I believe that scientific performance indicators are widely applied to
01:02
inform funding decisions and to determine the availability of career opportunities. Those of you who are working in science or have had a look into the science system might agree to that. We want to understand evaluative bibliometrics as algorithmic science evaluation instruments to
01:20
highlight some things that do occur also with other algorithmic instruments of evaluation. We're going to start with a quote from a publication in 2015 which reads,
01:43
As the tyranny of bibliometrics tightens its grip, it is having a disastrous effect on the model of science presented to young researchers. We have heard the talk of Hanno already and he's basically talking about problems in the science system and the reputation by the indicators.
02:09
The question is, is bibliometrics the bad guy here? If you speak of tyranny of bibliometrics, who's the actor doing this? Or are maybe bibliometricians the problem?
02:21
We want to contextualize our talk into the growing movement of reflexive metrics. Those who are doing science studies, social studies of science, scientific metrics and bibliometrics, the movement of reflexive metrics. The basic idea is to say, okay, we have to accept accountability.
02:41
If we do bibliometrics and scientific metrics, we have to understand the effects of algorithmic evaluation on science and we will try not to be the bad guy. The main mediator of the science evaluation which is perceived by the researchers is the algorithm.
03:05
I will not hand over the microphone but I will hand over the talk to Teresa. She's going to talk about the edification of scientific evaluation. Okay. I hope you can hear me? No? Yes? Okay.
03:25
So when we think about the science system, what do we expect? What can society expect from a scientific system? In general, we would say reliable and truthful knowledge that is scrutinized by the scientific community.
03:42
So where can we find this knowledge? Normally in publications. So with these publications, can we actually say whether science is bad or good or is there better science than others? In the era of digital publication databases, there's big data sets of publications
04:06
and these are used to evaluate and calculate the quality of scientific output. So in general, with this metadata, we can tell you who's the author of a publication,
04:26
where's the home institution of this author or which types of citations are in the bibliographic information. So this is used in the calculation of bibliometric indicators. For example, if you take
04:47
the journal impact factors which is a citation-based indicator, you can compare different journals. And maybe perhaps say which journals are performing better than others or if the journal factor has increased or decreased over the years.
05:09
Another example would be the Hirsch index for individual science which is also widely used when scientists apply for jobs.
05:21
So they put these numbers in their CVs and supposedly this tells you something about the quality of research those scientists are conducting. So with the availability of the data, we can see an increase in its usage and in a scientific
05:44
environment in which data-driven science is established, scientific conduct decisions regarding hiring or funding heavily rely on these indicators. And there's maybe a naive belief that these indicators that are data-driven and rely on
06:10
data that is collected in the database is a more objective metric that we can use.
06:21
So here's a quote by Rita and Simon. In this brave new world, trust no longer resides in the integrity of individual truth-tellers or the veracity of prestigious institutions, but is placed in highly formalized procedures and acted through disciplined self-restraint.
06:43
Now there cease to be supplements. So we see a change of an evaluation system that is relying on expert knowledge to a system of algorithmic science evaluation. In this change, there's a belief in a depersonalization of the system and the perception of algorithms as the rule of law.
07:12
So when looking at the interaction between the algorithm and scientists, we can tell that this relationship is not as easy as it seems.
07:29
Algorithms are not in fact objective. They carry social meaning and human agency. They are used to construct a reality and algorithms don't come naturally. They don't grow
07:45
on trees and can be picked by scientists and people who evaluate the scientific system. So we have to be reflective and think about which social meanings the algorithm holds. So when there's a code that the algorithm uses, there's a subjective meaning in this code and there's agency
08:10
in this code and you can't just say, oh, this is a perfect construction of the reality of scientific systems. So the belief that this tells you more about the quality of research is not a good indicator.
08:28
So when you think about the example of citation counts, the algorithm reads the bibliographic information of a publication from the database and so scientists cite papers that relate to their studies but we don't actually know which of these citations are more meaningful than others.
08:56
So they're not as easily comparable but the algorithms give you the belief they are.
09:04
So relevance is not as easily put into an algorithm and there's different types of citations. So the scientists perceive this use of the algorithms also as a powerful instrument and so the algorithm has some sway above the scientists
09:25
because they rely so much on those indicators to further their careers, to get a promotion or get funding for their next research projects. So we have a reciprocal relationship between the algorithm and the scientists and this creates a new construction of reality.
09:46
So we can conclude that governance by algorithms lead to behavioral adaptation in scientists and one of these examples that uses the science citation index will be given from Francisco.
10:05
Thanks for the handover. I'm focusing on reputation and authorship as you can see on the slide and first let me start with a quote by Eugene Garfield which says
10:24
Is it reasonable to assume that if I cite a paper that I would probably be interested in those papers which subsequently cited as well as my own paper? Indeed I have observed on several occasions that people prefer to cite the articles I had cited rather than cite me.
10:44
It would seem to me that this is the basis for the building up of the logical network for the citation index service. So actually this science citation index which is described here was mainly developed in order to solve the problem of information retrieval.
11:08
Eugene Garfield also founder of this science citation index short SCI noted or began to note a huge interest in reciprocal publication behavior.
11:21
He recognized the increasing interest as a strategic instrument to exploit intellectual property and indeed the interest in the SCI and its data successively became more relevant within the disciplines and its usage extended. Later Pries de Sol another social scientist asked or claimed for a better research on the topic as it currently also meant a crisis in science.
11:54
And stated if a paper was cited once it would get cited again and again so the
12:00
main problem was that the rich would get richer which is also known as the Matthew effect. Finally the SCI in its use turned into a system which was and still is used as a reciprocal citation system and became a central and global actor.
12:21
Once a paper was cited the probability it was cited again was higher and it would even extend its own influence on a certain topic within the scientific field. So it was known that you would either read a certain article and people would do research on a certain topic or subject.
12:46
So this phenomenon would rise to an instrument of disciplining science and created power structures. Let me show you one example which is closely connected to this phenomenon I just told you about.
13:07
I don't know if here in this room there are any astronomers or physicists. There are a few. That's great actually.
13:21
So in the next slide here we have a table with a time window from 2010 to 2016. Social scientists from Berlin found out that the co-authorship within the field of physics extended by 58 on a yearly basis in this time window.
13:47
So this is actually already very high but they also found another very extreme case. They found one paper which had around about 7,000 words and mentioned authorship of 5,000.
14:06
So in average the contribution each scientist or researcher of this paper who was mentioned contributed was 1.1 word.
14:21
Sounds strange, yeah. So of course you have to see this in a certain context and maybe we can talk about this later on because it has to do with ATLAS particle detector which requires high maintenance and stuff.
14:42
But still, so the number of authorship, and you can see this regardless which scientific field we are talking about, generally increased the last years. So it remains a problem and especially for the reputation obviously it remains a problem that there is such high pressure on nowadays researchers.
15:13
Still of course we have ethics and research requires standards of responsibility.
15:22
And for example there is one, there are several ones but there is one here on the slide, the Australian Court for the Responsible Conduct of Research which says, The right to authorship is not tied to a position or profession and does not depend on whether the contribution was paid for or voluntary.
15:41
It is not enough to have provided materials or routine technical support or to have made the measurements on which the publication is based. Substantial intellectual involvement is required. So yeah, this could be one rule to work with or to work by, to follow.
16:03
And still we have this problem of reputation which remains and we hand over to Judith again. Thank you. So we are going to speak about strategic citation now. So if you put this point of reputation like that, you may say, so the researcher finds something in
16:31
his research and addresses the publication describing it to the community and the scientific community rewards the researcher with reputation.
16:43
Now the algorithm, which is perceived to be a new thing, is mediating the visibility of the researcher's results to the community and is also mediating the rewards, so the career opportunities of the funding decisions and so on.
17:03
And what happens now and what is possible to happen is that the researcher addresses his research also to the algorithm in terms of citing those who are evaluated by the algorithm who he wants to support and also in terms of keywords, strategic keywording and so on.
17:26
And that's the only thing which happens new might be a perspective on that. So the one thing new, the algorithm is addressed as a recipient of scientific publications. And it is like far-fetched to discriminate between so-called invisible colleges and citation cartels.
17:46
What do I mean by that? So invisible colleges is a term to say, okay, people are citing each other, they do not work together in a co-working space maybe. Or that they do research on the same topic and it's only plausible that they cite each other.
18:01
And if we look at citation networks and find people citing each other, that does not necessarily have to be something bad. And we also have people who are concerned that there might be citation cartels, so researchers citing each other,
18:20
not for purposes like the research topics are closely connected, but to support each other in their career prospects. And people do try to discriminate those invisible colleges from citation cartels ex -post, from looking at metadata networks of publication and find that a problem.
18:45
And we have a discourse on that in the Bibliometrics community and I will show you some short quotes, how people talk about those citation cartels.
19:01
So, for example, Davis in 2012 said, George Frank warned us on the possibility of citation cartels, groups of editors and journals working together for mutual benefit. So we have heard about their journal impact factors, so it's believed that editors talk to each
19:22
other, hey, you cite my journal, I cite your journal, and we both will boost our impact factors. So we have people trying to detect those cartels and Manjin and I wrote that we have little knowledge about the phenomenon itself and about where to draw the line between acceptable and unacceptable behavior.
19:42
So we are having moral discussions about research ethics and also we find discussions about the fairness of the impact factors. So Yang et al. wrote, disingenuously manipulating impact factor is the significant way to harm the fairness of the impact factor.
20:01
And that's a very interesting thing, I think, because why should an indicator be fair? So to believe that we have a fair measurement of scientific quality, relevance and rigor in one single number, like the journal impact factor, is not a small thing to say.
20:24
And also we have a call for detection and punishment, so Davis also wrote, if disciplinary norms in decorum cannot keep this kind of behavior at bay, the threat of being delisted from the JCR may be necessary. And so we find the moral concerns on right and wrong, we find the
20:42
evocation of the fairness of indicators, and we find the call for detection and punishment. And when I first heard about that phenomenon of citation cartels, which is believed to exist, I had something in mind which sounded familiar to me because we have a similar information
21:04
retrieval discourse about ranking and power in a different area of society and search engine optimization. So I found a quote by Page et al. who developed the PageRank algorithm, so Google's ranking algorithm in 1999, which has
21:29
changed since then a lot, but they wrote also a paper about the social implications of the information retrieval by those indicators, by the PageRank as an indicator, and wrote that these types of personalized PageRanks are virtually
21:46
immune to manipulation by commercial interests, for example, fast updating of documents is a very desirable feature, but is abused by people who want to manipulate the results of the search engine. And that was important for me to read because we also have a narration of abuse, of manipulation,
22:07
the perception that that might be fair, so we have a fair indicator and people try to betray it. And then we had in the early 2000s, I recall having a private website with a public guestbook and getting link spam
22:24
from people who wanted to boost their Google PageRanks, and shortly afterwards Google decided to punish link spam in their ranking algorithm, and then I got lots of emails of people saying, please delete my post from your guestbook because Google's going to punish me for that.
22:42
And we may say that this search engine optimization discussion is now somehow settled and it's accepted that Google's ranking is useful and they have a secret algorithm, but it works and thus it's widely used. And although the journal impact factor seems to be transparent, it's basically the same thing, that it's accepted
23:11
to be useful and thus it's widely used, so the journal impact factor is the SCI and the like. And we have another analogy, so that Google decides which SEO behavior is regarded acceptable and punishes those
23:25
who act against the rules, and thus holds an enormous amount of power, which has a lot of implications, led to the spreading of content management systems for example with search engine optimization plugins and so on.
23:43
And we also have this power concentration in the hands of Clarovit, former Thomson Reuters, who host the database for the journal impact factor, and they decide on who's going to be indexed in those journal citation records, and how is the algorithm in detail implemented in their databases.
24:08
So we have this power concentration there too, and I think if we think about this analogy we might come to interesting thoughts.
24:22
So our time is running out, so we're going to give a take-home message. So we find that the scientific community reacts with codes of conduct to a problem which is believed to exist, the strategic citation. We have database providers which react with sanctions, so people are delisted from the journal citation records to punish them for citation stacking.
24:52
And we have researchers and publishers who adapt their publication strategies in reaction to this perceived algorithmic power, but if we want
25:07
to understand this as a problem, we don't have to only react to the algorithm, but we have to address the power structures. So who holds these instruments in their hands if we talk about bibliometrics as an instrument? And
25:24
we should not only blame the algorithm, so hashtag don't blame the algorithm. Thank you very much. Thank you for shining a light on how science is actually seen in its publications, and as I started off
25:52
as well, it's more about scratching each other a little bit. I have some questions here from the audience. Please. Yes, thank you for this interesting talk. I have a question. You may be familiar with the term measurement
26:07
dysfunction, that if you provide a worker with an incentive to do a good job based on some kind of metric, then the worker will start optimizing for the metric instead of trying to do a good job. And this is kind of inevitable. So don't you see that maybe it could be
26:26
treating the symptoms if we just react about code of conduct, tweaking algorithms or addressing power structures, but instead we need to remove the incentives that lead to this measurement dysfunction. So I would refer to this phenomenon as perverse learning, so learning for
26:47
the grades you get but not for your intrinsic motivation to learn something. We observe that in the science system, but if we only adapt the algorithms, so take away
27:03
the incentives, you wouldn't want to evaluate research at all, which you can probably want to do. But to whom would you address this call or this demand? So please do not have indicators, so I give the question back to you.
27:37
Okay, questions from the audience out there on the internet, please. Your mic is not working. Okay, then I go to microphone number one, please, sir.
27:48
Yeah, I want to have a provocative thesis, but I think the fundamental problem is not how these things are gamed, but the fundamental problem is that we think the impact factor is a useful measurement for the quality of science, because I think it's just not.
28:08
Yeah, I would not say that the general impact factor is a measurement of scientific quality, because no one has a definition of scientific quality.
28:24
So what I can observe is only people believe this general impact factor to reflect some quality and maybe they are chasing a ghost, but whether that's a valid measure is not so important to me, even if it were a valid measure, it would concern me how it affects science.
28:53
Okay, question from microphone number three there, please. Thanks for the interesting talk. I have a question about the 5,000 author paper. Was it
29:03
the same paper published 5,000 times or was it one paper with 10 page title page? No, it was one paper, counting more than 7,000 words. And the authorship, so authors and co-authors were more than 5,000.
29:26
Isn't it obvious that this is a fake? Well, that's what I meant earlier when saying you have to see this within its context. So, physicists are working with this, with ATLAS, this detective system, and as there were
29:51
some physicists in the audience, they probably do know how this works, I do not. But as they claim, it's so much work to work with this, and as I said, it requires some high maintenance, they obviously have...
30:19
So everybody who contributed was listed?
30:21
Exactly, that's it. And if this is ethically correct or not, well, this is something which needs to be discussed, right? This is why we had this talk, as we want to make this transparent and contribute to an open discussion.
30:40
Okay, I'm sorry guys, I have to cut off here because our mission out there in space is coming to an end. I suggest that you guys find each other somewhere, maybe the tea house or... Sure, we're around. You're around. I would love to have a last applause for these ladies, really with their lights on how these algorithms are working. Thank you very much.