We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Towards Data Justice: Social Justice in the Era of Datafication

00:00

Formal Metadata

Title
Towards Data Justice: Social Justice in the Era of Datafication
Title of Series
Number of Parts
234
Author
License
CC Attribution - ShareAlike 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
We are living in a datafied society in which the collection and processing of massive amounts of data is being used for decision-making and governance across more and more areas of social life. How do we address possible harms and challenges for social justice? Are calls for individual privacy and encryption tools sufficient? This talk will propose a broader agenda to both understand and create social justice in the era of datafication.
Hazard (2005 film)HypermediaIdentity managementComputer animationJSONXMLUMLLecture/Conference
BitSpacetimeObservational studyContext awarenessTerm (mathematics)Group actionLecture/Conference
Context awarenessDatabaseDependent and independent variablesPoint (geometry)System identificationAreaMathematical analysisLecture/Conference
Mathematical analysisLie groupAsynchronous Transfer ModeTransformation (genetics)DivisorPlastikkarteTelecommunicationHypermediaDecision theoryKey (cryptography)Exploit (computer security)SoftwareBitLecture/ConferenceMeeting/Interview
Computer programmingBit rateNeighbourhood (graph theory)Set (mathematics)PredictabilityType theoryInformation
CASE <Informatik>Decision theoryMorley's categoricity theoremHypermediaInformationComputer programmingLikelihood functionMeeting/Interview
ThetafunktionDataflowRadiology information systemComputer programmingRow (database)Proxy serverFamilyInteractive televisionDivisorPhysical systemIntegrated development environmentSource code
Service (economics)Resource allocationPhysical systemComputer programmingBit rateAlgorithmMeeting/Interview
InternetworkingInformationUtility softwareTransportation theory (mathematics)Menu (computing)Service (economics)MeasurementCuboidBit rateProcess (computing)Row (database)Projective planeInformationGoodness of fit
Computer programmingInequality (mathematics)Moment (mathematics)Game controllerContext awarenessMessage passingVotingVulnerability (computing)Characteristic polynomialData analysisService (economics)Meeting/InterviewComputer animation
IP addressInteractive televisionTelecommunicationThermal conductivityDegree (graph theory)Formal languageNumberComputer animation
EmailFacebookRight angleInterpreter (computing)Term (mathematics)DatabaseForm (programming)AlgorithmWebsiteOrder (biology)Identity managementRevision controlComputer animationMeeting/InterviewLecture/Conference
Observational studyField (computer science)Morley's categoricity theoremAddress spaceEDIFCorrespondence (mathematics)
Dependent and independent variablesPoint (geometry)DatabaseField (computer science)Link (knot theory)Meeting/Interview
LeakContext awarenessMoment (mathematics)InternetworkingArtificial neural networkPlastikkarteInternet der DingeSpecial unitary groupMultiplication signMeeting/Interview
Dependent and independent variablesType theoryTransformation (genetics)Self-organizationCivil engineeringContext awarenessSource codeWordEncryptionGroup actionMeeting/Interview
DigitizingCASE <Informatik>Information privacyDependent and independent variablesRegulator geneRight angleExpert systemGroup action
Traffic reportingHypermediaGroup actionInformation securityContext awarenessMeeting/Interview
Theory of relativityPower (physics)Classical physicsState of matterRight angleDigitizingNeighbourhood (graph theory)Sound effectGraph coloringMultiplication signMeeting/Interview
Right angleDigitizingMathematicsEvent horizonShift operatorPower (physics)Civil engineeringSelf-organizationTransformation (genetics)Message passingForcing (mathematics)Different (Kate Ryan album)Arithmetic meanFundamental theorem of algebraState of matterMeeting/Interview
Term (mathematics)Electronic data processingEqualiser (mathematics)Perspective (visual)Observational studyPhysical systemResultantDifferent (Kate Ryan album)Process (computing)Meeting/Interview
Information privacyNatural numberEqualiser (mathematics)Power (physics)Data conversionLecture/ConferenceMeeting/Interview
MedianDifferent (Kate Ryan album)Group actionTerm (mathematics)outputSkewnessLevel (video gaming)InformationMassType theoryFunction (mathematics)AlgorithmElectronic data processingLink (knot theory)Equaliser (mathematics)Profil (magazine)Set (mathematics)Meeting/Interview
Inverter (logic gate)Information privacyProcess (computing)Arithmetic meanForm (programming)Dependent and independent variablesTerm (mathematics)Key (cryptography)DigitizingDifferent (Kate Ryan album)Personal digital assistantDistanceEncryptionProfil (magazine)Process (computing)Category of beingPredictabilitySpacetimeDecision theorySymmetry (physics)Power (physics)Electronic data processingGroup actionRight angleAsymmetryType theoryDoubling the cubeComputer animation
Representation (politics)Universe (mathematics)MereologyCivil engineeringQuicksortType theorySoftware developerElectronic data processingLimit (category theory)Software frameworkMeeting/Interview
Exclusive orDecision theoryPhysical systemPerspective (visual)Point (geometry)Meeting/Interview
InformationComputer architectureSoftware frameworkPhysical systemRight angleExterior algebraComputer scienceForm (programming)SoftwareTerm (mathematics)Wage labourPhysical lawMeeting/Interview
BuildingInequality (mathematics)Computing platformCategory of beingForm (programming)Exterior algebraInformationDecision theory
Type theoryInformationProcess (computing)AlgorithmMereologyGroup actionMultiplication signMeeting/Interview
Group actionInclusion mapConnected spaceInequality (mathematics)Perspective (visual)Software developerLecture/Conference
Software developerSoftware frameworkRight angleElectronic data processingExterior algebraMeeting/Interview
Logical constantEvent horizonSoftware developerPhysical systemProjective planeSoftware frameworkPlastikkarteState of matterForcing (mathematics)Context awarenessSpacetimeCollaborationismProcess (computing)Exterior algebraType theoryComputer animation
Object (grammar)Ocean currentSelf-organizationImaginary numberDependent and independent variablesConnected spaceInformation privacyCivil engineeringGroup actionSpacetimeLecture/ConferenceMeeting/Interview
Multiplication signLecture/ConferenceComputer animation
Transcript: English(auto-generated)
Hello, everyone. My name is Arne Hintz. We work at Cardiff University in the beautiful country of Wales.
We're based at the Cardiff School of Journalism, Media and Cultural Studies. Recently, we've launched a new space for research and action, the Data Justice Lab. What we want to do here today is to talk a bit about why we started to think about data
in these terms, in the context of social justice, and why we think it's important to advance social justice in the era of datafication, but also then to explain a little bit better what we mean by data justice, and maybe at the end we can have a bit of a discussion
about what to do about this. So, to start, why should we think about social justice in the context of datafication? There are two starting points really for this. One are the consequences of datafication. The other are the typical responses to data-based analysis and surveillance.
Those are the two areas that brought us to think about data justice, think about it in these terms, and think about what to do about that. So let's start with datafication. It's a debate that has been had a lot during these days, also at this conference. What we mean by this is the transformation of our lives into data points,
the collection and analysis of health data, of social media communication, of data about our movements in smart homes, in smart cities, about our consumption habits, our networks and friendships, our political preferences, and it's about the way in which
the exploitation of the resource of big data has become a key factor for economic success and for political control, and a mode of governance, and a new mode of decision-making. Just a few examples to highlight a bit what we mean by this.
For example, in predictive policing, data on neighborhood crime rates, on previous crimes of individuals, and so on, is used to predict who might be a future criminal. Police use programs such as Predpol to tell them where a crime is likely to occur and who is likely to be involved.
And these types of programs are also used to collect information on, for example, activists and protesters. We've done research about how police in the UK use social media data to, for example, categorize activists into threats and into non-threats, and to use social media data for these purposes.
Computer programs calculate a risk score of people who've been arrested and thereby the likelihood of them committing future crimes. And this kind of risk score is used in court to set sentences for convicted criminals and decide for how long they should go to jail.
Now, supposedly, this allows for more accurate, more evidence-based decision-making, but that's not always the case, as we can see, for example, in this picture here, that's from an investigation by a journalist at ProPublica, and it's about risk scores used in the US justice system.
Here we see a black woman with a minor offense who is rated a higher risk than the white guy with a serious criminal record. And that's not necessarily so because race is included in the system, but these programs use factors such as employment, living environment,
previous interaction with the police, crime among family and friends, and so on, which can then serve as a proxy for race, for example. We know that there is systematic discrimination against blacks in the US, we know that black people are stopped and searched far more often than whites, they are incarcerated far more often, and of course, if all this feeds into the program,
then this is what we end up with. One of the problems with these systems is that the algorithms are private and it's not possible for either defendants or for the public to see why they got a particular score. Now, the allocation of services such as insurance is also increasingly based on data.
Health insurances are starting to experiment with offering lower rates if customers measure their health with Fitbits and make data available. Car insurances offer lower rates if we install a box that measures our driving. The most far-reaching project is planned by the Chinese government.
It's a kind of social credit score where every citizen will get a score based on things like criminal records, spending habits, but also social networks, the kind of information that they post, social engagement, and so on. This is then used to decide whether someone gets a loan, a job, better social services,
access to good schools, access to universities, or is allowed to travel. The idea is that this is then rolled out nationwide in 2020, and they call it a program of social governance to predict and prevent risk. We can also call it a program of social control.
Yeah, but this is the most far-reaching experiment, I think, at the moment. In the context of national politics and elections, the use of data has been discussed a lot recently. Companies like Cambridge Analytica offer their services of voter data analysis
to political campaigns so that voters can be influenced with messages that are targeted to particular voter characteristics and particular vulnerabilities. And the role of this in the Trump election and also in the Brexit referendum in the UK last year has been widely debated.
But more broadly, data also transforms classic national citizenship. The NSA can legally conduct surveillance on foreign nationals but not on US citizens. And so to establish whether a piece of online communication belongs to a US citizen or a foreigner, they identified a number of selectors such as phone number, IP address, language,
the degree of interaction with people inside and outside the US, and so on. And so if many of your Facebook friends are believed to be foreigners and you talk on email a lot with somebody or exchange emails a lot with someone believed to be a foreigner, or you talk on the phone, you check international news websites and so on and so on,
then maybe you're perceived as a foreigner according to the data and you can be monitored. John Cheney Lippold from University of Michigan has written about this and he called this an algorithmic form of citizenship, a data-based version of citizenship. He says, quote,
it functionally abandons citizenship in terms of national identity in order to privilege citizenship in terms of provisional interpretations of data. It sometimes aligns with the nationality, as in your passport, and sometimes it doesn't. All these examples show that we are categorized according to data assemblages
and our rights and obligations are reconfigured according to these classifications. But data-based categorizations don't necessarily correspond, at least not always, with what we experience in our everyday lives, our offline lives. And often we have no way of knowing how we are classified,
for what reason, and what we can do about this. There's a new field of study in academia, the field of critical data studies, that addresses these problems of datafication. Scholars in this field look at, for example, data-based discrimination or questions of accountability. Who is accountable if someone is arrested or sentenced based on a data score?
Anyway, so that's the first starting point for thinking about data justice, the consequences of datafication. The second for us, really, were the responses to the Snowden revelations and our protection from surveillance. The Snowden leaks highlighted the problems of datafication
just as digital tools became entirely integrated into our lives. We're now talking, of course, about the Internet of Things, about smart devices, artificial intelligence, and so on. And so it's, in a way, a truly historic moment in which all this comes together, the complete datafication of our lives, but also, at the same time, increased awareness and problems of this,
awareness of the problems of this through, for example, the Snowden revelations. At Cardiff University, we conducted research on the implications of the Snowden revelations and responses to them. And, yeah, as we probably know, there are mainly, actually, two types of responses. There is the technical response, encryption, anonymization tools.
That's been very successful. There's increased awareness of these, of surveillance through these tools. There's increased use of these tools. Yes, but still, it's something that not all people, and actually not many people, are doing, or even civil society organizations are doing a lot.
And we think a problem with this approach is that technological self-defense is typically an individual act that each individual is responsible for. It puts the onus on me to protect myself. And it doesn't necessarily focus on the broader transformation of society through datafication.
The other approach has been the legal one, court cases and political advocacy, policy advocacy, rather. Court cases against surveillance regimes have been quite successful. British surveillance policy was practically declared illegal through these court cases. Policy advocacy had some mixed responses and outcomes.
The UK Investigatory Powers Act was pushed through despite significant lobbying by digital rights groups. The EU data protection regulation has been a more promising outcome. But especially, again, this response is limited typically to small expert communities, legal experts, digital rights groups,
not necessarily the wider civil society or the broader public. So what about public debate? Well, media reporting has been largely focused on justifying surveillance in the context of the broader discourse on national security. Public knowledge and debate has been rather limited.
There is unease about surveillance, and our research has shown that. People are worried, but they resign largely to the fact that data collection remains obscure and seems impossible to avoid. And finally, well, the main claim by Snowden and the critics of surveillance
of the surveillance state has been that surveillance by the NSA and others affects everyone and has to be resisted because of that. And yes, that's a very important point, but at the same time, we also know that surveillance affects people with particular skin colour more than others, particular religion more than others.
It affects activists more than passive citizens. It affects poor people more than the rich. It may affect those living in a certain neighbourhood more than living in another neighbourhood, and so on. And so we need to look at classic questions of discrimination and power relations and social justice to really understand the implications of datafication.
And so we believe that the self-protection and digital rights are important, but are not sufficient to address the more fundamental transformation that we're witnessing, the shift in the fabric and organisation of society through datafication, which impacts on people's ability to participate in society
and some people more than others, which leads to a renegotiation of civil liberties and democracy and a power shift between different forces in society. And so it's not merely a question of protecting my messages against state surveillance, for example, or using data ethically,
but we need to understand these broader transformations in which datafication infuses and changes society, governance and power. And so what does that mean and how do we do that? Okay, so we have started to think about this in terms of data justice.
Now the term data justice has also been used to describe the use of data-driven systems in criminal justice. That's not what we mean here when we're talking about data justice. We're talking about the study and practice of datafication from the perspective of social justice. So social justice here highlights precisely these questions
of equality and fairness in how different communities and individual data subjects are implicated in data processes and how they are positioned in society as a result of datafication, as well as questions of who and what it is that drives the infrastructures that shape the way the world is represented and ordered through datafication.
And an emphasis on social justice, I think, also invites us to consider the nature and role we think datafication and technology generally should or ought to have in society and what nature and role it also shouldn't have in our society. So by data justice, we're predominantly talking about
reframing the debate on data and to shift the conversation from concerns primarily with individual privacy to broader questions of power, equality, and fairness that come with that. And as Arne highlighted, it's to deal with questions that or to highlight the debate around the fact that data processes are uneven.
So this idea of mass surveillance that somehow that means when we talk about mass surveillance that we're all equally implicated in this as individuals to try and sort of have a more nuanced understanding than that and actually think of it more as also the surveillance scholar David Lyon has described it as social sorting.
And we need to emphasize that actually some groups are more surveilled than others and also surveilled for different reasons than others, that we're not all implicated in this equally. And linked to that, to highlight the data processes can discriminate and exclude at all stages really of the data process with skewed data sets to begin with.
So the input in terms of the actual design of the algorithms themselves, so what is actually weighted, what information gets highlighted, and what doesn't. And also in terms of the output, so the type of score or profile that gets produced. And also to highlight that data processes create what we might think of
as new stratifications of have and have nots, meaning that there's a new power asymmetry between those who are able, who have access to the resources to carry out profiling, versus those who are subjected to those profiles, who are unable to even understand why and how they're profiling the way that they are. And this highlights issues around the types of categories that are used
to classify citizens in different ways, and also questions of course of due process and how you can challenge decisions, and issues around transparency, and also this idea of who actually owns data. So questions of ownership also comes up here when we start understanding datafication around these terms.
And then also the fact that data processes advance what we might think of as a new politics that's based more and more on prediction and preemption, meaning that we're increasingly governed by what we are predicted to do in the future, what we might profile as intending to do in the future, rather than what we actually do and how we actually act and who we actually are.
And this has implications of course for our understanding of citizenship, and this distance between our data double and actually who we are and our lived experience becomes incredibly important and very, very political, and for us a key issue of social justice.
So this is what we're talking about. These are the types of issues we want to highlight when we're reframing the debate in terms of data justice. And there are different approaches to this issue. So one is, and this is the kind of stuff that we have focused on, so far anyway, based on our previous research, which is about trying to articulate responses to datafication based on social justice,
so a more systemic and collectivist form of resistance that feeds into a broader social movement than what we might have when we talk about encryption and policy advocacy. So for example, to try and link up concerns between different communities.
We spoke, for example, with someone who is a community activist in Bristol who deal with questions around fair housing, who feel that surveillance and questions around data isn't really anything to do with them. They're concerned with other things. So a kind of outsourcing of this issue to technology activists or digital rights groups. And we feel that one response by reframing it in terms of data justice
is to try and overcome this disconnect that we have in civil society between technology activists on the one hand and social justice activists on the other and actually highlight how datafication also comes to play a key part in the types of social justice agendas that, for example, community activists might pursue.
The second approach that has been around data justice is particularly in development, where there is an emphasis on trying to develop certain principles that can underpin also a sort of data ethics framework. So highlighting, for example, issues around... So Lynette Taylor, for example, at Tilburg University
has worked on questions around visibility and representation and anti-discrimination. So trying to come up with new principles that can underpin how we should pursue data processes, the handling of the data process, as well as the uses of data. And then there is an approach around data justice
where it's more about examining datafication from the perspective of marginalized communities, highlighting points of discrimination and exclusion. So, for example, Virginia Eubanks of New America has been studying how poor communities in the US engage with data-driven systems, looking, for example, at how decisions about who should receive benefits
are made through data-driven systems and how people also understand and perceive those decisions. So also to try and understand what are the interests at play, who gets rewarded in these systems and who gets punished when you take the perspective of already marginalized communities or disenfranchised communities.
Then we can think of data justice also about applying existing social and economic rights framework that we actually have in place, for example, around anti-discrimination or migrants' rights or rights at work, so labor law, for example, and apply these to datafication,
or at least to examine how datafication either infringes upon or enables our ability to enjoy economic and social rights that we have. And then there is, of course, also the approach to data justice, which is about developing alternative data systems that is based on architecture that actually considers social justice
in the very design of the architecture. So practicing computer science in a way that makes politics explicit. So someone like Jeffrey Johnson from Utah Valley University, for example, talks about this in terms of information justice. But there's also a design justice network and forms of data activism that is about this idea to design data infrastructures
that actually considers questions of discrimination and inequality in the design by working with communities who might be subject to it already. So, for example, building things like cooperative alternatives to various forms of businesses. So there is the EcoMondo Cleaning Cooperative, for example,
where it's about designing platforms together with the cleaners in the Bronx that allows them to be in charge of decisions that are made through that platform about their work and what information should be there and what information shouldn't, what type of data should be collected, what type of data shouldn't be collected. So how can we advance data justice?
Well, we think obviously is to continue to do research around these, how these data processes actually work because there is a huge lack of public understanding that we have found around this. And actually also we work partly in a journalism school and we find that there is a huge lack of journalistic skill in how you actually investigate some of these processes to try and hold these algorithms and data-driven decision-making to account.
So we also want to think of it that you have to advance data literacy also amongst journalists and also practitioners like lawyers and so forth. And also we need to broaden the stakeholders that are involved to actually connect concerns, so for example to include anti-discrimination groups
and other social justice activists in the debate and bring in historical perspectives on how inequality and unfairness happen in society when we talk about also data. And we think it can be advanced also, of course, as mentioned through further policy developments either by highlighting certain principles that should underpin data processes
but also applying rights frameworks we actually have in place already to some of these issues. And then to further collectivist design where we actually bring together social justice activists and developers when we think about alternative data infrastructures and to continue to question what interests are at play,
what forces actually drive these datafication processes and also highlight the politics of that. So that's what we want to do with our new initiative that we've set up at Cardiff University called the Data Justice Lab which is this collaborative space for research and practice on the relationship between datafication and social justice.
We had a public launch in March directed by myself, Arne Hintz and Joanna Redden. And it's actually also about trying to advance a European concept for some of these debates because they tend to have been very US-centric so forth, particularly around this issue of data and discrimination.
So to try and also get some context within Europe on this and some European frameworks in place. And the types of projects we're developing are things like the use of big data in governance and for social policy, the development of data scores, issues around datafication in health, issues around the impact on particular communities like refugees
who are implicated in data systems, developments of alternative forms of smart cities that may consider social justice. So we want to talk about data justice and the Data Justice Lab just to sum up. To reframe the debates to understand data as a social and economic justice issue,
we think we need to think through collectivist responses to datafication that go beyond individual privacy and what we might think of as techno-legal solutionism. We think we have to overcome this disconnect in civil society between technology activists and social justice activists and connect concerns. And from this, it's about nurturing alternative political imaginaries
for what the deal on data actually should be. So thinking about how society ought to be organized and the social organization of technology within that society beyond the current dominant understandings of datafication that's based around efficiency and objectivity. So, thank you for that.
I don't think we have time for questions, but we might.