Emotional Trauma, Machine Learning and the Internet

Video thumbnail (Frame 0) Video thumbnail (Frame 1315) Video thumbnail (Frame 3039) Video thumbnail (Frame 3993) Video thumbnail (Frame 5040) Video thumbnail (Frame 6410) Video thumbnail (Frame 7865) Video thumbnail (Frame 9202) Video thumbnail (Frame 10117) Video thumbnail (Frame 11258) Video thumbnail (Frame 12301) Video thumbnail (Frame 13362) Video thumbnail (Frame 14463) Video thumbnail (Frame 15912) Video thumbnail (Frame 17355) Video thumbnail (Frame 18659) Video thumbnail (Frame 19745) Video thumbnail (Frame 21269) Video thumbnail (Frame 22662) Video thumbnail (Frame 24141) Video thumbnail (Frame 25206) Video thumbnail (Frame 26787) Video thumbnail (Frame 27917) Video thumbnail (Frame 28926) Video thumbnail (Frame 29981) Video thumbnail (Frame 31221) Video thumbnail (Frame 32148) Video thumbnail (Frame 34634) Video thumbnail (Frame 35551) Video thumbnail (Frame 36725) Video thumbnail (Frame 37731) Video thumbnail (Frame 39254) Video thumbnail (Frame 40295) Video thumbnail (Frame 43758) Video thumbnail (Frame 44636) Video thumbnail (Frame 46056) Video thumbnail (Frame 48952) Video thumbnail (Frame 49930) Video thumbnail (Frame 52987) Video thumbnail (Frame 54365) Video thumbnail (Frame 55449) Video thumbnail (Frame 56673) Video thumbnail (Frame 57782) Video thumbnail (Frame 58894) Video thumbnail (Frame 60643) Video thumbnail (Frame 63703) Video thumbnail (Frame 65523) Video thumbnail (Frame 66461)
Video in TIB AV-Portal: Emotional Trauma, Machine Learning and the Internet

Formal Metadata

Emotional Trauma, Machine Learning and the Internet
Title of Series
CC Attribution - ShareAlike 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
This is a talk on machine learning, emotional data, and how design affects behavior, specifically around online harassment. Can design and data affect behavior, and mitigate online harassment? The talk will cover two topics- the possibility of creating emotional data corpuses in machine learning, and using machine learning along with users in social media platforms to create transparent, open systems that focus on emotions and conversations.
Internetworking Computer animation Wage labour Internetworking Multiplication sign Virtual machine
Existence Algorithm Chatterbot Interactive television Computer Mereology Computer programming Formal language Neuroinformatik Content (media) Word Computer animation Software Data conversion Series (mathematics) Metropolitan area network Form (programming) Task (computing)
Algorithm Virtual machine Shared memory Interactive television Machine code Graph coloring Host Identity Protocol Formal language Uniform resource locator Word Machine learning Computer animation Bit rate James Waddell Alexander II Data conversion Series (mathematics) Uniform boundedness principle Reading (process) Physical system
Medical imaging Algorithm Information privacy Resultant Automatic differentiation Spacetime
Algorithm Pattern recognition Multiplication sign Gender Virtual machine Sheaf (mathematics) Water vapor Mereology Automatic differentiation Product (business) Wave packet Medical imaging Digital photography Software Endliche Modelltheorie Resultant Physical system Row (database) Spacetime
Predictability Algorithm Meeting/Interview Feedback Mathematical analysis Virtual machine Set (mathematics) Disk read-and-write head Resource allocation
Word Machine learning State of matter Neighbourhood (graph theory) Virtual machine Object (grammar) Flow separation Traffic reporting Physical system Product (business)
Word Email Computer animation Observational study Block (periodic table) Search engine (computing) Term (mathematics) Resultant Physical system
Word Computer animation Bit rate Sound effect Game theory
Algorithm Existence Open source Decision theory Commutator Virtual machine Sound effect Meeting/Interview Different (Kate Ryan album) Quadrilateral Text editor Computing platform Spacetime
Particle system Standard deviation Word Computer animation Digital media Different (Kate Ryan album) Multiplication sign Telecommunication Species Quicksort Formal language
Group action Algorithm Virtual machine Machine code Rule of inference Number Word Web service Computer animation Software Different (Kate Ryan album) Term (mathematics) Mixed reality Right angle Series (mathematics) Thermal conductivity Computing platform Identity management Spacetime
Axiom of choice Digital photography Machine learning Meeting/Interview String (computer science) Video game Series (mathematics) Number Spacetime
Dialect Digital media Variety (linguistics) Interface (computing) Data mining Coefficient of determination Word Arithmetic mean Digital photography Process (computing) Computer animation Different (Kate Ryan album) Game theory Family
Computer file Virtual machine Mereology Computer programming Twitter Word Machine learning Process (computing) Computer animation Internet forum Software framework Game theory Traffic reporting Computing platform Physical system
Dependent and independent variables Scaling (geometry) Multiplication sign Feedback Sheaf (mathematics) Morley's categoricity theorem Twitter Peer-to-peer Category of being Content (media) Digital photography Process (computing) Computer animation Internet forum Energy level Traffic reporting Spacetime
Content (media) Word Computer animation Internet forum View (database) Flag output Real-time operating system Line (geometry) Traffic reporting Physical system Spacetime
Medical imaging Dependent and independent variables Context awareness Machine learning Collaborative software Process (computing) Computer animation Virtual machine Extension (kinesiology)
Medical imaging Word Mathematics Process (computing) Software Internet forum Internetworking Virtual machine Software development kit Wave packet Formal language
Web page Classical physics Group action Digitizing Electronic program guide Formal language Mathematics Computer animation Bit rate Term (mathematics) Internet forum Speech synthesis Right angle Selectivity (electronic) Quicksort Game theory Spacetime Identity management
Frequency Word Computer animation Address space Identity management
Word Computer animation Different (Kate Ryan album) Term (mathematics) Variety (linguistics) Gradient Electronic program guide Website Conservation law Speech synthesis Spacetime
Category of being Uniform resource locator Word Term (mathematics) Internetworking Blog Virtual machine Data analysis Mereology Physical system
Polygon mesh Multiplication sign Virtual machine Database Plastikkarte Perspective (visual) Formal language Frequency Word Spreadsheet Latent heat Computer animation Bit rate Internet forum Personal digital assistant Term (mathematics) Different (Kate Ryan album) Blog Speech synthesis Flag Partition (number theory) Spacetime Physical system
Latent heat Standard deviation Computer animation Internet forum Meeting/Interview Different (Kate Ryan album) Internetworking Multiplication sign Electronic mailing list Data conversion Extension (kinesiology) Error message
Web page Algorithm Computer animation Meeting/Interview Weight Structural load Online help Partition (number theory) Physical system Spacetime Twitter
Computer animation Different (Kate Ryan album) Multiplication sign User-generated content Summierbarkeit Instance (computer science) Spacetime
Electric generator Meeting/Interview Authorization Speech synthesis Mass Theory
Collaborationism Algorithm Group action Inheritance (object-oriented programming) Open source Linear regression Mathematical analysis Bit Graph coloring Type theory Word Computer animation Bit rate Meeting/Interview Different (Kate Ryan album) Speech synthesis Right angle Endliche Modelltheorie Computing platform Sinc function Spacetime
Facebook Computer animation Software Set (mathematics) Series (mathematics) Data conversion Information privacy Family
Noise (electronics) Meeting/Interview Expression Virtual machine Right angle Perspective (visual) Computer programming
Algorithm Standard deviation Multiplication sign Virtual machine Set (mathematics) Sound effect Grand Unified Theory Mereology Product (business) Process (computing) Computer animation Positional notation Personal digital assistant Universe (mathematics) Software testing Quicksort
Content (media) Context awareness Algorithm Machine learning Internet forum Meeting/Interview Graph (mathematics) Moment (mathematics) Online help Mereology Wireless LAN Form (programming)
Computer file View (database) Virtual machine Sheaf (mathematics) Cyberspace Mereology Twitter Content (media) Machine learning Internet forum Meeting/Interview Term (mathematics) Extension (kinesiology) Traffic reporting Basis <Mathematik> Volume (thermodynamics) Hand fan Type theory Spring (hydrology) Process (computing) Computer animation Personal digital assistant Game theory Resultant Spacetime
Digital photography Computer animation Interactive television Volume (thermodynamics) Game theory Quicksort Logic gate Physical system Twitter
Axiom of choice Digital rights management Computer animation Oval Set (mathematics) Right angle Student's t-test Information privacy Arithmetic progression Element (mathematics) Row (database)
Addition Process (computing) Inheritance (object-oriented programming) Internet forum Observational study Meeting/Interview Instance (computer science) Traffic reporting Vector potential
Content (media) Medical imaging Computer animation Wage labour Videoconferencing Energy level Similarity (geometry) Quicksort Metadata Task (computing)
Content (media) Medical imaging Word Coefficient of determination Bit rate Hash function Sheaf (mathematics) Volume (thermodynamics) Game theory Traffic reporting
Area Group action Dialect User profile Computer animation Meeting/Interview Energy level Mereology Identity management
Group action Observational study Virtual machine Perspective (visual) Twitter Product (business) Formal language Template (C++) Facebook Content (media) Mathematics Internetworking Data conversion Computing platform Algorithm Database Machine code Lattice (order) Computer animation Speech synthesis Right angle Figurate number Quicksort Game theory Thermal conductivity Communications protocol Spacetime
Point (geometry) Sound effect Water vapor Formal language Computer animation Internetworking Term (mathematics) Website Speech synthesis Species Error message Spacetime Physical system
Word Computer animation Bit rate Website Speech synthesis Parameter (computer programming) Rule of inference
Computer animation Meeting/Interview Rule of inference
and it was at the time it if you take if you want to talk
the for it to you the everyone and
Caroline senders and I'm an online rested researcher with the Wikimedia Foundation as also the lab fellow what does he denied being a further for the past 2 years i've been studying online harassment inside of social networks for the past 5 years i've been studying
online protest as well as human behavior and the kind of data we create inside of social networks Intel's talk is machine uh emotional labor + machine learning so I spent a lot of time studying people on the Internet and you should laugh at my guess because the great thing but this 1 is even thinking a
lot of the kinds of of content we create inside of social networks the kind of language in conversations we have because it exists online because existence I have technology anytime conversation we have is actually data are interactions from the them emotional professional flirtatious angry sad man villages as the form of data but effort will 2 years ago I was working at IBM Watson is a design researcher working chat about software in answer to think about how much of our language is actually
network and how much of it actually exist as data once it's put science technology very basically this is how a computer understands language this is how an algorithm or a chat bots offered parts out a conversation your happening it's incredibly literal in this example it's as remind me the the baby to mark 7 and near actually need to know what remind me as someone can program to say the word the letters R E N I indeed equals to do this task on a calendar design actually understand your language it's actually been pointed to a series of commands inside of programming or even like
this for example hello it's letter today a letter EPI plus your location can return things there denoted as whether it be a hailstorm uh windy overcast except trapped in return with fellow at something but someone has define this language so as to define what whether is and those definitions of can carry any kind of bias and then especially when it's something that's not literal like the various conversations we have people defined through code was system is and we live inside of the system how much knowledge a agency do we have inside of the systems this is of
problems with machine learning at say with unsupervised machine learning unsupervised machine learning is a series of algorithms that work at harmlessly through code there is a lot of human intervention supervised machine learning actually has much more human interaction within a researcher can help guide along the parameters inside of our algorithm this is an example of that was a read about the Guardian about a year ago i li Alexander on the see on my way it is also your rate when you google the words unprofessional share this is what popped up a year ago and as you can see that this is mainly people of color hair
styles on the left is actually when you Google professional here you can see it's nearly white people I don't think an engineers
out to decide to make this algorithm race as they were using pre-existing datasets and images of varied in March professional non-professional but it's carried the inherent bias of have in our society into a space where we actually can intervene algorithmically worth that these results so there's nothing we can do to stop that little weenies work on bias algorithms in 2013 was actually pretty instrument also pointing out how unintended biases can pop up specifically again in Google she said Professor Sweeney is a professor at Harvard she works in data privacy lab she started googling black sounding names and so that they triggered arrest related ads Stephen Googled her own name and the same thing happened with is Sweeney question mark arrested question mark adds
hopped up there about bail bonds and this is the way to when you're arrested your bail in the United States professor Sweeney has no criminal record but the implication of these ads break it an earlier today in a talk with uh Gillian your cannot center they spoke specifically about this kind of algorithmic bias inside facial recognition software and they have this really great he profit growth that I'd love to repeat back to you take offered as part of this and initiative in the United States called AI now it's a nonprofit designed to look at bias that exists as we start to create more more products with uh using machine learning algorithms are being fed certain images often chosen by engineers and the system builds a model of the world based on those images if a system is fed on of photos of people were overwhelmingly white will have a harder time recognizing white faces so is that sort of mean what kind of of biased data as
being fed all the systems that exist inside of our lives who made the data were the come from how big is the dataset how old is it how many different genders are represented I am and where can we access the data to even begin to look in fact check it the use of this bias is already intervening in our daily lives and creating harmful erroneous results from that it's very literal this is an example of the section in is really algorithmic face detection company continues to do that that's being used to assert detecting different kinds of faces the problem with this other than the surveillance aspect which is a major in Grand an awful but is more how wrong this kind of data can be whose training it again water that what is the dataset look like hardly determining who the spaces are and also when there is the other way of emotional
analysis whose deciding what happiness is in 1 of sadness this crisis of particular kind of feedback loop from an older data set that using to create predictions for new data is very low bias and that in fact a lot of uh predictive algorithms for using older data
sets you get caught in this feedback loop and blazer grow the as the head of Google's Machine Learning Group actually wrote this on media predictive predictive policing listed as Time magazine's 50 Best Inventions of 2011 is an early example such a feedback loop it's the idea that to use machine learning to allocate police resources to
likely crime spots believing in machine learning is objectivity several US states implemented displeasing approach however many notice the system was learning from previous data at police report patrolling black neighborhood it's more than white neighborhoods as a lead to more arrests of black people the system then learns of those arrests are more likely thus leading to this reinforcement of the this original human bias so as it mean that we're using is older at this older data hasn't been cost corrected how many different products already exist within that a work that I'm working on is thinking about how can you use machine learning to start looking at online harassment borrowing thinking about what is harassing his determining this actually really funny example at the this is 1
of my favorite words the dividend in Seoul America
that it's a term of endearment in the UK and as of poorly poorly behaved American and I love this word so much and also an example of a smart idea gone awry because they represent the problem in technology it's all the same for problems the study the problem occurs when a spam filter search engine blocks so that you know the search results because of the text that contains a certain kind of word so hotmail implemented this in the systems of course and England
could not register their mail if they included the words but because it has the word kind
and this is something where it designed fail spectacularly in fails in 1 of the in what could be an awful unintended way and this is 1 of the Council effects to be considered word blocking is often thing people England learning very quickly inside of online harassment think about if users could block the word game a date for example what what their experiences have been like what we've had such a big problem with all in rustic camping game again the better
example of why the thing about the scoring for problem no is when a company implements on your beliefs have this kind of word filtering rate so hotmail implemented by removing the word kind but no decided as an individual user consumer that that's the word they wanted
to filter out and as we end up with problems like this so how do we think about the causal effects of his career decisions a commuter having
for us a major advocate of user agency that's why I work for the Wikimedia Foundation than open-source company for we actually co-design with the editors have to think about different problems that existence of our platform border look like it uses to have more agency inside of the spaces then from there can machine learning be used along with co-designing with different kinds of users as a designer I think about this a lot harder we designed transparently with algorithms can exist and what would it look like and I transparently and in what kinds of algorithms are reusing but also where the decisions that were making how useful than a community
a large inexpensive community across the different cultures and languages we're trying to set standards how can you do that with aspect at the and this is a
really important thing to consider especially as we put more and more of our data 1 word time into these really large social networks where private companies so sensory coming the common that's where we discuss its were we talk about everything people fall in love who never met before inside of social networks race social media is not a
social media as a communication tool what happens inside the species is very opaque it's very private is not very public particle with harassment as harassment can be very nebulous can the literal 3 contextual making the cultural yeah these are hard things to sort of teach teaching Alabama how do you teach contacts had you teach culture it's much easier something literal like word blocking it's even better if users can implement that kind of blocking on their own the social media the
mixed emotional identity space so 1 and noticed uh in the past couple years is fact we've actually moved closer to understanding what harassment is across all different kinds of networks across all different kinds of groups for example the word saying is saying how much more commonplace it's the release of public documents but
even better docs things early being folded into codes of conduct in terms of service agreement on different platforms no docs thing actually some peace than even adopting still occurs they're not only the folded into your rules is serving the actually implement yeah and I think it's more important to think about that like how we're moving closer to the space of having a more general understanding as to what harassment as the work in machine learning work into this docking could have just a series of numbers of an if you really someone's e-mail er gy someone so number right that's a set number you have an algorithm run through venture check those numbers were look for something says phone
number plus a like a string of of numbers following that this is something I actually walked refusal new companies through the idea that it's close to represent a decontextualized
space which actually doesn't exist on social networks there's no decontextualized neutral space every space has bias in it it
carries all the predators we have for real life into the space the reason I like to walk people through
is it's more of a design exercise to think about harassment and how you could use machine learning and say this is to mitigate harassment so this is just a regular decontextualized is but photo of the user has to remove the tag mostly be that's great we can't really do that we've attack where they asked to remove the photo you know through the series of questions next ending it uses a series of steps that could exist as coder as a series of design choices
and be a great radio dials or what not inside an interface so why is uncomfortable actually notice when I was doing my 2 years of research in a game Monday that the majority of game the victims would start off talking about the US and by saying I feel uncomfortable mining of this is a good dog whistle why do you feel uncomfortable how extensive is the word uncomfortable where all the different meanings of the word uncomfortable it's not just discomfort it can mean a variety of different things I don't like the photo is
media with an attractive mystery to lose my job I'm afraid that's my family I'm afraid upset my peers or if you wanted to know this actually required in an algorithmic intervention because the users taking is what's important is what
happens after user would file a report so we start thinking about ways in which machine learning could actually be used for moderators and filing harassment reports what I'm suggesting is an algorithmic intervention that uses supervised machine learning that uses the knowledge researchers have with a better framework
using machine learning to isolate New trends that can exist inside of process reports your users will tell you what's happening what's wrong on platform especially if it's a daily part of the Elias Gamma gay victims would file multiple multiple multiple reports with the exact words fumigate inside of it so that he can use supervised machine learning program had been run on these reports they would have noticed a new word appearing the word game a day they also would have noticed the increasing of reports are happening over the course of a mistake that should have triggered any kind of system to say we have a new trend appearing and it's not a good 1 and this example I show is different ways to think about how these different emotions could be marked in a much
more literal or category categorical way the 1st I like my photo perhaps that is considered annoying content you does remove it high and you Fisher report to a section called annoying but perhaps afraid to lose my job riddles might at my peers maybe that it's mark more as abused you can integrate the level of use to be put it on a scale from there you can then think
about where which moderator it goes to and put an S. made we have time for response the so the users actually getting some feedback from your report they filed the majority of of harassing victims actually don't receive any kind of response and on spaces like Twitter when they file arborescent claim the last 1 I feel unsafe and you could follow up at something that is viewed as a dangerous situation but even more important is moderators could change
what's markers abuse the will of abuse acceptor in real time by re re reading reports of they got retagging flagging new views words they haven't seen and also having the system do this as well what this does if you think about the fact that uh moderate could read had what that means is
that the re teaching the system the system is learning from the actual input to the system is not doing anything autonomously it's learning directly from moderate there's no this this idea opposing would
only work if you have trained moderators places like face but actually pushed majoria Muttering content to spaces like the Philippines they often are not trained the kind of some like deep cultural context of of what they're looking at nor given any kind of uh emotional support the idea behind this is maybe we could perhaps create a way to start a better understanding emotional trauma emotional data inside of the systems but also like letting trained moderators letting researchers in ethnographers also determine what is happening inside of the space the line for more human
intervention as well as a more human response measuring the stock is really
about how machine learning can be seen as a collaborative tool for humans there
things machines do really really well they probably process data and images faster than humans can but there are things that humans do extraordinary while we're sussing out things were going understanding context regarded asking questions were good at following up we're going at not being literal unless we have to be literal so as a machine learning
designer and researcher I wanna think about the future of machine learning can be if it's viewed as a collaborative tool were treated as an extension of myself not viewed as an
Thomas third-party not viewed as uh Artificial Intelligence humanoids that viewed as an actual school my tool kit that I can also be more involved in the process of training session or something like harassment
when Internet language is changing so so quickly had we think about what machines do well if there is a better way to sort stored specific word seeing the analytic rising words or images and actually being able to compare those 2 new images arising we maybe we could actually be able to study
mean culture as well as harassment much much better and more nuanced way as opposed to waiting for victims to talk very publicly about the kinds of problems are having on these networks how do you mitigate and have you lesson harm of victims and how do you also criticism it's easier and safer for moderators to look at this harmful material so slang enhancement vernacular changes quickly in Austin I mentioned earlier that I was abuzz even been uh but the mom fellow what
it means for the past 6 months I've been studying the rise of the old rate in the United States still the rise of popular yeah I've been looking at how the Trump presidency is sort of change the digital landscape and science this is like fortune and write it as well as how it's changed harassment culture to become a lot more political also become a lot more specific now this is where ethnography is really important but also a what I'm describing comes directly in the play right before the you alright so right it was removed from red it this is what the page looked like what you see here is specific language that deals with the all right where they're promoting the space as the space of white nationalism what that means is
that it's inherently violent term space to be and it's a political science term that I hadn't seen appear in space like Reddit before and this is like game really talk action accepted they would talk about fighting for their own identity gamers they didn't use classical political speech inside of the way that they describe themselves with other selection is it's become it's become a lot more specific and people aren't afraid actually talk about the fact that the white
nationalist why not supports the idea of a white Stephen white identity now this is sort of terrifying I think as a researcher this is also where there's
several like the last day it was active on on so I scraped all of the
alright separated as well as the Donald and I started looking at word frequency and how often they occur and how often specific words appeared inside of this red addresses other edits this is the Donald so you'll see garbage trans Hillary's down there obviously other white supremacy etc.
and all this I started reading all these different lots as finding insight into different spaces what I that the work there is a rise in support of particular slang term so if we go back for a 2nd and
if you see this here but they created a guide to
all the different kinds of terms that they use this grade died for indoctrinating new members into the alright so I took a variety of guides I found from the daily stormer which is a
new not the website that's also linked to off the alright so it as well new reactionary site that the rise of new conservative politics in various alright the various alright sites what I created was what I call hate speech dictionary this is a small snippet of it what I'm doing is I'm tagging all the different words that I
see some if it's a blog about the person if it's a slang term if it's a spacer location and then also tagging the words if it's yeah general all right white nationalist white white-supremacist or a neo-Nazi may be
working directly with the Southern Poverty Law Center and on determining which words fall into which category this is actually being used for it with ProPublica a journalist excite in the United States for a lot of big data analysis and scripting that they're using others why would I do this other than suggesting myself to looking at horrifying parts of the Internet you can last you want to of because of it because knowledge doesn't exist before any kind of machine learning system actually wouldn't be able to recognize any of these terms as hate
speech because it doesn't actually exist inside of it doesn't exist inside a spreadsheet that you can view a system that is designed to analyze hate speech this takes researchers takes contextual learning and it takes it takes a fair amount of teaching but now at compile the the speech database a can actually use machine learning to help me look for new terms it couldn't use machine learning 1st the did know what it was looking for but because these terms appear with such frequency next to other words I can now compare and look for new and emerging words as they appear inside of these different spaces inside these different blogs and these different social networks I cannot see any revenue slinger comes up and I can handle softer journalists and researchers this is an example from goals perspective API now I actually like perspective the the reason I bring this up is this is according to the inside of a mesh deformation lead this is the most popular white supremacy term per phrase in the world and as you can see the only rate is 37 % toxic perspective is is a new API that was to look at toxicity in language it's not designed to work autonomously and we can see that in this case if it were working in time asleep it which they it would fail on a very large and very real way the per 2nd is designed to work with moderators and inside the New York Times tell better partition the comments that they're getting so they can go through them faster and easier and determine which comments for suitable to exist on the time it's not perfect but it is designed for a very specific use case inferred is and that use case to teach this over time this is more perspective the
bottom uh says have a good day 1st see and I love you in Chinese but again it's a way to start helping partition the different kinds of world the volume of low that the moderator is giving it is not perfect and they will keep saying that and that even with its in corrections and false flags
and human can quickly sets up these errors and still work in a faster way as opposed to working without any kind of tool or extension health summary it's probably all back again this is
where technology can alleviate as specific kind of work that we're having it can alleviate the different kinds of problems that exist inside of moderating especially as as the world is in
such tumultuous times was it means exist on the internet now with the rise of populism as an American I think about this a lot with the rise of the all right this only works in online
harassment I'm curious as to how how many conversations all have to look at that will be shrouded in political disagreement was actually a conversation that's really rooted in racism and how do you determine set standards for that value also prop moderates for what they're about to see How do proper moderator to see any kind of Nazi insignia was how do you make sure that people's workflows or not Red overrun with this kind of heavy emotional listing there are things we can do with technology to
help bearable the weight of that help their load of the emotionally model that moderate your face and no sought to train these systems but it can help partition where that we that we look at these these spaces given how fast different algorithms can pass information and
the thing about this is how do we create more spaces that had user generated definitions of safety I don't know what Twitter things serve as good quality when they enacted across all of our different accounts called filter I don't know of Twitter is on the same page with me when I think about
good because Donald Trump still has a Twitter account I don't know I ways even considered safety or ways consider high
importance and size spaces like that and so as a researcher who is really rooted on user agency I wonder how do we create spaces that he have user-generated definitions of safety how large can that
be how large can we scale these different kinds of spaces think yeah few answers I think we have plenty of time for you need and you prepared for human yet so as you know we have 2 microphones and that's the room the sum uh yes we have so we have 1 question over there was it like this the the the and the 2nd question over there OK so my performance of the way please introduce yourself for instance that the question can I'm my name is lots of
jobless and I basically 2 questions the 1st is that um you mentioned that you want to generate use use this and what people are protected from hate speech and so on uh but I wonder if you run into the into you
create fill the bubble you keep those people who are affected to write an ideology politics and you can't convince them off the oposite anymore 1st questions and the 2nd 1
has um I was the US last year and uh that the author of a book I called method of mass destruction and the basic theory was that we need
um uh that algorithms to
predict for example policing um all will use it would have to fill this kind so that they have to be open source in general so we create kind of an ethnic how to creates a whole to program of retinal if you have a judgment of this
well I'm super open-source given that Wikipedia is an open-source tool and platform but also I agree like faces is anything away the create agency around algorithms we should be able to see the way that the written but also the data that the fad even before algorithms and going inside a social network it has to be written create it has to be trained on models rate where those models who trained it how they with the dataset hold of the dataset we need access to all those things I very much agree with that secondly I'm not advocating for the companies to implement any kind of word blocking the research I was doing was actually to see how do you start to think about hate speech in a new way beyond since analysis classically there's by different kinds of sentiments Iicense analysis but I don't think speech would necessarily fit into anger + discuss it maybe a little bit of joy it fits into a different space anger disgust joy and the sadness I always forget the last 1 of those are the ways that that's classic since the analysis right but there is so much more human emotion disgust anger joy sadness acceptor gray so on and regression was with filter bubbles I think we're really existing filter bubbles and I think of the rebels are a problem but I think we already exist inside of them any Republica is technically a filter bubble if you think about it you chose a come to this conference for a specific reason it's a specific geo-location and it has specific stocks that you like perhaps you work let you get off that like come to this conference for the day but it's about technology it's about communities of collaboration with some people actually like those topics so how would you reach them I think you'd be really hard to do that but also people are allowed to this decide and is it to decide and design spaces they exist and I 1 of you think that that a type of filter rubble there's oftentimes they can be safer to be in a filter bubble especially if you're a person of color or marginalized groups I don't know if I would want
trump uh a series of of of Trump supporters jumping in on my Facebook uncertain conversation that's why I have really high privacy settings right I think the problem
of filter rubbles is when we don't get to decide what's in a bubble when it's algorithmically decided for us it is something that Facebook does we also already live in a bubble because friending people that we know and that we probably know somewhat well we've met weary have mutual friends and I think you'll forget there are actual ioral social networks are a family networks are work networks already technically filter bubbles so the
next question where was an expression used
expression great thank you hi I'm too simple noise and the filmmaker I'd like to talk think interest very general questions and
concerning the outlook of your of your research but when you think about all societies and the impact that it would a column of emotional of machine learning what that might have in in the political perspective could you I to do you think that the junior research you more concerned with what is actually happening because I have the impression that people who actually program that you examples with with a have an idea probably of society right of which this is an idea that has a perspective can you do that right and that means had URL try to and
I think a big problem is when things are designed the
design with too few use cases there too few personas I think I like to do when I design is to think about what's the worst thing that could happen and what's the worst that can go wrong so if I were to come up with a product idea that's maybe about helping people work out better I like to go through all the ways it can be used in the worst possible way and think about that is that the causal effects of what I'm designing and how do I how do I do not implement that and I don't know how how often uh algorithms go through that kind of like notation intense QA intense testing and part of the problem is that were having to work with and like that we to work with so few really large datasets super the problem is that a lot of external algorithms are made inside of like university labs and they really have access to certain kinds of data the so they're having to use these data sets are really old where they're not quite large enough at is a quite that the problem but they're also not designing the stuff for public consumption the designing it sort of testing idea from there than the standard can be folded into public consumption think part of the problem is actually looking at companies the implementing machine learning into products and asking them to spend a lot of time on the product the problem with that is that modern capitalism doesn't actually set up any kind design firm for small company were bigger agency to spend that much time on developing a product that's using machine learning that's really the main problem is also thinking about data in consensual data as a currency so consensual the in the sense of can users opt in some pretty choosing to give up their data earlier part of the process and they can be trained based offers a big thing is is that there's just not enough data but also none of the that people willingly give up and data being used in a smart way next
question yeah yeah and graphs and you OK so 1st year and then and thank you can there's no from the from this moment and I understand that the context form will
your research is the use of algorithms for moderation those years so to help people moderate content can also think of the different with this kind of machine learning will help us understand what is happening online and how react at that's 1 part of questions that in more I was uh in the data with the example of the algorithm learning
what and knowing the east and I just censoring that content from the future that I use censoring because that's how I see the risk of this so the the other
nodes on the basis of 1 case where I clicked I don't like it that this type of thing is annoying it's my will become a problem for myself the future so I'm maybe I did but the example but if you could elaborate on that will be greater than the so it
means invited me to necessarily intended for annoying to have to be the basis of like
machine learning section and more so in the sense that if you look at any kind of harassment filing annoying is often listed as a thing and what that means is that like a low priority in terms of aggressive report so you're marking content is annoying on it's something that they will actually pay attention to local just remove it from your timeline of from your view but it often disk get like swept under the rug and and then see I think the 2nd part questions asking how this can be used beyond moderation I mean I think using any kind of research guided machine learning extension would be fantastic for looking at emerging geopolitical trends in 5 spaces like Twitter I'm concerned that like with how about Twitter's being reliable will we ever what will we see another Arab Spring in cyberspace like Twitter or will be shut down and censored by the government and how will how can we watch that and as international community support that kind of process that's happening and so I think of a good example is being able to use these things to look at different kinds of digital protested starting to exist a big here I have someone who works in online harassment but also have studied protest is a protest campaign of harassment campaign if you look at it is based off the volume they look almost identical so they motivate people thought was a phantom fracturing if I was to people later they thought it was a 1 fan them fighting in fighting game here see themselves those as as protesters they Results protesting a changing
games they see themselves protesting society that wants to change games and taken away from them victims of gamer gate you there it is what it is it is a harassment campaign that's
incredibly complex so over using tools a sort of look at like just a very basic analytic systems like I guess identifying markers which would be volume are will we limit the ability of protesters were trying to solve it with harassment and that's why I think concepts is really important as well as specific researchers I can work inside the systems and work alongside the systems and there's a lot of identifying markers I think to harassment campaigns a big thing you can look at is like the interaction history between users have ever interacted before is 1 user a brand new user worse an older user of Twitter to study and found that most accounts are either spam or the engaging in harassment so that's a good example of an account is really really young and doesn't have a photo
tied to it could be a about could be aborted be engaging in harassment so the way for people to filter out by their own choice like opting into to filter out new and egg accounts that the way to cut on grass and that
is that algorithmically heavy insulating users decide to determine their own privacy settings OK and the next question comes from the audience to the right of me the year the
2nd row I then on the stage of the design and management student entering the top progress the and and I was kind of wondering about the different elements of some things that you mentioned you said both that's and maybe in March
parents is taken in the preparation of moderators so that the kind of more educated when determining what might be users and what not but you also mention that for instance study and also it was kind of scary which I totally understand so I'm wondering if you've thought about the potential like psychological preparation
firmament like from other in addition to the more practical aspects of the preparation so that they themselves don't just become embedded and sad In the process and actually able to do their jobs effectively ends he's still relatively happens humans was totally amending a big thing is is looking at what exists inside of the press reports and being able to change it up so like being able mark
something as annoying for example if you were looking at something that's rated as highly highly abusive being able to switch from OK with that highly abusive content all day 2 can I look at annoying content for her hopefully a week I as a way of sort of lesson that emotional labor so that's like 1 big thing to think about is that there are 2 people of another very task to where there are having to look at the same level of abusive content idea I like videos of beheadings for child pornography and images a a lot easier actually
to sort of in a way I guess moderate and their worth to look at but if an images existed before you can use image hashing a metadata from the image to see if it's a similar same image of what's tricky is when it's a brand
new images so it's also look at that but at least it's cutting down on some of the volume of more abusive content people looking at what is harder is with words and being able to like rate the words as abusive and that's why I think looking at slang words is is is a good example of a dog was also you know give again is a great example of like we use the hash with everything so any abuse
report has been we're going could go to a special kind of section of this is like all the game its stuff ever were if you go back to the dictionary making be able to look and see like enough of these words have existed and it's been reported as harassment and abuse like this is probably something talking about white nationalism great
yeah do we have more questions yes here's the
2nd Roosevelt Due to the fact here we have a possible they want to use the microphones that few high the areas thank you for your talk but I
was wondering you mention level tools are available to moderators but as a general public if you want to interact with outside through a the royal or perhaps with these groups and joint use television dial it emerges that uh you face who worked with approval can easily be floats with a backwash trying to to to to interact with the group a scrap us open to to your you you think you reaches also provides some of tools to maybe monitor that support to interact safely with these creatures while still maintaining your own profile on personal identity on 1 I think that I think part of the problem
is I'm spaces like Facebook and Twitter view the product as a very close off product a space like wikipedia use
Wikipedia as an open platform that you can migrate your own templates to exist on so that's much more of a protocol in the sense that you can make changes to your personal account interchange the way that you interact with that I think of a massive problem is how productise things are we live in in a time of like was like seamless product design of and also the people that work for these companies are hard to find the right of Figure who they are the impossible to contact you're like these these spaces exists as a sort of nebulous entities I think that this could be used to help people perhaps interactor mitigate because the actions they have uh I don't actually interact with the all right at all I did interact with game early when I was studying it and it did have negative consequences for me and so I think this is kind of of a really tricky space that I study from afar by also study like looking at the kinds of language they generate online and on the internet and the it's important to sort of talk about that but they are putting out content from this hyper specific western perspective incited spaces that talk a lot about irony and and they don't intentionality so I wouldn't say that like with the database I showed or any kind of machine learning algorithm that you could say this person is like x per cent white supremacist I don't think you can generate those kinds of conclusions from data nor should you that goes a lot more into Thoughtcrime and policing territory that I'm comfortable with and also create another layer layer of surveillance I don't think that we need I do think that is interesting for us to have conversations of the public about what we consider hate speech always considered acceptable inside of the spaces meeting I think the code of conduct often gets overlooked so something like i . thing for example 2 years ago that was then considered harassment now it is it is widely implemented on
different social networks and even implemented on the site were Doxy occurs which is peace then but least like that is a more normalized term and and so thinking about like water we normalizing in the space is really
important I'm not suggesting that we remove like white supremacy terms from the Internet you think is important for us now with the rise of populism even in Europe to talk about what kind of systems really exist and what is considered acceptable 1 a point most of most social networks are American-based companies that also means of their implementing American policy in space is they're not America America has a very tricky relationship to the 1st which is the freedom of speech amendment a lot of companies error on this very libertarian side where anything can be said in the social networks initiatives let anything go in I don't necessarily think that they should create a lot of policy around what can be said I you think they need to have internal and external dialog around what kind of species are the generating if people are planning were if people are planning like offline attacks and also where the effects of language is 1 thing to say you can't talk
about rape on this site it's another thing to say you can't make great threats and those are not the same thing and I think that's where things get super super tricky and and that's really the understanding of freedom of speech often
it's really nebulous and it becomes an argument of censorship we don't think social networks to outlaw the word rate I do think they should necessarily have rules that say don't make great threats 1 more question no looking
so thank you you want there's no
thing as a rule if way in