AI VILLAGE - The Great Power of AI Algorithmic Mirrors of Society

Video thumbnail (Frame 0) Video thumbnail (Frame 6345) Video thumbnail (Frame 9313) Video thumbnail (Frame 10044) Video thumbnail (Frame 14039) Video thumbnail (Frame 14725) Video thumbnail (Frame 21516) Video thumbnail (Frame 32198) Video thumbnail (Frame 42880) Video thumbnail (Frame 48320)
Video in TIB AV-Portal: AI VILLAGE - The Great Power of AI Algorithmic Mirrors of Society

Formal Metadata

Title
AI VILLAGE - The Great Power of AI Algorithmic Mirrors of Society
Title of Series
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
2018
Language
English

Content Metadata

Subject Area
Abstract
Following the progress in computing and machine learning algorithms as well as the emergence of big data, artificial intelligence (AI) has become a reality impacting every fabric of our algorithmic society. Despite the explosive growth of machine learning, the common misconception that machines operate on zeros and ones, therefore they should be objective, still holds. But then, why does Google Translate convert these Turkish sentences with gender-neutral pronouns, “O bir doktor. O bir hemşire”, to these English sentences, “He is a doctor. She is a nurse”? As data-driven machine learning brings forth a plethora of challenges, I analyze what could go wrong when algorithms make decisions on behalf of individuals and society if they acquire statistical knowledge of language from historical human data. In this talk, I show how we can repurpose machine learning as a scientific tool to discover facts about artificial and natural intelligence, and assess social constructs. I prove that machines trained on societal linguistic data inevitably inherit the biases of society. To do so, I derive a method that investigates the construct of language models trained on billions of sentences collected from the World Wide Web. I conclude the talk with future directions and open research questions in the field of ethics of machine learning.
Context awareness Open source Observational study Multiplication sign Source code Execution unit Set (mathematics) Translation (relic) Function (mathematics) Information privacy Mereology Semantics (computer science) Computer programming Formal language Attribute grammar Programmer (hardware) Malware Machine learning Different (Kate Ryan album) Authorization Energy level Software testing Endliche Modelltheorie Associative property Information security Form (programming) Programming language Copyright infringement Artificial neural network Binary code Database transaction Unsupervised learning Arithmetic mean Process (computing) Personal digital assistant Network topology Universe (mathematics) Natural language Abstract machine Window
Slide rule Randomization Information Multiplication sign Software developer Mathematical analysis Process modeling Information privacy Semantics (computer science) Attribute grammar Product (business) Formal language Programmer (hardware) Software Authorization Energy level Point cloud Natural language Series (mathematics) Endliche Modelltheorie Abstract machine Metric system Metropolitan area network Spacetime
Context awareness Token ring Semantics (computer science) Product (business) Formal language Web 2.0 Different (Kate Ryan album) Term (mathematics) Maschinelle Übersetzung Energy level Endliche Modelltheorie Exception handling Predictability Pattern recognition Scaling (geometry) Artificial neural network Gender Mathematical analysis Cartesian coordinate system Sequence System call Type theory Word Pattern language Abstract machine Window Spacetime
Group action Context awareness Web crawler Randomization Differential (mechanical device) Multiplication sign Direction (geometry) Correspondence (mathematics) Approximationsalgorithmus Combinational logic 1 (number) Function (mathematics) Computational intelligence Mereology Data dictionary Semantics (computer science) Dimensional analysis Formal language Expected value Web 2.0 Machine learning Different (Kate Ryan album) Analogy Negative number Endliche Modelltheorie Position operator Area Algorithm Software developer Computer simulation Sound effect Determinism Lattice (order) Measurement Type theory Category of being Arithmetic mean Supervised learning Numeral (linguistics) Vector space Principal component analysis Pattern language Resultant Spacetime Point (geometry) Statistics Mobile app Observational study Token ring Similarity (geometry) Distance Trigonometric functions Twitter Wave packet Hypothesis Attribute grammar Congruence subgroup Frequency Term (mathematics) Internetworking Energy level Software testing Associative property Condition number Form (programming) Distribution (mathematics) Standard deviation Wechselseitige Information Information Artificial neural network Gender Expert system Cartesian coordinate system Performance appraisal Word Personal digital assistant Einbettung <Mathematik> Abstract machine Musical ensemble Family Active contour model Window
Group action Statistics Beat (acoustics) Context awareness Computer file Euler angles 1 (number) Similarity (geometry) Computational intelligence Semantics (computer science) Formal language Revision control Programmer (hardware) Frequency Sign (mathematics) Object-oriented programming Different (Kate Ryan album) Software testing Endliche Modelltheorie Associative property Standard deviation Theory of relativity Information Gender Plasma display Keyboard shortcut Computer simulation Sound effect Basis <Mathematik> Category of being Type theory Word Googol Personal digital assistant Pearson product-moment correlation coefficient Vertex (graph theory) Family Resultant Spacetime
Randomization Context awareness Transportation theory (mathematics) Multiplication sign 1 (number) Computational intelligence Information privacy Mereology Semantics (computer science) Software bug Machine learning Hypermedia Analogy Software framework Endliche Modelltheorie Information security Physical system Area Collaborationism Algorithm Digitizing Fitness function Computer simulation Lattice (order) Cognition Type theory Message passing Vector space Mathematical singularity Right angle Sinc function Spacetime Classical physics Connectivity (graph theory) Graph coloring Machine vision Attribute grammar Product (business) Medizinische Informatik Software testing Utility software Codierung <Programmierung> Proxy server Address space Task (computing) Time zone Scaling (geometry) Information Artificial neural network Gender Weight Physical law Projective plane Machine code Word Visualization (computer graphics) Personal digital assistant Interpreter (computing) Point cloud Abstract machine
hi everyone thanks for coming to my talk today I'll be talking about how we can uncover bias in different machine learning models and what the implications might be and I'm Eileen as John introduced me and I just became an assistant professor and today is my second talk as an assistant professor yesterday was my first one and yes let's see is this working no yesterday I talked about supervised machine learning being used on language of individuals so that we can identify these individuals based on their linguistic style and this is a serious privacy concern but when we look at language at the aggregate level language of society we see that there are some fairness problems there because this linguistic data that is coming from society also brings the biases of society human spirit and now we are going to be looking at this and yesterday what I talked about was about coming up with a method to identify a style of individuals or these can also be programmers so that we can do attribution and this has security enhancing properties but at the same time we see that it's very privacy infringing in some real-world cases and today I'll be talking about a method for quantifying and detecting bias in linguistic data or linguistic machine learning models and for this I basically adapt implicit association test for humans to machines and given that there's the language Universal that in every language semantics happens in the same context window the method that I came up it can be used in any language basically okay so in AI under the umbrella of AI today we'll be looking at natural language processing and machine learning and in particular deep learning and unsupervised learning in the past I had there has been some work on looking at supervised machine learning to see where the bias might be happening how it can be removed but it's slightly more difficult to understand this in unsupervised machine learning models since they don't have classification outputs and so on and again talking about for example individual language versus society's language when we look at individuals language the syntax that they use in language is the most identifying thing even in source code but today we'll be focusing on semantics the meaning in language at the societal level and just a brief summary of this work on security privacy and machine learning where I'm using style uma tree which is the study of linguistic style and since this is linguistic style we came to looking at natural language or we can also look at artificial languages such as programming languages and some examples are identifying the authors of English text English as a second language so that we can identify their native language or translate the text and identify the native language or translator as well as underground foreign text were underground form users engage in business transactions and even though these are very noisy datasets we can still identify these authors and sometimes these are suspect sets for intelligence agencies and because of that this work the tools that we developed and that are open source on github any one of you can download and use they're currently being used by the FBI or expert witnesses we use as scientific evidence in court or European high tech crime units so that they can identify suspects online based on their language but again I'm reminding you this is about individuals language and the privacy implications that it has and in artificial languages focusing on programming languages in particular Python C and C++ source code as well as binaries this work is being used by DARPA and being part of the Department of Defense you might imagine why they would be interested in attribution problems for example to find the authors of malicious software or you know press regimes authors of for example
censorship circumvention software and expert witnesses again can now use this information as scientific evidence in court I've been collaborating with the US Army Research Laboratory they are still working on programmer the anonymous Asian programmer attribution okay but today I want to talk more about fairness and language at a societal level and I'll start with an example so the do any one of you know this author Robert Galbraith okay so there was this novel trying novel called the Cuckoo's calling my trouble Galbraith but some people suspected that it might not be written by Robert Galbraith but instead JK Rowling and after performing stylo metric analysis it was shown that it was actually written by the famous JK Rowling of the Harry Potter series and after that JK Rowling said that okay yes she is the author of this book but she wanted to use a man's name because it's a crime novel and the publisher also thought that it would sell better if it was published under the name of a man and at the same time they wouldn't know that it was written by JK Rowling so she would get a more realistic evaluation of her work so we can see even at high-profile people how bias is affecting society even for such an important product that they are publishing and here this reminds me of the interplay between privacy and fairness big data's evil to twist this is what I call them for example with privacy we can have a serious problem when sensitive information is leaked but with fairness when sensitive information or protected attributes are abused then we have a fairness problem there but in the upcoming slides we are also going to see that privacy does not imply fairness okay let's start focusing on lecture
natural language processing models linguistic models semantic spaces in machine learning for example recently Google had this cloud natural language API Amazon Google many companies researchers are making these tools available or for commercial purposes as well and such tools are being used by developers researchers just random folks citizens and so on and
we can also see that whenever we are dealing with a smart or digital application if there is any text included in it then usually these linguistic machine learning models are also being used for example web search when you're trying to type something and you can see some suggestions to fill that sentence which is sequence prediction or text generation at the context level they are also looking at linguistic semantics spaces or machine translation sentiment analysis especially for market predictions to see how a commercial product is for example perceived is it negative positive and of course it's using linguistic data and tokens words as tokens that are in certain contexts windows or named entity recognition and text generation when you're receiving an automated call on the phone usually that takes it Texas automatically generated okay why would this be a problem let's look at this example so one of my native language is Turkish and Turkish is a genderless language there are no gendered pronouns there's one pronoun and he it means he she or it okay so I'm translating from English to Turkish she's a doctor translated as he she or it is a doctor taking the Turkish sentence translating it back to english it's translating as he's a doctor it's not even asking if it should be a he she or it okay let's say that it's smart enough to understand that this is a human so it shouldn't be an it it's the same not saying he or she but just choosing he as the most accurate answer and another example to see if this is just one rare exception or is this happening at the larger scale he's a nurse translated to Turkish he or she is a nurse translated back to English she's a nurse okay you can maybe also see the difference doctor versus in nurse the prestige or the salaries of these people and we see a pattern here he or she or it is a professor he is a professor teacher she is a teacher of is this only happening in English German is also a gendered language not only pronouns but other things are also gendered it's more gendered than English and of course from Turkish but again we see that a doctor is translated as a male and a nurse is translated as a female and then we have Bulgarian and in Bulgarian almost everything is gendered words are gendered addict adjectives are gender and again we see that doctor is a male whereas a nurse is a female okay so
it has been 62 years since they came up with the term I believe it's still 62 years they came up with the term artificial intelligence and what they what can be done with it all doomsday stories about when there is super intelligence or when machines are going to take over and so on but they haven't been really thinking about immediate problems we might have with artificial intelligence okay
we know that when garbage goes in to machine learning models what comes out is usually garbage as well because if that's the quality of your training data then the output would reflect the same quality and one example here for half texas collected for generating semantic spaces and linguistic model says for example there was the microsoft tweet but take and it was taken down the same day it was introduced because it very quickly turned into a very offensive racist biased part and Microsoft wasn't ready to account for such cases in the linguistic model and this was done by fortune it was basically model poisoning it was adversarial machine learning and it worked very quickly and these were some of the examples that the treatment started treating and of course it was taken down but we can see how easily the bias can be embedded with a strong effect size in this example within a few hours and how is this getting into these models let's take a step by step look at this we know that humans are biased this is not necessarily a bad thing because there are neutral biases as well as time's biases are helpful in some conditions and I'm going to give some examples of those as well but when you're biased and when we are speaking we reflect that bias in our semantics we have valence for example we say that okay the snake is ugly or this butterfly is beautiful for example and these are neutral biases but they are there and as we are speaking forming languages we tend to have similar patterns in same context windows let's say that we have a negative context window and snake tends to appear in that context window as a neutral bias but it's still a negative bias for snakes and that's called then distributional meaning we see in the statistics that certain words end up in negative context for example and context windows and after that machine learning models especially semantic spaces look at this distribution of meeting meaning and co-occurring statistics and after that it learns the co-occurring statistics of certain terms or certain people's names and what they are associated with and this is reflected in machine learning models basically the bias is propagated through this entire process and at some times it is even increased and augmented it's not just perpetuated how can we measure it because especially when we look at our supervised learning we don't really have something to directly control for and measure these things unless we look at the model at the construct level like the intelligence of this model let's say the understanding of the world that this model has and for humans the implicit association test has been used as a way to measure the implicit biases that we might have Greenwald from University of Washington came up with the implicit association test in 1998 and there's a lot of criticism about this method as well but at the same time is revealing some patterns about the world and humans and it's revealing subconscious biases that we might have and we might not be even aware of and this test is basically asking you to associate certain societal groups members or terms with certain stereotypical words and how fast they associate for example a butterfly between negative or positive versus how fast they associate a snake being positive or negative and then when you are doing these associations on a computer you are asked to click right or left to classify a positive term with butterfly or snake and after that there is a difference in reaction time and this differential reaction time in associating congruent stereotypes and incongruent stereotypes gives you the effect size for implicit bias that we might have and this example is from a girl taking the implicit association test for males versus females about science versus arts and the bias in general is that males are associated with science whereas women are associated with arts and you can go and take this test online at Harvard's website the in her implicit that harvard.edu implicit and I'm showing this because we
see a few implicit association test here that I listed and in the experiments that I did I took these previous experiments that have been generated by experts in Society of psychology because I'm not an expert in that area at all and I wanted to see if we can replicate these biases versus what happens when I try random things or things that are not biases so that I'm not just cherry-picking the biases that I want to show you you see that these are the main categories that were there and I'm using the same ones to see if they're reflected because the other thing is this test is taken by millions of people over decades and another example is you can also take it in German if you are from German and let's say because the same tests since they are based on context and words they they can exist in any language basically and we can generate linguistic models in any language as well so we can apply these tests in other languages for other cultures or countries ok let's start looking into the details of generating language models another example here of where text is taken to generate these models basically became crawl the web take all kinds of text structured unstructured tweets here we see some tweets from Donald Trump before he became the president but we see a certain pattern in the text about maybe certain biases and so on and this is blindly fed into in this case neural networks and the neural networks looks at the co-occurring statistics and all the data point wise mutual information it has and after that it produces a semantic space in this semantic space we have the common example is with 300 dimensions we basically have a dictionary of a language and then this dictionary each word contains each word in a language with in a vector form numeric vector form with 300 dimensions where each dimension is a combination of certain contexts but when we have each of these words we see that similar words are projected to the same points in that space for example things about a positiveness things about feelings things about females and so on and based on their vicinity we can understand or answer many questions and the types of semantic spaces that I focused on in my study were word Tuvok the algorithm and globe from Stanford researchers were topic when it was introduced it was it was extremely popular and the models that was produced by word to make which is from Google research it's being used by many developers researchers or app developers and so on so a lot of people use these for their applications and glue is similar to that as well it's produced by Stanford researchers around the same time and these two semantic spaces have about the same accuracy after evaluation even though it's not very clear how to evaluate these methods it's like an approximation of an evaluation for these semantic spaces and word Tomek is based on Google News data and we would expect Google News data to be more neutral and objective because it's news data but we see that that is not the case based on coherence steaks and other type of data that grow uses is common grow it's about 800 billion tokens from the internet it's basically the crawl of the World Wide Web basically the language of Internet users I would say in this case not in particular society in general but the internet using population ok what can we do with this word with these word embeddings first of all we can understand syntax we can understand the meaning of a word better we can perform analogies and get answers or we can look for semantic similarity example no but we can ask questions such as ok Rome is - Italy as Paris is - what and it will be able to answer that its friends so it has some understanding of language it has understanding of semantics but there is also knowledge and maybe even statistics embedded here and by statistics I don't mean the co-occurrence statistics I mean statistics about the world and looking into the details of these vectors we have each word listed by frequency and the first word is usually a or D or just a comma and so on and there are 300 features that represent this word in the semantic space and in goal we have about 2 million tokens 2 million birds in this dictionary and in a dictionary you would expect much less words but given that this is taken all the words above a certain frequency on the internet we have words such as Obama or Michael Jackson and so on ok what can we do with vector arithmetic so when we project these vectors to 2d space we see that for example I hope at least the laser is working no because I didn't turn it on ok now it's working ok here we see that on the lower side there is brother which is a male and here we see sister and now we see the direction of gender here once we see the direction of gender maybe we can look at kink and find what corresponds to the female version of King which is Queen here so basically we can perform vector arithmetic with cosine similarity most of the time or taking the principal components of these vectors and try to answer the questions that we have in syntactic form analogy form or semantic form and so on how can I use this information to measure bias in machine learning models so the first thing I came up with was the word embedding Association test instead of the implicit association test and for this what I do is I try to quantify the implicit or in this case it's actually explicit and deterministic associations between societal categories and evaluative attributes which are stereotypes in this case and what I do is I take the distance between the stereotypes and societal groups for the two groups and look at the difference in standard deviation between the means of these associations and that gives me the effect size of a certain bias and we can also measure the statistical significance by generating a null hypothesis and then see if the result that we are getting from the effect size significant or not and for this the first thing I wanted to start with with was looking at neutral stereotypes that are universally accepted or they are called so for example flowers being considered Pleasant and insects being considered unpleasant for some reason apparently most of the population just naturally intuitively have this stereotype or musical
instruments being considered pleasant whereas weapons are being considered unpleasant and since this is not dangerous for society or harmful to society is considered neutral and in this case we see that the effect size is 1.5 around 1.5 for both of them and this is a large effect size if it's above 0.8 and the highest effect size can get is too because it's bounded by the standard deviation here and we see that these are both statistically significant and high effect sizes and for the upcoming examples it will be the same case they are all statistically significant with high effect sizes okay let's start looking at other major implicit association test categories such as white people's names versus black people's names and trying to understand if they are considered pleasant or unpleasant and we get the congruent stereotype that white people are considered pleasant in this case and when we look at the differences between genders we see that males are associated with courier and females are associated with family and again I'm using the same words that the implicit association test is using to ask you to perform the classification association test here the exact same words and when we look at science versus arts again we see that males are associated with science and females are associated with arts I'm not going to talk about stereo threat here but at least we can see that bias is certainly perpetuated with these linguistic models and two examples that I have it models here are from 2014 we are in 2018 and these are still the state-of-the-art models that are being used by many people and they are not really updated on a frequent basis because they are quite large files and they require a lot of data and so on okay let's look at some other how-to related stereotypes for example young people being considered plasm as all people are being considered unpleasant or physical diseases considered controllable various mental diseases being considered uncontrollable and we see the stigma being reflected in these models or when we look at for example heterosexual versus homosexual individuals we see the attitude towards the more straight versus transgender and so on and we can also perform this in German when we look at the main categories after generating the linguistic model and you can also generate your linguistic models for example you can download corpora online or use google engrams data from different years decades concretes languages and so on to analyze what might be going on in these years and when we look at the most recent version of google ngrams for german we see that we can also replicate the stereotype or the prejudice for Turkish people Turkish people went to Germany decades ago millions of them as immigrants and there isn't a very positive attitude towards some and we are able to replicate that from Google sign Graham's data as well by performing the vfat test sorry the beat test ok we saw that semantics and bias is embedded in these models what about empirical information that doesn't depend on a context or a feeling but it's more about statistics in the world can we replicate those as well or can those be a reason for these biases for example let's say that I would like to know how a certain name is associated with being male or female Taylor for example it's an androgynous name and it's almost 50/50 male and female and based on this I can do I can perform a similar computation to see how much a word is associated with a certain stereotypical group and based on this I went to us I I went I collect the data from the US Census Bureau's 1990 I believe data where they included the gender of people with certain names and how many of them were out there so I took all of these names the most frequent ones and tried to calculate their association with genders being female or male and we see that this under genius names had 0.84 correlation coefficient so it's almost about let's say 84 percent similarity in the statistics of the world versus the association that I'm getting only from a semantic model which is very interesting to me and I wanna and this is a illustration of that we see here that for example Taylor is in the middle almost white 50 percent male 50 percent female from the effect side that I'm getting which is almost zero in this case and we see that Carmen is almost a hundred percent female whereas Chris is a hundred percent male okay what about if you look at employment statistics occupation statistics based on gender and the Bureau of Labor Statistics publishes these disinformation every year and based on this I took the data from I'm trying to see which year this was it might be 2017 and I took the occupation names and then tried to see their association with certain genders and the correlation coefficient was 0.9 which is amazing and after that when we look at the result we see that a programmer is oops where is my it's being reflected from the mirror I don't want to bind anyone so I'm not gonna use it but on the upper left side we see that programmer is almost a hundred percent male whereas a nurse is almost a hundred percent female and when you look at for example google google and grant searching for she's a programmer usually the result you get it's zero because things that are below a certain frequency are just cut from the engrams and until recently she's a programmer was zero and we can see that here reflected here as well okay now we can understand that there are different types of bias that are embedded in semantic spaces and there might be different reasons why these are getting in these models but we can come up with three main categories four types of bias that we are dealing with the first one is the vertical information gender and occupations for it was one example and this is not exactly biased it's basically the statistic that we have in the world and this might be caused by the injustice is in the past or biases in the past but we don't know any information about that we just have the statistics here and the models are learning from these statistics but we also see that the universal biases are embedded in these models as well as
things that can be really prejudiced such as black versus white names being considered pleasant versus unpleasant but I what I would like to remind that in some cases you want certain types of biases to be your machine learning models because they can be very useful it depends on what kind of task you are dealing with and some people have suggested fairness through blindness basically just remover done just remove protected attributes or device the system by completely taking the biased component from the vector space from all words but this cannot be the right solution to this like just turning a blind eye to this because first of all once we remove this information we see that we are also removing statistical information about the world and the other thing would be we would end up with redundant encodings which wouldn't have the same quality as the previous ones and we don't know what exactly we are losing here and another very important example is having proxies even if you remove protected attributes are still proxies for bias to take place for example when automated systems are deciding to give loans to certain people the zip code is a proxy to the address so even if you remove all protected attributes the zip code since it's a proxy to having a certain financial status it would give you the redlining example where certain people are just denied loans because of their zip code so and by the way in law the main discrimination criteria is using protected attributes for discrimination and if those protected attributes are removed and if you are prop seeing from zip code it's completely ok to be using the system like that right now and I'm suggesting inside fairness through awareness first of all we need to understand cultural bias and then based on this bias we need to understand the protected attributes that come with them and we also need to understand a machine learning test for example in bioinformatics or health informatics we need to make sure that there is a certain bias the bias for genders or certain ethnicities or racial backgrounds are taken into account because one example is for cardiac disease symptoms of men and women are different treatment should be different as well and we have to make sure that we are taking this into account and whatever model we are building and I'm saying that fairness is task-specific for this purpose and this work was published last year at science and there were a lot of news about it in media and one of them said after this random screen shot that I took which I really liked because it says in 2017 society started taking AI bias seriously so in 2017 people really understood that there was a serious problem caused by these automated systems and the bias that they are perpetuating in society that's already a huge problem to deal with and this is not happening at the light scale ok what am I going to be working on next so this kind of covers the project that I was working on and I'm trying to wrap up quickly so that you can also ask questions and also I was going to mention in the beginning that this can be interactive but I think it's too late now ok I'm not going to be focusing on the singularity or transhumance or when machines are going to gain cognition because we have much
more immediate problems right now on for example focusing on computer vision and join semantic visual spaces and these are being current news for example for automated surveillance and computer vision systems are known to have bias as well but it's much harder to quantify those if you're not dealing with supervised machine learning and imagine a system being biased for certain skin colors or ethnicities and how much of a problem this might become because a lot of these automated systems are being used to identify targets or they are even used in war zones or they are used all over the streets to identify for example anomalies for anomaly detection and so on and we don't exactly know how these systems are working yet and they might very well be biased because all the analyzable ones are showing bias what about algorithmic transparency and interpretive on machine learning so for example with realist cars we know that they have vision systems as well and let's say that this is the classical trolley problem in the current sense where we have the driverless cars it's going to crash into someone there'll be an accident it cannot avoid that and it has to decide is it going to crash into the white male executive right across the car or is it going to choose to run into the old black lady on the other side and we don't know the answer to this yet and because of that we have to be very careful about what kind of products that we are building because we don't want to be building the digital analogues of Robert Moses racially-motivated love overpasses so Robert Moses is considered one of the best urban planners he's the planner of New York City but the overpasses that he built all around the parkways were quite low so buses were people with lower financial status had to use public transportation buses couldn't pass through those low overpasses so people had to have their own cars so that they can go to the beaches in Long Island or the parts that he built and with these low overpasses people were basically just separated and this led to decades of segregation and we have to make sure that we are not causing the same problems again with the digital product that we are putting out there blindly and for this I'd like to keep working on a fairness framework to uncover bias in artificial intelligence and come up with ways to mitigate it while preserving the utility of systems and come up with fairness algorithms but there are a lot of privacy and security implications here as well for example when we have these machine learning models going from our phones to the cloud or from our fitness trackers to the cloud can we guarantee fairness in a secure and at the same time private weight how can we avoid adversarial poisoning for these systems and so on and there are many unanswered questions in this area and it's a very exciting area and I would like to thank all of my collaborators as well none of this work would have been possible without and I'm really grateful to them now I think we have a few minutes for questions as well so I would be happy to take questions comments questions anything [Applause]
Feedback