Artificial intelligence dealing with the right to be forgotten
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 644 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/41159 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | |
Genre |
FOSDEM 2018248 / 644
2
3
5
6
7
8
10
12
16
18
27
29
30
31
32
34
39
46
47
48
55
57
58
61
64
67
76
77
80
85
88
92
93
98
101
105
110
114
115
116
118
121
123
128
131
132
133
134
140
141
142
143
146
147
149
162
164
171
173
177
178
179
181
182
184
185
187
188
189
190
191
192
200
201
202
204
205
206
207
211
213
220
222
224
229
230
231
233
237
241
242
243
250
252
261
265
267
270
276
279
280
284
286
287
288
291
296
298
299
301
302
303
304
305
309
310
311
312
313
316
318
319
322
325
326
327
329
332
334
335
336
337
340
344
348
349
350
354
355
356
359
361
362
364
365
368
369
370
372
373
374
376
378
379
380
382
386
388
389
390
393
394
396
400
401
404
405
406
407
409
410
411
415
418
421
422
423
424
426
427
429
435
436
439
441
447
449
450
451
452
453
454
457
459
460
461
462
464
465
470
472
475
477
478
479
482
483
486
489
490
491
492
493
494
496
497
498
499
500
501
503
506
507
508
510
511
512
513
514
515
517
518
519
522
523
524
525
527
528
534
535
536
538
539
540
541
543
544
545
546
547
548
550
551
553
554
555
559
560
561
564
565
568
570
572
573
574
576
578
579
580
586
587
588
590
593
594
596
597
598
601
603
604
606
607
608
610
613
614
615
616
618
619
621
623
624
626
629
632
633
634
635
636
639
641
644
00:00
Right angleAxiom of choiceComputer animationLecture/Conference
00:28
ParadoxPerspective (visual)Right angleChannel capacityInformationUniverse (mathematics)Search engine (computing)ECosInformation privacyPerspective (visual)Semiconductor memoryPoint (geometry)Multiplication signExpressionPhysical lawCASE <Informatik>Disk read-and-write headBlackboard systemDirection (geometry)Game controllerParadoxOrder (biology)Process (computing)Arithmetic mean2 (number)Lecture/ConferenceComputer animation
08:15
StatisticsGamma functionGoogolWebsiteMaxima and minimaInformationEmailTape driveRing (mathematics)Right angleData managementConnected spacePRINCE2Order (biology)Source codeFamilyResultantInformationParadoxInformation privacyMultiplication signSound effectGoogolWebsiteCASE <Informatik>TwitterYouTubeNatural numberVideo gameStatisticsChemical equationExpressionRow (database)Level (video gaming)Physical lawParameter (computer programming)ACIDDatabaseFacebookTraffic reportingDataflowMassFigurate numberSquare numberElectronic mailing listLink (knot theory)Proof theoryBinary fileContent (media)Form (programming)Different (Kate Ryan album)outputCryptographyComputer fileComputer animation
15:57
DatabaseGoogolInformationNumberParadoxInformation privacyMaxima and minimaAlgorithmMachine learningWhiteboardNormal (geometry)Data modelIntegrated development environmentControl flowCryptographyOrder (biology)Multiplication signPhysical systemInformation privacySocial classSoftware testingSpeech synthesis1 (number)Sound effectDecision theoryEncryptionCloud computingData analysisArtificial neural networkInformationoutputFunction (mathematics)Maxima and minimaMedical imagingRight angleProduct (business)Goodness of fitMassDigital photographyOpen sourceTerm (mathematics)Game controllerStatisticsTotal S.A.State of matterShared memoryStrategy gameCASE <Informatik>2 (number)FacebookForm (programming)Derivation (linguistics)Sensitivity analysisFunctional (mathematics)Point cloudIntegrated development environmentMathematical analysisNumber
23:38
Information privacyRight angleGraphics tabletMatrix (mathematics)Product (business)Strategy gameComputer animationLecture/Conference
24:25
CollaborationismProgram flowchart
Transcript: English(auto-generated)
00:05
OK, welcome our next speaker, Kristina Rosu. OK, thank you. Thank you for the introduction. I'll start, well, I'm Kristina, and I'm going to speak to you about artificial intelligence and the right to be forgotten. First, I'm going to start with a reference
00:21
from Cicero's book regarding Thomas Tocqueville's choice. He made an interesting choice back in the antiquity, and he said something like, if he were to choose between the right to master his capacity of remembering things or to master the capacity of memorizing things, he chose back then to master the capacity of forgetting
00:46
things, because remembering things, in his opinion, was something not so important. And it was more important to be able to forget. And his reasons back then were something like, because everything that he had ever seen or heard
01:02
stuck in his memory, and because nothing that flowed into this man's mind ever flowed out again. So these two reasons that he had back then, I believe they apply also to our current status now, even though it dates back to the antiquity.
01:21
Regarding to the paradox of forgetting, if I'm going to tell you, for example, please pause them room. Don't think about the pink elephant. What are you going to do? Are you not going to think about a pink elephant? I think you're actually going to think about that elephant. So this is exactly how our psychological mind is working.
01:42
And this is a major issue when dealing with how we address these issues regarding technology and the GDPR, as Jurek said before. Well, this paradox of forgetting made Umberto Eco saying that an arse oblivionalis does not
02:01
exist. So currently, we are having in the status quo so many techniques of remembering things. And we buy different stuff, and we are trying to memorize courses at the university and so on. But regarding forgetting, we don't really have a clear technique. How exactly do we get out of our heads is something that we entered there.
02:23
I mean, if we're thinking about traumas and stuff like that, things get even more problematic. So Umberto Eco stated in one of his works that an arse oblivionalis does not exist. Then why exactly are we talking about a right to be forgotten? What is it?
02:41
So I'm going to tell you first a legal definition. I'm just going to throw it out there. I just assume that you guys are more from the you have more an IT background. So this is how a legal definition looks like. It's the substantial right of oblivion and the rather procedural right to erasure
03:01
from data protection. Now, the right of an individual to be forgotten means that he has the legal possibility of obtaining automatically or by request the deletion of personal information that is no longer relevant or useful
03:21
that was posted online by himself or by a third party, even if initially it was legally posted. So I will say that in the spirit of the right to be forgotten, we should look more at the antiquity and how exactly people forget psychologically
03:42
and how exactly are we forgetting in our society than to think about the technique of deleting something or to think about how exactly are we going to make an AI system forget something, which I will actually argue that is not really possible. So first, let's see the European perspective
04:01
on the right to be forgotten. We have here three major things that happened during the past. We have the directive 95, which was dated in 1995 on the protection of individuals with regard to the processing of personal data. And this directive did not found itself very,
04:24
was not very used at that time in 95. It started to get more important in 2012 when there was this case law, Google Spain and Google Incorporated and Mr. Mario Costella Gonzalez.
04:41
He practically used this directive in order to make the European Court of Justice to offer us the definition of what is the right to be forgotten, which prior didn't actually exist very clearly. And then we have the famous GDPR that my colleague first talked about,
05:00
where in article 17 actually gets written a right to be, a right to erasure and a right to be forgotten. Now, I'm gonna speak first about what happened in 2012 with that case law. Mr. Gonzalez noticed that every time he was going on Google and he was Googling his name,
05:22
there was this problem of him appearing next to two articles from La Bangardia, where he back in 98, in 1998, he had issues, we had some money issues and he had to list his house to an auction. So he believed that that information
05:41
is not relevant anymore, that that was quitted. So at that point, he said that that information shouldn't be there. So it's a problem, the fact that every time somebody was Googling his name, the information gets there. So he asked first to the newspaper saying that, hey, you should delete those two articles that are there.
06:04
The newspaper denied his request. Then he went further and he went to this agency, AAPD, that protects his privacy. And this agency said as well that it's not possible to delete those two articles
06:22
because those two articles were initially legally posted online. So there is no problem, we have freedom of expression. And then he went to the national court, the Spain National Court. And in 2012, the Spain National Court had an issue confronting this request because it was actually very weird
06:41
that somebody would want to delete information. It was actually the first time this happened at that point. And confronted with this issue, the Spanish court went to the European Court of Justice and asked three questions. The answer to the first question was the fact that between suing Google Spain
07:01
and suing Google Incorporated, we should actually sue Google Incorporated, the corporation from the US. The second answer was regarding the fact that the search engine qualifies as a data controller as it is mentioned in the prior directive from 1995
07:21
that I was telling you about. And the third aspect was actually the definition that currently exists of the right to be forgotten. You can see it on the blackboard. And regarding the GDPR definition, this is going to be the legal definition
07:41
starting from May this year. I'm not gonna forget. I'm just gonna roll this like this. I'm not gonna read, sorry. Okay. Now, the notion of the right to be forgotten,
08:00
even though it's from now, we speak now about it, it's actually quite old. And it dates back to France, to Germany, to 19th century. We can find it also in Italy, also in Spain, also in Germany, in different forms of course.
08:23
Now, here are some statistics on the definition on the deletion for GDPR compliance. This statistic was conducted on 750 IT companies. And this is one of their answer
08:41
that I believe it's very relevant. If your company receives a request based on the right to be forgotten, what is the method you would use to delete the content? So their answer, one of their answers was basic deletion. And basic deletion means actually taking the file
09:01
and dragging it to the recycle bin. So this would not actually mean to delete something, in fact, and would not offer the possibility in an audit, for example, to offer a proof of erasure. Now, this is another statistic about the methods
09:20
they are planning to use in order, that currently exist in order to delete something. And in AI systems, from this thing, the only thing that could work, in my opinion at least, I'm not a tech person, is just to burn it, or I don't know,
09:41
use acid or something like that. So it's basically physical destruction because we cannot use cryptographic erasure and we cannot use data erasure in order to make an AI system actually not use a certain information that is inside the AI. Now, I was telling you that back in 2012,
10:03
the Spanish National Court had issues dealing with Mr. Gonzalez's request. After 2012, something miraculously happened. Suddenly, a massive flow of requests
10:20
were made to Google in the name of the right to be forgotten. And the source of this statistic is the transparency reports, which is Google. And the most popular targets for the right to be forgotten were Facebook search results followed by YouTube and Twitter.
10:42
And this was basically the answer that Google had in order to protect, to the case law that happened in 2012. They hired basically dozens of lawyers and they asked them to deal with requests based in the name of the right to be forgotten.
11:01
And at the national level, a thing that I would like to mention is France, which had the first regulatory agency, the CNIL, and they requested Google to extend removals globally, not just to Google France, but also to google.com. And this thing, which is very interesting,
11:23
society's answer was related to, was the fact that people actually made manifestations and were against the fact that this is happening because they were advocates of freedom of expression. And these are some criterias for evaluating deletion requests.
11:43
We can see here the data as a subject role in public life, which shows the tendency in general to make a balance between public life and private life. The nature of information biased towards an individual's strong privacy interest or biased towards a public interest.
12:01
The source of the information. Now, if you are unlucky enough to get posted in a very, very important newspaper, then probably your request will not be followed because New York Times is more cool than you. And the time. This is practically the passage of time,
12:21
the arguments that Mr. Gonzalez invoked, the fact that it passed too much time, the information is no longer relevant. And this thing is actually particularly relevant for crime records because if somebody committed a crime in the past, that crime is not legally relevant after a certain time.
12:41
And the problem is that online, if an article was written about that information online, it remains. And this is a problem. So we have requests on this matter. Now, here are some articles that have been erased from Google search results,
13:03
which I found on Google. And here are the people that were forgotten. Now, let's see what we forgot until now. We forgot that Tim Blackstone, a former porn star and brother of Baroness Blackstone was found guilty of insider trading in 2003.
13:22
And this is like the pink elephant, basically. This is what I'm telling you. That it's not really consistent. We don't really forget something when we see something like this. We see the name, we see the source, we see practically everything. We, a Spanish court ordered an investigation into allegations that Saudi billionaire prince
13:42
raped a woman on a yacht in Ibiza in 2008. An article detailing in 2003, how the Roman Catholic Church reached an out of court settlement regarding a former, regarding, yeah, I mean, you can read it there.
14:00
It's, oh, okay. So practically this is an illustration of why the right to be forgotten should have a connection with the psychological law, the psychological way of forgetting. And this is how the paradox of forgetting works. Now, reputation management sites talk about the boomerang effect.
14:20
The fact that what draws attention, that if we draw attention to an information, in reality, we actually get everybody to speak about that pink elephant I was talking about the first time. And also if the request is accepted by Google, a notification regarding the fact that that article was deleted in the name
14:42
of the right to be forgotten gets posted. So that draws even more attention. And then another thing that's happening is that there are sites currently that actually collect this kind of links. So this is, it's like a collection. And also Wikipedia had a certain list. So it's like, we don't really forget.
15:00
I mean, it will be hyper critical to say that it's actually consistent what they're saying. Now, let's see, let's get even more deep into the technological side of the things and see what happens if we are trying to delete something in a database. And this data is valid, not just for MySQL, it's valid for most of technology. Now in figure one, we can see here
15:22
the database before deletion. We have five records, C1 from C5. And the I is the input, the S is the end. And record three is marked for deletion. And is also linked to a garbage offset,
15:40
which is a collection of deleted records and currently available free space. Now, the database, what happens with the database when the record gets deleted, when we want something to be deleted from the database, the database searches for the data and it starts from the input and goes to the ends to the S, to the end.
16:00
And if that C3 is not found there, then the search will show that that information is no longer there. But in reality, that information remains there and gets stuck there until we need more free space. And after a certain amount of time, we can speak about a real deletion of that information. So in reality, it's not a real deletion.
16:23
And also Vivian Reading, the VIP of the European Commission said that it is clear that the right to be forgotten cannot amount to a right of a total erasure of history. Now, here we have another interesting statistic regarding the major investors in AI technology
16:43
and the numbers related to the information that is collected by us. And as you can see here, Facebook and Google and Google drives and Google photos and so on, they collect a massive amount of information. Now, they also invest in AI.
17:01
Now, let's think what goes on with those two data that I just told you. In reality, they take our data from us, which we offer so voluntarily, and they are using it in order to improve their AI systems to make them more goods for production.
17:27
Now, let's see what is AI. I made a really long story short. This is actually what AI is, like very short. So we have here the reinforcement learning, but I would like to speak more about the neural networks
17:41
because as you can see here in the drawing, we have a neural network. We have an input layer. We have different layers, which can be one or two or so on, and we have an output layer. Now, the crazy thing that happens is in the middle, where it can be just a hidden layer, but now I just selected an image with three.
18:02
So inside here, when the information gets inside, no matter if we anonymize it or whatever, it goes like this, like it goes cyclical, and in the end, we can get an output that is similar to the input. Like for example, if I'm thinking about an AI
18:20
that refers to a dictionary, or can be a totally different idea. Now, this is the status of the AI currently, and their performance. So as you can see here, they're not so performant, and this is Apple's most advanced face ID technology,
18:43
which again is not so performant. This is actually an article, and these two women were like colleagues, and they were recognized, even though Apple said that someone is trying, that the probability of that happening is one in a million.
19:01
This is a recital from the EUG DPR, regarding your right to be forgotten, and the fact that we have the right not to be subjected to a decision based solely on automatic processing, and which produces legal effects concerning us, without any human intervention.
19:21
So this applies to the AI. Now, erasure of one's data, they tried it. They tried to erase data from AI, in order to respect the right to be forgotten. And they got to the conclusion that it may work, though the research made up to this date,
19:43
was conducted on information randomly selected. And if we're speaking about the right to be forgotten, we speak about the specific information that is from the AI, and they didn't test it until now. So we don't know exactly what goes on, if we say that Mr. Gonzalez shouldn't be there.
20:01
We just know that if a guy shouldn't be there, and if Mr. Gonzalez has a characteristic, which makes the AI more valuable, like it has like, I don't know, something that is related to the class of the AI, then the AI will lose performance. So it will impact the AI, and we cannot, we cannot really erase the data from such a system.
20:22
Now, also, we, another practical thing is the fact that according to the GDPR, newly data analysts should require every time the consent of the person in order to have, to make an analysis on that. And this is problematic again, because we cannot really get consent every time
20:43
in order to analyze the data. And if we add also shared environments, or cloud computing for implementing performance, the problem gets even higher. Now, functional encryption algorithms, they don't work on big data. Seo anonymization, this does not comply with the GDPR,
21:02
because the GDPR says that the same thing applies to seo anonymization as well, as the anonymization, and data anonymization, which means blurring the sensitive data into derivative form, and they use it and has no practicality.
21:21
Now, possible approaches, the strategies of obfuscation. This is something that dates, this is a strategy that dates from the Second World War, and it reflects the fact that if I'm throwing you, I'm bombarding you with lots and lots and lots of information, then maybe the relevant information will not be there anymore,
21:41
it will not be visible. So when we speak about lots of data, this is like one strategy that they thought about to try to bombard situational things like this with lots and lots of information, and in the end, people will not focus so much on the relevant data. So in this case, actually,
22:02
actually this case does not work so well in AI, of course, but it's the main strategy that they are trying to implement in order to comply with the right to be forgotten. Data minimization is actually the only one that can work, because if we don't share our information,
22:20
if companies don't get our information, then we don't speak about that information. So we should like advocate for the right to privacy, and this is what I would like you to remember from this talk, the fact that we should be more careful about our privacy, because in reality, if something goes online, it cannot be really deleted, so it will stay there.
22:42
So this is a major problem. Yeah, we have the right to be forgotten. Yeah, we have the GDPR, and it's like a wonderful thing in Europe, the fact that we will have the GDPR, but in reality, we have the technological impediments that will not allow us to delete, and also, you've seen what happened until now
23:03
with people that are very, very particularly interested on what Google gets off from their searches. Okay, so I would like to advocate practically for the right to privacy, and take back control over your data.
23:22
Think that we have the GDPR. Let's make Europe a state of the art in terms of respecting privacy, and also think about new technological aspects regarding how to deal with AI, and think that from the open source community, you can invent something
23:41
which can deal with deletion first, and also think about the fact that if you are managing to think in the future, it's something that helps people not to recognize that person, the user, like strategies of not recognizing the person
24:00
would still comply with the right to be forgotten, so if you're thinking about something like that, that would be amazing from the technological aspect. Also, use ethical products that respect your privacy as a user, and I selected here some of them. Mastodon, Matrix, and Crip Pad. If you wanna hear about Crip Pad, there's a talk at four o'clock.
24:20
Thank you.
Recommendations
Series of 10 media