PIMMI: a command line interface to study image propagation
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 542 | |
Author | ||
Contributors | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/61910 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Computer-generated imageryObservational studyInterface (computing)Wireless Markup LanguageObservational studyPropagatorHypermediaTracing (software)Lecture/Conference
00:43
Computer-generated imageryObservational studyInterface (computing)Arithmetic meanObservational studyMereologyComputer animation
01:11
Identical particlesComputer-generated imageryCluster samplingTransformation (genetics)Characteristic polynomialComputer filePattern recognitionPersonal digital assistantObservational studyComputer networkHypermediaAxiom of choicePairwise comparisonGene clusterDigital photographyTouchscreenPropagatorDifferent (Kate Ryan album)Axiom of choicePattern recognitionDemo (music)Type theoryHoaxNumberComputer-assisted translationBusiness clusterCoefficient of determinationComputer animation
03:45
Lecture/Conference
04:27
Business clusterComputer-generated imageryData miningExecution unitGamma functionGene clusterSet (mathematics)Filter <Stochastik>Computer animation
04:47
Bookmark (World Wide Web)Element (mathematics)Instant MessagingBusiness clusterComputer-generated imageryData miningBusiness clusterResultantGroup actionSource codeQuery languageTwitterComputer animation
05:37
UsabilityBusiness clusterDenial-of-service attackProgrammable read-only memoryComputer-generated imageryGamma functionBookmark (World Wide Web)Game theoryBusiness clusterState of matterSlide ruleComputer animationPanel painting
06:01
Business cluster
06:21
Observational studyComputer-generated imageryInterface (computing)Point (geometry)Invariant (mathematics)Scale (map)Singuläres IntegralLocal ringAlgorithmVector graphicsData structurePrice indexDifferent (Kate Ryan album)DatabaseSimilarity (geometry)Transformation (genetics)Digital filterCluster samplingGraph (mathematics)Gastropod shellFile formatComputerNeuroinformatikVector spacePoint (geometry)DatabaseSimilarity (geometry)Local ringKey (cryptography)Machine visionMereologyComputer animation
08:25
Denial-of-service attackBusiness clusterStructural loadRevision controlInformationDemo (music)Cluster samplingQuery languageNumberVertex (graph theory)Component-based software engineeringGraph (mathematics)Maxima and minimaComputer animation
11:48
Business cluster
12:11
NumberComputer-generated imageryQuery languageInverse problemPrice indexMereologyContext awarenessGraph (mathematics)Similarity (geometry)Electronic visual displayPersonal digital assistantDemo (music)Source codeAstrophysicsLibrary (computing)Vector graphicsInvariant (mathematics)Local ringObject (grammar)Pattern recognitionScale (map)Machine visionSimilarity (geometry)PhysicalismPlotterCASE <Informatik>MetadataProjective planePlanningLink (knot theory)Dependent and independent variablesGraph (mathematics)MereologyTwitterAlgorithmDatabaseActive contour modelType theoryComputer architectureEndliche ModelltheorieOpen setGene clusterStreaming mediaHypermediaSoftware developerComputer animation
17:58
Machine visionSource codeInvariant (mathematics)Scale (map)Pattern recognitionLocal ringLibrary (computing)Vector graphicsSimilarity (geometry)Machine visionComputer-generated imageryVector spaceSimilarity (geometry)Visualization (computer graphics)Business clusterDatabaseLink (knot theory)Type theoryResultantHypermediaConnected spaceMereologyLatent heatOrder (biology)Context awarenessSlide ruleGene clusterTwitterServer (computing)NumberGraph (mathematics)Semantics (computer science)Pulse (signal processing)Local ringLecture/Conference
24:53
Point cloudProgram flowchart
Transcript: English(auto-generated)
00:08
Hi everyone. Well, I'm very impressed to have such a large audience for such a small tool. But I'm Beatrice, I work at the French Media Lab. And today I'm going to present
00:22
PIMI, which is a tool to study image propagation. The Media Lab is a lab where social scientists try to, among other things, study the traces that people leave online. And for now, they are quite well equipped with tools to study text. But when they ask
00:47
me, OK, how can I study mean propagation? I'm still struggling to give them answers. So what does it mean to study mean propagation? It means being able to recognize that some
01:06
parts of an image are copied or partially copied. So what this tool does, it's very clusters of images and grouped together images that are total or partial copies of each
01:27
other. It's able to deal with image transformation, so if the image is cropped or zoomed, and it's able to adapt to copper's characteristics. So it will try to make the best of your
01:43
data sets, depending on the number of images you have or the type of images you have. What PIMI is not able to do is to cluster semantically similar images. So it's not the tool that you are going to use if you want to create clusters of cats and clusters
02:04
of dogs or, I don't know, find images of violence versus images of peace. And it's not able to do some face recognition. So, again, you will not be able to make some
02:20
clusters of pictures of Elizabeth II versus clusters of images of Emmanuel Macron. What you could imagine doing, and we could imagine also work together if you are a researcher
02:43
working on those subjects, is to study the propagation of MIMO on social networks, as I was saying. But also you could study the usage of the press agency photos in a press corpus or stock photos as well. You could also study the dissemination of fake news
03:05
based on image montage. Or you could study the editorial choices between different media, depending on whether they use the same images or not. So let me do a quick demo of how it looks for now. It's not on the screen.
03:35
OK. Let's forget about that. I'm very sorry. Well, I'll try to make it work.
04:28
OK. Well, still not showing totally all clusters. But so we create clusters of images. So this is a data set that is created by the French INRIA and that is presenting some
04:45
degradation on images. So they take an original picture and they apply some filters or they crop the images to see if they are able to put the images together. So we can see that we have pretty correct results on that data set.
05:03
And this is our results on some images that we collected ourselves on Twitter using Elon Musk as a query. And so we tried to cluster those images. So as you can see, we have images of Elon Musk. We are able to group together some images that
05:27
are crops of others. So this is probably the source image of the montage that has been done here. But we can also see that we have some problems with the tool.
05:42
For example, here we have a cluster with two images that have been assembled together and we create a cluster of actually two images. But well, that's the state of the two for now. And now I try to come back to my slides.
06:25
OK, so how does it work for people who work in computer vision? I'm going probably to say some things that are quite basic, but I try to make it clear for people who do not do computer vision.
06:42
So it is not based on colors at all. It's used like the grayscale of images and it tries to detect points of interest on a picture and then it uses these local key points as vectors.
07:03
And then those vectors are indexed in a database that is able to perform some very quick similarity search. Of the tool, as I say, there is that problem of parts of images
13:36
that create clusters that are bigger than they should be. So our plan is to be able to detect
13:43
images that are actually those links between two clusters. So to be able to detect that this image is actually containing two images and to be able to deal with part of images. And also what we would like to do is to show images in their context,
14:04
to be able to show the tweets that contain those images or Instagram posts, etc. Or at least to show additional metadata for the users. And also, we would like to show you the graph of image similarities so that the clusters that
14:25
are resulting from that graph are not interpretable. And to improve our tool, we need your use cases because for now we have those two, three databases
14:49
but we would be very glad to do some partnerships with other researchers to improve the tool. Thank you very much for your attention.
15:01
If you want to look at the slides, we have the references to all the images used and to the papers of the algorithms used by Pini. I'm open for questions.
15:25
With the sound stream, but it's back online. So yeah, you should repeat the question for the stream. OK, I'll do it. Yes?
15:40
Adam, you said it's actually immediately in mind because research from the physics side, solar cell research, there are some 3D plots where you want to compare similarities, but they're not images as you have right now. It's basically data. Is that the use case that is possible? Or is it 3D plots, contour plots, stuff like that?
16:04
Well, trying to find similarities in... Oh, sorry, I have to repeat the question. So the question was, if I understand well, how to reproduce that use case, not on images, but on other types of documents that would be, I guess, some features.
16:32
3D contour plots. And I'd say, well, as long as you can represent your data in the shape of vectors,
16:44
then you're ready to use face to do some search for nearest neighbor in your database. And then you can go for the whole pipeline, create some graphs, find communities in the graph and go for it. But I'm not sure Pini is your tool, but well,
17:07
the architecture of Pini could be, of course, a model. Is there any project current or completely wrong going that Media Labs has used Pini for, or is it still largely in development?
17:26
It is largely in the development. Sorry, I repeat the question. So are there some projects at the Media Lab that are currently using Pini? And the response is no.
17:58
Have you considered any other ways of presenting picture similarity or
18:08
using picture similarity? All the types of image similarity, if I understand well. Well, I'd say that that was what I was saying in my second slide.
18:27
There are other types of image similarity, for example, semantical similarity. And well, maybe in a few months, if we have like a robust architecture, we could
18:44
maybe include some other types of vectorization of images. But for now, well, there are already tools that do that. Like there is something called Clip Server that helps you find similar images
19:10
from clip vectors that are like semantical vectors. So you could use that tool, it's great.
19:21
Yes? Yes.
19:46
So the question is, is the tool really able to distinguish the thing that is of interest to us, the fact that we are talking about a dog?
20:00
So the tool is only able to find partial copies in an image. So the tool would probably be able to say that all those images contain the same part of the face of a dog. So it would probably be able to group all those images together.
20:21
The problem is that if there are other images in the database that contain the rest of the images, then they would probably also be grouped in the same cluster. So that's why what we are currently doing about parts of images
20:41
would let us improve the cluster so that it's purified from the rest of the images. And we could have a cluster of the face of that specific dog, and then a cluster of that taco in the second cluster.
21:06
Yes? What kind of clusterization do you use on the graph? Well, for now we have the best result with, excuse me, what kind of clusterization do you use on the graph?
21:22
For now we have our best result using pure connected components. So actually the specification we do on the graph to reduce the number of links between images is enough to have separated connected components in the graph.
21:41
And so we take each connected component and it's our cluster. What we would like to do is to try to mix with some move on community detection, but actually for now it's not the thing that works best.
22:15
Yes? I'm not sure I understand the question.
22:23
Can you try to rephrase it? Okay. What things are you looking at to improve the model?
22:44
Well, there are many things where we are looking at. For now, mainly we look at techniques to do a better graph specification in order to find more coherent clusters.
23:07
We are not so much working on the local descriptors parts of the tool for now. Yes?
23:34
Have you considered, used the direct link to the Twitter images or social media images online?
23:50
Did I repeat everything? Yes. Well, yes, we would like people to be able to see images in their context because actually
24:01
they won't understand what's happening if they just have images. They need to see, okay, why was this image published, who answered, etc. So this would probably mean that we need to add at least the link to the pulse
24:23
or maybe some kind of visualization of it.