PIMMI: a command line interface to study image propagation

CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/61910 (DOI)

Publisher

FOSDEM VZW

Release Date

2023

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

PIMMI is a Python software that performs visual mining in a corpus of images. Its main objective is to find all copies, total or partial, in large volumes of images and to group them together. Our initial goal is to study the reuse of images on social networks (typically, our first use is the propagation of memes on Twitter). However, we believe that its use can be much wider and that it can be easily adapted for other studies. The main features of PIMMI are therefore: - ability to process large image corpora, up to several millions files - robustness to some modifications of the images (crop, zoom, composition, addition of text, ...) - adaptability to different use cases (mainly the nature and volume of the image corpus) A study using PIMMI will generally be broken down into several steps: 1. constitution of a corpus of images (jpg and/or png files) and their metadata 2. choice of PIMMI parameters according to the criteria of the corpus 3. indexing the images with PIMMI and obtaining clusters of reused images 4. exploitation of the clusters by combining them with the descriptive metadata of the images The development of this software is still in progress and we warmly welcome beta-testers, feedback, proposals for new features and even pull requests!