We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Similarity Detection in Online Integrity

Formale Metadaten

Titel
Similarity Detection in Online Integrity
Untertitel
Fighting abusive content with algorithms
Serientitel
Anzahl der Teile
542
Autor
Lizenz
CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
How Meta manages to take offline millions of pictures, videos and text that violate its community standards, all of them adversarially engineered, in a catalog that counts in the trillions. We'll talk about open source technologies that embrace vector search, state of the art in neural and non-neural embeddings, as well as turnkey solutions. Content moderation is a problem that affects every service that hosts user uploaded media. From the avatars to a personal collection of pictures, the platform holds the responsibility of removing the violating content. The problem can be tackled with clssifiers, human moderators and by comparing media signatures; this presentation will be about the latter. Similarity Detection is an approach that tries to detect media based on an archive of "definitions" (yes, like the antiviruses) of things that have already been classified as violating. But how do we measure similarity between images from the perspective of a machine (not to mention video/audio clips of different lenghts)? The answer is not MD5... We'll talk how we do it, what technologies you can use too and how we can leverage a public, crowdsourced archive of signatures to defeat various threats, from terrorism to misinformation to Child Exploitation.