Computer vision has made impressive gains through the use of deep learning models, trained with large-scale labeled data. However, labels require expertise and curation and are expensive to collect. Can one discover useful visual representations without the use of explicitly curated labels? In this talk, I will present several case studies exploring the paradigm of self-supervised learning – using raw data as its own supervision. Several ways of defining objective functions in high-dimensional spaces will be discussed, including the use of General Adversarial Networks (GANs) to learn the objective function directly from the data. Applications in image synthesis will be shown, including automatic colorization, paired and unpaired image-to-image translation (aka pix2pix and cycleGAN), and, terrifyingly, #edges2cats. The opinions expressed in this video do not necessarily reflect the views of the Heidelberg Laureate Forum Foundation or any other person or associated institution involved in the making and distribution of the video. |