Supervised Learning with Missing Values

Cite

Related Material

Institut des Hautes Études Scientifiques (IHÉS)

Varoquaux, Gaël

Formal Metadata

Title

Supervised Learning with Missing Values

Title of Series

Journée Statistique et Informatique pour la Science des Données à Paris Saclay, 2021

Number of Parts

Author

Varoquaux, Gaël

Contributors

Soussen, Charles (Moderation)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/54727 (DOI)

Publisher

Institut des Hautes Études Scientifiques (IHÉS)

Release Date

2021

Language

English

Production Year

2020

Content Metadata

Subject Area

Computer Science Mathematics

Genre

Workshop/Interactive Format

Abstract

Some data come with missing values. For instance, a survey’s participant may ignore some questions. There is an abundant statistical literature on this topic, establishing for instance how to fit model without biases due to the missingness, and imputation strategies to provide practical solutions to the analyst. In machine learning, to build models that minimize a prediction risk, most work default to these practices. As we will see, these different settings lead to different theoretical and practical solutions. I will outline some conditions under which machine-learning models yield the best-possible predictions in the presence of missing values. A striking result is that naive imputation strategies can be optimal, as the supervised-learning model does the hard work [1]. A challenge to fitting a machine-learning model is that there is a combinatorial explosion of possible missing-values patterns such that even when the output is a linear function of the fully-observed data, the optimal predictor is complex [2]. I will show how the same dedicated neural architecture can approximate well the optimal predictor for multiple missing-values mechanisms, including difficult missing-not-at-random settings [3]. [1] Josse, J., Prost, N., Scornet, E., & Varoquaux, G. (2019). On the consistency of supervised learning with missing values. arXiv preprint. [2] Le Morvan, M., Prost, N., Josse, J., Scornet, E., & Varoquaux, G. (2020). Linear predictor on linearly-generated data with missing values: non consistency and solutions. AISTATS 2020.