We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

How to write a scikit-learn compatible estimator/transformer

Formal Metadata

Title
How to write a scikit-learn compatible estimator/transformer
Subtitle
Tips and tricks, testing your estimator, and must-watch related current developments
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
This is a hands-on short tutorial on how to write your own estimator or transformer which can be used in a scikit-learn pipeline, and works seamlessly with the other meta-estimators of the library. It also includes how they can be conveniently tested with a simple set of tests. In many data science related tasks, the use-case specific requirements require us to slightly manipulate the behavior of some of the estimators or transformers present in scikit-learn. Some of the tips and requirements are not necessarily well documented by the library, and it can be cumbersome to find those details. In this short tutorial, we go through an example of writing our own estimator, test it against the scikit-learn's common tests, and see how it behaves inside a pipeline and a grid search. There has also been recent developments related to the general API of the estimators which require slight modifications by the third party developers. I will cover these changes and point you to the activities to watch as well as some of the private utilities which you can use to improve your experience of developing an estimator.