We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Snorkel Beambell - Real-time Weak Supervision on Apache Flink

Formal Metadata

Title
Snorkel Beambell - Real-time Weak Supervision on Apache Flink
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The advent of Deep Learning models has led to a massive growth of real-world machine learning. Deep Learning allows Machine Learning Practitioners to get the state-of-the-art score on benchmarks without any hand-engineered features. These Deep Learning models rely on massive hand-labeled training datasets which is a bottleneck in developing and modifying machine learning models. Most large scale Machine Learning systems today like Google’s DryBell use some form of Weak Supervision to construct lower quality, large scale training datasets that can be used to continuously retrain and deploy models in a real-world scenario. The challenge with continuous retraining is that one needs to maintain prior state (e.g., the learning functions in case of Weak Supervision or a pre-trained model like BERT or Word2Vec for Transfer Learning) that is shared across multiple streams, while continuously updating the model. Apache Beam’s Stateful Stream processing capabilities are a perfect match here including support for scalable Weak Supervision.