arXiv 1711.10160

Snorkel: Rapid Training Data Creation with Weak Supervision

By Alexander Ratner, Stephen H. Bach, et al.

Published 2017-11-28

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

Labeling training data is increasingly the largest bottleneck in deploying machine learning systems. We present Snorkel, a first-of-its-kind system that enables users to train state-of-the-art models without hand labeling any training data. Instead, users write labeling functions that express arbitrary heuristics, which can have unknown accuracies and correlations. Snorkel denoises their outputs without access to gr…

View the original paper on arXiv