Back to Glossary
Machine Learning

Weakly supervised learning

Weakly supervised learning is a type of machine learning where the training data is labeled with less accurate, less complete, or less informative labels than those used in fully supervised learning. It aims to train models using these noisy or imprecise labels, alleviating the need for extensive manual annotation.

Explanation

Weakly supervised learning addresses the challenge of acquiring large, accurately labeled datasets, which can be expensive and time-consuming. Instead of precise labels, it utilizes weaker forms of supervision, such as: * **Inexact Labels:** The provided label indicates membership in a broader category, rather than a specific instance. For example, labeling an entire image as containing a 'car' instead of bounding boxes around each individual car. * **Inaccurate Labels:** The labels are noisy and may contain errors. For instance, a labeler might misidentify a 'dog' as a 'wolf' in some instances. * **Incomplete Labels:** Only a subset of the training data is labeled. Techniques in weakly supervised learning include: * **Self-training:** The model is initially trained on a small set of labeled data and then used to predict labels for the unlabeled data. High-confidence predictions are then added to the training set, iteratively improving the model. * **Multi-instance learning:** Each instance is part of a 'bag' with a single label, indicating if at least one instance in the bag belongs to a class. The model learns to identify which instances within a bag contribute to the bag's label. * **Learning from label proportions:** The model learns from the proportion of each class in a set of data, rather than individual labels. Weakly supervised learning is beneficial in scenarios where obtaining high-quality labeled data is impractical. It is used in various applications, including image classification, object detection, and natural language processing. The key is to design algorithms that are robust to noise and can effectively leverage the available weak supervision signals to learn useful patterns.

Related Terms