What is it about?
You have to do supervised classification. A data set is collected and manually labelled. But the labels of some objects are unknown or uncertain. For example, in a satellite image it is not clear whether it is a pedestrian, a tree or a shadow. Or an e-mail might be classified as 20 percent spam and 80 percent as a regular e-mail. Can this additional information be used in the classifier to improve the quality of the classification? Yes, it can.
Featured Image
Photo by Markus Spiske on Unsplash
Why is it important?
Google and other companies collects a huge amount of data. Special teams try to classify/label this data. Often only a small part (5%) is labelled by humans, the other part of the data may be labelled by an algorithm, which may not be accurate. Can these imperfect labels still be used to help predict how to label the remaining samples? This problem appears when the labelling is impossible, time-consuming, or expensive.
Perspectives
Read the Original
This page is a summary of: On a Weakly Supervised Classification Problem, January 2022, Springer Science + Business Media,
DOI: 10.1007/978-3-031-16500-9_26.
You can read the full text:
Contributors
The following have contributed to this page