Improving Barely Supervised Learning by Discriminating Unlabeled Data with Super-Class
Advances in Neural Information Processing Systems
In semi-supervised learning (SSL), a common practice is to learn consistent information from unlabeled data and discriminative information from labeled data to ensure both the immutability and the separability of the classification model. Existing SSL methods suffer from failures in barely-supervised learning (BSL), where only one or two labels per class are available, as the insufficient labels cause the discriminative information to be difficult or even infeasible to learn. To bridge this gap, we investigate a simple yet effective way to leverage unlabeled data for discriminative learning, and propose a novel discriminative information learning module to benefit model training. Specifically, we formulate the learning objective of discriminative information at the super-class level and dynamically assign different categories into different super-classes based on model performance improvement. On top of this on-the-fly process, we further propose a distribution-based loss to learn discriminative information by utilizing the similarity between samples and super-classes. It encourages the unlabeled data to stay closer to the distribution of their corresponding super-class than those of others. Such a constraint is softer than the direct assignment of pseudo labels, while the latter could be very noisy in BSL. We compare our method with state-of-the-art SSL and BSL methods through extensive experiments on standard SSL benchmarks. Our method can achieve superior results, e.g., an average accuracy of 76.76% on CIFAR-10 with merely 1 label per class. The code is available at https://github.com/GuanGui-nju/SCMatch.
Open Access Status
This publication is not available as open access
China Postdoctoral Science Foundation