Summer Research Fellowship Programme of India's Science Academies

Illustrated self-supervised learning

Sankalp Arora

National Institute Of Technology Kurukshetra, Mirzapur Part, Haryana, 136119

Dr Vineeth N Balasubramanian

Indian Institute of Technology Hyderabad, Near NH-65, Sangareddy, Kandi, Telangana, 502285


Considering the performance of training the machine learning models, a decent amount of labels is something that comes very first in mind, but manual annotations prove to be time consuming and expensive process. Scrutinizing the rising unlabeled data’s stock (e.g., all the images on the Internet, free surveys, free texts), which is substantially more than a limited number of human-curated labeled datasets, turns out to be wasteful not to use them. Thus, a significant tailback in the current supervised learning paradigm is the label generation part.

What if we can frame a supervised learning task in a unique form to predict only a subset of information using the rest to get labels for free for unlabeled data and train unsupervised datasets in a supervised manner? Yes, we can achieve this! Moreover, this way, all the information needed, both inputs and labels have been provided. This technique is known as Self-Supervised learning. Self-Supervised learning is majorly used in Natural Language Processing and much less used in computer vision models.

This Paper deals with all the methodologies and newly innovated techniques of Self-Supervised learning categorized into three different sections of Image-Based, Video-Based, and Control Based SSL.

Keywords: machine learning, representation learning, self-supervision.

Written, reviewed, revised, proofed and published with