Supervised vs Unsupervised Learning
In this blog, we will walk through Supervised and Unsupervised Learning and answer these questions:
- What is Supervised Learning?
- What is Unsupervised Learning?
- When should we use them?
But before we dive into the details of these two types, let’s first have a quick review of what is Machine Learning.
Traditional Software Programming
Traditionally software programming has been about writing a specific set of instructions to be executed in sequence to accomplish a task. It is like having a well-defined recipe to produce your desired outcome. Any person or machine can follow this recipe step by step to unfailingly produce the correct result. These are problems like doing complex calculations, operating on large data sets, reliably tracking & responding to events, and many more. Because of how a computer’s hardware works, at its most basic level-the level of chips, a machine is much more adept at following these instructions without fail, at speeds that humans can never dream of matching.
The problems mentioned above have been the center point of the software industry for the last few decades and gave an immense boost to human productivity. But there is still a huge set of tasks that only humans can do, for the meantime. Recognizing and responding to emotions, being able to understand and respond in natural spoken languages are a few tasks that can not be duplicated by step by step instructions. Moreover, contrary to pre-programmed computers, humans also have a remarkable ability to adapt to environmental changes.
Interestingly for 99% of the things we do in life, we humans never receive explicit instructions (even if we do, we hate to follow them). We can approach any task just like a video game – we make our trials, we observe the outcome, we make adjustments to our mental ‘models’ with the feedback, and slowly over time, we start to get better at achieving our desired results. The more we ‘Train,’ the better our mental ‘model’ becomes. Machine Learning algorithms follow this approach of solving problems, much like trial and error, by building models without ever explicitly coding or even understanding the underlying rules.
Supervised & Unsupervised Learning
In practice, the goal of machine learning aims to build a model that takes a particular input X and produces the corresponding desired output Y. To build this model, we start with a Mathematical Function with specific starting parameters that we believe could work. Then we use our input data, called ‘Training Data,’ to adjust the parameters of this function – thus training our model.
Now, just like a trained human expert, this trained model in real-world applications can find the best response for future input; while further improving itself in some cases (just like humans). Supervised Learning & Unsupervised Learning are two broad categories of algorithms that differ by how we adjust the parameters of our function to build our model.
In supervised learning, the training data contains both the input variables and the correct expected output for each input. We start the training process with a specified starting value of the modeling function parameters. Then, we compare the produced output to the expected ‘correct’ output—feedback on the error measures are used to adjust the parameters to minimize the error. Humans manually provide the correct output for each input in the training set, and thus learning happens in ‘human supervision.’ Regression and Classification algorithms fall under this category.
Unsupervised Learning algorithms are those which do not involve any human ‘supervision’ in the learning process. Thus there are no samples of correct output for the input variables in the training data. The model instead learns by adjusting the parameters to best highlight the underlying pattern or structure in the input alone. Clustering & Anomaly Detection are two popular algorithms under this category.
Which type of learning is right for your problem?
While one could guess that we use supervised learning when we have human-labeled training data available, we can also use it with unsupervised learning. You are free to use unsupervised learning algorithms on labeled data by merely ignoring the labels. To know which is the best approach for you, we look at what we are trying to accomplish and which algorithms we would like to use. It is more about choosing the right individual algorithm than the correct type of learning.
In the next blog we will cover different types of algorithms and which kind of learning is best suited for them.
Did you enjoy this content? Follow our linkedin page!