You may have heard of artificial intelligence and model training many times. In this post, we explain what types of training exist to solve AI problems.
You may have heard of artificial intelligence and model training many times. In this post, we explain what types of training exist and what subproblems artificial intelligence can be divided into, taking into account these types. Go for it.
Machine learning problems are usually divided into two phases. The first is a training phase, where the algorithm must learn and find patterns, while the second is the testing part, where we give the algorithm data that it has never seen and ask it to make the prediction. Generally, the division is usually 80-20 or 70-30. Around 70 or 80% of the data for training and 20 or 30% of the data for testing.
For example, if we want to predict bank fraud, we will first have to divide our data set into training and testing. In the training phase, we will pass you the training data to learn to recognize patterns of what is fraud and what is not, and then, in the test phase, we will give you the new data to make the predictions.
Typically another subset of data is generated called validation; now, we will talk about it later. But first, we are going to learn more about the most common types of learning.
As we said in the introduction, there are several types of learning. Machines need a period of learning about the data to recognize those patterns based on the experience later. In this module, we will look at both supervised learning and unsupervised learning. Within supervised learning, we would have two types of problems: regression and classification, and within unsupervised learning, broadly speaking, we would have one type of problem, which is clustering.
Let’s see them a little more in detail.
The first is supervised learning. This is one of the typical learnings in Machine Learning. Its operation is that, during the training process, we give it both the characteristics and the result or the class that we have to label. That is, the algorithm knows both the characteristics and the result of the classification. But only in the learning process.
As we mentioned, within supervised learning, we find two types of problems: regression and classification.
In the case of classification, the objective is to assign a specific class to each data record characteristic.
For example, in the case of a problem detecting spam in the mail, we would give you a training data set where both the characteristics of the emails and whether it is spam or not would be. Thus, the algorithm should learn what characterizes cases labeled as spam and the characteristics of those labeled as non-spam.
In the case of regression, the objective will be to assign a constant numerical value to each data record.
An example would be the prediction of the house price, where we have a series of characteristics of the accommodation, and we want to predict how much it will cost. That is, here we are not looking for a special class, but a value.
The second is unsupervised learning. In this type of learning, we do not know the group or class we want to assign either in training or the test.
Typically in unsupervised learning, the problems tend to be clustering. That is, what we want is to find patterns in the data that allow us to group those that have common characteristics.
For example, when we want to carry out a marketing campaign, we try to segment and generate different groups of clients with similar characteristics and, based on that segmentation, impact each group with a much more segmented and personalized type of advertising.
Other types of training are somewhat different from these, such as transfer learning, reinforcement learning, or active learning, which we will discuss in more detail in other posts.