How to measure, and compare performance of machine learning models.

Hey!, Welcome to COT. We have have written many articles on AI/Machine learning , but we never talked about various metrics to measure how well a model performs, and how to choose the model which works the best with your problem, or dataset.

So, this articles is for those who do not know anything about it, of course :). My goal is to keep the article simple, and not to introduce many new terms which you are not familiar with.

This article will be in two parts.

1. How to measure performance of a model?.

2. How to choose best Machine Learning model?.

Let's start. Please keep the terms in mind which are in italic. It will be useful.


1. How to measure performance of a Machine Learning model?

Generally the number which represents performance of a model is called score in machine learning. There are many ways/metrics to find it, we will only talk about two. One for classification called accuracy, and another for regression called R squared.

(a)Finding performance of a classification model

Accuracy is metric used to find performance of a classification model. To make it more simpler, I will only talk about binary(yes/1, or no/0 output) classification. Finding accuracy becomes easy if you have confusion matrix for a model. Now what is confusion matrix?. 

What is confusion matrix?

Confusion matrix represents how well a model performed on test data, but it does not give us a one number called score, or accuracy in this case. It is a matrix.

Let's understand it more, below is how a confusion matrix looks like for a binary classifier.

[Info: you can click on the image, and view it in full size]

Green color in the image represents number of correct classifications, and Red represents the opposite of it. As you can see in the image, there are four elements in the matrix. Here is what they mean.

True Positives(TP): Number of instances(rows/examples) whose target values are yes/1, and the model predicts yes/1.

False Positives(FP): Number of instances whose target values are no/0, but the model predicts yes/1.

False Negatives(FN): Number of instances whose target values are yes/1, but the model predicts no/0.

True Negatives(TN): Number of instances whose target values are no/0, and the model predicts no/0. 

For example, you give data of N heart disease patients to your model, and the model says, yes they have heart disease, then there are N True Positives.

Now, you are ready to know what is accuracy, and how to find it?. Just look at the image below, and you will know it.

If you want to find it in percentage, multiply the answer by 100. Higher accuracy is better, but not always. You can read this article to understand it.

(b) Finding R squared(R^2) score of a regression model

As you know, regression predicts continuous numbers, you can't create a confusion matrix for it. Confusion matrix only works if you have classes(yes/no etc). So, we use R square to measure performance of regression models. We will talk about regression with 2 variables only.

What is R squared, or R^2 in machine learning?

As you know regression is about fitting a line to data which can be used for predictions. That's enough knowledge. Before understanding R sqaured, you have to assume two cases.

Case 1: 

Assume, you consider only one variable for prediction, the target variable itself. How?. Mean value will be the prediction for all instances, or examples. It will produce a huge error, if the variance is high. Let's call the variance here as var(mean), keep it in mind. Section A in the below image shows graph of it, and variance formula.

[Note: Variance shows how far the values in the data are, from the mean]

Case 2: 

Now, take 2 variables in consideration for prediction. You will first find best fitting line using regression, and then you will again measure variance, but this time it will be from the fitted line, instead of just single value called mean. Let's call the variance here as var(fit), keep it in mind. You can see the formula, and graph in the section B of above image.

Now, you are ready to know what is R squared. Below image image shows the formula for R squared.

Multiply R squared value by 100 to get in percentage. Higher the R squared is, better the model. It explains how much variance gets reduced when we consider another variable.

R squared is not perfect, it has flaws. That is why we have adjusted R squared, and, p-value. We will talk about them in upcoming articles.


2. How to choose best Machine Learning model?

You must be thinking that models can be compared by comparing their performances which are calculated by using above techniques. But, what if I say you can do better?. For that, you have to use something called cross validation.

What is Cross Validation in machine learning?

Normally when you are training a model, you divide your dataset into two parts. One is called training set, and another is called test set. Most of times we take random 10% instances/examples of the total as test set. We can do better here.

Cross validation says- 

(i) Divide your dataset in K equal parts, and use one part as test set, and remaining K-1 parts as training set. After that, train the model, and save the score/performance in an array. 

(ii) Do it again, and again with different parts as test set each time, till all the parts are used as test sets exactly once. The process will be repeated for exactly K times.

In the end, you will have an array of scores for different combinations of test, and training sets. The array will be of K size, and it can be called cross validation scores array/list. This is called K-fold Cross Validation. Generally, we use 10-fold cross validation. Now, the question is how to compare two models?.

It's simple, apply cross validation to all models you want to use. SVM, Decision Trees, regression etc. You will have cross validation score lists for all. Take mean of each list, the model whose list has the highest mean will be the best model. That's it.


If you have any question, or suggestion then you can ask in the comment section below. If you liked the article, please share with your friends. Bye!.

Post a Comment