What is Naive Bayes Algorithm? Different kinds of classifiers in Naive Bayes | Explained with real world example

Machine learning is divided into 3 major parts. Supervised learning, Unsupervised Learning, and Reinforcement Learning. We have covered all these topics before. We explained each and every single thing very properly and with the real-world examples, and tried to make as simple as possible. If you are a beginner and don't know where to start machine learning then you can check our previous machine learning series here.

So, If you do a little bit of research on Which things you need to know to get into Machine learning then probably from most anywhere you here one word that is maths. Seriously I am not gonna lie but yeh that is true maths is very necessary for machine learning. you need to have very good knowledge of maths and programming concepts, statistics, calculus, etc. However, there is some algorithm like this will not require that hard knowledge of maths and they are very effective so I started this algorithm series from Naive Bayes. Later on, we will see more algorithms respectively I will also teach you necessary maths with sweet examples so don't worry. So, now let's move on to our algorithm which is Naive Bayes.

Naive Bayes is the easiest Algorithm and very efficient for supervised learning. So what the heck is this algorithm? This algorithm is used for classification. For example, we have given dataset of fruits and we have to predict one fruit based on the properties at that time we can use this algorithm. If you didn't get it but don't worry I will explain it in detail later.

To learn Naive Bayes you need to learn in the following sequence:
 1) Conditional Probability
2) Naive Bayes theorem
3)  Naive Bayes Classifier
After these 3 steps, you have covered the whole topic of Naive Byes. So let's get started.

1. Conditional Probability:- Conditional probability is nothing, but a simple condition where your event occurs.
Example:- 

P(A | B) = P(A ⋂ B)
                -----------
                     P(B)
How to read: Finding out the probability of event A given that event B has already occurred.

2. Bayes Theorem:- Bayes theorem is a solution for finding a probability when you know the probability of other events already.

Equation:- P(A/B)= P(B/A).P(A)
                                 ---------------
                                        P(B)
Derivation Of this equation:

P(A/B)= P(A ⋂ B)
               ------------
                    P(B)

P(B/A)=P(B⋂A)
              ----------
                   P(B)

P(A⋂B)=P(A/B).P(B) = P(B/A).P(A)
P(A/B) = P(B/A).P(A)
                ---------------
                       P(B)

So, This is the derivation of the equation. That is why we need to understand conditional probability so that we can know from where this equation came.

Let's take a real-world example to understand this thing. Take one Deck of the card. Here you know there are 52 cards in one deck. Now suppose if we had to find the probability of card which card is King and it is also a face card. So what you can do here? You can apply Bayes theorem here to get your answer.

Question:- Find the probability of a card which is King and it is also a face card(Photo or picture on the card)

Answer:- You know already there are 4 Kings in a deck and 12 cards with the face. So, what you gonna do is just put the values into Bayes equation.
P(king/Face)= P(Face/King) . P(king)
                        ----------------------------
                                     P(Face)
                      = 1 . 4/52
                          ---------
                           12/52
                      = 1. 1/13
                          --------
                           3/13
                      = 1/13
This is one very simple example of the Bayes theorem. Now let's understand what are different classifiers of Naive Bayes.

3. Naive Bayes Classifiers:- First of all we have to understand what is classifier? So, the classifier is a model used to differentiate objects based on a certain feature, like dog vs cat.

For example:- We have some amount of fruits and we have to find the probability of that given fruit occur based on the feature. So, here we have given that Fruit ={Yellow, Sweet, Long}. Here we have to find a fruit which is yellow as well as sweet and long from the given dataset.


As you can see in the image that values are given for all the fruits. So, How to apply Bayes theorem in this question and solve this. First, you need to find the Probability of fruit with every feature and then multiply those probabilities. 

Like, P(Fruit/ Orange)=P(Yellow/ Orange) . P(Sweet/ Orange) . P(Long/Orange)

Similarly, you have to find these for every fruit. Which is for 

P(Fruit/ Orange)= 0.53*0.69*0 = 0
P(Fruit/ Banana)= 1*0.75*0.87=0.65
P(Fruit/Others)=0.33*0.66*0.33=0.072

So, Here we have maximum value for fruit banana which is true. banana is long and sweet as well so, our algorithm produces output very correctly. 

Now let's see the different variants of Naive Bayes. So, it has 3 variants:
1) Bernoulli Naive Bayes
2)Multinomial Naive Bayes
3) Gaussian Naive Bayes

1) Bernoulli Naive Bayes:-
If you have some Data or Dataset and if features values are Binary. in this dataset then we can apply Bernoulli Naive Bayes. So, Whenever we saw data like True or false, Positive or Negative like in this binary form then we simply apply Bernoulli Naive Bayes.

Bernoulli Distribution:-

P(X=x) = Px. (1-P)1-x

Example:- P(success)=P
                  P(Failure)=Q=1-P

where x=0 or 1

If we put values into our equation then we get,
X=1 for Success
X=0 for Failure

So, P(X)= {P        if x=1

                 {Q        if x=0
This is our Bernoulli distribution. Very simple, When you have binary type data then just apply this distribution.

2) Multinomial Naive Bayes:-

For Multinomial Naive Bayes when you have data which have discrete count then you can apply this algorithm.
Example:- Suppose you have a data or a document and I give you one word to find how many time that word occurred in that document at that time you can use multinomial Naive Bayes.

Multinomial distribution:- 

P(X1=x1.…Xk=xk)   =
            N!                              P1x1…….Pkxk
             ------------------------------------------------
               X1!....,,xk!

In the situation where you have given probability and you have to find discrete count from that given probability then you have to use Multinomial distribution.
For Example, we have given,

Okey So, We did a survey on blood group of one classroom. and then we pick some random guys for the sample.
(blood group: number of students)
 blood group O: 1
blood group A: 2
blood group B: 2
Blood group AB:1
 Now, how many students have blood group O that is 1(Data is given), similarly for B(2) and so on. That data which has given to you is discrete count so, in this situation, you can use multinomial distribution. 

If you put these values into the equation then you got,

X1=1, X2=2, X3=2, X4=4
where X= number of occurrence (number of people having blood group) 
N= Size of your sample

So, if you are in the situation like this where you have given probability and discrete count then without thinking you can apply this distribution.

3) Gaussian Naive Bayes:

Before we start first let's see where to apply this Algorithm? You can apply Gaussian Naive Bayes when the features or variables are continuous nature.

Where the data is Discrete count then you can not use Gaussian Naive Bayes. 

The features which are continuous values which probability we have to model, At that time we use Gaussian distribution. For example, you can apply this on iris Dataset. which is in a continuous manner.

Probability Density Function


This algorithm is very useful and widely used in supervised learning. This is the basic first algorithm you need to learn to get started with supervised learning. If you want to know which algorithms and how many algorithms are used for solving problems then you can check out this article Here.

I know this is a very long article but This is the very first Algorithm You are starting in machine learning so, it needs to be very clear and well explained. If you have any doubt then you can join our community Here where you can ask your questions, you can do discussion with many peoples, etc. It helps you to gain more knowledge and enhance your skills. I will keep updated with upcoming machine learning series on that server so, if you want to learn machine learning from the beginning then don't forget to join our server. LINK

If you have any doubts regarding this article then you can comment down below. we will definitely answer your problem. This series is continued so stay tuned with us. I will see you in the next post, till then Good Bye. Peace.













Post a comment

1 Comments