Multinomial Naïve Bayes

I am watching lectures on Natural Language Processing. In the week 3 lectures, professor talks about text classification. I found writing the formulas in words helps.

Given a training set, classify test set into a class

We need to calculate two probabilities
Say, we have a training set of 5 documents , with 3 documents of class ‘A’, and 2 documents of class ‘B’.

1) Probability of a class A, given a training set

= number of documents classified as ‘A’\total number of documents in training set

2) Probability of each word in the vocabulary

  • Vocabulary (V) – unique words in the training set
  • Assume there are 3 words in training set hello, world,goodbye
  • All documents for class ‘A’ are merged and same for class ‘B’
  • Probability of a word ‘hello’ given class ‘A’

    = number of times word ‘hello’ occurs in documents classified as ‘A’ + 1 \total words in documents classified as ‘A’ + V

Then, we tackle the test set. And, figure out which class is proportionally having maximum probability. How ?

1 ) Use the prior probability of a class, say ‘A’
2) And take each word in the test set, use the corresponding probability of the word in that class(‘A’) ^ frequency word in the test set
3) And, just multiply…