Naive bayesian text classifier using textblob and python

Naive bayesian text classifier using textblob and python
5 (100%) 4 votes

Text classifier are systems that classify your texts and divide them in different classes. In this article we are going to made one such text classifier using textblob and python. You want to read more about naive bayesian theorem, read it here.

Naive bayesian text classifier using textblob and python

Naive bayesian text classifier using textblob and python

For this we will be using textblob, a library for simple text processing. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

We will do this in separate python environment for this we need virtualenv. How to install virtualenv.

sudo pip install virtualenv

Now we have installed virtualenv next step is to create virtaul environment for our little project. Run the below command to create virtualenv.

virtualenv sent

sent is the name of the environment. Now we have created an environment. The above command will create an environment and install setup tools in it. Now we need to launch the environment. For this, run the below command

. sent/bin/activate

Now for installing textblob use below commands

pip install textblob
python -m textblob.download_corpora

The second command will download the data files that textblob uses for its functionality and for nltk. Now look at the below script which will do the sentiment classification for you.

Now look at the below script

train = [
     ('What an amazing weather.', 'pos'),
     ('this is an amazing idea!', 'pos'),
     ('I feel very good about these ideas.', 'pos'),
     ('this is my best performance.', 'pos'),
     ("what an awesome view", 'pos'),
     ('I do not like this place', 'neg'),
     ('I am tired of this stuff.', 'neg'),
     ("I can't deal with all this tension", 'neg'),
     ('he is my sworn enemy!', 'neg'),
     ('my friends is horrible.', 'neg')
 ]
test = [
     ('the food was great.', 'pos'),
     ('I do not want to live anymore', 'neg'),
     ("I ain't feeling dandy today.", 'neg'),
     ("I feel amazing!", 'pos'),
     ('Ramesh is a friend of mine.', 'pos'),
     ("I can't believe I'm doing this.", 'neg')
 ]
from textblob.classifiers import NaiveBayesClassifier
cl = NaiveBayesClassifier(train)
print cl.classify("This is an amazing library!")
# Lets test the accuracy of the classifier
print cl.accuracy(test)

Now you have classifier cl which is based on Naive Bayes Classifier. Use this classifier to get your text classified. Keep in mind that the text classifier generally need a huge amount of data to be trained and here the data is very less. Also we calculated the accuracy of the classifier. 
Also the time of training the classifier increases with the data. So the primary approach is to make the classifier object and keep it in memory to use it again and again when required you can also update the classifier as below.

How to update classifier

new_data = [('She is my best friend.', 'pos'),
             ("I'm happy to have a new friend.", 'pos'),
             ("Stay thirsty, my friend.", 'pos'),
             ("He ain't from around here.", 'neg')]
cl.update(new_data)
cl.accuracy(test)

It is as simple as it. Its really simple to make these with the libraries present now a days its just that we don’t know the libraries and hence we don’t build these.
If you like the article please share and subscribe.

 


Gaurav Yadav

Gaurav is a Full Stack Web Developer and Blogger. Sportsperson by heart and loves football. He has experience with various frameworks in php, python and javascript. Loves to explore new frameworks and evolve with the trending technology.

1 COMMENT
  • Part of Speech tagging, noun phrases, sentences and tokenization for natural language processing
    Reply

    […] Naive Bayesian Text Classifier using Python and TextBlob […]

Leave a Reply

Your email address will not be published. Required fields are marked *