Natural Language Processing using python and textblob. In this article we are going to see how we can get part of speech, noun phrases, sentences and tokenization. We will use textblob which we used before to make classifiers. You can find the previous videos below.
Lets start by using installing required libraries.
sudo pip install virtualenv
Now we have installed virtualenv next step is to create virtaul environment for our little project. Run the below command to create virtualenv.
sent is the name of the environment. Now we have created an environment. The above command will create an environment and install setup tools in it. Now we need to launch the environment. For this, run the below command
Now for installing textblob use below commands
pip install textblob python -m textblob.download_corpora
The second command will download the data files that textblob uses for its functionality and for nltk. Now look at the below script which will do the sentiment classification for you.
Now that you have installed the required libraries lets look at the scripts needed to get the required parts.
Part of speech Tagging
from textblob import TextBlob text = TextBlob("Python is a high-level, general-purpose programming language.I am loving it.") print text.tags
Thats it now you will get the list of POS tags as below in a list.
[('Python', 'NNP'), ('is', 'VBZ'), ('a', 'DT'), ('high-level', 'JJ'), ('general-purpose', 'JJ'), ('programming', 'NN'), ('language', 'NN')]
Noun Phrase Extraction
Just use the below attribute to get the noun phrases.
You will get the below results.
will give the list of all the words.
will give the list of all the sentences.
b = TextBlob("I havv goood speling!") print(b.correct())
This will attempt to correct the spelling
b = TextBlob("One plus One is two") print(b.word_counts['One'])
The result will be 2
Translate to other dialect
en_blob = TextBlob(u'Simple is better than complex.') en_blob.translate(to='es')
TextBlob("Simple es mejor que complejo.") will be the result.
All of the above information is taken from https://textblob.readthedocs.io/en/dev/quickstart.html#get-word-and-noun-phrase-frequencies you read it for more details.