Understanding Sentiment Analysis in Python

Uma Khanna
4 min readJan 5, 2021

What we as human beings fail, we start expecting it from the machines, and now after all the bummer. We have started designing and training machines to detect underlying sentiment in a text.

Sentiment analysis (or opinion mining) is a natural language processing technique used to determine whether data is positive, negative, or neutral. It can also be an essential part of the market research and formatting customer service approach. Not only can you see what people think of your own products or services, but you can also see what they think about your competitors too. The overall customer experience and expectations of your users can be revealed quickly with sentiment analysis, but it can get far more granular too.

Source: Webcubator Technologies

The steps involved in sentiment analysis are as under:

  1. Tokenization
  2. Cleaning the data
  3. Removal of stopwords
  4. Classification of words
  5. Apply Supervised Algorithm for classification
  6. Calculation

Tokenization: So, the very first step is creating tokens of all the words present in a paragraph/sentence. Let’s take the following example which is a review of the food served in the restaurant: ‘The food was awesome!’

which becomes “The”, “food”, “was”, “awesome” and “!”.

Cleaning the data: This step includes the removal of special characters from the sentence. This leaves us with the remaining ensuing content “The”, “food”, “was”, “awesome” and removing the “!”.Hence, the content becomes:

“The food was awesome”.

Removal of stopwords: The Stopwords are those which do not add any value to the sentiment analysis.

source: SlideShare

Now, after the removal of stopwords which are “The ” and “was”.We are left with only two tokens from the sentence :

“food” and awesome

Classification of words: If the word is positive, it’s score value will be +1. If neutral, it’s value will be 0 and -1 for the negative token/word. With this process the sentiment analysis becomes Scientific.

Supervised Algorithm for classification: This is a very small example but for analysis, at the industry level we can create a set of pre-classified words as per the requirement. For better understanding and further analysis, the words can be ‘bad’, ‘so-so’, ‘okay’, ‘good’, ‘very good’, and ‘awesome’.

Calculation: According to the words which we are left with :

food = 0
awesome= 1
Total sentiment = 0+1
= 1(POSITIVE)

Sentiment analysis using Python

We need to install TextBlob which is a python library that offers a simple API to access its methods and perform basic NLP tasks. Textblob gives us the measure of Polarity which implies how positive or negative your statement is. And also the amount of Subjectivity, which reflects the personal feelings, views, or beliefs in the comment.

pip install Textblob
from textblob import TextBlob

Let’s take some feedbacks and calculate the above-mentioned parameters of TextBlob for analysis.

Feedback1 = “The food was so -so.”
Feedback2= “The food was okay.”
Feedback3 = “The food was good.”
Feedback4 = “The food was very good.”
Feedback5 = “The food was awesome.”
Feedback6 = “The food was bad.”
text1 = TextBlob(Feedback1)
text2 = TextBlob(Feedback2)
text3 = TextBlob(Feedback3)
text4 = TextBlob(Feedback4)
text5 = TextBlob(Feedback5)
text6 = TextBlob(Feedback6)
print(text1.sentiment)
print(text2.sentiment)
print(text3.sentiment)
print(text4.sentiment)
print(text5.sentiment)
print(text6.sentiment)

Which gives the following result :

Feedback Result

With this above analysis, we can easily find out the reviews of the customers about the food served in a restaurant which helps in the improvement in the quality of the food and also solves the purpose of better marketing strategies against the competitors. Hope that it was enriching for all the budding data enthusiasts and I did justice to the title in a very simple way. Just make sure you are providing the correct path and installing the required libraries beforehand.

Your feedback and suggestions if any, will be highly appreciated.

--

--

Uma Khanna

A Data Scientist by Profession and Autodidact by heart.