IMDB Movie Review Sentiment Analysis

More from Author
Sai Kamal
Sai Kamal

Software Engineer

Tags:
NLP
2 min read

What is Sentiment Analysis?

The act of computationally recognising and categorising opinions contained in a piece of text, especially in order to discern whether the writer has a good, negative, or neutral attitude toward a given topic, product, etc. Sentiment analysis is a technique for analysing a piece of text to determine the sentiment contained within it. It accomplishes this by combining machine learning and natural language processing (NLP). This project is about movie reviews sentiment analysis based on Machine Learning, NLP, and LSTM models.

LSTM_RNN model

Using LSTM_RNN model

For developing sentiment Analysis model using LSTM Layers few techniques were applied for making the model to perform in better way

Techniques used

  • Collecting Data from various sources
  • Text cleaning
  • Balancing data
  • Regular Expression
  • pipeline of NLP
    • lower text
    • stemming
    • lemmatization
    • stopwords
    • spacy library
    • nltk(natural language toolkit)

Accuracy result using word Embedding and LSTM

  • Training_Accuracy = 52.8942115768463
  • Test_Accuracy = 0.4898785425101215
model using Machine Learning use cases

Developing the model using Machine Learning use cases

Using the same data we also developed ML models by cleaning to create a genralized model

Algorithms used

  • SVM (support vector machine)
  • Naive Bayes
  • Decision Tree
  • Random Forest

Techniques used

  • Data cleaning
  • Using cross validation to train the data in better way
  • HyperParameter Tunning
    • GridSearch CV
    • Radomized search CV
  • AUC and ROC curve
  • TPR (True positive Rate)
  • FPR (False Positive Rate)
  • classification_report

SVM

Data cleaning
  • Training_Accuracy = 0.9230769230769231
  • Test_Accuracy = 0.7266666666666667

Naive Bayes

Using cross validation to train the data in better way
  • Training_Accuracy = 0.9247491638795987
  • Test_Accuracy = 0.7866666666666666

Decision Tree

HyperParameter Tunning
  • Training_Accuracy = 0.9966555183946488
  • Test_Accuracy = 0.7333333333333333

Random Forest

Random Forest
  • Training_Accuracy = 0.9966555183946488
  • Test_Accuracy = 0.7333333333333333

AUC and ROC score

AUC and ROC score
  • AUC and ROC = 0.8234352773826458

Back To Blogs


contact us