What is Sentiment Analysis?
The act of computationally recognising and categorising opinions contained in a piece of text, especially in order to discern whether the writer has a good, negative, or neutral attitude toward a given topic, product, etc. Sentiment analysis is a technique for analysing a piece of text to determine the sentiment contained within it. It accomplishes this by combining machine learning and natural language processing (NLP). This project is about movie reviews sentiment analysis based on Machine Learning, NLP, and LSTM models.
Using LSTM_RNN model
For developing sentiment Analysis model using LSTM Layers few techniques were applied for making the model to perform in better way
Techniques used
- Collecting Data from various sources
- Text cleaning
- Balancing data
- Regular Expression
- pipeline of NLP
- lower text
- stemming
- lemmatization
- stopwords
- spacy library
- nltk(natural language toolkit)
Accuracy result using word Embedding and LSTM
- Training_Accuracy = 52.8942115768463
- Test_Accuracy = 0.4898785425101215
Developing the model using Machine Learning use cases
Using the same data we also developed ML models by cleaning to create a genralized model
Algorithms used
- SVM (support vector machine)
- Naive Bayes
- Decision Tree
- Random Forest
Techniques used
- Data cleaning
- Using cross validation to train the data in better way
- HyperParameter Tunning
- GridSearch CV
- Radomized search CV
- AUC and ROC curve
- TPR (True positive Rate)
- FPR (False Positive Rate)
- classification_report
SVM
- Training_Accuracy = 0.9230769230769231
- Test_Accuracy = 0.7266666666666667
Naive Bayes
- Training_Accuracy = 0.9247491638795987
- Test_Accuracy = 0.7866666666666666
Decision Tree
- Training_Accuracy = 0.9966555183946488
- Test_Accuracy = 0.7333333333333333
Random Forest
- Training_Accuracy = 0.9966555183946488
- Test_Accuracy = 0.7333333333333333
AUC and ROC score
- AUC and ROC = 0.8234352773826458