Sentiment analysis is a technique used to determine the emotional tone behind words, helping us understand how people feel about specific topics. This project focuses on analyzing movie reviews to classify them as either positive or negative using a Logistic Regression model.
With the rise of online reviews, understanding public sentiment has become essential for businesses and content creators. By training a model on a dataset of labeled movie reviews, we can automate the process of sentiment classification, providing valuable insights into audience reactions. This project showcases the use of machine learning in natural language processing and its practical applications in evaluating user opinions.
- Text Classification: Automatically classifies movie reviews as positive or negative, providing quick sentiment assessments.
- Machine Learning Model: Utilizes a Logistic Regression model, a reliable algorithm for binary classification tasks.
- Easy-to-Use Notebooks: Contains Jupyter notebooks for training the model, making predictions, and analyzing results in an interactive environment.
To get started with the Sentiment Analysis project, follow the steps below to set up your environment.
This project is designed to run in Google Colab. However, if you want to run it locally, you will need to have the following installed:
- Python (3.7 or higher)
- pip (Python package installer)
- Jupyter Notebook (to run Jupyter Notebooks locally)
- install required libraries and run it
The dataset used for this project consists of 50,000 movie reviews obtained from Kaggle. Due to constraints on file size, we have uploaded a limited version of the dataset (25 MB). For access to the full dataset, please visit the following link: Kaggle Dataset.
The Logistic Regression model achieved over 89% accuracy in classifying movie reviews, with clear visualizations and a detailed confusion matrix showcasing its performance.
Distributed under the MIT License. See LICENSE for more information.