Classify movie-review sentiment (positive / negative) — full NLP pipeline, classical ML vs deep learning, and a type-your-own-review Streamlit demo.
Best model — TF-IDF + Logistic Regression · 90.9% accuracy · F1 0.910 · ROC-AUC 0.970
Given the text of a movie review, predict whether the sentiment is positive or negative — the canonical NLP text-classification task. Built on the IMDB Large Movie Review dataset: 50,000 reviews, perfectly balanced (25k positive / 25k negative).
TL;DR — After light text cleaning and TF-IDF (unigrams + bigrams), a Logistic Regression classifier reaches 90.9% accuracy — beating Naive Bayes, Linear SVM and an embedding neural network. The classic result: a well-tuned linear model is very hard to beat on IMDB bag-of-words features.
Four models, one shared balanced 80/20 hold-out, ranked by accuracy:
| Model | Accuracy | F1 | ROC-AUC |
|---|---|---|---|
| TF-IDF + Logistic Regression ⭐ | 0.909 | 0.910 | 0.970 |
| TF-IDF + Linear SVM | 0.907 | 0.908 | 0.970 |
| Neural Net (word embeddings) | 0.896 | 0.897 | 0.959 |
| TF-IDF + Multinomial NB | 0.879 | 0.881 | 0.950 |
A Streamlit app classifies any review you type, with a confidence score.
pip install -r requirements.txt
streamlit run app/app.pyDeploy it free on Streamlit Community Cloud — see
STEPS.md.▶️ Live demo: https://sentiment-analysis-mrvnvm7o4iwbdv6zjgabg2.streamlit.app/
Sample predictions:
| Review | Prediction |
|---|---|
| "An absolute masterpiece. Stunning performances…" | 😊 Positive (0.99) |
| "A complete waste of time. The plot made no sense…" | 😞 Negative (0.00) |
| "Gorgeous visuals but the pacing dragged and the ending fell flat." | 😞 Negative (0.36) |
The word clouds and distinctive-term analysis are intuitive — great, excellent, wonderful dominate positive reviews; worst, waste, awful, boring dominate negative ones.
The classes are perfectly balanced (so accuracy is meaningful), and review length barely differs by sentiment.
- Cleaning: strip HTML/
<br>tags, lowercase, remove non-letters, de-duplicate (≈418 dupes removed). - Classical: TF-IDF with unigrams + bigrams (30k features) → Logistic Regression / Linear SVM / Multinomial NB.
- Deep learning: a Keras word-embedding network (Embedding → GlobalAveragePooling → Dense).
- Honest evaluation: accuracy, precision, recall, F1 and ROC-AUC on a shared split, plus a confusion matrix and an error analysis of the model's most confident mistakes (mostly sarcasm and mixed-sentiment reviews).
- Deployment: the TF-IDF + Logistic Regression model is serialised with
jobliband served via Streamlit.
Ankit_Saxena_Sentiment_Analysis/
├── Sentiment_Analysis.ipynb # Full notebook: cleaning → EDA → 4 models → error analysis
├── data/
│ └── IMDB-Dataset.csv.gz # 50k labelled reviews (gzip; pandas reads it directly)
├── app/
│ ├── app.py # Streamlit demo
│ ├── tfidf_vectorizer.joblib # Fitted TF-IDF vectoriser
│ ├── sentiment_model.joblib # Trained Logistic Regression
│ └── model_meta.json
├── assets/ # Figures (word clouds, comparison, …)
├── reports/ # Written report (DOCX + PDF)
├── requirements.txt · STEPS.md · LICENSE · README.md
git clone https://github.com/AnkitSaxena-AI/sentiment-analysis.git
cd sentiment-analysis
pip install -r requirements.txt
jupyter notebook Sentiment_Analysis.ipynb # the analysis
streamlit run app/app.py # the demoAnkit Saxena — @AnkitSaxena-AI
Dataset: IMDB Large Movie Review (Maas et al., 2011). For educational use.



