Skip to content

Latest commit

 

History

History
39 lines (24 loc) · 1.43 KB

File metadata and controls

39 lines (24 loc) · 1.43 KB

SPAM/HAM Classification using NLP

This repository contains code for a machine learning model that classifies text messages as spam or ham (not spam) using Natural Language Processing (NLP) techniques. The model utilizes the Naive Bayes algorithm with TF-IDF (Term Frequency-Inverse Document Frequency) feature extraction. It preprocesses text data, trains the model on labeled examples, and evaluates its performance using accuracy, precision, recall, and F1-score metrics. This project demonstrates how NLP can be applied to solve the problem of spam detection in text messages.

Key Features

  • Preprocessing text data: tokenization, stopword removal, and TF-IDF vectorization.
  • Training a Naive Bayes classifier on labeled text messages.
  • Evaluating model performance using standard classification metrics.
  • Simple and easy-to-understand implementation for spam/ham classification.

Dependencies

  • Python 3.x
  • pandas
  • scikit-learn

Usage

  1. Clone the repository:
  1. Navigate to the project directory:
  • cd spam-ham-classification
  1. Install dependencies:
  • pip install -r requirements.txt
  1. Run the main script:
  • python spam_ham_classification.py

Contributions

Contributions, bug reports, and feature requests are welcome! Please feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License.