Abstract: The rapid expansion of digital communication has significantly increased the amount of spam across platforms such as SMS, email, URLs, and social media. These spam messages often contain phishing links, fake offers, and harmful advertisements that threaten user security. To address this issue, the project proposes a Machine Learning–based Spam Classification System that can automatically detect and filter spam from various communication sources. The system begins by cleaning and preprocessing text data through steps like tokenization, stop-word removal, and normalization. It then uses TF-IDF to convert textual information into numerical features. Multiple machine learning models are trained to accurately distinguish between spam and legitimate messages. The system learns underlying patterns that help identify spam more effectively. Its performance is evaluated using metrics such as accuracy, precision, recall, and F1-score. The proposed solution minimizes false detections and improves reliability. It is capable of processing large volumes of data and can be easily integrated into real-time applications. Overall, the system strengthens communication security and builds user trust.
Keywords: Spam Classification, Machine Learning, SMS Spam Detection, Email Spam Filtering, URL Analysis, Social Media Spam, Text Classification, TF-IDF, Cyber Security.
Downloads:
|
DOI:
10.17148/IARJSET.2025.121238
[1] Nandini P Gowda., Jnanashree TR., N Govind Prasad., Vibha Datta, "ML-Driven Spam Classification Model," International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2025.121238