Cyberbullying Detection on Social media Platforms using ML and NLP
Abstract: The rapid growth of user-generated content on video-sharing platforms such as YouTube has significantly enhanced global communication while simultaneously increasing the prevalence of cyberbullying within comment sections. Offensive and harmful comments negatively impact users' psychological well-being and degrade online interactions. Manual moderation of such large-scale textual data is inefficient, necessitating automated intelligent detection systems. This study proposes a machine learning-based framework for multi-class classification of YouTube comments into bullying, non-bullying, and supportive categories. The system integrates Natural Language Processing (NLP) techniques including text cleaning, tokenization, stop-word removal, and lemmatization. Feature extraction is performed using TF-IDF vectorization with n-gram representations, along with sentiment-based features. To address class imbalance, SMOTE is applied during training. The classification framework employs XGBoost and Logistic Regression models, which are combined using a stacking ensemble approach to improve generalization and predictive performance. Experimental results demonstrate that the ensemble model effectively captures linguistic patterns in cyberbullying-related content and achieves reliable performance across accuracy, precision, recall, and F1-score. The proposed system provides a scalable and computationally efficient solution for automated cyberbullying detection on YouTube. Keywords Cyberbullying Detection, YouTube Comment Analysis, Social Media Monitoring, Machine Learning, Natural Language Processing, Text Classification, Sentiment Analysis.
How to Cite:
[1] Inbanathan S, Dr. P. Menaka, “Cyberbullying Detection on Social media Platforms using ML and NLP,” International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2026.13225
