📞 +91-7667918914 | ✉️ iarjset@gmail.com
International Advanced Research Journal in Science, Engineering and Technology
International Advanced Research Journal in Science, Engineering and Technology A Monthly Peer-Reviewed Multidisciplinary Journal
ISSN Online 2393-8021ISSN Print 2394-1588Since 2014
IARJSET aligns to the suggestive parameters by the latest University Grants Commission (UGC) for peer-reviewed journals, committed to promoting research excellence, ethical publishing practices, and a global scholarly impact.
← Back to VOLUME 13, ISSUE 5, MAY 2026

Sentiment Analysis of YouTube Comments Using TF-IDF and LightGBM with MLflow and Flask Deployment

Rohit Santosh Bedse, Diwansing Shivsing Girase, Krishita Shailendra Patil, Kiran Jitendra Patil, Prof. Reema Kalda

👁 10 views📥 4 downloads
Share: 𝕏 f in
Abstract: Sentiment analysis of user-generated content is a pivotal Natural Language Processing (NLP) task with significant applications in audience analytics, content moderation, and brand monitoring. YouTube comments represent a rich but noisy source of opinionated text, presenting challenges including informal language, abbreviations, sarcasm, and domain-specific terminology. This paper presents an end-to-end multiclass sentiment classification pipeline for YouTube comments built on Term Frequency-Inverse Document Frequency (TF-IDF) feature engineering and a Light Gradient Boosting Machine (LightGBM) classifier. The proposed system integrates structured data ingestion, NLTK- based text preprocessing with negation-aware stopword handling, TF-IDF vectorization with unigram-to-trigram representations, balanced multiclass classification, MLflow experiment tracking, and a Flask REST API deployment layer. An HTML, CSS, and JavaScript frontend provides an interactive interface for YouTube video URL input, real- time comment sentiment prediction, and dashboard visualization. Comparative experimentation across nine machine learning algorithms confirmed LightGBM as the optimal model. The proposed system achieved 89.4% overall accuracy and a macro-averaged F1-score of 88.7% on the held-out test set.

Keywords: Sentiment Analysis, YouTube Comments, TF-IDF, LightGBM, MLflow, Text Classification

How to Cite:

[1] Rohit Santosh Bedse, Diwansing Shivsing Girase, Krishita Shailendra Patil, Kiran Jitendra Patil, Prof. Reema Kalda, “Sentiment Analysis of YouTube Comments Using TF-IDF and LightGBM with MLflow and Flask Deployment,” International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2026.135107

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.