Abstract: Adverse drug side effects pose significant challenges in pharmaceutical research, ranking as a leading cause of treatment failure and mortality. Traditional laboratory-based evaluations of drug side effects are resource-intensive and time-consuming, necessitating the adoption of machine learning techniques for efficient and accurate predictions. This study explores the use of supervised learning approaches for drug side effect prediction, leveraging biomedical data and computational models. We employ various feature extraction techniques, including Bag of Words (BOW) and Term Frequency-Inverse Document Frequency (TF-IDF), combined with classification models such as Logistic Regression, Random Forest, and Support Vector Machines (SVM). Experimental results demonstrate that the TF-IDF-based models achieve superior performance, with Logistic Regression attaining a test accuracy of 80.88% and SVM achieving 80.90%. These findings highlight the potential of machine learning in predicting drug side effects, optimizing drug safety assessments, and reducing the risks associated with adverse reactions. Our study provides a comprehensive analysis of model effectiveness and discusses key challenges, research gaps, and future directions for improving predictive performance in this critical domain.

Keywords: Machine Learning, Supervised Learning, Feature Extraction, TF-IDF, Bag of Words, Logistic Regression, Random Forest and Support Vector Machines.


PDF | DOI: 10.17148/IARJSET.2025.12427

Open chat