Abstract: Suicide is a serious global issue requiring timely interventions. Developing an accurate prediction system using available data can help identify at-risk individuals and provide timely support. This study analyzes suicide data to pinpoint key attributes contributing to suicide attempts, aiming to predict future attempts with high precision using machine learning techniques. We evaluated three algorithms—Logistic Regression, Random Forest, and Naïve Bayes—finding Random Forest to be the most accurate. The dataset was preprocessed and important features, such as age, gender, mental health history, and socio-economic status, were identified. Stratified k-fold cross-validation ensured robust model evaluation. Results indicate that ensemble methods like Random Forest significantly improve suicide attempt predictions, aiding mental health professionals in early intervention. Future research should incorporate diverse data sources, such as social media and electronic health records, while addressing ethical concerns about privacy and deployment.
Keywords: Suicide, prediction system, machine learning, Random Forest, Logistic Regression, Naïve Bayes, data analysis, mental health, intervention, ethical concerns.
| DOI: 10.17148/IARJSET.2024.11729