Abstract: The availability of low-cost genomic sequencing has created vast amounts of genomic data, which presents opportunities and challenges for the interpretation of genomic data in clinical and research environments. In this article, we describe a new software tool for the analysis of genomic data and disease prediction based on machine learning algorithms. The proposed tool applies several supervised learning algorithms to detect patterns between genomic markers and predict disease risk along with confidence measures. The software offers a wide variety of data visualization, model comparison, and feature importance analysis to aid in the interpretation of the results. Tests conducted on example datasets show the software’s capacity to effectively identify significant genomic markers and classify disease status with acceptable accuracy. Furthermore, the software has also been implemented as an interactive web application on Google Collab, providing an immediate platform for researchers, educators, and clinicians to apply machine learning to genomic medicine without requiring extensive computational expertise. This research contributes to the emerging area of computational genomics by supplying an open-ended system for hypothesis formation and exploratory analysis in genomic studies.

Keywords: Machine learning, genomics, disease prediction, personalized medicine, feature importance, SNP analysis, bio informatics.


Downloads: PDF | DOI: 10.17148/IARJSET.2025.121255

How to Cite:

[1] Sumukh M, Ramu B, Yashwanth K H, Raziq Pasha, Malashree M S, "Genomic Data Analysis," International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2025.121255

Open chat