Abstract: Phishing represents a critical cybersecurity threat where attackers deploy fraudulent websites to deceive users into disclosing sensitive personal information, such as banking credentials and passwords. Traditional blacklist-based detection methods are largely ineffective against new and rapidly changing phishing URLs. This research proposes an automated, intelligent detection system that utilizes machine learning to identify both known and unknown phishing websites by analysing 30 structural URL characteristics. A dataset of labelled legitimate and phishing URLs was sourced from the UCI Machine Learning Repository and PhishTank to train the system. Multiple classification algorithms were evaluated, including Logistic Regression, Decision Tree, and Random Forest, with the XGBoost (Extreme Gradient Boosting) classifier emerging as the optimal model, achieving a peak accuracy of 96.88%. The final trained model was integrated into a real-time web application developed using the Streamlet framework, providing a scalable and efficient solution for proactive cybersecurity.
Keywords: Phishing Detection, Machine Learning, XGBoost, Cybersecurity, URL Analysis, Feature Extraction.
Downloads:
|
DOI:
10.17148/IARJSET.2026.13383
[1] Dharshini T, Mrs. P. Shanthi, "Phishing Website Detection Using Machine Learning," International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2026.13383