Abstract: This paper presents a multi-modal deepfake detection system capable of analyzing images, videos, and audio for signs of manipulation. The project addresses the limitations of single-modality detection systems by combining various machine learning and deep learning techniques to identify complex forgeries. The proposed system utilizes CNN and ResNet for detecting spatial inconsistencies in images, an LSTM on frame sequences for identifying temporal anomalies in videos, and a combination of Librosa and a Random Forest classifier for detecting synthetic audio patterns. The system aims to be resilient against adversarial attacks and provide accurate, real-time results through a user-friendly, Flask-based web interface. The anticipated outcomes include superior detection accuracy, balanced precision and recall, and enhanced generalization across diverse datasets.

Keywords: Deepfake Detection, Multi-modal Detection, Machine Learning, Deep Learning, Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM).


Downloads: PDF | DOI: 10.17148/IARJSET.2025.12835

How to Cite:

[1] Varun Kumar G, Dr. Harish G, Dr. Smitha shekar B, "Deepfake Creation and Detection of Multimedia Data Using Machine Learning," International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2025.12835

Open chat