Abstract: SILENT SPEAK is an intelligent real-time communication system designed to empower individuals with speech and hearing impairments by translating non-verbal cues into spoken language. The system captures hand gestures using the MediaPipe framework and classifies them through a TensorFlow-based deep learning model trained for precision and efficiency. Simultaneously, facial emotions such as happiness, anger, sadness, and surprise are detected using the ResidualMaskingNetwork model integrated via the DeepFace library. These combined inputs are then converted into audible speech through a text-to-speech (TTS) engine, enabling fluid and expressive communication. A user-friendly graphical interface, developed with Tkinter, displays real-time predictions and allows users to interact with the system seamlessly. With its ability to interpret both gestures and facial expressions, SILENT SPEAK offers a comprehensive solution for augmenting communication, supporting inclusive interactions, and bridging the gap between verbal and non-verbal communication in real-world scenarios.
Keywords: Non-Verbal Communication, Hand Gesture Detection, Emotion Recognition, Real-Time Speech Output, Assistive Communication Technology, MediaPipe, DeepFace, TensorFlow, Human-Centered AI, Multimodal Interaction.
|
DOI:
10.17148/IARJSET.2025.12436