Abstract: This project introduces an innovative real-time feedback software aimed at enhancing public speaking skills through comprehensive analysis of webcam data. The system evaluates key aspects of body language such as posture, gestures, and eye contact, along with critical speech metrics including filler word usage, speaking pace, and clarity. By delivering instant, actionable feedback and detailed progress reports, it enables users to systematically improve their presentation skills. The software is built using Streamlit for a responsive user interface and backend, a Convolutional Neural Network (CNN) for analyzing non-verbal communication, Hugging Face models for advanced natural language processing, and Librosa for audio analysis and transcription. Trained on a diverse dataset of annotated public speaking videos, the system ensures high accuracy and relevance while maintaining strict privacy and ethical standards. Extensive testing has validated its reliability, and continuous updates based on user feedback allow the software to evolve with technological advancements and user needs. This AI-powered tool represents a significant step forward in making high-quality public speaking training accessible to all.
Keywords: The main keywords of the project are public speaking, real-time feedback, body language, speech analysis, CNN, Hugging Face, Librosa, NLP, audio-visual processing, feature extraction, user interface, Streamlit, Tkinter, machine learning, deep learning, emotion detection, posture, gestures, eye contact, and filler word
|
DOI:
10.17148/IARJSET.2025.125258