Abstract: After making significant strides from its first uses to help the blind and visually impaired, optical character recognition (OCR) has evolved into a vital tool for automated data extraction from photos. The necessity to efficiently handle massive amounts of image-based data and the increasing digitization of information have been the driving forces behind this change. The advancement of OCR is examined in this research, with a focus on the contribution of Convolutional Neural Networks (CNNs) to increased text extraction job accuracy.
Conventional OCR methods, which mostly used rule-based strategies and manually created features, had trouble handling differences in font sizes, styles, and image quality, especially when dealing with intricate backgrounds. OCR has been greatly impacted by the paradigm shift in computer vision brought about by the development of deep learning, particularly CNNs. CNNs, which draw inspiration from the human visual system, are skilled at automatically deriving complex patterns and features from image data without the need for intensive feature engineering. OCR performance has significantly improved as a result of this capacity, allowing computers to more accurately handle a variety of font types, scales, and even difficult backdrop conditions.
The current literature on CNN-powered OCR systems is reviewed in this work, which looks at diverse architectures, methods, and language-specific applications. It also describes a brand-new system architecture that achieves reliable and effective text extraction from pictures. The suggested design aims to address the shortcomings of current instruments while highlighting the wider societal advantages of developing OCR technology.
Keywords: Utilising OCR and CNN to extract text from images includes concepts like Convolutional neural network, Deep learning, Optical Character Recognition, and Feature Extraction and Key words like Text extraction, Text comments, and Image extraction using CNN, Text detection, text recognition, CNN, Text Extraction, and Pre-Processing.
|
DOI:
10.17148/IARJSET.2025.12218