Real-time text recognition and text-to-speech processing system for visually impaired

D. Parasar; Y. Jadhav; A. Patel; J. Shah

doi:10.1049/icp.2024.0493

What is it about?

In contemporary society, the pace of our daily lives has reached such a level of intensity that we find ourselves with limited opportunities to engage with crucial informational resources. The act of reading has transitioned from being obligatory to becoming elective. The preference for podcasts over novels and other written media is driven by the perception that podcasts are more time-efficient. In the current era of technological technology, utilizing this medium demands less cognitive effort compared to reading and offers a more user-friendly experience. Optical Character Recognition (OCR) refers to a technique used to audibly interpret text that has been retrieved from photographs or PDF files. The proposed methodology involves converting PDF documents into audio files, while eliminating extraneous parts such as headers, footers, page numbers, publisher copyright seals, external references, and other similar components. The approach employs word recognition and text-to- speech technology to convert PDF files into audio versions. The process involves employing several technologies such as Optical Character Recognition (OCR) and text-to-speech synthesizers, like Google Text-to-Speech and Python Text-to-Speech, to convert written text into audible audio.

Photo by Alvin Cabaltera on Unsplash

Why is it important?

The proposed methodology aims to achieve the goal of creating a smart environment in which individuals utilize advanced systems to enhance their quality of life.

Perspectives

The aforementioned technique effectively mitigates the generation of said elements as undesirable outputs within the ultimate audio file. The study at hand aims to assist individuals who are visually impaired and face challenges in reading. Podcast versions of these texts are available for auditory consumption.
Dr. Ashish D Patel
Shri Sad Vidya Mandal Institute of Technology

This page is a summary of: Real-time text recognition and text-to-speech processing system for visually impaired, January 2023, the Institution of Engineering and Technology (the IET),
DOI: 10.1049/icp.2024.0493.
You can read the full text:

Read

Contributors

The following have contributed to this page

Dr. Ashish D Patel
Shri Sad Vidya Mandal Institute of Technology

Real-time text recognition and text-to-speech processing system for visually impaired

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Real-time text recognition and text-to-speech processing system for visually impaired

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management