What is it about?

In the normal free-form handwritten text, repetition (repeated writing of the same stroke several times in the same place), over-writing, and crossing out are very common. In this article,we call the presence of these three types of writing as “noise.” Cleaning to extract useful text from such types of noisy text is an important task for robust recognition. To the best of our knowledge, no work has been reported on cleaning of such noise from online text in any scripts and hence, in this article, we propose an automatic text-cleaning approach for online Bangla handwriting recognition.

Featured Image

Why is it important?

In the domain of handwriting recognition, researchers do not consider noisy words for recognition. In the case of online noisy data, some researchers manually pre-process the data to remove the noisy parts from the word before recognition. Manual pre-processing is a tedious and costly procedure. To the best of our knowledge, this is the first work of its kind for online recognition.

Perspectives

The current work is done with a difficult script—Bangla. In future,we plan to analyze the writing styles of other Indian scripts as well as Latin and Chinese scripts and devise algorithms for detection and cleaning of noises accordingly.

Dr. Nilanjana Bhattacharya

Read the Original

This page is a summary of: Cleaning of Online Bangla Free-form Handwritten Text, ACM Transactions on Asian and Low-Resource Language Information Processing, March 2018, ACM (Association for Computing Machinery),
DOI: 10.1145/3145538.
You can read the full text:

Read

Contributors

The following have contributed to this page