What is it about?

An Arabic online text database called Online-KHATT is presented, which addresses the lack of a free benchmarking database of natural Arabic online text. This database consists of natural Arabic online text written without any constraints using a digital pen.

Featured Image

Why is it important?

The main objective of this work is to build a comprehensive benchmarking database of the online Arabic text. Part of this objective is the development of tools, techniques, and procedures for online text collection, verification, and transliteration. Additionally, we built a dataset for segmented online Arabic characters and ligatures with ground truth labeling and present classification results of online Arabic characters using DBN-based HMM.

Perspectives

The database consists of 10,040 lines of Arabic text written by 623 writers using Android- and Windows-based devices. The text lines of the Online-KHATT database are randomly distributed into training, testing, and verification sets that contain 70%, 15%, and 15% of the text lines of the database, respectively. We have segmented part of the collected data into characters along with their ground truths. We have developed tools for the collection of data (for devices with an electronic pen), verification and correction of ground truths, transliteration, and semi-automated segmentation of characters. In addition, we also present the experimental results of Arabic online character recognition using the Online-KHATT database.

Galal BinMakhashen
King Fahd University of Petroleum and Minerals

Read the Original

This page is a summary of: Online-KHATT: An Open-Vocabulary Database for Arabic Online-Text Processing, The Open Cybernetics & Systemics Journal, March 2018, Bentham Science Publishers,
DOI: 10.2174/1874110x01812010042.
You can read the full text:

Read

Contributors

The following have contributed to this page