Curras: an annotated corpus for the Palestinian Arabic dialect

Mustafa Jarrar; Nizar Habash; Faeq Alrimawi; Diyam Akra; Nasser Zalmout

doi:10.1007/s10579-016-9370-7

What is it about?

The corpus is about 56K words/tokens. Every word in the corpus was then manually annotated with a set of metadata attributes to describe the orthographical, morphological, and semantic features of the word such as part of speech, prefixes, stem, suffixes, dialect lemma, MSA lemma, CODA surface, gender, number, mode, and a gloss in English. Every word was annotated in context

Why is it important?

(i) Language learners can use it as a trilingual Palestinian-Standard Arabic-English lexicon (ii) Linguists can use it to for research purposes (iii) To develop IT applications. The dialectal content is rapidly increasing on the web, especially in the social media, and there are no computer applications currently available to process and understand this content, e.g., automatic translate, effective searching and retrieval, spell checking, speech recognition, and many others.

Perspectives

The annotations tags are compatible with LDC Arabic tags, was done very high accuracy, and it is can be searched and downloaded from http://portal.sina.birzeit.edu/curras/
Dr Mustafa Jarrar
Birzeit University

This page is a summary of: Curras: an annotated corpus for the Palestinian Arabic dialect, Language Resources and Evaluation, December 2016, Springer Science + Business Media,
DOI: 10.1007/s10579-016-9370-7.
You can read the full text:

Read

Resources

URL
Search and download the Dialect Corpus
مدونة اللهجة العامية الفلسطينية

Contributors

The following have contributed to this page

Dr Mustafa Jarrar
Birzeit University

Annotated corpus for the Palestinian dialect

What is it about?

Why is it important?

Perspectives

Resources

Search and download the Dialect Corpus

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Annotated corpus for the Palestinian dialect

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Resources

Search and download the Dialect Corpus

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management