What is it about?

Dealing with low-resource languages is problematic as data is in short supply. In such cases, taking advantage of 'similar' languages can be beneficial, as we show in this paper for Azeri-to-English. We build a large-scale Turkish-to-English system and then take advantage of lexical similarities between Turkish and Azeri to obtain best performance for the primary language pair of interest, Azeri--English.

Featured Image

Why is it important?

Sometimes when dealing with low-resource languages, researchers are at a loss as to how to build competitive tools. This simple idea can be replicated for other resource-poor scenarios where larger amounts of training data exist for closely-related languages.

Perspectives

This is a nice paper which uses a very simple idea to obtain better performance for a low-resource language pair. It is easily replicable for other similar languages.

Andy Way
Dublin City University

Read the Original

This page is a summary of: Translating Low-Resource Languages by Vocabulary Adaptation from Close Counterparts, ACM Transactions on Asian and Low-Resource Language Information Processing, December 2017, ACM (Association for Computing Machinery),
DOI: 10.1145/3099556.
You can read the full text:

Read

Contributors

The following have contributed to this page