What is it about?
Recent ML models require a large and clean corpus of parallel data. Most of the time, they cannot even deal with rare words effectively. Due to the unavailability of a large parallel corpus, it is challenging to use ML models for translating Sanskrit. However, we have improved the translation accuracy even under zero-shot conditions using morphological patterns (such as Dhatu, Vibhakti, and compound words) and improved filtering heuristics.
Featured Image
Why is it important?
Much work needs to be done using ML to address the challenges in translating Sanskrit, one of the oldest and rich languages known to the world, with its morphological richness and limited multilingual parallel corpus.
Perspectives
Read the Original
This page is a summary of: Filtering and Extended Vocabulary based Translation for Low-resource Language pair of Sanskrit-Hindi, ACM Transactions on Asian and Low-Resource Language Information Processing, January 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3580495.
You can read the full text:
Contributors
The following have contributed to this page