Statistical and Syllabification Based Model for Nepali Machine Transliteration

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Machine Transliteration is one of the important modules for the development of a correct Machine Translation (MT) system. Machine Translation is the technique of converting sentences in one natural language into another using a machine, whereas Machine Transliteration is the method of converting words in one language into phonetically identical words in another. When Machine Translation is unable to translate the Out-of-Vocabulary (OOV) words, Name Entity words, technical words, abbreviation, etc. then Machine Transliteration transliterates these words phonetically. This paper presents a transliteration system for the English-Nepali language pair using the most widely used statistical method with a linguistic syllabification methodology. A model has been designed based on syllable splitting that splits 19,513 parallel entries which contains person names, place, etc. IRSTLM and GIZA++ are used to build the language model (LM) and translation model (TM) i.e. word alignment respectively over parallel entries. For English-Nepali parallel entries on Syllable based split, an accuracy of 87% has been achieved.

Cite

CITATION STYLE

APA

Roy, A. K., Paul, A., & Purkayastha, B. S. (2022). Statistical and Syllabification Based Model for Nepali Machine Transliteration. In Communications in Computer and Information Science (Vol. 1579 CCIS, pp. 19–27). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-10766-5_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free