Hostname: page-component-6766d58669-fx4k7 Total loading time: 0 Render date: 2026-05-24T20:11:50.399Z Has data issue: false hasContentIssue false

A survey of automatic Arabic diacritization techniques

Published online by Cambridge University Press:  10 October 2013

AQIL M. AZMI
Affiliation:
Department of Computer Science, King Saud University, Riyadh 11543, Saudi Arabia e-mails: aqil@ksu.edu.sa, reham.imamu@gmail.com
REHAM S. ALMAJED
Affiliation:
Department of Computer Science, King Saud University, Riyadh 11543, Saudi Arabia e-mails: aqil@ksu.edu.sa, reham.imamu@gmail.com

Abstract

In Modern Standard Arabic texts are typically written without diacritical markings. The diacritics are important to clarify the sense and meaning of words. Lack of these markings may lead to ambiguity even for the natives. Often the natives successfully disambiguate the meaning through the context; however, many Arabic applications, such as machine translation, text-to-speech, and information retrieval, are vulnerable due to lack of diacritics. The process of automatically restoring diacritical marks is called diacritization or diacritic restoration. In this paper we discuss the properties of the Arabic language and the issues that are related to the lack of the diacritical marking. It will be followed by a survey of the recent algorithms that were developed to solve the diacritization problem. We also look into the future trend for researchers working in this area.

Information

Type
Articles
Copyright
Copyright © Cambridge University Press 2013 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable