Hostname: page-component-76dd75c94c-h9cmj Total loading time: 0 Render date: 2024-04-30T07:39:36.463Z Has data issue: false hasContentIssue false

A finite-state morphological grammar of Hebrew

Published online by Cambridge University Press:  01 April 2008

S. YONA
Affiliation:
Department of Computer Science, University of Haifa, 31905 Haifa, Israel e-mail: shlomo@cs.haifa.ac.il, shuly@cs.haifa.ac.il
S. WINTNER
Affiliation:
Department of Computer Science, University of Haifa, 31905 Haifa, Israel e-mail: shlomo@cs.haifa.ac.il, shuly@cs.haifa.ac.il

Abstract

Morphological analysis is a crucial component of several natural language processing tasks, especially for languages with a highly productive morphology, where stipulating a full lexicon of surface forms is not feasible. This paper describes HAMSAH (HAifa Morphological System for Analyzing Hebrew), a morphological processor for Modern Hebrew, based on finite-state linguistically motivated rules and a broad coverage lexicon. The set of rules comprehensively covers the morphological, morpho-phonological and orthographic phenomena that are observable in contemporary Hebrew texts. Reliance on finite-state technology facilitates the construction of a highly efficient, completely bidirectional system for analysis and generation.

Type
Papers
Copyright
Copyright © Cambridge University Press 2007

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adler, M. (2004) Word-based statistical language modeling: Two-dimensional approach. Thesis proposal, Ben Gurion University, Beer Sheva.Google Scholar
Alon, E. (1995) Unvocalized Hebrew Writing: The Structure of Hebrew Words in Syntactic Context. Ben-Gurion University of the Negev Press. In H ebrew.Google Scholar
Bar-Haim, R., Sima'an, K. and Winter, Y. (2005) Choosing an optimal architecture for segmentation and POS-tagging of Modern Hebrew. Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages, pages 3946, Ann Arbor, Michigan. Association for Computational Linguistics.CrossRefGoogle Scholar
Bar-Haim, R. (2005) Part-of-speech tagging for Hebrew and other Semitic languages. Master's thesis, Computer Science Department, Technion, Haifa, Israel.Google Scholar
Barkali, S. (1962) Lux HaP'alim HaShalem (The Complete Verbs Table). Reuven Mass, Jerusalem. In Hebrew.Google Scholar
Beesley, K. R. and Karttunen, L. (2003) Finite-State Morphology: Xerox Tools and Techniques. CSLI, Stanford.Google Scholar
Beesley, K. R. (1996) Arabic finite-state morphological analysis and generation. Proceedings of COLING-96, the 16th International Conference on Computational Linguistics, Copenhagen.CrossRefGoogle Scholar
Beesley, K. R. (1998) Arabic morphology using only finite-state operations. In: Rosner, M., editor, Proceedings of the Workshop on Computational Approaches to Semitic languages, pages 5057, Montreal, Quebec. COLING-ACL'98.CrossRefGoogle Scholar
Beesley, K. R. (2001) Finite-state morphological analysis and generation of Arabic at Xerox Research: Status and plans in 2001. ACL Workshop on Arabic Language Processing: Status and Perspective, pages 1–8, Toulouse, France.Google Scholar
Bentur, E., Angel, A., Segev, D. and Lavie, A. (1992) Analysis and generation of the nouns inflection in Hebrew. In: Ornan, U., Arieli, G. and Doron, E., editors, Hebrew Computational Linguistics, chapter 3, pages 3638. Ministry of Science and Technology. In Hebrew.Google Scholar
Choueka, Y. (1980) Computerized full-text retrieval systems and research in the humanities: The Responsa project. Computers and the Humanities, 14: 153169.CrossRefGoogle Scholar
Choueka, Y. (1998) MLIM – a system for full, exact, on-line grammatical analysis of Modern Hebrew. In: Eizenberg, Y., editor, Proceedings of the Annual Conference on Computers in Education, p. 63, Tel Aviv. In Hebrew.Google Scholar
Cohen, H. A. (1996) klalei ha-ktiv xasar ha-niqqud. leshonenu la &am, special edition. In Hebrew.Google Scholar
Har'El, N. and Kenigsberg, D. (2004) Hspell: a free Hebrew speller. Available from http://www.ivrix.org.il/-projects/-spell-checker/.Google Scholar
Ide, N., Romary, L. and dela Clergerie, E. la Clergerie, E. (2003) International standard for a linguistic annotation framework. SEALTS ‘03: Proceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems, pages 2530, Morristown, NJ, USA. Association for Computational Linguistics.CrossRefGoogle Scholar
Itai, A., Wintner, S. and Yona, S. (2006) A computational lexicon of contemporary Hebrew. Proceedings of The fifth international conference on Language Resources and Evaluation (LREC-2006), Genoa, Italy.Google Scholar
Kaplanm, R. M. and Kay, M. (1994) Regular models of phonological rule systems. Computational Linguistics, 20 (3): 331378.Google Scholar
Kataja, L. and Koskenniemi, K. (1988) Finite-state description of Semitic morphology: A case study of Ancient Akkadian. COLING, pp. 313–315.CrossRefGoogle Scholar
Kiraz, G. A. (2000) Multitiered nonlinear morphology using multitape finite automata: a case study on Syriac and Arabic. Computational Linguistics, 26 (1): 77105.CrossRefGoogle Scholar
Koskenniemi, K. (1983) Two-Level Morphology: a General Computational Model for Word-Form Recognition and Production. The Department of General Linguistics, University of Helsinki.Google Scholar
Lavie, A., Itai, A., Ornan, U. and Rimon, M. (1988) On the applicability of two-level morphology to the inflection of Hebrew verbs. Proceedings of the International Conference of the ALLC, Jerusalem, Israel.Google Scholar
Lavie, A., Wintner, S., Eytani, Y., Peterson, E. and Probst, K. (2004) Rapid prototyping of a transfer-based Hebrew-to-English machine translation system. Proceedings of TMI-2004: The 10th International Conference on Theoretical and Methodological Issues in Machine Translation, Baltimore, MD.Google Scholar
Ornan, U. and Kazatski, W. (1986) Analysis and synthesis processes in Hebrew morphology. Proceedings of the 21 National Data Processing Conference. In Hebrew.Google Scholar
Ornan, U. (2003) The Final Word. University of Haifa Press, Haifa, Israel. In Hebrew.Google Scholar
Roche, E. and Schabes, Y. (editors) (1997) Finite-State Language Processing. Language, Speech and Communication. MIT Press, Cambridge, MA.CrossRefGoogle Scholar
Schwarzwald, O. (2001) Moden Hebrew, volume 127 of Languages of the World/Materials. LINCOM EUROPA.Google Scholar
Schwarzwald, O. (2002) Studies in Hebrew Morphology. The Open University of Israel.Google Scholar
Segal, E. (1997) Morphological analyzer for unvocalized hebrew words. Unpublished work, available from http://www.cs.technion.ac.il/-erelsgl-hmntx.zip.Google Scholar
Segal, E. (1999) Hebrew morphological analyzer for Hebrew undotted texts. Master's thesis, Technion, Israel Institute of Technology, Haifa. In Hebrew.Google Scholar
Wintner, S. and Yona, S. (2003) Resources for processing Hebrew. Proceedings of the MT-Summit IX workshop on Machine Translation for Semitic Languages, pp. 5360, New Orleans.Google Scholar
Wintner, S. (2004) Hebrew computational linguistics: Past and future. Artificial Intelligence Review, 21 (2): 113138.CrossRefGoogle Scholar
Wynne, M. (editor) (2005) Developing Linguistic Corpora: a Guide to Good Practice. Oxbow Books, Oxford. Available online from http://ahds.ac.uk/linguistic-corpora/.Google Scholar
Zdaqa, Y. (1974) Luxot HaPoal (The Verb Tables). Kiryath Sepher, Jerusalem. In Hebrew.Google Scholar