Skip to main content Accesibility Help
×
×
Home

An alternative to synchronous tree substitution grammars*

  • ANDREAS MALETTI (a1)
Abstract

Synchronous tree substitution grammars (stsg) are a (formal) tree transformation model that is used in the area of syntax-based machine translation. A competitor that is at least as expressive as stsg is proposed and compared to stsg. The competitor is the extended multi bottom-up tree transducer (mbot), which is the bottom-up analogue with the additional feature that states have non-unary ranks. Unweighted mbot have already been investigated with respect to their basic properties, but the particular properties of the constructions that are required in the machine translation task are largely unknown. stsg and mbot are compared with respect to binarization, regular restriction, and application. Particular attention is paid to the complexity of the constructions.

Copyright
References
Hide All
Aho, A. V. and Ullman, J. D. 1972. The Theory of Parsing, Translation, and Compiling. Upper Saddle River, NJ, USA: Prentice Hall.
Alexandrakis, A. and Bozapalidis, S. 1987. Weighted grammars and Kleene's theorem. Information Processing Letters 24 (1): 14.
Arnold, A. and Dauchet, M. 1982. Morphismes et bimorphismes d'arbres. Theoretical Computer Science 20 (1): 3393.
Bar-Hillel, Y., Perles, M. and Shamir, E. 1964. On formal properties of simple phrase structure grammars. In Bar-Hillel, Y. (ed.), Language and Information: Selected Essays on their Theory and Application, Chapter 9, pp. 116150. Reading, MA, USA: Addison Wesley.
Berstel, J. and Reutenauer, C. 1982. Recognizable formal power series on trees. Theoretical Computer Science 18 (2): 115–48.
Borchardt, B. 2004. A pumping lemma and decidability problems for recognizable tree series. Acta Cybernetica 16 (4): 509–44.
Brown, P. F., Cocke, J., Della Pietra, S. A., Della Pietra, V. J., Jelinek, F., Lafferty, J. D., Mercer, R. L., and Roossin, P. S. 1990. A statistical approach to machine translation. Computational Linguistics 16 (2): 7985.
Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., and Mercer, R. L. 1993. Mathematics of statistical machine translation: parameter estimation. Computational Linguistics 19 (2): 263311.
Chiang, D. 2005. A hierarchical phrase-based model for statistical machine translation. In Knight, K., Ng, H. T., and Oflazer, K. (eds.), Association for Computational Linguistics: 43rd Annual Meeting, pp. 263–70. Ann Arbor, MI, USA: Association for Computational Linguistics.
Chiang, D. and Knight, K. 2006. Tutorial: an introduction to synchronous grammars. In Calzolari, N., Cardie, C., and Isabelle, P. (eds.), Association for Computational Linguistics: 44th Annual Meeting, Sydney, Australia: Association for Computational Linguistics.
DeNero, J., Bansal, M., Pauls, A. and Klein, D. 2009. Efficient parsing for transducer grammars. In Ostendorf, M., Collins, M., Narayanan, S., Oard, D. W., and Vanderwende, L. (eds.), Human Language Technologies: 2009 Annual Conference, pp. 227–35. Boulder, CO, USA: Association for Computational Linguistics.
DeNero, J., Pauls, A. and Klein, D. 2009. Asynchronous binarization for synchronous grammars. In Su, K.-Y., Su, J., Wiebe, J., and Li, H. (eds.), Association for Computational Linguistics: 47th Annual Meeting, pp. 141–4. Singapore, Singapore: Association for Computational Linguistics.
Eilenberg, S. 1974. Automata, Languages, and Machines, Volume 59 of Pure and Applied Math. Orlando, FL, USA: Academic Press.
Engelfriet, J., Fülöp, Z. and Vogler, H. 2002. Bottom-up and top-down tree series transformations. Journal of Automata, Languages and Combinatorics 7 (1): 1170.
Engelfriet, J., Lilin, E. and Maletti, A. 2009. Extended multi bottom-up tree transducers: Composition and decomposition. Acta Informatica 46 (8): 561–90.
Engelfriet, J., Rozenberg, G. and Slutzki, G. 1980. Tree transducers, L systems, and two-way machines. Journal of Computer and System Sciences 20 (2): 150202.
Fülöp, Z., Maletti, A. and Vogler, H. 2010. Preservation of recognizability for synchronous tree substitution grammars. In Drewes, F., and Kuhlmann, M. (eds.), Applications of Tree Automata in Natural Language Processing: 2010 Workshop, pp. 19. Uppsala, Sweden: Association for Computational Linguistics.
Fülöp, Z. and Vogler, H. 2009. Weighted tree automata and tree transducers. In Droste, M., Kuich, W., and Vogler, H. (eds.), Handbook of Weighted Automata, pp. 313403. EATCS Monographs on Theoretical Computer Science, Chapter IX. Berlin, Germany: Springer.
Golan, J. S. 1999. Semirings and their Applications. Dordrecht: Kluwer Academic.
Graehl, J., Knight, K. and May, J. 2008. Training tree transducers. Computational Linguistics 34 (3): 391427.
Hebisch, U. and Weinert, H. J. 1998. Semirings — Algebraic Theory and Applications in Computer Science. Singapore: World Scientific.
Hopcroft, J. E. and Ullman, J. D. 1979. Introduction to Automata Theory, Languages and Computation. Reading, MA, USA: Addison Wesley.
Huang, L., Zhang, H., Gildea, D. and Knight, K. 2009. Binarization of synchronous context-free grammars. Computational Linguistics 35 (4), 559–95.
Knight, K. 2007. Capturing practical natural language transformations. Machine Translation 21 (2): 121–33.
Lilin, E. 1978. Une Généralisation des Transducteurs D'états Finis D'arbres: les S-transducteurs. Thèse 3ème cycle, Université de Lille.
Lilin, E. 1981. Propriétés de clôture d'une extension de transducteurs d'arbres déterministes. In Astesiano, E., and Böhm, C. (eds.), Trees in Algebra and Programming: 6th Colloquium, Volume 112 of Lecture Notes in Computer Science, pp. 280289. Genoa, Italy: Springer.
Maletti, A., Graehl, J., Hopkins, M. and Knight, K. 2009. The power of extended top-down tree transducers. SIAM Journal on Computing 39 (2): 410–30.
Maletti, A. and Satta, G. 2009. Parsing algorithms based on tree automata. In de la Clergerie, E. V., Bunt, H., and Danlos, L. (eds.), Parsing Technologies: 11th International Conference, pp. 112. Paris, France: Association for Computational Linguistics.
Maletti, A. and Satta, G. 2010. Parsing and translation algorithms based on weighted extended tree transducers. In Drewes, F., and Kuhlmann, M. (eds.), Applications of Tree Automata in Natural Language Processing: 2010 Workshop, pp. 1927. Uppsala, Sweden: Association for Computational Linguistics.
Mohri, M. 2009. Weighted automata algorithms. In Droste, M., Kuich, W., and Vogler, H. (eds.), Handbook of Weighted Automata, pp. 213254. EATCS Monographs on Theoretical Computer Science, Chapter IV. Berlin, Germany: Springer.
Nederhof, M.-J. and Satta, G. 2003. Probabilistic parsing as intersection. In Parsing Technologies: 8th International Conference, pp. 137148. Nancy, France: Association for Computational Linguistics.
Nederhof, M.-J. and Satta, G. 2008. Computation of distances for regular and context-free probabilistic languages. Theoretical Computer Science 395 (2–3): 235–54.
Raoult, J.-C. 1993. Recursively defined tree transductions. In Kirchner, C. (ed.), Rewriting Techniques and Applications: 5th International Conference, Volume 690 of Lecture Notes in Computer Science, pp. 343357. Montreal, Canada: Springer.
Sakarovitch, J. 2009. Rational and recognisable power series. In Droste, M., Kuich, W., and Vogler, H. (eds.), Handbook of Weighted Automata, pp. 105174. EATCS Monographs on Theoretical Computer Science, Chapter IV. Berlin, Germany: Springer.
Satta, G. 2010. Translation algorithms by means of language intersection. (Manuscript).
Schützenberger, M. P. 1961. On the definition of a family of automata. Information and Control 4 (2–3): 245–70.
Wang, W., Knight, K. and Marcu, D. 2007. Binarizing syntax trees to improve syntax-based machine translation accuracy. In Eisner, J. (ed.), Empirical Methods in Natural Language Processing: 2007 Joint Conference, pp. 746754. Prague, Czech Republic: Association for Computational Linguistics.
Zhang, H., Huang, L., Gildea, D. and Knight, K. 2006. Synchronous binarization for machine translation. In Moore, R. C., Bilmes, J., Chu-Carroll, J., and Sanderson, M. (eds.), Human Language Technology: 2006 Annual Conference, pp. 256263. New York, NY, USA: Association for Computational Linguistics.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed