Hostname: page-component-77f85d65b8-9nbrm Total loading time: 0 Render date: 2026-03-28T23:42:02.006Z Has data issue: false hasContentIssue false

Joint learning of morphology and syntax with cross-level contextual information flow

Published online by Cambridge University Press:  20 January 2022

Burcu Can*
Affiliation:
Department of Computer Engineering, Hacettepe University, Ankara, Turkey Research Institute of Information and Language Processing, University of Wolverhampton, Wolverhampton WV1 1LY, UK
Hüseyin Aleçakır
Affiliation:
Cognitive Science Department, Informatics Institute, Middle East Technical University, Ankara, Turkey
Suresh Manandhar
Affiliation:
Madan Bhandari University of Science and Technology Development Board, Karyabinayak, Nepal
Cem Bozşahin
Affiliation:
Cognitive Science Department, Informatics Institute, Middle East Technical University, Ankara, Turkey
*
*Corresponding author. E-mail: b.can@wlv.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

We propose an integrated deep learning model for morphological segmentation, morpheme tagging, part-of-speech (POS) tagging, and syntactic parsing onto dependencies, using cross-level contextual information flow for every word, from segments to dependencies, with an attention mechanism at horizontal flow. Our model extends the work of Nguyen and Verspoor ((2018). Proceedings of the CoNLL Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. The Association for Computational Linguistics, pp. 81–91.) on joint POS tagging and dependency parsing to also include morphological segmentation and morphological tagging. We report our results on several languages. Primary focus is agglutination in morphology, in particular Turkish morphology, for which we demonstrate improved performance compared to models trained for individual tasks. Being one of the earlier efforts in joint modeling of syntax and morphology along with dependencies, we discuss prospective guidelines for future comparison.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. A Turkish clause with labeled dependency relations between morphologically complex words. First line: The orthographic form (“-” is for “null”). Second line: morphological segments. Third line: morphological tags (-GEN=genitive, .3s=3rd person singular, -ACC=accusative, -PAST=past tense). Dependencies in the article are arrowed (head to dependent) and labeled UD dependencies (de Marneffe et al. 2021).

Figure 1

Figure 2. The layers of the proposed joint learning framework. The sentence “Ali okula gitti.” (“Ali went to school”) is processed from morphology up to dependencies.

Figure 2

Figure 3. The layers of the proposed joint learning framework working on the sentence “okula gitti” (“(he/she) went to school”). The vectors c, w, e, mt, s, and p denote character, word, concatenated character and word, morphological tag, segment, and POS tag, respectively.

Figure 3

Figure 4. The sub-architecture of the morphological segmentation component. The one hot vector of each character of a word is fed into a BiLSTM. The resulting vector obtained from each state of the BiLSTM is fed into a multilayer perceptron with one hidden layer and with a sigmoid activation function in the output layer. The output of the sigmoid function is a value between 0 and 1, where 1 corresponds to a segment boundary and 0 indicates a non-boundary. In the example, there is a boundary after the character i, that is, 1 is the output from the MLP at that time step.

Figure 4

Figure 5. The encoder–decoder sub-architecture of the morphological tagging component. The top part is the encoder and the bottom part is the decoder. The example is “kahveleri bende içelim” (“let’s drink coffee at my place”).

Figure 5

Figure 6. The pointer network which is used to predict a dependency score between words in a sentence. Some Turkish dependencies are shown for exemplifying head-dependent relations.

Figure 6

Figure 7. An illustration of the overall architecture with the cross-level contextual information flows between layers, using the example “Ali okula gitti.” (“Ali went to school”).

Figure 7

Figure 8. An illustration of morpheme tag cross-level contextual information flow between POS layer and morpheme tagging layer. The context contains only two previous words in the example. The POS layer also takes the morpheme segmentation embeddings and word embeddings which are also based on their own attention networks; they are excluded from the figure for the sake of simplicity.

Figure 8

Table 1. Experimental results for different joint models

Figure 9

Table 2. Comparison of Turkish dependency parsing results with other models

Figure 10

Table 3. The comparison of the Turkish POS tagging results with other models

Figure 11

Table 4. The comparison of the Turkish morphological tagging results (FEATS) with other models

Figure 12

Table 5. The results of cross-level contextual information flow on Turkish

Figure 13

Table 6. The results of joint model on other languages (English, Czech, Hungarian, and Finnish) with (w) and without (w/o) cross-level contextual information flow. The bold ones are the highest scores for the given language

Figure 14

Table 7. Comparison to other models on English, Czech, Hungarian, and Finnish

Figure 15

Table 8. LAS by length

Figure 16

Table 9. UAS by length

Figure 17

Table 10. Statistics about test partition of Turkish IMST universal dependencies

Figure 18

Figure 9. Dependencies in Turkish relative clauses (in brackets), and with the relativized noun (in italic). They are (a) root-clause object relativization; (b) root-clause subject relativization; (c) relativization out of possessive NPs, here from “man’s car”: “adam-GEN.3s araba-POSS.3s.” Dotted edges indicate which dependency relation “acl:rel” dependency is expected to capture.

Figure 19

Figure 10. Unbounded dependencies in Turkish relative clauses (in brackets), and with the relativized noun (in italic). They are (a) long-range subject relativization and (b–c) long-range object relativization. Dotted edges indicate which dependency relation “acl:rel” dependency is expected to capture.

Figure 20

Table 11. Results for projective and non-projective sentences

Figure 21

Figure 11. Projective and non-projective tree percentages grouped by the sentence length.

Figure 22

Table 12. UAS by coarse POS categories

Figure 23

Table 13. LAS by coarse POS categories

Figure 24

Table 14. Pipeline ablation results

Figure 25

Table A1. LAS by dependency relation

Figure 26

Table A2. UAS by dependency relation