A Note on Misplaced or Wrongly Attached zu in German

This paper deals with the misplacement of the infinitival marker zu ‘to’ in German. While this phenomenon only occurs in certain config-urations in the standard language, such as auxiliary fronting, it is common in dialects and shows quite a high degree of variability. I discuss the misplacement of zu in Standard German due to auxiliary fronting, as well as other types of zu-misplacement found in dialects. I propose two parsimonious options for the analysis of the standard language as well as dialect cases, namely, i) precedence rules and ii) a special kind of infixing operation that was first proposed in the framework of Categorial Morphology (Bach 1984, Hoeksema 1985). I show that even though the first approach has its merits, the second one is more advantageous.*


Introduction.
One of the more memorable quotes by Thorsten Legat, a German exfootball professional notorious for his clumsy style of speaking, goes like this: German is an awful language, at least when it comes to its infinitival morphosyntax. It is not surprising, then, that even speakers less prone to spoonerisms than Thorsten Legat run into troubles in this domain-be they associated with parsing difficulties that can be encountered with nested or crossed dependencies in ECM-constructions (Bach at al. 1986) or with mysteries such as the long passive, as in 1. Such examples show case conversion of the embedded object when the matrix verb appears in the passive. 1 This construction shows a very high variance in terms of its general acceptability; in particular, judgments vary significantly as to what matrix predicates are acceptable in this construction.
(1) a. wenn Karl den Wagen zu reparieren versucht if Karl the car.ACC to repair tries 'if Karl tries to repair the car' b. wenn der Wagen zu reparieren versucht wird if the car.NOM to repair tried becomes 'if one tries to repair the car' (Höhle 1978:176) This paper addresses one such challenging issue, namely, the placement of the infinitival marker zu 'to' in several dialects of German, and also sheds light on what should be the best analysis of zu in Standard German. I am not offering a thorough analysis of zu and all the intricacies associated with its use. Rather, I want to share a new empirical observation and sketch an idea of what a proper analysis of this phenomenon might look like. In a nutshell, the basic generalization is the 1 Crossed dependencies have figured prominently in the theory of grammar since they were taken as evidence that context-free grammars are not powerful enough to express all syntactic dependencies that can occur in natural languages (see Shieber 1985 on Swiss German). This phenomenon led to the construction of a family of new grammar formalisms, weakly context-sensitive grammars, enriched with additional mechanisms, such as function composition, that go beyond the capacity of context-free grammars. In fact, such a device-namely, wrapping rules-is also used in the analysis sketched in this paper.
following: zu is a functional morpheme that can be handed down from the immediately dominated verb of a verbal chain to its next dependent. In technical terms, there are two simple tools to capture this insight, namely: (i) Precedence statements in their original form, as introduced in Generalized Phrase Structure Grammar (GPSG; Gazdar et al. 1985). This means that dominance (as a hierarchical relation) is dissociated from precedence (as a string-based, linear notion).
(ii) The (mis)placement of zu can be treated in terms of a special kind of infixation operation. Such an approach was first developed in the context of Categorical Morphology (Hoeksema 1985), in particular by Bach (1984) or Hoeksema & Janda (1988), and proved to be useful beyond the realm of pure morphology.
These tools remain well within the boundaries of a restrictive and formally explicit treatment of (morphological) displacement phenomena. Precedence rules can even be stated for context-free grammars (even though they soon reach their limits), and the wrapping rules for infixation discussed below merely represent a mildly context-sensitive add-on. The choice of one of these two options is mainly dependent on where one wants to draw the line between syntax and morphology. While there is sufficient empirical evidence for treating zu as a syntactically independent element sensitive to constraints on linearization, it might be sensible to keep other displacement phenomena inside the realm of inflectional morphology. The remainder of this article is structured as follows: First (section 2), I discuss the basic empirical facts about zu 'to' in Standard German and dialectal varieties such as Alemannic and Hessian. I also turn to other displacement phenomena that can occur in the morphological domain. Then (section 3), I elaborate on some of the technicalities associated with the proper treatment of these phenomena. In section 4, I offer some thoughts on whether certain cases of the phenomenon under discussion might constitute exploratory expressions in the sense of Harris & Campbell 1995, that is, forerunners of a new grammatical construction. The final section wraps up the main findings of the paper.

The Basic Facts.
Let me now take a closer look at the syntactic behavior of the infinitival marker zu 'to' in German and its dialects. Bech (1955:13) was the first to notice that this element-contrary to what the convention of treating it as a separate orthographic word might suggest-actually fits better within inflectional morphology, as an affix. In current theoretical approaches to German sentence structure, this seems to be the majority position (Vogel 2009:327, note 15). An analysis along these lines is supported by the data in 2 (see Haider 2010:272-273).
(2) a. Er schien gleichzeitig [ zu lachen und *(zu) weinen] he seems at-the-same-time to laugh and to cry b. He seemed to [laugh and cry at the same time] c. anzufangen 'to begin' (lit. 'on=to=catch'); angefangen 'begun' (lit. 'on=ge=catched') The contrast between 2a and 2b shows that zu in German is obligatorily realized in both conjuncts in coordinations (Bech 1955 refers to this restriction as Statuskongruenz 'status agreement'). In English, by contrast, where the status of (cognate) to as a particle is uncontroversial, this restriction is not operative. In addition, 2c shows that zu and the participial prefix ge-appear in the same structural position in particle verb constructions, that is, between the stem and the (putative) particle. Several other arguments in favor of the German infinitival marker being an affix are discussed by Haider (1993:234-236). These arguments are based on differences between zu and its English counterpart to, which is usually analyzed as an exponent of a functional head position (I o or T o ). In English, but not in German, the negation particle as well as adverbs can intervene between to and the VP, as in 3a,b; 3c shows that in VP ellipsis contexts, the particle must be retained. 2 (3) a. He was careful to not destroy the atmosphere. b. He tried to carefully disentangle the complex argumentation. c. They are [ VP laying eggs now], just like they used to [ VP _]. (Haider 1993:234, examples 2a,b,e) Sporadic older analyses of zu as a functional head have proven to be unconvincing on the empirical level (see the discussion by Haider 2010:273-274), yet this assumption still has its advocates-see, for example, Hinterhölzl (2006:157-158;2018), who analyzes zu as an aspectual head, and Salzmann (2016, who assumes that zu is a functional head without making particular claims as to its semantic content or contribution. Of course, in a grammar-theoretic setting where lexical integrity is lifted (which seems to be the standard assumption within the generative mainstream) and even bound morphemes can be considered as syntactic heads, the distinction between functional and lexical categories is somewhat blurred. Thus, the question boils down to which kind of functional category zu is exactly and whether it constitutes a bound or a free morpheme.
Another question which shall not concern me any further is whether zu is syntactically active or just ornamental, as has been assumed for nonfinite inflectional markers in general (Sternefeld 2006:92, Rathert 2009. As far as I know, Haider (1984) was one of the first to propose that zu blocks the designated argument in coherent infinitive constructions, and in so doing he also offered a natural explanation for modal sein-passives, as in 4a. With haben-passives, however, he has to assume that deblocking is possible, as in 4b.
(4) a. In the same vein, Rapp & Wöllstein (2009) distinguish between two variants of zu-one that is responsible for the referential anchoring of complements of factive and propositional verbs and one expletive variant incorporated into V o . Thus, the idea that the infinitival markersomewhat orthogonal to its morphological status-is a syntactically (or also semantically) active element still has its advocates. 3 Let me return to the affix analysis. A problem for this view is posed by data such as in 5a,b: They show that in Standard German, the zumarking is confined to the right edge of the verbal complex. When processes such as fronting of the temporal auxiliary occur-for example, in substitute infinitive constructions (commonly referred to as IPP, that is, infinitivus pro participio)-the affix is handed down to the highest verb of the remaining verbal complex, as in 5b. As a result, zu appears on the wrong verb stem, which is unexpected for an affix. This process can be stated in terms similar to Chomsky's (1957) affix hopping mechanism or some alternative device like the one proposed in this paper (see below). This restriction-that is, zu being confined to the right edge-is one of the sources of the so-called Skandalkonstruktion 'scandal construction', exemplified by 5c, where each verb in the right periphery bears the wrong (that is, an unexpected) morphological marking (see Reis 1979, Vogel 2009, Haider 2011, Gaeta 2013 (Vogel 2009:325, example 37) 3 One reviewer correctly notes that zu 'to' can be regarded as syntactically active in other respects as well, for example, by licensing a PRO subject. Of course, there are analyses of control infinitives that do not require this assumption (for example, in an HSPG setting), yet the fact remains that there are several observations that point to this element being more than just a morphological ornament, so to speak.
Remarkably, Dutch is not subject to this restriction, as the contrast between 6a and 6b shows (examples taken from Bech 1963:291-292). The syntactic inertness of zu, which is first mentioned by Merkes (1895), was integrated into Bech's (1955Bech's ( , 1957 topological model of German infinitival constructions and used as a piece of evidence that the occurrence of an upper field is an indicator for coherence. (6) a. Ich glaube es haben tun zu konnen.
Standard  Reis 1979), Vogel (2009:324) takes 5a,b as an empirical hint for analyzing zu as a phrasal affix that is attached to the last verb of the verbal complex. In his opinion, the first status (simple infinitive) and the third status (participle) belong to word morphology, whereas the second status (zu-infinitive) reflects a morphological property of the verb phrase. 4 A consideration of German dialects and diachronic facts reveals that misplaced zu is not restricted to perfective contexts (with or without IPP). In the second volume of Otto Behaghel's German syntax, quite a variety of structural types can be found (see Behaghel 1924:308-309). Apart from more regular cases of misplaced zu caused by auxiliary fronting, as in 7a, one also finds configurations where zu attaches to the wrong verb without any reordering having taken place, as in 7b. Further examples of this type from Early New High German can be found in Ebert et al. (1993:397), thus showing that it is a regular grammatical pattern. Finally, as documented by 7c, there are also certain interactions with other dialectal constructions, most notably particle splits that occur in older stages of German and several contemporary dialects (see Schallert & Schwalm 2015 for an overview).  (Bader 1995:22) b. Schämsch di nüüd cho z bättle? shame-∅ REFL not come to beg 'Aren't you ashamed of having come here begging?' (Weber & Dieth 1987:244, note 1) Even though misplacements of zu have mainly been reported for Alemannic dialects, they are also found in other varieties. Further examples from different German dialects (mainly from the central region) are cited in Höhle 2006:67-68. In a survey on particle splits in Hessian dialects Johanna Schwalm and I conducted we also found examples for misplaced zu, both in simple cases, such as 9a, and in interaction with particle stranding, as in 9b, the latter corresponding in structural terms to example 7c above. Note that structures such as 7b above also occur, where zu is attached to the left verb in a left-branching structure (which is assumed to be the base order in a Germanic OV language). An empirical survey of 94 speakers conducted by Schallert (2012) yielded six examples of this structure in Vorarlberg Alemannic, as in 10a; an analogous, albeit sporadic example could also be found in Southern Bavarian, as in 10b.
(10) a. Er ist lieber humplig ham glofa, he is rather limping home walked als sich vo mir zfahra lo. than REFL from me to=drive let 'He rather walked home limping than let himself be driven home by me.' (ID 58;62/w,Satteins,Vorarlberg) b. Mei Våta glap z'gwing kinn my father believes to=win can 'My father believes he is able to win.' (St. Veit in Defreggen, Eastern Tyrol; Mayerthaler et al. 1995:55) Further examples of this construction from a West Central and a Low German dialect are given in 11. Example 11a from Frankfurt shows doubling of zu, once in its regular position to the right, once displaced to the left. Thus, the verb gelasse, which appears in the typical prefixed infinitive construction selected by certain verbs (mainly modalsbrauchen 'need', shows a high affinity to this verbal class) in West Central German dialects alongside the anomalous zu-marking. Note that the Frankfurter Wörterbuch, the source for this example, states that zu appears "häufig in Verdoppelung" 'frequently in doubling' (Brückner 1988:3650), so there can be no doubt that this construction represents a regular grammatical pattern and is not just a production error. Another example of this type, given in 11b, comes from the urban dialect of Berlin. Example 11c is from North Lower Saxon.
(11) a. ich brauch merr deß net zu gefalle zu gelasse I need me.DAT that not to please to let 'I don't need to put up with that' (Brückner 1988:3651) b. det brauch er sich nich zu jefallen zu lassen that need.3.SG he REFL not to please to let 'that he needn't put up with that' (Schildt & Schmidt 1986:241) c. Und nun sind wir dann wieder angefangen and now are we then again started eine Neuuberschlickung da vonstatten zu gehen lassen. a new.over.mudding there pass.off to go let 'And now we have again started to pass off an overflow with mud.' 6 (ZW1Q3; Averlak, Schleswig-Holstein) In light of the diachronic and dialect data, there is sufficient evidence that zu mostly attaches to the rightmost verb in the verbal complex, yet in some cases it is handed down to the immediately preceding verb. This means that the long-held generalization (since Merkes 1895), which is also maintained by Gaeta (2013:584) and Salzmann (2016Salzmann ( :409, 2019:11), is not entirely correct. 7 A short typological digression: Misplacement of te is also reported for dialectal/regiolectal varieties of Dutch, as the following example (taken from Pots 2017:128) shows. It features the Dutch progressive construction with the verb zitten 'sit', which selects a te-infinitive; the te-marking can surface on any verb in the right periphery. However, there is considerable variation in terms of the overall acceptability of this positional variability and in terms of the specific contexts in which it can apply. Pots takes this variation as sufficient evidence for a bipartite analysis of te. For speakers who only allow the in situ variant (where the infinitival marker appears on the expected verb, that is, wachten selected by zitten), it acts as a prefix. Conversely, the dislocation configurations are analyzed as instances of clitic climbing (familiar from restructuring verbs in Romance languages such as Italian).
A closer parallel to the misplacements in the German dialects I have presented can be found in Afrikaans, Flemish, and certain varieties of Dutch, where te seems to be able to appear right in front of the whole verbal complex (see Salzmann 2019:43-44 for several examples). Turning back to German and summarizing the data presented so far, one is faced with a somewhat blurred picture: While the various syntactic positions of zu (particularly in the dialects) point to the conclusion that it is a syntactically active element, the coordination facts hint at its status as a prefix (see also Salzmann 2019:38 for some discussion). Note, in passing, that the situation in Dutch is comparable (see Zwart 1993:104 As the examples in 16 from the Early New High German period show, this kind of variation seems to have its roots in older stages of German. (16) a. das ain yeglicher widersach/ vndersteet seynen wiedersacher that a each opponent desists his opponent zu belaydigen. beswaren vnd zu raitzn̄ to insult burden and to irritate 'that each opponent desists from insulting, burdening, and irritating his opponent' (Geiler, Predigten teütsch 144a; from Ebert et al. 1993:397) b. der gewonet auch die leute zu reissen und fressen who is.used.to also the people to seize and devour 'who is also used to seize and devour the people' (Luther, Ez. 19,6;from Haspelmath 1989:297) In his general grammaticalization scenario that describes the progression from the allative preposition to the infinitive marker, Haspelmath (1989:297) treats the reduction of an item's scope as one of the common grammaticalization parameters. He then takes data such as 16 to indicate reduction of the structural scope of zu (see Lehmann 2015, chapter 4): Whereas it is able to attach to bigger syntactic domains-namely, phrasal conjuncts-in this era, it gradually turns into an element attached to single stems (that is, an affix).

Other Displacement Phenomena.
In his seminal paper on substitutes in the system of nonfinite morphology, Höhle (2006) shows that the examples of the wrongly attached infinitival prefix discussed so far are but an instance of one of several morphological displacement phenomena that occur in the context of complex predicates. Another example can be seen in 17. It is from an East Central German dialect in which werd-'become' (waen in 17) normally selects a so-called gerundial form of the infinitive suffixed by -e(n), which goes back to an inflected form of the infinitive in the Old High German/Middle High German era. However, in cases where the dependent of werd-'become' itself embeds another verb, as in 17, the expected gerundial form of the infinitive is replaced by the special substitute form müd 'must'. The gerundial suffix -e(n) required by werdnow appears on the dependent of müd, in this case glün 'sue'. Höhle refers to this form as "supine" since it differs from the regular past participle by truncation of the participial prefix and by its occasional vowel alternations. 9 (17) mə waen müd glün we will must.SUP sue (Kleinschmalkalden, Thuringia; 'we will likely have to sue' 10 (Dellit 1913;cited in Höhle 2006:66) Typically, examples of this construction are found in perfective contexts such as 18, which feature the modal verbs müssen 'must' and dürfen 'be allowed to' (the latter is obviously derived from a different ablaut grade than the regular participle); however, there are also examples of this construction in future and passive contexts. 11 (18) a. ij håwe musd gi:e I have must.SUP go.GER 'I had to go' (regular participle: gemusd) (Oberschwöditz [Trebnitz], Saxony-Anhalt) b. du håsd darfd driŋke you have been.allowed.SUP drink 'you were allowed to drink' (regular participle: gedorfd) (Trebs 1899;cited in Höhle 2006:57-58) Let me now return to example 17 above: Even though the gerundial form required by werd-is not realized by müss, it appears on its immediate dependent, glün 'sue' (as shown by the suffix -n instead of the bare infinitive, which shows no suffix in this dialect). Thus, morphological selection requirements are passed down to the next verb, very much the same as with the zu-cases discussed earlier.
A further level of displacement is represented by cases where the most deeply embedded verb satisfies the selectional requirements of both its superordinate verbs, as is shown with the Alemannic example in 19 from Bernese German quoted by Höhle (2006:70). Here, the zu-marked infinitive z'häuffe 'to help' can be interpreted as simultaneously fulfilling the requirements of schiint 'seems' and probiere 'try'. Against the background of the cases of zu-doubling I presented above, one might also wonder whether this example results from syntactic haplology.
(19) dr Hans schiint sine Frunde probiere z'häuffe the Hans seems his Friends try to=help 'Hans appears to try to help his friends' (Bader 1995:22) Further cases of this phenomenon are discussed by Salzmann (2016Salzmann ( :428-432, 2019; an appropriate example from Early New High German is quoted in Behaghel 1924:308. The Duden volume mentioned earlier recommends that cases of haplology such as 20-when only one of two infinitives bears the zu-marking-should be avoided (Hennig 2016:1060).
(20) Ich hoffe mich §(zu) erkennen geben zu können. I hope me.REFL to recognize give to could 'I hope to be able to reveal myself.' Finally, and somewhat orthogonally to the cases I have discussed so far, detachment phenomena can also be observed with finite forms. Famous examples come from Swabian (for example, Steil 1989 and references quoted therein) or East Franconian (Heyse et al. 2007:439), where the finiteness features in complex predicates can occur on the embedded instead of the embedding predicate; this effect is reported for the benefactive verb helfen 'help' (as in 21) and the phase predicate anfangen 'begin' (see also Schallert 2014a:192 (Steil 1989:41) Morphological displacement with finite forms remains an understudied subject even though it is crucial for a deeper understanding of morphological mismatches triggered by syntactic processes.

Generalizations About zu and Displaced Morphology.
In light of the data that he compiled, Höhle (2006:73) states a generalization about displacement phenomena similar to the ones discussed here. In his view, they are word order-sensitive: They are blocked in left-branching configurations, as in 22a, whereas they occur freely in right-branching ones, as in 22b. As I demonstrated in the preceding section, there is counterevidence to this generalization, at least when it comes to the behavior of the infinitival marker zu. Höhle (2006:73-74) takes this generalization to hold in disharmonic configurations as well, that is, syntagmas that show partially rightbranching and partially left-branching orders, as long as the relevant segment is right-branching. Thus, of the serializations schematized in 23, transfer of V2's selectional requirements on to V3 would be blocked in 23a,b, while being licensed in 23c.
However, this corollary also runs into trouble. One famous instance of the scandal construction, quoted in 5c above and repeated in 24, also features a disharmonic word order, namely, 3-1-2, yet it only partially corresponds to Höhle's generalization. While the displacement of zu applies within a right-branching segment, namely, ⟨haben können⟩, the other relevant segment, ⟨helfen können⟩, is clearly left-branching.
(24) ohne gesungen haben zu können without sung.PCPT have to can.IPP 'without having been able to sing' (Vogel 2009:325, example 37) More precisely, the relevant generalization seems to be that under certain conditions, a syntactic element X n that governs a second status (zu) can transfer its selectional requirements to X n+1 , the category it immediately dominates, a process that is schematically visualized in 25.
The way in which I formulate this generalization is inspired by Höhle (2006), yet my version is less restrictive. Branching direction does not seem to be the relevant factor, as misplacement occurs in left-as well as in right-branching configurations, as shown in 26a and 26b, respectively. Formation of an upper field, that is, fronting of the governing category to the leftmost position of the verbal complex, poses no obstacle for the transfer of zu in 26c, nor does the occurrence of nonverbal interveners (verb projection raising), as can be observed in the Swiss Alemannic example in 27 (from Salzmann 2013b:77). Let me summarize the discussion in this section: zu-infinitives show unexpected behavior in that they can be misplaced both to the left and to the right within the verbal complex. Such a behavior seems to be absent in other areas of infinitival morphology, however, with the exception of the scandal construction (see Salzmann 2019:11-15 for a detailed discussion).

What is the Proper Analysis of zu?
Ever since Höhle's (2006) important contribution, there has been a revived interest in morphological mismatches in the right nonfinite domain, the zu-anomaly just being a small piece of the puzzle. Since the main contribution of my paper is empirical, I do not deal with the specifics of different approaches (see Salzmann 2016:19-23 and in particular, Salzmann 2019 for a recent overview). What is more, those approaches are all problematic in the sense that they are based on the following two assumptions, which have been contested by the data quoted in the preceding section: (i) zu attaches to the rightmost verb of the verbal complex; (ii) Misplaced morphology only occurs in right-branching configurations.
So back to the drawing board. What is the easiest way of capturing the generalization that zu can be handed down to the next dependent verb? Directionality comes into play as a (micro-)parametric option, because this one step can either apply to the right (which seems to be the more common option) or to the left (the less common option). 12 The answer to this question is twofold: First, I discuss precedence statements as a technical means to deal with the (mis)placement of zu (section 3.1). As a more powerful alternative for handling this phenomenon, I use the infixing operations introduced by Bach (1984) as an analytical tool. Finally (section 3.3), I discuss Salzmann's (2013bSalzmann's ( , 2016 approach to how zu and other cases of misplaced morphology might be treated and address some open problems with his analysis.

Precedence Rules.
The first explicit formalization of precedence rules can be found in the context of GPSG even though attempts at such formalization had been made before (see Gazdar et al. 1985, chapter 3). The basic approach consists of reformulating a context-free production rule such as 28a as an immediate dominance (ID) rule in the format of 28b. The crucial difference between the two formats is that the latter formulation does not make any claims about the linear ordering of the nodes on the right-hand side of the rule, that is, any of the n! permutations of the nodes B 1 , B 2 , …, B n is licensed. In their original form, precedence statements are restricted to local trees, that is, a single mother node plus all the nodes it immediately dominates.
As Gazdar et al. (1985:44-45) note, statements like these are part of the definition of the set of trees a particular context-free phrase structure grammar permits. Additional (linear) precedence rules as local relations between the nodes on the right-hand side are introduced. I now give the precise definitions of these concepts in 29, which are slightly adapted from Klenk 1985:39. 12 Qualifying cases, such as the ones already given in Schallert 2012:252, as "very rare exceptions" (Salzmann 2016:9) seems premature, at least to me. If there is an agreement that the zu-anomaly is a phenomenon in its own right, not just a "grammatical illusion" (Haider 2011), then its directionality ought to be taken seriously.

(29) Definition 1. An ID/LP syntax is a 5-tuple (V NT , V T , ID, LP, S),
where V NT , the set of nonterminals, and V T , the set of terminals, are vocabularies with V NT ∩ V T = Ø. S is the starting symbol, ID-the set of immediate dominance rules, and LP-the set of linear precedence rules.
Definition 2. An ID rule is a finite, nonempty set of pairs of the form (A, ⟨A 1 , …, A n ⟩) with n > 0 or (A, ⟨…⟩) (deletion rule) where A ∈ V NT and A i ∈ V NT ∪ V T for 1 ≥ i ≥ n. Alternatively, one can notate such rules as A ⇒ ⟨A 1 , …, A n ⟩ or A⇒ ⟨…⟩.

Definition 3. A linear precedence rule (LP) is an asymmetric relation
This means that for each x, y ∈ V NT ∪ V T it follows that x ~ R y implies y ~ R x. In addition, this relation is transitive, meaning that for some z ∈ V NT ∪ V T with connection x ~ R y and y ~ R z, then x ~ R x also holds. I denote this relation by ≺ and its inverse (R -1 ) by ≻. Klenk (1985:40-41) proves an interesting result with regard to the formal complexity of an ID/LP syntax, showing that the sets of context-free languages L CF and those of L ID/LP languages have the same cardinality. However, this does not mean that the two types of underlying grammars are equivalent. In general, it is not possible to devise an equivalent ID/LP syntax for a given context-free syntax directly, that is, without conversion into a modified context-free syntax (ibid.). Let me now proceed to an analysis of the zu-facts in terms of precedence rules. Linearization statements have been applied to word order properties of languages such as German in general (Kathol 2000) and to complex predicates in particular (Müller 2002). An open question in the context of this problem is how flat or layered the verbal complex is. For instance, the observation that scope-sensitive material occurring within the verbal complex domain (as in verb projection raising structures) seems to allow only narrow readings has been taken as evidence for a layered structure (Haegeman & van Riemsdijk 1986, Salzmann 2011), yet there is also counterevidence (see Schallert 2014a, section 3.2.2 for some discussion). With regard to the special case of the infinitival marker, however, there is no indication that word order variation is associated with differences in interpretation (see Salzmann 2019:21-22). The same holds true for split infinitives in English, albeit for independent reasons, of course-to is a functional head and thus always scopes over the VP.
Note that the approach by Salzmann (2013bSalzmann ( , 2016 makes use of linearization statements as well, yet they require quite complex background assumptions: Zu is assumed to be the exponent of a headfinal functional projection, and displacement is the effect of local dislocation (in the framework of Distributed Morphology, see Embick & Noyer 2001). Ironically, this approach is not powerful enough because it ignores the misplacements to the left, for which I have given sufficient empirical evidence. Although I fully agree that a linearization approach to zu is on the right track, it can be stated in much simpler terms while still covering much of the relevant data. By reducing precedence rules to the bare bones, so to speak, it is easier to adapt or extend them, thus fitting them to the syntactic model of one's choice.
In the following, I show how the most common serializations can be derived with an ID/LP-syntax. First, the question is how Gazdar et al.'s (1985) notion of a local tree in the above sense can be sensibly applied to the case at hand. As linearization domain (LD) or local tree I consider all verbal heads of the VP-domain, including zu/te (and perhaps other infinitival markers), irrespective of what exact hierarchical relations might hold between them.
(30) LD ⇒ V 1 V 2 V 3 ...V n Let me take the three main serializations with respect to the positioning of zu from 26, which are illustrated with the same lexical material in 31. For the time being, I treat the regular placement of zu as in 31a on par with the stranding case in 32. The latter structure results from fronting the auxiliary in the context of the substitute infinitive construction, but I only consider the placement of zu, the modal können 'can', and the lexical verb helfen 'help'. Going back to 31, I am interested in the position of the ECM-verb lassen 'let', which I regard as belonging to the category Mod, zu 'to', and the lexical verb fahren 'drive', meaning that LD := {V, zu, Mod}. The latter label, Mod, covers all verbs that are able to enter a selectional relation with other verbs, that is, show "status government" in Bech's (1955:12) traditional terminology, but are not auxiliaries: 31a represents the Standard German system with zu at the rightmost end of the verbal complex, 31b the system of Swiss German and other dialects with dislocation to the right, and 31c the mirror-image counterpart, as represented, for example, by Vorarlberg Alemannic. 13 (31)  The Standard German system can be derived with the precedence rules in 33. LP 1 and LP 2 alone are powerful enough to capture the serializations in 31a,c, which is incidentally the system of Vorarlberg Alemannicalongside displacement to the left, the Standard German serialization is always possible in this variety (see Schallert 2012, section 8.3.2 for an overview). Of course, ungrammatical serializations, for example, ⟨Mod, V, zu⟩, are ruled out due to LP 2 in the present case.
(33) a. LP 1 : For the system of Swiss German (and other varieties with dislocation to the right) the precedence rules in 34 are needed. Note that LP 4 and LP 5 in 34a,b are the exact mirror image of LP 1 and LP 2 in 33a,b. Once again, ungrammatical patterns are banned by these precedence rules, for instance ⟨zu, Mod, V⟩, due to LP 5 .
(34) a. LP 4 : Mod ≺ V b. LP 5 : Mod ≺ zu c. LP 6 : zu ≺ V As previous examples have shown, it is not so difficult, with the aid of precedence rules, to establish the correct serialization patterns of zu. However, an analysis along these lines soon runs into trouble with more complex configurations. Consider the misplacement caused by auxiliary fronting in 32 above. Without additional precedence rules for the placement of the auxiliary, there is the problem of overgeneration because ungrammatical serializations such as 35 are not blocked by the rules stated in LP 1 -LP 3 .
(35) a. *helfen zu können haben (V zu Mod (Aux)) b. *helfen zu haben können (V zu (Aux) Mod) A quite natural solution to these problems would be positing more elaborated precedence rules, for example, zu ≺ V n , which translates as "zu always has to precede the verb with the highest index (that is, the most deeply embedded verb)". However, such a rule cannot be stated in the context-free format I introduced in this section. Another obvious problem is posed by the zu-doubling cases discussed in section 2.1. Apart from the fact that they cannot introduce new material, it is very difficult to formulate appropriate precedence rules for both tokens of zu.

Morphosyntactic Infixing Operations.
In the previous section, I showed that the basic patterns of the zuanomaly can be treated in a sufficient manner with the aid of precedence rules. It became apparent, however, that such rules soon reach their limits when confronted with the great range of variability in the verbal complex. What is more, an approach along these lines cannot cover cases of zu-doubling. I now want to propose an alternative analysis of the zufacts in terms of a special kind of infixation. Such an approach was first developed in the context of Categorical Morphology (see the overview in Stewart 2016:22-26). This analysis was originally proposed for dealing with verb raising constructions in Dutch, but it can also be easily extended to the phenomenon under discussion here. Bach (1984) proposes several wrapping rules that operate on a string x of grammatical categories x 1 … x n . 14 These operations were taken up by Hoeksema & Janda (1988:206-221) to analyze a wide variety of (morphological) infixation processes. Since I am interested solely in the process of prefixation, I focus on the relevant operations given in 36.
(36) a. LWRAP-pref(x, y) = (LREST(x) (y LAST(x))) b. RWRAP-pref(x, y) = (FIRST(x) (y RREST(x))) These operations allow prefixing an element y either to x n , the last category of x, as in 36a, or to the right rest of x, that is, the first element following x 1 . Evidently, such devices are inspired by the typical string methods that are implemented in almost all modern programming languages. Taking Python as an example, the following code snippet splits the string into its first element and the rest. For completeness' sake, I also give the reverse operation in the last row of 37.
In technical terms, one is dealing with a commuting combinator (Cfxy ≡ fyx) that permutes the arguments of a given functor category (Baldridge & Hoyt 2015:1065. This device extends the generative power of a categorial grammar to the level of so-called mildly context-sensitive languages (Vijay-Shanker & Weir 1994).

>>> s[:-1], s[-1:] >>> ('strin', 'g')
The cases where zu attaches to the left, that is, the first element of the verbal complex, can be handled by defining one further wrap operation that prefixes zu to the first element of the string x 1 , …, x n . I want to call this operation FWRAP-the definition is given in 38.
(38) FWRAP-pref(x, y) = ((y FIRST(x)) LAST(x)) Empirical motivation for such a rule comes from the observation that in Dutch, for instance, verb particles can be stranded at the left edge of the verbal complex, as shown in 39 (from Neeleman & Weerman 1993:435). Crucially, op still constitutes a part of the verbal complex in that no nonverbal interveners can be inserted between it and the following verb.
(39) a. dat Jan het meisje wil opbellen that John the girl wants PART=phone b. dat Jan het meisje op wil bellen that John the girl PART wants phone 'that John wants to call the girl' How do standard concatenative morphological operations such as prefixation or suffixation work in this framework? Hoeksema (1985:15) takes categories, simple or derived, to be represented as ordered triples according to the blueprint of 40, comprising a phonological (π p ), a categorial (π c ), and a semantic component (π s ).
(40) L := ⟨π p (L); π c (L); π s (L)⟩ Affixation is handled via two directionally specified application rules- Hoeksema (1985:19) speaks of "cancellation". The categorical dimensions of Right cancellation and Left cancellation are listed in 41a and 41b, respectively. Ordinary zu-prefixation amounts to applying a suitable argument to the affix as a functor, whereby the phonological representations are concatenated (my discussion partly follows Stewart 2016:23). In categorical short shrift this can be written down as follows: V [zu] /V, V ⇒ > V [zu] . Thus, a verb such as scheinen 'seem' in German subcategorizes for a category V with the morphological index [zu] (status government), which is itself a derived category. The zu-doubling cases mentioned in section 2.1-one of them, from Frankfurt German, repeated as 42-can be derived by a combination of simple application (X/Y Y ⇒ > X) plus FWRAP as defined above.
(42) ich brauch merr deß net zu gefalle zu gelasse I need me.DAT that not to please to let 'I don't need to put up with that.' (Brückner 1988:3651) The simple tools offered by Categorial Morphology are sufficient to capture the basic properties of zu in German dialects. Salzmann (2016 proposes to treat the cases of morphological misplacement phenomena discussed in section 2.2 (the zu-anomaly being but one instance) as the effect of local dislocation in the sense of Embick & Noyer 2001. Whereas processes such as lowering operate on hierarchical structure, LD "operates in terms of linear adjacency" (p. 561). The most famous instance of lowering is observed in languages such as English, where lexical verbs do not move to T o /I o ; instead, the finiteness features of this head are realized on the verb, as shown by the contrast between 43a and 43b (Embick & Noyer 2001:562).

Morphological Displacement as Local Dislocation.
(43) a. Mary [ TP t l [ vP loudly play-ed l the trumpet.]] b. *Mary did loudly play the trumpet.
Salzmann treats verbal complex formation as a PF-phenomenon that comes into play when the ordering of heads of nested verbal projections (as hierarchical representations) has to be determined. Starting with a right-branching base order as in 44a, adjacent heads can be rebracketed and inverted, as in 44b.
The same mechanism is now employed for the derivation of zu, yet there are different kinds of interactions between the two processes (see Salzmann 2013b). The basic idea is that zu heads a left-branching functional projection right above the VP-level, while the base order for the latter projection is taken to be right-branching, by contrast. In 45, the derivations of the different orderings of zu are listed: 45a would be the type of upper field formation discussed by Bech (1963), 45b the regular case with a completely left-branching configuration, and 45c a case of the scandal construction. Finally, 45d represents zu-dislocation to the right (as I discussed earlier, Salzmann does not consider the dislocation cases to the left; the same applies to doubling of the infinitival marker).
(45) a. The crucial point is that zu-affixation operates after verb cluster formation (at least in this context): "By Local Dislocation, it is affixed onto and inverted with the closest, i.e. linearly adjacent verbal element" (Salzmann 2016:417). The simplest case would be 45d, which corresponds to the right-branching base order of verbal heads he assumes.
To my mind, an approach along these lines offers the possibility of modeling different cases of morphological dislocation in a uniform fashion. However, it comes at a high price because Salzmann makes quite a lot of auxiliary assumptions. First, it is by no means obvious why zu would constitute a left-branching functional head. As mentioned earlier, no claims as to its semantic contribution are made. Second, Salzmann (2016) also has to assume that, as a functional head, zu constitutes a morphological word (in the parlance of Embick & Noyer 2001:577-578) that adjoins to a segment of a complex head, thus a subword, which is in conflict with the requirement that only elements of the same morphological type can be adjoined (2016:417-418, note 9). To circumvent this problem, further technicalities have to be introduced which are in need of proper independent justification.
What is more, an analysis of zu as a functional head once again opens up Pandora's box, so to speak, in that all the problematic configurations-the very reason why such an analysis was dismissedreemerge (see Haider 2010:273-274). To mention just one example (taken from Haider 2003:93), it is a well-known fact that VP can act as an extraposition site in German, as in 46a. However, in the right periphery, extraposed material has to follow the verbal complex as a whole, as can be seen from the contrast between 46b and 46c.
(46) a. [ VP Gerechnet damit] i hat sie nicht mehr e i reckoned it=with has she not anymore b. *dass sie nicht mehr gerechnet damit hat that she not anymore reckoned it=with has c. dass sie nicht mehr gerechnet hat damit that she not anymore reckoned has it=with 'that she didn't expect that to happen anymore.' If, however, the cascade of VPs is below FP, one would expect extraposed material to be squeezed in between the (topmost) VP node (that is, the relevant case for my purposes) and FP, as demonstrated in 47. Thus, it has to be stipulated that extraposition comes after verb cluster formation because otherwise local dislocation between zu and its left neighbor from the verbal complex would be blocked.

(47) *um [ FP [ VP [ VP rechnen können] mit so etwas]
F o zu] in.order.to reckon could with so something to 'in order to be able to take something like that into consideration' Worse yet, taking zu to be the head of a left-branching functional projection also leads to more serious problems, as semantically compatible adverbials are predicted to be able to intervene between FP and the VP-domain: To be fair, there is also the possibility of treating the different displacement phenomena under discussion here as instances of lowering, with zu attaching to the verbal head of its complement. I do not want to claim that such an analysis is impossible, but it undermines the original motivation for Salzmann's approach-namely, treating verbal complex formation as a PF-phenomenon and thus capturing its compactness property (see Salzmann 2013a).
To conclude, the approach of Salzmann couched in a Distributed Morphology setting has the charm of offering a more general analysis of morphological displacement phenomena, yet nontrivial adaptions or modifications are necessary to make it work. As of now, it is beset with conceptual and empirical problems.

Misplaced zu as an Exploratory Expression.
Let me now add some thoughts on misplaced zu from a diachronic perspective. As I have shown, the only relevant context where this phenomenon appears involves the movement of the zu-marked auxiliary to the front of the verbal complex, as displayed in 49 (see also examples 5-6 above; Bech 1955:62 refers to this process as "upper field formation"). Since zu seems to be inert, it ends up with the wrong verb, as it were.
(49) ohne es haben lesen zu können without it have read to could 'without having been able to read it' As I have shown, the inertness of zu is one of the sources of the so-called scandal construction, where all verbs in the right periphery bear an unexpected morphological marking. While Vogel (2009), Salzmann (2016 and Wurmbrand (2012) treat this construction as a regular part of German syntax, other voices in the literature are more skeptical: Reis (1979) expresses the idea that it belongs to the realm of phenomena that are not fully rule-governed, and Haider (2011) even goes so far as to treat it as a grammatical illusion; that is, a phenomenon that is deemed acceptable by some speakers while in fact it conflicts with well-established grammatical rules and is thus better regarded as ungrammatical. 15 To my mind, these diverging opinions are also informed by two distinct general conceptions of what a theory of grammar is supposed to model (Pullum & Scholz 2001; see also the discussion in Müller 2016, chapter 14): Generative-enumerative approaches (for example, Categorial Grammar, Minimalism, etc.) view well-formed structures as the result of a convergent application of rewrite-rules, whereas modeltheoretic approaches treat them as conforming to structural descriptions specified by the theory. Müller (2016:490) describes this contrast succinctly: "the generative side only allows what can be generated by a given set of rules, whereas the model-theoretic approach allows everything that is not ruled out by constraints." Most importantly, both types of approaches make different claims about gradient acceptability. In model-theoretic terms, the (un)acceptability is the cumulative effect of constraint-violation, whereas in generative-enumerative terms, it is the impossibility to find a convergent derivation.
I do not want to claim that one of these two basic conceptions of what a grammar theory is supposed to model is per se better equipped to deal with the zu-anomaly. Instead, I want to offer a different angle on the question why this phenomenon has such an exceptional status. An interesting idea in this regard is expressed by Gaeta (2013:376), who believes morphological mismatches such as the zu-anomaly to be the byproduct of the extension of a new construction (in diachronic terms), which can lead to grammatical conflicts. On a more basic level, misplaced zu in its different guises constitutes a paradigm case of what Harris & Campbell (1995:73) refer to as exploratory expressions: By exploratory expressions we mean expressions which are introduced through the ordinary operation of the grammar and which 'catch on' and become fixed expressions and eventually are grammaticalized. Such expressions may originally be introduced for emphasis, for reinforcement, for clarity, for exploratory reasons, or they may result from production errors or afterthoughts. It appears that most initial exploratory expressions are made by applying the rules of grammar in a regular way, but it may be that some perhaps also involve ignoring (breaking) existing rules of grammar. The vast majority of such expressions are never repeated, but a few will come to be used frequently, will gain unmarked status, and will be grammaticalized. It is only when the exploratory expression has been reanalyzed as an obligatory part of the grammar that we may speak of a grammatical change having occurred.
Helmut Weiß (pers. commun.) expresses the opinion that the statement by Harris & Campbell (1995) seems to confuse constructions that are generated via (somewhat) unusual application of grammatical rules (via ignoring or even breaking them) and constructions that result from simple production errors. In his view, ignoring or even breaking existing rules of grammar always implies intentionality. Therefore, it is more plausible that the genesis of the zu-anomaly is due to production errors: While grammatical rules are mostly opaque and thus cannot be broken deliberately, the extension of a certain grammatical pattern through production errors does not imply intentionality on the part of the speaker.
In my opinion, the infinitival marker zu behaves as strangely as it does because it is stuck somewhere in the middle between a particle (free morpheme) and an affix. Of course, this explanation is not sufficient for the other cases of morphological mismatches in the right periphery, let alone detachments of finite morphology (see the discussion in section 2.2), but it might very well be the case that they stem from different grammatical sources altogether. More specifically, the difference between the quote by Thorsten Legat, representing the zu-anomaly in the guise of a production error, and the misplacements in the dialects boils down to how deeply wired they are into the grammar. It is not difficult to find comparable examples for which an interpretation as a simple production error is less likely: (50) Der entfernte Beitrag war heftig kritisiert worden. So schrieb Proll in Richtung jener Frauen, die über sexuelle Belästigung berichteten, sie würde sich "schämen, damit jetzt zu hausieren gehen". be.embarrassed with.this now to peddle go 'The deleted posting had been criticized heavily. Proll wrote, in the direction of those women reporting about sexual harassment, she would "be embarrassed now to peddle it"'.
(derStandart.at 2017) According to Harris & Campbell (1995:74-75), the following three stages can be distinguished when an exploratory expression becomes established: First, there is the introductory stage, when the expression in question is only used rarely. Then there is the (not very likely) chance that it would "catch on", meaning that it would become more widely used while its unusualness or newness are still apparent. Expressions that have reached this second stage are labeled popular. The last stage, reached only by few, is when the expression becomes fixed, that is, it gains the unmarked status. As Harris & Campbell (1995:75) note-and this is crucial-fixation can also have an areal component: "Some areal phenomena apparently develop through the fixing of exploratory expressions." That is exactly what one observes with the different variants of the zu-misplacements, the variant to the right being more widespread than the one to the left or, for that matter, the zu-doubling cases. 16

Conclusion.
This paper had two main goals: On the empirical level, I showed that the discussion about zu 'to' and its cognates in other West Germanic languages suffers from the deficit that not all relevant data are taken into consideration. On the theoretical level, I proposed two simple, yet formally fully explicit devices to handle different cases of displaced morphology. For the infinitival marker, there is sufficient evidence that it 16 An anonymous reviewer remarks that the low frequency of misplaced zu might be the reason why this phenomenon remained in the state of being an exploratory expression. However, as the reviewer suggests, the regularity itself might be wired quite deeply into the grammar: "Shouldn't we assume that the rules that emerge in such a situation are very general rules of UG?" I think this idea fits in with the characterization of exploratory expressions as "expressions which are introduced through the ordinary operation of the grammar" (Harris & Campbell 1995:73), that is, they can be seen as an additional window into the workings of grammatical systems.
does indeed behave like a phrasal affix (Vogel 2009) in that it combines properties of a bound (see the gapping facts or, at least as a preference pattern, coordination) and a syntactically active, free morpheme (see displacement). I showed that a context-free ID/LP syntax is powerful enough to derive some of the basic patterns; yet, ultimately, the wrapping rules discussed in section 3.2, in the form of morphosyntactic infixing operations, are more powerful and flexible, thus also allowing one to model cases of zu-doubling.
As for other cases of misplaced or unexpected morphology, it might very well be the case that more powerful tools such as reverse agree (Wurmbrand 2012) or local dislocation (Salzmann 2016 need to be invoked (see the discussion in section 3.3), yet it is clear that the respective analyses have to be adapted to accommodate the hitherto unnoticed or ignored empirical facts about the syntactic distribution of zu presented in this paper. It could also be worthwhile to exploit some of the simpler devices, such as function composition or, in the specific context of Categorial Morphology, substitution as a one-place operation for deriving portmanteau morphs, for example, French du (< de + le; see Schmerling 1983:228-230) or even morphological substitute forms in the verbal complex in their different shapes and guises. As of now, however, I have no concrete proposal along these lines to offer, so these matters have to be left open to future research.
A final reflection: If it is the goal of grammar theory not only to develop reasonably explicit and mathematically elegant formalisms, but also to model the grammatical knowledge of native speakers and its interaction with other cognitive domains, then adequacy criteria from these branches of science also come into play. For that reason, I do not agree with Stefan Müller's (2016:529) position (who quotes a statement to that effect by Carl Pollard) that the formal complexity of a descriptive language should not be the limiting factor: The question at this point is whether it is an ideal goal to find a descriptive language that has exactly the same power as the object it describes. Carl Pollard (1996) once said that it would be odd to claim that certain theories in physics were not adequate simply because they make use of tools from mathematics that are too powerful. [Footnote omitted] It is not the descriptive language that should constrain the theory but rather the theory contains the restrictions that must hold for the objects in question.