French subject island? Empirical studies of dont and de qui

Abstract Dont has been claimed to be an exception to the ‘subject island’ constraint (Tellier, 1991; Sportiche and Bellier, 1989; Heck, 2009) and to contrast with true relative pronouns such as de qui. We provide corpus data from a literary corpus (Frantext), which show that relativizing out of the subject is possible with dont and de qui in French relative clauses, and is even the most frequent use of both relative clauses. We show that it is not a recent innovation by comparing subcorpora from the beginning of the twentieth century and from the beginning of the twenty-first century. We also show, with an acceptability judgement task, that extraction out of the subject with de qui is well accepted. Why has this possibility been overlooked? We suggest that it may be because de qui relatives in general are less frequent than dont relatives (about 60 times less in our corpus). Turning to de qui interrogatives, we show that extraction out of the subject is not attested, and propose an explanation of the contrast with relative clauses. We conclude that in this respect, French does not seem to differ from other Romance languages.


Dont and de qui
There are several possibilities in French in order to extract a complement introduced by de, among which dont ('of which') and de qui ('of who') (but also de quel, duquel, etc.). The use of these two expressions obeys syntactic and semantic constraints: dont is restricted to relative clauses, whereas de qui can also be used in interrogatives; de qui is restricted to animates, whereas dont is not: 1 (1) a. un ami dont / de qui je tiens ce récit 'a friend whom I got this story from' b.
les lieux dont / *de qui on se souvient 'the places we remember' Furthermore, dont cannot introduce a free relative (2a), cannot be in situ and cannot be the complement of a preposition (2b): it is therefore not used in piedpiping.
(2) a. Je me souviens de qui / *dont on m'a dit du bien.
'I remember who(ever) I hear good things about.' b.
cet homme loin de qui / *dont je suis 'the man who I am far from' To account for these differences, it has been proposed that dont is a complementizer (Godard, 1988;Tellier, 1990) while qui is a wh-pronoun (Tellier, 1991;Le Goffic, 2007).

Syntactic constraints on subject extraction?
The distinction between dont and de qui has played an important role in the discussion about the 'subject island' constraint in generative grammar (Chomsky, 1973;Chomsky et al., 1977). If extraction out of the subject is banned crosslinguistically, why do certain languages allow it? In Italian for example, Rizzi (1982) has shown that extraction out of the subject is allowed (3a), and has related this to the pro-drop parameter (see also Stepanov, 2007). From this perspective, French is a challenging case, since it is not a pro-drop language, contrary to other Romance languages. Godard (1988) has shown that dont allows extraction out of the subject (3b), and this can be related to the fact that it is a complementizer and not a true relative pronoun (see also Sportiche, 1981).
(3) a. Questo autore, di cui so che il primo libro è stato pubblicato recentemente (Rizzi, 1982:61) 'this author, of whom I know that the first book has been published recently' 1 In all examples we put in bold the item of which a dependent is relativized. We apply this convention also in examples cited from other authors. b.
C'est un philosophe dont un portrait se trouve au Louvre (Godard, 1988:47) 'It is a philosopher of whom a portrait is in Le Louvre' Another analysis is proposed by Heck (2009). He considers that relativization of the complement of the subject is not an extraction: dont is a specifier of the subject phrase and the whole DP is pied-piped to the edge of the relative clause, as shown in example (4a). French dont thus lives a double life, since it permits 'true' extraction of complements of verbs and of object nouns (4b).
La fille dont tu as rencontré [le frère_] DP 'the girl whose brother you met' According to Heck's analysis, no material should intervene between dont and the subject when dont is a specifier (4a): no long distance dependency (5a) and no subject inversion (5b) is supposed to be possible when relativizing a complement of the subject. However, such constraints have not been tested empirically.
(5) a. ?? un homme dont je refuse que le fils vous fréquente (Tellier, 1991) 'a man of whom I refuse that the son dates you' b. * Colin, dont choque la coiffure blonde peroxydée (Heck, 2009) 'Colin, whose bleached blond hair is shocking' Turning to true relative pronouns (analysed with wh-movement in the generative tradition), Tellier (1990) claims that French differs from Italian, and that extraction out of the subject is not possible with de qui (6a, 6c), which only allows extraction out of an object (6b). A similar contrast is proposed in Sportiche and Bellier (1989).
C'est un linguiste de qui vous avez rencontré les parents. (Tellier, 1991:90) 'this is a linguist of whom you have met the parents' c.
le diplomate dont / ?*de qui la secrétaire t'a téléphoné (Tellier, 1990:307) 'the diplomat of whom the secretary called you' Again, this contrast has not been tested empirically. Following (Chomsky, 2008), the subject island constraint only applies to 'true' subjects, (7a) i.e. when the verb builds a v*P (a phase). The subject of a transitive verb is in (Spec,v*P) and no element from the subject can move to the edge of the phase, thus blocking extraction. Under this theory, it is possible to extract out of the subject of a passive (7b) or an unaccusative verb, because their subject is in the domain of v in the deep structure (see, however, Legate (2003) who argues that unaccusatives and passives vPs are also phasal). Similar approaches (Polinsky et al. (2013), Uriagereka (2012) among others) assume that extraction out of the subject of transitive verbs is disprefered cross-linguistically. Applied to French, this theory predicts a difference between (6c) and (7c).
Of which car were the hoods damaged by the explosion? (Ross, 1967:242) c.
l'homme dont/ de qui la soeur est venue (Sportiche, 2011:30b) 'the man whose sister came' Using Head-driven Phrase Structure Grammar (HPSG) (Pollard and Sag, 1994), Sag and Godard (1994) and Godard and Sag (1996) posit no subject constraint for French. They provide good examples with dont (8) and predict long distance extraction such as (5a) to be possible, but do not discuss transitive verbs nor de qui relatives.
(8) le Dr X dont la maison de Le Corbusier peut être visitée 'Dr X, whose house by Le Corbusier can be visited' (Godard & Sag 1996:63) Sportiche (2011) argues that dont is a weak pronoun, and that there are cases where both dont and de qui are possible (7c); however, he does not discuss the subject island constraint.
Many syntacticians thus assume subtle contrasts between good and bad cases with dont (5) on one hand, between good and bad cases with de qui (6) on the other hand. However, these contrasts rely on linguists' intuitions and have not been tested rigorously.

Aim of this article
Given the lack of convergent data in the theoretical literature, empirical studies are needed in order to choose between a theory that claims that French differs from other Romance languages in obeying a 'subject island constraint' and a theory that claims that it does not. The questions to be addressed are the following: Is extraction out of the subject allowed in French with de qui as well as with dont? Is it restricted to unaccusative verbs and passives? Must dont be adjacent to the subject in this case? The aim of this article is to provide new empirical data on the current use of de qui, and to compare it with dont. We first review Abeillé et al. (2016), which provides two corpus studies on the present use of dont. We then present two new corpus studies, using Frantext, which is a large database of literary texts, from well-known authors, which are well edited. We compare the present day use (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013) of de qui with that of the last century (1900)(1901)(1902)(1903)(1904)(1905)(1906)(1907)(1908)(1909)(1910)(1911)(1912)(1913). We also ran a new experiment on de qui, using controlled acceptability judgements (Gibson and Fedorenko, 2013;Sprouse and Almeida, 2017) and compare it with an experiment on dont reported in Abeillé et al. (2020).
1 Some corpus results on dont Abeillé et al. (2016) present the distribution of dont in a newspaper corpus, the French Treebank (FTB; Abeillé et al., 2019), and in a spoken corpus, the Corpus Francais Parlé Parisien des années 2000 (CFPP2000; Branca-Rosoff et al., 2012) of contemporary French. An overview of their results is given in Table 1. 2 They find examples where dont can relativize the complement of the subject noun, and surprisingly, this usage is the most frequent one in the French Treebank overall (9a) and the most frequent one when relativizing a complement of a noun in the spoken corpus (9b). They propose that the lower percentage of dont for subjects in the spoken corpus is due to the lower proportion of nominal subjects (clitic subjects are more frequent in spoken French, and no extraction is possible out of them).
[:::] des gens dont les parents avaient des grosses fortunes (CFPP2000) 'people of whose parents had great wealth' Looking more closely at the examples of relativization out of the subject, they found a significant proportion of transitive verbs (35% of all extractions out of subject in the FTB, and 33% in the CFPP2000).
In order to compare dont and de qui in contemporary French, we have searched for de qui relatives in the same two corpora used by Abeillé et al. (2016). There is only one occurrence of de qui in the FTB and 11 in CFPP2000. Leaving aside interrogative uses and dysfluencies, there are only three relatives in CFPP2000: one relativizing the complement of adjective, one the complement of the object noun and one with pied-piping. This is why we turn to a larger corpus.  Table 2. 4 We annotated the de qui relatives for syntactic functions: de qui can relativize the complement of a verb (10a), of a noun (10b), of an adjective (10c) or of a preposition (10d).
[:::] Jean-Claude Passeron, autour de qui gravitaient [:::] quelques membres de la même promotion [:::] (Bardadrac, Genette, 2006) 'Jean-Claude Passeron, around whom some schoolmates were gravitating' When de qui relativizes the complement of a noun, we also annotate the function of the noun. The distribution is reported in Table 3. 5 Strikingly, relativizing the complement of the subject is the most frequent use of de qui relative clauses: 54 cases are relativizations out of the subject, i.e. 26.87%, compared to 15.92% for the complement of a verb and 15.42% for the complement of an object noun. This is even more striking if we exclude subjects out of which it The label 'others' includes qui free relatives (le geste craintif de qui cherche secours 'the fearfull gesture of who is seeking help'), and free choice uses (qui que ce soit 'whoever').

5
The label 'others' includes relative clauses with several gaps (also called parasitic gaps). is impossible to extract (clitics and proper nouns). Among 85 candidate subjects, 54 have their complement relativized, i.e. 62.4%. This is similar to what Abeillé et al. (2016) have observed for dont in newspaper texts.
[:::] son mari excentrique, de qui la collection orientale dormait au musée de l'Homme. (Pense à demain, Garat, 2010) 'her excentric husband, of whose oriental collection slept in the Musée de l'Homme' Contrary to Sportiche and Bellier (1989), Tellier (1991)   transitive verb (11a), some with an unaccusative verb (11b), all with a preverbal subject. Table 4 shows that this is not restricted to unaccusative verbs and passives, contrary to Chomsky (2008) and Uriagereka (2012). The transitive verbs alone represent 42.59% of the verbs among the relativization out of a subject. They represent 51.74% of the verbs in the relative clauses with de qui overall, and a Fisher's Test fails to show a significant difference between the two percentages (p = 0.1516).
3 Dont relatives in Frantext (2000Frantext ( -2013 In order to compare de qui with dont, we ran a second study on the same subcorpus (Frantext 2000(Frantext -2013. Since dont has more than 13,000 occurrences, we randomly selected a subset of 500 occurrences, among which we selected the ones with an animate antecedent, in order to have a meaningful comparison with de qui. Over 143 relative clauses had an animate antecedent. We annotated the function of the relativized element and the function of the noun when a complement of noun is relativized. Compare the distribution in Table 5 with Table 3. 6 Here again, relativizing the complement of the subject is possible (12) and it is the most frequent usage of dont, as in Abeillé et al. (2016). Among 73 subjects out of which it is possible to extract (with a common noun), 58 have their complement extracted (79.50%).
(12) [:::] la belle Antillaise dont l'effigie orna [:::] les boîtes de Banania. (La solitude de la fleur blanche, Roux, 2009) 'the nice Caribbean girl, whose picture decorated the packages of the Banania brand' Like de qui relative clauses, relativizing the complement of a subject with dont is not restricted to passive or unaccusative verbs, as can be seen in Table 6. The transitive verbs alone represent 24.59% of the verbs among the relativizations out of subjects. They represent 37.76% of the verbs in the relative clauses with dont overall, and a Fisher's Test shows that the difference is significant (p < 0.001). However, it would be difficult to claim that transitive verbs are ruled out, since their frequency is non-marginal (this is confirmed by a Binomial Test).
Heck (2009)'s hypothesis predicts that no material should stand between dont and the subject. No extraction out of a postverbal subject or out of the subject of an embedded clause should be possible. But we find four such postverbal subjects in the corpus (13a-b); and one long-distance dependency (13c).
[:::] madame Segond-Weber, la grande tragédienne, dont il aime rappeler que les répliques tombaient de sa bouche « comme des fûts de colonne ». (Voix off, Podalydes, 2008) 'Madame Segond-Weber, the great tragedian, of who he enjoys recalling that the lines fall out of her lips « like pillars »' In order to compare de qui and dont relative clauses, we exclude de qui relative clauses with pied-piping. If we leave aside relativizing the complement of a preposition (autour de qui 'around whom') (14.93%) and the complement of a noun which is introduced by a preposition (avec le fils de qui 'with the son of whom') (18.41%) in Table 3, the proportion of relativizing the complement of the subject with de qui is 40.30%, which is very close to dont (42.66%). This similarity between de qui and dont (Tables 3 and 5) is confirmed by a statistical test: A Fisher's Test fails to find a significant difference between the amount of relativization of the complement of the subject in the dont relative clauses and in the de qui relative clauses (p = 0.7156).
Relativizing the complement of the subject is thus possible with de qui as well as with dont. It is the most frequent use of de qui relatives (40.30% if we leave aside pied-piping uses), as it is the most frequent use of dont relatives in our corpus (42.66%). Furthermore, it is not limited to unaccusative verbs, contrary to Chomsky (2008). In this case, transitive verbs are even more frequent with de qui than with dont (a test shows the difference is marginally significant p<0.05). With dont, it is possible with postverbal subjects, contrary to Heck (2009). Thus, there does not seem to be syntactic restrictions on relativizing the complement of the subject in present day French. Why has it been overlooked? It is true that dont relatives are much more frequent than de qui relatives overall, and that the use of de qui in pied-piping (67 examples) slightly outnumbers the relativization out of the subject (54 examples). Maybe this is why French linguists have overlooked this possibility with de qui. Or maybe it is because it might be a new use of de qui, since these French linguists wrote on dont and de qui at the end of the twentieth century. This is why we consider a previous time period in Frantext. 7 4 De qui relatives in Frantext (1900Frantext ( -1913 Since linguists of the twentieth century (Tellier, 1990(Tellier, , 1991 have considered de qui as unacceptable when relativizing the complement of a subject, and whereas our analysis in the previous section showed that this use of de qui is relatively common in contemporary French, we consider here the possibility that it has arisen in the language only recently. This is why we ran another corpus study on Frantext, targeting the period 1900-1913 (179 texts, 7.8 million words). The query de qui in Frantext for 1900-1913 gave us 271 occurrences, which we annotated manually. Their distribution is presented in Table 7: 171 occurrences of de qui are relative clauses, i.e. 63.10% (vs. 44.77% of the occurrences of de qui in Frantext 2000-2013). A Fisher's test shows that there is a significant difference between the two periods (p < 0.001), so that there is a significantly higher proportion of relative clauses for 1900-1913 than for 2000-2013 8 .
As in contemporary French, de qui can relativize any kind of complement: the complement of a verb (14a), of a noun (14b), of an adjective (14c) or of a preposition (14d). The distribution can be seen in Table 8 [:::] Berthe Hochard, de qui elle a redressé la serviette [:::] (La Maternelle, Frapié, 1904) 'Berthe Hochard, whose towel she readjusted' 7 We observe that all our examples of de qui relativization out of the subject are from books by the same author: Anne-Marie Garat. Overall, this author uses de qui very much: 147 de qui relatives (73.87% of our corpus results) are from Garat. We will see in the next section that other authors also use de qui relative clauses for the complement of the subject. 8 As in Table 2, the label 'others' is for qui (and not de qui) free relatives and free choice uses. 9 The label 'others' includes qui (and not de qui) free relatives, as well as free choice uses. c.
[:::] ceux à l'encontre de qui il s'exerce [:::] (Du côté de chez Swann, Proust, 1913) 'those against who it wields' Among 171 relative clauses, 38 relativize the complement of the subject (15), i.e. 22.22%. This percentage is a bit lower than in Frantext 2000-2013, but a Fisher's test fails to show a significant difference between the two periods (p = 0.3355). However, if we only take into account subjects out of which it is possible to extract (excluding proper names and clitics), the difference with Frantext 2000-2013 is bigger. Among 95 such subjects in the relative clauses, 38 show a relativization of their complement, i.e. 40% (vs. 62.4% for Frantext 200040% (vs. 62.4% for Frantext -2013. We thus conclude that extraction out of the subject has increased with de qui over time, and a Fisher's test shows a significant difference between the two periods (p = 0.003).
[:::] un légiste inattaquable, mais de qui les victimes tout de même intéressent [:::] (Mes Cahiers : t. 9 : 1911-1912, Barrès, 1912 'an unassailable jurist, but of who the victims however arise interest' c. [:::] toi de qui l'âme est si merveilleusement jumelle de la mienne [:::] (Le journal d'une femme de chambre, Mirbeau, 1900) you of who the soul is so wonderfully twin with mine' There is no difference in the type of verbs involved in those relativizations out of the subject between 1900-1913 and 2000-2013 (compare Table 9 to Table 4). Transitive verbs represent 36.84% among the relativizations of the complement of the subject, a bit more than in Frantext 2000-2013 (Table 10), but this difference is not significant (a Fisher's test fails with p = 0.6678). Transitive verbs represent 49.12% in all de qui relative clauses in Frantext 1900-1913: this is not significantly different from their amount when relativizing out of the subject (a Fisher's test fails with p = 0.1398).
We also find three examples of relativizing out of the inverted subject, such as (16). Overall, we find a high amount of relativizing out of a subject with de qui, which does not seem to be an innovation, even if it has increased over time. We cannot find any restriction on the type of verbs involved in these relative clauses (contra Chomsky, 2008), as in Frantext (2000.
5 Dont relatives in Frantext (1900Frantext ( -1913 We now compare de qui with dont for this second time period. Frantext 1900-1913 has close to 10,000 occurrences of dont. We randomly selected a subset of 1,300 occurrences, and excluded relatives with an inanimate antecedent. This left us with 179 relative clauses. Their distribution is presented in Table 11. Up to 56.42% of these relative clauses show an extraction out of a subject NP (17). This is more than in Frantext 2000-2013 (42.66%) and a Fisher's test shows that this difference is significant (p = 0.0184). Furthermore, among 118 subjects out of which it is possible to extract, 101 are extracted (85.59%). This is more than in Frantext 2000-2013 (79.50%), but the difference is not significant (a Fisher's test fails with p = 0.3199). 'the blessed ones, whose languor had the power to cure any languor' We also find one case of subject inversion when relativizing out of the subject (18), which contradicts the adjacency constraint proposed by Heck (2009).
(Pierre Loti, L'Inde (sans les Anglais), 1903) 'the prancing monsters, of who can be recognized the shapes' As in Frantext 2000-2013, different types of verbs are attested in the 101 relativizations out of a subject (Table 12). Transitive verbs alone represent 33.66% of the verbs, vs. 45.81% of the verbs in all relative clauses with dont. A Fisher's test shows that the difference is significant (p < 0.001). As in Frantext 2000-2013, relativizing with dont out of the subject tends to have less transitive verbs than other kind of relativizations. Table 13 summarizes the distribution of transitive verbs for both periods. There are more transitive verbs when relativizing out of the subject in Frantext 1900-1913 than in Frantext 2000-2013, but the difference is not significant (a Fisher's test fails with p = 0.2897). As far as transitive verbs are concerned, we can therefore see a stable tendency: de qui and dont relativizing the complement of the subject have significantly less transitive verbs than the rest of the de qui and dont relative clauses, but still have a high and non-marginal amount of transitive verbs. In our corpus, this tendency does not evolve across time.
The distributions of de qui and of dont relative clauses in the time period 1900-1913 are very similar, even if relativizing out of a subject with de qui (41.30% of the relative clauses, excluding pied-piping) is less frequent than with dont (56.42%) (the difference is significant, as confirmed by a Fisher's test, with p = 0.0211). For the 2000-2013 period, the difference between both kinds of relative clauses for relativizing out of subject was not significant: 40.30% for de qui (excluding pied-piping) and 42.66 % for dont.
Overall, our corpus studies show a high amount of relativizing out of a subject with de qui, in a proportion roughly similar to relativizing out of a subject with dont (contra Sportiche and Bellier, 1989;Tellier, 1990Tellier, , 1991. Comparing our two periods (1900-1913 and 2000-2013), this proportion is increasing for de qui and decreasing for dont (with an animate antedecent). We cannot find any restriction on the type of verbs involved in these relatives (contra Chomsky, 2008;Uriagereka, 2012): when relativizing out of the subject, transitive verbs were even more frequent with de qui than with dont for both periods. We have also found examples with a postnominal subject, both for dont and de qui, which contradict the adjacency constraint proposed by (Heck, 2009) for dont. Overall, our data contradict the syntactic constraints proposed by Sportiche and Bellier (1989), Tellier (1990Tellier ( , 1991 and Heck (2009). 10 6 Experimental results Using literary texts from Frantext, which are very well written and edited, we found that de qui relatives corresponding to the complement of the subject are well attested in both twentieth and twenty-first century French, even though they are less numerous than with dont. One possible reason for the discrepancy between our results and those from the linguistic literature (Tellier 1991, Sportiche 2011) is that we have relied on corpus data while theoretical linguists generally rely on acceptability data. To see whether there really is a difference between corpus data and acceptability data in this domain, we ran a controlled acceptability experiment (Hemforth, 2013; Sprouse and Almeida, 2017) on de qui relative clauses. We first report the experiment on dont reported in Abeillé et al. (2018) in order to draw a meaningful comparison.  2020), which we summarize here. We tested 24 items in the following 2 (subject/object) x 3 conditions: with dont for the complement of the noun, with que instead of dont (ungrammatical control), with clause coordination (grammatical control).
(19) a. dont subject: Un célèbre pâtissier a mis au point une recette dont la simplicité a famous pastry.chef has invented a recipe of which the simplicity ravit les apprentis depuis des générations. delights the apprentices since several generations b. dont object: Un célèbre pâtissier a mis au point une recette dont les apprentis a famous pastry.chef has invented a recipe of.which the apprentices aiment la simplicité depuis des générations. like the simplicity since several generations c. coord subject: Un célèbre pâtissier a mis au point une recette, et sa simplicité a famous pastry.chef has invented a recipe and its simplicity ravit les apprentis depuis des générations. delights the apprentices since several generations 10 Looking for other factors that may govern the choice between dont and de qui relative clauses with an animate antecedent, we annotated several properties: relative clause restrictiveness (appositive/restrictive), antecedent number, antecedent definiteness, verb type (transitive, unaccusative, unergative:::) and subject inversion. We ran several generalized linear regression models (Sakamoto et al., 1986). The variable to be explained was always the relative word. Only one factor appeared significant in Frantext (2000Frantext ( -2013: an appositive relative is more likely to have de qui than dont, and another factor was marginally significant: a relative with a transitive verb is more likely to have de qui than dont. In Frantext 1900-1913, however, these factors were not significant.

d. coord object:
Un célèbre pâtissier a mis au point une recette, et les apprentis a famous pastry.chef has invented a recipe and the apprentices aiment sa simplicité depuis des générations. like its simplicity since several generations e. que subject: Un célèbre pâtissier a mis au point une recette que la simplicité a famous pastry.chef has invented a recipe that the simplicity ravit les apprentis depuis des générations. delights the apprentices since several generations f. que object: Un célèbre pâtissier a mis au point une recette que les apprentis a famous pastry.chef has invented a recipe that the apprentices aiment la simplicité depuis des générations. like the simplicity since several generations We chose transitive verbs that come in reverse subject/object pairs (ravir/aimer 'delight/like'), and nouns denoting qualities (importance, beauté, simplicité 'importance, beauty, simplicity'). We tested 24 experimental items randomly mixed with 24 distractors. We ran an online acceptability rating task (using a 1-10 scale similarly to the French school system) on Ibex (http://spellout.net/ibexfarm/), using the RISC platform (https://expesciences.risc.cnrs.fr/) and social medias to recruit 48 participants.
The results show a subject advantage for dont relative clauses, such that extractions out of the subject (19a) were rated marginally better than extractions out of the object (19b). Both the dont relative clauses and the coordination variants (19c,d) were rated significantly higher than the ungrammatical controls with que.

An experiment on de qui
Using a design similar to that of Abeillé et al. (2018); Abeillé et al. (2020), we ran an acceptability judgement task, comparing extracted variants with clausal coordination (no extraction) and ungrammatical control (missing preposition de) conditions, both for subject and object (2x3 design). We chose relational human nouns (compagne 'partner', cousin 'cousin') for both subjects and objects, to avoid animacy mismatch, and reversible transitive verbs (fréquenter 'frequent', connaître 'know') so we had the same verb in subject and object conditions. We had three practice items, 24 target items and 24 distractors. The items and the distractors were randomized in six lists, using a Latin square design. Each participant saw each item in exactly one condition. Participants were asked to rate the sentences from 0 (not natural) to 10 (perfectly natural). We conducted the study on the internet (Ibex, Drummond, 2010) with 28 participants, 23 women and 5 men, aged 18 to 75, who volunteered on the RISC platform (https:// expesciences.risc.cnrs.fr/) and over social medias.
The results of the experiment are reported in Figure 1, the bars indicating 95% confidence intervals. Running maximal linear mixed models (Baayen et al., 2008) 11 with crossed random slopes for subjects and items (Barr et al., 2013), we first compared the extraction out of subject (20a) with extraction out of object (20b) and found no significant difference. We ran two 2x2 analyses, comparing extraction, first, with grammatical controls, and, second, with ungrammatical controls. Again, we found no main effect of grammatical function (subject, object), but in each case an effect of extraction type, such that the de qui relatives were rated lower than grammatical controls (20c,d) (p<.0001) but higher than ungrammatical controls (20e,f) (p<.0001). In both models, there was no interaction effect. This confirms what we found in our corpus studies: there is no penalty for relativizing the complement of the subject noun with de qui, contrary to Sportiche and Bellier (1989) and Tellier (1990Tellier ( , 1991.

Discussion
Our experiment shows that there is no subject penalty for de qui relative clauses, and confirms the acceptability of the de qui relative clauses found in Frantext corpora (sections 2 and 4). If we compare it with Abeillé et al. (2018)'s experiment on dont reported in section 6.1, we notice that there is no subject advantage with de qui, while there was one with dont. This may suggest that the putative contrast in grammaticality proposed by Tellier (1991) (extraction out of the subject is grammatical with dont and ungrammatical with de qui) should be revised as a mere preference: extraction out of the subject is preferred (over extraction out of object) with dont and not with de qui in relative clauses, as far as acceptability judgements are concerned. However, this difference might be due to the experimental materials, and to the animacy match or mismatch between subject and object. In the dont experiment, the baseline (coordination) also displays a subject advantage, while in our de qui experiment there is no subject advantage in the coordination baseline. This is clear in the two interaction graphs below (Figure 2): the two lines are exactly parallel in both experiments, showing no interaction between subject/object condition and extraction, bars indicating 95% confidence intervals.
Another difference between the two experiments is that the de qui relatives are below the grammatical coordination baseline, while the dont relative clauses were above the coordination baseline (Figure 2). This again may be due to the difference in materials between the two experiments: the relevant NP in the coordination condition had a possessive determiner in the dont experiment (the sentence may therefore be ambiguous); in our de qui experiment, we repeated the complement of the noun in the coordination version in order to avoid this possible ambiguity (as can be seen in (20c,d)). The lower acceptability of de qui, compared to dont, can also be explained by its lower frequency: we had more than 13,000 dont relatives vs. 319 de qui relatives (including free qui relatives) in Frantext 2000-2013; and less than 200 de qui relatives (again including free qui relatives) in Frantext 1900Frantext -1913. The relationship between frequency and acceptability is not a simple issue. As pointed out by Lau et al. (2017) there is no correlation if one considers individual sentences, the frequency of which varies according to lexical frequency and length. Furthermore, unseen sentences (with zero frequency) can be fully acceptable (Featherston, 2005). However, if one considers the frequency of certain constructions, regardless of lexical frequency and sentence length, as we have done in this article, this frequency shows some correlation with acceptability judgements (Keller, 2000;Lau et al., 2017). The overall difference in frequency between de qui and dont relative clauses, may explain the difference between our results and those of Abeillé et al. (2020), over all and not specifically in the subject condition. It may also explain the contrasts in judgements reported by Tellier (1990Tellier ( , 1991 between dont and de qui, if rarer constructions tend to receive lower acceptability judgements. We do not know why dont is much more frequent than de qui overall. We hypothesize it may be because dont is a complementiser (Godard, 1988;Tellier, 1991), hence simpler to process than a de qui PP (Kluender and Kutas, 1993). It may also be because dont is a weak pronoun (Sportiche, 2011) and (de) qui a strong pronoun, given a general preference for weak over strong forms.

General discussion
Our corpus studies (Frantext 1900(Frantext -1913(Frantext and 2000(Frantext -2013 did not confirm the syntactic constraints proposed on subject extraction in French by Sportiche and Bellier (1989), Tellier (1990Tellier ( , 1991 and Heck (2009). Relativizing out of a nominal subject is possible and frequent, both with de qui and with dont. In both periods, there are more de qui relativizations out of the NP subject than out of the NP object (which is considered acceptable in the generative literature), and these relativizations out of the subject also outnumber the relativization of verbal complements. If the pied-piping cases are an exception, the distribution of de qui is similar to the distribution of dont, for which relativizing out of the NP subject is the most common usage. Both types of relativization out of the subject are not an innovation in French, even if the proportion of subjects with de qui extraction has risen, and they are not limited to unaccusative verbs, contra Chomsky (2008) and Uriagereka (2012). On the other hand, attested examples of relativization out of postverbal subjects make Heck's proposal that dont belongs to the subject phrase implausible.
This does not necessarily mean that extraction out of the subject is not constrained in French. Relative clauses are an extraction type, but there are others to be tested. De qui can indeed also be used in wh-questions, and (Tellier, 1990(Tellier, , 1991 also suggested that questioning out of the subject is not grammatical with de qui. (21) ?* De qui est-ce que la secrétaire t'a téléphoné? (Tellier, 1990:307) 'Of who did the secretary call you?' We thus compare de qui relatives with de qui interrogatives in the same corpus, for the same time periods.

A comparison with de qui interrogatives
We found 128 de qui questions in Frantext 2000-2013 (Table 2). After taking out 43 verbless questions, 86 de qui questions are left, 41 direct and 45 subordinate. We annotate them with the syntactic function of de qui and the function of the noun when de qui is a complement of the noun. Their distribution is strikingly different from that of relative clauses (compare Table 14 with Table 3). We do not find any question out of the subject, while we find questions for the complement of the verb (22a) or of the object (22b).
(Dans la main du diable, Garat, 2006) 'First of all, of who do you know that they are gone together there'? b.
Dans la première oeuvre de cette exposition, [:::] de qui as-tu utilisé les voix ? (La vie possible de Christian Boltanski, Boltanski and Grenier, 2007) 'In the first artwork of this exhibition, of who did you use the voices?' In Frantext 1900-1913, we found 70 de qui questions (Table 7). After taking out 17 verbless questions, we are left with 53 de qui questions, 34 direct and 19 subordinate. As for 2000-2013, we observe a striking difference between relative clauses and interrogatives (compare Table 15 with Table 8): we do not find any example of questioning out of the subject. This suggests a contrast between relative clauses and wh-questions, even though for 1900-1913, we did not find any questioning out of the object either. Interrogatives with de qui strongly favour the complement of the verb (23a): this case represents 71.7% of the de qui verbal questions in 1900-1913. We found seven examples of questioning out the predicative  Proust, 1913) 'the interlocutor will know well who you mean' (lit. 'of who you want to talk') b.
De qui était-il la proie ? (Jean-Christophe : L'Adolescent, Rolland, 1905) 'Of who was he the prey?' While it is true that constructions with zero frequency can have various degrees of acceptability (Featherston, 2005), we can say that our corpus data do not contradict the subject penalty suggested by (Tellier, 1990(Tellier, , 1991 for de qui questions. Our corpus studies show a striking difference between de qui relative clauses on the one hand, and interrogatives on the other hand. Relativizing the complement of the subject is frequent, even favoured in our corpus, questioning is not.

Are relative clauses special?
Does French obey a syntactic subject island constraint? Our corpus studies have provided numerous examples of extraction out of the subject with de qui relative clauses, and an acceptability judgement task has found no difficulty with relativizing the complement of the subject with de qui. However, we did not find any example of extraction out of the subject in wh-questions. The subject island constraint can be viewed as a specific syntactic constraint, or as the result of cumulative independent constraints. Haegeman et al. (2014) consider several non-syntactic factors that may ameliorate extraction out of the subject, for example indefiniteness. Bianchi and Chesi (2014) also suggest that subjects of thetic sentences do not occupy the same position as those of categorical sentences, and are easier to extract from. While these proposals may explain why some examples are better than others, they do not address the basic contrast we have found between relative clauses and wh-questions. Why are relative clauses special?
We first consider the hypothesis that relativization does not involve extraction, as suggested by an anonymous reviewer. The ease of relativizing may mean that such relative clauses do not involve extraction, contrary to wh-questions. It is true that there are cases of gapless dont relative clauses with a resumptive pronoun in French (Godard, 1988;Tellier, 1991), but not with de qui. In our corpora, we found no de qui relatives with a resumptive pronoun. For 2000-2013, we found five such dont relative clauses (24), and three for [1900][1901][1902][1903][1904][1905][1906][1907][1908][1909][1910][1911][1912][1913]. In this case, as noticed by Godard (1988), the resumptive pronoun must be embedded (*celui dont i il i l'accompagne 'the one of which he accompanies him'), contrary to our cases with extraction out of the subject. Several authors have proposed a gapless analysis, such that the relative PP would be a hanging topic (Giorgi and Longobardi, 1991;Broekhuis, 2006). This analysis would predict that an anaphoric pronoun or a possessive should be possible. Going back to our corpus examples, it is clear that such a possessive is not grammatical with dont (ex. 12) nor with de qui (ex.11a), in standard French: 12 (26) a. la belle Antillaise, dont l'effigie orna les boîtes de Banania. 'the nice Caribbean girl, of which the picture decorated the packages of the Banania brand' b.
*la belle Antillaise, dont i son i effigie orna les boites de Banania 'the nice Caribbean girl, of which her picture decorated the packages of the Banania brand' c.
les ogres de qui la danse barbare vous confisque l' enfance 'the ogres of whom the barbaric dance takes your childhood away from you' d.
*les ogres de qui i leur i danse barbare vous confisque l'enfance 'the ogres of whom their barbaric dance takes your childhood away from you' Another problem for the gapless hypothesis is that it does not account for connectivity effects: dont relativizes a de-PP, while que relativizes an NP (in standard French). In the experiment reported by Abeillé et al. (2020) (see section 6.1), que was tested as a variant of dont and was judged unacceptable, both for the complement of the subject and the complement of the object; so it is difficult to claim that such dont relative clauses are gapless. In our experiment (see section 6.2), qui was tested as a variant of de qui and was judged unacceptable, in both subject and object conditions, which provides evidence for a PP[de] gap (see Haegeman et al. (2014) for more arguments on the same line). We thus conclude that our 12 In our experimental items (20a, 20b), such a possessive is not possible either: (25) a. J' ai un voisin de qui la compagne connaît ma cousine.
'I have a neighbour of whom the partner knows my cousin' b. * J'ai un voisin de qui sa compagne connaît ma cousine.
'I have a neighbour of whom his partner knows my cousin' empirical studies show that extraction out of the subject is possible with de qui and dont in relative clauses, but not attested with de qui in wh-questions. A similar difference among constructions has been observed experimentally by Sprouse et al. (2016) for Italian: in two acceptability judgement tasks, extraction out of subjects was rated higher than extraction out of objects in relative clauses, and lower in wh-questions. They conclude that 'this [cross construction] variation requires modification to all existing syntactic theories of island effects'.

A discourse-based hypothesis
The lack of extraction out of subjects in wh-question, but its high frequency (and high acceptability) in relative clauses may be difficult to explain under purely syntactic accounts (Chomsky et al., 1977;Chomsky, 2008). Other approaches to the subject island constraint have proposed that processing (Kluender, 2004) and discourse factors (Erteschik-Shir, 2007) play a role. It may thus reflect a nonsyntactic difficulty (Chaves, 2013). Erteschik-Shir (2007) and Goldberg (2013), among others, have proposed that the difficulty of extraction does not come from syntactic configurations but from discourse infelicity. Since extraction makes a constituent more salient (it becomes a Topic or a Focus), they claim that extraction is infelicitous out of backgrounded or non-focal constituents. Since the subject is usually a Topic, extraction out of the subject is not ungrammatical but infelicitous. Abeillé et al. (2020) have proposed to revise such an explanation in order to account for the difference between constructions, and suggest that it is not felicitous to focus part of a non focal constituent. The extracted element in a wh-question is the Focus of the clause, whereas the subject is prototypically a Topic. Focusing (with a wh-question) a subpart of a (Topic) subject is thus dispreferred. This type of approach predicts that relative clauses are not constrained in this respect: the relativized element is not focused and the antecedent of the relative clause can have any discourse status in the main clause. Such a theory only constrains focalizing extraction, such as whquestion, but not relativizing. It does not predict that it is more frequent than relativizing out of object, as in our corpus studies. This latter fact may come from non-syntactic factors that make subject relatives more frequent than object relatives cross linguistically (see for example Roland et al., 2007) and from a more general accessibility hierarchy of grammatical functions (Keenan and Comrie, 1977). 13 If valid, this proposal should apply to other languages as well, making relativizing out of the subject easier than questioning out of the subject (see Sprouse et al., 2016, for such a contrast in Italian). It may also explain that some wh-questions are better than others, if a subject can be made more focal in a given context. 14 13 Pyscholinguistic experiments also show that subject relatives are preferred to object relatives, in French (Pozniak and Hemforth, 2015), and cross linguistically (Holmes and O'Reagan, 1981). 14 Of course, other factors may play a role as well in other languages. For example in English, it is well known since Ross (1967) that preposition stranding penalize extraction out of the subject (27a), even if more acceptable examples have been proposed by various authors (Kluender, 1998(Kluender, , 2004Chaves, 2013): (27) a. * Which cars were the hoods of damaged by the explosion? (Ross, 1967:fn. 31) b. What were pictures of seen around the globe? (Kluender, 1998:268) c. Which problem will a solution to never be found? (Chaves, 2013:301) 8 Conclusion Theoretical discussions of long distance dependencies have claimed that it is more difficult to extract out of subjects than out of objects (Ross, 1967;Chomsky, 1986), even though various factors may play a role (Haegeman et al., 2014). French dont has been claimed to be an exception in this respect (Sportiche and Bellier, 1989;Tellier, 1990Tellier, , 1991 and de qui to follow the general rule. In a large corpus of literary texts (Frantext 2000-2013), we have found that relativizing out of the subject is possible with de qui as well as dont. Furthermore, our corpus analysis shows that both for dont and de qui, relativization of the complement of the subject noun is the most frequent use of these relative clauses (pied-piping excepted for de qui). It is possible with intransitive as well as transitive verbs, with preverbal subjects as well as postverbal subjects. We compared with Frantext 1900-1913 and could not identify any meaningful evolution between the two periods. We ran an acceptability judgement task that confirmed that relativizing the complement of the subject with de qui is as acceptable as relativizing the complement of the object. The syntactic difference between the two types of relativizers (a complementizer for dont vs. a relative pronoun for qui, or a weak vs. a strong pronoun following Sportiche, 2011) does not play a role. We thus conclude that French behaves like Italian in this respect (Rizzi, 1982).
However, looking at de qui interrogatives, we did not find any extraction out of subject. We conclude that extraction out of the subject is sensitive to the construction type: it is possible in relative clauses and more difficult in interrogatives. This difference among constructions has also been observed by Sprouse et al. (2016) for Italian, and found to be difficult to explain by most current syntactic theories. We suggest it is compatible with theories that view the subject island as a matter of syntax-discourse interface (Erteschik-Shir, 2007;Abeillé et al., 2020).
On a more general note, Sprouse et al. (2013) argued that most intuitions reported on English in a major linguistic journal were confirmed by acceptability judgements. However, Linzen and Oseki (2018) argue that it is not the case when testing data from other languages (Hebrew and Japanese). This article suggests using both corpus data and acceptability judgements.