1. Introduction
Natural languages can form
filler-gap dependencies, which establish a relationship between a moved element (the
filler) and a
gap in its base syntactic position (i.e., where the filler is ultimately interpreted).
1 In
wh-questions such as (1-a), the filler
wh-phrase
which book is linked to a
gap contained within the complement clause. Relative clauses (RCs) such as (1-b) are also filler-gap dependencies, where the head of the RC is the filler that is linked to a gap.
(1) | a. | Which book did Anna say [that Brian had read __]? |
| b. | That is the book which Anna said [that Brian had read __]. |
Filler-gap dependencies can in principle cross an arbitrary linear and structural distance (
Chomsky1973,
1977), as illustrated in (2):
(2) | a. | Which book did Anna say [that Sunniva thought [that Kristin believed [that Brian had read __]]]? |
| b. | That was the book which Anna said [that Sunniva thought [that Kristin believed [that Brian had read __]]]. |
Although long-distance filler-gap dependencies are possible, it has been known at least since
Ross (
1967) that trying to relate a filler to a gap inside specific constituents leads to unacceptability. These domains are called
islands. Several constituent types have been identified as islands, including subject phrases (nominal or clausal), certain adjuncts, embedded questions (EQs), and relative clauses (RCs) (
Chomsky 1973,
1977;
Huang 1982;
Ross 1967;
Stepanov 2007). Examples of these island types are given in (3).
(3) | a. | Subject |
| | *Which boy did you think that [the mother of __] was interesting? |
| b. | Adjunct |
| | *Which boy did Christian talk to Odd [after Anne yelled at __]? |
| c. | Embedded Question |
| | *Which boy did Odd remember [what __ was called]? |
| d. | Relative Clause |
| | *Which cake did you meet the woman [who made __]? |
Since the discovery of island effects, researchers have been interested in figuring out why they arise. A dominant tradition has sought to explain island effects as arising from universal syntactic conditions on A’-movement operations (
Chomsky 1973,
1977,
1986,
2000;
Cinque 1990;
Huang 1982).
2 The traditional syntactic approach predicts, all else equal, that island effects should be observed with all dependencies that are derived via A’-movement, such as
wh-movement and relativization (
Chomsky 1977;
Schütze et al. 2015).
An alternate functionalist tradition attributes island effects to discourse-pragmatic factors grounded in the information status of different elements in a sentence (e.g.,
Erteschik-Shir 1973;
Goldberg 2006;
Kuno 1987;
Van Valin 1995). Particulars of individual accounts differ, but most employ the distinction between items that are in
focus (those that correspond to or request new information) and those that are
backgrounded (e.g., items that are given or
discourse-old). The underlying intuition behind many of these accounts is that island effects arise when prominent or focused items are linked to gaps in backgrounded constituents. For example,
Goldberg (
2006) proposed that all filler-gap dependencies place the filler in a discourse prominent position, which is incompatible with gaps that fall inside backgrounded constituents. As a result the account also predicts that backgrounded constituents are islands for filler-gap dependencies.
(4) | Backgrounded Constituents are Islands (BCI)
|
| Backgrounded constituents may not serve as gaps in filler-gap constructions. (Goldberg 2006, p. 135) |
In apparent contradiction to the predictions of both traditional syntactic accounts and discourse-based accounts such as
Goldberg (
2006), recent experimental research suggests that certain island effects may vary as a function of A’-dependency type (
Abeillé et al. 2020;
Bondevik et al. 2021;
Kush et al. 2018,
2019;
Sprouse et al. 2016). The extent of cross-dependency variation is, however, not well established. Moreover, the conclusion that different dependency types yield different island effects has been made based on comparison across experiments. Few studies have directly compared different dependency types within a single experiment.
The first goal of this paper, therefore, is to more systematically map the empirical landscape in one language, Norwegian, through a side-by-side comparison of island effects with wh- and RC-dependencies.
The second goal of the paper is to evaluate our results against a new discourse-based account of island effects, the
Focus Background Conflict constraint (henceforth FBCC) put forward by
Abeillé et al. (
2020), which was developed specifically with the goal of accounting for cross-dependency variation in island effects. To keep the size of the paper manageable, we focus primarily on the FBCC and do not attempt to exhaustively cover how prior syntactic and discourse-based approaches could or would account for our findings.
Before we present our experiment and the results, the remainder of the introduction reviews the FBCC and provides some relevant background on islands in Norwegian.
1.1. The Focus-Background Conflict Constraint
Abeillé et al. (
2020) proposed a new discourse-based constraint intended to account for island effects:
(5) | Focus-Background Conflict Constraint (FBCC)
|
| A focused element should not be part of a backgrounded constituent. (Abeillé et al. 2020, ex. 8) |
According to the FBCC, whenever a focused filler is associated with a gap inside a backgrounded constituent, a clash in discourse-status occurs, causing the sentence to be infelicitous (rather than syntactically ill-formed). This infelicity results in a decrease in acceptability.
3 The FBCC links islandhood to backgroundedness, but unlike Goldberg’s BCI (4), the FBCC is stated in such a way that it does not uniformly treat backgrounded constituents as islands for all filler-gap dependencies. Instead, the FBCC holds that backgrounded constituents are only islands for dependencies where the filler is focalized.
Wh-dependencies put the questioned element into focus (
Jackendoff 1972) by seeking new information, so
wh-extraction from a backgrounded constituent is predicted to be unacceptable. RC-dependencies, however, do not place the filler—the head of the RC—into focus, because the function of a standard RC is to add information to a
given entity. Therefore, the FBCC predicts that RC-dependencies into backgrounded constituents should be felicitous.
Abeillé and colleagues tested the predictions of the FBCC by investigating the acceptability of
wh- and RC-dependencies into nominal subject phrases in English and French, which they argued are backgrounded by default. The authors motivate the backgrounded status of subject phrases using a (corrective) negation test (
Erteschik-Shir 1973;
Van Valin 1995;
Van Valin and LaPolla 1997). The test relies on the intuition that constituents can only be negated or denied if they are contained in the part of the sentence that is asserted/focused. The authors note (p. 19) that ‘[i]n a neutral context, it is more felicitous to negate (part of) the object than (part of) the subject.’ This explains the difference between (6-a) and (6-b).
(6) | a. | A: The football player liked the color of the car. |
| | B: No, the size of the car. |
| b. | A: The football player liked the color of the car. |
| | B: #No, the baseball player. |
As we will see later, it is unclear whether this test reliably diagnoses backgrounded constituents in other constructions, but for the moment we take the distinction at face value. According to Abeillé and colleagues, the relative infelicity of (6-b) indicates that the subject phrase is backgrounded. Therefore, the account predicts that extraction of a wh-filler from inside a subject should result in an island effect. No island effects are predicted, however, for RC-dependencies from the same subjects.
Across multiple experiments the authors investigated the acceptability of English
wh- and RC-dependencies with PP fillers (
pied-piping, as in (7-a) and (7-b)) and NP fillers (
prepositional stranding as in (7-c) and (7-d)) from definite subject NPs.
(7) | a. | Pied-piping from Subject,Wh-question |
| | Of which sportscar did [the color __] delight the baseball player because of its surprising luminance? |
| |
| |
| b. | Pied-piping from Subject, RC-dependency |
| | The dealer sold a sportscar, of which [the color __] delighted the baseball player because of its surprising luminance. |
| |
| |
| c. | P-stranding in Subject,Wh-question |
| | Which sportscar did the [color of __] delight the baseball player because of its surprising luminance? |
| |
| |
| d. | P-stranding in Subject, RC-dependency |
| | The dealer sold a sportscar, which [the color of __] delighted the baseball player because of its surprising luminance. |
Experiments 2 and 3 of
Abeillé et al. (
2020) compared sentences such as those above with counterpart sentences in which the
wh- and RC-fillers were associated with gaps inside NPs in object position (e.g., (8)) and unquestionably ungrammatical baseline sentences (9).
4(8) | a. | Pied-piping from Object NP,Wh-question |
| | Of which sportscar did the baseball player love [the color __] because of its surprising luminance? |
| |
| |
| b. | Pied-piping from Object NP, RC-dependency |
| | The dealer sold the sportscar of which the baseball player loved [the color __] because of its surprising luminance. |
| |
| |
| c. | P-stranding in Object NP,Wh-question |
| | Which sportscar did the baseball player love [the color of __] because of its surprising luminance? |
| |
| |
| d. | P-stranding in Object NP, RC-dependency |
| | The dealer sold the sportscar which the baseball player loved [the color of __] because of its surprising luminance. |
(9) | a. | Ungrammatical Baseline,Wh-question |
| | *Which sportscar did the baseball player love the color because of its surprising luminance? |
| |
| |
| b. | Ungrammatical Baseline, RC-dependency |
| | *The dealer sold a sportscar, which [the color __] the baseball player loved because of its surprising luminance. |
The results of the experiments showed that extraction from object phrases was generally more acceptable than from subject phrases, irrespective of dependency type. Differences in the acceptability of extraction from subjects varied by dependency type and by the category of the filler. For wh-questions, both pied-piping and P-stranding dependencies were judged as unacceptable as the ungrammatical baseline (9-a). For RC-dependencies, while P-stranding dependencies were judged as unacceptable as the corresponding ungrammatical baseline (9-a), pied-piping dependencies were judged significantly more acceptable and on par with grammatical P-stranding from an object NP (8-b).
Abeillé and colleagues argue that the results broadly support the FBCC. The unacceptability of
wh-extraction from subject phrases is predicted. The authors also contend that the results of the RC-experiments align with the FBCC. Without any auxiliary assumptions, the FBCC predicts that both pied-piping and P-stranding RC-dependencies into subjects should be acceptable. The prediction for pied-piping is arguably borne out in English (and in French). However, the unacceptability of P-stranding is inconsistent with the simple predictions of the FBCC. To accommodate the P-stranding results, Abeillé and colleagues argue that there is an additional constraint—independent of the FBCC—that renders P-stranding (inside subjects) unacceptable. They speculate that the factor could be grounded in processing difficulty. We find the possible explanations proposed in
Abeillé et al. (
2020) unlikely
5, but for the purposes of the paper we remain agnostic as to why there are differences between P-stranding and pied-piping from nominal subjects.
With the caveat above, the acceptability of pied-piped RC-movement from subjects provides suggestive support for the FBCC. As the FBCC is proposed as a general constraint, it is expected to apply beyond subjects to other domains that have been considered islands. The prediction of the FBCC is that—all else equal—any domain that is backgrounded should block wh-dependencies, but should permit RC-dependencies. Our experiment tests these general predictions in Norwegian based on three domains: adjuncts, embedded questions, and (existential) RCs. We also test extraction with P-stranding from nominal subjects as an unacceptable baseline against which to compare the results of the other domains.
1.2. Norwegian
Native speakers of Mainland Scandinavian languages such as Norwegian, Swedish, and Danish are consistently reported to accept and produce filler-gap dependencies into domains that were considered islands in many other languages (see, among others,
Christensen 1982;
Engdahl 1982,
1997;
Erteschik-Shir 1973;
Lindahl 2017;
Maling and Zaenen 1982;
Taraldsen 1982). It has been observed that Norwegian permits filler-gap dependencies into embedded questions and (some types of) relative clauses. The following sentences are examples of such dependencies found in a recent corpus study of children’s books (
Kush et al. 2021, pp. 22, 25):
(10) | Embedded Question |
| |
| Han ene typen vet vi jo ikke engang [hva __ heter]. |
| he one guy.def know we prt neg even what is.called |
| |
| ‘That one guy, we don’t even know what __ is called.’ |
| ≈ ‘That one guy, we don’t even know the name of.’ |
(11) | Relative Clause |
| |
| Det er det ingen [som __ vet __] |
| that is it no.one rel knows |
| |
| ‘That, there is no one who knows __.’ |
| ≈ ‘No one knows that.’ |
The acceptability of sentences such as those above in Norwegian (and Swedish and Danish) has led some researchers to posit parametric differences in
syntactic islandhood of EQs and RCs in Mainland Scandinavian on the one hand and languages such as English on the other where extraction from EQs and RCs incurs a more reliable cost.
6According to these accounts, the underlying structure of EQs and RCs in Mainland Scandinavian makes it possible to move out of EQs and RCs without violating locality rules on movement, thus rendering the data compatible with traditional syntactic accounts (
Lindahl 2017;
Nyvad et al. 2017;
Vikner et al. 2017).
Island-insensitivity beyond EQs and RCs is not as well-established. The formal literature has largely assumed that subjects are islands for all filler-gap dependencies in Norwegian. This assumption has recently received support from experiments that have shown that sentences such as (12) are consistently rated as unacceptable (
Bondevik et al. 2021;
Kush and Dahl 2020;
Kush et al. 2018,
2019).
(12) | Subject |
| |
| *Hvilken gutt syntes du at [mora til __] var interessant? |
| which boy thought you that mother.def to was interesting |
| |
| ’Which boy did you think the mother of __ was interesting?’ |
The islandhood of adjuncts is also less often discussed. A reference grammar of Norwegian (
Faarlund 1992, p. 117) provides examples of apparently acceptable topicalization out of tensed (temporal) adjunct clauses in (13).
7 However,
Bondevik et al. (
2021) found that while topicalization from conditional adjuncts did not result in island effects, topicalization from reason and temporal adjunct clauses did. This suggests that a more nuanced understanding of the islandhood of different adjuncts may be required.
(13) | Adjunct |
| a. | Det blir han sint [når jeg sier __]. |
| | that becomes he angry when I say |
| | ‘That he becomes angry when I say __.’ |
| b. | Den saken venter vi her [mens de fikser __]. |
| | that case.def wait we here while they fix |
| | ‘That case we wait here while they fix __.’ |
In sum, prior work shows that filler-gap dependencies are in principle possible into EQs and RCs (and perhaps some adjuncts) in Norwegian.
Though dependencies into EQs, RCs and possibly adjuncts are reported, the acceptability of extraction from different constituents may vary by dependency type (
wh-movement, relativization and topicalization). The majority of documented examples of extraction from RCs feature topicalization (
Taraldsen 1982; see also
Engdahl 1997 and
Lindahl 2017). In the parsed child-fiction corpus of Norwegian bokmål (part of NorGramBank, see
Rosén et al. 2009),
Kush et al. (
2021) found that all instances of extraction from RCs were topicalization dependencies. Attested examples of extraction from EQs usually feature either RC-movement or topicalization:
Kush et al. (
2021) found that of the 404 examples of extraction from EQs in their corpus, 319 featured relativization and the remaining 85 examples were topicalization dependencies.
Wh-question dependencies are conspicuously absent in most collections of naturally occurring examples.
8 The lack of any examples with
wh-extraction from these domains is potentially surprising given earlier claims that, in principle, nothing blocks such dependencies in Norwegian (e.g.,
Maling and Zaenen 1982).
Recent judgment studies paint a roughly similar picture:
Kush et al. (
2018) did not find
wh-extraction to be acceptable in Norwegian for extraction from subjects, conditional adjuncts, relative clauses, or complex NPs. A smaller island effect was found for
wh-movement from
whether EQs. When investigating topicalization on the other hand,
Kush et al. (
2019) found that contextually-supported topicalization from EQs was acceptable (though topicalization without context did produce an island effect), while judgments of topicalization from RCs were variable. Topicalization from subjects and complex NPs was, however, unacceptable. Interestingly, the authors also found that topicalization from conditional adjuncts did not produce island effects, an effect which
Bondevik et al. (
2021) replicated. Finally,
Kush and Dahl (
2020) confirmed that relativization from EQs did not produce island effects.
Given the variation discussed above, we reasoned that Norwegian was a good language in which to systematically test for differences in island effects across dependency type. An added benefit of testing Norwegian is that Norwegian may also offer us the opportunity to isolate discourse-based (or non-structural) factors that influence the acceptability of ‘island violations’ and that are independent of syntactic constraints in domains such as EQs and RCs, if those domains are assumed to not be syntactic islands.
3. Results
Participants rated bad filler sentences low (mean
z-score = −0.91). The average rating of bad fillers is marked on each interaction plot with the dotted line labeled ‘BF’ to give a sense of the lower bound of unacceptability. Good fillers, which varied in complexity, received an average rating close to
z = 0, represented by the dotted line labeled ‘GF’. Aggregated together all good items (filler and test) were rated close to
z = 0.51 (‘GI’ in
Figure 1). Ratings on these trials indicate that the participants understood and performed the task as expected. Below we present the results for each of the island types in turn.
3.1. Subjects
Statistical analysis revealed a significant Structure × Distance × Dependency interaction (LMEM: = 0.52, t = 3.27, p = 0.0037; CLMM: = 2.24, z = 3.64, p = 0.0003), indicating that the size of the Structure × Distance island effect varied across dependency type. Follow-up analysis revealed significant Structure × Distance interactions for RC-dependencies (LMEM: = −0.67, t = −5.66, p < 0.0001; CLMM: = −1.19, z = −2.61, p = 0.0090) and Wh-dependencies (LMEM: = −1.18, t = −10.4, p < 0.0001; CLMM: = −3.24, z = −6.65, p < 0.0001). The Structure × Distance interaction effect was larger for wh-dependencies than for RC-dependencies (DD = 1.15 v. DD = 0.67, respectively). The difference in size of the interaction effect appears largely driven by the reduced average acceptability of the RC-dependency in the Long-noIsland condition (z = 0.22) compared to the wh-dependency (z = 0.48). The average acceptability of wh-movement from a subject (z = −0.66) and RC-movement from a subject (−0.65) did not differ significantly.
Ratings distributions by condition are presented in
Figure 2. Ratings across
Short conditions were nearly all at the top end of the scale (
z ∼ 1). Ratings were differently distributed in the
Long-noIsland versus
Long-Island conditions. Ratings in
Long-noIsland conditions were largely distributed around
z = 1, though there was a longer left tail indicating that participants rated the occasional
Long-noIsland sentences as degraded. In contrast, the
Long-Island conditions mostly grouped around the lower end of the scale (
z <−1), indicating that participants overwhelmingly perceived the sentences as deeply unacceptable.
3.2. Embedded Questions
Statistical analysis revealed a significant Structure × Distance × Dependency interaction in the LMEM ( = 0.34, t = 2.05, p = 0.0508), but the 3-way interaction was only marginally significant in the CLMM ( = 1.22, z = 1.82, p = 0.0682). Resolving the three-way interaction revealed that while there was an island effect for wh-dependencies as manifested by a significant Structure × Distance interaction (LMEM: = −0.48, t = −4.33, p = 0.0003; CLMM: = −1.37, z = −3.59, p = 0.0003), no such effect was found for RC-dependencies. The DD score was larger for wh-dependencies than for RC-dependencies (DD = 0.47 v. DD = 0.15, respectively).
Visual inspection of
Figure 1 suggests that the difference in interaction size across dependency type is mostly due to differences in the acceptability of the
Long-noIsland conditions (
Wh:
z = 0.28 v. RC:
z = −0.01), not differences between the
Long-Island conditions. The average acceptability of
wh-movement from an EQ (
z = −0.21) is relatively close to the mean acceptability of RC-movement (
z = −0.08) and post hoc comparisons revealed that the numerical difference between the conditions was not significant (
p > 0.1).
Ratings distributions are presented in
Figure 3. Ratings in
Short conditions were nearly all high, whereas ratings in
Long conditions were more variable. The variable ratings of RC-dependencies in the
Long-Island and
Long-noIsland conditions overlap completely, confirming that participants did not perceive RC-movement from EQs as marked compared to RC-movement from embedded declaratives. For
wh-dependencies, ratings in the
Long conditions were also variable, but there was slightly less overlap between the
Long-Island and
Long-noIsland distributions. On the one hand, participants were slightly less likely to give high ratings to
wh-extraction from EQs than
wh-extraction from embedded declaratives. This could be interpreted as evidence for a penalty. On the other hand, if we compare judgments of
Long-Island sentences across dependency type, we see that the distributions in the
Long-Island Wh and the
Long-Island RC condition are nearly identically distributed. This could be taken to suggest that participants did not perceive
wh-movement from EQs to be worse, in the absolute sense, than RC-movement from EQs.
3.3. Adjuncts
Statistical analysis revealed a significant
Structure ×
Distance ×
Dependency interaction in the CLMM (
= 1.36,
z = 2.33,
p = 0.0199), though the effect was only marginally significant in the LMEM (
= 0.32,
t = 1.84,
p = 0.0819). Effect sizes differed between the two dependency types (DD = 0.47 for
wh-dependencies v. DD = 0.11 for RCs). We ran a separate analysis for each dependency type, which revealed an absence of an island effect for RC-dependencies as manifested by a non-significant
Structure ×
Distance interaction (LMEM:
p = 0.5; CLMM
p = 0.9). Visual inspection of
Figure 1 confirms the absence of an island effect for RC-dependencies. There was a significant
Structure ×
Distance interaction for
wh-dependencies (LMEM:
= −0.41,
t = −2.61,
p = 0.0214; CLMM:
= −1.35,
z = −3.11,
p = 0.0019). The interaction is notable, however, in that
wh-movement from a conditional adjunct was rated higher on average (≈z = 0.25) than RC-movement (≈z = −0.21). Post-hoc comparisons revealed that this difference was significant (
p < 0.05).
Figure 4 shows that the participants’ ratings across
Short conditions were generally rated high (∼+1). The distribution of judgments in
Long conditions differed across dependency type. Judgments in the the
Long-noIsland-Wh-dependency condition were mostly high, similar to judgments in
Short conditions. Judgments in the the
Long-Island-Wh dependency condition were more variable. The distribution suggests relatively polar responses across trials with a larger cluster around
z = +0.75 and a smaller cluster around
z = −1. It seems that the majority of trials were rated around +0.75, suggesting that the sentences were judged acceptable more often than they were rejected. Ratings of
Long sentences for RC-dependencies had qualitatively different distributions. Ratings of
Long-noIsland sentences had a mode at the top of the scale, but many sentences were rated as less acceptable to some degree. In the corresponding
Long-Island-RC dependency condition, the ratings are centered around the midpoint of the scale with substantial variance. If we compare judgments in the
Long-Island conditions across dependency type, it appears that participants were more likely to give a high acceptability score to
wh-movement from a conditional than RC-movement, despite the fact that an ‘island effect’ is only observed with
wh-movement.
3.4. Relative Clauses
For Relative Clauses, the
Structure ×
Distance ×
Dependency interaction was significant in the LMEM (
= −0.44,
t = −2.29,
p = 0.0368) and marginally significant in the CLMM (
= −2.16,
z = −1.89,
p = 0.0587). Resolving the three-way interaction revealed
Structure ×
Distance interactions for both RC- (LMEM:
= −0.95,
t = −9.22,
p < 0.0001; CLMM:
= −5.48,
z = −5.3,
p < 0.0001) and
wh-dependencies (LMEM:
= −0.51,
t = −2.8,
p = 0.0145; CLMM:
= −2.95,
z = −3.44,
p = 0.0006). The interaction observed for RC-dependencies resembles a standard island effect, such that the
Long-Island condition is rated significantly worse than the
Long-noIsland condition. The interaction with
wh-dependencies does not resemble the typical interaction pattern. First, there is not a signficant difference between the average acceptability of the
Long-Island and
Long-noIsland conditions. The interaction appears to be driven entirely by extremely high acceptability ratings in the
Short-Island condition. We attribute the high ratings to the relative simplicity of the structures used in these conditions. As discussed in
Section 2.1, we were forced to deviate from a strict factorial design in the
Short-Island condition. Therefore it seems inappropriate to use DD scores to quantify the ‘RC island effect’. Instead, the most informative comparison for determining whether there is an island effect is to compare the mean ratings in the
Long-Island (
z = 0.59) and
Long-noIsland (
z = 0.60) conditions. We interpret the negligible difference between the two
Long conditions as evidence that there is no island effect for
wh-extraction from an existential RC.
Rating distributions by condition are presented in
Figure 5. Similar to other domains,
Short conditions received consistently high ratings. Looking at
wh-dependencies where we observed no island effect, we see that the
Long-Island and
Long-noIsland distributions are nearly identical, indicating that participants did not distinguish
wh-movement from a declarative complement clause from an existential RC. Interpreting the ratings of
Long RC-dependencies is less straightforward. Participants generally rated sentences from the
Long-noIsland condition high, indicating that they judged RC-movement from a declarative complement clause acceptable. Ratings of RC-movement from existential RCs, however, show considerable variation and no clear mode. Insofar as the distribution is clearly different from the
Long-noIsland condition, the conclusion that there is an island effect of some sort is supported. It seems, however, that the island effect does not reflect uniform rejection of the dependencies (as seen with movement from subjects).
4. Discussion
We found that statistically significant Distance × Structure effects varied by domain and dependency type. These ‘island’ effects indicate that some extractions resulted in decreases in acceptability that could not be accounted for by main effects of Structure and Distance alone. We found that significant effects (i) often reflect highly variable judgments in the ‘island-violating’ Long-Island condition and (ii) do not always entail that ‘island violations’ are unacceptable in absolute terms. In what follows we discuss effects by domain and how our results align with predictions of the FBCC.
4.1. Subjects
We observed large island effects for both RC- and wh-extraction from the subject phrases we tested. We saw that the size of the island effects differed by dependency type, but we reasoned that the statistically significant differences were not practically or theoretically meaningful in that participants reliably rejected RC- and wh-dependencies into subjects. Regardless of its origins, the subject island effect provides a benchmark for a large, consistent island effect against which we can compare other effects in the study.
We do not draw conclusions about whether the island effect we observed is consistent with the predictions of the FBCC because our items used preposition stranding, which
Abeillé et al. (
2020) argued was unacceptable for independent reasons. We point out, however, that if preposition stranding causes the problem, the explanation for the unacceptability cannot be that readers could not locate the gap site. The stranded preposition marked the gap site very clearly. It is also unlikely that the explanation can be linked to a preference for pied-piping, since pied-piping is not an option in Norwegian RC-dependencies, and it is not used in
wh-questions in standard varieties.
4.2. Embedded Questions
Replicating the findings of
Kush and Dahl (
2020), we found that relativization of a subject from an EQ did not result in a significant island effect. We observed a significant island effect for
wh-extraction from the same EQs, though this island effect was smaller (DD = 0.49) than our subject island effects (DDs = 1.14). Since we replicate the absence of an island effect for relativization, we conclude that the ambiguity between demonstrative relativization and clefting did not have an effect on the acceptability of extraction from EQs.
Although there was an island effect for wh-movement, the effect was largely due to differences in the average acceptability of wh- and RC-extraction from declarative complements. The average acceptability of wh-extraction from EQs was not significantly different from RC-extraction from the same EQs. Further, judgments of wh-extraction and relativization from EQs exhibited nearly identical variability.
If EQs are more backgrounded than declarative complement clauses, the FBCC predicts that we should see a penalty for wh-extraction from an EQ compared to wh-extraction from a declarative complement. A comparable penalty should not be observed for RC-extraction. A proponent of the FBCC might interpret the island effect we observed as consistent with this prediction.
We think, however, that there are also reasons to treat the interaction with caution. First, the small interaction effect could simply be an artifact of a ceiling effect. As discussed, the interaction emerges for wh-dependencies because there is a pairwise difference between the Long-noIsland and the Long-Island conditions, but not between the Short conditions. However, both Short conditions are rated essentially at the top of the scale, where potentially meaningful acceptability differences may be compressed. Second, the the average acceptability ratings and their distributions in the Long-Island condition were nearly identical for wh- and RC-extraction. The similarities make it hard to conclude that participants perceived wh-extraction as ‘worse’ than RC-extraction.
4.3. Adjuncts
We found that relativization from a conditional adjunct did not result in a significant island effect, similar to English results from
Sprouse et al. (
2016).
Wh-extraction yielded a statistically significant island effect, but the effect was small (DD = 0.44) because the mean rating of wh-extraction from a conditional (z≈ 0.25) was relatively high. It was above the average rating of the good fillers in the experiment and significantly higher than the average rating of relativization from a conditional adjunct. Thus, wh-extraction from conditionals appears to be, on average, ‘acceptable’ despite the island effect. The distribution of judgments confirmed that most participants considered wh-extraction from an adjunct to be acceptable more often than not: Participants rated the sentences near the top of the scale on the majority of trials, though they rated the sentences at the bottom end of the scale on the rest of trials.
We now turn to how our results square with the FBCC. The absence of an island effect for relativization from conditional adjuncts is consistent with the FBCC insofar as the FBCC does not predict island effects for relativization from any constituent. The significant island effect for wh-extraction is potentially consistent with the FBCC.
Once again, we think that the interaction effect, and the judgment distributions underlying that effect, do not unequivocally support the FBCC. We saw that the relatively high mean rating of wh-extraction from an adjunct was the result of averaging over a judgment distribution that had a mode at the top of the scale and a smaller proportion or judgments at or below zero. That is, participants were more likely, on balance, to judge wh-extraction from an adjunct just as acceptable as from an embedded declarative. If conditional adjuncts are uniformly backgrounded, we would expect a reliable penalty for wh-extraction from them: participants should have rated wh-extraction from an adjunct to be less acceptable than from a declarative on a majority of trials. This is not what we see. It seems instead that insofar as there is a penalty, it is observed inconsistently, on a small number of trials.
A proponent of the FBCC could accommodate the inconsistent unacceptability of
wh-extraction, by letting the backgroundedness of conditional adjuncts vary. Under this interpretation, participants rated
wh-extraction from conditional adjuncts acceptable on trials where they interpreted the conditional as part of the focus domain and rejected
wh-extraction on trials where they interpreted the adjunct as backgrounded. If variability in backgroundedness is behind the judgment variability we observed, there is a simple prediction: there should be a negative correlation between individual items’ backgroundedness as measured by the negation test and the acceptability of
wh-movement from those adjuncts.
15 We have not conducted the experiments to confirm or falsify this prediction, but have made our items and data publicly available on the project’s OSF page to any researchers who are interested in conducting the experiments.
Finally, it should be noted that our results, which seem to suggest that
wh-extraction from a conditional is largely acceptable, appear to conflict with the results of
Kush et al. (
2018), where
wh-extraction from conditional adjuncts resulted in large, consistent island effects across three experiments. What is responsible for the differences in extractability? We do not have an iron-clad explanation for the discrepancy, but we suspect that lexical differences between items used in the studies may have played a role: The current experiment adapted adjunct items from
Bondevik et al. (
2021), which differed from those used in
Kush et al. (
2018) in two potentially relevant ways. First, items in
Bondevik et al. (
2021) were constructed relative to a context sentence, which may have indirectly led to more ‘natural-sounding’ items than those used in
Kush et al. (
2018). Second, items in
Bondevik et al. (
2021) and our study used a very restrictive set of predicates in the main clause. In all
Island conditions, the matrix verb was
bli (‘become’), followed by an adjective describing an emotional state (e.g, ‘happy’, ‘angry’, ‘nervous’ and ‘surprised’). In
Kush et al. (
2018) a wider set of matrix predicates was used (‘complain’, ‘sigh’, ‘protest’, ‘worry’ and ‘become happy’). If the matrix predicate influences the possibility of extraction from an adjunct, as suggested by
Truswell (
2011) and others, the difference in predicate types could be the source of the apparent discrepancy in results. We encourage more systematic investigation of how different predicates influence the possibility of extracting from conditionals and other adjuncts and whether the observed cross-dependency differences in English would be attenuated with different predicates.
4.4. Relative Clauses
Participants rated wh-extraction from an existential RC just as acceptable as wh-extraction from a declarative complement clause. However, they rated relativization from an existential RC as significantly worse, on average, than relativization from a declarative complement. Where judgments of wh-extraction were consistently acceptable, judgments of relativization exhibited a large degree of variation, ranging across the scale from z = −1 to z = +1.
As we discussed in the Materials section, existential RCs are non-presuppositional and are therefore not backgrounded. As such, the FBCC predicts that they should therefore allow wh-extraction. Our results are consistent with this prediction.
The island effect for RC-movement from existential RCs does not follow from any formalized account that we are aware of. According to the FBCC, RC-movement should, all else equal, be permissible wherever
wh-movement is possible. Therefore, the source of the island effect must lie elsewhere. We do not have a concrete proposal for what additional factor(s) could be at play, but our results rule out a simple explanation grounded in complexity or dependency length. One possibility is that it is specifically the combination of demonstrative relativization and an existential RC that causes infelicity or unacceptability. If so, we might predict that sentences with eventive relativization would not be judged as unacceptable:
(25) | Jeg likte faktisk ølet som det var mange som hata __ |
| I liked actually beer.def rel it was many rel hated |
| lit. ‘I actually liked the beer that there were many who hated __.’ |
| ≈ ‘I actually liked the beer that many hated.’ |
The variation in judgments also suggests that RC-movement from existential RCs may not be uniformly unacceptable. It is possible that item-specific factors, individual differences, or some interaction of the two modulate acceptability. For example, participants may have struggled (to varying degrees) to accommodate/imagine a supporting context for relativization across individual items (see
Chaves and Putnam 2020 for more discussion). Providing a formal foundation for these intuitions should be one goal of future inquiry.