Code switching by phase

We show that the theoretical construct “phase” underlies a number of restrictions on code-switching, in particular those formalized under the Principle of Functional Restriction (Gonzalez-Vilbazo 2005) and the Phonetic Form Interface Condition (MacSwan and Colina 2014). The fundamental hypothesis that code-switching should be studied using the same tools that we use for monolingual phenomena is reinforced.


Introduction
Since the notion of "phase" was introduced to linguistic theory by Chomsky [1], a rich body of work has arisen that demonstrates its usefulness as a descriptive tool in syntax as well as the interfaces of syntax with other linguistic modules (see [2] for a clear introduction to phase theory, arguments and development).
However, phases have been so far underused in the linguistic study of code-switching.The only articles that we are aware of that use phases productively are [3][4][5].This is despite the eloquent argumentation proposed by Mahootian [6] and MacSwan [7] that any restrictions we find on code-switching should be accounted for using the same tools that we use to account for any other phenomenon of linguistic competence.Despite the conspicuous scarcity of phase theory in code-switching research, we believe that phases can be very useful in resolving some long-standing empirical puzzles.In this contribution, we aim to show that this is the case with two particularly intricate examples.
The first example is the network of phenomena that are bundled together in González-Vilbazo's Principle of Functional Restriction (in the original: Prinzip der Funktionalen Restriktion) [8].Among other effects (which we discuss in due course), this principle excludes code-switching between an auxiliary and a participle: In example (1), we have the German auxiliary hast 'have.2'and the Spanish participle contado and the result is unacceptable to Spanish/German bilinguals.
The second example involves the restrictions of code-switching within the word.Poplack proposes the Free Morpheme Constraint (FMC) according to which there cannot be any code-switching that separates morphemes within the word [9].This accounts for the ungrammaticality of one of the most famous examples in the code-switching literature: 1   2. *Estoy eatiendo Spa/Eng Am eating [9] (p.581) Poplack's constraint metamorphosed into the PF Disjunction Theorem [7] and later the PF Interface Condition (PFIC) [10].In these theoretical developments, the impossibility of (2) arises as the consequence of contradictory phonetic requirements imposed on one word.
However, there are numerous counterexamples to the, as summarized in [11], which suggests that the FMC as well as the PFIC are tools that are too blunt to provide an adequate analysis.What we find is that something like (3a) is acceptable while something like (3b) is not: example (2) for the sake of tradition, but it is in fact not a very good example.The verb eatiendo is inflected in the third conjugation and the third conjugation is not productive in Spanish.Consequently, the unacceptability of (2) can be accounted for quite independently of the FMC.
In more recent approaches, what we call v here is split into two heads: v provides the structure with category label and voice introduces the external argument (see [20,21]).In this sort of framework, Voice would be a phase head while v would not (or rather v would be a phase head, whenever Voice is absent; see also [22] for some discussion on the role of Voice and v across languages): Phases have been used in three empirical domains.First, phases have been used in the theory of locality: the complement domain of a phase is claimed to be a domain that is opaque to higher probing and therefore any movement must pass through the specifiers of phase heads.
Second, the head of a phase has been taken to be the locus of grammatical features.An early proposal along these lines is [19].In this article, Marantz proposes that the lexical verb is nothing but an array of semantic/conceptual features-in fact, purely a root without a mark for the syntactic category.The root becomes what we call a verb as the result of being selected by v, which additionally attracts the root, forming an incorporated structure.Likewise, Chomsky [23] and Richards [24] put forth the idea that all the features that trigger syntactic dependencies originate in v and C (although a mechanism of "inheritance" ensures that T and V do the actual job of setting up dependencies, see footnote 5).These ideas are crystallized in the Phase Head Hypothesis of González-Vilbazo and López [3,25]: The Phase Head Hypothesis (PHH): The phase head determines grammatical properties of its complement.
In the mentioned paper, word order, prosody and the expression of information structure are all determined by the phase head.In [4], several pieces of evidence are presented that confirm that assignment of morphological case to verbal complements is dependent on v.The crucial data comes from light verb constructions, which are pervasive in code-switching varieties.In these constructions, the verbal predicate is split into two heads: a lexical head, usually with default verb morphology; and a light verb, which can be translated as 'do', and bears all the grammatical properties of the construction.In [3,4], it was argued that the light verb is a spell-out of v: 3 A third area where the phase notion has stimulated significant research is the interfaces with interpretive systems.Chomsky links the completion of all operations within a phase with a "Transfer" of the information contained in the complement of the phase to the interpretive systems [29].This "interpretation by phase" hypothesis has been explored in several pieces of work, particularly in the areas of information structure and PF (see [30,31], among many others).
Let us explore the notion of Transfer in more detail.Consider the diagram in (6): For the purposes of this article, we maintain the simpler structure in which v performs a double function.Adopting the more complex structure would require some readjustments to our analysis but not change our fundamental proposal.3 Muysken discusses parallel examples in Dutch/Turkish code-switching [27].He claims that the Turkish light verb and the Dutch verb form a unit, while the Dutch verb and the Dutch noun do not.Problems for such an approach are laid out in [28] on the basis of idiomatic constructions.In (ii), we have a VP-idiom from Dutch geen reet interesseren 'to have no interest at all' combined with a Turkish light verb: ii.The fact that an idiomatic interpretation is possible at all in the bilingual construction points towards a structure in which the Dutch VP forms a constituent to the exclusion of the Turkish light verb.

6.
In this example, C is the head of the phase.Once the C head has entered the derivation and all the grammatical properties of C are satisfied, the TP is transferred to the interpretive components.Exactly the same steps take place in the vP phase.Notice that although v is the head of the lower phase, it transfers with T. Additionally, please note that the clausal structure is probably more complex than (6), even in the simplest examples.T should be regarded as a cover term for functional categories dedicated to tense, mood and aspect.Likewise, the complement of v may not be a root but a complex structure that includes the root and functional categories related to event structure.
Let us now consider nominal phrases.We follow the proposal in [32] that the nominal phrase is headed by K (=case) and assume the following structure: The structure in (7) represents the following hypotheses: in a nominal phrase, a root is selected by n, a categorizer.n is a phase head.n is selected by Number, which is itself the complement of D and the latter a complement of K. K is also the head of a phase (see [2] for a discussion of nominal phases and the phasehood of n).Notice that, as a consequence, the complement of K and K itself are transferred in different phases, as shown in ( 8

Transfer
In this example, C is the head of the phase.Once the C head has entered the derivation and all the grammatical properties of C are satisfied, the TP is transferred to the interpretive components.Exactly the same steps take place in the vP phase.Notice that although v is the head of the lower phase, it transfers with T. Additionally, please note that the clausal structure is probably more complex than (6), even in the simplest examples.T should be regarded as a cover term for functional categories dedicated to tense, mood and aspect.Likewise, the complement of v may not be a root but a complex structure that includes the root and functional categories related to event structure.
Let us now consider nominal phrases.We follow the proposal in [32] that the nominal phrase is headed by K (=case) and assume the following structure: The structure in (7) represents the following hypotheses: in a nominal phrase, a root is selected by n, a categorizer.n is a phase head.n is selected by Number, which is itself the complement of D and the latter a complement of K. K is also the head of a phase (see [2] for a discussion of nominal phases and the phasehood of n).Notice that, as a consequence, the complement of K and K itself are transferred in different phases, as shown in (8): Languages 2017, 2, 9 4 of 17 6.
In this example, C is the head of the phase.Once the C head has entered the derivation and all the grammatical properties of C are satisfied, the TP is transferred to the interpretive components.Exactly the same steps take place in the vP phase.Notice that although v is the head of the lower phase, it transfers with T. Additionally, please note that the clausal structure is probably more complex than (6), even in the simplest examples.T should be regarded as a cover term for functional categories dedicated to tense, mood and aspect.Likewise, the complement of v may not be a root but a complex structure that includes the root and functional categories related to event structure.
Let us now consider nominal phrases.We follow the proposal in [32] that the nominal phrase is headed by K (=case) and assume the following structure:

{[KP K [DP D [NumP Num} {[nP n [√P √]]]]]}
The structure in (7) represents the following hypotheses: in a nominal phrase, a root is selected by n, a categorizer.n is a phase head.n is selected by Number, which is itself the complement of D and the latter a complement of K. K is also the head of a phase (see [2] for a discussion of nominal phases and the phasehood of n).Notice that, as a consequence, the complement of K and K itself are transferred in different phases, as shown in (8): 8. Thus, in (8), D transfers with Num and n while K transfers with the vP.This will be the case for the internal argument.As for the K head of the external argument, it transfers with C. Thus, in (8), D transfers with Num and n while K transfers with the vP.This will be the case for the internal argument.As for the K head of the external argument, it transfers with C.
We are now in a position to formulate our Block-Transfer Hypothesis: Block-Transfer Hypothesis (BTH): The material that is transferred to the interfaces is sent in one fell swoop.
Bilinguals have multiple externalization systems.They may even have more than one PF, as MacSwan argues [7], although this is a complex matter since PF is itself a complex grammatical module (see [33] for a discussion of PF in bilingual grammars). 4Let us adopt the simple assumption that bilingual speakers have multiple PFs.The BTH ensures that, when a structure is transferred, it is transferred in one block to one of the PFs.The consequence of this hypothesis for code-switching is the following: code-switching may take place at phase boundaries but not within the phase.Code-switching within the phase would entail transferring some material to one externalization system while simultaneously transferring some other material to another externalization system.This is precisely what the BTH prevents.
Before we proceed with the empirical consequences of the BTH, it is worth pointing out that the BTH is not a novel theoretical construct.Rather, it is an explicit formulation of what is a universal implicit assumption.All the literature on phases that we are aware of takes it for granted that phases transfer in one shot and not piecemeal.Interestingly, this property of phases is only clearly visible in code-switching contexts.

The Principle of Functional Restriction
The PFR states the following (our translation) [8] (p.67):

Functional restriction a
Let X and Y be functional categories.b Let X and Y be members of the extended projection of the same lexical category.c Let L1 and L2 be distinct languages.d Then: 9.
In other words, it is illegal to code-switch between two functional heads that belong in the same extended projection.The term "extended projection" is borrowed from [35].As Grimshaw describes extended projections, the entire clause up to and including CP is an extended projection of the verb, while the entire KP is an extended projection of the noun.Thus, the PFR is a very strong and general principle: it forbids code-switching between C and T and between any functional categories within the clause; it also forbids code-switching within the nominal phrase [35].Although we think that the PFR, as stated in [8], is too strong, it does make some correct predictions, as we shall see.
The PFR seeks to account for two classes of phenomena: the impossibility of code-switching between an auxiliary and a participle and the impossibility of code-switching between a complementizer and its complement.We discuss here the first type of phenomenon, postponing the other to the end of the section.
González-Vilbazo's Spanish/German bilingual consultants were adamant in rejecting sentences such as the following, with a German auxiliary and a Spanish participle [8]: The symmetric form with a Spanish auxiliary and a German participle was not rejected as strongly by González-Vilbazo's participants for reasons unknown to us.We have tested the following sentences with three Spanish/English early bilinguals and their judgments are certain, in any combination of auxiliary and participle (see also [36] for numerous pieces of data): Although González-Vilbazo does not discuss it [8], the PFR also prevents code-switching between two auxiliaries.It is indeed the case that code-switching between 'have' and 'be' is ungrammatical, according to our Spanish/English bilingual consultants.We conclude that there is indeed a restriction against code-switching between an auxiliary and its complement, thus confirming the PFR.What is interesting to us is that this aspect of the PFR is fully accountable within a phase system.Any heads between C and v belong in the same phase and therefore the BTH predicts that they should transfer together to the same PF module.Let us see how with the tree in (16), which exemplifies the relevant aspects of example (12): The symmetric form with a Spanish auxiliary and a German participle was not rejected as strongly by González-Vilbazo's participants for reasons unknown to us.We have tested the following sentences with three Spanish/English early bilinguals and their judgments are certain, in any combination of auxiliary and participle (see also [36] for numerous pieces of data): Although González-Vilbazo does not discuss it [8], the PFR also prevents code-switching between two auxiliaries.It is indeed the case that code-switching between 'have' and 'be' is ungrammatical, according to our Spanish/English bilingual consultants.

*El canciller había been running for office
The chancellor had 15.* The chancellor had estado presentándose a elecciones been presenting.SELF to elections We conclude that there is indeed a restriction against code-switching between an auxiliary and its complement, thus confirming the PFR.What is interesting to us is that this aspect of the PFR is fully accountable within a phase system.Any heads between C and v belong in the same phase and therefore the BTH predicts that they should transfer together to the same PF module.Let us see how with the tree in (16), which exemplifies the relevant aspects of example (12): 16.
We take it for granted that the English auxiliary have has raised to T and not higher (see [37,38] among many others) while the Spanish verb contado 'told' has raised to v and, possibly, a higher category.The structure in (16) shows that the participle and the auxiliary are transferred in the same phase.The BTH says that they should both be transferred to the same PF.Thus, the BTH correctly predicts that ( 11)-( 15) should be ungrammatical.We take it for granted that the English auxiliary have has raised to T and not higher (see [37,38] among many others) while the Spanish verb contado 'told' has raised to v and, possibly, a higher category.The structure in (16) shows that the participle and the auxiliary are transferred in the same phase.The BTH says that they should both be transferred to the same PF.Thus, the BTH correctly predicts that ( 11)-( 15) should be ungrammatical.
Interestingly, the BTH can improve over the PFR with regards of the following apparent counterexample: it is in fact possible to code-switch between the be auxiliary and its complement in a progressive construction [36,39] Why is there this difference between the perfect and the progressive structures?The only account that we are aware of is MacSwan's [40].His proposal is that the auxiliary haber 'have' triggers restructuring while estar 'be' does not.The only reliable test for restructuring is clitic climbing, but both auxiliaries accept clitic climbing.Thus, it is unclear whether restructuring is the way to go.Data such as this are certainly beyond the scope of the PFR: in both perfect and progressive aspect, there is only one lexical item and therefore only one extended projection.
Fortunately, the contrast between ( 11)-( 15) and ( 17) falls directly under phase theory and the BTH.That is because it has been argued in the recent literature that progressive sentences have a phase barrier between T and the vP.Laka argues for these on the basis of examples such as the following Basque sentence [41]: Example (18a) is a normal transitive sentence in Basque, in which the external argument is marked with ergative case while the internal argument has no case morphology, usually interpreted as absolutive.Let us further assume that ergative in Basque is a dependent case [42,43], assigned to an argument in a structure if this argument is "in competition with" another argument in the same domain.Keeping this in mind, consider (18b).In this example, both arguments appear in absolutive case.This suggests that they are in different domains and therefore not in competition-in our terms, they are in different phases.
Likewise, Harwood has presented several arguments taken from English that the progressive aspect is unique in, when present, being able to act as the clause-internal phase head, crucially denying its complement phase status [44].In other words, when progressive is present, it crucially extends the size of the clause-internal phase.This can be seen in, e.g., the scope of ellipsis: the progressive morpheme [-ing] must always be included within the elliptical constituent, whereas the participle [-en] can be in or out (see [45] for the first description of these facts): Harwood argues in detail that when progressive is merged, the phrase headed by the predicational vP is extended [44].Thus, in our example in (17), it is not C that is the phase head, but rather the progressive head realized by estar, which triggers a spell-out of its complement.
A possible approach to an analysis of progressive aspect is hinted at by Laka herself [41].We should understand the progressive aspect as a locative adpositional phrase or case morpheme.There is evidence for a Prepositional Phrase (PP) analysis of the progressive aspect both in the grammar of English and in a broad comparative swath.A PP analysis for the English progressive was suggested by Bolinger [46], and supportive evidence comes from the discussion on the diachronic development of the progressive out of a structure that contained a P head embedding a deverbal noun [47].For instance, as observed in [46], progressive forms can be coordinated with PPs, e.g., They'are already in position and waiting for the call.In Spanish, as Bybee et al. note [48] (p.130), the auxiliary used for the progressive is estar, which has its origin in the Latin verb stare 'to stand'.While arguably not much of the original meaning of the Latin source is preserved in estar, in Spanish, estar and not ser, is used to express location and temporary state.Finally, Bybee et al. argue that in the majority of the world's languages, a locative component can be identified in the progressive [48].Thus, John is laughing = John is at laugh, e.g., Dutch: Jan is aan het lachen.We take it that it is a property of human language that a PP structure underlies the progressive aspect.
Some authors have argued that PPs are phases (see the discussion in [2] and references therein).Following this line of thought, we conclude that the progressive aspect delineates a phase barrier.This is shown in (20): progressive is estar, which has its origin in the Latin verb stare 'to stand'.While arguably not much of the original meaning of the Latin source is preserved in estar, in Spanish, estar and not ser, is used to express location and temporary state.Finally, Bybee et al. argue that in the majority of the world's languages, a locative component can be identified in the progressive [48].Thus, John is laughing = John is at laugh, e.g., Dutch: Jan is aan het lachen.We take it that it is a property of human language that a PP structure underlies the progressive aspect.Some authors have argued that PPs are phases (see the discussion in [2] and references therein).Following this line of thought, we conclude that the progressive aspect delineates a phase barrier.This is shown in (20): 20.
Given the cross-linguistic pervasiveness of the PP structure to express progressive aspect and the independent analysis of English developed by Harwood, we surmise that the structure in ( 20) is a property of the progressive aspect in Universal Grammar.Since the auxiliary and the progressive morpheme are in different phases, it follows that code-switching between them should be permissible.This accounts for the contrast between the ungrammatical (10)-( 15) and the grammatical (17), which is not predicted by the PRF.
The other type of code-switching that the PRF is meant to prohibit is code-switching between C and TP.This is an area where, once again, contemporary research on phase theory can provide some insight.González-Vilbazo reports that examples in which the complementizer is in one language and TP is in the other are unacceptable to Spanish/German bilinguals [8]

…
Given the cross-linguistic pervasiveness of the PP structure to express progressive aspect and the independent analysis of English developed by Harwood, we surmise that the structure in ( 20) is a property of the progressive aspect in Universal Grammar.Since the auxiliary and the progressive morpheme are in different phases, it follows that code-switching between them should be permissible.This accounts for the contrast between the ungrammatical (10)-( 15) and the grammatical (17), which is not predicted by the PRF.
The other type of code-switching that the PRF is meant to prohibit is code-switching between C and TP.This is an area where, once again, contemporary research on phase theory can provide some insight.González-Vilbazo reports that examples in which the complementizer is in one language and TP is in the other are unacceptable to Spanish/German bilinguals [8] In example (21), the complementizer que 'that' is drawn from the Spanish lexicon and T, as reflected in the inflection of the light verb haría 'would do' is also Spanish.Example ( 22) is ungrammatical because C is Spanish while T is German, as reflected in the verb schreibt 'writes' The examples in (23) and ( 24) are the mirror image.Identical judgments are reported in [49] in Spanish/English and French/Arabic code-switching.It seems that this is a real fact about code-switching, for some code-switching pairs.
The unacceptability of ( 23) and ( 24) appears, at first blush, to be accounted for within González-Vilbazo's PFR: Code-switching between C and TP is ruled out because they are members of the same extended projection.However, closer inspection of the data shows that there are a number of challenges to the generalization that code-switching between C and TP is not possible.It seems to be the case that when C is spelled out with a complementizer, the result tends to sound degraded, as González-Vilbazo argues.However, when the subordinate clause is fronted by a wh-phrase, the sentence is fully acceptable (see [50] for a detailed discussion of wh-movement in code-switching): Generally speaking, as long as there is some content in Spec,C, code-switching seems to be possible:  24) and (26) show is not a general prohibition of code-switching between C and TP, but rather a more specific prohibition of code-switching between C and TP if Spec,C is empty.We are grateful to an anonymous reviewer for the suggestion that [51] can be useful in this context.Ott argues that in free relative clauses, as in matrix CPs, the complementizer is transferred together with its complement.This is because after feature inheritance, C has no uninterpretable features that need to be valued or checked and Full Interpretation requires that it be removed. 5In embedded interrogatives, the interpretable [Q] feature ensures that the C remains present in the next phase.
This suffices to account for the difference in acceptability between ( 21)-( 24) and (25).In (25), C does not transfer with the TP and therefore can be spelled out in a different externalization system.In ( 22) and ( 24), C transfers with its complement and the BTH requires that C and TP transfer together-if they do not, we obtain an ungrammatical result.As for (25), notice that the complementizer is not a plain featureless complementizer because it is one of the two-piece formula puesto que that introduces a causal adjunct.
Additionally, when we find code-switching between C and TP, we see that the TP alters its grammatical properties and becomes more similar to the language of C, regardless of how the syntactic terminals spell-out.Let us reconsider example (26).The wh-phrase is English and all the lexical items in the subordinate clause are English.Interestingly, the word order in the subordinate clause follows an English pattern.In a fully Spanish clause, you expect to see subject-verb inversion in the presence of an argument wh-phrase.This inversion does not take place in (26) because the complementizer in this language is an English complementizer, which does not trigger inversion.Something similar is apparent in (27).Notice that the word order in the subordinate clause is Aux+Verb.This is unexpected in a German subordinate clause, which is obligatorily head final (modulo embedded verb second, cf.[52]).González-Vilbazo and López argue that the reason why we obtain this unusual word order is because the complementizer is Spanish, which then imposes a Spanish-like word order in the subordinate clause [25].
Notice that the data in ( 26) and ( 27) cannot be accounted for with the PRF, which would predict that both would be ungrammatical.Additionally, notice that the most popular approach to the grammatical properties of code-switching, Carol Myers-Scotton's Matrix Language Frame does not predict these data either (see [53]).In this model, INFL determines the matrix language, and therefore the word order of the main constituents in the clause.However, ( 26) and (27) show that this cannot be the case, it must be C that decides on this matter.We obtain a double advantage because the claim that "C does it" can be integrated into the well-developed framework of phase theory while the claim that "INFL does it" is an isolated stipulation.

Switches within the Word
Poplack famously claimed that code-switching could only take place between free morphemes and could never take place between two morphemes within the same word [9].Later research has shown that this restriction is too strong.There are a number of examples in the literature of apparent code-switching within the word: 6 5 Chomsky suggests that the uninterpretable phi-features and EPP of T are in fact inherited from C [23].The notion of feature inheritance is introduced in [23] as a way to solve an apparent paradox of this system: (i) syntactic dependencies emanate from phase heads and (ii) T establishes syntactic dependencies although it is not a phase head.The mechanism of inheritance allows for non-phase heads to establish dependencies.
6 Thus, we have to be careful to distinguish between verbal borrowings and code-switched verbs.See [54] for a comprehensive overview of loanword adaptation strategies in the languages of the world.Wohlgemuth identifies three such strategies, ranging from treating the borrowed verb stem like a native one without any morphosyntactic adaptation, as in (28), or applying a verbalizer of some kind so that the loan verb can then be inflected, as in ( 29) and (30), to using a light verb strategy, as exemplified in (5) above.28 On the other hand, it is also clear that not everything goes.These are four generalizations that we can extract from the literature and our own research: 1.It is possible to code-switch between a derivational morpheme and the root.However, it is not possible to code-switch between a derivational morpheme and an inflectional morpheme [8] (p.131).This generalization can be exemplified using the word cabreiert 'angered': This word is formed by taking the Spanish root √ cabre, to which the German derivational morpheme [ier] has been attached and it is inflected with German inflectional morphemes.In contrast, words such as the ones in (34)  In (32a), we have added German inflection to the Spanish root with an ungrammatical result.Likewise, in (32b), we have added Spanish participle morphology on a German verbal derivational morpheme and the result is equally ungrammatical.Finally, in (32c), a Spanish inflectional morpheme has been added to the German word stuhl.Although the German word is masculine, and the Spanish inflection is the masculine marker, the result is thoroughly ungrammatical. 7  2. The previous generalization bears one interesting exception: case morphology.The literature shows many examples that suggest that it is possible to add case morphemes to words that belong "in the other language".However, as far as we know, this has never been remarked upon.The following is an example from our own fieldwork with Turkish/German bilinguals: 7 Alexiadou et al. [20] and Alexiadou [57] discuss cases of Greek-German code-switching, where examples of the type in (32c) are grammatical.While the German word is feminine, it is assigned a Greek declension class (neuter) and case.As in Greek exponents of declension class, gender, case, and number are all fused, (iii) could be viewed as a sub-case of our generalization 2: iii. to matratz-i the mattress-NEUT.ACC Notice in particular that the German word Bewerbung is a complex word, which includes the German nominalizing morpheme [ung].
3. Although morphemes from two languages can be used to build a word, the phonology of the word has to belong to one or the other language.For instance, in example (29), the pronunciation of cabreiert is "German", with both tokens of /r/ pronounced as velar trills.This fact is accounted for with MacSwan's PF Disjunction Theorem [7] (p.230) (later transformed into the PF Interface Condition (PFIC) in [10] (p.191)).MacSwan's idea can be summarized as follows: PF takes the syntactic word as the unit of analysis.The phonology of a language consists of a set of ranked constraints (see [58] for an introduction to constraint-based phonology).Obviously, no two languages have the same ranking of constraints-German, for instance, has a high ranked constraint that forces its voiceless stops to be aspirated, but in Spanish this constraint is very low.Bilinguals have two phonologies, which means that they have two rankings of constraints, one for each language.Thus, if a word such as cabreiert is fed to PF, one set of constraints will have to yield to the other.This phenomenon seems to be related to the structure of the phonological word and therefore phase theory is not part of it.However, it leads us to the fourth puzzle, which is indeed dependent on phase theory.4. The fourth generalization is the following: the derivational morpheme decides the PF of the whole word.That is, in the word cabreiert, the German suffix decides on the phonology of the whole word-and the [r] of √ cabre sounds like a German velar trill and not a Spanish alveolar tap.The experimental design in [59] is particularly apt to show the point.Stefanich and Cabrelli use English nonce words such as zarp and they elicit from bilingual subjects the production of these words with a Spanish morphology, as in zarpeando [59].The point is how they pronounce the initial segment: as a voiced fricative (English phonology) or as a voiceless fricative (Spanish phonology).Their results show that bilinguals strongly prefer to pronounce zarpeando with a voiceless initial segment: the affix has influenced the pronunciation of the root.This is a puzzle that falls outside of the scope of the PFIC because the PFIC only requires the pronunciation to be homogeneous; it makes no prediction as to whether the root or the grammatical morpheme should win out.
Let us see how these puzzles can be accounted for using our BTH.The first puzzle tells us that the derivational morpheme does not need to be in the same language as the root but it has to be in the same language as the inflectional morphemes.The phase system accounts for this directly under the assumption that a derivational morpheme is a spell-out of a categorial morpheme, v or n.The root, which is the complement of v or n, is transferred independently of the derivational morpheme.On the other hand, the derivational morpheme and the inflectional morphemes are transferred in the same phase.The following diagram shows this with the help of the example cabreiert: Condition (PFIC) in [10] (p.191)).MacSwan's idea can be summarized as follows: PF takes the syntactic word as the unit of analysis.The phonology of a language consists of a set of ranked constraints (see [58] for an introduction to constraint-based phonology).Obviously, no two languages have the same ranking of constraints-German, for instance, has a high ranked constraint that forces its voiceless stops to be aspirated, but in Spanish this constraint is very low.Bilinguals have two phonologies, which means that they have two rankings of constraints, one for each language.Thus, if a word such as cabreiert is fed to PF, one set of constraints will have to yield to the other.This phenomenon seems to be related to the structure of the phonological word and therefore phase theory is not part of it.However, it leads us to the fourth puzzle, which is indeed dependent on phase theory.
4. The fourth generalization is the following: the derivational morpheme decides the PF of the whole word.That is, in the word cabreiert, the German suffix decides on the phonology of the whole word-and the [r] of √cabre sounds like a German velar trill and not a Spanish alveolar tap.The experimental design in [59] is particularly apt to show the point.Stefanich and Cabrelli use English nonce words such as zarp and they elicit from bilingual subjects the production of these words with a Spanish morphology, as in zarpeando [59].The point is how they pronounce the initial segment: as a voiced fricative (English phonology) or as a voiceless fricative (Spanish phonology).Their results show that bilinguals strongly prefer to pronounce zarpeando with a voiceless initial segment: the affix has influenced the pronunciation of the root.This is a puzzle that falls outside of the scope of the PFIC because the PFIC only requires the pronunciation to be homogeneous; it makes no prediction as to whether the root or the grammatical morpheme should win out.Let us see how these puzzles can be accounted for using our BTH.The first puzzle tells us that the derivational morpheme does not need to be in the same language as the root but it has to be in the same language as the inflectional morphemes.The phase system accounts for this directly under the assumption that a derivational morpheme is a spell-out of a categorial morpheme, v or n.The root, which is the complement of v or n, is transferred independently of the derivational morpheme.On the other hand, the derivational morpheme and the inflectional morphemes are transferred in the same phase.The following diagram shows this with the help of the example cabreiert:

34.
The second puzzle tells us that there is one type of inflectional morpheme that is independent of whatever inflectional or derivational morphemes it c-commands: the case morphology.Phase theory can account for this directly.Consider again the tree in (8).This tree shows that K is the head of the nominal phase.As a consequence, the complement of K transfers independently of K (K itself transfers with a different phase, either the v phase or the C phase).
The examples in (35) solidify this argument.In Turkish, verbs of motion may govern one of two cases in the complement.They can govern locative case, to indicate location, or dative case, to

InflP
The second puzzle tells us that there is one type of inflectional morpheme that is independent of whatever inflectional or derivational morphemes it c-commands: the case morphology.Phase theory can account for this directly.Consider again the tree in (8).This tree shows that K is the head of the nominal phase.As a consequence, the complement of K transfers independently of K (K itself transfers with a different phase, either the v phase or the C phase).
The examples in (35) solidify this argument.In Turkish, verbs of motion may govern one of two cases in the complement.They can govern locative case, to indicate location, or dative case, to indicate motion into a place.German has a similar distinction, but in this language dative indicates location and accusative indicates motion into a place.Example ( 35) is of a light verb construction in which the light verb is in Turkish and the lexical verb and the complement are in German.Following [3], we assume that the light verb is the spell-out of v.The complement of v is the phrase parkhaus-a/da fahren, which we take to be a root phrase with some default verbal morphology attached to it (alternatively, we can adopt the assumptions mentioned in footnote 2 and take the light verb to be the spell-out of voice and its complement a vP): Example ( 35) is of a light verb construction in which the light verb is in Turkish and the lexical verb and the complement are in German.Following [3], we assume that the light verb is the spell-out of v.The complement of v is the phrase parkhaus-a/da fahren, which we take to be a root phrase with some default verbal morphology attached to it (alternatively, we can adopt the assumptions mentioned in footnote 2 and take the light verb to be the spell-out of voice and its complement a vP):

36.
The Phase Head Hypothesis tells us that the grammatical properties of the phase are dependent on the phase head, including morphological case assignment.Notice that the phase head in ( 36) is a Turkish v-consequently, as predicted by the PHH, the case morphology on the complement follows the patterns of Turkish (locative for location, dative for motion) and the case morphology spells out as in Turkish.Thus, the tree in (36) shows that K and DP belong in different phases and that K belongs in the v phase.This is all in agreement with the BTH. 8e want to elaborate a little more on this.K is a member of the v phase, but it is itself the head of the phase which is composed of the nominal projections up to DP.Since K in (38) is Turkish, we predict that the complement DP must have Turkish properties.This seems to be the case: if the DP had a German structure, it would have to include an overt determiner.8 An anonymous reviewer points out that in (37) the root and the K are in different languages, in apparent contravention of the BTH.This is an interesting observation, for which we can only provide a provisional explanation.The BTH seems to make good predictions with constituents that carry the same categorial feature in a broad sense, i.e., they belong to the same extended projection.Thus, the BTH works with the aspect, mood and tense features of the verb or with the nominal features of the noun.However, it does not seem to affect constituents whose categorial feature is distinct from the head of the phase.In this particular case, K is a nominal feature merged within a verbal phase.Despite the BTH, it seems that K does not seem to need to spell-out in the same language as the other constituents in the vP phase.Thus,

DP
The Phase Head Hypothesis tells us that the grammatical properties of the phase are dependent on the phase head, including morphological case assignment.Notice that the phase head in ( 36) is a Turkish v-consequently, as predicted by the PHH, the case morphology on the complement follows the patterns of Turkish (locative for location, dative for motion) and the case morphology spells out as in Turkish.Thus, the tree in (36) shows that K and DP belong in different phases and that K belongs in the v phase.This is all in agreement with the BTH. 8 We want to elaborate a little more on this.K is a member of the v phase, but it is itself the head of the phase which is composed of the nominal projections up to DP.Since K in (38) is Turkish, we predict that the complement DP must have Turkish properties.This seems to be the case: if the DP had a German structure, it would have to include an overt determiner.
The data in (37) corroborate this prediction.Auer and Muhamedova show that Russian/Kazakh bilinguals can code-switch between a Kazakh case morpheme and a Russian DP-except that the Russian DP becomes almost unrecognizable because all its grammatical properties have to adapt to the phase head K.In example (37), the Russian noun ploshchod 'square' should trigger femenine concord on the adjective stariy 'old' [60].However, the adjective appears in a default, masculine form.We argue that the reason lies in the case morpheme: the case morpheme is Kazakh and Kazakh has no gender.Assume that concord, like other grammatical properties, is dependent on the phase head.If concord is triggered by K and K comes from Kazakh, concord is not possible.The Russian noun, on its own, cannot trigger gender concord.37. anau stariy ploshchod'-ti ne-ler-di zöndedi Russ/Kaz this old square-ACC thing-PL-ACC renovated 'This old square and so (were) renovated.'[60] (p.43) Thus, example (37) confirms our broader generalization that the head of the phase determines the grammatical structure of its complement.
Let us now move onto the final puzzle.Recall that the phonetic structure of a word must be decided by the derivational morpheme.The solution to this puzzle must be obvious by now: The derivational morpheme is the spell-out of a categorizing morpheme (n, v, etc.), and this morpheme is the head of a phase.The PHH tells us that the head of a phase determines the grammatical properties of the phase.It follows that the phonetic properties of the root will be consistent with those of the derivational morpheme.As Embick argues extensively [12], a phase head fixes the phonology and the interpretation of the root, in addition to providing categorial information.Crucially, in word-internal mixing patterns, we expect one phonology and not two, and this is apparently what we find, see [61]. 9  Let us conclude this section.Traditional approaches to code-switching have claimed that code-switching within the word is unacceptable ( [7,9,10]) but empirical counterexamples have been accumulating over the years.We have shown that these empirical counterexamples are not random-on the contrary, they are fully rational within phase theory.

Conclusions
The theoretical construct "phase" has become a fundamental tool in linguistic theory since it was first proposed by Chomsky [1].In this contribution, we have shown that phases can also be usefully deployed to account for some old puzzles in code-switching data, to wit, the empirical generalizations formalized in the Principle of Functional Restriction and the PF Interface Condition.Additionally, we have shown that phases have a broader empirical scope than the PFC and the PFIC, since they can account for phenomena that neither the PFR nor the PFIC are designed to analyze.The to need to spell-out in the same language as the other constituents in the vP phase.Thus, we reformulate the BTH tentatively as follows: all the constituents in a phase that share a categorial feature are sent to the externalization systems in one block.9 An anonymous reviewer suggests that the wanna contraction in English is an example of phonetically dependent material which looks in the phonetics upwards, i.e., outside the phase, since it undergoes T-to-C movement and forms a phonetic unit with the matrix verb.This suggests, as the reviewer observes, that the order in which the material looks in the phonetics is not pre-ordained (downwards, as in our example, or upwards in the wanna contraction).The authors of [61] present evidence that the wanna contraction is sensitive to prosodic phrasing, and there is currently a vivid debate in the literature as to whether or not prosodic and syntactic phases match.As in the case of the wanna contraction, there is movement to a higher head; we can assume, following [63] among others, that this movement extends the phase, which would make the analysis of the wanna pattern compatible with our general assumptions.fundamental hypothesis that code-switching should be studied as any other expression of human linguistic competence is reinforced.
Throughout the article, we have argued that the phase system provides some analytical advantages over its competitors.We argued that the BTH has an advantage over PFR in [8] and Matrix Language Model in [53], to the extent that it provides an account for data surrounding code-switches between C and TP as well as the special case of the progressive aspect.We have shown that it also has an empirical advantage over [7,10], which forbid any form of code-switching within the word.We have shown that code-switching within the word is possible and we have also shown that the phase system explains how it can happen (see also [64]).All in all, it seems to us that phase theory, coupled with distributed morphology, is a promising path to take in the analysis of code-switching.

): 8 .
Thus, in(8), D transfers with Num and n while K transfers with the vP.This will be the case for the internal argument.As for the K head of the external argument, it transfers with C.
19. a.John has been kissed and Peter has been <e> too.b.*John is being kissed and Peter is being <e> too.
'Driving inside the parking garage'

'
Driving inside the parking garage' . Consider the following examples: are not possible: