Morphosyntactic Features Versus Morphophonological Features in L2 Gender Acquisition: A Cross-Language Perspective

: This paper aims to demonstrate the reliability of morphosyntactic versus morphophonological features in the acquisition of L2 gender of inanimate nouns across languages. Based on Anna Kibort study “Towards a typology of grammatical features”(2010), the current research proposes that the presence of a gendered determiner is more reliable than gendered noun-ﬁnal morphemes in the process of adjective agreement within the Determiner Phrase (DP) across two gender transparency system languages. To test this hypothesis, the current research compares English second-language (L2) learners of Hebrew and Spanish. Both languages have a binary gender system for nouns; how-ever, Hebrew lacks a determiner with gender value, but provides a plural ending morpheme that encodes both number and gender. In contrast, Spanish has a gendered article that facilitates gender acquisition, but lacks a plural ending morpheme that indicates gender. Thirty-two L1 English–L2 Spanish learners and thirty-two L1 English–L2 Hebrew learners with different proﬁciency levels completed an adjective-agreement forced-choice task and an adjective-agreement elicited-production task—in their respective target languages. The tasks contained Spanish opaque plural nouns and Hebrew plural transparent nouns, highlighting the role of the determiner in Spanish and the role of transparency plural-ending morphemes in Hebrew. The results revealed that Spanish L2 learners performed better on the tasks than L2 Hebrew learners, offering evidence for the relevance of syntactic agreement knowledge over phonological cues in gender acquisition. The current investigation also examined the reliability of noun-ending morphemes in the target process, showing that transparent morphophonological features are less reliable than the determiner in the target process. One comprehension task and one production task were conducted to determine the effects of morphosyntactic versus morphophonological features in the target nouns. The data showed that, overall, the presence of the gendered determiner has the main effect on gender acquisition when the learner has no phonological cues in the input. Additionally, the same results indicated that the consistency of the gender information encoded in the article provided by the input is crucial in the process of noun– adjective agreement. On the other hand, the presence of transparent plural noun-ending is less reliable when Hebrew L2ers need to match the noun to the appropriate gender of the adjective. Taken together, these ﬁndings suggest that syntactic knowledge facilitates the acquisition of gender for inanimate nouns across languages.


Introduction
Several studies have examined the importance of morphosyntactic features in the acquisition of L2 gender, stating that L2 learners use gender markings on determiners to establish predictive syntactic agreement relations (Oliphant 1998;Hopp 2013;Grüter et al. 2012;Halberstadt et al. 2018). On the other hand, research on various languages emphasizes the role of noun-ending transparency, which leads to gender acquisition (Szagun et al. 2007;Foote 2014). Therefore, it remains unclear whether the gender information encoded in the determiner is more reliable for L2 learners than transparent ending morphemes in the acquisition of inanimate nouns. To disentangle the role of the two types of morphemes, the present study contributes the adoption of a cross-language perspective, testing whether the presence of a gendered determiner in Spanish before opaque plurals nouns will facilitate adjective agreement. The Spanish Determiner Phrase (DP) is compared with the Hebrew DP that exhibits a genderless determiner before transparent plural nouns that mark gender phonologically. This is the case for Hebrew, which, unlike Spanish, lacks a determiner with gender information, but contains plural transparent noun-ending morphemes that facilitate gender assignment.
To this end, the present work provides a theoretical analysis of two linguistic systems (Hebrew and Spanish) that provide different gender features that impact the L2 acquisition of gender, and how some features are relevant for the acquisition of gender properties in Languages 2022, 7, 142 2 of 15 inanimate nouns across languages. Additionally, the current research is concerned with the role of syntactic knowledge in the acquisition of the gender of target nouns: it proposes that syntactic gender agreement facilitates the acquisition of L2 gender, whereas noun-final morphemes, which are not governed by syntactic rules (Kibort 2010), delay the process of gender acquisition in inanimate nouns across languages.

Gender Distribution in Hebrew and Spanish: Functional Similarities across Languages
Hebrew and Spanish have a grammatical gender system whereby all nouns are assigned a gender. Conversely, English (the native language of the current study participants) is considered to have a 'natural gender system': the gender of the nouns is distinguished on a strictly semantic basis, the criteria being humanness and the sex of the relevant referents (Siemund 2008).
Hebrew and Spanish are languages with no common ancestor (Dryer and Haspelmath 2013). However, they share several functional categories, and one of them is gender. In Spanish and Hebrew, all nouns are classified in terms of grammatical gender, which is arbitrary and distinct from natural gender. Nouns are divided into two classes, masculine and feminine, and, as such, gender is an inherent lexical feature of nouns (Corbett 1991). L2 learners must track these features in the input to acquire each noun's gender, and, as such, make use of the different clues available to them in the input.
Spanish has morphophonological cues that reveal the gender feature of nouns. The most reliable of these cues occurs as transparent noun-ending morphemes, of which -o indicates masculine gender and -a indicates feminine. However, this is not a one-to-one correspondence: 62% of masculine Spanish nouns end in -o, while only 55.9% of feminine Spanish nouns end in -a in the Davies (2001Davies ( -2002Davies ( , 2015Davies ( -2017 corpus which means that there are several opaque gender suffixes for nouns, such as -e (calle 'street'(F); puente 'bridge'(M)). In addition, the transparency/opacity of the singular noun ending morpheme is not affected by the plural suffixation process, which is marked consistently with the suffix -s when the stem ends in a vowel; and -es when the stem ends in a consonant. Table 1 shows the transparency/opacity noun gender system in Spanish. The target Spanish nouns of the current study are opaque nouns finishing in -e in their plural form. The selection of the plural form is due to the methodological design (see Procedure and Materials section).
A second source of information in Spanish that can be used to predict gender is distributional (Mariscal 2008). In other words, the gender feature is specified by the elements that are systematically distributed around the head noun, with the most informative of these items being the definite/indefinite article, which occurs more frequently than the other elements before the head noun. Table 2 illustrates gender and number distribution in definite and indefinite articles. Adjectives in Spanish are also distributed within the DP and agree with the gender of the noun that they modify. The adjective-final morpheme typically overtly represents the noun-adjective internal agreement 1 . Like nouns, plural suffixation does not affect the transparency/opacity of the adjective final morphemes, as the following Table 3 shows: Like Spanish, Hebrew distinguishes between the genders of nouns, and every noun is characterized as masculine or feminine (Gonen and Rubinstein 2015). Feminine gender is typically marked through the suffixes -ah or -t. Masculine nouns are unmarked; in other words, nouns lacking a feminine ending are typically masculine. However, some feminine nouns do not have the most frequent feminine ending morpheme and a very small number of nouns ending with -ah or -t are masculine (Meir 2006). The transparency/opacity of the noun gender system is illustrated in Table 4. Another relevant distinction between Hebrew and Spanish is that, in Hebrew, plural suffixation is mainly determined by the gender of the noun (Levy 1980). Plural forms are gender-marked and, in contrast with singular forms, explicit marking exists in both masculine and feminine plural nouns (Gollan and Frost 2001). Hebrew has both a masculine plural suffix -im and a feminine plural suffix -ot. Almost all masculine nouns form their plural with the morpheme -im and feminine nouns with the ending -ot. Table 5 illustrates the distribution of plural noun morphemes. There are, however, a few exceptions for opaque nouns (masculine nouns ending with -ah or -t and non-marked feminine nouns) where the pluralization process does not match with the lexical gender (eben-F.S 'stone'/ebanim-F.PL 'stones'). The current study measured only plural transparent ending morphemes so that each group (L2 Hebrew and L2 Spanish) had one reliable cue for the noun-adjective agreement process (see Procedure and Materials section). Within the DP, Hebrew exhibits no agreement with the determiners. The definite determiner is not marked for gender or number. In the following examples, the definite determiner does not show any ending matching with gender nouns.
ha kit-ot The classrooms-F.PL. "the classrooms" However, a reliable source of information that can be used to predict gender is the adjective-final morpheme. The adjective is always overtly represented with the ending morpheme -ah for the feminine singular, while the masculine form, by contrast, has no such ending. Similarly, the adjective plural morpheme is always -im for masculine plural and -ot for feminine plural, regardless of the irregularity status of the singular noun, without exception. Table 6 shows the noun-adjective agreement in Hebrew. Table 6. Noun-adjective agreement in Spanish.

Gender
Noun Adjective In sum, the two gender systems differ in the agreement process between constituents within the DP. The internal agreement process reflects a relationship between two head constituents. The relationship may be seen as a sharing of grammatical features between the elements (Pesetsky and Torrego 2007). The authors argued in favor of the view of Agree as feature-sharing. They proposed that certain features of lexical items appear to come from the lexicon unvalued and receive their value from a valued instance of the same feature, present on another lexical item. For example, in: ( The determiner in Hebrew, unlike Spanish, does not share a copy with the gender property NOM, since DET has no gender value. DET does not need to receive any value from NOM. Therefore, there are fewer morphosyntactic features within the DP to determine the lexical property of gender in Hebrew when compared to Spanish.
In sum, to disentangle the role of the two types of morphemes in gender acquisition, the current work measured whether the Spanish gendered determiner is more reliable than the Hebrew gendered plural morpheme in the process of adjective agreement, proposing that the morphosyntactic feature of the determiner is more reliable than the morphophonological feature of the transparent noun-ending morpheme in gender acquisition across languages.

Morphosyntactic and Morphophonological Features in Gender Acquisition
The assignment and the agreement properties of the noun are the central notions in gender acquisition. Assignment refers to the lexical property of the noun, including semantic and formal principles (Comrie 1999), and agreement refers to the overt manifestation of assignment choices based on the properties of the noun (Audring 2008). The differentiation of these properties has led several authors to distinguish between noun-final morphemes that manifest assignment (morphophonological features) and final morphemes that express agreement (morphosyntactic features in determiners and adjectives) (Kirova 2016;Kirova and Camacho 2022). Therefore, the difference between morphosyntactic and morphophonological features lies in the fact that the former is governed by syntactic rules, but not the latter. However, it is not very clear how noun-final morphemes that mark assignment are not governed by syntactic or agreement rules. Kibort (2010) explained this difference, establishing a differentiation between morphemes. They distinguished between morphosemantic and morphosyntactic features. The authors argued that both types of morphemes affected the semantic level of the languages; however, morphosemantic features do not require participation in syntax, unlike morphosyntactic features. For example, transparent gender noun morphemes do not require a syntactic process to retrieve the gender feature, whereas inanimate opaque noun-ending morphemes require a syntactic mechanism to retrieve gender assignment. Based on the previous statement, the current studies propose that morphophonological features are morphemes that are independent of syntactic rules and that the phonological criterion itself allows the word's gender value to be inferred. On the other hand, for a feature to be relevant to syntax means that at least some of its values must be determined through a syntactic relation with another word. In the case of opaque nouns and gender, the final morpheme requires a syntactic agreement with other elements within the constituent. Therefore, the elements that determine the gender value within the DP are called morphosyntactic features.
Given this explanation, there is abundant research that has showed the primacy of morphosyntactic features over morphophonological features in the acquisition of gender across languages (in Italian, Oliphant 1998;in Hopp 2013). Differences in the reliability of gender cue are mainly based on the transparency/opaqueness of morphophonological features across languages. Transparency occurs when the formal assignment of gender allows for an accurate inference of the gender of the noun without having to rely on agreement on other arguments (Audring 2014). On the contrary, noun-ending morphemes that do not follow the most frequent masculine or feminine phonological ending are opaque (Velnić 2020). In Spanish, Kirova and Camacho (2022) proposed that gender acquisition converges with the L1 agreement system only when L2 speakers automatize and rely on the process of syntactic gender agreement. The authors administered a self-paced reading grammaticality judgment task to a group of L1 English-L2 Spanish learners and concluded that participants do rely on morphosyntactic cues as they acquire Spanish gender (for example, participants relied on the determiner la (the FEM) when the noun-ending morpheme was opaque mano (hand-FEM)). They seem to switch from a morphophonological strategy (using gender morphemes on nouns to deduce gender) to morphosyntactic strategy (using gender morphemes for determiners and modifiers) as their proficiency goes up. Grüter et al. (2012) investigated whether difficulties in mastering gender in L2 Spanish learners could best be characterized as production-specific performance problems or issues with retrieval information in real-time language use. One important result of their study is that differences between L1 and L2 lay in the lexical representation of grammatical gender or gender assignment. The authors explained that differences between L1 and L2 gender were due to in L1. There is a strong association between nouns and gender-marked modifiers, most importantly determiners, unlike the process of word-learning within an L2 context. Therefore, the more the L2er focuses on determiners, the more target-like the acquisition. However, languages such as Hebrew that lack gendered determiners to predict gender in the input point to a need to examine L2 gender acquisition in a linguistic system that does not provide a well-demonstrated facilitative element in the target process. Consequently, Languages 2022, 7, 142 6 of 15 the present work examines whether the presence of a gendered determiner will facilitate the acquisition of adjective agreement within the DP when the input does not provide a transparent noun suffix.
In Hebrew, there is a lack of studies that investigate the relevance of morphosyntactic features versus morphophonological features in L2 acquisition of the gender of inanimate nouns. However, the study by Armon-Lotem and Amiram (2012) provided evidence to predict Hebrew L2 participants' behavior. The authors investigated the acquisition of the Hebrew gender system with L2 learners, working with participants with different L1s: Russian and English. Importantly, Russian, unlike English, has gender morphology. The authors concluded that Russian L1 speakers paid more attention to formal information due to the syntactic similarities with their L1 agreement system, while L1 English speakers paid more attention to semantic information, given the absence of such agreements in their native language. Since English L1-Hebrew L2 learners appeared reticent to rely on semantic information when facing inanimate nouns, they must find other linguistic resources in the input for the noun-adjective gender agreement process, which is the nounfinal morpheme. Since Hebrew has no determiner with a gender value, but the system provides a salience phonological gendered-plural morpheme, the present work investigates whether the comparison of two types of morphological features across languages will shed light on the reliability of cues in gender acquisition. In other words, if the genderless determiner plays a more significant role, the presence of the plural noun-final morphemes with gender value in Hebrew will be less reliable than the Spanish determiner in the process of the noun-gender agreement. On the contrary. If transparent morphemes are more reliable than the determiner, the agreement process will be more accurate in Hebrew L2ers. Therefore, based on the previous literature, we predict that focusing on the determiner will benefit L2 Spanish learners in the noun-adjective agreement.

This Study
The present study tests the hypothesis that L2 learners find morphophonological morphemes less reliable than morphosyntactic morphemes when predicting adjectivegender agreement. The hypothesis is based on previous studies that claim that the syntactic knowledge of gender underlies target-like acquisition in L2 learners of several languages (Italian: Oliphant 1998; German: Hopp 2013; Spanish: Grüter et al. 2012). A promising test case for this hypothesis can be found by examining how L1 English speakers behave when they learn Hebrew, whose gender system provides a salience plural noun-final morpheme with gender value. If the presence of a determiner with gender value is crucial in L2 gender acquisition, I argue that L1 English-L2 Spanish speakers will perform better in acquiring the gender of inanimate nouns than L1 English-L2 Hebrew learners. In addition, if morphophonological information is less reliable in the process of adjective agreement, I predict that L1 English-L2 Hebrew learners will have lower accuracy in assigning adjective gender to plural nouns. Therefore, the research questions ask whether morphosyntactic features are more reliable than morphophonological features in the acquisition of the gender of inanimate nouns across languages.

Participants
The sample pool included 128 participants. The control group were 32 L1 English-L2 Spanish learners (L2SP) and 32 L1 English-L2 Hebrew learners (L2HB). The L2 Spanish participants had been studying Spanish for some years after the age of 10. The L2 Hebrew learners had also been studying for some years. All L2 participants were native speakers of English, born and raised in a monolingual English family, and their community also spoke English. The L2 speakers did not speak any other language besides English and the target language. The L2 Spanish learners and some of the L2 Hebrew learners were from a higher education institution on the East Coast, and some of the L2 Hebrew learners were from a theological seminary.
On the other hand, the comparison group consisted of 32 Spanish-dominant Spanish-English bilinguals (L1SP), and 32 Hebrew-dominant Hebrew-English bilinguals (L1HB). All the L1 Spanish group were native speakers from a higher education institution in Chile, and the native Hebrew speakers were from a higher education institution in Israel. All participants were in the process of completing undergraduate or graduate studies. Due to the absence of a monolingual population in Israel (most Israelis can speak English reasonably well, as it is a required second language for students in both Hebrew and Arabic schools (Uhlmann 2011)), Chilean Speakers and Israeli speakers were all sequential bilinguals, English L2 learners. Participants had completed at least college-level studies. All the participants were born in each target country and had been raised there. The comparison data were collected in Chile and Israel. Table 7 summarizes the demographics of the participants. The proficiency of the experimental groups was measured by MINT scores (Gollan et al. 2012), taking the results as a continuous variable. There was a total of 68 possible points on the test. All the participants were dominant in English, as shown by their productive vocabulary size. Their MINT scores in English were higher (L2SP, M = 64.5; L2HB, M = 63.4) than their scores in Spanish (M = 25.5) and Hebrew (M = 25.2) for the L2SP and L2HB, respectively. The use of the MINT as a screening measure is justified by previous research on bilingualism (Gollan et al. 2015;Hur et al. 2020). On the other hand, MINT scores have been shown to be significantly related to other, more complex language measurement instruments (Gollan et al. 2012;Sheng et al. 2014), in addition to finding a correlation between proficiency scores using this instrument and the DELE (Hur et al. 2020).
Since MINT results were not discrete, a histogram with results by L2 group was created to illustrate target-language level within each group. Figure 1 showed the results of MINT across L2 groups.
One two-tailed t-test was run to compare means between L2 Hebrew and L2 Spanish participants. There were no significant differences between means by group (t = 0.32416, df = 838.82, p-value = 0.7459), which supports the null hypothesis that the means are equal. Since MINT results were not discrete, a histogram with results by L2 group was created to illustrate target-language level within each group. Figure 1 showed the results of MINT across L2 groups One two-tailed t-test was run to compare means between L2 Hebrew and L2 Spanish participants. There were no significant differences between means by group (t = 0.32416, df = 838.82, p-value = 0.7459), which supports the null hypothesis that the means are equal.

Procedure and Materials
Participants for each target language completed four tasks in the following order: (1) a background questionnaire (2 min), (2) an informed consent form (2 min), (3) the MINT lexical proficiency test (5 min), (4) a forced-choice task (FCT) (7 min) and (4) an elicited production task (7 min). The experimental phase took place in a single session, and participants completed it individually. It lasted about 30 min.
The lexical items selected for the experimental tasks come from the textbooks used in each institution's foreign language department: Dicho y Hecho (Potowski et al. 2014) for Spanish and Hebrew from Scratch, Part 2 (Chayat et al. 2013) for Hebrew. For Spanish, we selected eight high-frequency inanimate opaque nouns (four feminine and four masculine) and eight low-frequency inanimate opaque nouns (four feminine and four masculine). Although the Spanish plural noun morpheme does not encode gender information, we used the plural version of the nouns to match the tasks in each language as much as possible. For Hebrew, eight high-frequency plural inanimate transparent nouns (four feminine and four masculine) and eight low-frequency plural inanimate transparent nouns (four feminine and four masculine) were selected. The decision to choose opaque nouns in Spanish and transparent nouns in Hebrew was based on the available gender cues that participants had in the task. In Spanish, the opaque noun is always preceded by the determiner with gender value (los/las) unlike in Hebrew, where the noun is preceded by a genderless determiner, but with a plural noun-ending morpheme with gender value.
It was decided to include lexical frequency as a variable to examine the relevance of different types of cues, irrespective of their frequency. Furthermore, none of the words were a cognate between the respective L2 and English, since cognates that have phonological overlap can facilitate storage in the lexicon and subsequently affect gender acquisition (Amengual 2016). The frequency selection process consisted of counting the number of times that a noun with the above-mentioned features appeared in the textbook. Then, we organized the data in a frequency table that showed, for each noun, how many times the item appears in the book. Finally, we compared the frequency of each noun in

Procedure and Materials
Participants for each target language completed four tasks in the following order: (1) a background questionnaire (2 min), (2) an informed consent form (2 min), (3) the MINT lexical proficiency test (5 min), (4) a forced-choice task (FCT) (7 min) and (4) an elicited production task (7 min). The experimental phase took place in a single session, and participants completed it individually. It lasted about 30 min.
The lexical items selected for the experimental tasks come from the textbooks used in each institution's foreign language department: Dicho y Hecho (Potowski et al. 2014) for Spanish and Hebrew from Scratch, Part 2 (Chayat et al. 2013) for Hebrew. For Spanish, we selected eight high-frequency inanimate opaque nouns (four feminine and four masculine) and eight low-frequency inanimate opaque nouns (four feminine and four masculine). Although the Spanish plural noun morpheme does not encode gender information, we used the plural version of the nouns to match the tasks in each language as much as possible. For Hebrew, eight high-frequency plural inanimate transparent nouns (four feminine and four masculine) and eight low-frequency plural inanimate transparent nouns (four feminine and four masculine) were selected. The decision to choose opaque nouns in Spanish and transparent nouns in Hebrew was based on the available gender cues that participants had in the task. In Spanish, the opaque noun is always preceded by the determiner with gender value (los/las) unlike in Hebrew, where the noun is preceded by a genderless determiner, but with a plural noun-ending morpheme with gender value.
It was decided to include lexical frequency as a variable to examine the relevance of different types of cues, irrespective of their frequency. Furthermore, none of the words were a cognate between the respective L2 and English, since cognates that have phonological overlap can facilitate storage in the lexicon and subsequently affect gender acquisition (Amengual 2016). The frequency selection process consisted of counting the number of times that a noun with the above-mentioned features appeared in the textbook. Then, we organized the data in a frequency table that showed, for each noun, how many times the item appears in the book. Finally, we compared the frequency of each noun in the L2 textbooks with that of frequency corpora for Spanish (Davies 2001-2002-2017, Davies 2001-2002-2017 and Hebrew (Linzen 2009), and selected the eight nouns with the highest frequency and the eight nouns with the lowest frequency to use in the experimental tasks. The items included in the tasks are illustrated in Table 8.

Method
The goal of this task was to examine whether participants matched noun-adjective gender agreement via comprehension abilities using the items shown in Table 8. The task Languages 2022, 7, 142 9 of 15 also attempted to measure whether the presence of the Spanish determiner (but the lack of a gendered final-noun morpheme) gave an advantage in English Spanish L2 or whether the presence of plural noun-ending morphemes (but the presence of a genderless determiner) hinders the comprehension of adjective agreement in L2 Hebrew learners. the L2 textbooks with that of frequency corpora for Spanish (Davies 2001(Davies -2002(Davies , 2015(Davies -2017 and Hebrew (Linzen 2009), and selected the eight nouns with the highest frequency and the eight nouns with the lowest frequency to use in the experimental tasks. The items included in the tasks are illustrated in Table 8.

Method
The goal of this task was to examine whether participants matched noun-adjective gender agreement via comprehension abilities using the items shown in Table 8. The task also attempted to measure whether the presence of the Spanish determiner (but the lack of a gendered final-noun morpheme) gave an advantage in English Spanish L2 or whether the presence of plural noun-ending morphemes (but the presence of a genderless determiner) hinders the comprehension of adjective agreement in L2 Hebrew learners.
In the Spanish version, participants were presented with a written sentence containing a gendered article + a plural opaque noun construction. In the Hebrew version, learners read a genderless article + gender marked plural noun, together with a choice of two adjectives with gender markers. After they read the word, they needed to select between two gender-marked adjectives. It was expected that they would select the accurate adjective based on the determiner (in the Spanish version) or the plural-noun ending morpheme (in the Hebrew version). High-frequency nouns were followed by high-frequency adjectives, and low-frequency nouns were followed by low-frequency adjectives. It is useful to remember that Hebrew adjective-final morphemes are always transparent, unlike Spanish. However, all Spanish adjective-final morphemes in the current task were transparent Figure 2 showed the experimental trial in the two target languages. In the Spanish version, participants were presented with a written sentence containing a gendered article + a plural opaque noun construction. In the Hebrew version, learners read a genderless article + gender marked plural noun, together with a choice of two adjectives with gender markers. After they read the word, they needed to select between two gender-marked adjectives. It was expected that they would select the accurate adjective based on the determiner (in the Spanish version) or the plural-noun ending morpheme (in the Hebrew version). High-frequency nouns were followed by high-frequency adjectives, and low-frequency nouns were followed by low-frequency adjectives. It is useful to remember that Hebrew adjective-final morphemes are always transparent, unlike Spanish. However, all Spanish adjective-final morphemes in the current task were transparent Figure 2 showed the experimental trial in the two target languages. the eight nouns with the lowest frequency to use in the experimental tasks. The items included in the tasks are illustrated in Table 8.

Language
Masculine

Method
The goal of this task was to examine whether participants matched noun-adjective gender agreement via comprehension abilities using the items shown in Table 8. The task also attempted to measure whether the presence of the Spanish determiner (but the lack of a gendered final-noun morpheme) gave an advantage in English Spanish L2 or whether the presence of plural noun-ending morphemes (but the presence of a genderless determiner) hinders the comprehension of adjective agreement in L2 Hebrew learners.
In the Spanish version, participants were presented with a written sentence containing a gendered article + a plural opaque noun construction. In the Hebrew version, learners read a genderless article + gender marked plural noun, together with a choice of two adjectives with gender markers. After they read the word, they needed to select between two gender-marked adjectives. It was expected that they would select the accurate adjective based on the determiner (in the Spanish version) or the plural-noun ending morpheme (in the Hebrew version). High-frequency nouns were followed by high-frequency adjectives, and low-frequency nouns were followed by low-frequency adjectives. It is useful to remember that Hebrew adjective-final morphemes are always transparent, unlike Spanish. However, all Spanish adjective-final morphemes in the current task were transparent Figure 2 showed the experimental trial in the two target languages.  The experiment began with six practice items, which included singular animate nouns. The experiment was designed and presented using Qualtrics. Participants scored 1 if they selected the accurate adjective; 0 if they selected the non-accurate adjective.

Experiment 2: Adjective agreement Elicited Production Task (EPT)
Method The goal of this task was to assess gender agreement in participants' spoken production by eliciting noun-adjective sequences. The task also attempted to measure whether the presence of the Spanish determiner (but the lack of a gendered final-noun morpheme) gave an advantage in English Spanish L2 or whether the presence of plural noun-ending morphemes (but the presence of a genderless determiner) hindered the production of adjective agreement in L2 Hebrew learners.
Before the trials began, participants were asked the color of the following words. On each trial, participants saw a written word. He/she needed to read the word aloud. Then, a colored square appeared, and the participant was asked to say aloud the color of the word. For example, in the Spanish version for the written target noun lápices ('pencils'), the image of a black square was presented. In the Hebrew version, for the written target The goal production by whether the p morpheme) ga noun-ending production of Before the each trial, part a colored squa word. For exa the image of a ‫יטוֹת‬ ‫המִ‬ ('beds'),

Results
The analy Production Ta independent v allows us to m (Clark and Lin coded binomia or the unexpe variables were the variables normality test the filter Excel ('beds'), the image of a red square was presented (Figure 3).

Method
The goal of this task was to assess gender agreement in participants' spoken production by eliciting noun-adjective sequences. The task also attempted to measure whether the presence of the Spanish determiner (but the lack of a gendered final-noun morpheme) gave an advantage in English Spanish L2 or whether the presence of plural noun-ending morphemes (but the presence of a genderless determiner) hindered the production of adjective agreement in L2 Hebrew learners.
Before the trials began, participants were asked the color of the following words. On each trial, participants saw a written word. He/she needed to read the word aloud. Then, a colored square appeared, and the participant was asked to say aloud the color of the word. For example, in the Spanish version for the written target noun lápices ('pencils'), the image of a black square was presented. In the Hebrew version, for the written target ‫יטוֹת‬ ‫המִ‬ ('beds'), the image of a red square was presented (Figure 3). In the Spanish version, it was expected that the participants would produce the adjective based on the article (los (M.P)), e.g., negros ('black' (M.P.)). In the Hebrew version, it was expected that the participants would produce the adjective based on the plural final morpheme (-ot (F.P), e.g., ‫דוֹמוֹת‬ ‫האַ‬ ('adomot' red (F.P)). The experimenter provided six practice items, including singular animate nouns, to assure that participants understood the exercise. The experiment was designed and presented using Power Point software. As PPT does not record oral answers, the PI used a digital voice recorder. Participants scored 1 if they produce the accurate gender of the adjective; 0 if they produced any other form. The complete tasks can be found at the links in Supplementary Material.

Results
The analysis presents the results from the forced-choice task (FCT) and the Elicited Production Task (EPT). The model selected was GLMM since this model allows some independent variables to have random effects. A model that assumes random variables allows us to make "broad level" inferences about the larger population of participants (Clark and Linzer 2015). The dependent variable in both tasks was response, which was coded binomially according to whether the participant answered the expected response or the unexpected response (1 for expected vs. 0 for non-expected). The independent variables were group (L1, L2), frequency (low, high) and language (Spanish, Hebrew). All the variables were qualitative and treated like dummy variables. Therefore, no data normality test was run. The data cleaning procedure was conducted in Excel and R. First, the filter Excel function was run to check typos at every level of each qualitative variable In the Spanish version, it was expected that the participants would produce the adjective based on the article (los (M.P)), e.g., negros ('black' (M.P.)). In the Hebrew version, it was expected that the participants would produce the adjective based on the plural final morpheme (-ot (F.P), e.g., The goal of this task was to assess gender agreement in participants' spoken production by eliciting noun-adjective sequences. The task also attempted to measure whether the presence of the Spanish determiner (but the lack of a gendered final-noun morpheme) gave an advantage in English Spanish L2 or whether the presence of plural noun-ending morphemes (but the presence of a genderless determiner) hindered the production of adjective agreement in L2 Hebrew learners.
Before the trials began, participants were asked the color of the following words. On each trial, participants saw a written word. He/she needed to read the word aloud. Then, a colored square appeared, and the participant was asked to say aloud the color of the word. For example, in the Spanish version for the written target noun lápices ('pencils'), the image of a black square was presented. In the Hebrew version, for the written target ‫יטוֹת‬ ‫המִ‬ ('beds'), the image of a red square was presented (Figure 3). In the Spanish version, it was expected that the participants would produce the adjective based on the article (los (M.P)), e.g., negros ('black' (M.P.)). In the Hebrew version, it was expected that the participants would produce the adjective based on the plural final morpheme (-ot (F.P), e.g., ‫דוֹמוֹת‬ ‫האַ‬ ('adomot' red (F.P)). The experimenter provided six practice items, including singular animate nouns, to assure that participants understood the exercise. The experiment was designed and presented using Power Point software. As PPT does not record oral answers, the PI used a digital voice recorder. Participants scored 1 if they produce the accurate gender of the adjective; 0 if they produced any other form. The complete tasks can be found at the links in Supplementary Material.

Results
The analysis presents the results from the forced-choice task (FCT) and the Elicited Production Task (EPT). The model selected was GLMM since this model allows some independent variables to have random effects. A model that assumes random variables allows us to make "broad level" inferences about the larger population of participants (Clark and Linzer 2015). The dependent variable in both tasks was response, which was coded binomially according to whether the participant answered the expected response or the unexpected response (1 for expected vs. 0 for non-expected). The independent variables were group (L1, L2), frequency (low, high) and language (Spanish, Hebrew). All the variables were qualitative and treated like dummy variables. Therefore, no data normality test was run. The data cleaning procedure was conducted in Excel and R. First, the filter Excel function was run to check typos at every level of each qualitative variable ('adomot' red (F.P)). The experimenter provided six practice items, including singular animate nouns, to assure that participants understood the exercise. The experiment was designed and presented using Power Point software. As PPT does not record oral answers, the PI used a digital voice recorder. Participants scored 1 if they produce the accurate gender of the adjective; 0 if they produced any other form.
The complete tasks can be found at the links in Supplementary Material.

Results
The analysis presents the results from the forced-choice task (FCT) and the Elicited Production Task (EPT). The model selected was GLMM since this model allows some independent variables to have random effects. A model that assumes random variables allows us to make "broad level" inferences about the larger population of participants (Clark and Linzer 2015). The dependent variable in both tasks was response, which was coded binomially according to whether the participant answered the expected response or the unexpected response (1 for expected vs. 0 for non-expected). The independent variables were group (L1, L2), frequency (low, high) and language (Spanish, Hebrew). All the variables were qualitative and treated like dummy variables. Therefore, no data normality test was run. The data cleaning procedure was conducted in Excel and R. First, the filter Excel function was run to check typos at every level of each qualitative variable and incorrect labeling was eliminated. Second, the function na. rm was run to exclude missing values when calculating descriptive statistics in R.

Experiment 1: Adjective Agreement Comprehension Forced-Choice Task (FCT)
Descriptive statistics were run to reveal the performance of the 132 participants distributed in the four groups (L2SP, L2HB, L1SP, and L1HB). Each participant completed 16 FCT test items, which were randomly distributed by the software Qualtrics. Figure 4 illustrates differences in frequency condition by group.
Regarding high-frequency conditions, L2SP mean was higher (M = 0.83; SD = 0.39) than L2HB mean (M = 0.56; SD = 0.49), like low-frequency conditions, where L2HB results were lower (M = 0.31; SD = 0.43) than L2SP results (M = 0.70; SD = 0.46). However, results for L2HB in the low-frequency condition were higher than the previous task. Like the previous task, errors were exceedingly rare in the L1 group, showing a slight difference in the L1SP group under high-frequency conditions ((M = 98; SD = 12) over the L1HB group (M = 0.96; SD = 0.18). In low-frequency conditions, results for native Hebrew speakers (M = 0.98; SD = 0.12) and native Spanish speakers (M = 0.98; SD = 0.15) were similar. and incorrect labeling was eliminated. Second, the function na. rm was run to exclude missing values when calculating descriptive statistics in R.

Experiment 1: Adjective Agreement Comprehension Forced-Choice Task (FCT)
Descriptive statistics were run to reveal the performance of the 132 participants distributed in the four groups (L2SP, L2HB, L1SP, and L1HB). Each participant completed 16 FCT test items, which were randomly distributed by the software Qualtrics. Figure 4 illustrates differences in frequency condition by group. These results support previous findings of missing morphology production, indicating that, even in cases where surface morphology is never acquired, it is still possible for the learner to determine the syntactic status of the linguistic structure in the target language (e.g., Lardiere 1998;Prévost and White 2000). Nevertheless, mean differences in FCT results between L2ers indicate that the type of morphology cue has an effect on access to grammatical knowledge.
A generalized linear mixed model (GLMM) was selected to examine effects between the variables. The model was run to find whether the experimental results were affected by group, language and frequency conditions. The model reinforced the results in descriptive statistics. A significant main effect was found for the group condition (β = 3.11, SE = 0.52, z = 5.91, p < 0.01), language condition (β = 3.11, SE = 0.52, z = 5.91, p < 0,01) and lexical frequency condition (β = −1.90, SE = 0.27, z = −6.83, p < 0.01).
These results replicate previous findings (Grüter et al. 2012;Hopp 2013;Halberstadt et al. 2018), confirming that, within a language, the reliability of the determiner is higher than the reliability of noun-ending morphemes. The current work extends the previous findings, providing evidence of the role of the determiner over noun-final morphemes with gender information across languages. These results support previous findings of missing morphology production, indicating that, even in cases where surface morphology is never acquired, it is still possible for the learner to determine the syntactic status of the linguistic structure in the target language (e.g., Lardiere 1998;Prévost and White 2000). Nevertheless, mean differences in FCT results between L2ers indicate that the type of morphology cue has an effect on access to grammatical knowledge.
A generalized linear mixed model (GLMM) was selected to examine effects between the variables. The model was run to find whether the experimental results were affected by group, language and frequency conditions. The model reinforced the results in descriptive statistics. A significant main effect was found for the group condition (β = 3.11, SE = 0.52, z = 5.91, p < 0.01), language condition (β = 3.11, SE = 0.52, z = 5.91, p < 0,01) and lexical frequency condition (β = −1.90, SE = 0.27, z = −6.83, p < 0.01).
These results replicate previous findings (Grüter et al. 2012;Hopp 2013;Halberstadt et al. 2018), confirming that, within a language, the reliability of the determiner is higher than the reliability of noun-ending morphemes. The current work extends the previous findings, providing evidence of the role of the determiner over noun-final morphemes with gender information across languages.

Experiment 2: Adjective Agreement Elicited Production Task (EPT)
Descriptive statistics were run to reveal the performance of the 132 participants distributed in the four groups (L2SP, L2HB, L1SP, L1HB). Each participant completed 16 EPT test items, randomly distributed by the Power Point software. The results across the four groups and frequency conditions were similar to FCT results. The main difference between the FCT results was that L2 Hebrew performed below chance in the high-frequency condition and both L2 groups performed below chance (below the cross-sectional white line) in the low-frequency condition L1 groups performed at a high level in this task, indicated by a mean accuracy of 96% (SD = 0.18) in the L1SP and 97% (SD = 0.18) in the L1HB group. In the L2 group, by contrast, accuracy was significantly lower. However, responses by L2SP (M = 0.57; SD = 0.49) were more accurate than L2HEB (M = 0.29; SD = 0.49). For the current analysis, it is also useful to calculate the proportion of accurate responses by frequency condition. All groups showed better performance for the high-frequency conditions than for the low-frequency condition, as we can see in Figure 5.
L1 groups performed at a high level in this task, indicated by a mean accuracy of 96% (SD = 0.18) in the L1SP and 97% (SD = 0.18) in the L1HB group. In the L2 group, by contrast, accuracy was significantly lower. However, responses by L2SP (M = 0.57; SD = 0.49) were more accurate than L2HEB (M = 0.29; SD = 0.49). For the current analysis, it is also useful to calculate the proportion of accurate responses by frequency condition. All groups showed better performance for the high-frequency conditions than for the low-frequency condition, as we can see in Figure 5. The responses of L2 Spanish (L2SP) were more accurate than those of L2 Hebrew (L2HB), supported by mean frequency condition in each group. L2SP outperformed L2HB in high-frequency conditions. The proportion of accurate responses in the L2SP group was above chance (above the cross-sectional white line), unlike L2HB responses, which were below chance. However, in the low-frequency condition, both L2 group responses were below chance. On the other hand, the comparison groups for both target languages displayed expected patterns, showing accurate responses across conditions, as we can appreciate in Table 3.
A second GLMM (one model for each dependent variable: one for FCT responses and one for EPT response) was run to find whether the experiment results were affected by group condition (L1, L2). The model found significant differences for group condition (β = 4.4, SE = 0.23, z = 19.2, p < 0.01), showing that EPT results were affected by whether participants were native speakers or L2 learners. Regarding the L2 language condition (Hebrew-Spanish), the model also displayed differences in EPT results (β = 1.2, SE = 0.18, z = 6.86, p < 0.01), indicating a relevant finding for the current work that language features influence answers' accuracy. The lexical frequency condition also showed significant effects (β = -1.41, SE = 0.18, z = −7.77, p < 0.01). This suggests that there are differences The responses of L2 Spanish (L2SP) were more accurate than those of L2 Hebrew (L2HB), supported by mean frequency condition in each group. L2SP outperformed L2HB in high-frequency conditions. The proportion of accurate responses in the L2SP group was above chance (above the cross-sectional white line), unlike L2HB responses, which were below chance. However, in the low-frequency condition, both L2 group responses were below chance. On the other hand, the comparison groups for both target languages displayed expected patterns, showing accurate responses across conditions, as we can appreciate in Table 3.
A second GLMM (one model for each dependent variable: one for FCT responses and one for EPT response) was run to find whether the experiment results were affected by group condition (L1, L2). The model found significant differences for group condition (β = 4.4, SE = 0.23, z = 19.2, p < 0.01), showing that EPT results were affected by whether participants were native speakers or L2 learners. Regarding the L2 language condition (Hebrew-Spanish), the model also displayed differences in EPT results (β = 1.2, SE = 0.18, z = 6.86, p < 0.01), indicating a relevant finding for the current work that language features influence answers' accuracy. The lexical frequency condition also showed significant effects (β = −1.41, SE = 0.18, z = −7.77, p < 0.01). This suggests that there are differences between the independent variable (frequent item versus non-frequent item) and the dependent variable (task responses), demonstrating that this condition had a task-effect, irrespective of target language.

Discussion
The purpose of this study was to investigate whether morphosyntactic features were more reliable for L2 learners than morphophonological features in the process of adjectivegender agreement in inanimate nouns across languages. Due to the contrast between the gender systems of the target languages, the study aimed to examine differences among morphosyntactic and morphophonological cues that yield different outcomes in the process of predicting noun-adjective agreement. The current research examines the possibility that transparent-ending morphemes are less consistent in the acquisition of gender for the target nouns. Overall, the results suggest that syntactic cues are more reliable for L2ers than phonological cues, even when the Hebrew plural final-noun morpheme is highly predictive of the noun's gender (Gollan and Frost 2001).
The results of the two tasks supported the hypothesis by revealing that the L2 Spanish group outperformed the Hebrew L2 group. These results aligned with previous studies of gender acquisition by Kirova and Camacho (2022). As these authors stated, when L2 learners process input in a more sophisticated way, they redefine their gender assignment strategy. They start focusing on the morphology of determiners and adjectives, and they start making better predictions, since these categories are more reliable than the morpheme of the noun. Since Spanish provides determiner-noun agreement within the DP, and Hebrew provides a salience plural noun-ending morpheme with gender value, it is possible to state that the process of focusing on morphosyntactic cues facilitates adjective gender agreement. A large body of research in Spanish has demonstrated the strong relationship between determiner and transparent noun-final morpheme (Grüter et al. 2012;Halberstadt et al. 2018;Kirova 2016). The present work contributes by providing data showing that syntactic features are, in fact, relevant when predicting opaque noun-adjective agreement relationships when the lexical component of gender does not match the surface gender morpheme.
One explanation for the previous statement deals with the DP hypothesis (Abney 1987). Following the Chomskian framework, the author analyzes noun phrases as DPs, headed by the functional category D. D decides the category and the distribution of the elements in the nominal structure. The DP hypothesis has been applied to a variety of languages, including Spanish, since focusing on the determiner is a well-demonstrated strategy in predicting gender features. Conversely, Wintner (2000) claimed that the definite article in Hebrew is an affix, retaining the view that the head of noun phrases in Hebrew is the noun. If that is the case, the affix combines with nominal elements in the lexicon, and hence is inaccessible to syntactic processes. If the Hebrew definite article is lexically attached to the noun rather than subject to syntactic rules, L2 speakers need to focus on ending phonological cues. Although Hebrew plural final-noun morpheme is highly predictive of the noun gender, the problem resides in cases of a clash between plural noun suffix and noun gender, unlike Spanish, where "D" always encodes transparent gender information, with very few exceptions (e.g., el agua).
The task results also showed the relevance of lexical frequency in predicting nounadjective agreement, showing that frequency impacts the comprehension and production of gender features in line with the frequency lag hypothesis (Gollan et al. 2011). The hypothesis states that the lexical effect substantially impacts lexical accessibility. In addition, the results yielded differences in production and comprehension tasks across groups. L2 learners had lower results in the production task across groups. Furthermore, Hebrew L2ers perform at floor level in EPT low-frequency conditions. One explanation for low EPT results across groups is that lexical frequency has a stronger effect on production than comprehension. Hur et al. (2020), in a study of gender acquisition, demonstrated that production tasks were more challenging than comprehension tasks and, on the other hand, that lexical frequency facilitated answer accuracy in production. Therefore, it is predictable that the retrieval of the lexical gender posed more significant difficulties in production when they faced low-frequent items. In the case of floor-level results in Hebrew, 'L2ers' difficulty is fostered by the absence of a gendered determiner that hinders the lexical competition between lowfrequency items, unlike Spanish L2ers. Nevertheless, it is essential to mention that Hebrew 'L2ers' lower performance in EPT non-frequent items may not be as accurate in reflecting Hebrew L2ers' syntactic knowledge of 'gender's grammatical features. Higher results in FCT low-frequency items demonstrated that participants have the mental representation of gender features in the target language.
In sum, the results of the present study showed that L2 Spanish learners make use of morphosyntactic strategies when the nouns have opaque endings, suggesting that the acquisition of syntactic knowledge plays a fundamental role in the L2 gender process, irrespective of the transparency of the noun and the word's lexical frequency.

Conclusions
This study investigated the role of morphosyntactic versus morphophonological features in the acquisition of inanimate nouns in L1 English-L2 Hebrew and L1 English-L2 Spanish learners. Therefore, the present work corroborated the important role played by the determiner regarding gender value across languages within the DP in gender L2 acquisition.
The current investigation also examined the reliability of noun-ending morphemes in the target process, showing that transparent morphophonological features are less reliable than the determiner in the target process. One comprehension task and one production task were conducted to determine the effects of morphosyntactic versus morphophonological features in the target nouns. The data showed that, overall, the presence of the gendered determiner has the main effect on gender acquisition when the learner has no phonological cues in the input. Additionally, the same results indicated that the consistency of the gender information encoded in the article provided by the input is crucial in the process of nounadjective agreement. On the other hand, the presence of transparent plural noun-ending is less reliable when Hebrew L2ers need to match the noun to the appropriate gender of the adjective. Taken together, these findings suggest that syntactic knowledge facilitates the acquisition of gender for inanimate nouns across languages.
Funding: This research received no external funding.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board (or Ethics Committee) of Rutgers University (Pro2019000235; 3/25/2019) for studies involving humans.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/jmarkovitsr/agreement accessed on 25 May 2022.

Conflicts of Interest:
The authors declare no conflict of interest.

1
There are, nonetheless, several lexical exceptions in which the adjective-final morpheme does not mark gender, for example verde (green). Opaque adjective gender morphemes are not part of the methodological design of the current study.