Second Language Acquisition and the Mastery of Discourse Connectives: Assessing the Factors That Hinder L2-Learners from Mastering French Connectives

: Even though the mastery of discourse connectives represents an important step toward reaching high language proﬁciency, it remains highly di ﬃ cult for L2-learners to master them. We conducted an experiment in which we tested the mastery of 12 monofunctional French connectives conveying six di ﬀ erent coherence relations by 151 German-speaking learners of French, as well as a control group of 63 native French speakers. Our results show that the cognitive complexity of the coherence relation and connectives’ frequency, both found to be important factors for native speakers’ connective mastery, play a minor role for the mastery by non-native speakers. Instead, we argue that two speciﬁc factors, namely the connectives’ register and meaning transparency, seem to be more predictive variables. In addition, we found that a higher exposure to print in L1, correlates with a better mastery of the connectives in L2. We discuss the implications of our ﬁndings in the context of second language acquisition.


Introduction
Discourse connectives are linguistic elements that encode procedural meaning indicating the coherence relation holding between discourse segments such as cause (1) or concession (2) (Mann and Thompson 1988;Sanders et al. 1992).
(1) Mary was glad because she won first prize at the science contest.
(2) Mary won first prize at the science contest, but she did not brag about it.
Even though connectives are used in both spoken and written discourse, there are some important differences between the two modes. While only a handful of connectives are used with high frequency in speech, in writing a broader array of connectives are used, although less frequently (Crible and Cuenca 2017). As a result, many connectives are used mostly in writing and their mastery is linked to the degree of competence that speakers have with this mode, as measured by their degree of exposure to print (e.g., Zufferey and Gygax 2020a).
While reading in their first language, readers directly benefit from connectives since they increase the local and global comprehension of a text (e.g., Degand et al. 1999;Van Silfhout et al. 2014), increase processing speed (Murray 1997;Sanders and Noordman 2000) and improve recall of its content (Caron et al. 1988). However, learners even at an advanced stage of language acquisition still lag behind native speakers in their mastery of connectives in L2. Several corpus studies report the difficulties that L2-learners encounter to master discourse connectives adequately and specify for example that non-native speakers overuse corrobative and additive connectives (e.g., Field and Oi 1992;Milton and Tsang 1993;Granger and Tyson 1996;Tapper 2005), underuse contrastive or adversative connectives (e.g., Ha 2014; Shi 2017), or misuse contrastive or adversative connectives (e.g., Milton and Tsang 1993;Park 2013;Ha 2014). Further, several studies show that texts written by non-native speakers have a lower variety of connectives in comparison with native speaker texts (e.g., Liu and Braine 2005;Don and Sriniwass 2017;cf. Degand and Hadermann 2009).
In sum, even though there is ample evidence in the literature that learners do not fully master connectives, experimental studies that have assessed the factors that complicate or even prevent the mastery of discourse connectives for non-native speakers are quite sparse. This article is an attempt to start filling this gap by assessing the role of frequency and the degree of cognitive complexity of the coherence relation for learners' ability to handle connectives in L2, as these factors have been found to be highly relevant both for children and teenagers acquiring their first language (Evers-Vermeul and Sanders 2009;Crosson et al. 2008;Zufferey and Gygax 2020b) as well as adult native speakers (e.g., Canestrelli et al. 2013;Zufferey and Gygax 2020a).
More specifically, we experimentally measured the ability of non-native and native speakers of French to correctly use twelve French connectives used to convey six coherence relations. Six of the twelve connectives have an overall high frequency in written corpus data, whereas the other six are less frequent. We also included six coherence relations with varying degrees of cognitive complexity according to the categorization put forward by Sanders et al. (1992). We also tested written language proficiency in our participants using lexical and grammatical tasks, and measured learners' degree of exposure to print in their native (German) and non-native (French) languages.
Our experiment provides new insights to deepen our understanding of learners' mastery of connectives for several reasons. First, by focusing on twelve connectives mostly used in written language, we provide new data on connectives that have not been assessed so far, but that nevertheless are an integral part of written language competence. Second, the importance of cognitive complexity and frequency as factors explaining individual variations between connectives and speakers have, to our knowledge, not been assessed in the context of second language acquisition, and neither have the correlations between the mastery of connectives and the degree of exposure to print. Third, we chose to assess learners' mastery of connectives through a constrained experimental production task, contrary to most previous studies that have resorted to corpus analyses and texts comparisons (such as Field and Oi 1992;Milton and Tsang 1993;Granger and Tyson 1996;Bolton et al. 2002;Tapper 2005;Pit 2007;Ha 2014;Hu and Li 2015).
The paper is structured as follows. We first discuss in detail several factors that could hinder second language learners' ability to achieve a native-like use of connectives, namely the frequency of the connective in corpus data, the overall language proficiency of the speaker, transfer effects from the speakers' L1 and the cognitive complexity of the relation that the connective conveys. We then present a new experiment designed to assess the role of each of these factors.

Factors Explaining L2 Learners' Inability to Master Connectives
Quite intuitively, the overall language proficiency seems to be a relevant factor to explain learners' mastery of connectives, as connectives are lexical items acting at the syntax-discourse interface. Several corpus studies have indeed found that an overall higher language proficiency correlates with a better use of connectives (e.g., Chen 2014;Tazegül 2015). Cho and Shin (2014), for example, found that as the overall language proficiency of learners increased, their tendency to overuse connectives declined. However, connectives remain challenging even for learners with a high level of language proficiency. For example, Yang and Sun (2012) analyzed and compared the use of connectives in argumentative writings by 30 second-year and 30 fourth-year undergraduate Chinese EFL learners. In this study, advanced students showed a better proficiency for cohesive markers such as referential expressions (as illustrated in the example (3)) compared to low-proficiency speakers, but failed to show better use of discourse connectives.
(3) Mary won first prize at the science contest. She was very proud of it.
More recently, Zufferey and Gygax (2017) showed the persisting difficulties of very advanced German-speaking learners of French to master the highly frequent French connective en effet (roughly equivalent to the English 'indeed'). In an online reading and an offline judgement task, advanced learners were not as sensitive as native speakers in detecting the loss of coherence caused by the absence of the connective en effet in confirmation relations, illustrated in (4).
'Susanne thought that she had lost something. She forgot her purse in the bus.' Zufferey and Gygax (2017) showed an important aspect of connective mastery: the ability to discriminate relations that can be left implicit from those that need to be marked by a connective, as Crewe (1990) had also noted based on corpus results (see also Van den Bosch et al. 2018 in the context of bilingual third-graders).
Further complication for non-native speakers might be caused by negative transfer from their first language (e.g., Meisuo 2000;Hamed 2014;Don and Sriniwass 2017;Shi 2017;Leedham and Cai 2013). For example, Granger and Tyson (1996) compared corpora of native and non-native speakers and found an overuse of corroborating connectives that they explained in terms of negative transfer effects from their L1 French. Tapper (2005) however concluded that the overuse of corroborating connectives might represent "a shared learner language feature" (Tapper 2005, p. 124), as this effect was found also in a corpus study with L1 Swedish learners. Regarding the comprehension of connectives, Zufferey et al. (2015) further suggested that the ability of advanced learners to detect misuses of connectives linked to transfer effects depended on the nature of the task, namely whether it was measured in an online or an offline task. In a reading task using eye-tracking, learners were as sensitive as native speakers to incorrect uses of connectives, whereas on a grammatical judgment task, they were specifically not able to detect misuses corresponding to correct uses in their L1. These results indicated that negative transfer effects occurred when learners had to explicitly think about rules of correct usage.
In order to examine further other explanatory factors that have not yet been assessed with a population of learners, we will present in the following section findings of L1 research on connective acquisition. The rationale is that factors that complicate the mastery of connectives for non-native speakers might mimic those that complicate the mastery for native speakers.

Factors Affecting First Language Acquisition of Connectives
Several studies reported that even native speakers encounter difficulties when attempting to fully master connectives (e.g., Lamiroy 1994; Zufferey and Gygax 2020a). One possible cause for these difficulties comes from the fact that connectives convey coherence relations with varying degrees of cognitive complexity (Sanders et al. 1992). Indeed, additive relations (5) are cognitively easy, while causal relations seem to be more complex because they require the ability to infer a causal link between segments (6). Concessive relations are the more complex type of relation, as they require the identification of an implicit causal link that is denied (7).
(5) Mary knew several languages and was a trained physicist. (6) Mary got a leading position because she was a renowned physicist. (7) Mary got a leading position, but she stayed humble.
The importance of cognitive complexity was found both for children acquiring their first language, as they start producing simpler relations before more complex ones (Evers-Vermeul and Sanders 2009), but also in the way adults process these relations. For example, Murray (1997) found that adults had longer reading times and lower ratings of coherence for sentences containing concessive connectives compared to causal ones.
Similarly, Crosson et al. (2008) showed that fourth graders of Spanish-speaking background understood addition relations better than concessive ones. However, this effect was only found for high frequency connectives. This frequency effect, central in children's ability to master connectives, has also been found for teenagers and adults (Nippold et al. 1992;Zufferey and Gygax 2020b).
Although these effects have been found to be reliable, some studies have shown variations associated to the degree to which participants were exposed to print (Zufferey and Gygax 2020a). This exposure to print is particularly relevant in predicting correct processing of less frequent connectives.

The Present Experiment
In order to assess the role of the two factors identified for children acquiring their first language and adults, namely cognitive complexity and frequency, we conducted an experiment in which we assessed them and measured the role of exposure to print in both L1 and L2. Based on the findings from the literature reported above, we hypothesized that a greater degree of cognitive complexity might decrease the mastery of connectives conveying these relations. We also hypothesized that more frequent connectives would be mastered better by learners than infrequent ones. Exposure to print, and grammatical knowledge, both important factors for native speakers (Zufferey and Gygax 2020a), can also be expected to be important for learners.

Native Speakers: French Speakers
Thirty-nine native speakers were recruited using the Internet platform profilic (Profilic, Oxford, UK, www.prolific.co [2020], see also Palan and Schitter 2018). Each of the participants was paid 4.18£ for their participation. All participants were students, native French speakers, and showed satisfying participation (minimum 95% good ratings) in previous studies on the prolific platform. Twenty-four additional native speakers were recruited through the Department of Psychology from the University of Fribourg in Switzerland. The participants were granted course credit for participation. Together, these two groups formed our control group (n = 63, 40f) with a mean age of 23.1 years (SD = 6.7). For 57 participants French was the only mother tongue, while 6 indicated that they had a second mother tongue besides French.

Non-Native Speakers: German-Speaking Learners of French
The non-native speakers were French-learners, recruited in the final and pre-final classes of three German speaking high schools (i.e., gymnasiums), two in Switzerland (Canton Bern), and one in Germany (federal state of Schleswig-Holstein). The non-native group (n = 151, 100f) had a mean age of 18 years (SD = 1.13). In this sample, 125 of the participants had German as their only mother tongue, while 13 indicated that they have another mother tongue besides German. All of the non-native participants went to school in a monolingual German-speaking school program. All participants were non-native speakers of French.
Note that initially, the sample was composed of 162 participants. However, eleven participants indicated German and French as their mother tongue and were therefore excluded from our sample.
All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the ethics protocol was approved by the Swiss National Science Foundation (100012 184882).

Design
In order to assess the mastery of connectives by non-native speakers, twelve French connectives, typically but not exclusively used in the written mode, were chosen for this experiment. The connectives were chosen using the online dictionary of French connectives Lexconn (Roze et al. 2012). The tested relations were: addition, consequence, concession, contrast, cause and condition. These were chosen because they imply various degrees of cognitive complexity. Based on previous studies (e.g., Murray 1997;Sanders et al. 1992;Morera et al. 2017) these relations can be roughly ordered along the following scale of cognitive complexity (left = least complex, right = most complex): additive < cause < consequence < condition < concession For each relation, we included the following connectives: (1) addition: en outre, par ailleurs ('in addition', 'moreover') (2) consequence: ainsi, c'est pourquoi ('so', 'therefore') (3) contrast: en revanche, par contre ('in contrast', 'on the other hand') (4) condition: pourvu que, dans le cas où ('so long as', 'in case') (5) cause: puisque, car ('since', 'because') (6) concession: néanmoins, cependant ('nevertheless', 'however') Within each relation, one of the connectives had a high frequency and the other had a low frequency in corpus data (see Table 1). In order to determine the frequency of each connective, we conducted an analysis through the French web corpus, FrTenTen 2017 (over 5 billion words, Jakubíček et al. 2013) by using SktechEngine (Kilgarriff et al. 2014). Table 1. Frequency of the used connectives in the corpus French Web 17 determined by an analysis using SketchEngine. For the connectives "pourvu que" and "puisque" apostrophized variations (i.e., "pourvu qu'", "puisqu'") are also included. To test the mastery of connectives, we chose a sentence completion task in which the missing element was the connective. A similar methodology was used in previous studies (Nippold et al. 1992;Crosson et al. 2008;Zufferey and Gygax 2020b). To complete each sentence, participants were asked to choose among six connectives (one correct, five foils from the other five relations). As linguistic proficiency may vary even across native speakers and impact connective mastery Gygax 2020a, 2020b), we also measured grammatical and lexical knowledge in French, as well as the exposure to print in L1 and L2.

Relation
To explicitly test the exposure to print in French we chose the French version of the Author Recognition Task (ART-F, Zufferey and Gygax 2020a) as a test for exposure to print in L1. As the participants were learners of French with a minimum level of B1, we assumed that they did not have much reading experience in French. We therefore decided to add, besides the ART-F test, a German version of the ART test (ART-Ger, Grolig et al. 2020) for the learners. As a high exposure to print predicted in previous experiments a better mastery of discourse connectives for native speakers (Zufferey and Gygax 2020a), testing it explicitly for non-native speakers could bring further insight into the acquisition of connectives in L2. In addition, we tested (1) the grammatical competence in French for all participants using the grammar task designed by Zufferey and Gygax (2020a) and (2) the Lextale-test for French (Brysbaert 2013) to measure their lexical knowledge (see Procedure for the order of presentation).

Sentences with Connectives
For the connectives task, 60 items were used. Each item started with a statement with an animate noun phrase, e.g., Alain déteste les chiens ('Alain hates dogs'), which was then followed by a second statement including matching second information, e.g., Son frère les aime bien ('His brother likes them'). Between the two statements a blank was left, and only one connective would render the relation between the two sentences coherent.
As presented in example (1), the only fitting connective is the connective en outre conveying an additive relation, since the relation between the facts that Marie is passionate about gardening and likes sailing is neither contrastive, causal, conditional, nor implies a consequence. Thus, the relations that the other proposed connectives would activate in this context would render the link between sentences incoherent. The other connectives served therefore as foil options in this particular case.
Each of the 12 above listed connectives was the right answer for 5 items, while five other connectives were foils. The foils were chosen in a counterbalanced way among the connectives of the five other relations that were tested during the experiment. For example, the options for the sentences that implied an additional relation were par ailleurs (the correct answer), ainsi, cependant, par contre, dans le cas où, and car (foils). This way we could ensure that each connective was occasionally both a foil and the correct answer. It particularly ensured that the foil options would render the sentence incoherent and only the intended connectives would result in a coherent relation. The items and the design were tested during a pre-test with 50 native French-speakers (17f), who were recruited using the Internet platform Profilic (Profilic, Oxford, UK, 2020) prior to the main experiment. Each participant in this pre-test was paid 4,18£ for their participation. The pre-test showed that four items did not lead to a clear consensus among native-speakers. These four items were therefore replaced in the test (see Supplementary Materials for the final items and their associated connectives). During the experiment, the items were presented in a randomized order. In addition to that, the options were also presented in a randomized order to ensure that the participants would not prefer any connective over another because of its position.

Procedure
We used Qualtrics software (Qualtrics LLC, Provo, UT, USA) to design our experiment and the participants performed it online via a weblink. Before starting the experiment, a consent form was presented, and each participant was asked to read it carefully and agree with it.
Participants then moved on to the connectives task, in which they were asked to fill the blanks for each item by clicking on the connective that would render the link between sentences coherent. It was specified that only one option was possible. In order to move on, the participants had to click on an arrow. Only one item and its options were presented at a time. As the main task of the experiment were designed as an offline completion task instead of an online measurement (see Marinis 2012), response times were not recorded, and participants did not have time constraints. The overall duration of a session was around 30 min.
Next, in order to measure the level of exposure to print, participants were asked to complete the French version of the Author Recognition Task (Zufferey and Gygax 2020a). In this task, within 80 names, participants have to indicate which names they recognize as real authors (40 names are actual authors, and 40 names are invented authors). Participants get a point for correctly identifying a real author, and get −1 point for each invented author incorrectly identified. After that, the non-native participants performed the German version of this test (Grolig et al. 2020). In this test, 50 names were actual authors and 25 were invented.
To test participants' lexical knowledge, all participants completed the Lextale-test for French (Brysbaert 2013). In this test, participants had to choose from a list of 84 words which words were, according to them, existing French words. Fifty-six words were indeed real French words, whereas 28 were phonologically plausible but invented.
In order to assess the grammatical competence of the participants, they were asked to complete a grammatical test (Zufferey and Gygax 2020a). In this test, a list of 40 French sentences was presented and the participants were asked to evaluate their correctness. While 20 of the sentences were correct, 20 included grammatical errors typical of written language. The participants evaluated the sentences by moving a continuous scale slider which was labeled "I am sure this is incorrect" on the far left and "I am sure this is correct" on the far right. The sentences were presented one at a time and in a randomized order.
This was followed, for the non-native speakers only, by a detailed self-evaluation of their proficiency of the French language, as well as learning history questions for French. We created the learning history test based on Kaushanskaya et al. (2019) and Li et al. (2014), see also (Li et al. 2006). Within this task, the participants had to evaluate the contribution of eight given factors for their own individual acquisition of the French language. The factors were rated by the participants on a scale from 0-10, (10 indicating the highest importance) and were based on a modified version of the language history test by Kaushanskaya et al. (2019). The given factors were friends, family, reading, language method, tv/Internet, radio/music, school, and work. We wanted to explore the link between the subjective importance of these factors and the mastery of connectives. Although purely subjective, an intuitive judgment of great importance for one of these factors might give a hint as to a possible interacting effect with the acquisition of connectives. Finally, demographic questions were asked for all participants, such as age, gender and mother tongue.

Descriptive Analyses of the Proficiency Measures
As mentioned earlier, in order to obtain a score for the French version of the author recognition task we distributed for each real author that was selected +1 point and for every invented author selected -1 point (as in Zufferey and Gygax 2020a). The highest possible score was thus 40. In order to obtain a score from the author recognition task for German speakers we proceeded the same way, only this time the highest possible score was 50, as the German version included 50 correct and 25 foil author names. The outcomes of the Lextale task were analyzed similarly: for every right word +1 point was given, for every fake word −1 point was given. The maximum possible in this task was 56 points. For the grammar task, the answers given on the continuous slider were converted on a scale from 0 to 100, in which 0 indicated a low score and 100 the highest score.
Since the value 0 indicated a good score for a correct sentence and a bad score for an incorrect sentence, we calculated the mean score of the grammar task for each participant using the formula: Mean (mean (incorrect sentences) + (mean (100 − correct sentences)) Mean scores of all proficiency measures are shown in Table 2, and scatterplots are shown in in Figures 1-3. Table 2. Mean scores of the Lextale, the French author recognition task (ART-F), the German author recognition task (ART-Ger) and grammar tasks by native and non-native speakers.

Native
Non-Native

Connective Production Task
In the main task, for every right answer a 1 was recorded and for every false answer a 0 (the mean range was thus 0-1). Table 3 shows the mean scores for each tested connective (see also Figure 4).

Connective Production Task
In the main task, for every right answer a 1 was recorded and for every false answer a 0 (the mean range was thus 0-1). Table 3 shows the mean scores for each tested connective (see also Figure 4).  To account for the binary nature of our dependent variable (a correct or incorrect answer for each item), we ran generalized logistic mixed models that were fit by maximum likelihood test. As advocated by Schreiber-Gregory (2018), we initially checked that all assumptions of logistic regressions were met (i.e., appropriate outcome structure, absence of multicollinearity, linearity of independent variables and log odds, and large sample size). As we designed the experiment as a repeated measures design, the assumption of observation independence was not met but we accounted for it by using a mixed-effect-model (i.e., adding the random effects for both participants and items to the model).
For creating the models we chose Baayen's (2008) procedure, using a forward-approach, including fixed effects one at a time, and comparing the resulting model to the preceding one using log-likelihood test. The initial model, our random model, did not include any fixed effect. The models were tested until the final model did not improve and it was justified by our experimental design. For creating the generalized logistic mixed models we used R (R Core Team 2020, Version 4.0.0), To account for the binary nature of our dependent variable (a correct or incorrect answer for each item), we ran generalized logistic mixed models that were fit by maximum likelihood test. As advocated by Schreiber-Gregory (2018), we initially checked that all assumptions of logistic regressions were met (i.e., appropriate outcome structure, absence of multicollinearity, linearity of independent variables and log odds, and large sample size). As we designed the experiment as a repeated measures design, the assumption of observation independence was not met but we accounted for it by using a mixed-effect-model (i.e., adding the random effects for both participants and items to the model).
For creating the models we chose Baayen's (2008) procedure, using a forward-approach, including fixed effects one at a time, and comparing the resulting model to the preceding one using log-likelihood test. The initial model, our random model, did not include any fixed effect. The models were tested until the final model did not improve and it was justified by our experimental design. For creating the generalized logistic mixed models we used R (R Core Team 2020, Version 4.0.0), specifically the glmer()-function of the lm4-package (Bates et al. 2015) in order to assess the fit of the model the anova()-function of the base-R-package (R Core Team 2020) to calculate the χ2 value of the log-likelihood test. In order to obtain p-values we used the summary()-function from the car-package (Fox and Weisberg 2019). As a post-hoc test we computed the least squares means of the fixed effects using the lsmeans()-function (Tukey method for comparing a family of 12 estimates, Confidence level: 0.95) of the emmeans-package (Lenth 2020).

Analysis of Frequency, Relation and Language Group
After constructing our null model that included no fixed effect other than participants and items as random effects, we compared this model to one that had language group as a fixed factor. The model improved significantly (χ2 = 113.6, ∆df = 1, p < 2.2 × 10 −16 ). We further added relation as an interacting fixed effect and the model showed again a better fit (χ2 = 228.74, ∆df = 11, p < 2.2 × 10 −16 ). We then added frequency as an interacting fixed effect and the model improved further (χ2 = 416.14, ∆df = 23, p < 2.2 × 10 −16 ). Since no other fixed or random effect or slope would improve our model (nor would it have been justified by our design), we kept the following model as our final model (in R notation style): value of main task~language group * relation * frequency + (1|participant) + (1|item) The output of our final model is shown in Table 4. Signif. codes: "***" < 0.001, "**" < 0.01, "*"< 0.05.
As for frequency, we found significant differences between the mastery of different connectives when comparing language groups (see Table 6 for the details), with the exception of the less frequent connective néanmoins and the high frequent connective par contre. Interestingly, both groups mastered these connectives equally well. Signif. codes: "***" < 0.001, "**" < 0.01, "*"< 0.05.
When only considering native speakers, the post hoc test mainly revealed that the native speakers mastered the causal relation particularly well, whereas they visibly struggled more with connectives conveying a concessive relation. More precisely, the connectives conveying condition, consequence, and cause relations were mastered significantly better than the connective conveying a concession (see Table 7). Further, as it can be seen in Figure 5, connectives conveying a causal relation were mastered better than connectives conveying additional, concessive and contrastive relations. Signif. codes: "***" < 0.001, "**" < 0.01, "*"< 0.05. For the non-native speakers, the post hoc revealed no significant differences between the different relations. As can be seen in Figure 6, the non-native speakers of our experiment had a similar level of proficiency for all tested relations. However, when considering each relation independently, and comparing the more frequent connective to the less frequent one, the analyses showed significant differences within the relations of consequence (i.e., between ainsi and c'est pourquoi), cause (i.e., between car and puisque), and contrast (i.e., between en revanche and par contre) but not within the relations of concession (i.e., between cependant and néanmoins), condition (i.e., between dans le cas où and pourvu que), and addition (i.e., between en outre and par ailleurs) as shown in Table 8. For the non-native speakers, the post hoc revealed no significant differences between the different relations. As can be seen in Figure 6, the non-native speakers of our experiment had a similar level of proficiency for all tested relations. However, when considering each relation independently, and comparing the more frequent connective to the less frequent one, the analyses showed significant differences within the relations of consequence (i.e., between ainsi and c'est pourquoi), cause (i.e., between car and puisque), and contrast (i.e., between en revanche and par contre) but not within the relations of concession (i.e., between cependant and néanmoins), condition (i.e., between dans le cas où and pourvu que), and addition (i.e., between en outre and par ailleurs) as shown in Table 8.   Figure 6. Percentage of correct answers by non-native speakers per relations. Signif. codes: "***" < 0.001, "**" < 0.01, "*"< 0.05.

Connective Production Task and Proficiency Measures
Since we measured the overall language proficiency of the participants with the grammar task, exposure to the written mode, French and German author recognition tasks, and the Lextale-task, we analyzed whether a high score in those tasks would predict a good score in the main task. In order not to violate the assumption of the absence of multicollinearity of logistic regressions (that is, whether fixed effects were correlated to each other (Schreiber-Gregory 2018)), we first conducted a series of tests of correlation of the scores. First, we checked for each correlation of those scores, the distribution and variances. We did so by using plots, the Shapiro-Wilk test of normality using the shapiro.test()-function of the stats-v.3.6.2.-package (R Core Team 2020), and the Fligner-Killeen test of homogeneity of variances using the fligner.test()-function of the stats-v.3.6.2.-package (R Core Team 2020). The tests and plots showed that the data was not normally distributed and that the variances were not homogenous. We decided therefore to use the Spearman's ranks correlation test. We conducted this test using the cor.test() -function of the stats-v.3.6.2.-package (R Core Team 2020) and using the ggpairs()-function of the GGally-package (Schloerke et al. 2020) in R. As seen in Table 9, the rank correlation coefficients of the tests showed several slight and moderate correlations (see also Figures 7 and 8).
Languages 2020, 5, x FOR PEER REVIEW 15 of 27 Table 9. Correlation score between the language proficiency measurements of native and non-native speakers.   Table 9. Correlation score between the language proficiency measurements of native and non-native speakers.

S Spearman's ρ p-Value
Non-native speakers ART-F~Grammar task 1. Thus, multicollinearity would not have been indubitably ensured and we decided to test the different language proficiency measurements separately for each language group as predictive effects for the mastery of connectives. By doing so, we accounted for the multicollinearity of the measurements as well as the fact that the language proficiency tasks for native speakers did not necessarily measure the same thing as the tasks for non-native speakers. Indeed, for native speakers a higher score in the grammar task would indicate a better language expertise whereas a good score in this task indicated a higher language proficiency for non-native speakers. As for the earlier analyses, for all models we set participants and items as random factors, the null model in each analysis included no fixed effect. Before we conducted the analysis for the different language proficiency tasks, we rescaled the scores of the grammar task, the Lextale-task and the author recognition tasks scores to 0-1 and rounded it with one digit (hence: 0.0-1.0), since this would facilitate the model calculation and since it was justified by the experimental design. Thus, multicollinearity would not have been indubitably ensured and we decided to test the different language proficiency measurements separately for each language group as predictive effects for the mastery of connectives. By doing so, we accounted for the multicollinearity of the measurements as well as the fact that the language proficiency tasks for native speakers did not necessarily measure the same thing as the tasks for non-native speakers. Indeed, for native speakers a higher score in the grammar task would indicate a better language expertise whereas a good score in this task indicated a higher language proficiency for non-native speakers. As for the earlier analyses, for all models we set participants and items as random factors, the null model in each analysis included no fixed effect.
Before we conducted the analysis for the different language proficiency tasks, we rescaled the scores of the grammar task, the Lextale-task and the author recognition tasks scores to 0-1 and rounded it with one digit (hence: 0.0-1.0), since this would facilitate the model calculation and since it was justified by the experimental design.

Grammar Task
For the non-native speakers, we added to our null model (that did not include any fixed effect) successively the score of the grammar task (improvement of the model: χ2 = 17.55, ∆df = 1, p = 2.795 × 10 −5 ), frequency as an interacting effect (improvement of the model: χ2 = 22.21, ∆df = 3, p = 5.909 × 10 −5 ), and relation as an interacting fixed effect (improvement of the model: χ2 = 109.62, ∆df = 23, p = 2.945 × 10 −13 ). Other fixed effects did not improve the model. The outcome of our final model shows that the grammar task score was a significant predictor of the connectives task for the non-native speakers (estimate: 2.61, SE = 0.72, z = 3.61, p < 0.001 with the intercept: estimate = −0.61, SE = 0.35, z = −1.77, p = 0.08).
We proceeded the same way for the native speakers, i.e., adding successively to our null model that included only the random effects of participants and items, the score of the grammar task (improvement of the model: χ2 = 14.41, ∆df = 1, p < 0.001), frequency (improvement of the model: χ2 = 15.16, ∆df = 3, p < 0.01), and relation (improvement of the model: χ2 = 88.08, ∆df = 23, p = 1.488 × 10 −09 ). Adding other effects did not improve the fit of the model. The output showed that the grammar task was a predicting effect for our connective mastery task for native speakers as well (estimate = 3.06, SE= 0.95, z = 3.22, p < 0.01 with the intercept: estimate = −0.45, SE = 0.59, z = −0.76, p = 0.45). As can be seen in Figure 9, a higher score in the grammar task predicted a higher score in the main task, for both native and non-native speakers: Adding other effects did not improve the fit of the model. The output showed that the grammar task was a predicting effect for our connective mastery task for native speakers as well (estimate = 3.06, SE= 0.95, z = 3.22, p < 0.01 with the intercept: estimate = −0.45, SE = 0.59, z = −0.76, p = 0.45). As can be seen in Figure 9, a higher score in the grammar task predicted a higher score in the main task, for both native and non-native speakers: Figure 9. Relationship between the scores of the main task and the grammar task by native and nonnative participants.

French Author Recognition Task
For the non-native speakers, adding the score of the French author recognition task to our null model (that included no fixed effects, but only random effects) improved the models' fit (χ2 = 6.92, ∆df = 1, p < 0.01) as well as frequency (improvement of the model: χ2 = 8.88 ∆df = 3, p = 0.031) and relation (improvement of the model: χ2 = 81.00, ∆df = 23, p = 2.188 × 10 −8 ). This final model showed that the French ART-score was not a predicting effect for the score in our main task for the non-native Figure 9. Relationship between the scores of the main task and the grammar task by native and non-native participants.

French Author Recognition Task
For the non-native speakers, adding the score of the French author recognition task to our null model (that included no fixed effects, but only random effects) improved the models' fit (χ2 = 6.92, ∆df = 1, p < 0.01) as well as frequency (improvement of the model: χ2 = 8.88 ∆df = 3, p = 0.031) and relation (improvement of the model: χ2 = 81.00, ∆df = 23, p = 2.188 × 10 −8 ). This final model showed that the French ART-score was not a predicting effect for the score in our main task for the non-native speakers (estimate = 1.45, SE = 0.95, z = 1.53, p = 0.13; intercept: estimate = −0.03, SE = 0.34, z = −0.09, p = 0.93).
For the native speakers, adding the ART-F score improved the model fit in comparison with the null model (χ2 = 7.51, ∆df = 1, p < 0.01). Adding frequency and relation improved the model further (model fit for the frequency: χ2 = 9.97, ∆df = 3, p = 0.02; and for the relation: χ2 = 78.28, ∆df = 23, p = 6.02 × 10 −8 ). The results show that the ART-F score is a significant predictor variable for the main task score for the native speakers (estimate = 2.7, SE = 1.01, z = 2.67, p < 0.01, intercept: estimate = 0.14, SE = 0.50, z = 0.28, p = 0.78). Hence, as can be seen in Figure 10, a higher score in the French author recognition task correlated with a higher score in the main task only for the native speakers.
Languages 2020, 5, x FOR PEER REVIEW 18 of 27 Figure 10. Relation between the scores in the main task and in the French author recognition task by native and non-native participants.

German Author Recognition Task
Only non-native speakers were analyzed since we did not test our native control group with the German author recognition task. In comparison to our null model that included participant and item as random effects but included no fixed effect, the model with the score of the German author recognition task did show a better fit (χ2 = 9.73, ∆df = 1, p < 0.01). Adding frequency and relation as interacting fixed effects improved the model further (for the frequency: χ2 = 15.67, ∆df = 3, p < 0.01, for the relation: χ2 = 88.18, ∆df=23, p = 1.432 × 10 −9 ). The final model indicated that the German author recognition task was a predicting effect for the main task score (estimate = 1.07, SE = 0.53, z = 2.04, p = 0.04; intercept: estimate 0.05, SE = 0.27, z = 0.20, p = 0.84). As can be seen in Figure 11, a higher score in the German author recognition task predicted a higher score in the main task. Figure 10. Relation between the scores in the main task and in the French author recognition task by native and non-native participants.

German Author Recognition Task
Only non-native speakers were analyzed since we did not test our native control group with the German author recognition task. In comparison to our null model that included participant and item as random effects but included no fixed effect, the model with the score of the German author recognition task did show a better fit (χ2 = 9.73, ∆df = 1, p < 0.01). Adding frequency and relation as interacting fixed effects improved the model further (for the frequency: χ2 = 15.67, ∆df = 3, p < 0.01, for the relation: χ2 = 88.18, ∆df=23, p = 1.432 × 10 −9 ). The final model indicated that the German author recognition task was a predicting effect for the main task score (estimate = 1.07, SE = 0.53, z = 2.04, p = 0.04; intercept: estimate 0.05, SE = 0.27, z = 0.20, p = 0.84). As can be seen in Figure 11, a higher score in the German author recognition task predicted a higher score in the main task. Figure 11. Relation between the scores in the main task and in the German author recognition task by the non-native participants.
Regarding native speakers, we improved the models' fit by successively adding the scores of the Lextale task (χ2 = 20.55, ∆df = 1, p = 5.814 × 10 −6 ), frequency (χ2 = 20.78, ∆df = 3, p < 0.001), and relation (χ2 = 104.88, ∆df = 23, p = 2.002 × 10 −12 ). Our final model showed that the Lextale was for native speakers a significant predicting factor for the main task score (estimate = 4.00, SE = 1.64, z = 2.44, p = 0.01; intercept: estimate = −1.98, SE = 1.36, z = −1.45, p = 0.15). A higher score within the Lextale-task predicted for both native and non-native speakers a higher score in the main task, as can be seen in Figure 12. Figure 11. Relation between the scores in the main task and in the German author recognition task by the non-native participants.

Details of the Linguistic Self-Evaluation of the Non-Native Participants
We also tested whether the indications from the self-evaluation of our non-native participants showed significant effects with our main task score. In order to do so, we tested the non-native data only and tested the subjective indications of the participants individually as fixed effects, whereas participants and items were set as random effects.
Regarding the factors that the participants had to evaluate concerning the contributions to their personal acquisition of French, we first tested the factor reading. After adding successively the value of the importance for the L2-acquisition attributed to reading (improvement of the model: χ2 = 10.06, ∆df = 1, p < 0.01), the relation (improvement of the model: χ2 = 25.23, ∆df = 11, p < 0.01) and the frequency (improvement of the model: χ2 = 86.91, ∆df = 23, p = 2.332 × 10 −9 ) to the model, the result showed that the factor reading showed a significant effect on the mastery of the connectives (estimate = 0.18, SE = 0.05, z = 3.61, p < 0.001; intercept: estimate = −0.25794, SE = 0.28, z = −0.94, p = 0.35). This correlation can be clearly seen in Figure 13. Finally, we tested the participants' indications of the other factors (i.e., school, friends, family, language method, Tv/Internet, radio/music and work) as well, but none of the models that included only the tested factor as a fixed effect showed an improvement in comparison to the null model, meaning that these factors did not show any effect on the outcome of the main task score. Figure 12. Relation between the scores in the main task and in the Lextale task by the non-native and native participants.

Details of the Linguistic Self-Evaluation of the Non-Native Participants
We also tested whether the indications from the self-evaluation of our non-native participants showed significant effects with our main task score. In order to do so, we tested the non-native data only and tested the subjective indications of the participants individually as fixed effects, whereas participants and items were set as random effects.
Regarding the factors that the participants had to evaluate concerning the contributions to their personal acquisition of French, we first tested the factor reading. After adding successively the value of the importance for the L2-acquisition attributed to reading (improvement of the model: χ2 = 10.06, ∆df = 1, p < 0.01), the relation (improvement of the model: χ2 = 25.23, ∆df = 11, p < 0.01) and the frequency (improvement of the model: χ2 = 86.91, ∆df = 23, p = 2.332 × 10 −9 ) to the model, the result showed that the factor reading showed a significant effect on the mastery of the connectives (estimate = 0.18, SE = 0.05, z = 3.61, p < 0.001; intercept: estimate = −0.25794, SE = 0.28, z = −0.94, p = 0.35). This correlation can be clearly seen in Figure 13. Finally, we tested the participants' indications of the other factors (i.e., school, friends, family, language method, Tv/Internet, radio/music and work) as well, but none of the models that included only the tested factor as a fixed effect showed an improvement in comparison to the null model, meaning that these factors did not show any effect on the outcome of the main task score. Figure 13. Relation between the scores in the main task and the subjective importance of reading for the personal acquisition of the French language (0-10) by the non-native participants.

Discussion
In order to investigate the factors that hinder non-native speakers' ability to master discourse connectives, we tested participants' production competence on six coherence relations and 12 different connectives for native and non-native speakers. Participants were asked to fill in blanks with connectives that seemed most appropriate to them. We also tested the exposure to print in L1 and L2 as well as different types of language proficiency, i.e., lexical and grammatical knowledge of L2, as potential explanatory variables.

Scores on the Connectives Task
Our results revealed that the expected factors-cognitive complexity and frequency-did not seem to be the best explanatory factors to address non-native speakers' performance. In fact, for non-native speakers, a frequency effect was only present for the relations of cause, contrast, and consequence. For the latter, participants were even better at using the less frequent connectives (i.e., c'est pourquoi). Instead of the initially assumed frequency hypothesis, it appears that non-native speakers rely on "comfort words" (i.e., words that they are comfortable to use and that they know how to master, see Hasselgren 1994), independently of their frequency. This is consistent with corpus-based studies that show overuse of only certain linking devices 1 that decreases with an increased language proficiency (Leedham and Cai 2013). The fact that non-natives achieved in our experiment a native-like score for the connective par contre ('on the other hand') might be explained by the more familiar register of this connective. Some studies showed that L2-learners tend to have a more colloquial style, respectively using more informal connectives than natives (Leedham and Cai 2013). The non-native speakers of our experiment might thus be more familiar with par contre that is frequently used in speech rather than the more formal en revanche, even though the latter is more frequent in writing.
In the case of c'est pourquoi ('that's why'), we argue that the high score is due to its lexical transparency. Indeed, c'est ('that is') and pourquoi ('why') are introduced at an early stage in second 1 Crewe (1990) encourages L2-learners to rely first on a small set of connectives and to expand it eventually. Figure 13. Relation between the scores in the main task and the subjective importance of reading for the personal acquisition of the French language (0-10) by the non-native participants.

Discussion
In order to investigate the factors that hinder non-native speakers' ability to master discourse connectives, we tested participants' production competence on six coherence relations and 12 different connectives for native and non-native speakers. Participants were asked to fill in blanks with connectives that seemed most appropriate to them. We also tested the exposure to print in L1 and L2 as well as different types of language proficiency, i.e., lexical and grammatical knowledge of L2, as potential explanatory variables.

Scores on the Connectives Task
Our results revealed that the expected factors-cognitive complexity and frequency-did not seem to be the best explanatory factors to address non-native speakers' performance. In fact, for non-native speakers, a frequency effect was only present for the relations of cause, contrast, and consequence. For the latter, participants were even better at using the less frequent connectives (i.e., c'est pourquoi). Instead of the initially assumed frequency hypothesis, it appears that non-native speakers rely on "comfort words" (i.e., words that they are comfortable to use and that they know how to master, see Hasselgren 1994), independently of their frequency. This is consistent with corpus-based studies that show overuse of only certain linking devices 1 that decreases with an increased language proficiency (Leedham and Cai 2013). The fact that non-natives achieved in our experiment a native-like score for the connective par contre ('on the other hand') might be explained by the more familiar register of this connective. Some studies showed that L2-learners tend to have a more colloquial style, respectively using more informal connectives than natives (Leedham and Cai 2013). The non-native speakers of 1 Crewe (1990) encourages L2-learners to rely first on a small set of connectives and to expand it eventually All authors have read and agreed to the published version of the manuscript.. our experiment might thus be more familiar with par contre that is frequently used in speech rather than the more formal en revanche, even though the latter is more frequent in writing.
In the case of c'est pourquoi ('that's why'), we argue that the high score is due to its lexical transparency. Indeed, c'est ('that is') and pourquoi ('why') are introduced at an early stage in second language acquisition. Even though connectives have been grammaticalized and form linguistics elements of their own, it might be assumed that learners can detect its meaning from their knowledge even with relatively little language proficiency when they understand and master the basic lexical elements that form the connective. However, transparency effects seem to differ from language transfer effects since an equivalent connective of the French c'est pourquoi ('that's why') is practically nonexistent in German. Although a literal translation might be possible, this form is unpopular and widely unused: a corpus search in the German web corpus deTenTen13 (over 19 billion words, see Jakubíček et al. 2013) retrieved only 11 occurrences for the literal German translation of c'est pourquoi, das ist warum, and zero occurrence for the similar equivalents das ist wieso and das ist weshalb. Indeed, the more intuitive alternative for a native speaker is the German connective deswegen ('therefore'), that has 885,542 occurrences in the same German web corpus. Therefore, in the case of c'est pourquoi, a direct transfer effect from the German language is unlikely.
Whereas the high scores of c'est pourquoi and par contre by non-native participants might thus be explained by register difference and meaning transparency, the native-like use of néanmoins ('nevertheless') by our non-native participants is intriguing. Here, and this is only speculative, there might have been a transfer effect, as néanmoins is similar to the German concessive connective nichtsdestotrotz ('nonetheless'). Note that native speakers scored comparatively low for this infrequent concessive connective, thus approaching non-native speakers' performance.
Regarding the results of the native speakers of our experiment, we did not find an overall frequency effect, i.e., that more frequent connectives would have been generally mastered better as infrequent ones, and thus failed to replicate the findings of Nippold et al. (1992) and Zufferey and Gygax (2020a). However, since we aimed in our experiment to measure connective mastery by L2 learners, we did not choose highly infrequent connectives. We can assume therefore that the frequency effect might still occur, yet only when less frequent connectives are tested (as in Zufferey and Gygax 2020a).
We did however find performance differences between relations in our native speakers, which suggest that the degree of cognitive complexity of a relation is a predicting factor for L1 s connectives mastery. Our native speakers scored better for connectives of causality than for those that conveyed addition, conditional, and concessive relations. They also scored worse for connectives that conveyed a concessive relation than those conveying conditional or consequence relations. These findings support the assumption that causal relations are not only easier to process than concessive ones (Murray 1997;Morera et al. 2017) but are also mastered better by native speakers. In addition, our native participants scored significantly better for causal relations than for additive ones. This is in line with Sanders and Noordman (2000), who found evidence that readers processed causal relations faster than additional ones. They actually suggested that readers have a « preference for detecting meaningful explanation » (Sanders and Noordman 2000, p. 53) as well as a causal expectation of the text structure. The fact that native speakers in our experiment did not perform at ceiling (i.e., 86% correct answer) strongly supports findings of other studies showing that language proficiency varies even among native speakers (Zufferey and Gygax 2020a). Several studies have shown large differences in connective proficiency between native speakers who appeared to be dependent on age and exposure to print (e.g., Bolton et al. 2002;Lamiroy 1994;Gygax 2020a, 2020b).

Link with Exposure to Print and Language Proficiency
In our experiment, exposure to print did predict performance in the connective production task, for both native and non-native speakers. In addition to the results of the author recognition tasks (which are discussed below), the self-reports of reading habits also supported the assumption that a high exposure to print is an important factor for connective mastery in L2. Among all the factors (e.g., school, Internet, friends, music) for which the participants had to evaluate the significance for their own acquisition of the French language, only reading was a significant predicting factor for the mastery of connectives. Participants that considered reading to be personally important for their acquisition of French were more likely to score better in our connective production task. Since we did not find a correlation between the Art-scores and self-evaluation of a high importance of the factor reading, we can conclude that this finding may represent an additional and independent signal that a high exposure to print is central for the mastery of connectives. It is furthermore noteworthy that a higher importance for L2-acquisition attributed to the factors school or language method did not predict a better mastery of the connectives, whereas both factors are, by definition, supposed to instruct the correct use of connectives.
Regarding the scores of the author recognition tasks, we found that for non-native speakers a high exposure to print in their L1 correlated with a better mastery of connectives in L2. We did not find this effect for the ART-F for our non-native speakers, which is not surprising in light of the rather strong floor effect (i.e., non-native speakers did not recognize many French-speaking authors). Still, the fact that the scores of the French version of the author recognition task by the native speaker are nearly identical of those obtained by Zufferey and Gygax (2020a), (mean in our experiment: 8.02, SD = 5.5, mean obtained by Zufferey and Gygax 2020a: 8.05, SD = 5.88) confirms that the French version of the author recognition task can be considered to be a robust measure of exposure to written text for native French speakers.
The fact that the German author recognition task represents for the non-native speakers a predictive factor may shed light on an effect of the "inter-dependence hypothesis" (Cummins 1984, as cited in Degand and Sanders 2002), and the "Linguistic Coding Differences Hypothesis" (e.g., Sparks et al. 2006). The latter hypothesis states that the acquisition of a second language is based on competence in the native language. As such, individual skills in the first language may be transferred to the L2. Sparks et al. (2006) found for example that "native written language measures were the best predictors of overall FL [foreign language] proficiency" (Sparks et al. 2006, p. 146). As the participants grew older, the factor of literacy in L1 became a predicting factor for L2-proficiency. The researchers suggested "a speculative link" (Sparks et al. 2006, p. 152) between home literacy activities in L1, which resulted in a higher L1-literacy, with the increase of the L2-proficiency. Dufva and Voeten (1999) also found that literacy (as defined by word recognition and comprehension skills) was a significant predicting effect for L2-proficiency. The researchers concluded thus that a high strategic text processing skill in L1 might be transferred to the second language (see also Van Gelderen et al. 2007;Sparks et al. 2012).
The results from our experiment support this hypothesis, as we found high L1-literacy to predict the ability to correctly use connectives in L2. Since we found an effect for the German author recognition task but not for the French version, our study supports the idea that reading, even in a different language, strengthens the general understanding of discourse connectives, raises a global consciousness about the procedural meaning that they convey, and consequently sharpens their use even in L2. As the influence of a differing L1 proficiency among the participants has been, to our knowledge, widely neglected in the research of the acquisition of L2 discourse connectives, further research might focus on participants' competences in their native language as a further explanatory variable. Together with the outcome of the author recognition tasks, our results suggest that the understanding and mastery of discourse connectives may be linked to an overall language-independent reading competence.
In addition to exposure to written language, lexical knowledge of a language, measured in our experiment with the Lextale test, as well as grammatical knowledge, did represent, for both native and non-native speakers, predictive variables for the mastery of discourse connectives. As the proficiency of connectives of our participants increased along with their proficiency of grammar and lexicon, we can therefore conclude that mastering connectives is an integral part of second language acquisition, that involves both extensive vocabulary and grammatical competences. Further research could further address and detail the actual causal nature of those factors in the mastery of discourse connectives.

Implications for Teaching
Our results seem to have high implications for second language teaching since a highly transparent connective might be considered a good starting point for non-native speakers to learn connectives in L2 effectively. Indeed, by introducing primarily highly transparent connectives to L2-learners, they could rapidly learn to express a wide variety of coherence relations. However, it should be noted that in these cases learners might fall prey to negative transfer effects, misconceptions about the use of a connective caused by its grammaticalization and issues of register uncertainty. For example, as our results show, the intuitive grasp of meaning by non-native speakers of the highly transparent French contrastive connective c'est pourquoi ('this is why') could not only result in an overuse of the connective but could also cause problems regarding the specific register of a situation (i.e., situations where other connectives conveying a consequence relation would be more appropriate, such as ainsi, donc etc.). Especially for more advanced L2-writers one should consider all connectives used by skilled native writers (Chen 2014). As stated in Granger and Tyson (1996); (Crewe 1990) teaching connectives with a direct translation or interchangeable lists of supposed translation equivalents might mislead non-native learners. Instead, an introduction of connectives with examples and detailed metalinguistic comments on how they convey which kind of relation in a given sentence, including an authentic context, would be more favorable.