Next Article in Journal
Vocative Che in Falkland Islands English: Identity, Contact, and Enregisterment
Previous Article in Journal
Learning Environment and Learning Outcome: Evidence from Korean Subject–Predicate Honorific Agreement
Previous Article in Special Issue
Game on: Computerized Training Promotes Second Language Stress–Suffix Associations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

That Came as No Surprise! The Processing of Prosody–Grammar Associations in Danish First and Second Language Users

by
Sabine Gosselke Berthelsen
1,* and
Line Burholt Kristensen
2
1
Center for Languages and Literature, Lund University, 221 00 Lund, Sweden
2
Department of Nordic Studies and Linguistics, University of Copenhagen, 1172 Copenhagen, Denmark
*
Author to whom correspondence should be addressed.
Languages 2025, 10(8), 181; https://doi.org/10.3390/languages10080181
Submission received: 6 December 2024 / Revised: 3 July 2025 / Accepted: 7 July 2025 / Published: 28 July 2025

Abstract

In some languages, prosodic cues on word stems can be used to predict upcoming suffixes. Previous studies have shown that second language (L2) users can process such cues predictively in their L2 from approximately intermediate proficiency. This ability may depend on the mapping of the L2 prosody onto first language (L1) perceptual and functional prosodic categories. Taking as an example the Danish stød, a complex prosodic cue, we investigate an acquisition context of a predictive cue where L2 users are unfamiliar with both its perceptual correlates and its functionality. This differs from previous studies on predictive prosodic cues in Swedish and Spanish, where L2 users were only unfamiliar with either the perceptual make-up or functionality of the cue. In a speeded number judgement task, L2 users of Danish with German as their L1 (N = 39) and L1 users of Danish (N = 40) listened to noun stems with a prosodic feature (stød or non-stød) that either matched or mismatched the inflectional suffix (singular vs. plural). While L1 users efficiently utilised stød predictively for rapid and accurate grammatical processing, L2 users showed no such behaviour. These findings underscore the importance of mapping between L1 and L2 prosodic categories in second language acquisition.

1. Introduction

Prosody is rarely considered in second language processing studies. Yet, sensitivity to prosodic cues is not only relevant for studies of emotion processing (de Marco, 2019) and intonation (van Maastricht et al., 2016). It can also be relevant for lexical distinction and efficient grammar processing. In this article, we first summarise previous work on how first language (L1) users and second language (L2) users make use of prosodic cues like word tones and stress to process upcoming suffixes, focusing on the few existing studies of L2 users of Spanish and Swedish, before turning to the prosodic phenomenon of interest in this study: Danish stød. Stød is a complex “laryngeal syllable rhyme prosody”1 (Basbøll, 2014) with many functions in spoken Danish, one of which is to index specific types of upcoming suffixes. Specifically, upon hearing a stød or non-stød realisation at the beginning of a word, the listener can predict the word’s possible endings, which reduces the processing effort of the already-activated endings (cf. Section 2.3). In this study, we contrast how L1 users of Danish process the predictive properties of this prosodic cue with how they are processed by L2 users of Danish whose L1 is German. Using a mismatch paradigm, we measure the additional processing time when a prosodic cue surprisingly is followed by a non-predicted suffix.
Our study is the first experimental study of L2 processing of Danish prosody, and illustrates how sensitivity to prosodic cues in Danish can facilitate the processing of noun morphology, and how such facilitatory effects differ between L1 users and L2 users. Stød is an acoustic-phonetically and distributionally complex prosodic cue that does not easily map onto the prosodic categories in our L2 users’ L1. Focusing on this cue, we show the impact of poor L1-L2 mapping on L2 speech processing, specifically the prediction of grammatical structure, at different acquisitional stages.

2. Theoretical Background

2.1. Predictive Prosody–Grammar Associations in an L1

In many languages, prosody is involved in the morphological construction of words. This is true in particular for languages where the addition of derivational or inflectional morphemes to lexical word stems leads to morphophonological alternations. Depending on the language, different prosodic phenomena, such as lexical stress (e.g., English system, systematic, systematicity), tone (e.g., Somali gooli ‘lioness’, goól ‘lioness’s’ [with ’ indicating a high tone]; Banti, 1988), or word accents (e.g., Japanese kíru ‘to cut’, kiránai ‘cut.neg’; Labrune, 2012), can be alternated in this context. The weighting of acoustic correlates involved in the process differs between languages. For lexical stress, for instance, duration is the strongest cue in Thai (Cutler, 2005), while F0 and intensity are more important cues for the realisation of lexical stress in Spanish (Cutler, 2005; Ortega-Llebaria et al., 2013). Tone can be expressed through differences in pitch height or pitch direction (Banti, 1988; Crysmann, 2015; Gandour, 1983; Oomen, 1981), and word accents are typically realised in pitch patterns (Labrune, 2012; Langston, 1997; Riad, 2014; Rischel, 2008), but related morphophonological features can be expressed through voice quality differences (Basbøll, 2003).
Importantly, and regardless of the acoustic cues involved, morphophonological alternations may potentially lead to a predictive relationship between prosody on the word stem and continuations of the word. For example, a high tone on an initial syllable ki in Japanese would cue the word kíru, but not kiránai. Similarly, an initial unstressed syllable in Armenian, Polish, or Spanish, which all assign lexical stress with respect to the right word boundary (Dolatian, 2019; Gussmann, 2007; Ortega-Llebaria, 2006), can alert the listener to the presence of an upcoming suffix (e.g., Armenian amusin ‘husband.sg’ vs. amnusin-ner ‘husband-pl’; Dolatian, 2019).
Experimental work on prosody–grammar associations in Danish, Spanish, and Swedish has shown that first language (L1) users can internalise such morphophonological connections between prosody and grammar and use them when processing speech input. Specifically, owing to the potentially predictive relationship, L1 users have been observed to actively predict word continuations based on prosodic information on the word stem (Roll, 2022).
This was first illustrated for Swedish word accents. In Central Swedish, if the stem hund- ‘dog-’, for example, is produced with accent 1 (a low tone), it can be followed by the definite suffix -en, but not by the plural suffix -ar. The plural -ar is instead preceded by a word stem with accent 2 (a high falling tone). The strong association between prosody and inflectional suffixes makes the prosodic feature predictive. This can facilitate language processing, as has been observed both in brain responses and in decreased response times (Roll, 2015, 2022; Roll et al., 2010, 2013, 2015, 2017, 2023; Söderström et al., 2012, 2016, 2017a, 2017b, 2018). Specifically, upon hearing a more predictively useful prosodic cue, L1 listeners’ brains tend to produce a strong prediction negativity (PrAN, Roll et al., 2015). The strength of this brain response is negatively correlated with the number of possible word continuations: the fewer continuations there are available, the stronger the neural response (Söderström et al., 2016). The PrAN is, therefore, considered to index listeners’ active prediction of upcoming suffixes (León-Cabrera et al., 2024). If a predictive word stem with its prosodic cue is subsequently followed by a non-associated ending and predictions are not met, L1 listeners’ brains further produce an increased N400 response (or LAN, depending on the task) and an increased P600. Increased N400 and LAN are neural responses related to failed semantic and morphological prediction, respectively (Gosselke Berthelsen et al., 2018; León-Cabrera et al., 2017, 2024; Söderström et al., 2016), while the posterior P600 indexes the need for a revision or repair after failed morphosyntactic prediction (Kutas et al., 2011; Kuperberg et al., 2020). Behaviourally, Swedish L1 listeners are also affected when tonal word stems are not followed by the associated suffix: they are considerably slower at accessing the grammatical features of a non-predicted suffix (Söderström et al., 2012).
Similar patterns of response times and neurophysiological data have been observed for prosody–grammar associations in Danish (Clausen & Kristensen, 2015; Hjortdal et al., 2022) In Danish, word stems either bear the prosodic feature stød (a form of laryngealisation), or non-stød (its contrasting modal voice counterpart). Stød and non-stød, respectively, on word stems are associated with different suffix ranges; see Section 2.3.
These findings are complemented by eye-tracking data and word completion studies for prosody–grammar associations in Spanish verbs, where the stress pattern of the verb is dependent on the subsequent suffix (Sagarra & Casillas, 2018). This is caused by inflectional suffixes in Spanish, which either attract stress or not. As a result, the verb stem is unstressed before certain suffixes (e.g., third person past or future suffixes: tom-ó take-pst.3sg, or tom-ará take-fut.3sg) and stressed before others (e.g., third person present tense suffixes: tom-e take-prs.3sg). Hence, similarly to Swedish word accents and Danish stød, the verb stem prosody in Spanish becomes predictive of possible, associated suffixes.
These findings from the prosody–grammar domain in Swedish, Danish, and Spanish add support to contemporary language processing theories that see prediction as an integral part of language comprehension (Christiansen & Chater, 2016; Clark, 2013; Friston, 2010; Kuperberg & Jaeger, 2016; Levy, 2008). They supplement well-documented effects of failed prediction for other types of phonological cues (DeLong et al., 2005), semantic cues (Kutas et al., 2011), pragmatic cues (Nieuwland & van Berkum, 2006), and information structural cues (Kristensen & Wallentin, 2015), and show that predictions can also be based on acoustic cues such as intensity, duration, F0, or voice quality involved in the realisation of prosodic phenomena like lexical stress, tone, word accents, or stød.

2.2. Processing Predictive Prosody–Grammar Associations in an L2

While L1 users use prosody-based cues when processing upcoming morphology in an effortless and largely automatic way, L2 users do not seem to use these cues to the same extent, at least not in the early stages of second language acquisition (SLA). At the beginner level, neither Swedish L2 users (Gosselke Berthelsen et al., 2018) nor Spanish L2 users (Lozano-Argüelles et al., 2020; Sagarra & Casillas, 2018) use prosodic cues to predict suffixes. With increased language proficiency, learners seem to make better use of the associations. In Spanish, advanced learners can use stress cues to predict verbal suffixes (Lozano-Argüelles et al., 2020; Sagarra & Casillas, 2018), and for Swedish word accents, behavioural responses indicate the onset of grammar prediction at intermediate proficiency (Schremm et al., 2016).
The roles of the learners’ language background and L1-based familiarity with the function of L2 prosody were not examined in these previous studies of L2 users, but various theories on L2 speech production and perception point to the importance of these factors (Flege & Bohn, 2021; So & Best, 2014; Strange, 2011; van Leussen & Escudero, 2015). It is therefore realistic to propose that the progression of prosody-based predictive processing of grammatical structure is not solely dependent on language proficiency, but also depends on learners’ L1 background and the mapping of L1 and L2 prosodic features and functions.
For L2 speech acquisition, it is usually assumed that speech is initially analysed through the lens of L1 categories (e.g., Flege & Bohn, 2021; So & Best, 2014). This implies that when there is a strong overlap between the phonological systems of the L1 and the L2, learners are able to process the language input through existing categories, and are likely to perceive and produce L2 speech relatively effortlessly and quickly (Flege & Bohn, 2021). For prosody, a learner needs to be familiar with not only the acoustic realisation of the prosodic features, but also their linguistic function (So & Best, 2014). Consider, for instance, Swedish L2 learners, who have been observed to easily and highly preconsciously associate a high falling tone, but not a low rising tone, with L2 grammatical information (Gosselke Berthelsen et al., 2022). Both falling tones and rising tones are part of the Swedish language, but only falling tones have a close association with grammatical suffixes, while rising tones are connected to focus marking (Bruce, 1977, 1983, 1987). With increasing proficiency, Swedish learners should be able to acquire both L2 tonal functions because they map perceptually onto different L1 categories (rise vs. fall), but for the rising tones, they will have to restructure their functional associations. This is supported by data showing that learners from L1 backgrounds where tones are used for intonation only can acquire the functional associations between tonal information in Swedish and grammatical suffixes by upper-intermediate proficiency (Hed et al., 2019). The opposite mapping pattern is found for Spanish L2 lexical stress onto English L1 categories (see Fernandez & Sagarra, 2025). While both languages have a functional category of lexical stress which can become predictive with suffixation (Fritz et al., 2025; Lozano-Argüelles et al., 2020; Sagarra & Casillas, 2018), the acoustic cues by which stress is realised across the two languages differ to a certain degree (Ortega-Llebaria et al., 2013). As a result, remapping is necessary, and English L1 learners acquire L2 Spanish stress and its predictive features after beginner proficiency (Lozano-Argüelles et al., 2020; Sagarra & Casillas, 2018). In the present study, we investigated a third type of mapping, where both the perceptual categories and the functional categories of the L2 prosody did not map easily onto the L2 learners’ L1: the mapping of L2 Danish stød onto L2 users’ German L1 categories. The exact mapping is explained in Section 2.4 below. Considering the need to develop both perceptual and functional categories that differ significantly from the L1, we hypothesised that evidence of successful acquisition of stød would not be visible before advanced proficiency levels.

2.3. Complex Prosody–Grammar Association in Danish: Stød

Stød is a cross-linguistically rare prosodic cue. Diachronically related to Swedish word accents (Goldshtein, 2021; Riad, 2003), stød has a very similar distribution and function, but a different realisation. Phonetically, stød is a biphasic prosodic phenomenon. The first phase is characterised by a high F0 contour while the second phase (stød proper) involves a form of laryngealisation (transcribed as ˀ) that is realised through an interplay of different factors, such as pitch, intensity, duration, and periodicity (Fischer-Jørgensen, 1989; G. F. Hansen, 2015; Peña, 2022; Siem, 2024). Stød only occurs when there is a long sonorant phase (i.e., at least a long vowel or a short vowel plus a sonorant consonant), and its exact realisation (i.e., the use and strength of different acoustic cues) can differ both inter- and intra-personally (Ejskjær, 1967, 1990; Fischer-Jørgensen, 1989; A. Hansen, 1943; G. F. Hansen, 2018; Quist, 2002; Ringgaard, 1960; Siem, 2024). As with Swedish word accents, stød is connected to grammatical endings (Basbøll, 2003): Stød can appear before the definite singular suffixes -en (1) and -et (see 7A for an example), but cannot precede the plural suffix -e (2). This systematic use of voice quality in morphological composition is rare cross-linguistically.
(1)hund-en[ˈhunˀ-n̩]dog-def‘the dog’
(2)hund-e[ˈhun-ə]dog-pl‘dogs’
(3)hund[ˈhunˀ]Dog‘dog’
(4)hun[ˈhun]She‘she’
Yet, despite its cross-linguistic rarity, acoustic variability, and phonological restraints, L1 users of Danish show clear neural and behavioural patterns related to prediction when stød is predictive of grammatical suffixes (Clausen & Kristensen, 2015; Hjortdal et al., 2022). However, unlike in Swedish word accents (Roll, 2022), prediction is presumably only one of two important functions of stød in contemporary Danish. Danish has a large number of minimal pairs distinguished by stød alone. Besides unrelated minimal pairs such as (3) and (4), many Danish suffixes have become (segmentally) homonymous as a result of diachronic processes of phonetic reduction producing numerous morphologically related minimal pairs, such as (5) and (6).2 The consequential existence of large numbers of minimal pairs in Danish presumably increases the grammatical informativeness associated with stød and qualifies stød as a lexical discriminator or even a grammatical marker in its own right.
(5)(hun) tal-er[ˈtsæˀl-ɐ](she) speak-prs‘(she) speaks’
(6)(en) tal-er[ˈtsæl-ɐ](a) speak-er‘(a) speaker’
The reduction processes at the end of words that are characteristic of Danish are still active at present and produce large variation in the contemporary realisation of suffixes. Importantly, in standard Copenhagen Danish, the predominant realisation of the neuter definite singular suffix -et3] contrasts with the colloquial [ə]-realisation,4 which makes it indistinguishable from the standard realisation of the plural suffix -e ([ə]) in (8). This creates a minimal pair distinguished only by the prosody of the word stem. This strengthens the claim for an assumed distinguishing, quasi-grammatical function of prosody in contemporary Danish, complementing the established predictive function.
(7A)hus-et[ˈhuːˀs-ɤ]house-def‘the house’
(7B)hus-et[ˈhuːˀs-ə]house-def‘the house’
(8)hus-e[ˈhuːs-ə]house-pl‘houses’

2.4. German L1 Users Acquiring Danish Stød

To examine the L2 acquisition of Danish stød, we recruited an L2 group with German as their L1. Although stød is likely difficult for L2 listeners from any L1 background, a homogeneous group allows for analysis of the mapping of stød’s form and function onto the L1’s prosodic inventory. German was chosen as the L1 for several reasons: (1) German and Danish are closely related typologically, and Germans should therefore have comparatively little difficulty with vocabulary and grammar acquisition in Danish, giving them ideal preconditions. (2) A previous prosody acquisition study for Swedish used Germans as the L2 group (Gosselke Berthelsen et al., 2018). Employing an L2 group with the same L1 background eases between-study comparability. (3) German has no prosodic feature that corresponds directly to stød. This allowed us to test how learners with German as their L1 map cues that are both unfamiliar perceptually and functionally. Perceptually, the creaky voice and glottal stop components of stød map best onto pre-vocalic boundary markers and plosive substitutes in German: German employs creaky voice and glottal stops consistently before vowel-initial words and stem morphemes (Eisenhuth, 2015; Garellek, 2013; Kohler, 2009). Creaky voice also occurs in German when plosives assimilate to nasals (e.g., Stunden [ˈʃthʊnn̰n] ‘hours’; cf. Kohler, 2009). Based on information-structural descriptions of German prosody (Eisenhuth, 2015; Féry, 2010; Gibbon, 1998; Kohler, 2009), we could further hypothesise that German L1 learners would map the perceptually important pitch component of stød (Fischer-Jørgensen, 1989; Peña, 2022) onto an intonational signal of focus (H*L). Focus may also be the default category that the high, non-dropping intensity curve of non-stød could be mapped onto in the German L1, while the relatively lower pitch of non-stød could instead be interpreted as a lack of focus. In sum, there is large variability in the potential mapping of perceptual cues involved in stød production onto L1 German prosodic categories, and German L1 users may perceive stød as a focus signal or a boundary signal. In comparison, non-stød may be mapped onto both focus and non-focus categories. With respect to functional categories, the German language does not have a feature like stød where a word stem can select (and pre-activate) possible word continuations (Roll, 2022). Unlike in Swedish word accents, the large number of minimal pairs in Danish turns this selection and prediction function of stød discriminative in many cases.
Nonetheless, German L1 users would be vaguely familiar with the general concept of other predictively useful cues. Much like Danish, German has different cases of predictive associations in the derivational and inflectional system, particularly related to lexical stress and umlaut: some derivational suffixes attract stress (e.g., nominaliser -ei [cf. Danish -i]), and some derivational and inflectional suffixes can cause umlaut in the word stem (e.g., nominaliser -er as in sing-en sing-inf ‘to sing’—Säng-er ‘sing-er’ [cf. Danish -er: syng-e sing-inf ‘to sing’—sang-er ‘sing-er’] or plural -e in Kuh [ˈkhuː] ‘cow’, Küh-e [ˈkhʏː-ə] ‘cow-s’ [cf. Danish ko [ˈkhoːˀ] ‘cow’, kø-er [ˈkhøːˀ-ɐ] ‘cow-s’). German L1 users are, in principle, familiar with the idea that information can be used predictively, but they have no comparable functional feature to Danish stød, which acts on top of and in interaction with, but separately from, lexical stress and umlaut. We therefore argued that German L1 users would be essentially unfamiliar with both the perceptual and functional categories of Danish stød, which would negatively affect the feature’s acquirability. Consequentially, we expected not to find evidence of rapid prosody-based word form selection and pre-activation below advanced proficiency levels in Danish L2 users with German as their L1.

3. The Present Study

Using prosody to decode words’ morphological structure makes speech processing faster and more efficient, but does not typically affect its overall success. Speech comprehension can proceed—albeit more slowly—even if users cannot make use of the prosodic cues (Roll, 2022; Roll et al., 2010). This makes prosody-based word form selection and pre-activation a less crucial skill for L2 users to acquire. When the prosodic cue moreover is highly complex and crucially differs from an L2 users’ L1 prosody inventory in terms of both its perceptual and functional categories, we expect that this skill will be decoded and internalised late in the acquisition process. Investigating when L2 users start to show sensitivity to Danish stød, we constructed a mismatch study where the validity of the prosodic cue was manipulated. Besides the German L1 Danish L2 group, we recruited a group of Danish L1 users as a baseline.
Our main research question was how these two groups would differ in their prosody processing for Danish nouns. Including the L2 users’ Danish proficiency as a variable in our study, we further aimed to explore at what proficiency stage sensitivity to stød as a functional cue would emerge in our L2 users.
Using response times as the most important measure of prediction-facilitated processing, we hypothesised that we would find a general difference between L1 and L2 users of Danish, such that L1 users would use prosody–grammar associations to predict upcoming suffixes for target words and, therefore, would show sensitivity to prosody–suffix matches, while L2 users on the whole would not. For L2 users, we expected a considerably later onset than in previous studies which tested only partially mismapped prosodic cues for instance in Spanish or Swedish (Sagarra & Casillas, 2018; Schremm et al., 2016).
Additionally, we expected a general difference in grammatical decision accuracy, such that L1 users would be more proficient in their grammatical decisions than L2 users. Here, an effect of L2 proficiency should be observable. We also hypothesised that the L1 users would use stød patterns to distinguish singular from plural in neuter singular word forms, where suffix realisations can converge. We did not expect this effect in the L2 users, since we anticipated that they would remain relatively agnostic to stød patterns until the end of the acquisition process.

4. Materials and Methods

We constructed an online response time study with Gorilla Experiment Builder (Anwyl-Irvine et al., 2020).

4.1. Participants

A total of 106 people (60 German L1; 46 Danish L1) participated in the study. Six were recruited via Prolific and paid via the platform. The remaining 100 were recruited via posters and social media and participated from home (n = 99) or in the lab (n = 1). They entered a draw to win chocolates. A total of 27 entries (21 German L1; 6 Danish L1) were excluded before analysis because of incomplete data (n = 22), technical problems (n = 1), noncompliance with the task (n = 1), or nonconformity with age requirements (n = 3). Consequently, data from 39 German L1 users (mean age 29 ± SD = 6; 27 female, 11 male, 1 non-binary) and 40 Danish L1 users (mean age 26 ± 4; 23 female, 15 male, 2 non-binary) were included in the analysis. The inclusion criteria were normal hearing, normal or corrected-to-normal vision, and no history of brain trauma or language disorders.
The Danish L1 participants were recruited without limitations regarding their dialect background. None of them had grown up in Southern regions with purely tonal realisations of stød, but two had grown up directly on the border of such regions. Most of the other participants were from Zealand (n = 27). The members of the German L1 group were assigned to sub-groups pertaining to their Danish listening skills according to self-ratings based on the description of listening skills in the Common European Framework of Reference (CEFR: Council of Europe, 2020). There were ten beginner listeners (≤level A2), ten lower-intermediate listeners (level B1), nine upper-intermediate listeners (level B2), and ten advanced listeners (≥level C1). On average, they had started learning Danish at 25 years (SD = 7), with no significant difference between the proficiency groups; F(3,36) = 0.39, p = .762. A detailed overview of the participants’ experience with the Danish language can be found in the Supplementary Materials (Table S1).

4.2. Stimuli

We recorded auditory stimuli consisting of target words embedded in the carrier sentence Det var vist [target], Brit sagde. ‘It was probably [target], Brit said.’ By means of this carrier sentence, the target word was both preceded and followed by a plosive (to facilitate segmentation). The target words followed a specific stød-alternating pattern: stød on the word stem before the definite singular suffixes -en or -et, and non-stød on the word stem before the plural suffix -e; see Table 1.
Five target words were of common gender. For these words, stød is purely predictive, as there is no convergence of definite singular and plural suffixes.5 Six target words were of neuter gender, where the reduction of the definite singular creates potentially ambiguous cases, possibly leading to a distinguishing function for the prosodic cues. For the full list, see Table A1. In the experiment, the different genders were presented in separate blocks whose order was counterbalanced between participants. Unequal numbers were a consequence of a low number of relatively frequent and spliceable words which would be recognisable for L2 users. All the target word stems had a long sonorant rhyme (long vowel, diphthong, or short vowel + sonorant) where stød could be realised. All canonical singular and plural stimuli were produced in one recording session in a sound-attenuated studio by a male adult speaker of standard Copenhagen Danish at a normal speaking rate. Using a single, phonetically trained speaker resulted in stronger and more consistent realisations of stød than learners would typically encounter. This ensured that the prosodic cues would be easily identified if our L2 participants had previously acquired them.
The presence of stød on the stød-bearing word stem (compared to non-stød stems) was attested by several parameters related to stød averaged across the stød-bearing syllable: lower intensity (MDEF = 61.51 dB, MPL = 66.81 dB, t(22) = 3.70, p = .001), higher pitch (MDEF = 132.50 Hz, MPL = 112.06 Hz, t(22) = −3.00, p = .007), more shimmer (measure of amplitude irregularity) (MDEF = 17.26%, MPL = 10.28%, t(22) = −2.58, p = .017), more jitter (measure of frequency irregularity) (MDEF = 4.86%, MPL = 2.26%, t(22) = −4.39, p < .001), and a lower harmonics-to-noise ratio (HNR) (MDEF = 5.44 dB, MPL = 11.77 dB, t(22) = 9.01, p < .001). Although these measures are not necessarily ideal for capturing all features related to the realisation of stød, they are frequently used and were sufficient for showing the perceivable differences between stød- and non-stød stems in our study (Fischer-Jørgensen, 1989; G. F. Hansen, 2015; Hjortdal et al., 2022; Siem, 2024). For an illustration of the larger and earlier intensity drop for word stems with stød and the higher pitch on stød-bearing syllables, see Figure 1.
From the canonical singular and plural target words that were recorded naturally, stem–suffix mismatches were created in Praat (Boersma, 2001) via cutting and splicing: Word stems ending in obstruents were cut before the obstruent (e.g., bol+de [ˈpɒl+tə] ball-pl or ski+bet [ˈskiːˀ+pɤ] ship-def). Word stems ending in sonorants were cut after the sonorant (e.g., navn+et [ˈnaʊnˀ+ɤ] name-pl or skjold+e [ˈskjɒl+ə] shield-def). Subsequently, beginnings from definite singular words were spliced with endings from plural words of the same word pair, and word beginnings from plural words were spliced with endings from definite singular words. In the rare case that splicing resulted in audible transitions visible in the waveform and spectrogram, a small portion (>10 ms) of the concatenated stimuli was cut.

4.3. Procedure

During the experiment, all participants listened to carrier sentences with matched and mismatched target words. Their task was to judge the number category of the target word as quickly as possible using the F and J keys. Response options (én ‘one’ for singular, flere ‘several’ for plural) were shown on the screen, and response key association (F = ‘one’ or F = ‘several’) was randomised between participants. Trials ended when a response was given. If no response was given at 2 s after word offset, a time-out screen (1500 ms) urged participants to respond faster. A 500 ms fixation cross cued the next trial; see Figure 2. Participants could escape to a break screen by pressing the P key. Response times (RTs), the number of timed-out responses, and response accuracy (i.e., ‘singular’ or ‘plural’ responses) were recorded. Common-gender words (120 trials per participant: 5 target words in 4 conditions, repeated 6 times) and neuter-gender words (120 trials per participant: 6 target words in 4 conditions, repeated 5 times) were repeated in separate blocks with counterbalanced order across participants. Familiarisation with the procedure was achieved in a short practice phase with eight trials with test words with non-alternating prosody (non-stød in both singular and plural). Explicit feedback was given in the practice phase only. Familiarisation and testing had a mean duration of 8.6 min (range = 6.5–11.0), excluding breaks over 1 min, of which there were a total of 6 across participants.

4.4. Statistics

In 1.3% of all trials, the response was outside of the 2000 ms time window after target onset, and we did not obtain any data. The remaining responses were pre-processed to reduce the impact of outliers. We categorically excluded all responses that occurred earlier than 200 ms after rhyme onset (i.e., the earliest possible onset of the laryngealisation of stød proper6), as they likely represented fast guesses (n = 153; 0.8%). The prosody (stød/non-stød) is unlikely to be encoded and responded to at this point (Whelan, 2008). RTs were further subjected to logarithmic transformation to reduce the influence of long RT outliers and to normalise the data distribution before statistical analysis (Whelan, 2008). We did not exclude incorrect responses from the RT data, since responses in the mismatch conditions (particularly the ambiguous neuter plural mismatch) could not confidently be considered incorrect. Participants could simply have decided to use the prosodic cue rather than the suffix for their grammatical decisions. Therefore, we instead included Accuracy (based on the grammatical properties of the suffix) as a variable in the RT models.
The pre-processed RT data were subjected to a linear mixed-effects models analysis in R (version 4.1.2; R Core Team, 2021) via RStudio (version 2021.9.2.382; R Studio Team, 2022) using the lme4 package (version 1.1.29; Bates et al., 2015). We fitted a global model including the binary, deviation-coded7 fixed-effect variables Match (match vs. mismatch), Number (singular vs. plural), Gender (common vs. neuter), and Accuracy (correct vs. incorrect responses), as well as the continuous variable Trial Number, rescaled to a scale of 0 to 1, and the 5-level variable Proficiency (L1, advanced, upper-intermediate, lower-intermediate, beginner), with L1 as reference level. We further included all possible interactions with the factors Match and Proficiency to capture their potential influence on the data: interactions between Match and Proficiency; Match and Number; Match and Gender; Match and Accuracy; Match, Proficiency, and Number; Match, Proficiency, and Gender; Match, Proficiency, and Accuracy; and Match, Proficiency, Gender, and Number; as well as Proficiency and Gender; Proficiency and Number; and Proficiency and Accuracy. Participant and Item were included as random intercepts. For suffix-based response accuracy, we fitted generalised linear mixed-effects models including the same factors and interactions as in the RT models, except for the factor Accuracy. The MuMIn package (version 1.46.0; Burnham & Anderson, 2002) was used to compare and rank the different models according to Δ values for the corrected Akaike Information Criterion (AICc) and to compute Akaike weights (w). In line with Burnham and Anderson (2002), all models with Δ values < 7 were considered as potential candidates of models that would best explain the data.

5. Results

Section 5.1 presents the results for response times, which are most relevant for the hypotheses of this study. Section 5.2 presents the data for response accuracy (based on the suffixes’ grammatical properties). Section 5.3 summarises the results for both data types.

5.1. Response Times

One model passed the pre-defined Δ threshold with a Δ value of 0 and an Akaike weight of 0.981, meaning that the likelihood of this being the best model to explain the data was 98.1%. The model showed that RTs were influenced by a range of factors; see Table 2. RTs were longer for mismatch (M = 2.973 [~940 ms] ± SD = 0.136) than for match (M = 2.949 [~889 ms] ± 0.140), for neuter (M = 2.969 [~932 ms] ± 0.147) than for common gender (M = 2.953 [~898 ms] ± 0.129), and for plural (M = 2.968 [~929 ms] ± 0.137) than for singular (M = 2.954 [~900 ms] ± 0.139), and were longer at the beginning of the experiment session; r(18693) = −.05, p < 0.001. There was no overall difference between L1 and L2 users’ RTs, but lower-intermediate users responded more slowly (M = 3.003 [~1007 ms] ± 0.133) than L1 users (M = 2.953 [~897 ms] ± 0.129). This may be related to an increasing awareness of grammar and relatively little experience in the lower-intermediate group resulting in more conscious, slower grammatical decisions overall.
An interaction between Proficiency and Match revealed that the effect of Match was driven by the L1 group; see Figure 3A. Unlike any of the L2 groups, the L1 users responded faster to matched words than to mismatched words. This suggests that only the L1 users used the prosodic cues as a predictive cue for suffixes which reduced their RTs when the prosodic cue and suffix matched. An interaction between Proficiency and Gender, on the other hand, revealed that the effect of Gender was driven by the L2 groups; see Figure 3B. That is, there was a significant difference between the L1 user group and the three most advanced L2 user groups, such that only the latter had longer RTs for neuter gender than for common gender. There was no significant difference between the L1 user group and beginner L2 users. The beginner learners’ low proficiency likely made all decisions equally difficult, while the higher-proficiency L2 users had acquired a good command of common-gender but not neuter-gender suffixes. The automatically selected best-fit model for RT had a residual unexplained variation of 0.014.

5.2. Response Accuracy

For response accuracy (with respect to the grammatical feature of the suffix), only a single global model passed the Δ threshold. This model (w = 1.00) featured the main effects and interactions of all contrasts that were included in the maximal model. The main effects and the most complex interaction are shown in Table 3.
Suffix-based Accuracy was higher for match (M = 86.4% ± SD = 34.3%) than for mismatch (M = 77.8 ± 41.6%), for common gender (M = 88.2 ± 32.3%) than for neuter gender (M = 75.9 ± 42.8%), and for singular (M = 83.7 ± 36.9%) than for plural (M = 80.5 ± 39.7%), and was higher for later trials; r(18693) = .03, p < .001. There was also a significant effect of Proficiency, with the L1 group being significantly more accurate (M = 89.0 ± 31.3%) than the three lowest-proficiency L2 groups (Mbeg = 63.4 ± 48.2%; Mlow = 79.3 ± 40.6%; Mupp = 72.4 ± 36.4%). The full model, including all the main effects and interactions, is specified in Table S1 in the Supplementary Materials.
The most complex interaction is visually illustrated in Figure 4. There is a significant effect of Gender, Number, and Match in the L1 group, but not in the L2 groups. Notably, the L1 users recognised matched neuter singulars (stød + neuter singular suffix [ɤ]) more accurately (M = 97.7 ± 14.9%) than any other match (or mismatch) condition, as shown in Figure 4—likely because they perceived the prosodic cue (stød) as a strong additional, grammatical marker in neuter words due to suffix reduction processes; see Section 2.3. The relative strength of stød as a prosodic neuter singular cue can also explain the finding that mismatched neuter plurals (stød + neuter plural suffix [ə]) were significantly least often identified as plurals in the L1 group. In fact, this is the only condition in which the L1 group’s percentage of suffix-based decisions is near chance (M = 59.7 ± 49.1%) and response variation is particularly high; see Figure 4. All other conditions, including mismatches, are close to ceiling (M = 84.9–97.7%), with comparatively little variation in response behaviour (SD = 14.9–35.9%). There is no corresponding effect for neuter words with non-stød on the word stem (i.e., plural match and singular mismatch).
This observation inspired us to look at the L1 group’s response behaviour for neuter words with stød on the word stem at an individual participant level. An exploratory examination of the plural mismatch condition revealed strong intrapersonal differences: some L1 users consistently classified plural mismatches (stød + [ə]) as plurals (for up to 96.7% of all cases), others as singulars (for up to 96.7% of all cases), and yet others were undecided (~50%), as illustrated in Figure 5. We believe that this response behaviour can be explained by the dialect- and register-based suffix realisation of the -et suffix discussed in Section 2.3, where reduction leads to convergence of singular and plural suffixes. In these cases, some L1 listeners rely on the prosodic cue when making grammatical decisions for the suffix.

5.3. Results Summary

Overall, the RT results show that only L1 users, and not L2 users—regardless of proficiency—were affected by prosody–suffix mismatches. Specifically, Danish L1 users responded significantly faster to matched words than to mismatched words. This suggests that they could use the prosodic cues, stød and non-stød, predictively to make grammatical decisions. German L1 Danish L2 users, on the other hand, could not use the prosody predictively at any proficiency level.
The results for response accuracy illustrate that L1 users could more proficiently assess the L1 words’ grammatical features than lower-proficiency L2 users, but advanced L2 users were indistinguishable from L1 users. A maximal interaction of Proficiency, Gender, Number, and Match for response accuracy further suggests that while L1 users used stød (but not non-stød) grammatically to distinguish word forms in the neuter gender, L2 users did not, regardless of proficiency.
L2 users’ RTs were significantly affected by noun gender. They were considerably slower at identifying the grammatical properties of neuter-gender words. This is likely related to difficulties in distinguishing the definite singular suffix (-et [ɤ]) from the plural suffix (-e [ə]), since [ɤ] does not exist in German, and both L2 phonemes might map onto the German L1 phoneme.

6. Discussion

Due to the expected clear differences between participant groups, we will discuss their processing behaviour separately. In Section 6.1, we illustrate how L1 users of Danish use prosodic cues to predict upcoming suffixes. In Section 6.2, we discuss how L2 users differ from L1 users in their use of the same prosodic cues. The response time differences for definite singular and indefinite plural nouns observed for all participants are discussed in Section 6.3. Finally, Section 6.4. points to the relevance of the present results for languages with different types of prosody–grammar associations.

6.1. Processing Prosody–Grammar Associations in L1 Danish

We hypothesised that L1 users would use stød and non-stød predictively—akin to previous studies (Clausen & Kristensen, 2015; Hjortdal et al., 2022).

6.1.1. The Predictive Function of Danish Prosody–Grammar Associations

As hypothesised, the results corroborate the idea that appropriate prosodic cues facilitate L1 users’ suffix processing. When L1 users are exposed to correctly matched prosody and suffixes, they respond considerably faster than even highly proficient L2 users, and they respond considerably faster than when there is a prosody–suffix mismatch. This type of facilitative predictive behaviour has previously been observed in L1 users for the same type of associations in Danish, Swedish, and Spanish (Clausen & Kristensen, 2015; Hjortdal et al., 2022; Roll, 2015; Sagarra & Casillas, 2018; Söderström et al., 2012), but is likely much more common than this relatively small selection of studied languages suggests. Essentially, the only prerequisite is a reasonably strong association between an acoustic cue and upcoming linguistic information. In the context of languages with inflectional suffixes, such associations can be particularly strong due to frequent iteration and could, therefore, lead to particularly firm predictive relationships; see Section 6.4.
Slower response times for mismatches could simply be a result of error processing, and do not necessarily imply that prediction has taken place beforehand. This is where the use of neurophysiological measurements is advisable. ERP studies have previously not only shown responses in the brain that are associated with error processing after failed predictions (N400, P600), but also responses that occur during prediction and are correlated with the number of possible endings and with lexical certainty (Hjortdal et al., 2024; Söderström et al., 2016). For L1 users of Danish, one ERP study (Hjortdal et al., 2022) came to the same conclusion as this study: Danish L1 listeners use stød and non-stød to predict grammatical structure and upcoming suffixes.

6.1.2. The Distinctive Function of Danish Prosody–Grammar Associations

Besides the predictive function of stød and non-stød, we also anticipated that L1 users would use them to disambiguate word forms in cases where variable, informal pronunciation would make them segmentally identical. In our stimuli, this was the case for definite neuter singulars and indefinite neuter plurals, and our data corroborate our hypothesis. An analysis of L1 users’ response patterns (i.e., of response ‘accuracy’) revealed that L1 users have a high degree of variation when deciding whether words with singular prosody and the plural suffix -e [ə] should be classified as singular or plural. There was a high degree of variation in how L1 users treated these word forms. Some participants interpreted them as valid singular allophones, while others categorically judged them as invalidly cued plurals. This suggests that the reduced schwa realisation of the singular suffix is not yet fully accepted by all Danish users in formal settings. It is possible, of course, that the experimental settings and paradigm could have affected the results. The relatively formal setting of the experiment could discourage listeners from accepting a relatively more informal word form. At the same time, the presence of the non-schwa-realisation of the definite singular suffix and of other mismatch conditions could have alerted listeners to the fact that the experimental paradigm was designed for the mismatched neuter plural to be considered a mismatch. Finally, a singular interpretation of mismatched plurals would have skewed the ratio of singular and plural words, which some participants may have been sensitive towards. In consequence, it is likely that L1 users would have been even more willing to accept a schwa realisation of the definite singular suffix in less formal, more balanced settings. Nonetheless, we argue that L1 users use the prosodic make-up of words to distinguish otherwise identical word forms.
Interestingly, the quasi-grammatical treatment of the stød in ambiguous neuter word forms ending in schwa had a carryover effect to processing of the stød in neuter words overall. As such, we found that L1 users were more consistent in judging the grammatical properties in matched singular neuters than any other word type in the study. That is to say, when both the prosodic cue and the suffix unambiguously pointed to singular, L1 users were highly consistent in their grammatical judgement, more so than in the comparable matched singular condition in common-gender nouns. This indicates that the reduction of [ɤ] to [ə] affects the status of stød in neuter nouns, turning it into a singular marker. Importantly, there is no apparent tendency for L2 users to simultaneously interpret non-stød as a plural marker. This is likely due to the fact that non-stød is less restrictive than stød and can have more possible endings (Hjortdal et al., 2022).
In summary, stød and non-stød are used predictively by L1 users of Danish. In addition, the high degree of variation in neuter suffix pronunciation could potentially turn stød into a singular marker in Danish neuter words. This process seemed to be completed for some L1 users in our study and ongoing for others.

6.2. Processing of Prosody–Grammar Associations in L2 Danish

For the L2 users, in contrast, we anticipated that the poor perceptual and functional mapping of the L2 Danish stød onto German L1 prosodic categories would make it considerably more difficult to acquire than the previously examined prosodic contrasts of Spanish and Swedish, where only one dimension (perception or function) was mapped poorly (Sagarra & Casillas, 2018; Schremm et al., 2016). Therefore, we expected to observe predictive processing only at the highest proficiency level in our study.

6.2.1. The Predictive Function of Danish Prosody–Grammar Associations in L2 Users

Analysing response times for evidence of predictive processing in the L2 user group, we found that they were equally slow in their grammar assessments after valid and invalid prosodic cues. In other words, we saw no facilitatory effect of prosody and, consequentially, no indication of prosody-based grammar prediction. Contrary to our prediction, we did not find an emerging prosody-based facilitation effect with increasing proficiency. Consequentially, our data provided no evidence of predictive behaviour for Danish stød in the L2 group, not even at advanced L2 proficiency. This differs significantly from the previous results for Swedish and Spanish, where L2 users showed predictive use of prosody–morphology associations at upper-intermediate proficiency levels and above (Sagarra & Casillas, 2018; Schremm et al., 2016).
Although this finding was somewhat unexpected in its magnitude, it only strengthens our assumption that the poor L1-L2 mapping of both perceptual and functional stød properties inhibits the acquisition of predictive associations. As discussed, stød can be expressed through a multitude of factors which map differently onto the prosodic categories in German and presumably various other languages. In the context of stød and non-stød, the perceptual mapping might even go so far that some parts of the cues’ acoustic profiles map onto opposing categories. Also functionally, stød is a typologically rare feature, related closely, however, to the mainland Scandinavian word accents. For L2 learners’ from L1 backgrounds without word accents, the functional mapping would presumably be similarly poor, as was the case for the German L1 users.
In accordance with So and Best (2014), who argue that poorly mapped prosodic features are particularly difficult to perceive and produce, our study shows that this is the case for Danish stød. It illustrates that poor L1-L2 mapping of perceptual and functional prosody categories not only impedes speech perception and categorisation, but also affects higher levels of language processing, like the possibility of forming associations between prosodic features and grammatical regularities and using them predictively. The complexity, variability, and typological rarity of Danish stød makes the prosodic categories unlikely to be perceived easily through the lens of other languages, and in our study, it did not allow for the formation of predictive cues among Danish L2 users. We expect, however, that L2 users from languages with a perceptually or functionally similar cue (e.g., L1 users of Swedish or Norwegian) would be able to acquire the Danish stød considerably more easily and resemble English speakers in their acquisition of Spanish stress or non-tonal users in their acquisition of Swedish word accents (Sagarra & Casillas, 2018; Hed et al., 2019).
As expected, the results of this study contrast starkly with the acquisition of tonal predictive cues in Swedish and that of the stress-related cues in Spanish (Sagarra & Casillas, 2018; Schremm et al., 2016) where prediction-related patterns were observed from intermediate levels. Our results for the complex cue in Danish show that we cannot rely on results for prosodic cues in one language to generalise across languages and cues when it comes to prosody processing, even when the cues carry out the same function. Instead, we must consider the perceptual and functional characteristics of the specific prosodic feature (So & Best, 2014) and assess their mapping onto prosodic categories in the learners’ L1 when we make predictions about L2 speech processing.

6.2.2. The Distinctive Function of Danish Prosody–Grammar Associations in L2 Users

The same as for the predictive function, we found no evidence of the different L2 groups using the prosodic features to distinguish word forms. This would have been visible in a higher response accuracy for one or more Match conditions where the prosodic cue and the suffix cue both indicate the same word category, and in a lower response accuracy for mismatched words where the prosodic cue selects one grammatical category and the suffix cue the other. This was not observed in the L2 group. Similarly, the L1 group, but not the L2 group, had an interaction effect with Gender and Number, as in the L1 group, where the stød (but not the non-stød) served as a grammatical discriminator in neuter-gender words.
Instead, we found that the German L1 Danish L2 participants had overall slower response times for neuter-gender words than for common-gender words. This suggests that the distinction of neuter word forms was difficult for our L2 users. While this could be related to the converging pronunciations of suffixes discussed for L1 users above, we believe the difficulty arises from a poor L1-L2 mapping for the neuter suffixes in our study. The sound that constitutes the current standard pronunciation of the neuter definite singular suffix and the pronunciation present in our stimuli (i.e., [ɤ]) does not exist in the German L1 phoneme inventory. In fact, Basbøll and Wagner (1985, p. 81) declare that there is not even a remotely comparable sound in German, with the lateral /l/ coming closest. It is thus likely that the German L1 users would map the reduced central vowel realisation of the syllabic definite singular suffix onto the same L1 category as the indefinite plural suffix [ə], namely the only reduced central vowel in German: schwa (/ə/).This would only be strengthened by the variability in realisations of the definite singular suffix8, and would explain the longer response times and relatively low accuracy among the L2 users.

6.3. Response Times for Definite Singular and Indefinite Plural Nouns

Our study replicated a response pattern that is well-known from previous studies using the same paradigm in Danish and Swedish (Clausen & Kristensen, 2015; Hjortdal et al., 2022; Roll, 2015): indefinite plural word forms were processed and responded to significantly faster than definite singulars. Previous studies have discussed different possible reasons for the slower processing of plural forms in L1 users (Clausen & Kristensen, 2015). They have suggested that longer response times are based either on differently strong associations between the prosody and the grammatical suffix leading to differently strong surprisal effects, or on a different number of activated word forms for word stem and prosody with more activated forms leading to more effortful processing and, therefore, longer response times. Either of these suggestions could explain the response behaviour in our L1 group. For the L2 group, however, both are difficult to sustain. It is unlikely that the longer response times for plurals were due to differently strong associations between prosody and suffixes in this group, as the lack of a match effect implicitly suggests that the L2 users do not associate suffixes with preceding prosodic features. The same holds true in analogy for activated word forms, as there is no reason to believe that L2 users should activate fewer word forms for stød than for non-stød, especially when our data suggest that they were relatively unaware of the prosodic cue overall. Rather, the L2 users’ response time difference for singular and plural could be explained by previous exposure to overarticulation or regional variation. As discussed in the introduction, the neuter definite singular suffix -et can be pronounced very distinctly as [ət]. In analogy, the common-gender definite singular suffix -en can be pronounced [ən]. While these overly explicit forms do not exist in the Copenhagen standard, both may be frequent in learner-directed classroom language, and [ət] is a common realisation of -et in South and East Jutlandic. When learners hear the plural suffix -e [ə], it is possible that they perceive this as a fragmentary definite singular suffix and, therefore, wait marginally longer for a possible realisation of the definite suffix’s consonant before they make their singular/plural decision. The same strategy could, in theory, also be used by the L1 user group, especially those familiar with South and East Jutlandic speech. However, this would not explain the overlap with results for Swedish. In Swedish, the plural suffix -ar [ar] resulted in longer response times in the same mismatch paradigm as in the present study (Roll et al., 2015; Söderström et al., 2012) and cannot be perceived as a fragment of the definite singular -en [en]. An explanation encompassing all three groups (Swedish and Danish L1 users and Danish L2 users) could be that plurality is inherently more demanding to process than definiteness. It is possible that plural suffixes add important semantic information to a word, specifically that there is more than one object, while definite suffixes only have a referential function that could be less processing-heavy. To our knowledge, however, no research outside of the studies mentioned above has thus far investigated processing differences between definiteness and plurality, and the proposed explanation is, therefore, tentative and would need to be further substantiated.

6.4. The Relevance for Prosody–Grammar Associations in Other Languages

In theory, suffixes can be cued by many different auditory features. In the present study, the cues were related to the acoustically complex Danish stød. In the neighbouring languages Swedish and Norwegian, as well as in some Danish dialects (Ejskjær, 1990) and other pitch accent languages, similar relationships would be expressed through pitch patterns (Langston, 1997; Riad, 2014; Rischel, 2008). Similarly, many African tone languages have tone changes before grammatical suffixes, which makes the tone predictive (Banti, 1988; Crysmann, 2015; Oomen, 1981). Stress can also function as a strong predictor, particularly in languages where stress is assigned (more or less) regularly with respect to the right word boundary. This is the case, for instance, in Spanish, Polish, or Armenian (Dolatian, 2019; Gussmann, 2007; Ortega-Llebaria, 2006). But even more subtle prosodic cues, like vowel duration in English, for instance, can be used predictively to indicate suffixes (Rehrig, 2017). Such subtle cues are presumably present in many languages as segment duration is affected by syllable structure, which, in turn, can be affected by suffixation (Lehiste, 1972). Finally, segmental changes can also function as predictive cues. The vowel changes related to umlaut in German, for instance, can be a cue for plural and diminutive suffixes in nouns, inflections in verbs, comparatives or superlatives in adjectives, and a whole range of derivational suffixes (Féry, 1994). Similar patterns of vowel changes exist in languages like Icelandic, Portuguese, or Greek (Galani, 2005; Thráinsson, 2017; Wetzels, 1995). Consonant changes are also frequent before suffixation, for instance due to final devoicing or assimilation, as in German, Dutch, or Russian (Wetzels & Mascaró, 2001), or even word-initially, as in mutations across Celtic languages (Hannahs, 2011; Stenson, 2019). While the strongest predictive relationships are likely to involve inflection and derivation, predictively useful associations also exist between words, as is the case for agreement where articles or adjectives can predict noun gender. Essentially, there is a myriad of regular associations between speech sounds and linguistic information that users can use to make predictions during speech perception. It would be highly rewarding if this were investigated for different languages in future research. This would be carried out most reliably with neurophysiological data, but as the present study emphasises, prediction can also be investigated using behavioural data from response times and response accuracy, where response times would be affected considerably more than accuracy if a speech cue is used predictively rather than as a secondary grammatical cue. Response time studies do not depend on expensive equipment and can be carried out even in relatively remote locations (see Cronhamn et al., 2024).

7. Conclusions

Our study shows that L1 users, but not L2 users, of Danish used the prosodic cues stød and non-stød to predict the upcoming suffix when processing the grammatical structure of words. L1 users’ response times were negatively affected by mismatches, indicating that they associated prosody with suffixes and actively used them during speech processing. At the same time, their grammatical decisions for the segmental homophones were often based on the prosody alone. In contrast, the L2 users did not have longer response times for mismatches, not even at advanced levels, nor did they use prosodic cues to strengthen their grammatical decisions. We argue that this is due to a poor mapping between the perceptual and functional characteristics of stød and the L2 users’ L1 prosodic categories. Instead of suffix or word selection based on prosody, our study revealed that the L2 users were largely unaware of prosody–suffix associations. Additionally, they struggled with the distinction of suffixes for one of the noun genders included in our study. Also based on a case of poor L1-L2 mapping, they could likely not perceptually distinguish the familiar reduced centre vowel [ə] from an unfamiliar reduced centre vowel [ɤ].
The results confirm our hypothesis that prosodic cues in different languages vary in acquirability based on how well they can be interpreted through the lens of the learner’s L1. In previous word processing studies with Swedish word tones and Spanish stress, L2 users at intermediate levels used prosodic cues to predict suffixes. Compared to these prosodic cues, stød is more complex in its acoustic realisation and maps poorly onto the learners’ perceptual and functional L1 prosodic categories (German). This complexity may have impeded the L2 learners’ functional acquisition of the cue and made it inaccessible even to L2 users at advanced language proficiency levels. Further studies on different prosodic phenomena with similar and different L1-L2 mapping in various source and target languages, as well as more neurocognitive studies on the L2 processing of prosodic cues, would further strengthen the conclusions of the present paper.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/languages10080181/s1: Table S1: Language background; Table S2: Response accuracy.

Author Contributions

Conceptualization, S.G.B. and L.B.K.; methodology, S.G.B. and L.B.K.; validation, S.G.B.; formal analysis, S.G.B. and L.B.K.; investigation, S.G.B.; resources, S.G.B. and L.B.K.; data curation, S.G.B.; writing—original draft preparation, S.G.B.; writing—review and editing, S.G.B. and L.B.K.; visualisation, S.G.B.; project administration, S.G.B.; funding acquisition, S.G.B. and L.B.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Swedish Research Council (2021.00269), Independent Research Fund Denmark (7023-00131B), and Riksbankens Jubileumsfond (M23-0052).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by The Research Ethics Committee at the Faculty of Humanities, University of Copenhagen, on 12 January 2022.

Informed Consent Statement

Informed consent was obtained from all the subjects involved in the study prior to participation.

Data Availability Statement

Anonymised data and analysis scripts are available at https://erda.ku.dk/archives/bd1402d1b0788f123fc94eb5dcf7f51b/published-archive.html (uploaded on 19 June 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Neuter- and common-gender nouns used in the experiment. Target words were embedded in the following carrier sentence: Det var vist X, Brit sagde. ‘It was probably X (that) Brit said.’ When the target word started with the letter <t>, vist was replaced with the synonymous nok to facilitate splicing.
Table A1. Neuter- and common-gender nouns used in the experiment. Target words were embedded in the following carrier sentence: Det var vist X, Brit sagde. ‘It was probably X (that) Brit said.’ When the target word started with the letter <t>, vist was replaced with the synonymous nok to facilitate splicing.
Stød-Associated Suffix -en/-et (Definite Singular)Non-stød-Associated Suffix -e (Indefinite Plural)
WordBroad TranscriptionTranslationWordBroad TranscriptionTranslation
MatchMismatch MatchMismatch
bold-enˈpɒlˀt-n̩ˈpɒlt-n̩ball-defbold-eˈpɒlt-əˈpɒlˀt-əball-s
falk-enˈfælˀk-ŋ̍ˈfælk-ŋ̍falcon-deffalk-eˈfælk-əˈfælˀk-əfalcon-s
helt-enˈhɛlˀt-n̩ˈhɛlt-n̩hero-defhelt-eˈhɛlt-əˈhɛlˀt-əhero-s
hvalp-enˈvælˀp-m̩ˈvælp-m̩puppy-defhvalp-eˈvælp-əˈvælˀp-əpuppie-s
kælk-enˈkʰɛlˀk-ŋ̍ˈkʰɛlk-ŋ̍sleight-defkælk-eˈkʰɛlk-əˈkʰɛlˀk-əsleight-s
kamp-enˈkʰamˀp-m̩ˈkʰamp-m̩fight-defkamp-eˈkʰamp-əˈkʰamˀp-əfight-s
navn-etˈnaʊˀn-ɤˈnaʊn-ɤname-defnavn-eˈnaʊn-əˈnaʊˀn-əname-s
skib-et 9ˈskiːˀp-ɤˈskiːp-ɤship-defskib-eˈskiːp-əˈskiːˀp-əship-s
telt-etˈtsɛlˀt-ɤˈtsɛlt-ɤtent-deftelt-eˈtsɛlt-əˈtsɛlˀt-ətent-s
skjold-et 9ˈskjɒlˀ-ɤˈskjɒl-ɤshield-defskjold-eˈskjɒl-əˈskjɒlˀ-əshield-s
fjeld-etˈfjɛlˀ-ɤˈfjɛl-ɤmountain-deffjeld-eˈfjɛl-əˈfjɛlˀ-əmountain-s
9 The two mismatch forms for these words exist as real words: Skibet without stød is the past participle of the infrequent verb skibe ‘to ship’. Note that Danish typically uses sejle ‘to sail’ in the context of shipping. Skjoldet without stød means ‘blotched’. Due to the experimental constraint of including nouns with the targeted inflectional pattern, relatively high frequency, and splicing-friendly syllable structure, we decided to include these words despite their existing but infrequent non-stød counterparts. We evaluated that the non-stød participle and adjective would not interfere maximally in a paradigm that focused solely and strongly on the distinction of singular and plural nouns.

Notes

1
Danish stød is a complex prosodic cue; see Section 2.3. In the standard Copenhagen variety, it is best described as a case of laryngealisation with a preceding tonal component (Fischer-Jørgensen, 1989). It is often realised differently across and within speakers (Ejskjær, 1990; A. Hansen, 1943; Siem, 2024). Stød acts at the syllable level, rather than being aligned with specific phonemes: the laryngealisation starts roughly in the middle of the voiced section of a syllable (Peña, 2022). In our definite singular stimulus word skjoldet [skjɒlˀ-ɤ] ‘the shield’ (voiced section underlined), for instance, we would find the onset of laryngealisation in the second half of the vowel, and it would continue through the lateral. The laryngealisation is typically preceded by a high tonal marking on the first part of the syllable rhyme before the onset of the laryngealisation.
2
Since Late Modern Danish (ældre nydansk), the two suffix forms have been largely homonymous, most definitively established with Grundtvig’s spelling conventions (Grundtvig, 1872). Before that, there was a nomen agentis suffix in Late Middle Danish (gammeldansk) that was realised as -ere, presumably developed from a previous -are (compare Old Norse -ari (Næs, 1952) and present-day Swedish -are (Telman et al., 1999)). The present tense suffix, in contrast, was realised as -er in Late Middle Danish (Petersen & Krogh, 2024), developed from the Old Danish present tense indicative singular -r/-ir (Munch, 1846).
3
The neuter definite singular suffix -et, for instance, has a long history of pronunciation variants. In different regional varieties, it has, since the late 19th century, been realised as either [əð], [ət], [ə], or [ər] (Bennike & Kristensen, 1898–1912). By the middle of the 20th century, the [ə] realisation had become an acceptable variant in virtually all but the traditional [ət] areas in central Jutland (including Denmark’s second-largest city, Aarhus) (Sørensen & Køster, n.d.). In Zealand Danish, including the prestigious, standard variety of the capital, Copenhagen, the traditional [əð] realisation prevails only in careful, distinct speech. In other contexts, -et is today predominantly realised as a reduced centralised alveolar–velar vowel [ɤ] (Schachtenhaufen, 2013), see (7A). The further reduced form [ə], see (7B), is also accepted in Zealand Danish (Sørensen & Køster, n.d.; Schachtenhaufen, 2013; Sørensen, 2014).
4
The neuter definite singular ending -et is, in modern standard Danish, realised as a syllabic sound produced with a raised tongue tip and tongue bed (Grønnum, 2005). In this paper, we follow the recent suggestion for the vowel-based notation /ɤ/ ([ɤ̯̈]) (Horslund et al., 2022; Schachtenhaufen, 2024) rather than using the more traditional /ð/ [ð̱̞(ˠ)] (Grønnum, 2005). This vowel-based transcription better captures the nature of the syllabic definite ending -et.
5
In contemporary Danish, the colloquial assimilation of the plural suffix -e [ə] to previous phonemes would lead to plural forms that are segmentally indistinguishable from definite singulars for word stems ending in nasals. Examples would be ovne [ɒʊnn̩] ‘ovens’ and ovnen [ɒʊnˀn̩] ‘the oven’ or ringe [ˈʁæŋŋ̍] ‘rings’ and ringen [ˈʁæŋˀŋ̍] ‘the ring’. We made sure not to include common-gender word stems ending in nasals.
6
Vowel onset was chosen as the earliest possible onset point for laryngealisation. In our target stimuli, this corresponded to a latency of between 122 ms and 164 ms (M = 147 ms, SD = 16) prior to the factual onset of the laryngealisation of stød (i.e., stød proper). Due to voiced pre-vocalic segments in some target words, vowel onset differed from F0 onset in our data. F0 onset would be the earliest possible time point for the realisation of pitch cues related to stød, and these have been shown to be important in L1 users’ stød perception (Peña, 2023). F0 onset was up to 79 ms earlier than vowel onset (M = 22 ms, SD = 28). However, timing the response time cut-off to F0 onset rather than vowel onset (i.e., F0 onset + 200 ms) included only few additional trials (N = 10), and the majority (n = 7) were produced by one participant with a large number of very premature responses (before target word onset). Vowel onset, therefore, seemed like the best cut-off point anchor for our specific dataset.
7
Deviation coding compares both levels of a variable against the grand mean, rather than using one level as baseline. This type of coding was chosen here since our variables cannot be considered treatment-type variables, where one level possesses the essence of a baseline (before treatment) and the other level a change from said baseline (after treatment). We have no reason to believe, for instance, that Gender (common vs. neuter) or Number (definite singular vs. indefinite plural) have this kind of relationship where one level is the default (baseline) from which the other level is derived. Deviation coding, therefore, best represents the essence of our data.
8
There is a traditionally a large degree of migration from Germany to South and West Jutland, where the -et suffix would, in the local variety, often be realised as [ət] rather than [əð], [ɤ], or [ə]. It is therefore possible that at least some of the L2 participants were relatively unaware of the reducted [ə] realisation of -et in informal standard Danish. However, the vast majority of our L2 participants (34 out of 39) indicated that they were acquiring standard Copenhagen Danish, in which -et is most commonly realised as [ɤ], very distinctly as [əð], or very colloquially as [ə]. This suggests that most L2 participants would be highly familiar with both the [ɤ] realisation and the [ə] realisation used in our target words, but some might not.

References

  1. Anwyl-Irvine, A. L., Massonnié, J., Flitton, A., Kirkham, N., & Evershed, J. K. (2020). Gorilla in our midst: An online behavioral experiment builder. Behavior Research Methods, 52(1), 388–407. [Google Scholar] [CrossRef]
  2. Banti, G. (1988). Two Cushitic systems: Somali and Oromo nouns. In H. van der Hurst, & N. Smith (Eds.), Autosegmental Studies on Pitch Accent (pp. 11–50). De Grutyer. [Google Scholar]
  3. Basbøll, H. (2003). Prosody, productivity and word structure: The stød pattern of Modern Danish. Nordic Journal of Linguistics, 26(1), 5–44. [Google Scholar] [CrossRef]
  4. Basbøll, H. (2014). Danish stød as evidence for grammaticalisation of suffixal positions in word structure. Acta Linguistica Hafnensia, 46(2), 137–158. [Google Scholar] [CrossRef]
  5. Basbøll, H., & Wagner, J. (1985). Kontrastive phonologie des deutschen und dänischen: Segmentale wortphonologie und -phonetik (Vol. 160). De Gruyter. [Google Scholar]
  6. Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48. [Google Scholar] [CrossRef]
  7. Bennike, V., & Kristensen, M. (1898–1912). Kort over de danske folkemål med forklaringer. Gyldendalske Boghandel. [Google Scholar]
  8. Boersma, P. (2001). Praat, a system for doing phonetics by computer. Glot International, 5, 341–345. [Google Scholar]
  9. Bruce, G. (1977). Swedish word accents in sentence perspective. Gleerup. [Google Scholar]
  10. Bruce, G. (1983). Accentuation and timing in Swedish. FoliaLinguistica, 17, 221–238. [Google Scholar] [CrossRef]
  11. Bruce, G. (1987). How floating is focal accent? In K. Gregersen, & H. Basbøll (Eds.), Nordic prosody IV: Papers from a symposium (pp. 41–49). Odense University Press. [Google Scholar]
  12. Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach (2nd ed.). Springer. [Google Scholar]
  13. Christiansen, M. H., & Chater, N. (2016). The Now-or-Never bottleneck: A fundamental constraint on language. Behavioral and Brain Sciences, 39, e62. [Google Scholar] [CrossRef]
  14. Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36(3), 181–204. [Google Scholar] [CrossRef]
  15. Clausen, S. J., & Kristensen, L. B. (2015). The cognitive status of stød. Nordic Journal of Linguistics, 38(2), 163–187. [Google Scholar] [CrossRef]
  16. Council of Europe. (2020). Common European framework of reference for languages: Language, teaching assessment—Companion volume. Available online: https://www.coe.int/lang-cefr (accessed on 30 November 2024).
  17. Cronhamn, S., Hjortdal, A., da Silva, F., & Roll, M. (2024, August 21–24). The predictive function of Baniwa classifiers. 57th Annual Meeting of the Societas Linguistica Europaea, SLE 2024, Helsinki, Finland. [Google Scholar]
  18. Crysmann, B. (2015). Representing morphological tone in a computational grammar of Hausa. Journal of Language Modelling, 3(2), 463–512. [Google Scholar] [CrossRef]
  19. Cutler, A. (2005). Lexical stress. In D. B. Pisoni, & R. E. Remez (Eds.), The handbook of speech perception (pp. 264–289). Blackwell Publishing. [Google Scholar]
  20. DeLong, K. A., Urbach, T. P., & Kutas, M. (2005). Probabilistic word pre-activation during language comprehension inferred from electrical brain activity. Nature Neuroscience, 8, 8. [Google Scholar] [CrossRef]
  21. de Marco, A. (2019). Teaching the prosody of emotive communication in a second language. In C. Savvidou (Ed.), Second language acquisition—Pedagogies, practices and perspectives. IntechOpen. [Google Scholar] [CrossRef]
  22. Dolatian, H. (2019). Cyclicity and prosody in Armenian stress-assignment. University of Pennsylvania Working Papers in Linguistics, 25(1), 79–88. Available online: https://repository.upenn.edu/pwpl/vol25/iss1/10 (accessed on 18 March 2025).
  23. Eisenhuth, H. (2015). Production and perception of word boundary markers in German speech [Doctoral thesis, University of Konstanz]. [Google Scholar]
  24. Ejskjær, I. (1967). Kortvokalstødet i sjællandsk. Akademisk forlag. [Google Scholar]
  25. Ejskjær, I. (1990). Stød and pitch accents in the Danish dialects. Acta Linguistica Hafniensia, 22(1), 49–75. [Google Scholar] [CrossRef]
  26. Fernandez, K., & Sagarra, N. (2025). Game On: Does Computerized Training Promote Second Language Stress–Suffix Associations? Languages, 10(7), 170. [Google Scholar] [CrossRef]
  27. Féry, C. (1994). Umlaut and inflection in German [Master’s thesis, University of Tübingen]. [Google Scholar]
  28. Féry, C. (2010). German intonational patterns. Walter de Gruyter. [Google Scholar]
  29. Fischer-Jørgensen, E. (1989). Phonetic analysis of the stød in standard Danish. Phonetica, 46, 1–59. [Google Scholar] [CrossRef]
  30. Flege, J. E., & Bohn, O.-S. (2021). The revised Speech Learning Model (SLM-r). In R. Wayland (Ed.), Second language speech learning (pp. 3–83). Cambridge University Press. [Google Scholar] [CrossRef]
  31. Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11, 2. [Google Scholar] [CrossRef]
  32. Fritz, I., Kotzor, S., & Lahiri, A. (2025). Línguist~Lingúistics: Phonological alternations in L1 and L2 processing [Manuscript submitted for publication].
  33. Galani, A. (2005). The morphosyntax of verbs in Modern Greek [Doctoral thesis, University of York]. [Google Scholar]
  34. Gandour, J. (1983). Tone perception in Far Eastern languages. Journal of Phonetics, 11(2), 149–175. [Google Scholar] [CrossRef]
  35. Garellek, M. (2013). Production and perception of glottal stops [Doctoral thesis, UCLA]. [Google Scholar]
  36. Gibbon, D. (1998). Intonation in German. In D. Hirst, & A. Di Cristo (Eds.), Intonation systems: A survey of twenty languages. Cambridge Univ. Press. [Google Scholar]
  37. Goldshtein, Y. (2021). Stødets naturlige historie. In Y. Goldshtein, I. S. Hansen, & T. T. Hougaard (Eds.), 18. Møde om udforskningen af dansk sprog. Aarhus Universitet. [Google Scholar]
  38. Gosselke Berthelsen, S., Horne, M., Brännström, K. J., Shtyrov, Y., & Roll, M. (2018). Neural processing of morphosyntactic tonal cues in second-language learners. Journal of Neurolinguistics, 45, 60–78. [Google Scholar] [CrossRef]
  39. Gosselke Berthelsen, S., Horne, M., Shtyrov, Y., & Roll, M. (2022). Native language experience shapes pre-attentive foreign tone processing and guides rapid memory trace build-up: An ERP study. Psychophysiology, 59(8), e14042. [Google Scholar] [CrossRef]
  40. Grønnum, N. (2005). Fonetik og fonologi: Almen og dansk. Akademisk forlag. [Google Scholar]
  41. Grundtvig, S. (1872). Dansk haandordbog med den af kultusministeriet anbefalede retskrivning. C. A. Reitzels Forlag. [Google Scholar]
  42. Gussmann, E. (2007). The phonology of Polish. OUP Oxford. [Google Scholar]
  43. Hannahs, S. J. (2011). Celtic Mutations. In M. van Oostendorp, C. Ewen, B. Hume, & K. Rice (Eds.), The blackwell companion to phonology (Vol. 5, pp. 2807–2830). Wiley & Sons, Ltd. [Google Scholar] [CrossRef]
  44. Hansen, A. (1943). Stødet i dansk. Munksgaard. [Google Scholar]
  45. Hansen, G. F. (2015). Stød og stemmekvalitet. En akustisk-fonetisk undersøgelse af ændringer i stemmekvaliteten i forbindelse med stød [Ph.D. thesis, University of Copenhagen]. [Google Scholar]
  46. Hansen, G. F. (2018, June 7–8). Exploring voice quality changes in words with stød. Proceedings Fonetik 2018 (pp. 21–26), Gothenburg, Sweden. [Google Scholar]
  47. Hed, A., Schremm, A., Horne, M., & Roll, M. (2019). Neural correlates of second language acquisition of tone-grammar associations. The Mental Lexicon, 14(1), 98–123. [Google Scholar] [CrossRef]
  48. Hjortdal, A., Frid, J., Novén, M., & Roll, M. (2024). Swift prosodic modulation of lexical access: Brain potentials from three North Germanic language varieties. Journal of Speech, Language, and Hearing Research, 67(2), 400–414. [Google Scholar] [CrossRef]
  49. Hjortdal, A., Frid, J., & Roll, M. (2022). Phonetic and phonological cues to prediction: Neurophysiology of Danish stød. Journal of Phonetics, 94, e101178. [Google Scholar] [CrossRef]
  50. Horslund, C. S., Puggaard-Rode, R., & Jørgensen, H. (2022). A phonetically-based phoneme analysis of the Danish consonant system. Acta Linguistica Hafniensia, 54(1), 73–105. [Google Scholar] [CrossRef]
  51. Kohler, K. J. (2009). Glottal stops and glottalization in German: Data and theory of connected speech processes. Phonetica, 51, 38–51. [Google Scholar] [CrossRef]
  52. Kristensen, L. B., & Wallentin, M. (2015). Putting Broca’s region into context: fMRI evidence for a role in predictive language processing. In R. M. Willems (Ed.), Cognitive neuroscience of natural language use (pp. 160–181). Cambridge University Press. [Google Scholar] [CrossRef]
  53. Kuperberg, G. R., Brothers, T., & Wlotko, E. W. (2020). A tale of two positivities and the N400: Distinct neural signatures are evoked by confirmed and violated predictions at different levels of representation. Journal of Cognitive Neuroscience, 32(1), 1. [Google Scholar] [CrossRef]
  54. Kuperberg, G. R., & Jaeger, T. F. (2016). What do we mean by prediction in language comprehension? Language, Cognition and Neuroscience, 31(1), 32–59. [Google Scholar] [CrossRef]
  55. Kutas, M., DeLong, K. A., & Smith, N. J. (2011). A look around at what lies ahead: Prediction and predictability in language processing. In Predictions in the brain: Using our past to generate a future (pp. 190–207). Oxford University Press. [Google Scholar] [CrossRef]
  56. Labrune, L. (2012). The phonology of Japanese. Oxford University Press. [Google Scholar]
  57. Langston, K. (1997). Pitch accent in Croatian and Serbian: Towards an autosegmental analysis. Journal of Slavic Linguistics, 5(1), 80–116. [Google Scholar]
  58. Lehiste, I. (1972). Manner of articulation, parallel processing, and the perception of duration. Working Papers in Linguistics: The Ohio State University, 12, 33–52. [Google Scholar]
  59. León-Cabrera, P., Hjortdal, A., Gosselke Berthelsen, S., Rodríguez-Fornells, A., & Roll, M. (2024). Neurophysiological signatures of prediction in language: A critical review of anticipatory negativities. Neuroscience & Biobehavioral Reviews, 160, 105624. [Google Scholar] [CrossRef]
  60. León-Cabrera, P., Rodríguez-Fornells, A., & Morís, J. (2017). Electrophysiological correlates of semantic anticipation during speech comprehension. Neuropsychologia, 99, 326–334. [Google Scholar] [CrossRef]
  61. Levy, R. (2008). A noisy-channel model of human sentence comprehension under uncertain input. In M. Lapata, & H. T. Ng (Eds.), Proceedings of the 2008 conference on empirical methods in natural language processing (pp. 234–243). Association for Computational Linguistics. Available online: https://aclanthology.org/D08-1025 (accessed on 18 March 2024).
  62. Lozano-Argüelles, C., Sagarra, N., & Casillas, J. V. (2020). Slowly but surely: Interpreting facilitates L2 morphological anticipation based on suprasegmental and segmental information. Bilingualism: Language and Cognition, 23(4), 752–762. [Google Scholar] [CrossRef]
  63. Munch, P. A. (1846). Sproghistoriske undersøgelser om det aeldste faellesnordiske sprogs udseende og forsøg til at bestemme den olddanske og oldsvenske mundarts normale orthographi, grammatik og rette forhold til norroena-mundarten. Annaler for Nordisk Oldkyndighed og Historie, 1, 219–283. [Google Scholar]
  64. Næs, O. (1952). Norsk grammatikk: Bokmål og nynorsk på bakgruun av språkhistorie og dialekter. Ordlære. Fabritius. [Google Scholar]
  65. Nieuwland, M. S., & van Berkum, J. J. A. (2006). When peanuts fall in love: N400 evidence for the power of discourse. Journal of Cognitive Neuroscience, 18(7), 1098–1111. [Google Scholar] [CrossRef]
  66. Oomen, A. (1981). Gender and plurality in Rendille. Afroasiatic Linguistics, 8(1), 35–78. [Google Scholar]
  67. Ortega-Llebaria, M. (2006). Phonetic cues to stress and accent in Spanish. In Selected proceedings of the 2nd conference on laboratory approaches to Spanish phonetics and phonology (pp. 104–118). Cascadilla Press. [Google Scholar]
  68. Ortega-Llebaria, M., Gu, H., & Fan, J. (2013). English speakers’ perception of Spanish lexical stress: Context-driven L2 stress perception. Journal of Phonetics, 41(3-4), 186–197. [Google Scholar] [CrossRef]
  69. Peña, J. M. (2022). Stød timing and domain in danish. Languages, 7(1), 50. [Google Scholar] [CrossRef]
  70. Peña, J. M. (2023). Effects of fundamental frequency and harmonics-to-noise ratio on the perception of Danish laryngealized phonation. In R. Skarnitzl, & J. Volín (Eds.), Proceedings of the 20th international congress of phonetic sciences (pp. 1736–1740). Guarant. [Google Scholar]
  71. Petersen, K. T., & Krogh, S. (2024). Vej Lejre græsse nu Faar paa Vold, hvor fordum Kæmperne drukke. Udviklingen af numeruskongruens mellem subjekt og finit verbum i dansk fra ca. 1500 til ca. 1900. Ny Forskning i Grammatik, 31, 176–193. [Google Scholar] [CrossRef]
  72. Quist, P. (2002). Nye danskere, nye dialekter. In Dialekter—Sidste udkald? Modersmål-Selskabets årbog 2002 (pp. 101–107). Hans Reitzels Forlag. [Google Scholar]
  73. R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Available online: https://www.R-project.org/ (accessed on 18 March 2024).
  74. Rehrig, G. L. (2017). Acoustic correlates of syntax in sentence production and comprehension [Ph.D. thesis, Rutgers]. [Google Scholar]
  75. Riad, T. (2003). Diachrony of the Scandinavian accent typology. In P. Fikkert, & H. Jacobs (Eds.), Development in prosodic systems (pp. 91–144). De Gruyter Mouton. [Google Scholar] [CrossRef]
  76. Riad, T. (2014). The phonology of Swedish. OUP Oxford. [Google Scholar]
  77. Ringgaard, K. (1960). Vestjysk stød [Ph.D. thesis, Aarhus University]. [Google Scholar]
  78. Rischel, J. (2008). Morphemic tone and word tone in Eastern Norwegian. In J. Rischel (Ed.), Sound structure in language (pp. 167–174). Oxford University Press. [Google Scholar] [CrossRef]
  79. Roll, M. (2015). A neurolinguistic study of South Swedish word accents: Electrical brain potentials in nouns and verbs. Nordic Journal of Linguistics, 38(2), 149–162. [Google Scholar] [CrossRef]
  80. Roll, M. (2022). The predictive function of Swedish word accents. Frontiers in Psychology, 13, e910787. [Google Scholar] [CrossRef]
  81. Roll, M., Horne, M., & Lindgren, M. (2010). Word accents and morphology—ERPs of Swedish word processing. Brain Research, 1330, 114–123. [Google Scholar] [CrossRef]
  82. Roll, M., Söderström, P., Frid, J., Mannfolk, P., & Horne, M. (2017). Forehearing words: Pre-activation of word endings at word onset. Neuroscience Letters, 658, 57–61. [Google Scholar] [CrossRef]
  83. Roll, M., Söderström, P., & Horne, M. (2013). Word-stem tones cue suffixes in the brain. Brain Research, 1520, 116–120. [Google Scholar] [CrossRef]
  84. Roll, M., Söderström, P., Horne, M., & Hjortdal, A. (2023). Pre-activation negativity (PrAN): A neural index of predictive strength of phonological cues. Laboratory Phonology, 14(1), 1. [Google Scholar] [CrossRef]
  85. Roll, M., Söderström, P., Mannfolk, P., Shtyrov, Y., Johansson, M., van Westen, D., & Horne, M. (2015). Word tones cueing morphosyntactic structure: Neuroanatomical substrates and activation time-course assessed by EEG and fMRI. Brain and Language, 150, 14–21. [Google Scholar] [CrossRef]
  86. RStudio Team. (2022). RStudio: Integrated development. Environment for R. RStudio, PBC. Available online: http://www.rstudio.com/ (accessed on 18 March 2024).
  87. Sagarra, N., & Casillas, J. V. (2018). Suprasegmental information cues morphological anticipation during L1/L2 lexical access. Journal of Second Language Studies, 1(1), 31–59. [Google Scholar] [CrossRef]
  88. Schachtenhaufen, R. (2013). Fonetisk reduktion i dansk [Doctoral thesis, Copenhagen Business School]. [Google Scholar]
  89. Schachtenhaufen, R. (2024). Utilpasset IPA. Danske Studier, 2023, 21–61. [Google Scholar] [CrossRef]
  90. Schremm, A., Söderström, P., Horne, M., & Roll, M. (2016). Implicit acquisition of tone-suffix connections in L2 learners of Swedish. Mental Lexicon, 11(1), 55–75. [Google Scholar] [CrossRef]
  91. Siem, A. (2024). Phonetic and dialectal variation in phonologically contrastive laryngealisation: A case study of the Danish stød [Doctoral thesis, Lancaster University]. [Google Scholar]
  92. So, K. C., & Best, C. T. (2014). Phonetic influences on English and French listeners’ assimilation of Mandarin tones to native prosodic categories. Studies in Second Language Acquisition, 36, 195–221. [Google Scholar] [CrossRef]
  93. Söderström, P., Horne, M., Frid, J., & Roll, M. (2016). Pre-activation negativity (PrAN) in brain potentials to unfolding words. Frontiers in Human Neuroscience, 10, e512. [Google Scholar] [CrossRef]
  94. Söderström, P., Horne, M., Mannfolk, P., van Westen, D., & Roll, M. (2017a). Tone-grammar association within words: Concurrent ERP and fMRI show rapid neural pre-activation and involvement of left inferior frontal gyrus in pseudoword processing. Brain and Language, 174, 119–126. [Google Scholar] [CrossRef]
  95. Söderström, P., Horne, M., Mannfolk, P., van Westen, D., & Roll, M. (2018). Rapid syntactic pre-activation in Broca’s area: Concurrent electrophysiological and haemodynamic recordings. Brain Research, 1697, 76–82. [Google Scholar] [CrossRef]
  96. Söderström, P., Horne, M., & Roll, M. (2017b). Stem tones pre-activate suffixes in the brain. Journal of Psycholinguistic Research, 46(2), 271–280. [Google Scholar] [CrossRef]
  97. Söderström, P., Roll, M., & Horne, M. (2012). Processing morphologically conditioned word accents. The Mental Lexicon, 7(1), 77–89. [Google Scholar] [CrossRef]
  98. Sørensen, V. (2014). Lyd og prosodi i de danske dialekter. The Peter Skautrup Center for Jutlandic Dialect Research. Available online: https://jysk.au.dk/fileadmin/www.jysk.au.dk/publikationer/centrets_publikationer/lydogprosodi.pdf (accessed on 18 March 2024).
  99. Sørensen, V., & Køster, F. (n.d.). Kort 18—Modsvarigheder til rigsmålets endelser -ede, -et. The Peter Skautrup Center for Jutlandic Dialect Research. Available online: https://jysk.au.dk/samlinger/baandsamling/dialektproever/oversigtoverkort/kort18 (accessed on 14 May 2025).
  100. Stenson, N. (2019). Modern Irish: A comprehensive grammar. Routledge. [Google Scholar] [CrossRef]
  101. Strange, W. (2011). Automatic selective perception (ASP) of first and second language speech: A working model. Journal of Phonetics, 39(4), 456–466. [Google Scholar] [CrossRef]
  102. Telman, U., Hellberg, S., & Andersson, E. (1999). Svenska akademiens grammatik. Norstedts. [Google Scholar]
  103. Thráinsson, H. (2017). U-umlaut in Icelandic and Faroese: Survival and death. In C. Bowern, L. Horn, & R. Zanuttini (Eds.), On looking into words (and beyond): Structures, relations, analyses. Zenodo. [Google Scholar] [CrossRef]
  104. van Leussen, J.-W., & Escudero, P. (2015). Learning to perceive and recognize a second language: The L2LP model revised. Frontiers in Psychology, 6, e1000. [Google Scholar] [CrossRef]
  105. van Maastricht, L., Krahmer, E., & Swerts, M. (2016). Prominence patterns in a second language: Intonational transfer from Dutch to Spanish and vice versa. Language Learning, 66(1), 124–158. [Google Scholar] [CrossRef]
  106. Wetzels, W. L. (1995). Mid-vowel alternations in the Brazilian Portuguese verb. Phonology, 12(2), 281–304. [Google Scholar] [CrossRef]
  107. Wetzels, W. L., & Mascaró, J. (2001). The typology of voicing and devoicing. Language, 77(2), 207–244. [Google Scholar] [CrossRef]
  108. Whelan, R. (2008). Effective analysis of reaction time data. The Psychological Record, 58(3), 475–482. [Google Scholar] [CrossRef]
Figure 1. Intensity (A) and pitch (B) patterns for target words, centred around syllable borders. Stød words are represented in red, and non-stød words in blue (pale lines = individual words; dark line = average). Dotted average lines mark time points where there are data for less than 40% of the target words: word-initially and word-finally due to different target word lengths, and word-medially for stød words when voicing becomes aperiodic [or biperiodic] and pitch cannot be tracked.
Figure 1. Intensity (A) and pitch (B) patterns for target words, centred around syllable borders. Stød words are represented in red, and non-stød words in blue (pale lines = individual words; dark line = average). Dotted average lines mark time points where there are data for less than 40% of the target words: word-initially and word-finally due to different target word lengths, and word-medially for stød words when voicing becomes aperiodic [or biperiodic] and pitch cannot be tracked.
Languages 10 00181 g001
Figure 2. The experimental procedure. The time-out screen appeared only if participants did not respond within 2 s after the auditory stimulus had finished. It urged participants to respond faster. Optionally, participants could press the P key to escape to a break screen if needed.
Figure 2. The experimental procedure. The time-out screen appeared only if participants did not respond within 2 s after the auditory stimulus had finished. It urged participants to respond faster. Optionally, participants could press the P key to escape to a break screen if needed.
Languages 10 00181 g002
Figure 3. Illustration of the interaction effects for Match and Proficiency (A) and Gender and Proficiency (B) for log-transformed RTs. Back-transformed RTs are indicated to the left.
Figure 3. Illustration of the interaction effects for Match and Proficiency (A) and Gender and Proficiency (B) for log-transformed RTs. Back-transformed RTs are indicated to the left.
Languages 10 00181 g003
Figure 4. Illustration of the most complex interaction for suffix-based response accuracy. The graphs for the Match and Number conditions are overlapped for each gender. Match is coded in green, and mismatch in grey/black. Singular conditions are indicated in dotted lines, and plural conditions in solid lines.
Figure 4. Illustration of the most complex interaction for suffix-based response accuracy. The graphs for the Match and Number conditions are overlapped for each gender. Match is coded in green, and mismatch in grey/black. Singular conditions are indicated in dotted lines, and plural conditions in solid lines.
Languages 10 00181 g004
Figure 5. The percentage of suffix-based responses for neuter words with stød by participant in the L1 group: plural mismatch in dark grey, singular match in light grey. Each pair of columns represents one participant, sorted by the percentage of plural decisions for mismatched plural words (stød + [ə]).
Figure 5. The percentage of suffix-based responses for neuter words with stød by participant in the L1 group: plural mismatch in dark grey, singular match in light grey. Each pair of columns represents one participant, sorted by the percentage of plural decisions for mismatched plural words (stød + [ə]).
Languages 10 00181 g005
Table 1. Illustration of the prosody–grammar associations in Danish (black and grey). The stimulus types used in the experiment are marked in black (matched suffixes) and red (mismatched suffixes).
Table 1. Illustration of the prosody–grammar associations in Danish (black and grey). The stimulus types used in the experiment are marked in black (matched suffixes) and red (mismatched suffixes).
Word StemMatchMismatch
Common
gender
bold- [ˈpɒlˀt]definite singular -en
indefinite singular: bare stem with no suffix
indefinite plural -e
bold- [ˈpɒlt]indefinite plural -e
definite plural: -ene
second part of compounds, e.g., -pige  ‘ball girl’
                -øje  ‘eye for the ball’
definite singular -en
Neuter
gender
skib- [ˈskiːˀp]definite singular -et
indefinite singular: bare stem with no suffix
indefinite plural -e
skib- [ˈskiːp]indefinite plural -e
definite plural: -ene
second part of compounds, e.g., -sdæk ‘ship’s deck’
                -brud ‘shipwreck’
definite singular -et
Table 2. Main effects and significant interactions of the best-fit model for log-transformed RT.
Table 2. Main effects and significant interactions of the best-fit model for log-transformed RT.
Random EffectsVarianceStd. Dev.
Participant (intercept)0.0050.070
Item (intercept)0.0000.011
Fixed effectsEstimate ( β ^ )Std. errort-valuep-value
(Intercept)2.9640.017252.804<2 × 10−16***
Match 0.0450.00218.4990<2 × 10−16***
LevelDanishadvan. 0.0090.0250.3450.731
LevelDanishupper−0.0020.026−0.0090.993
LevelDanishlower0.0510.0252.0450.044*
LevelDanishbegin.0.0040.0250.1690.866
Gender 0.0050.0070.6610.523
Number0.0140.0028.301<2 × 10−16***
TrialNumber−0.0210.003−69504 × 10−12***
Match:LevelDanishadvan.−0.0430.005−7.9572 × 10−15***
Match:LevelDanishupper−0.0400.006−7.1411 × 10−14***
Match:LevelDanishlower−0.0380.005−7.0402 × 10−12***
Match:LevelDanishbegin.−0.0420.005−7.6692 × 10−14***
Gender:LevelDanishadvan. 0.0360.0056.6802 × 10−11***
Gender:LevelDanishupper−0.0170.0062.8860.004**
Gender:LevelDanishlower0.0250.0064.5126 × 10−6***
Gender:LevelDanishbegin.0.0080.0051.4660.143
Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1.
Table 3. The main effects and most complex interaction for the best-fit model for response accuracy.
Table 3. The main effects and most complex interaction for the best-fit model for response accuracy.
Random EffectsVarianceStd. Dev.
Participant (intercept)0.6470.805
Item (intercept)0.0310.175
Fixed effectsEstimate ( β ^ )Std. errorz-valuep-value
(Intercept)2.4820.15116.418<2 × 10−16***
Match −1.4630.089−16.354<2 × 10−16***
LevelDanishadvan. −0.5550.299−1.8570.063.
LevelDanishupper−1.5900.306−5.2032 × 10−7***
LevelDanishlower−0.9940.296−3.3568 × 10−4***
LevelDanishbegin.−1.9740.293−6.7332 × 10−11***
Gender −0.5440.139−3.9159 × 10−5***
Number−0.5220.089−5.864<5 × 10−9***
Trial Number0.4050.0805.0744 × 10−7***
Gender:LevelDanishL1:Matchm:Number−1.0610.303−3.5055 × 10−4***
Gender:LevelDanishadv:Matchm:Number0.3120.4190.7440.457
Gender:LevelDanishupp:Matchm:Number0.3040.3001.0130.311
Gender:LevelDanishlow:Matchm:Number0.6510.3541.8370.066.
Gender:LevelDanishbeg:Matchm:Number0.1590.2740.5770.563
Gender:LevelDanishL1:Matchmm:Number−1.5010.188−7.9851 × 10−15***
Gender:LevelDanishadv:Matchmm:Number0.4940.4031.2280.220
Gender:LevelDanishupp:Matchmm:Number0.3200.2931.0930.274
Gender:LevelDanishlow:Matchmm:Number0.2140.3490.6140.539
Gender:LevelDanishbeg:Matchmm:Number0.4100.2751.4920.136
Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gosselke Berthelsen, S.; Kristensen, L.B. That Came as No Surprise! The Processing of Prosody–Grammar Associations in Danish First and Second Language Users. Languages 2025, 10, 181. https://doi.org/10.3390/languages10080181

AMA Style

Gosselke Berthelsen S, Kristensen LB. That Came as No Surprise! The Processing of Prosody–Grammar Associations in Danish First and Second Language Users. Languages. 2025; 10(8):181. https://doi.org/10.3390/languages10080181

Chicago/Turabian Style

Gosselke Berthelsen, Sabine, and Line Burholt Kristensen. 2025. "That Came as No Surprise! The Processing of Prosody–Grammar Associations in Danish First and Second Language Users" Languages 10, no. 8: 181. https://doi.org/10.3390/languages10080181

APA Style

Gosselke Berthelsen, S., & Kristensen, L. B. (2025). That Came as No Surprise! The Processing of Prosody–Grammar Associations in Danish First and Second Language Users. Languages, 10(8), 181. https://doi.org/10.3390/languages10080181

Article Metrics

Back to TopTop