Previous Article in Journal
The Addition of a Target Structure to Task Repetition as an Accuracy Enhancement: The Necessity of Reducing Cognitive Load
Previous Article in Special Issue
Inherently Long Consonants in Contemporary Italian Varieties: Regional Variation and Orthographic Effects
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Importance of Being Onset: Tuscan Lenition and Stops in Coda Position

Department of Philology, Literature, and Linguistics, University of Pisa, 56126 Pisa, Italy
*
Author to whom correspondence should be addressed.
Languages 2025, 10(6), 129; https://doi.org/10.3390/languages10060129
Submission received: 6 April 2024 / Revised: 29 April 2025 / Accepted: 20 May 2025 / Published: 30 May 2025
(This article belongs to the Special Issue Speech Variation in Contemporary Italian)

Abstract

:
This paper examines Gorgia Toscana (GT), a phenomenon of stop lenition observed in Tuscan varieties of Italian. Traditionally, this process has been understood to occur in post-vocalic positions, which, in the native lexicon, corresponds to onset position due to the absence of stops in syllable codas in Italian, apart from geminate consonants that straddle the coda and onset of adjacent syllables. However, stops in coda positions are found in both loanwords (e.g., admin, Batman) and bookwords (e.g., ritmo, tecnica). Drawing on original acoustic data collected from 42 native speakers of Florentine Italian, we investigated the realization of stops in such lexical items through allophonic classification and quantitative analysis. Our primary aim was to test the Onset Hypothesis, which posits that Gorgia exclusively affects stops in onset positions, implying that coda stops should not undergo lenition. Our findings support this hypothesis. We provide a phonological analysis within the frameworks of Strict CV and Coda Mirror, emphasizing the importance of syllable structure in understanding the manifestation of Gorgia Toscana, which we argue cannot be adequately captured solely by considering the linear order of segments.

1. Introduction

This contribution deals with the phonology of Gorgia Toscana (henceforth GT), a stop lenition process affecting Tuscan varieties, particularly the Florentine one, which is also the core of the phenomenon spreading within the dialectal area. GT, possibly the most emblematic trait of “Tuscanness” (cf. Cravens, 2000), has been the subject of several publications, such as, among others, Contini (1960), Giannelli and Savoia (1978, 1980), Giannelli and Cravens (1997), Marotta (2008), Ulfsbjorninn (2017), and Russo (2022), which have dealt with the phenomenon from different points of view. In this contribution, we aim at investigating the GT pattern in light of the syllable structure, focusing on the context of the syllable coda, an aspect that has so far received very little attention (exceptions are Ulfsbjorninn, 2017 and Marotta et al., 2023). This study is based on a corpus of originally collected data from the city of Florence, the Tuscan county seat.
The paper is structured as follows. In Section 2, we provide a phonological overview of GT, focusing on the effects on the targeted stop segments, while in Section 3, we will concentrate on the context in which GT occurs. In Section 4, the phonological frameworks adopted in the analyses are presented. Subsequently, in Section 5, we present a brief discussion of the sociolinguistic aspects, which might be assumed to interact with the GT phonological pattern. Section 6 presents the data collection, the corpus composition (Section 6.1), and the methodologies (Section 6.2), both concerning the allophonic classification (Section 6.2.1) and the quantitative analysis (Section 6.2.2), whose results are presented in Section 7. The discussion of our phonological interpretation of the GT pattern is found in Section 8, both dealing with its relationship with the syllable (see Section 8.1 and Section 8.2), and the subsegmental mechanism affecting stops when subject to the phenomenon (in Section 8.3). Section 9 concludes the paper.

2. A Phonological Description of What GT Is

GT is traditionally interpreted as a lenition or weakening process of the singleton voiceless stops /p t k/, resulting in homorganic voiceless fricatives /ɸ θ x/, respectively, along with the glottal fricative [h] as an alternative allophone of the velar stop /k/ (cf. Giannelli & Savoia, 1978; Sorianello, 2003). However, multiple further realizations are produced, as the studies of Giannelli and Savoia (1978, 1980) show; more than 30 allophones are reported in the Authors’ works (ibidem), approximately 10 allophones for each underlying stop. Such a variation has been analyzed as constituting a continuum of lenition, leading to the deletion of stops as the final step. More recently, Sorianello (2003) discusses the occurrence—within a corpus of acoustic data from three Florentine speakers—of six classes of allophones, distinguishable based on the manner of articulation: voiceless stops [p t k] (only sporadically), strong voiceless fricatives1 [ɸ̝ θ̝ x̝], voiceless fricatives [ɸ θ x h], voiced stops [b d ɡ], voiced fricatives [β ð ɣ], and approximants [β̞ ð̞ ɣ̞ ɦ̞]. Among the outcomes of /p/, Sorianello (2003) also records the occurrence of a partially voiced and fricated unreleased bilabial stop [p̬̚], defined as lenis. It represents the first step on the continuum of the spirantization outcomes, highlighting the gradient nature of GT.
The same kind of spirantization has been observed for voiced stops /b d ɡ/, resulting in homorganic voiced continuant segments, both fricatives and approximants [β β̞ ð ð̞ ɣ ɣ̞], according to the phonetic analysis in Giannelli and Savoia (1978, 1980; see also Villafaña Dalcher, 2008). In these studies, it is also observed that the voiced stops are less systematically affected by GT. Accordingly, Bafile (1997) and Sorianello (2003) consider voiceless stops as the main target of GT.
In Tuscan Italian, post-alveolar affricates /ʧ ʤ/ undergo a deaffrication process, which, as far as the relationship between underlying and surface forms is concerned, appears to be analogous to the spirantization observed with stops, i.e., the closure phase of the segment is lost and a (homorganic) continuant segment surfaces (as, e.g., in /la ˈʧena/ → [la ˈʃeːna] “the dinner”, /la ˈʤɔja/ → [la ˈʒɔːja] “the joy”; see Marotta (2001, 2008), Sorianello (2010). Indeed, both processes seem to affect the closure phase by removing it (underlyingly, closure may either extend to the whole segmental duration, as full stops are concerned, or to the first segment portion only, when affricates are considered). In this view, the label “spirantization” may also be extended to deaffrication processes, provided that the process yields a fricative rather than a stop. The voiceless affricate /ʧ/ is also affected by a deaffrication process outside of the Tuscan dialectal area, in regions where the voiced counterpart /ʤ/ is instead lengthened, as in Rome, Northern Apulian, Abruzzese and Campanian Italian (e.g., Roman [ˈaʤːile] vs. Tuscan [ˈaːʒile] “agile”, see Crocco, 2017).
Moreover, different from /ʧ/ and /ʤ/, alveolar affricates /ʦ ʣ/ are not a target of deaffrication in Tuscan, a fact that apparently obscures the pattern, being interpretable as a lack of class-internal uniformity. For example, both the diatopic and class inhomogeneity issues de facto led Bafile (1997) to treat deaffrication and stop spirantization as two individual phenomena. We will discuss such issues below.
Firstly, regarding diatopic variation, we argue that the diverse and heterogeneous patterns of lenition observed in non-Florentine and non-Tuscan varieties (e.g., the Roman dialect, where (a) stops /p t k/ are realized as lenis and partially voiced, and (b) /ʧ/ surfaces as the voiceless fricative [ʃ]) do not justify treating the process that produces continuant outputs from both stops and affricates in Florentine as phonologically distinct. In Florentine phonology, these processes are structurally identical.
By structurally identical, we mean that, pre-theoretically, both spirantization and deaffrication involve the loss of articulatory closure and affect the same class of segments, namely, obstruents. We argue that this is the phonological process active in Florentine referred to as GT. As will be shown in the next sections (see Section 4 and Section 8.1), in several phonological models, e.g., Element Theory (Backley, 2011), both deaffrication and stop spirantization are formalized in the same way, consistent with the view assumed here. Moreover, in ET, affricates and simple stops only differ in the length of the release (Backley, 2011, p. 108; see Jakobson et al., 1952, for affricates as strident stops).
Secondly, as the class-internal differences are concerned, one may object that the unbalanced situation of affricates is only apparent if we consider that the alveolar ones (/ʦ ʣ/) are phonologically geminates, members of the Italian intrinsic geminates set (Celata & Kaeppeli, 2003; Bertinetto & Loporcaro, 2005; Marotta & Vanelli, 2021). It is well known that y, geminates resist this kind of lenition, which usually affects singletons2. According to Marotta and Vanelli (2021, pp. 69–70), Italian intrinsic geminates are affected by a degemination process when occurring word-initially ([ˈʦiːo] “uncle” vs. [lo#tˈʦːiːo] “the uncle”) and post-consonantally ([ˈɡarʣa] “gauze” vs. [ˈɡaʣːa] “magpie”) due to the constraint on geminates to occur in such contexts (but see Mairano et al., 2025, in this special issue, for a more nuanced view), as their underlying representation is that of coda-onset clusters.
From our perspective, thus, claiming that GT affects the whole class of stop obstruents is a valid generalization. Indeed, as already stated, we follow Marotta (2008) and Ulfsbjorninn (2017) in including the post-alveolar affricates /ʧ ʤ/ too among the targets of GT. As argued above, affricates basically show the same pattern of full stops, resulting in the homorganic fricatives [ʃ ʒ]. In future acoustical studies, specifically dealing with affricates, it will be possible to investigate whether such segments are subject to a certain degree of phonetic variation, targeting the closure phase, analogous to that encountered with full stops. In addition to stop obstruents, Giannelli and Savoia (1978, 1980) report lenited outputs of fricatives, such as /v/, realized as approximants, and even of sonorants and liquids, as /n/, /l/, /m/ and /r/, which, in an analogous trend, are realized as homorganic approximants or are deleted.
At least as obstruents are concerned, in a phonetic vein, it is likely that the more analyses are conducted, the more allophones will be found, varying in the degree of frication noise and voicing, or, by the presence vs. absence of bursts and other phonetic cues. As Cravens (1984, p. 277, note 4) observes:
“The phonetic results of intervocalic weakening given here are necessarily abstractions from the multitude of realizations which occur in actual speech, determined sociolinguistically and geographically, by speed of speech, etc. Their status is that of the usual realization in relaxed, but not necessarily fast, speech”.
It might be argued that, to an extent, the amount of allophonic variation observed in the literature might be directly proportional to the analysis-specific fine-grainedness. Indeed, the number of possible GT outputs varies across studies. Recall from above that Giannelli and Savoia (1978) found about 10 allophones per place of articulation, while only six allophones are observed by Sorianello (2003). The last author also recognizes inter-speaker variation, as far as, e.g., the presence of voicing is concerned, with some speakers producing both voiceless and voiced outputs, whereas others only produced voiceless segments. The varying number of allophones observed at the inter-speaker level, both across studies and within a single investigation, should suggest that, in a measure, such a rich variation might be irrelevant when phonologically formalizing the process rule. It is nonetheless important to have a clear understanding of the phonetic facts to be able to assess what is phonologically (ir)relevant. As claimed, extra-phonological factors, such as speech rate, articulatory accuracy, demography, prosodic aspects (Giannelli & Savoia, 1978, p. 32), or sociolinguistic dynamics (see, e.g., Giannelli & Savoia, 1978; Urban, 1986, p. 36), affect phonetic production.
Consistent with the “phonological irrelevance” view, we believe that understanding the output of GT as simply a continuant, homorganic or debuccalized, version of the underlying stop, is—perhaps counter-intuitively—phonologically more accurate than a finer phonetic description of the occurring productions. The possible outputs are probably infinite in number and do constitute an openness continuum ranging from closure preservation to segment deletion.
From our perspective, given a phonological process, extra-phonological conditions being equal, the result is not either a, b, c… or z. Rather, e.g., in the case of lenition, the output is a range of sounds lacking one or some of the sub-segmental properties of the underlying segment, while preserving others. This does not mean that each production must be phonetically identical among speakers, speech styles and other extra-phonological factors. Outputs of a given process are phonologically identical in the sense that each token is characterized by the lack, or the addition—according to the kind of process—of a sub-segmental property. Such properties are identified either as features of various kinds, as node associations (Clements, 1985), or as elements (Backley, 2011), among others, according to the theoretical framework adopted. However, they all share the fact of being discrete phonological entities (Goldstein, 2003), i.e., their occurrence in the subsegmental representation is not gradient. As is known, phonologically, sounds may be continuant or non-continuant, voiced or voiceless, etc., but they do not oppose different degrees of continuity, voicing, etc.; if anything, they can be organized as a sequence of non-continuant and continuant gestures, as is the case with affricates.
All of this said, we reaffirm that phonetic descriptions involving manner-based allophonic classifications are precious in order for a phonological analyst to (i) isolate the “lowest common denominator” among the productions of a given phoneme in a specific context and consequently (ii) to try to provide a description of what is phonologically happening when the phenomenon under exam applies. Analogously to the studies of Bafile (1997), Marotta (2008), and Ulfsbjorninn (2017), this represents the approach adopted in this study.
As far as GT is concerned, the target of the process appears to be a property of segments connected to the closure gesture, as said above. In fact, all the allophones ever observed surfacing by GT involve weak or null contact between articulators. The segmental property mentioned above might have to do with the phonological representation of stopness, that is, the abstraction of the total closure gesture or of the full contact between articulators (Bafile, 1997). From an acoustic point of view, it may instead be understood as the representation of the (steep) decrease in intensity, a feature of oral and nasal stops, affricates and possibly characterizing liquids in some languages (as, e.g., in Backley, 2011).

3. When Does GT Occur?

So far, we have addressed the topic of what happens to stop obstruents affected by GT. A crucial piece of the puzzle, however, concerns the context of the process, i.e., when the obstruents are affected by GT. As is known, GT affects post-vocalic stops followed by vowels, glides or liquids, both in external sandhi and word-internally (see, as a general example, the output of /la#ˈpjɛtra/ → [la#ˈɸjɛːθra] “the stone”; Giannelli & Savoia, 1978, 1980; Giannelli & Cravens, 1997; Marotta, 1995, 2003, 2008; Bafile, 1997, 2003; Sorianello, 2001, 2003). Given that in Tuscan, as in standard Italian, stops may only be followed by the segments above mentioned, Marotta (2008, p. 243) makes explicit that “the process appears to be constrained only by the left side of the string; therefore, the triggering context of gorgia may simply be defined as postvocalic”. This rules out the post-consonantal and absolute initial positions from the contexts of GT. At the same time, it isolates the syllabic onset as the context affected by the process (Marotta, 2008, p. 244; cf. Bafile, 1997, p. 30; Loporcaro, 2005, p. 420; Giannelli & Cravens, 1997, p. 32; Ulfsbjorninn, 2017).
This derives from the observation that stops, with the exclusion of the first half of a geminate, always occupy the onset position, both in standard Italian and in Tuscan varieties. By definition, onsets are followed by nuclei, i.e., vowels, while they may be preceded by nuclei or codas. Therefore, GT may be described as triggered by the presence of a preceding nuclear position, i.e., a vowel (Marotta, 1995, p. 313; Marotta, 2008, p. 243; Bafile, 1997). By combining the two observations, it results that the target of GT is identified as the postvocalic onset, if this is occupied by a stop obstruent (sonorants are unaffected, but see Giannelli & Savoia, 1978, p. 47).
This generalization is crucially bound to the Tuscan native phonotactics. Diachronic evidence shows that, in Tuscan varieties, non-sibilant obstruents in syllable coda have, at least in the past, been illicit (cf. Celata et al., 2022, for a discussion on the frequency of such clusters in contemporary Italian). Heterosyllabic and heterorganic obstruent clusters, with the exclusion of /s/C, diachronically, have been gesturally simplified by the homorganization of the clusters through total regressive assimilations (e.g., Lat./kt/ > Tsc./tː/, as in LACTEM > la[tː]e “milk” and NOCTEM > no[tː]e “night”). Synchronically, the native lexicon lacks (non-sibilant) obstruents in the coda position, so the generalization that the GT’s target identifies as the onset cannot be falsified (but see Ulfsbjorninn, 2017). Stop geminates, which may be considered as coda-onset stop clusters, are another kettle of fish; different from heterorganic clusters, such as /kt/,/pt/, etc., geminates are produced through a single articulatory gesture. Accordingly, in, e.g., autosegmental frameworks (cf. Goldsmith, 1976), their phonological representation is usually conceived as a single melodic structure associated with two heterosyllabic timing slots. This way of understanding geminates is in harmony with their “inalterability” (see Callender, 2010, p. 46); they neither can be broken up by vowel epentheses (cf. Ulfsbjorninn, 2017) nor they can be targeted by rules affecting one half only. Instead, they behave as a unit (Goldsmith, 1990) from a cognitive point of view. Therefore, they will not be considered in this study as an instance of a potential GT target.
Relying solely on the native lexicon, one could argue that GT targets syllabic onsets just as plausibly as claiming it affects post-vocalic stop obstruents, regardless of syllabic affiliation. In both Italian and Tuscan native vocabularies, such obstruents consistently appear in the onset position, rendering the “Onset Hypothesis” untestable, as noted by Marotta (2008, p. 244). However, loanwords provide more varied phonotactic contexts. Obstruent clusters involving non-sibilant segments occur with some frequency. Fully integrated older borrowings are treated as native by contemporary speakers, who are often unaware of their foreign origin—e.g., /bisˈtekːa/ < Eng. beef steak (Repetti, 1993, pp. 185–187). Their adaptation reflects Italian phonotactic constraints; final obstruent codas have been re-syllabified as internal onsets via paragogic insertion, and illicit clusters have been simplified. However, partially integrated loanwords typically include learned words with Latin or Greek roots and Italian morphology, such as /psiˈkɔloɡo/ “psychologist”, /ˈritmo/ “rhythm”, and /ˈtɛknika/ “technique”. These words include non-native clusters (e.g., /ps/, /tm/, /kn/), where a non-sibilant obstruent is followed by a consonant that is neither a liquid nor a glide. While these clusters remain intact in the standard language, in low diastratic and diaphasic varieties, both Tuscan and non-Tuscan, they often, though not always, undergo nativization via vocalic epenthesis (/ps/ → [pis]) or total regressive assimilation (/ps/, /tm/, /kn/ → [sː], [mː], [nː]) (cf. Ulfsbjorninn, 2017; Repetti, 1993, 2012; Morandini, 2007).
Moreover, in relatively recent times, Italian, as well as other languages, has been borrowing an increasing number of loanwords, mostly from English, such as “weekend”, “cocktail” and, more recently, “lockdown”, “admin”, “catcalling”, “bed-and-breakfast”, and so on. Unlike bookwords, such new loanwords do not usually show Italian morphology, i.e., they lack inflectional morphemes conveying gender and number specifications.
Heterosyllabic clusters involving stops as first segments, recurring in recent loan-words, among other kinds of clusters, have been phonetically investigated in a Tuscan variety as produced by 10 speakers aged 24–35 by Celata et al. (2022). The target obstruents in the study were the alveolar stops /t d/within internal clusters, as in Batman and podcast. Celata et al. (2022) show that, among Tuscan speakers, non-native stop-initial heterosyllabic clusters (such as /tm/ and /dk/ in the examples above) behave as native sonorant-initial heterosyllabic clusters as far as the cluster length is concerned (see also below), as opposed to tautosyllabic clusters. In fact, stop + stop and sonorant + stop clusters (i.e., heterosyllabic clusters) are longer than stop + liquid sequences (i.e., tautosyllabic clusters). Although the study did not look for clusters’ repair strategies, such as total regressive assimilation, importantly, stop + stop clusters were not reported to undergo any relevant adaptation to the native phonotactics (neither epentheses nor regressive assimilations have been observed). It is, however, difficult to interpret the data therein in the light of GT; the speakers’ L1 might be a Tuscan variety where coronal stops are not spirantized, or at least not as systematically as in the Florence–Siena–Pistoia area, such as in the Pisan one (see Giannelli & Savoia 1978, 1980).
The phonological behavior of stop-initial heterosyllabic clusters in Tuscan has been investigated by Ulfsbjorninn (2017). It is observed that, in clusters of the kind, GT does not apply: [ˈiktus] rather than *[ˈixtus] “stroke”, [ˈɛtna] rather than *[ˈɛθna] “(mount) Etna”; moreover, a vowel epenthesis tends to occur between the consonants. The author (ibidem) concludes that /kt/, /tn/, etc. are not coda-onset sequences, but rather two distinct onsets separated by an empty nucleus (i.e., bogus clusters, cf. Harris, 1994), which tends to be pronounced, meaning GT only affects intervocalic onsets. One may observe, however, that epenthesis should create the context of GT, i.e., an intervocalic onset; still, GT is reported not to apply in these cases (cf. Ulfsbjorninn, 2017). The study requires more extensive discussion, though, being highly relevant to the present research. To proceed, we need first to introduce the theoretical frameworks in Ulfsbjorninn (2017) (see Section 4 and Section 8.1). The main aim of the present paper is to test the “Onset Hypothesis” (Marotta, 1995; Bafile, 1997). Our analysis is based on original Florentine Italian acoustic data, the variety representing the GT’s epicenter (cf. Giannelli & Savoia, 1978), using loanwords and bookwords.
Therefore, our main questions are the following:
  • Do obstruents in the coda position undergo spirantization of the same kind experienced by post-vocalic onsets?
  • Are their outputs similar to those of non-postvocalic onset stops, i.e., in post-coda context?
According to the Onset Hypothesis, the first stop of a heterosyllabic consonant cluster should be realized with full closure (cf. Ulfsbjorninn, 2017), preserving the cluster melodic identity. On the contrary, the observation of spirantized stops in the coda position will have to be taken as evidence confuting the hypothesis. In the latter scenario, one should conclude that only the linear order is relevant in the GT pattern.
In the following section, we present the phonological theoretical models adopted for our analysis.

4. The Analytic Tools: Strict CV, Coda Mirror, and Element Theory

Strict CV (Lowenstamm, 1996; Scheer, 2004) is an offspring of Standard Government Phonology (henceforth, GP), the framework developed by Kaye et al. (1985, 1990). Both Standard GP and Strict CV aim at getting rid of the syllable arboreal structure by postulating the existence of lateral relations, which account for the interaction between adjacent segments and position-dependent phenomena. Standard GP recognizes the existence of syllabic constituents, such as rhymes, onsets and nuclei, while Strict CV only admits onsets and nuclei. Indeed, in Strict CV, the universal syllable structure is simply CV (Lowenstamm, 1996; Scheer, 2004). Therefore, consonantal and vowel surface clusters (as well as geminates and long vowels) are said to include an empty onset or nucleus, underlyingly.
Strict CV captures the different phonological behavior of tauto- and heterosyllabic clusters based on the kind of lateral relations entertained by the empty nucleus between the C positions. Lateral relations work from right to left in Strict CV and are of two kinds: Government and Licensing. The former has inhibitory effects on the target’s subsegmental composition, and it is usually responsible for lenition processes, interpreted as subsegmental decomplexification (see below in this section). The latter potentially enhances the melody of the targeted segment. Such forces are spread by ungoverned nuclei (Scheer, 2004). Ungoverned nuclei are represented by phonetically expressed nuclei, final empty nuclei and nuclei within an Infrasegmental Government, henceforth ‘IG’, domain (Scheer, 2004). This last one is the kind of relation entertained between the second and the first member of a traditional branching onset; IG does not affect the melody of the target. Different from Standard GP—which defines branching onsets as head-initial Constituent Government domains (as well as branching rhymes and nuclei)—in Strict CV, the governor is the second member in the linear order, usually a liquid, a glide or a nasal, while the first recurring one represents the governee. The nucleus circumscribed within an Infrasegmental Government domain does not need to be governed by a following full nucleus in order to remain silent and, being ungoverned, it can exert its forces. In fact, a nucleus may remain phonetically empty only if it is governed by the following ungoverned (full) nucleus.
The interplay between the lateral relations and the distribution of empty nuclei illustrated above is how Strict CV interprets the traditional syllable structures.
As we said, a branching onset is defined as a CVC sequence where the second C governs the first one through Infrasegmental Government; although the nucleus between them is ungoverned, it may remain silent in this case, while still being able to exert both Government and Licensing. In an intervocalic branching onset, e.g., in capra “goat”, the empty nucleus before /r/ governs the stop /p/, while in a post-consonantal position, as in dentro “inside”, it governs the nucleus between the nasal and the stop /t/; being governed, such a nucleus remains silent as well. See Figure 1 and Figure 2 below.
In capra (Figure 1) both members of the TR (a shortcut for branching onset; ‘T’ standing for any obstruent, ‘R’ standing for any liquid) are both governed and unlicensed, while in dentro (Figure 2), the stop is ungoverned and licensed, and the rhotic is governed.
The first member of a branching onset experiences the same forces as simple onsets do: if surrounded by adjacent filled nuclei, it will be governed and unlicensed, while it will be licensed and ungoverned if preceded by a consonant (i.e., by an empty nucleus, in Strict CV terms).
As far as codas are concerned, they do not experience any lateral force, being both ungoverned and unlicensed: they are followed by a governed empty nucleus, which is unable to be a lateral actor. See Figure 2.
As was claimed above, Government has negative effects on the governee, while Licensing enhances the segmental expression on the surface. According to the forces setting, one may isolate syllabic positions as weak or strong.
Within Strict CV, such work is performed by the Coda Mirror theory (Ségéral & Scheer, 2001, revised in Scheer & Ziková, 2010). Coda Mirror is a theory of lenition and fortition that makes use of Strict CV tools, such as Government and Licensing. Given the structures in Figure 1 and Figure 2, a weak position is one that is governed and unlicensed; this is the situation of intervocalic onsets, both simple and branching. The post-consonantal position, which we have seen being ungoverned and licensed, is a strong position. In languages that show an initial empty CV string (Lowenstamm, 1999), Italian being one of them, the word-initial C stands in a strong position, since the following full nucleus is called to govern the initial empty one, licensing the consonant. The coda position is the mirror image of the post-consonantal onset: the former is followed by an empty nucleus, while the latter is preceded by one. Being unlicensed, the coda position is a weak one. However, it is not as weak as the intervocalic position, although the lack of Licensing determines the impossibility to strengthen (but see below), codas also lack Government, which is responsible for segmental inhibition. Thus, lateral relations determine a positional strength hierarchy. The post-consonantal onset represents the strongest position, while the intervocalic onset is the weakest, or the least strong, position; the syllable coda stands in between, as it is neutral in strength terms.
The Coda Mirror theory predicts, accordingly, that a coda is stronger than an intervocalic onset; this is in line with the cross-linguistic observation that codas, interpreted as pre-empty nucleus positions, spirantize only if also intervocalic onsets do, while the opposite is not true (Scheer & Ziková, 2010). As far as this study is concerned, we should therefore expect GT to possibly affect codas as well, a weak position, although stronger than the intervocalic one. The alternative possibility defines the Onset Hypothesis presented in Section 3, in which GT only affects intervocalic onsets. In Strict CV and Coda Mirror terms, the Onset Hypothesis must be reformulated by claiming that GT is the Florentine manifestation of Government affecting stop obstruents rather than due to a lack of Licensing.
The data in Giannelli and Savoia (1978) and Russo (2022), however, pose a challenge to Coda Mirror. In these studies, lenited outputs of underlying stops have been found in strong position, as in [la ˈθorθa] “the cake”. Such outputs may represent counterevidence of the theory’s predictions, as argued in Russo (2022). Nevertheless, lenited outputs in strong contexts are not aprioristically excluded by the Coda Mirror theory, since the generalization is only relative, where outputs in strong position are “at least as strong as those that appear in weak position” (Ségéral & Scheer, 2008, p. 140).
In this view, the output, e.g., [la ˈθorθa] “the cake”, should not be taken as a challenge to the Coda Mirror predictions. An example of this would be the output *[la ˈtorθa], in which the stop in weak position is not lenited, while the one in strong position is instead spirantized. Such outputs are not on record. The effects of Government and Licensing may be understood as interacting with the subsegmental composition associated with the targeted position. Within the Standard GP universe, the representation of segments is framed within Element Theory (henceforth, ET), first developed by Kaye et al. (1985, 1990). Currently, Backley’s (2011) offshoot is considered the standard version of Element Theory, to which we refer in this study. In ET, segments are compositions of elements (graphically represented within vertical brackets, e.g., |A|), phonologically pronounceable primes, defined based on acoustic characteristics, but distributed across segments according to their phonological behavior. Elements may either be headed, indicated by an underline, e.g., |A|, or non-headed (e.g., |A|), and the segmental expression will, accordingly, be different.
For consonants, the element |ʔ| contributes to the stopness of the segments; in isolation, it is pronounced as [ʔ], while if associated with other elements, it corresponds to an oral stop. It is indeed known as the “stop element” or the “occlusion element” (Backley, 2011, p. 115). On the contrary, |H| is responsible for the frication noise:without |ʔ|, it corresponds to a fricative consonant, while if |ʔ| is part of the elemental setup, it refers to aspiration and VOT. The elements |A|, |I| and |U| define place of articulation: the first characterizes gutturals and some coronals (e.g., /r/), headed |I| represents palatality and, therefore, is a component of palatal consonants, while non-headed |I| may contribute to some coronal consonant composition. Eventually, headed |U| represents labials and non-headed |U| velar consonants. As for voicing properties, voicelessness is defined by |H|, the element of frication noise, while |L| represents voiced consonants. Headed |H| characterizes languages showing phonological aspiration (e.g., English and Korean), while headed |L| recurs in languages showing regressive voicing phenomena (Backley, 2011, p. 151), such as in Spanish and Russian (Backley, 2011, p. 136; Martinez-Gil, 2019; Petrova et al., 2006).
As far as Florentine Italian is concerned, the laryngeal element |L| should not be headed, since voicing is not crucially involved in phonological processes, but simply defines the voice properties of segments (i.e., if present, it will make the consonant voiced). A voiceless stop may therefore be defined as composed of the occlusion element |ʔ|, the frication element |H| and the corresponding place elements |A|, |I| or |U|, either headed or non-headed, which are irrelevant here since place properties are invisible to Florentine GT (i.e., stops are lenited regardless of their place of articulation, different from, e.g., Pisan GT, cf. Marotta, 2001). In ET, lenition is understood as a decomplexification of the elemental structure by means of the loss of elements. Within this framework, GT may be directly accounted for by proposing the suppression of the |ʔ| element from the stop elemental composition (cf. Bafile, 1997). On the other hand, the framework understands fortition in terms of complexification of the structure, either by the acquisition of an element or by increasing the number of headed elements. Therefore, Florentine stop aspiration in strong position (see Giacomelli, 1934; Giannelli & Savoia, 1978; Russo, 2022) may be accounted for assuming that |H| acquires head properties, as in the process/p/ → [ph], formalized as |U H ʔ| → |U H ʔ| in ET (see Backley, 2011, pp. 140–141).
As anticipated in Section 3, the behavior of coda stops in Tuscan in light of GT was first addressed in Ulfsbjorninn (2017). In this study, stop-initial heterosyllabic clusters are interpreted as bogus clusters, i.e., two onsets separated by an empty nucleus. Bogus clusters differ from coda-onset sequences in lacking any relation between the segments (Harris, 1994). The bogus cluster analysis is based on the optional occurrence of vowel epentheses between the consonants, due to the looseness of the cluster, according to Ulfsbjorninn (2017). Indeed, according to the Structure Preservation Principle in Harris (1994, p. 190), by which resyllabication processes are not assumed to occur during derivation of the surface form, and the occurrence of vowel epethensis is taken as evidence of the onset-onset underlying syllable structure of stop-initial hetoroysllabic clusters. Despite being capable of justifying the occurrence of epenthesis in such clusters, unobserved both in branching onsets and native coda-onset clusters, the analysis raises two issues. First, it does not account for the occurrence of total regressive assimilations in bogus clusters, which imply a relation between segments in order for place features to be transferred from the second to the first segment, as in /ˈkaktus/ → [ˈkatːus] “cactus”. Second, it makes a wrong prediction. After epenthesis, both consonants result in intervocalic position and should therefore be subject to GT; still, lenition does not occur (the output is apparently [ˈkakətus]). Note that Ulfsbjorninn (2017) identifies the intervocalic onset as the target of GT. Based on the above observations, stop-initial heterosyllabic sequences should be analyzed as coda-onset sequences. We interpret both regressive assimilation and vowel epenthesis as nativization strategies, optionally affecting non-Tuscan clusters. This view should predict that in the realization of, e.g., /kt/ in the above example, both consonants should be long. Indeed, in order to preserve the underlying coda-onset structure, the epenthetic item should be a CVC string rather than simply a vowel V, yielding the output [ˈkakːətːus]. We will briefly discuss outputs of this kind in Section 8.3.
Before outlining the methods of data collection and phonetic analysis (Section 6), a brief discussion of relevant sociolinguistic factors that may influence the GT pattern is presented.

5. Diatopic and Sociolinguistic Variation Concerning GT

Believed to have originated in Florence, GT is not uniformly applied among Tuscan varieties. In the Florentine-Siena-Pistoia area the whole voiceless triplet /p t k/ is affected by GT, which extends, to a lesser extent, also to voiced stops. As one moves away from such an area, only voiceless stops show the phenomenon as active, and the bilabial /p/ shows to be discontinuously affected, or totally unaffected, by the process. Moving further away from the epicenter, within the Tuscan territory, also the alveolar /t/ does not participate in the pattern (Bafile, 1997; Giannelli & Savoia, 1978, 1980). The isogloss of the spirantization of /k/ has the widest areal diffusion, followed by the ones of the dental and bilabial voiceless stops (Giannelli, 2000).
It seems that Tuscan speakers are only aware of the phenomenon concerning /k/ (Marotta, 2008, 2014; Cravens, 2000; Villafaña Dalcher, 2008). Following Cravens (2000, p. 14), intervocalic /k/ spirantization is a “Tuscan stereotype” in a Labovian sense3.
According to Giannelli and Savoia (1978), intervocalic voiceless stop spirantization in Florence is a phenomenon present across all social classes, at least at the time of the inquiry. The most conservative speakers (workers, craftsmen, elders, and, in general, low-educated speakers) consistently employed GT and other dialectal features across various contexts and speech styles. In contrast, individuals from the middle-upper class with a high level of education tended to avoid dialectal features, including GT, in specific formal contexts.
In the 1990s, much attention was devoted to the spreading of the phenomenon in south-eastern Tuscany. According to the results in Cravens and Giannelli (1995) and Pacini (1998), referring to Cortona and Bibbiena (province of Arezzo) varieties, respectively, white-collar male workers were at the forefront of promoting this linguistic innovation, suggesting that GT was at least attributed a covert prestige (i.e., GT is interpreted as a marker of local solidarity, cf. Cravens & Giannelli, 1995), throughout Tuscany. More recently, perceptual studies by Calamai (2011, 2017) in Leghorn and Arezzo indicated that Tuscan speakers associate Florentine pronunciation with elegance, culture, and tidiness. These findings, coupled with Pacini’s (2010) research showing that GT spreading among young speakers in Cortonese was no longer tied to specific social classes, led Marotta (2014) to consider GT as a potential overt prestige feature within the region. However, she observes that, in recent decades, outside Tuscany, Northern pronunciations are perceived as more prestigious than Tuscan accents, in line with Galli de’ Paratesi (1984); see also De Pascale and Marzo (2016); De Pascale et al. (2017). This might influence Florentine speakers, especially younger ones, to exert control over their stop pronunciation (Marotta, 2014, p. 159).
Given that, after Giannelli and Savoia (1978), no sociolinguistic studies have ever focused on GT in Florentine Italian and, given the changes in prestige that GT seemed to have gone through in the last decades (Marotta, 2014), it is hard to predict the current sociolinguistic distribution of the phenomenon in Florentine. In other words, before testing the Onset Hypothesis (Section 3), one should first check whether GT is produced in the canonical context (i.e., in post-vocalic onset position, see Marotta, 2008) by each sociolinguistic group controlled in this study (see below).
According to the literature on the topic (Giannelli & Savoia, 1978; Cravens & Giannelli, 1995; Pacini, 1998, 2010; Marotta, 2014), we selected age, sex and level of education as sociolinguistic variables (see Section 6.1 for their parameterization). In the onset context, e expect that speakers will all show GT to some extent, even though some variation might emerge related to level of education (at least in diaphasically high contexts, Giannelli & Savoia, 1978) or age (Marotta, 2014), in which young and more educated speakers might try to inhibit spirantization. Since sex has been found to influence GT in Bibbiena (Cravens & Giannelli, 1995) and other local traits in Florentine (Piccardi, 2017), it was also included among the variables, as we cannot exclude that nowadays it might influence GT in Florence as well. If some instances of spirantization are found in the coda, they will not be expected to follow any sociolinguistic pattern.
The following section outlines the methods used in the empirical analysis to test hypotheses related to contextual and sociolinguistic factors influencing GT.

6. Materials and Methods

6.1. Participants and Data Collection

The materials presented here are a subset of a larger project aiming to investigate Florentine stop realizations in different contexts. For this project, 42 Florentine participants were recruited through social networks and personal contacts. To mitigate the impact of other Italian varieties, individuals selected for the data collection had to be born and raised in Florence, as well as their parents. According to the variables chosen for the sociolinguistic investigation (age, sex, and level of education), the participants are distributed as shown in Table 1.
As the age factor was chosen to explore potential intergenerational variation on stop realizations (hypothesizing that there might be a difference between young and adult speakers, Marotta, 2014, see Section 5), we treated age as a nominal variable. We classified speakers as part of the “young group” if their age was within the 20–35 range, and as part of the “older group” if their age was within the 45–65 range. This way, age groups are clearly distinguished thanks to a 10-year gap between them.
Unlike Cravens and Giannelli (1995), who distinguished speakers according to their social class (“white collar” vs. “blue collar”, based on their occupation), we selected speakers based on their level of education, discriminating between graduated and non-graduated speakers. The level of education is considered to be related to “sophistication” (Hudson, 1996), which is based on the stereotypes of “rough people” and “sophisticated people”. According to the author, such concepts are grounded not only in occupational (manual vs. intellectual job) connotations but also in cultural background (which is more related to the education); see Piccardi (2017) for a discussion. Moreover, in sociolinguistics, the level of education is frequently taken into consideration (D’Agostino & Paternostro, 2018) as it is related to cultural capital (Bourdieu, 1979), i.e., cultural resources of the individual, such as linguistic abilities, education and cultural practices. Finally, although the use of local traits is usually believed to be related more with gender identity than with biological sex differences (Podesva & Kajino, 2014), our speakers were categorized according to their biological sex, given the impossibility of deepening these aspects with the sample object of this study.
Recordings were conducted in the first author’s or the participants’ house, with a WH20 Shure headset microphone and a Zoom Handy 5 recorder (sampling: 44 kHz, 24 bit format), minimizing background noise and reverberation conditions. All the equipment was provided by the Laboratory of Phonetics of University of Pisa.
Data were collected through a reading task and a spontaneous conversation concerning topics such as holidays, Tuscan cuisine, sports, hobbies, and opinions about Florence. In the reading task, participants had to read aloud 180 meaningful sentences, containing a target word with a stop in different syllabic contexts. The sentences were presented on a PowerPoint presentation, one for each slide, and randomized via a MACRO script created ad hoc by the first author. Additionally, three slides prompting participants to take a break were included to prevent fatigue. Participants were asked to read the sentences at a normal pace; in cases of mispronunciation, the researcher invited them to repeat the entire sentence.
In this study, we focused mainly on the reading corpus by selecting 18 sentences (Table 2) containing words with voiceless stops in a postvocalic position. Among such items, 9 correspond to native Italian lexemes, in which the stop occupies the intervocalic position, whereas 9 are loanwords or bookwords, including voiceless stops, in the coda position.
The reading corpus for this study is thus composed of 612 tokens: 378 words with an intervocalic stop (V_V), and 233 words with a stop in the coda position (V_C). The difference in the number of tokens between the two contexts is due to technical problems, in which the production of four target words (/ˈkapsule/, /apˈnɛa/, /ipˈnɔsi/, /ˈtɛknika/) could be analyzed for six speakers only. One occurrence had to be discarded due to a mispronunciation of the word ritmo by one of the speakers.
Due to the fact that the corpus was not built to investigate this specific context and given the scarcity of loanwords and bookwords with stops in the coda position, the data for this context are not balanced for place of articulation (/p/ = 60, /t/ = 83, /k/ = 90) and following consonant (V_N = 101; V_T = 84, V_S = 48). Despite all onset stops recurring in post-stressed positions, stops in coda follow both stressed (5 words) and unstressed vowels (4 words). Out of these four words, three were with bilabial coda stops (/apˈnɛa/, /ipˈnɔsi/, /kapˈtato/) and one dental (/atmosˈfɛra/). We chose to keep them in this study as their exclusion, unfortunately, might have led to an even more unbalanced corpus, in relation to the place of articulation and the following consonant. Even though from the literature it seems that GT is not affected (Marotta, 2001; Bertinetto et al., 2007; Villafaña Dalcher, 2008) or just slightly affected (Sorianello et al., 2005; Piccardi & Ardolino, 2021) by lexical stress, we could not exclude that keeping these words might lead to a confounding effect. For this reason, we tried to control as much as possible the effect of stress in both the allophonic classification and the acoustic analysis (see Section 6.2), considering that further research on the coda context should control more systematically for this variable. Regarding the prosodic context, all the target words were sentence-internal. The sentences were designed in a way that the nuclear accent should not fall on the target word.
Given that Giannelli and Savoia (1978) noticed some cases of spirantization in postconsonantal positions (see also Section 4), we chose 9 words (378 tokens in total) from our broader corpus in which stops are preceded by rhotics (e.g., /ˈarko/ “arch”), laterals (e.g., /ˈfalko/ “howk”), nasals (e.g., /ˈanta/ “shutter”) and sibilants (e.g., /ˈrɔspo/ “toad”). We checked for the frequency of spirantized outputs in the postconsonantal position (see Section 6.2.1) in order to compare them to stop spirantized outputs in coda and intervocalic onset positions. These words were in sentence-internal unaccented positions as well.
Since it does not seem peregrine that loanwords and bookwords might not be affected by GT due to their non-native status (e.g., the loanword “sexy”, /ˈsɛksi/), we extracted, from the conversation data, all the words participants produced that contained voiceless stops in the coda position (46 token: 43/k/ and only 3/t/), intervocalic stops occurring in this kind of lexemes (40 tokens, e.g., “meeting” /ˈmiːtɪŋ/), and intervocalic stops in native words (702 tokens). Relative to these conversation data, orthographically transcribed using the OCTRA tool (Pömp & Draxler, 2017), our analysis focused solely on allophonic classification (Section 6.2.1). If the eventual lack of spirantization in coda stops is due to their non-native status, we might expect that intervocalic stops in loanwords do not undergo lenition either.
Before the analysis, forced alignment was applied to both the conversation and reading task materials using WebMaus (Schiel, 1999). With the help of Praat (Boersma & Weenink, 2023), manual adjustments were made on Textgrid boundaries of each voiceless stop, the preceding vowel, and the following segment.
For the postvocalic stops from the reading corpus, we executed both an allophonic classification (Section 6.2.1) and an acoustic quantitative analysis (Section 6.2.2), for which we will now discuss the methodology.

6.2. Analytic Procedures

6.2.1. Allophonic Classification

Following previous acoustic studies on GT (Marotta, 2001; Sorianello, 2001) and on stops in the coda position (Marotta et al., 2023), outputs of voiceless stops have been classified based on the presence or absence of a silence phase, burst, frication noise and voicing. This yields the following classification: (a) voiceless stops, when showing a silence (closure) phase for more than half of the phone duration (Figure 3); (b) fricatives (and affricates), when showing noise for the entire phone duration (fricatives) or for more than half of the phone interval, as well as the lack of burst (semifricatives, Marotta, 20084) (Figure 4) and absence of voicing bar; (c) approximants (and voiced fricatives), when showing voicing bar, either associated with the presence of frication (voiced fricatives, see Figure 5) or formant structure (approximants); and (d) (regressively) assimilated allophones, in which production of the cluster is as a long consonant, i.e., a geminate (Figure 6).
Since, from our point of view, what is phonologically relevant is the presence/absence of a clear closure phase (see Section 2), we analyzed our data using “GT application” as a dependent variable, considering non-continuant vs. continuant realizations. Since stop outputs show a closure phase, they have been considered cases in which GT does not apply; assimilated outputs, i.e., geminate outputs, interpretable as a repair strategy (see Section 3), were considered outside of the GT field of pertinence. Fricatives and approximants have been considered cases in which GT applies.
To test whether there is a difference in “GT application” between the onset and the coda context (Onset Hypothesis, see Section 3) and if position interacts with social variables (see Section 5) or the segment involved, we employed a Generalized Linear Mixed Model (GLMM with a binomial response) using, as fixed factors, syllabic position (onset/coda) in interaction with each social variable (sex, age, level of education) and phoneme, together with stress. The phoneme factor was included as it is well-known that, in the onset position, the place of articulation influences the GT pattern, with/k/being more prone to lenition and/p/being the most resistant segment (Sorianello, 2001; Villafaña Dalcher, 2008). The interaction of the sociolinguistic factors with the syllabic position was added as we expect them to affect GT in the onset position only (see Section 5). Similarly, the interaction of phoneme with position was added since, if the eventual spirantization in the coda position is found, we do not expect that it will show the same effect of place of articulation. Since coda stops occur both in unstressed and stressed syllables (e.g., /kapˈtato/ “detected“ vs. /ˈritmo/ “rhythm”, see Section 6.1), we included stress to observe whether such a factor affects spirantization in the coda position. Speaker and target word were included as random intercepts. We manually performed a model reduction by pruning non-significant variables, performing likelihood ratio tests between nested models until the best compromise between the Akaike (Akaike, 1974) and the Bayesian (Schwarz, 1978) Information Criterion (AIC/BIC) was met (Hay & Foulkes, 2016; Piccardi & Ardolino, 2021).
Limited to the realization of stops in coda as such, we noticed that they were either released or unreleased, and some cases of epentheses were also observed. A more fine-grained classification for the realization of coda stops has been thus conceived, with the following criteria: (a) released stops, when the silence phase was followed by a burst and a frication phase (VOT), (Figure 3); (b) unreleased stops, when only the silence phase occurred, and no burst and VOT were present (Figure 7); and (c) released stops with vowel epenthesis, when after the stop and before the following consonant a vocoid (periodic waveform) was found (Figure 8).
Although we are interested in the mere application of GT according to different stop syllabic positions—hence, an allophone classification based on spectrographic features would be sufficient—we also provide an analysis based on quantitative parameters. Allophonic classifications are inherently subjective, since they entail arbitrary boundaries along continuous variables, imposing a constrained classification (i.e., organized according to a limited number of categories). The aim of the instrumental acoustic analysis (Section 7.2) is, thus, to provide a quantitative description of the outputs in the two contexts. The acoustic measurements will support the spectrographic classification in providing the most unbiased results concerning GT.

6.2.2. Quantitative Analysis

The quantitative analysis included the extraction of duration and intensity, considered acoustic correlates of intervocalic lenition (Kingston, 2008; Hualde & Nadeu, 2012; Ennever et al., 2017; Katz & Pitzanti, 2019). Previous acoustic studies on GT in intervocalic position (Sorianello, 2003; Villafaña Dalcher, 2008) revealed a correlation between weaker allophones and decreasing duration, along with an increasing intensity. In our study, we will extend this investigation to the coda position, comparing these acoustic measures with those observed in the onset position.
Intensity and duration metrics have been extracted from both the stop consonant and the preceding vowel using a Praat script, developed by the first author. To mitigate potential influences of speech rate, we computed a normalized phone duration measure (NDur), which was defined as the duration ratio between the consonant and the preceding vowel. Likewise, to account for variations in intensity due to recording conditions, we employed the IntDiff measure (Hualde & Nadeu, 2012), which is calculated as the disparity between the maximum intensity observed within the preceding vowel and the minimum intensity in the stop consonant domain.
To account for the problem of having coda stops in two different stress conditions (see Section 6.1), we decided to use a sub-corpus, excluding coda stops preceded by an unstressed vowel. The inclusion of coda stops under both stress conditions could have introduced a strong confounding effect on IntDiff and NDur, as these measures rely on the intensity and duration of the preceding vowel, which varies based on stress (e.g., Bertinetto, 1981).
The difference between the coda stops in the two stress conditions was also confirmed in our data by an independent-sample t-test (see Section 7.2). The quantitative analysis thus involves 515 tokens out of 611.
Given the imbalance of the corpus, due to the effectuated reduction, results shown in Section 7.2 must be cautiously interpreted, representing only a preliminary quantitative description of coda stops in Italian. Indeed, regarding Italian stops in this context, Celata et al. (2022) only analyzed the duration of the entire coda-onset cluster, while McCrary (2004) solely focused on the preceding vowel. Therefore, at least to our knowledge, literature lacks data relative to the duration and intensity of Italian coda stops.
As a first step, we observed whether, within each context, the allophonic scale shows the expected decrease in duration (NDur) and IntDiff, typical acoustic cues of intervocalic lenition (Ennever et al., 2017). To test whether the difference in IntDiff and NDur among allophones in each context is statistically significant, we performed Kruskal–Wallis tests (non-parametric alternatives to one-way Analyses of Variance, that do not assume normality of data distribution, Baayen, 2008) on both measures in each context. As the following step, we ran post hoc comparisons (Wilcoxon signed-rank test for paired samples with Bonferroni correction) crossing allophone with IntDiff and, separately, with NDur. To investigate whether a relationship between these two measures exists, we also performed two Kendall correlation tests (one for each context).
After verifying the relationship between allophones and the acoustic measures, we fitted a Linear Mixed Model (LMM), with IntDiff serving as the dependent variable. The aim was to test whether also level of constriction (deemed to be associated with lenition phenomena) is affected by the syllabic position (Onset Hypothesis) interacting with social variables (see Section 5) or the segment involved.
For the most general model, we again selected age, level of education, sex and phoneme, in interaction with the syllabic position, as fixed factors, and word and speaker as random intercepts. We performed a model reduction with the same method exposed for the GLMM. Since the LMM model was fitted using restricted maximum likelihood (REML) settings, which prevents comparisons between models with different fixed factors, model reduction was carried out among models that were automatically refitted (by the anova() function) with maximum likelihood (ML) settings.
To test whether a difference in lenition between the two contexts occurs, we opted not to utilize duration (NDur) as a dependent variable; an effect of the syllabic position on NDur will not be easy to interpret. By considering each context individually, duration has been reported to correlate with lenition (Hualde & Nadeu, 2012; Sorianello, 2001; Katz & Pitzanti, 2019), while a possible difference in duration between coda and onset stops might not be interpreted as a difference in lenition only. In fact, the syllabic position itself is likely to influence the segment duration (McCrary, 2004), regardless of the presence of lenition processes. Indeed, in other varieties of Italian, not showing lenition ongoing phenomena, e.g., Lombard Italian, consonants in the syllable-final position (i.e., in coda) appear to be systematically longer than single intervocalic consonants (Farnetani & Kori, 1986, p. 30). Moreover, considering that NDur is defined as the ratio between the phone duration and the preceding vowel duration (NDur = phone duration/vowel duration), a difference in NDur between onset and coda stops might also come from a difference in the preceding vowel duration between the two contexts. In fact, we need to consider that, in Italian, stressed vowels preceding coda and onset consonants are expected to be of different duration (for the debate on the difference in length between stressed vowels in open and closed syllables see Fava & Magno Caldognetto, 1976; Vogel, 1982; Marotta, 1985; Chierchia, 1986; Farnetani & Kori, 1986; D’Imperio & Rosenthall, 1999; McCrary, 2004).
Celata et al. (2022) have recently confirmed that vowels preceding stop-initial tautosyllabic clusters (thus preceding onset stops) are significantly longer than those preceding stop-initial hetero-syllabic clusters (preceding coda stops).
Conversely, regarding IntDiff, the process seems to target a characteristic of segments associated with the closure gesture, and thus with constriction. Constriction is known to correlate with intensity (Chandrasekaran et al., 2009; Parrell, 2010). Therefore, we suggest that IntDiff may be the measure most closely reflecting the articulatory constriction of the phoneme. Consequently, it may best capture the phonetic effects of GT on stops, even when comparing the two contexts. In conducting the statistical analyses, we used the R Studio software (R Core Team, 2022, v. 2024.12.1). The packages lme4 (Bates et al., 2015) and LmerTest (Kuznetsova et al., 2017) were adopted to run the generalized and linear mixed models. The R-squared values were extracted with the MUMin package (Barton, 2023).

7. Results

7.1. Allophonic Classification of Intervocalic and Coda Stops

In Table 3, we present the results from the reading corpus of the allophonic categorization of postvocalic voiceless stops in onset and coda positions, according to the place of articulation.
We notice that stops behave very differently in the two contexts. Concerning the onset position, fricatives are the main realization (61.4%), approximants amount to 26.7%, and stops to 9.5% only, while deletion is sporadic (2.4%).
In coda, in contrast, stop realizations outnumber (88.8%) other realizations, i.e., voiceless fricatives (6.4%) and regressive assimilations (4.7%). Approximant realizations were not observed.
In both onset and coda positions, /k/ shows the lowest percentage of stop outcomes, whereas /p/shows the highest one. In onset, as previously observed in literature, among the realizations of underlying velar stops, approximants were the most frequent ones (found in 60% of cases). In the coda position, dental stops show slightly higher percentages of spirantized allophones as compared to velars, which show higher percentages of assimilation instead (7%).
Table 4 shows the percentages of “GT application” (continuant/non continuant allophones), which has been taken as the dependent variable for the GLMM, and their correspondence (following the criteria exposed in Section 6.2.1) with the allophonic distribution reported in Table 3, in which continuant outputs correspond to 90.5% in onset and 6.4% in coda.
Among coda stops, the percentage of GT application on the 95 segments preceded by an unstressed vowel amounted to 5.2%, while on the 137 stops preceded by a stressed vowel, it was 7.8%.
Table 5 shows the percentage of GT application for each level of the sociolinguistic factors considered in the study, among stops in onset and coda.
In Table 5, we notice that non-graduated speakers realize more spirantized allophones in both onset (98% versus 82%) and coda (8.7 versus 3.7%). Male and adult speakers produce slightly higher percentages of continuant outputs in the onset and higher percentages of continuant outputs in the coda (9.9% among males versus 3.3% among females, 8.7% among adults versus 3.7% among youths). However, among all age, sex, and education levels, continuant outputs are the main realization for stops in the onset position (percentages above 80%), while stops in coda are mainly produced as non-continuant outputs (percentages above 90%).
The best model (Table 6) on “GT application” resulted in one with the syllabic position, level of education and phoneme as the main factors (sex, age and stress were pruned out). Speaker and target word were kept as random intercepts.
The syllabic position shows the strongest effect, with GT application being more likely in the onset than in the coda context (Odds Ratio = 2924.63, p < 0.001). As all the interactions between position and the sociolinguistic factors were not significant, accordingly pruned during the model selection procedure, this difference holds true for each level of sex, education and age. The phoneme effect indicates that continuant allophones, for both the coda and the onset positions, are less likely to occur as outputs of the bilabial stop (Odds Ratio = 0.17, p = 0.002). Level of education is the only sociolinguistic factor showing a significant effect on the phenomenon, with lower-educated speakers showing a higher probability of spirantizing, independent of the syllabic position of the stop (Odds Ratio = 7.67, p = 0.003).
The model R2 measures (marginal = 0.71/conditional = 0.85) indicate that fixed factors account for more than 70% of the variance, while an additional 14% is accounted for by the random factors. Among them, speaker seems to explain more variability than the target word. The low variance associated with the target word seems to indicate that GT application is quite similar among the different lexemes.
As for the postconsonantal position, which served as a control, 84% of the 378 stops were realized as plosives, while 16% as spirantized allophones (affricates and fricatives).
Moreover, the results from the spontaneous corpus seem to confirm the different behavior of stops in coda and in the onset position that we found in the reading corpus.
As regards loanwords and bookwords, intervocalic stops (40 occurrences) were observed to spirantize in 98% of cases (e.g., [ˈmiːθɪŋɡ], “meeting”). More specifically, velar and dental stops were never realized as such, with the exclusion of one token of /t/. The velar stop /k/ was mostly realized as an approximant (82%), less frequently as a voiceless fricative (only 3 cases), while it was deleted in only one case.
On the contrary, out of the 46 stops (/k/ and /t/) in the coda position, 42 (92%) were realized as such (e.g., [ˈsɛksi], “sexy”) and 4 (8%) as voiceless fricatives.
The percentages of spirantization of intervocalic stops in loanwords and bookwords are very similar to those relative to intervocalic stops in the Italian lexicon; here, /k/ was never realized as a stop (0%), it surfaced as a fricative in 23% of cases, as an approximant in 69%, and was deleted in 8% of cases. Similarly, /t/ is realized as a stop in only 3% of instances, as a voiceless fricative in 79%, and as an approximant in just 6%. The fact that intervocalic stops show high percentages of continuant outputs, regardless of the native or loan status of the word, suggests that the preservation of stops in coda is not related to the word loan status.
Table 7 shows a finer-grained classification of coda stops, which, different from Table 3, further distinguishes stop realizations among released, unreleased, as well as cases of epenthesis (see Section 6.2.1). They are organized according to the class of the cluster’s second member to check whether it influences the realizations of coda stops.
Among coda stops, released stops (71.4%) represent the main realization, irrespective of the following consonant, and unreleased stops correspond to 15.5%. Before sibilants, they represent 27.1% of the realizations, whereas they are less frequent before other stops (14.3% of cases) and nasals (10.9%). Epentheses (3.4% of the total) occur (with one exception) before nasals, where they represent 6.9% of the realizations. Assimilation (4.7% in total) is realized only within a stop + stop cluster (where they represent 13% of realizations). No assimilation has been observed when the stop is preceded by a sibilant or a nasal (cf. Ulfsbjorninn, 2017). In these two contexts, we have found some occurrences of spirantized allophones instead. Fricatives (6% in total) are indeed present before a nasal (7.9% of the cases) and before a sibilant (14.6% of the cases), but never before another stop.
According to our results, cluster simplification is sporadic (10%), with 3% via assimilation, while 7% via epenthesis. The type of cluster simplification seems to be related to the following consonants: vocalic insertions are favored before nasals, while assimilations are more frequent in stop + stop clusters, a context in which spirantization is never present.
Summing up, our allophonic classification shows a strong difference in the stops’ behavior in onset as opposed to coda (Table 1 and Table 6). Stops in coda are almost always realized as stops (89%), with very few cases of spirantization, while in the onset they are almost always spirantized (90%). Spirantization in both contexts is more frequent among low-educated speakers, but a difference in GT application between coda and onset is found for all the sociolinguistic categories considered.
Cluster simplification via epenthesis or assimilation is also rare; the following consonant seems to influence these residual phenomena (Table 7). Moreover, the percentages of spirantization in coda context (6%) are closer to the postconsonantal ones (16%) as compared to the intervocalic ones (90.5%). According to the data from the spontaneous corpus, this difference does not seem to depend on the non-native origin of the word, as coda stops do not spirantize, while intervocalic stops in loanwords show the same percentages of spirantization as the intervocalic stops of the native lexicon.

7.2. Quantitative Analysis of Intervocalic and Coda Stops

Before conducting the quantitative analysis, we performed a t-test, which revealed significant differences in acoustic measures (IntDiff: t = 2.13, df = 206.72, p = 0.034; NDur: t = 5.75, df = 108.46, p < 0.001) between coda stops following stressed vowels (N = 96, μIntDiff = 35.3 dB, σIntDiff = 5.2; μNDur = 1.1, σNDur = 0.5) and those following unstressed vowels (N = 137, μIntDiff = 33.9 dB, σIntDiff = 5.2; μNDur = 2.2, μNDur = 1.7). Therefore, according to what we have outlined in Section 6.2.2, the 96 coda stops preceded by an unstressed vowel have been removed from the quantitative analysis, which involves 515 tokens out of 611.
As a first step, in Table 8 we report duration (NDur) and intensity (IntDiff) measures distinguished by syllabic context and allophone realization, to check, within each context, for the possible correlation between the lenited outputs, the decrease in segment duration and the increase in intensity observed in literature on GT and lenition in general (see Section 6.2.2).
In both onset and coda contexts, fricatives and approximants show lower IntDiff and NDur values than stop outputs, as expected. Fricatives in coda, which are shorter and less constricted (lower IntDiff) than stops—thus, also considering acoustic parameters—may be analyzed as (rare) instances of GT affecting codas. Nevertheless, given the small number of observations, these segments could also represent pure noise without any phonological relevance. The relationship among duration, intensity, and allophones, within each context, can be better visualized in the following graphs (Figure 9) created with ggplot2 package (Wickham, 2016).
The Kruskal–Wallis test verified that, in the onset context, allophones differed significantly in both NDur, (χ2 = 128.56, p < 0.001), and IntDiff, (χ2 = 169, p < 0.001). According to the post hoc comparisons (Wilcoxon paired signed-rank with Bonferroni correction) these differences are significant for both measures among all the allophones (p < 0.001).
As for the coda context, allophones differ significantly concerning both NDur, χ2 = 11.78, p = 0.003 and IntDiff2 = 15.23, p < 0.001), However, according to the post hoc comparisons, assimilated outputs and stops differ significantly in IntDiff (p = 0.004) but not in NDur (p = 0.11); stops and fricatives differ significantly in NDur (p = 0.002) but not in IntDiff (p = 0.07).
In the onset context, a clear relationship is observed between normalized duration (NDur) and IntDiff, confirmed by the Kendall correlation test (df = 367, τ = 0.25, p < 0.001). In the coda context, for which the data are more scattered, the correlation between the two acoustic measures is much weaker (df = 135, τ = 0.11, p = 0.04).
These results seem to indicate that a general relationship among the discrete and the continuous measures of lenition within each context is confirmed, even though this relationship is stronger in the onset context.
When comparing the two contexts, we first notice that stops in coda are, on average, 30 ms longer than onset stops. As normalized duration is concerned, we see that stops in coda are 1.1 times longer than the preceding vowels, while stops in the intervocalic position are shorter (0.9). Stops in the two contexts also differ in intensity; stops in coda are, on average, 35 dB less intense than their preceding vowels, while onset stops only 19 dB. Stops in coda are thus longer and less intense than onset ones. However, as mentioned also in Section 6.2.2, we cannot affirm that the difference in duration we found between the two contexts is an effect of lenition. In fact, it might be an effect of the context itself, given that consonants’ duration changes according to their syllabic position and that Italian consonants (other than stops) in the syllable-final position seem to be systematically longer than single intervocalic ones (Farnetani & Kori, 1986, p. 30; McCrary, 2004). The difference in IntDiff, considered as a correlate of constriction, can more easily be interpreted as a difference in lenition between the two contexts.
The best model for IntDiff (Table 9), obtained through model reduction (see Section 6.2.2), retained only the syllabic position as a fixed factor, and target word and speaker as random factors.
To sum up, from the quantitative analysis, it emerged that coda stops are longer and less intense than intervocalic ones. In the onset position, a clear relationship among allophones, duration and intensity is confirmed, while in the coda, it is present but is much weaker. The model focusing on IntDiff, where constriction was found to be greater in the coda than in the onset context, provided acoustic confirmation of the difference observed between the two contexts in the allophonic classification and in GT application test (Section 7.1). This difference was not dependent on any of the sociolinguistic factors considered.

8. Discussion

8.1. The Importance of Being Onset

The main aim of this study was to test the Onset Hypothesis, i.e., the possibility that, to be affected by GT, the stop does not simply need to be preceded by a vowel, but it also must occupy the syllable onset. As Marotta (2008) observes, in native lexemes a post-vocalic stop always occupies the onset position. However, bookwords and recent loanwords represent a testbed for the hypothesis. Indeed, stop-initial clusters whose second segment is neither a liquid nor a glide are only found in loanwords (admin, Batman, cactus) and bookwords (ritmo “rhythm”, acne “acne”, atmosfera “atmosphere”). Hence, such words have been used to investigate the pattern of GT as far as stops in the coda position are concerned.
Results confirm the validity of the Onset Hypothesis (cf. also Marotta et al., 2023). Continuant and non-continuant outputs were almost identically distributed according to the syllable position of the stop. In brief, stops in coda were realized as such in 88% of the total, while the same underlying segments in the postvocalic onset position were produced as fricatives, approximants, or were deleted in 90% of the total tokens. Concerning the place of articulation effect on spirantization, our data show a trend in line with previous observations in literature; underlying velar /k/ and dental /t/ stops were spirantized more frequently than bilabials /p/ (see Section 7.1; cf. Sorianello, 2001; Villafaña Dalcher, 2008). This trend was found regardless of the stop syllabic affiliation, contrary to what hypothesized in Section 6.2.1.
As for the sociolinguistic aspects controlled here, we found that the syllabic position influenced GT pattern, regardless of the sociolinguistic variables, such as age, level of education and sex. This implies that speakers consistently distinguish between non-triggering codas and triggering onsets, with no role for sociolinguistic variation. However, interestingly, the level of education appeared to influence the general application of GT, i.e., highly educated speakers produced less lenited outputs than low-educated speakers, regardless of the stop syllabic affiliation. Such a finding is in line with the view that the former group (at least in the context of a reading task) tends to exert more control over dialectal features than the latter group (cf. Giannelli & Savoia, 1978). The fact that age did not show any effect suggests that GT has not diminished its productivity, from an intergenerational point of view.
Eventually, contrary to the observations in Cravens and Giannelli (1995), relative to the Bibbiena’s Tuscan variety, Florentine GT is not affected by the common sex/prestige pattern usually found (Labov, 2001), different from other local traits, as observed by Piccardi (2017).
The difference in the degree of constriction between outputs in coda and the onset position is significant, aligning with the almost total absence of lenition in coda as compared to the almost systematic lack of stop outputs in the post-vocalic onset position. It is interesting to note that the degree of constriction of the underlying stops’ outputs shows a high variation at the interspeaker level, but it is not subject to sociolinguistic forces. It suggests that constriction, i.e., the distance between articulators, is likely to be independent from both sociolinguistic and phonological factors. The high variation related to the target word would require further investigation, controlling also the word frequency (Phillips, 1984; Pierrehumbert, 2001). Despite our results needing to be verified in a wider and more balanced corpus, they show that the Onset Hypothesis is correct. Stops in syllable coda are not affected by GT, which limits its action to the syllable onset (spirantization was observed in 6% only of the total coda stops).
Consistent with our findings, the locus “postvocalic” does not properly capture the context of GT. By assuming that only the presence of a preceding vowel is required for a stop to lenite, one overgenerates the production of spirantized outputs as the main realization of stops preceding non-liquid consonants, i.e., occupying the syllable coda, which was not found to be the case (cf. also Ulfsbjorninn, 2017). Therefore, the context needs to be specified with reference to the syllable constituents. In other words, Gorgia Toscana affects post-vocalic onsets occupied by stop obstruents. According to the observation that the relevant segments lenite also in branching onsets, the target of GT is most precisely defined as the head of an internuclear onset. Marotta (2008), Bafile (1997), Loporcaro (2005) and Ulfsbjorninn (2017) already argued that GT only affects onsets, although, to our knowledge, this view was never tested empirically, providing a phonetic description of the outputs, until Marotta et al. (2023) and the present study.
Note that one may describe the scenario sub iudice by claiming that GT affects stops that are simultaneously preceded by vowels and followed by liquids, glides or vowels (as, e.g., in Sorianello, 2001, among many others). Such a formulation is unable to explain why the right side imposes fewer restrictions. By assuming instead that a syllable structure governs the relations among segments, the reason why the right side imposes fewer restrictions becomes straightforward. In fact, in Italian, a stop occupying the syllable onset is by definition followed either by a liquid or a vowel (i.e., the onset’s branching and a nucleus, respectively). The presence of any other segment following the stop, such as nasals and obstruents, implies a coda-onset cluster in current Tuscan6 (see Ulfsbjorninn, 2017, sct. 4).
When an internuclear onset position is occupied by a stop segment, closure will be removed from the articulatory configuration, as indirectly shown by our acoustic data. From this perspective, if Florentine allowed nasals, in addition to liquids, as licit onset branchings, the stop would undergo lenition before both classes. The crucial factor is the affiliation with an internuclear syllable onset.
To conclude, the when issue, as it was labeled in Section 3, needs to be dealt with by referring to the syllabic domain and the relationships between constituents, i.e., at the suprasegmental level. Conversely, the what issue (see Section 2) concerns the subsegmental level (see the relative discussion in Section 8.3).

8.2. Phonological Analysis in Strict CV and Coda Mirror Frameworks

In this section, we analyze the pattern observed according to the frameworks presented in Section 4, i.e., Strict CV (Lowenstamm, 1996; Scheer, 2004) and Coda Mirror (Scheer & Ziková, 2010). As claimed in Section 4, such frameworks avoid the use of the syllable; they instead postulate the action of lateral forces on the segmental linear order to account for positional phenomena. We will discuss first the situation concerning the GT target, i.e., the internuclear onset, as defined in Section 8.1.
Recalling from Section 4 (see Figure 1 and Figure 2) that (branching) onsets between (full) nuclei are governed and unlicensed, the weakest condition according to the Coda Mirror theory. GT may, thus, be described as an effect of Government (cf. Marotta, 2008). Intervocalically, the full nucleus on the right governs the immediately adjacent C position on its left. Only 16% of post-C stops (licensed and ungoverned) were indeed realized as continuants, suggesting that the Coda Mirror prediction is generally correct. Governed positions represent the main site of weakening processes, such as GT (see Ségéral & Scheer, 2008; Scheer & Ziková, 2010).
Relative to weak positions, the model also correctly predicts that codas are stronger than intervocalic (branching) onsets. Indeed, our results show that the target of GT is the onset between two ungoverned nuclei, i.e., a governed C position. Onsets preceding governed nuclei, being ungoverned and unlicensed, i.e., the traditional coda, and onsets following governed nuclei, being ungoverned and licensed, i.e., the traditional post-consonantal onsets, are not (as frequently) affected. See Figure 10 and Figure 11.
As shown, GT may be directly accounted for by analyzing the pattern within the Strict CV framework, particularly based on Coda Mirror (Scheer & Ziková, 2010). GT may be understood as a manifestation of Government effects on stops, a kind of relation causing the inhibition of the segmental expression of the stop (as claimed above; see the following Section).
As far as branching onsets are concerned, according to their representation in Strict CV (see Section 4), the cluster-initial stop is lenited since it is governed by the empty nucleus within the Infrasegmental Government (IG) domain. Such an empty nucleus does not need to be governed in order to remain phonetically silent; it is, however, able to govern preceding V and C positions. Accordingly, an infrasegmentally governed stop will occupy a weak position if preceded by a full nucleus, where the following ungoverned empty nucleus will govern the stop. In contrast, an IG-governed stop will occupy a strong position when preceded by an empty nucleus, the stop being licensed in this case (e.g., /t/ in dentro “inside”). According to this perspective, the stop experiences the same lateral forces regardless of whether it occupies a simple or branching onset, in traditional terms.
In brief, the target of GT is defined as a stop obstruent, both plain and affricate, occupying a governed C position in Strict CV terms.
Stop-initial heterosyllabic clusters, as those analyzed in this study, have been reported to have frequently undergone nativization through total regressive assimilations—as in Lat. NOCTEM > Tsc. [ˈnɔtːe] “night” and, from our data, cactus → [ˈkatːus]—and vowel epentheses—as/ps/ → [pis] in psicologo “psychologist” (see, e.g., Repetti, 1993; Ulfsbjorninn, 2017). Few occurrences of such “strategies” have been detected within our corpus, though.
The low incidence of such realizations within our corpus suggests that heterosyllabic stop-initial clusters are mostly tolerated in current Florentine, contrary to the past as well as recent times. We may interpret such a fact as a loosening of the phonotactic constraints (see Section 3).
As for the vowel epenthesis issue presented in Section 4, we observe that, in our Florentine Italian data, such vocoids do not seem to impose stops in the coda to re-syllabify as onsets. In this case, we argued, they should be subject to GT, being intervocalic on the surface. On the contrary, they were realized as stops in the (few) occurrences in our data. Consistent with this observation, one may hypothesize that cases of word-internal epenthesis are analogous to those occurring after final consonants, not only in Tuscan, as in/la ˈkɔp/ → [la ˈhɔpːə] “the Coop (supermarket company)”, but also, e.g., in Roman Italian and other varieties, as in Rom. [sˈtɔpːə] “stop”, [sˈpɛkːə] “speck”(Broniś, 2016). In the final position, one cannot refer to bogus clusters in order to justify the vowel insertion (cf. Ulfsbjorninn, 2017).
An alternative analysis might interpret these outputs as stemming from the prohibition against stop obstruents appearing in syllable codas, rather than from an inability to form relationships with subsequent non-liquid consonants. Note that in the word-final position, the stop followed by epenthesis is geminated. If word-internal and word-final epentheses were the same process, forms such as /ˈtɛknika/ → [ˈtɛkːənːika], “technique” should be expected. The unified process might be analyzed as a way of preserving the stop underlying syllabic affiliation on the surface form, as the first half of a geminate, the only condition under which coda stops are licit in Florentine, especially in the past. At least from a preliminary observation of the few epenthesis tokens in our data, pre-epenthesis stops appear to be long. We have not controlled for the length of the post-epenthesis consonant.
Although a systematic analysis would require much more data than those within our corpus, it might be suggested that what is being epenthesized is a CV(C) syllable, where C is represented by the second half of a geminate and V by the post-lexical vocoid. Both the supposed CV(C) epenthesis and total regressive assimilations avoid the occurrence of an illicit coda, in favor of a licit one. In Tuscan, a coda stop is only licit when followed by a subsegmentally identical onset segment, i.e., in a geminate.
As a final note, the few instances of coda stop spirantization observed in this study show the same acoustic correlates, appear to be influenced by the same sociolinguistic factor (education), and follow the same distribution based on place of articulation as onset spirantization. Therefore, we cannot exclude that they represent rare instances of Gorgia Toscana. Crucially, coda stops spirantize only before nasals and sibilants. It is not peregrine to propose that clusters of this kind might receive tautosyllabic parsing. Indeed, such consonantal sequences (stop + nasal/sibilant) show an increasing sonority slope, typical of branching onsets (i.e., they respect the Sequencing Sonority Principle; see, e.g., Vennemann, 1988, 2012, among others). As shown in Section 7.1, stop spirantization in the pre-nasal/sibilant position is more likely to occur among low-educated speakers. At present, we are not able to provide a consistent and valid sociolinguistic justification for such a pattern, which was, however, rarely encountered, as argued above. Possibly, the tautosyllabic parsing might be due to less intense exposure to the English characterizing low-educated speakers, as compared to highly educated speakers. In English, in fact, such clusters are heterosyllabic. However, we are not aware of the degree of exposure to English of either high- or low-educated Florentine speakers within our sample.
Concluding this section, despite the linear syllable structure view of the phonological frameworks adopted in this contribution (i.e., Scheer, 2004), the Tuscan spirantization may be analyzed as a process where the syllable structure plays a fundamental role, since it involves the computation of syllabic constituents, along with the associated subsegmental melody, in seeking for articulatory closure to delete. In the following section, we will discuss the what issue (see Section 2), i.e., the subsegmental mechanism to which the targeted underlying stop is subject.

8.3. Lenition as Stopness Loss

As already discussed, GT affects stops, regularly voiceless stops (/p t k/), and, less systematically, the voiced ones (/b d ɡ/). In Section 2, we also discussed the deaffrication of /ʧ ʤ/ (Marotta, 2008; Ulfsbjorninn, 2017) as an output of GT. All these segments are characterized by a closure phase, the property that makes them stops; affricates are stops showing a fricated release (cf. Marotta & Vanelli, 2021).
The target of GT is exactly such a property, i.e., closure. The (non-nasal) segments, which otherwise would be articulated by completely blocking the airflow in a point of the oral cavity, in the context of GT are produced without interrupting the airflow, i.e., as fricatives and approximants.
In Section 8.2, we have seen that GT is the Florentine manifestation of Government effect on a C position (Scheer & Ziková, 2010), which inhibits the melodic expression of its target (Scheer, 2004).
Such a kind of weakening, i.e., the deletion of the closure phase, can be formalized within the Element Theory framework (Kaye et al., 1985, 1990; Harris, 1994; Harris & Lindsey, 1995; Backley, 2011, p. 128), as the suppression of the stop element |ʔ| from the element composition (see Bafile, 1997, p. 34). Therefore, the process, e.g., /p/ → [ɸ] is simply represented in ET as the operation |U H ʔ| → |U H|. The operation takes place when the relevant elemental setup is associated with a governed syllabic position, as we have seen in Section 8.2.
Although the difference between oral and debuccalized outputs was not controlled in the present study, we would like to note that Sorianello (2003, p. 3083; 2004, p. 8) claims that the output of /k/, which is said to mainly result in a glottal fricative [h], is strongly influenced by coarticulation with the following vowel. She argues that the spectral energy of the output of [h] depends on the next vowel quality; a following back vowel favors the production of [x], rather than [h], the latter surfacing before front and low vowels. Before front vowels, however, [h] shows a much higher spectral prominence (ca. 2800 Hz, see Sorianello, 2003) as compared to [h] preceding non-front vowels, roughly approaching the following vowel F2 values.
From the ET point of view, the scenario relative to/k/may be described as the influence of the resonance elements (|I|, |U|, |A|) of the governing nucleus on the elemental setup of the governed stop /k/. Similarly to our discussion relative to regressive assimilation, the vowel influence might be understood as an adjacency effect affecting the place-defining elements. The reasons why the velar place is much more influenced by vowels as compared to the other places remain to be explored in further studies; it may be hypothesized that this depends on the lingual articulator involved in velars, i.e., the tongue dorsum.

9. Conclusions

This paper dealt with the Gorgia Toscana pattern as observed in Florence. In particular, our aim was to test the Onset Hypothesis, i.e., the possibility for GT to only affect syllable onsets, rather than any post-vocalic stop, which might as well be represented by syllable codas. Indeed, although scholars had already analyzed the process as only affecting onsets (Bafile, 1997; Loporcaro, 2005; Marotta, 2008; Ulfsbjorninn, 2017), our study represents one of the first contributions, along with Marotta et al. (2023), providing empirical evidence. We specifically focused on the phonetic and phonological behavior of word-internal coda stops in light of GT. Results show that stops in coda are not spirantized (i.e., not affected by GT), phonetically surfacing as stops (cf. Ulfsbjorninn, 2017).
The “Importance of being Onset”, thus, refers to the relevance of the syllable structure in the understanding of lenition phenomena, such as GT. Targets are better identified by considering the syllable constituent domain, along with the subsegmental properties involved. If future studies were to replicate the observations made by Giannelli and Savoia (1978, 1980)—specifically, the approximant realization of liquids and nasals in the GT context, which can be interpreted as a loss of stopness—the syllable’s descriptive and potentially explanatory significance would be reinforced. Contrary to our claim in Section 2, where we stated that GT affects stop obstruents, this scenario suggests that any consonant occupying the internuclear onset position would lose its stopness7, if present, regardless of the phonological segmental class.
In conclusion, it appears that stops have likely become “phonotactically licit” in syllable coda positions. The number of nativization processes was highly limited, while stop-initial clusters were largely preserved in their underlying form. The expansion of stops into syllabic positions beyond onsets (and codas in geminates) may thus be considered a phonotactic innovation of Florentine Tuscan.

Author Contributions

Conceptualization, P.C.; methodology, G.A.; software, G.A.; validation, P.C. and G.A.; formal analysis, P.C.; investigation, G.A.; data curation, G.A.; writing—original draft preparation, P.C. (§ 1, 2, 3, 4, 8) and G.A. (§ 5, 6, 7); writing—review and editing, P.C. and G.A.; visualization, G.A.; supervision, P.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the University of Pisa (protocol code 007798/2022, date: 13 June 2022).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Notes

1
Strong voiceless fricatives show a narrower closure than the prototypical ones, cfr. Sorianello (2003, p. 3082).
2
That is, when weakened, geminates may degeminate, see, e.g., Kirchner (2004), while they are not commonly targeted by spirantization or any other process involving changes in the manner of articulation or the voicing activity.
3
Indeed, Tuscan speakers usually refer to GT as “ci aspirata”, literally “aspirated C”, in reference to the orthographic representation of /k/, i.e., ⟨c⟩.
4
The choice of setting the classification boundary between stops and fricatives when friction takes more than (the second) half of the phone duration follows Marotta (2001) and Villafaña Dalcher (2008). They consider the occurrence of an occlusion phase followed by frication noise for more than half of phone duration as an instance of weakening, instantiated as a sort of affricate (termed “semifricative” in Marotta, 2008). As we found these allophones in very low percentages (3%) in our corpora, instead of creating an underrepresented category, we grouped them together with fricatives, considering both affricates and fricatives as spirantized voiceless outputs. For the affrication of stops as an instance of lenition in other languages see, among others, Honeybone (2001) for Liverpool English, Pfiffner and Martinez-Garcia (2023) for Dutch, Yaqoub et al. (2023) for the Alma Arabic variety.
5
Distinguishing the 118 post-tonic coda stop realizations according to the finer classification, acoustic measures means and standard deviations are the following:
-
99 released stops: phone duration = 99 (28) ms, NDur = 1.2 (0.6), IntDiff = 34 (4) dB.
-
18 unreleleased stops: phone duration = 86 (28) ms, NDur = 0.9 (0.3), IntDiff = 37 (4) dB.
-
1 stop with epenthesis: phone duration = 115 ms, NDur = 1.4, IntDiff = 38 dB.
Interestingly, but obviously more data would be needed as it is just one case, the post-stressed stop with ephentesis is the longest and most constricted one, with a normalized duration closer to the assimilated geminate outputs.
6
Glides /w j/ may follow too. Their syllabic affiliation is, however, problematic (see the discussions in e.g., Marotta, 1988; Canalis, 2018).
7
In rhotics and laterals, stopness may be present as the apical contact, since in Tuscan they are prototypically produced as alveolar trills [r] and apical laterals [l] (Bertinetto & Loporcaro, 2005).

References

  1. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. [Google Scholar] [CrossRef]
  2. Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R (1st ed.). Cambridge University Press. [Google Scholar] [CrossRef]
  3. Backley, P. (2011). An introduction to element theory. Edinburgh University Press. [Google Scholar]
  4. Bafile, L. (1997). La spirantizzazione toscana nell’ambito della teoria degli elementi. In Studi linguistici offerti a G. Giacomelli dagli amici e dagli allievi (pp. 27–38). Unipress. [Google Scholar]
  5. Bafile, L. (2003). Il trattamento delle consonanti finali nel fiorentino: Aspetti fonetici. In G. Marotta, & N. Nocchi (Eds.), Atti delle XIIIe Giornate di studio del GFS (A.I.A.) (pp. 205–212). Pisa 28-30 Novembre 2002. ETS. [Google Scholar]
  6. Barton, K. (2023). MuMIn: Multi-model inference. Available online: https://CRAN.R-project.org/package=MuMIn (accessed on 8 July 2024).
  7. Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using Lme4. Journal of Statistical Software, 67(1), 1–48. [Google Scholar] [CrossRef]
  8. Bertinetto, P. M. (1981). Strutture prosodiche dell’italiano. accento, quantità, sillaba, giuntura, fondamenti metrici. Accademia della Crusca. [Google Scholar]
  9. Bertinetto, P. M., & Loporcaro, M. (2005). The sound pattern of standard Italian, as compared with the varieties spoken in Florence, Milan and Rome. Journal of the International Phonetic Association, 35(2), 131–151. [Google Scholar] [CrossRef]
  10. Bertinetto, P. M., Sorianello, P., & Ricci, I. (2007). Sulla gerarchia dei fattori che governano la gorgia. un’applicazione dell’algoritmo di decisione c&rt. In V. Giordani, V. Bruseghini, & P. Cosi (Eds.), Atti del III Convegno Nazionale AISV ‘Scienze Vocali e del linguaggio’ metodologie di valutazione e risorse linguistiche, Trento (pp. 167–180). EDK Editore. [Google Scholar]
  11. Boersma, P., & Weenink, D. (2023). Praat: Doing phonetics by computer [computer program]. Available online: http://www.praat.org/ (accessed on 8 July 2024).
  12. Bourdieu, P. (1979). La distinction. critique sociale du jugement. Minuit. [Google Scholar]
  13. Broniś, O. (2016). Italian vowel paragoge in loanword adaptation. Phonological analysis of the roman variety of standard Italian. Italian Journal of Linguistics, 28(2), 25–68. [Google Scholar]
  14. Calamai, S. (2011). Per una storia della pronuncia degli italiani: Opinioni e atteggiamenti intorno alla pronuncia fiorentina. In A. Nesi, M. S. Scotti, & N. Maraschio (Eds.), Storia della lingua italiana e Storia dell’Italia Unita. L’italiano e lo stato nazionale (pp. 175–184). Cesati. [Google Scholar]
  15. Calamai, S. (2017). Tuscan between standard and vernacular: A sociophonetic perspective. In M. Cerruti, C. Crocco, & S. Marzo (Eds.), Towards a new standard (pp. 213–241). De Gruyter. [Google Scholar] [CrossRef]
  16. Callender, C. (2010). Trubetzkoy, autosegmental phonology and the segmental status of geminates. In M. Procházka, M. Malá, & P. Šaldová (Eds.), The prague school and theories of structure (pp. 45–60). V&R Unipress GmbH. [Google Scholar]
  17. Canalis, S. (2018). The status of Italian glides in the syllable. In R. Petrosino, P. Cerrone, & H. Van Der Hulst (Eds.), From sounds to structures (pp. 3–29). De Gruyter. [Google Scholar] [CrossRef]
  18. Celata, C., & Kaeppeli, B. (2003). Affricazione e rafforzamento in italiano. Alcuni dati sperimentali. Quaderni del laboratorio di linguistica della Scuola Normale Superiore, 4, 43–59. [Google Scholar]
  19. Celata, C., Meluzzi, C., & Bertini, C. (2022). Acoustic and kinematic correlates of heterosyllabicity in different phonological contexts. Language and Speech, 65(3), 755–780. [Google Scholar] [CrossRef]
  20. Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., & Ghazanfar, A. A. (2009). The natural statistics of audiovisual speech. Edited by Karl J. Friston. PLoS Computational Biology, 5(7), e1000436. [Google Scholar] [CrossRef]
  21. Chierchia, G. (1986). Length, syllabification, and the phonological cycle in Italian. Journal of Italian Linguistics, 8, 5–34. [Google Scholar]
  22. Clements, N. (1985). The geometry of phonological features. Phonology Yearbook, 2, 225–252. [Google Scholar] [CrossRef]
  23. Contini, G. (1960). Per un’interpretazione strutturale della cosiddetta ‘gorgia toscana’. Boletim de Filologia, 19, 269–281. [Google Scholar]
  24. Cravens, T. D. (1984). Iintervocalic consonant weakening in a phonetic-based strength phonology: Foleyan hierarchies and the Gorgia Toscana. Theoretical Linguistics, 11(3), 269–310. [Google Scholar] [CrossRef]
  25. Cravens, T. D. (2000). Sociolinguistic subversion of a phonological hierarchy. WORD, 51(1), 1–19. [Google Scholar] [CrossRef]
  26. Cravens, T. D., & Giannelli, L. (1995). Relative salience of gender and class in a situation of multiple competing norms. Language Variation and Change, 7(2), 261–285. [Google Scholar] [CrossRef]
  27. Crocco, C. (2017). Everyone has an accent. Standard Italian and regional pronunciation. In M. Cerruti, C. Crocco, & S. Marzo (Eds.), Towards a new standard: Theoretical and empirical studies on the restandardization of Italian (pp. 89–117). De Gruyter Mouton. [Google Scholar]
  28. D’Agostino, M., & Paternostro, G. (2018). Speaker variables and their relation to language change. In W. Ayres-Bennett, & J. Carruthers (Eds.), Manual of romance sociolinguistics (pp. 197–216). De Gruyter. [Google Scholar] [CrossRef]
  29. De Pascale, S., & Marzo, S. (2016). Gli italiani regionali. Atteggiamenti linguistici verso le varietà geografiche dell’italiano. Incontri. Rivista Europea Di Studi Italiani, 31(1), 61–76. [Google Scholar] [CrossRef]
  30. De Pascale, S., Marzo, S., & Speelman, D. (2017). Evaluating regional variation in Italian: Towards a change in standard language ideology? In M. Cerruti, C. Crocco, & S. Marzo (Eds.), Towards a new standard (pp. 118–142). De Gruyter. [Google Scholar] [CrossRef]
  31. D’Imperio, M., & Rosenthall, S. (1999). Phonetics and phonology of main stress in Italian. Phonology, 16(1), 1–28. [Google Scholar] [CrossRef]
  32. Ennever, T., Meakins, F., & Round, E. R. (2017). A replicable acoustic measure of lenition and the nature of variability in Gurindji Stops. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 8(1), 20. [Google Scholar] [CrossRef]
  33. Farnetani, E., & Kori, S. (1986). Effects of syllable and word structure on segmental durations in spoken Italian. Speech Communication, 5(1), 17–34. [Google Scholar] [CrossRef]
  34. Fava, E., & Magno Caldognetto, E. (1976). Studio sperimentale delle caratteristiche elettroacustiche delle vocali toniche e atone in bisillabi italiani. In R. Simone, U. Vignuzzi, & G. Ruggiero (Eds.), Studi di fonetica e fonologia, Atti del IX convegno della S.L.I. (pp. 35–79) Bulzoni. [Google Scholar]
  35. Galli de’ Paratesi, N. (1984). Lingua Toscana in bocca ambrosiana: Tendenze verso l’italiano standard: Un’inchiesta sociolinguistica. Il Mulino. [Google Scholar]
  36. Giacomelli, R. (1934). Controllo fonetico per diciassette punti dell’AIS nell’Emilia, nelle Marche, in Toscana, nell’Umbria e nel Lazio. Archivum Romanicum, 18, 153–211. [Google Scholar]
  37. Giannelli, L. (2000). Toscana (2nd ed.). Pacini. [Google Scholar]
  38. Giannelli, L., & Cravens, T. D. (1997). Consonantal weakening. In M. Maiden, & M. Parry (Eds.), The dialects of Italy (pp. 32–40). Routledge. [Google Scholar]
  39. Giannelli, L., & Savoia, L. M. (1978). L’indebolimento consonantico in Toscana I. Rivista Italiana di Dialettologia, 2, 25–58. [Google Scholar]
  40. Giannelli, L., & Savoia, L. M. (1980). L’indebolimento consonantico in Toscana II. Rivista Italiana di Dialettologia, 4, 39–101. [Google Scholar]
  41. Goldsmith, J. A. (1976). Autosegmental phonology. Massachusetts Institute of Technology. [Google Scholar]
  42. Goldsmith, J. A. (1990). Autosegmental and metrical phonology. Basil Blackwell. [Google Scholar]
  43. Goldstein, L. (2003). Emergence of discrete gestures. In M.-J. Solé, D. Recasens, & J. Romero (Eds.), Proceedings of the 15th international congress of phonetic sciences, Barcelona, Spain (pp. 85–88). Casual Productions. [Google Scholar]
  44. Harris, J. (1994). English sound structure. Basil Blackwell. [Google Scholar]
  45. Harris, J., & Lindsey, G. (1995). The elements of phonological representation. In J. Durand, & F. Katamba (Eds.), Frontiers of phonology (pp. 34–79). Longman. [Google Scholar]
  46. Hay, J., & Foulkes, P. (2016). The evolution of medial/t/over real and remembered time. Language, 92(2), 298–330. [Google Scholar] [CrossRef]
  47. Honeybone, P. (2001). Lenition inhibition in Liverpool English. English Language and Linguistics, 5(2), 213–249. [Google Scholar] [CrossRef]
  48. Hualde, J. I., & Nadeu, M. (2012). Lenition and phonemic overlap in Rome Italian. Phonetica, 68(4), 215–242. [Google Scholar] [CrossRef]
  49. Hudson, R. A. (1996). Sociolinguistics. Cambridge University Press. [Google Scholar]
  50. Jakobson, R., Gunnar M. Fant, C., & Halle, M. (1952). Preliminaries to speech analysis: The distinctive features and their correlates. MIT Press. [Google Scholar]
  51. Katz, J., & Pitzanti, G. (2019). The phonetics and phonology of lenition: A campidanese sardinian case study. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 10(1), 1–40. [Google Scholar] [CrossRef]
  52. Kaye, J., Lowenstamm, J., & Vergnaud, J.-R. (1985). The internal structure of phonological segments: A theory of charm and government. Phonology Yearbook, 2, 305–328. [Google Scholar] [CrossRef]
  53. Kaye, J., Lowenstamm, J., & Vergnaud, J.-R. (1990). Constituent structure and government in phonology. Phonology, 7(1), 193–231. [Google Scholar] [CrossRef]
  54. Kingston, J. (2008). Lenition. In L. Colantoni, & J. Steele (Eds.), Proceedings of the 3rd conference on laboratory approaches to spanish phonology (pp. 1–31). Cascadilla Proceedings Project. [Google Scholar]
  55. Kirchner, R. (2004). Consonant lenition. In B. Hayes, R. Kirchner, & D. Steriade (Eds.), Phonetically based phonology (1st ed., pp. 313–345). Cambridge University Press. [Google Scholar] [CrossRef]
  56. Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. [Google Scholar] [CrossRef]
  57. Labov, W. (2001). Principles of linguistic change. Vol. 2: Social factors. Blackwell. [Google Scholar]
  58. Loporcaro, M. (2005). La sillabazione di muta cum liquida dal latino al romanzo. In S. Kiss, L. Mondin, & G. Salvi (Eds.), Latin et langues romanes (pp. 419–430). De Gruyter. [Google Scholar] [CrossRef]
  59. Lowenstamm, J. (1996). CV as the only syllable type. In J. Durand, & B. Laks (Eds.), Current trends in phonology, models and methods (pp. 419–443). European Studies Research Institute. [Google Scholar]
  60. Lowenstamm, J. (1999). The beginning of the word. In J. Rennison, & K. Kühnhammer (Eds.), Phonologica 1996 (pp. 153–166). Syllables!? Holland Academic Graphics. [Google Scholar]
  61. Mairano, P., Nodari, R., Ardolino, F., De Iacovo, V., & Mereu, D. (2025). Inherently Long Consonants in Contemporary Italian Varieties: Regional Variation and Orthographic Effects. Languages, 10(6), 118. [Google Scholar] [CrossRef]
  62. Marotta, G. (1985). Modelli e misure ritmiche. La durata vocalica in italiano. Zanichelli. [Google Scholar]
  63. Marotta, G. (1988). The Italian diphthongs and the autosegmental framework. In P. M. Bertinetto, & M. Loporcaro (Eds.), Certamen phonologicum: Papers from the 1987 cortona phonology meeting (pp. 399–430). Rosenberg and Sellier. [Google Scholar]
  64. Marotta, G. (1995). Apocope nel parlato di Toscana. Studi italiani di linguistica teorica e applicata, XXIV/2, 297–322. [Google Scholar]
  65. Marotta, G. (2001). Non solo spiranti. la ‘gorgia toscana’ nel parlato di Pisa. L’Italia Dialettale, 62, 27–60. [Google Scholar]
  66. Marotta, G. (2003). Una rivisitazione acustica della ‘gorgia’ toscana. In F. A. Leoni, F. Cutugno, M. Pettorino, & R. Savy (Eds.), Atti del convegno nazionale Il parlato italiano. Napoli (13–15 Febbraio 2003). D’Auria. [Google Scholar]
  67. Marotta, G. (2008). Lenition in tuscan Italian (gorgia toscana). In J. B. De Carvalho, T. Scheer, & P. Ségéral (Eds.), Lenition and fortition (pp. 235–272). Mouton de Gruyter. [Google Scholar] [CrossRef]
  68. Marotta, G. (2014). New parameters for the sociophonetic indexes: Evidence from the Tuscan varieties of Italian. In C. Celata, & S. Calamai (Eds.), Studies in language variation (pp. 137–168). John Benjamins Publishing Company. [Google Scholar] [CrossRef]
  69. Marotta, G., Cossu, P., & Avano, G. (2023). Voiceless stops in Syllable coda in tuscan Italian. In R. Skarnitzl, & J. Volín (Eds.), Proceedings of the 20th international congress of phonetic sciences (pp. 1995–1999). Guarant International. [Google Scholar]
  70. Marotta, G., & Vanelli, L. (2021). Fonologia e prosodia dell’italiano. Carocci editore. [Google Scholar]
  71. Martinez-Gil, F. (2019). Spirantization and the phonology of spanish voiced obstruents. In S. Colina, & F. Martinez-Gil (Eds.), The routledge handbook of spanish phonology (pp. 34–83). Routledge. [Google Scholar]
  72. McCrary, K. (2004). Reassessing the role of the syllable in Italian phonology: An experimental study of consonant cluster syllabification, definite article allomorphy and segment duration [Doctoral dissertation, University of California]. [Google Scholar]
  73. Morandini, D. (2007). The phonology of loanwords into Italian [Master’s thesis, University College London]. [Google Scholar]
  74. Pacini, B. (1998). Il processo di cambiamento dell’indebolimento consonantico a Cortona: Studio sociolinguistico. RID: Rivista Italiana di Dialettologia, 22, 15–57. [Google Scholar] [CrossRef]
  75. Pacini, B. (2010, December 14–15). Spirantizzazione fiorentina: Da ‘covert prestige’ a ‘overt prestige? Poster Presented at the Workshop Sociophonetics, at the Crossroads of Speech Variation, Processing and Communication, Pisa, Italy. [Google Scholar]
  76. Parrell, B. (2010). Articulation from acoustics: Estimating constriction degree from the acoustic signal. The Journal of the Acoustical Society of America, 128(4), 2289. [Google Scholar] [CrossRef]
  77. Petrova, O., Plapp, R., Ringen, C., & Szenyörgyi, S. (2006). Voice and Aspiration: Evidence from Russian, Hungarian, German, Swedish, and Turkish. The Linguistic Review, 23, 1–35. [Google Scholar] [CrossRef]
  78. Pfiffner, A., & Martinez-Garcia, J. (2023). Spirantization of word final plosives in Standard Dutch. In R. Skarnitzl, & J. Volín (Eds.), Proceedings of the 20th international congress of phonetic sciences (pp. 877–881). Guarant International. [Google Scholar]
  79. Phillips, B. S. (1984). Word frequency and the actuation of sound change. Language, 60(2), 320. [Google Scholar] [CrossRef]
  80. Piccardi, D. (2017). Sociophonetic factors of speakers’ sex differences in Voice Onset Time: A Florentine case study. In C. Bertini, C. Celata, G. Lenoci, C. Meluzzi, & I. Ricci (Eds.), Fattori sociali e biologici nella variazione fonetica (pp. 83–106). Officinaventuno. [Google Scholar] [CrossRef]
  81. Piccardi, D., & Ardolino, F. (2021). Gaming variables in linguistic research. Italian scale validation and a Minecraft pilot study. In C. Bernardasci, D. Dipino, D. Garassino, S. Negrinelli, E. Pellegrino, & S. Schmid (Eds.), L’individualità del parlante nelle scienze fonetiche: Applicazioni tecnologiche e forensi (pp. 299–324). Studi AISV 8. Officinaventuno. [Google Scholar] [CrossRef]
  82. Pierrehumbert, J. B. (2001). Exemplar dynamics: Word frequency, lenition and contrast. In J. L. Bybee, & P. J. Hopper (Eds.), Typological studies in language (pp. 137–157). John Benjamins Publishing Company. [Google Scholar] [CrossRef]
  83. Podesva, R. J., & Kajino, S. (2014). Sociophonetics, gender, and sexuality. In S. Ehrlich, M. Meyerhoff, & J. Holmes (Eds.), The Handbook of language, gender, and sexuality (pp. 103–122). Wiley. [Google Scholar] [CrossRef]
  84. Pömp, J., & Draxler, C. (2017, September 28–29). OCTRA—A configurable browser-based editor for orthographic transcription (2017). Phonetik und Phonologie im deutschsprachigen Raum (pp. 145–148), Berlin, Germany. [Google Scholar]
  85. R Core Team. (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Available online: https://www.R-project.org/ (accessed on 8 July 2024).
  86. Repetti, L. (1993). L’accento dei prestiti recenti in italiano. Quaderni Patavini di Linguistica, 12, 79–87. [Google Scholar]
  87. Repetti, L. (2012). Consonant-final loanwords and epenthetic vowels in italian. Catalan Journal of Linguistics, 11, 167–188. [Google Scholar] [CrossRef]
  88. Russo, M. (2022). Locality domains on lenition. spirantization (Gorgia) and voicing in Tuscan Dialects. Linx, 84, 1–91. [Google Scholar] [CrossRef]
  89. Scheer, T. (2004). A lateral theory of phonology. Mouton de Gruyter. [Google Scholar]
  90. Scheer, T., & Ziková, M. (2010). The coda mirror V2. Acta Linguistica Hungarica, 57(4), 411–431. [Google Scholar] [CrossRef]
  91. Schiel, F. (1999, August 1–7). Automatic phonetic transcription of non-prompted speech. ICPhS 1999 (pp. 607–610), San Francisco, CA, USA. [Google Scholar]
  92. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. [Google Scholar] [CrossRef]
  93. Ségéral, P., & Scheer, T. (2001). La coda-miroir. Bulletin de La Société de Linguistique de Paris, 96, 107–152. [Google Scholar]
  94. Sérégal, P., & Scheer, T. (2008). The coda mirror, stress and positional parameters. In J. B. De Carvalho, T. Scheer, & P. Ségéral (Eds.), Lenition and fortition (pp. 483–518). Mouton de Gruyter. [Google Scholar] [CrossRef]
  95. Sorianello, P. (2001). Un’analisi acustica della’gorgia’fiorentina. L’Italia Dialettale, 62, 61–94. [Google Scholar]
  96. Sorianello, P. (2003). Spectral characteristics of voiceless fricative consonants in Florentine Italian. In M.-J. Solé, D. Recasens, & J. Romero (Eds.), Proceedings of the 15th international congress of phonetic sciences, Barcelona, Spain (pp. 3081–3084). Casual Productions. [Google Scholar]
  97. Sorianello, P. (2004). Proprietà spettrali del rumore di frizione nel consonantismo fiorentino. In F. Albano Leoni, F. Cutugno, M. Pettorino, & R. Savy (Eds.), Il parlato italiano (CD-ROM). D’Auria. [Google Scholar]
  98. Sorianello, P. (2010). Gorgia toscana. In Enciclopedia dell’Italiano. Treccani. [Google Scholar]
  99. Sorianello, P., Bertinetto, P. M., & Agonigi, M. (2005). Alle sorgenti della variabilità della ‘gorgia fiorentina: Un approccio analogico. In P. Cosi (Ed.), Convegno nazionale Dell’AISV—Associazione Italiana di scienze della voce, misura dei parametri. aspetti tecnologici ed implicazioni nei modelli linguistici (pp. 327–362). EDK Editore. [Google Scholar]
  100. Ulfsbjorninn, S. (2017). Bogus clusters and lenition in Tuscan Italian: Implications for the theory of sonority. In G. Lindsey, & A. Nevins (Eds.), Language faculty and beyond (pp. 278–296). John Benjamins Publishing Company. [Google Scholar] [CrossRef]
  101. Urban, G. (1986). Linguistic consciousness and allophonic variation: A semiotic perspective. Semiotica, 61(1–2), 33–59. [Google Scholar] [CrossRef]
  102. Vennemann, T. (1988). Preference laws for syllable structure and the explanation of sound change. Mouton de Gruyter. [Google Scholar]
  103. Vennemann, T. (2012). Structural complexity of consonant clusters: A phonologist’s view. In P. Hoole, L. Bombien, M. Pouplier, C. Mooshammer, & B. Kühnert (Eds.), Consonant clusters and structural complexity (pp. 11–32). De Gruyter. [Google Scholar] [CrossRef]
  104. Villafaña Dalcher, C. (2008). Consonant weakening in florentine italian: A cross-disciplinary approach to gradient and variable sound change. Language Variation and Change, 20(2), 275–316. [Google Scholar] [CrossRef]
  105. Vogel, I. (1982). La sillaba come unità fonologica. Zanichelli. [Google Scholar]
  106. Wickham, H. (2016). Ggplot2: Elegant graphics for data analysis (2nd ed.). Springer. [Google Scholar] [CrossRef]
  107. Yaqoub, L., Hellmuth, S., & Bailey, G. (2023). The voiceless velar stop/k/in rijal alma arabic: Revisited. In R. Skarnitzl, & J. Volín (Eds.), Proceedings of the 20th international congress of phonetic sciences (pp. 3686–3690). Guarant International. [Google Scholar]
Figure 1. Lateral forces on consonants in a branching onset /pr/ in capra “goat”.
Figure 1. Lateral forces on consonants in a branching onset /pr/ in capra “goat”.
Languages 10 00129 g001
Figure 2. Lateral forces on consonants of a heterosyllabic cluster /n.t/ in dentro “inside”.
Figure 2. Lateral forces on consonants of a heterosyllabic cluster /n.t/ in dentro “inside”.
Languages 10 00129 g002
Figure 3. Soundwave and spectrogram of /ˈtaksi/ realized as [ˈtaksi]; male speaker, 26 years old.
Figure 3. Soundwave and spectrogram of /ˈtaksi/ realized as [ˈtaksi]; male speaker, 26 years old.
Languages 10 00129 g003
Figure 4. Soundwave and spectrogram of /ˈritmo/ realized as [ˈriθmo]; male speaker, 54 years old.
Figure 4. Soundwave and spectrogram of /ˈritmo/ realized as [ˈriθmo]; male speaker, 54 years old.
Languages 10 00129 g004
Figure 5. Soundwave and spectrogram of /ˈdito/ realized as [ˈdiːðo]; female speaker, 51 years old.
Figure 5. Soundwave and spectrogram of /ˈdito/ realized as [ˈdiːðo]; female speaker, 51 years old.
Languages 10 00129 g005
Figure 6. Soundwave and spectrogram of /ˈkaktus/ realized as [ˈkatːus]; female speaker, 26 years old.
Figure 6. Soundwave and spectrogram of /ˈkaktus/ realized as [ˈkatːus]; female speaker, 26 years old.
Languages 10 00129 g006
Figure 7. Soundwave and spectrogram of /ˈtaksi/ realized as [ˈtaksi]; male speaker, 26 years old.
Figure 7. Soundwave and spectrogram of /ˈtaksi/ realized as [ˈtaksi]; male speaker, 26 years old.
Languages 10 00129 g007
Figure 8. Soundwave and spectrogram of /apˈnɛa/ realized as [apəˈnɛa]; male speaker, 26 years old.
Figure 8. Soundwave and spectrogram of /apˈnɛa/ realized as [apəˈnɛa]; male speaker, 26 years old.
Languages 10 00129 g008
Figure 9. Acoustic parameters (x = IntDiff; y = NDur) extracted from the analyzed voiceless stop segments from the reading corpus, in the intervocalic (a) and coda context (b); colors indicate the allophonic outputs.
Figure 9. Acoustic parameters (x = IntDiff; y = NDur) extracted from the analyzed voiceless stop segments from the reading corpus, in the intervocalic (a) and coda context (b); colors indicate the allophonic outputs.
Languages 10 00129 g009
Figure 10. Representation of a governed stop /k/ in internuclear position and a licensed word-initial stop, in baco “worm”, target of GT.
Figure 10. Representation of a governed stop /k/ in internuclear position and a licensed word-initial stop, in baco “worm”, target of GT.
Languages 10 00129 g010
Figure 11. Representation of an ungoverned and unlicensed stop /k/ in coda, and a licensed and ungoverned stop /t/ in onset position, in cactus “cactus”, untargeted by GT.
Figure 11. Representation of an ungoverned and unlicensed stop /k/ in coda, and a licensed and ungoverned stop /t/ in onset position, in cactus “cactus”, untargeted by GT.
Languages 10 00129 g011
Table 1. Sociolinguistic grouping of this study’s participants.
Table 1. Sociolinguistic grouping of this study’s participants.
AgeSexLevel of Education
Young adults (22–32 y.o.)
N = 20
Males
N = 10
Graduated N = 5
Non-graduated N = 5
Females
N = 10
Graduated N = 5
Non-graduated N = 5
Older adults (45–65 y.o.)
N = 22
Males
N = 10
Graduated N = 5
Non-graduated N = 5
Females
N = 12
Graduated N = 5
Non-graduated N = 7
Table 2. Sentences containing the target words with postvocalic voiceless stops in coda and onset positions included in the present study.
Table 2. Sentences containing the target words with postvocalic voiceless stops in coda and onset positions included in the present study.
TokenSentence
Coda (V_C)
/p//ˈkapsule/ “capsules”Ho comprato nuove capsule biodegradabili
“I bought new biodegradable capsules”
/kapˈtato/, “intercepted”L’antenna ha captato un segnale radio
“The aerial intercepted a radio signal”
/apˈnɛa/, “freediving”Segue un corso di apnea subacquea
“He/she is taking a freediving course”
/ipˈnɔsi/ “hypnosis”Segue una tecnica di ipnosi terapeutica
“He/she is following a therapeutic hypnosis technique”
/t//ˈritmo/ “rhythm”Ballavano a ritmo di musica
“They were dancing to the rhythm of music”
/atmosˈfɛra/ “atmosphere”C’è un’atmosfera strana
“There is a strange atmosphere”
/k//ˈtɛknika/ “technique”Segue una tecnica di ipnosi terapeutica
“He/She follows a therapeutic hypnosis technique”
/ˈtaksi/ “taxi”Ho preso un taxi alla stazione
“I took a taxi from the station”
/ˈkaktus/ “cactus”È caduto su un cactus spinoso
“He fell onto a thorny cactus”
Onset (V_V)
/p//ˈkrɛpa/ “crack”Ho ricoperto la crepa grossa
“I covered the large crack”
/ˈrapa/ “turnip”Oggi c’è solo rapa bollita
“Today there is only boiled turnip”
/ˈtipo/ “guy”Luca è un tipo a posto
“Luca is an upright guy”
/t//ˈʧɛto/ “class”Fa parte del ceto medio
“He belongs to the middle class”
/ˈdito/ “finger”Ho picchiato il dito mignolo
“I hit my pinky finger”
/ˈfɔto/ “picture”Ho ancora quella foto ricordo
“I still have that photo as a souvenir”
/k//ˈbuka/ “hole”Hanno scavato una buca profonda
“They dug a deep hole”
/ˈkwɔko/ “cook”Giuseppe è un cuoco notevole
“Giuseppe is a remarkable cook”
/ˈfiki/ “figs”Fanno le nozze coi fichi secchi
  “They are celebrating their wedding with dried figs” (idiom)
Table 3. Distribution of the outputs of 611 voiceless stops from the reading corpus, distinguished according to the syllabic position and the place of articulation.
Table 3. Distribution of the outputs of 611 voiceless stops from the reading corpus, distinguished according to the syllabic position and the place of articulation.
StopFricativeApproximantDeletionAssimilatedTotal
Coda
(V_C)
207 (88.8%)15 (6.4%)0011 (4.7%)233 (100.0%)
/p/57 (95.0%)1 (1.7%) 2 (3.3%)60 (100.0%)
/t/75 (90.4%)8 (9.6%) 83 (100.0%)
/k/75 (83.3%)6 (6.7%) 9 (10.0%)90 (100.0%)
Onset
(V_V)
36 (9.5%)232 (61.4%)101 (26.7%)9 (2.4%)n/a378 (100.0%)
/p/19 (15.1%)94 (74.6%)13 (10.3%)0 126 (100.0%)
/t/12 (9.5%)99 (78.6%)15 (11.9%)0 126 (100.0%)
/k/5 (4.0%)39 (31.0%)73 (57.9%)9 (7.1%) 126 (100.0%)
Table 4. Absolute and relative frequency of GT application on voiceless stops in coda and onset positions (reading corpus), and their correspondence with the allophonic classification.
Table 4. Absolute and relative frequency of GT application on voiceless stops in coda and onset positions (reading corpus), and their correspondence with the allophonic classification.
Coda
(V_C)
GT applicationNon-continuant
218 (93.6%)
Continuant
15 (6.4%)
AllophonesStop
207 (88.8%)
Assimilation
11 (4.7%)
Fricative
15 (6.4%)
Approximant
0
Deletion
0
Onset
(V_V)
GT applicationNon-continuant
41 (9.5%)
Continuant
453 (90.5%)
AllophonesStop
36 (9%)
Assimilation
n/a
Fricative
232 (61.4%)
Approximant
140 (26.7%)
Deletion
9 (2.4%)
Table 5. Absolute and relative frequency of GT application on voiceless stops in coda and onset positions (reading corpus), for each level of the sociolinguistic factors considered in our study, i.e., education, sex and age.
Table 5. Absolute and relative frequency of GT application on voiceless stops in coda and onset positions (reading corpus), for each level of the sociolinguistic factors considered in our study, i.e., education, sex and age.
OnsetCoda
Variable LevelNon-ContinuantContinuantTotalNon-ContinuantContinuantTotal
Education
graduated32 (17.8%)148 (82.2%)180 (100.0%)103 (96.3%)4 (3.7%)107 (100.0%)
non-graduated4 (2.0%)194 (98.0%)198 (100.0%)115 (91.3%)11 (8.7%)126 (100.0%)
Sex
female23 (11.6%)175 (88.4%)198 (100.0%)118 (96.7%)4 (3.3%)122 (100.0%)
male13 (7.2%)167 (92.8%)180 (100.0%)100 (90.1%)11 (9.9%)111 (100.0%)
Age
young20 (11.1%)160 (88.9%)180 (100.0%)103 (96.3%)4 (3.7%)107 (100.0%)
adult16 (8.1%)182 (91.9%)198 (100.0%)115 (91.3%)11 (8.7%)126 (100.0%)
Table 6. Summary of the Generalized Mixed Model with GT application as the dependent variable (1 = continuants, 0 = non-continuants). Observations = 611. N speaker = 42, N target_word = 18. Random factor: Speaker (variance: 2.97; std.dev.: 1.72), target word (variance: 0.16; std.dev.: 0.40). Marginal R2/Conditional R2 = 0.71/0.85. Reference level (intercept): position = coda, phoneme = /k/, level of education = graduated.
Table 6. Summary of the Generalized Mixed Model with GT application as the dependent variable (1 = continuants, 0 = non-continuants). Observations = 611. N speaker = 42, N target_word = 18. Random factor: Speaker (variance: 2.97; std.dev.: 1.72), target word (variance: 0.16; std.dev.: 0.40). Marginal R2/Conditional R2 = 0.71/0.85. Reference level (intercept): position = coda, phoneme = /k/, level of education = graduated.
PredictorsEstimateOdds RatiosSEz Valuep
(Intercept)−4.510.010.73−6.17<0.001
position [onset]7.982924.630.829.75<0.001
level of education [non-graduated]2.047.670.692.930.003
phoneme [p]−1.780.170.58−3.060.002
phoneme [t]−0.450.640.52−0.880.380
Table 7. Allophonic classification for voiceless stops in coda position from the reading corpus organized by the following segment (manner of articulation).
Table 7. Allophonic classification for voiceless stops in coda position from the reading corpus organized by the following segment (manner of articulation).
StopAssimilatedFricativeTotal
Released StopUnreleased StopStop + Epenthesis
V_stop
e.g., /kaktus/
60 (71.4%)12 (14.3%)1 (1.2%)11 (13.1%)0.0%84 (100.0%)
V_s
e.g., /taksi/
28 (58.3%)13 (27.1%)007 (14.6%)48 (100.0%)
V_N
e.g., /ritmo/
75 (74.3%)11 (10.9%)7 (6.9%)08 (7.9%)101 (100.0%)
Total163 (70.0%)36 (15.5%)8 (3.4%)11 (4.7%)15 (6.4%)233 (100.0%)
Table 8. Mean and standard deviation of phone duration, normalized duration (NDur) and intensity difference within the preceding vowel (IntDiff). The data, from the reading corpus, are distinguished by context and output.
Table 8. Mean and standard deviation of phone duration, normalized duration (NDur) and intensity difference within the preceding vowel (IntDiff). The data, from the reading corpus, are distinguished by context and output.
Phone Duration (ms)NDurIntDiff (dB)
Coda (V_C)
(n = 137)
100 (36)1.13 (0.55)35 (5)
Assimilated (n = 9)156(37)1.46 (0.50)42 (5)
Stop5 (n = 118)98 (32)1.14 (0.55)35 (5)
Fricative (n = 10)74 (30)0.75 (0.33)31 (6)
Onset (V_V)
(n = 378)
63 (22)0.90 (0.44)19 (8)
Stop (n = 36)77 (10)1.29 (0.42)27 (6)
Fricative (n = 232)73 (14)0.97 (0.36)22 (6)
Approximant (n = 101)42 (17)0.55 (0.26)12 (6)
Deletion (n = 9)000
Table 9. Linear mixed model with IntDiff as a dependent variable. Reference level (intercept): position = coda). Observations = 515, N speaker = 42. N target_word = 14. Marginal R2/Conditional R2 = 0.47/0.74. Random factors: speaker (variance: 10.98; std. dev.: 3.31); target word (variance: 21.24, std.dev. 4.60).
Table 9. Linear mixed model with IntDiff as a dependent variable. Reference level (intercept): position = coda). Observations = 515, N speaker = 42. N target_word = 14. Marginal R2/Conditional R2 = 0.47/0.74. Random factors: speaker (variance: 10.98; std. dev.: 3.31); target word (variance: 21.24, std.dev. 4.60).
PredictorsEstimatesStd. Errordft Valuep
(Intercept)35.572.2415.2115.88<0.001
Syllabic position [onset]−17.012.6813.06−6.34<0.001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Avano, G.; Cossu, P. The Importance of Being Onset: Tuscan Lenition and Stops in Coda Position. Languages 2025, 10, 129. https://doi.org/10.3390/languages10060129

AMA Style

Avano G, Cossu P. The Importance of Being Onset: Tuscan Lenition and Stops in Coda Position. Languages. 2025; 10(6):129. https://doi.org/10.3390/languages10060129

Chicago/Turabian Style

Avano, Giuditta, and Piero Cossu. 2025. "The Importance of Being Onset: Tuscan Lenition and Stops in Coda Position" Languages 10, no. 6: 129. https://doi.org/10.3390/languages10060129

APA Style

Avano, G., & Cossu, P. (2025). The Importance of Being Onset: Tuscan Lenition and Stops in Coda Position. Languages, 10(6), 129. https://doi.org/10.3390/languages10060129

Article Metrics

Back to TopTop