Since the publication of Hinton et al.
), studies of sound symbolism have been thriving. Though stigmatized by 19th century scholars, because of its association with simple ideas about language origin, sound symbolism is now a legitimate subject for empirical investigation by linguists and cognitive scientists willing to take on the Saussurian ideal of an all-encompassing arbitrariness underlying human language. Research on sound symbolism now commonly acknowledges that language is characterized both by arbitrary, as well as motivated, iconic characteristics (Cuskley 2013
; Dingemanse et al. 2016
; Nuckolls 1999
). Although the term ‘arbitrariness’ was first used by Saussure, to characterize the bond between a word and its concept, or, more specifically, between a signifier and a signified (Saussure  2016, pp. 22–23
), much research on sound symbolism has attempted to identify possible correspondences between varieties of sounds and meanings.
Hinton, Nichols, and Ohala classify sound symbolism into four main categories (Hinton et al. 1996, pp. 2–6
). ‘Corporeal’ sound symbolism, the first type considered, consists of intonational indications of bodily and emotional states, involuntary reflexes, such as coughs, interjections (what have been called ‘unorthodox oral expressions’ (Ting 2013
)), and a variety of unsegmentable but communicative sounds. The second category of sound symbolism, ‘imitative’ sound symbolism, includes words that make use of linguistic sounds to imitate environmental sounds. Such onomatopoeic words—often labelled ideophones (Dingemanse 2011
), mimetics (Akita and Tsujimura 2016
), or expressives (Diffloth 1972
)—are often structured in ways that deviate from the normal lexicon (Newman 2001
; Nuckolls et al. 2016
). A third type of sound symbolism, ‘synesthetic’ sound symbolism, occurs when linguistic sound imitates non-linguistic phenomena, as when, for example, certain vowels, intonational profiles, or repetitive patterns are found to be imitative of size, shape, colors, or rhythm. The fourth main type of sound symbolism, termed ‘conventional’ sound symbolism, is attributed to such phenomena as phonesthemes, which are sounds, or clusters of sounds, such as the gl
- cluster in glitter
, and glow
. Because there is no obvious connection between the gl
- cluster and the visual meaning of the words it is part of, this type of sound symbolism is considered conventional or arbitrary, rather than iconic and motivated.
Having outlined the major types of sound symbolism, and explained their significance for language, we turn now to our research. Our investigation focused on the third type of sound symbolism described above, namely ‘synesthetic’ sound symbolism, which occurs when linguistic sound imitates non-linguistic phenomena. Specifically, we investigated a well-established type of synesthetic sound symbolism, magnitude sound symbolism. Magnitude sound symbolism occurs when high front vowels, such as [i] and [I
], which have high frequency second formants, are associated with smallness of size and related concepts. Back vowels, such as [o] and [a], by contrast, exhibit magnitude sound symbolism when they are linked with ideas of large size, because their second formants have lower frequencies than front vowels. Such communicative tendencies have been synthesized by Ohala
) into an ambitious theory, which argues that this type of sound symbolism is not only relevant for human language, but also for non-human agonistic displays.
Evidence in support of magnitude sound symbolism has also been found in one meticulous, conservative, and comprehensive study. Eliminating many possible confounding factors, including areal contact, genealogical relations, articulatory production costs, or systemic constraints, Blasi et al.
) found that the high front vowel [i] was 1.58 times more likely to occur in a word for the concept ‘small’ in 78 lineages, representing three out of five macro-areas of world languages.
The modest nature of this claim, however, does not diminish the significance of magnitude sound symbolism, since the lexical inventory of the study was limited to 100 basic vocabulary items, that do not represent the range of concepts which may potentially be linked with magnitude. In fact, magnitude sound symbolism has a notable tendency to be ‘stretched’ into a variety of cross-modal relationships. Jurafsky
), for example, compared vowels in the names of ice cream flavors and crackers. He discovered a preponderance of front vowels in the names of crackers (Wheat Thins, Triscuit, Ritz) suggestive of smallness, and by implication, desirable qualities of lightness and crispness. By contrast, there was a higher than expected number of back vowels in the names of ice cream flavors (Rocky Road, Jamoca Almond Fudge, Cookie Dough), all of which are suggestive of largeness and, by extension, desirable qualities of heaviness and richness. Coulter and Coulter
) found that when people mentally rehearsed prices that have numbers containing front (small) vowels, they tended to overestimate the size of the discount that was to be applied. In contrast, they underestimated the discount when the numbers contained back vowels, which are associated with largeness.
That front vowels symbolizing ‘smallness’ may also be expressive of ‘desirable lightness’ or ‘cheapness’ and back vowels, symbolizing ‘largeness’, may also be extended to apply to ‘heaviness’, ‘richness’, and ‘expensiveness’, are just a few examples of the cross-modal correspondences that have been attested for vowels and consonants. Others are discussed in Dingemanse and Lockwood’s
) extensive review. What is noteworthy about much of the experimental work on sound symbolism discussed in this review, whether focused on vowels, consonants, or both, is that, in addition to size, the attested correspondences are generally related to static concepts. These include brightness and darkness (Asano and Yokosawa 2011
); colors, such as reds, yellows and greens (Moos et al. 2014
); shapes, such as spikiness and roundness (Ramachandran and Hubbard 2001
; Nielsen and Rendall 2013
); and tastes, such as sweet and sour (Simner et al. 2010
Given the focus of the aforementioned research on links between magnitude sound symbolism and static concepts, the relatively recent line of experimental inquiry into possible links between magnitude sound symbolism and manner of motion is especially welcome. Cuskley
) designed an experiment that tested research subjects’ ability to make a connection between speed and sound. Participants heard invented words that varied in terms of reduplication, voicing, and vowel quality. They were then asked to adjust the speed of an animated bouncing ball to match these invented words. Results were that nonce words containing back vowels were rated as significantly slower than those containing front vowels or those featuring front and back vowels.
Yet another line of inquiry into magnitude sound symbolism and motion has been followed by Saji et al.
), whose research catalyzed this work. They conducted an open-ended study, allowing participants to devise their own sound-symbolic utterances, which they were asked to match with actions featured in short video clips. For each clip, participants were first asked to rate the action, according to whether it was jerky vs. smooth, large vs. small, heavy vs. light, and energetic vs. non-energetic. They were then asked to invent a word that they felt described the action portrayed. These invented nonce words were constructed to fit the template C1
. However, only the first consonant and vowel of the resulting nonce words were coded according to features such as place and manner of articulation, backness, and voicing, because they felt the first syllable would be the most influential part of the nonce word. Those features were then mined using a canonical correlation analysis. In this way, features that were correlated with particular actions could be identified, along with the strength of the association between each feature and action type.
Their results demonstrate that some sound symbolism correlations were made by both Japanese and English speakers. For example, voiced sounds, especially voiced sonorant sounds, such as nasals, were associated with slowness. Other correlations were language-specific. For instance, English speakers associated affricate consonants with heavy actions, in contrast to the Japanese participants, who produced nonce words containing affricates to denote light actions. This example of language-specific sound symbolism is attributed to the different phonological status of affricates in each of the languages. An additional, language-specific difference in sound symbolism was discovered with the high back vowel [u], which symbolized energetic motion for English speakers. The closest equivalent sound in Japanese, however, is an unrounded [ɯ] sound, which symbolized slowness for Japanese participants.
The importance of the study by Saji et al.
) is that it demonstrated general, as well as language system-specific sound symbolism, at work for the same stimuli. Assuming that any study is validated by additional research employing different methodologies, we decided to seek verification of their results for English speakers, using a different set of procedures. In the present paper, we initiated this cross-experimental verification by testing the degree to which English speakers associate light and jerky actions with velars, palatals, glides, and high vowels, and the degree to which they correlate heavy and smooth movements with affricates, glottals, laterals, and non-high vowels. We were also interested in the possibility that non-initial syllables might be sound-symbolically salient.
Our paper offers cross-experimental verification of the results for English language sound symbolism from Saji et al.
), as well as new findings. We confirmed that magnitude sound symbolism is not just about static qualities of smallness versus largeness. It can be observed, as well, in invented words for actions and processes. This may stem from what Talmy has identified as a human cognitive bias toward dynamism (Talmy 2000, p. 171
). The most significant finding of this study, however, is that sounds occurring in non-initial syllables exhibit sound-symbolic effects. This particular discovery challenges claims about the privileged status of initial syllables for sound symbolism judgements. Such claims, which are summarized in Kawahara et al.
), are based on a concept of psycholinguistic salience, or prominence for sounds in word-initial syllables.
Our results, by contrast, reveal that, in nonce verbs, some position-specific, sound-symbolic effects can be identified. With respect to the contrast between heavy and light actions, our results verified Saji et al.’s finding that, for English nonce words, the affricate /tʃ/ was sound-symbolic of heavy movements. However, in our study, the heaviness of /tʃ/ was found to be significant mostly in the second syllables of nonce verbs, while in the first syllables its influence was much weaker. Other sounds that symbolized heaviness in second syllables were /n/ and /h/. Heavy movements were also identified when the vowels of initial syllables were /ɑ/ or /aʊ/. Sentences portraying light movements, on the other hand, were more likely to have /i/, /u/, or /aɪ/ as the first vowel, or /j/ and /g/ as the second consonant. Such heavy and light patterning for the vowel sounds generally supports magnitude sound symbolism principles, because front and high vowels are associated with light actions, which can be related to smallness of size, while back vowels are associated with heavy actions, which are related to largeness of size.
For some sounds and meanings, however, position within a word seemed irrelevant. We found, for example, that the consonants /g/ and /k/, and the glide /j/, were associated with jerky actions whether they appeared in the first or second consonant position of the nonce verbs. Moreover, /h/ and /l/ are associated with smooth actions, irrespective of their positions within words. The nasal consonant, /n/, however, is enigmatic, because its sound-symbolic value changes with position. As an initial consonant, it is favored in words with jerky actions, while it is favored in words with smooth actions when it appears as the second consonant.
The findings of Dingemanse et al.
) may be cited to contextualize the significance of our study. They tested Dutch research subjects’ ability to correctly assign meanings to actual words drawn from five different languages. These words were taken from a class of expressions called ideophones which are, by their nature, sound-symbolically expressive. The ideophones were presented to subjects in resynthesized forms that controlled for the possible iconicity of both segments, as well as prosody. They argue that segmental sounds and prosody each make significant contributions to subjects’ ability to correctly assess meaning, and that psycholinguistic research tends to endow segments alone with too much importance. Our discovery that a sound’s position within a word is also significant adds another dimension to considerations of sound symbolism, by pointing to the importance of combinatoric principles, as well as principles that have yet to be identified.