A Narrative Review of Auditory Categorisation and Its Potential Role in Tinnitus Perception

Auditory categorisation is a phenomenon reflecting the non-linear nature of human perceptual spaces which govern sound perception. Categorisation training paradigms may reduce sensitivity toward training stimuli, decreasing the representation of these stimuli in auditory perceptual maps. Reduced cortical representation may have clinical implications for conditions that arise from disturbances in cortical activation, such as tinnitus. This review explores the categorisation of sound, with a particular focus on tinnitus. The potential of categorisation training as a sound-based tinnitus therapy is discussed. A narrative review methodological framework was followed. Four databases (PubMed, Google Scholar, Scopus, and ScienceDirect) were extensively searched for the following key words: categorisation, categorical perception, perceptual magnet effect, generalisation, and categorisation OR categorical perception OR perceptual magnet effect OR generalisation AND sound. Given the exploratory nature of the review and the fact that early works on categorisation are crucial to the understanding and development of auditory categorisation, all study types were selected for the period 1950–2022. Reference lists of articles were reviewed to identify any further relevant studies. The results of the review were catalogued and organised into themes. In total, 112 articles were reviewed in full, from which 59 were found to contain relevant information and were included in the review. Key themes identified included categorical perception of speech stimuli, warping of the auditory perceptual space, categorisation versus discrimination, the presence of categorisation across several modalities, and categorisation as an innate versus learned feature. Although a substantial amount of work focused on evaluating the effects of categorisation training on sound perception, only two studies investigated the effects of categorisation training on tinnitus. Implementation of a categorisation-based perceptual training paradigm could serve as a promising means of tinnitus management by reversing the changes in cortical plasticity that are seen in tinnitus, in turn altering the representation of sound within the auditory cortex itself. In the instance that the categorisation training is successful, this would likely mean a decrease in the level of activity within the auditory cortex (and other associated cortical areas found to be hyperactive in tinnitus) as well as a reduction in tinnitus salience.


Introduction
It is well-known that human perceptual systems group stimuli into behaviourally relevant categories [1,2]. Our ability to categorise sounds is governed by the fact that our perceptual spaces are warped in such a way that we are better able to discriminate between-category rather than within-category differences in sounds [3]. This phenomenon has been defined using several terms throughout the literature; namely, categorisation, categorical perception, and generalisation [3][4][5]. Similar to categorisation is the "perceptual magnet effect" which refers to the warping of the perceptual space for speech sounds (specifically some synthetic vowels and semi-vowels). According to Kuhl (1991) [2], who coined the term, what makes the magnet effect distinct from categorisation is that it is characterised by disparities in discriminability for prototypical versus non-prototypical stimuli belonging to the same phonemic category. Discriminability is far greater near non-prototypical members of a category than near prototypical members. The claim that categorisation and the perceptual magnet are different entities has been questioned by other researchers who prefer to think about the two as being one and the same [6][7][8][9][10]. Though the perceptual space appears to be more notably warped for vowels and semi-vowels, the effect is also present for consonants and non-speech stimuli [6][7][8][9][10]. As such, this review will consider categorisation and the perceptual magnet effect as interchangeable terms pertaining to the same phenomenon. Liberman (1957) [11] was one of the first to contemplate the idea of categorisation, proposing two learning processes likely to underpin it: acquired distinctiveness and acquired similarity/equivalence. Acquired distinctiveness describes a rise in perceptual sensitivity for sounds that are constantly categorised differently in a learning situation. Conversely, in acquired similarity, sounds that were once distinct from each other become difficult to distinguish following repeated categorisation to the same group [11]. Guenther et al. (1999) [9] expanded on these processes by examining the effect they had when used in auditory perceptual space training. The study investigated the possibility of inducing acquired similarity for category-relevant non-speech sounds using categorisation training and evaluated whether the resulting magnet effect was a consequence of the distribution of the training stimuli, or the type of training used by comparing the results with those obtained by discrimination training. According to Guenther et al. (1999) [9], categorisation and discrimination training paradigms have opposite effects, whereby categorisation training results in a reduction in the ability to distinguish between heavily experienced training sounds, whilst discrimination training leads to an increase in this ability. Furthermore, these effects translate into changes at the level of auditory maps within the brain, with the size of neural representation for the training sounds being determined by the type of training used [9]. Specifically, categorisation of the training sounds leads to a reduction in the number of cells coding these sounds in the auditory map, in turn decreasing the ability of an individual to differentiate sounds in this area of acoustic space, whilst the opposite effect is seen when using discrimination.
The diminished cortical representation that occurs as a result of categorisation training may have clinical implications for conditions that arise from disturbances in cortical activation, such as tinnitus. Tinnitus (ringing of the ears) appears to be the consequence of a series of neuroplastic changes in central auditory pathways that lead to hyperexcitability and an increase in the spontaneous firing rate in these pathways [12][13][14]. Categorisation training might offer a means of tinnitus management by decreasing the area of cortical representation for trained sounds. If successful, categorisation training could be used to reduce activity within the auditory cortex by changing the representation of sound within the cortex itself and in turn reversing the changes in cortical plasticity that are seen in tinnitus, thus reducing tinnitus severity and related distress.
While the notion of perceptual-specifically, categorisation-training is not novel, there are few reviews that offer a concise, up-to-date overview of categorisation within the context of both speech and non-speech sounds, and no reviews known to date that contemplate its potential role in tinnitus perception and management. Given that the cortical changes that occur as a result of categorisation training could oppose or even "reverse" those observed with tinnitus, it is essential to review what is currently known about this form of perceptual training, identify its role in sound perception, and offer insight into the rationale for investigating it as a potential mode of tinnitus therapy. This review will firstly explore the categorisation of sound, providing a general historic overview before expanding on the phenomenon and describing the neural basis believed to underpin it. Secondly, it will discuss its presence across several modalities and evaluate innate vs. passive instances of categorisation. Thirdly, it will bring awareness to the controversies surrounding categorisation and will contemplate its feasibility as a training paradigm. Lastly, this review will consider the clinical implications of categorisation as a tool for tinnitus management, drawing on what is currently known about the cortical effects of categorisation training, as well as the sparse but encouraging evidence emerging from tinnitus studies implementing perceptual training regimes.

Methods
The Green et al. (2006) [15] narrative review methodology was followed in conducting this narrative review. The advantage of using this framework is that it offers a broad and comprehensive overview of a specific topic. Furthermore, it allows various pieces of information to be pooled and ordered to form an understanding of the history and evolution of the topic, and enables speculations to be made based on the scope of current findings [15].
The key words (categorisation, categorical perception, perceptual magnet effect, generalisation, and categorisation OR categorical perception OR perceptual magnet effect OR generalisation AND sound) were extensively searched on four databases: PubMed, Google Scholar, Scopus, and ScienceDirect. Given the exploratory nature of the review and the fact that early works on categorisation are crucial to the understanding and development of auditory categorisation, all study types were selected for the period 1950-2022 and had the following exclusion criteria applied: article not available in English. Reference lists of articles were reviewed to identify any further relevant studies. The results of the review were catalogued and organised thematically according to common idea threads. The results and discussion section is divided into sections dedicated to each common idea thread.
In total, 112 articles were reviewed in full, from which 59 were found to contain relevant information and were included in the review. Each article was read in depth, identifying key information and evaluating the main findings, as well as defining how these findings inform auditory categorisation and its potential role in the perception of tinnitus. Several key themes identified include categorical perception of speech stimuli, warping of the auditory perceptual space, categorisation versus discrimination, the presence of categorisation across several modalities, and categorisation as an innate versus learned feature. Although a substantial amount of work focused on evaluating the effects of categorisation training on sound perception, only two studies investigated the effects of categorisation training on tinnitus.

Results and Discussion
3.1. Categorisation: The What, When, and Where 3.1.1. Categorisation and Its Origin Categorisation became a widely used term following the experiments of Liberman, Harris, Hoffman, and Griffith (1957) [16], who showed that the discrimination of synthetic speech sounds was predominantly governed by the categories to which these sounds were allocated to by the individual. Specifically, if the two sounds belonged to different categories, the participants could discriminate between the stimuli quite quickly and with relative ease, whereas if the sounds belonged to the same category, participants found it harder to discriminate between them. Participants tended to display good discriminability for between-category stimuli, and poor discriminability for within-category stimuli, even though the stimulus pairs used in these scenarios were equidistant in frequency space [4,16].
The idea of "prototypes" and "non-prototypes" was added to categorisation lexicon. A prototype was considered to be the "best" or "ideal" representative of a category, whilst a non-prototype was regarded as a poor exemplar of the same category [2]. For example, consider the category "bird"; this category is represented by a prototypical member such as "sparrow" or several prototypical properties such as "feathers". Some prototypes will be better than others in reflecting their parent category; although sparrows and ostriches are both birds, sparrows are considered to be better examples of birds as they share more similarities to other birds. As such, it can be said that sparrows are more ideal approximations of the prototype for birds, as they have more of the crucial features for determining "birdness" [17,18].

Warping of the Auditory Perceptual Space
Ease of discrimination for between-category stimuli was believed to be due to the assimilation of exemplars found near the prototype by the prototype itself. Strictly speaking, good exemplars of a category appear to draw similar members towards themselves comparatively more often than poor exemplars of the same category [18,19]. These observations lead to the realisation that the human perceptual system is warped, so that one's ability to discriminate between two stimuli is not linearly related to the physical distance between the stimuli as measured by dimensions such as frequency or time [9,20]. Guenther and Gjaja (1996) [4] proposed a neural model based on auditory map formation that is thought to underlie this nonuniformity ( Figure 1). a non-prototype was regarded as a poor exemplar of the same category [2]. For example, consider the category "bird"; this category is represented by a prototypical member such as "sparrow" or several prototypical properties such as "feathers". Some prototypes will be better than others in reflecting their parent category; although sparrows and ostriches are both birds, sparrows are considered to be better examples of birds as they share more similarities to other birds. As such, it can be said that sparrows are more ideal approximations of the prototype for birds, as they have more of the crucial features for determining "birdness" [17,18].

Warping of the Auditory Perceptual Space
Ease of discrimination for between-category stimuli was believed to be due to the assimilation of exemplars found near the prototype by the prototype itself. Strictly speaking, good exemplars of a category appear to draw similar members towards themselves comparatively more often than poor exemplars of the same category [18,19]. These observations lead to the realisation that the human perceptual system is warped, so that one's ability to discriminate between two stimuli is not linearly related to the physical distance between the stimuli as measured by dimensions such as frequency or time [9,20]. Guenther and Gjaja (1996) [4] proposed a neural model based on auditory map formation that is thought to underlie this nonuniformity ( Figure 1).  Guenther and Gjaja (1996). The model employs two layers of neurons (formant representation and auditory map) that are connected through adaptive synapses. The adaptive nature of the synapses determines what cells will become activated in the auditory map. During early exposure to speech stimuli, the strength of the synapses will be altered in such a way where it changes the firing preferences of the neurons in the auditory map, in turn reflecting the distribution of these sounds. The nonuniformity that comes about as a result of this preferential cell firing results in the magnet effect in this model.
The model is founded on the notion that exposure to a certain language in infancy results in nonuniformities in the distribution of neuronal firing preferences in the auditory neural map, leading to the magnet effect [4]. Specifically, the warping occurs because more cells in the map become tuned to the sounds most experienced by the infant. In other words, it is the distribution of the sounds heard by the infant that influences the firing preferences of neurons, giving rise to the warping observed in the auditory map and leading to the respective reduction in perceptual space close to phonemic category centres, and increase in this space away from centres [4,9]. Performing a series of experiments aimed at defining the effects of categorisation versus discrimination training, Guenther and colleagues (1999) [9] discovered that the sensitivity of their participants for the training stimuli differed depending on the training regime used; namely, categorisation training resulted in a decrease in this sensitivity, whilst discrimination training gave rise to an  Guenther and Gjaja (1996). The model employs two layers of neurons (formant representation and auditory map) that are connected through adaptive synapses. The adaptive nature of the synapses determines what cells will become activated in the auditory map. During early exposure to speech stimuli, the strength of the synapses will be altered in such a way where it changes the firing preferences of the neurons in the auditory map, in turn reflecting the distribution of these sounds.
The nonuniformity that comes about as a result of this preferential cell firing results in the magnet effect in this model.
The model is founded on the notion that exposure to a certain language in infancy results in nonuniformities in the distribution of neuronal firing preferences in the auditory neural map, leading to the magnet effect [4]. Specifically, the warping occurs because more cells in the map become tuned to the sounds most experienced by the infant. In other words, it is the distribution of the sounds heard by the infant that influences the firing preferences of neurons, giving rise to the warping observed in the auditory map and leading to the respective reduction in perceptual space close to phonemic category centres, and increase in this space away from centres [4,9]. Performing a series of experiments aimed at defining the effects of categorisation versus discrimination training, Guenther and colleagues (1999) [9] discovered that the sensitivity of their participants for the training stimuli differed depending on the training regime used; namely, categorisation training resulted in a decrease in this sensitivity, whilst discrimination training gave rise to an increase in sensitivity. Based on these findings, the researchers hypothesised that in categorisation training, repeated exposure to the selected training stimuli likely results in a smaller number of cells preferentially coding these sounds in the map, leading to a reduced cortical representation which ultimately weakens the listener's ability to differentiate between the stimuli in that region of acoustic. Conversely, discrimination training results in more neurons becoming tuned to the stimuli to which the listener was most frequently exposed to, leading to an increase in the cortical representation and thus an improved ability to differentiate between the sounds.

Neural Changes Resulting from Categorisation Training
To further explore categorisation and its effects on sound representation within the auditory cortex of human adults, Guenther et al. (2004) [1] used functional magnetic resonance imaging (fMRI) and identified the cortical changes that occur as a result of categorisation training. Their study revealed a significantly higher level of cortical activation for non-prototypical stimuli than for prototypical stimuli, particularly within the temporal lobe (Heschl's gyrus and planum temporale areas) (see Figure 2), lending support to the hypothesised neural changes thought to occur as a result of perceptual (categorisation/discrimination) training, as previously proposed by the researchers. Discrimination was believed to be more difficult between sounds located at the centre of a category compared to those found near category boundaries, as there are fewer cells representing these sounds in the auditory cortical areas [1]. Taking the findings of their study in addition to what was already known about categorisation, Guenther et al. (2004) [1] proposed that the brain in fact re-distributes neural resources away from areas of acoustic space where the ability to differentiate between sounds lacks behavioural importance (for example, at the centre of a sound category) and shifts the resources toward areas where precise discrimination is required.
duced cortical representation which ultimately weakens the listener's ability to diff tiate between the stimuli in that region of acoustic. Conversely, discrimination tra results in more neurons becoming tuned to the stimuli to which the listener was frequently exposed to, leading to an increase in the cortical representation and thu improved ability to differentiate between the sounds.

Neural Changes Resulting from Categorisation Training
To further explore categorisation and its effects on sound representation withi auditory cortex of human adults, Guenther et al. (2004) [1] used functional magneti onance imaging (fMRI) and identified the cortical changes that occur as a result of ca risation training. Their study revealed a significantly higher level of cortical activatio non-prototypical stimuli than for prototypical stimuli, particularly within the tem lobe (Heschl's gyrus and planum temporale areas) (see Figure 2), lending support t hypothesised neural changes thought to occur as a result of perceptual (categorisation crimination) training, as previously proposed by the researchers. Discrimination wa lieved to be more difficult between sounds located at the centre of a category comp to those found near category boundaries, as there are fewer cells representing sounds in the auditory cortical areas [1]. Taking the findings of their study in additi what was already known about categorisation, Guenther et al. (2004) [1] proposed the brain in fact re-distributes neural resources away from areas of acoustic space w the ability to differentiate between sounds lacks behavioural importance (for examp the centre of a sound category) and shifts the resources toward areas where precise crimination is required.

Passive Categorisation: A Consequence of Evolution?
For many years, the nature and origin of categorisation have been argued, with some researchers putting forth evidence for the feature being innate, whilst others have proposed that it is induced by learning [21,22]. The two hypotheses upon which these arguments are built are primarily the "relativist" hypothesis formulated by Whorf (1956) [23], which suggests that language and culture determine how we categorise, and the "universalist" hypothesis, which posits that category boundaries are a feature innate to humans [5,22]. Though initially the Whorf hypothesis was more commonly recognised and adopted amongst researchers, over the years it has been supplanted by universalism [21]. Evidence from psychophysical studies conducted on adults of different cultural backgrounds, categorisation in infancy studies, and animal studies lend support to the universalist view [4,[24][25][26][27][28][29][30][31][32][33][34][35][36][37].

Cross Cultural Studies
The innate ability for humans to passively categorise sound has been demonstrated by psychophysical and perceptual studies investigating the categorisation of sound in adults of diverse cultural and linguistic backgrounds. Stevens, Libermann, Studdert-Kennedy, and Öhman (1969) [38] explored the discrimination and identification of synthetic vowels by English and Swedish speakers, concluding that vowel perception is not a consequence of experience with linguistic categories, but rather a function of human auditory mechanisms. Building on these results, Guenther and Gjaja (1996) [4] employed their neural model in a simulation-based study investigating the perception of stimuli within or near the American English phonemic categories /r/ and /l/ in Japanese and American adults. The outcomes indicated that American adults, who have been exposed to many instances of /r/ and /l/ in their native language, show perceptual warping around the phonemic categories. Conversely, Japanese adults, who presumably have had less exposure to /r/ and /l/ phonemes as these categories do not have direct correlates in Japanese, do not display this perceptual warping. Their study demonstrated the presence of passive categorisation in humans, suggesting that categorisation and the perceptual magnet effect are a result of neural map formation in the auditory system.

Infancy Studies
Eimas, Siqueland, Jusczyk, and Vigorito (1971) [27] evaluated the extent to which categorisation is an innate feature through examining the perception of voice-onset time in infants between one-and four-months of age. Using a high-amplitude sucking procedure as a measure of response to sound stimuli, the infants were exposed to three different pairs of sounds of varying voice-onset times; 20 and 40 milliseconds, 0 and 20 milliseconds, and 60 and 80 milliseconds. The 20 and 40 millisecond pair represented stimuli on opposite sides of the category boundary, whilst the remaining two test pairs both fell within the same category. Specifically, the category boundary pair has been found to sound like the syllables BAH and PAH, respectively, to adult speakers of English and other languages, whereas the two within-category pairs were both instances of BAH or PAH. The results of the study indicated that infants, just like adults, perceive differences in voice-onset time categorically, suggesting that categorisation is an innate mechanism tuned to the properties of speech. Expanding on this outcome, Eimas et al. (1971) [27] proposed that the mechanism acts as a precursor for phonemic categories that later in development enables the conversion of speech signals into phonemes that can be then used to form words and meanings. To date, several other studies using the same voicing distinction methods have demonstrated the ability for infants to perceive differences in voice-onset time in a categorical manner, with investigators advocating for an underlying genetic predisposition for speech sound perception and categorisation [39][40][41].
Moving away from the voicing distinction experiments, Kuhl (1979Kuhl ( , 1980Kuhl ( , 1983 [28,42,43] examined the ability of 6-month-old pre-verbal infants to categorise speech sounds from the same phonetic category without receiving formal training and with a lack of productive skill. Kuhl's studies demonstrated that human infants possess perceptual abilities that enable both the discrimination of phonetically different signals, and the categorisation of those found within the same phonetic category, in turn also supporting the notion of innate mechanisms in speech perception. In concert with these findings, Jusczyk, Rosner, Cutting, Foard, and Smith (1977) [44] reported evidence of passive categorisation for non-speech stimuli in early infancy. Using a high-amplitude suckling technique to determine the perception of rise-time differences for sawtooth stimuli in 2-month-old infants, the investigators concluded that infants, like adults, possess the ability to perceive non-speech sounds in a categorical manner.

Animal Studies
Nonhuman studies serve as another line of evidence for passive categorisation of sound as they exhibit work done on species that are capable of sound perception but have no possibility of culture or language. Studies using primates constitute the majority of the work to date with mixed evidence supporting/contradicting categorisation. Studies by Sinnott, Beecher, Moody, and Stebbins (1976) [45] and Kuhl (1991) [2] failed to find evidence of categorical perception in Old World and rhesus monkeys, respectively, proposing that the phenomenon is a unique species-specific speech processing mechanism. However, several other experiments conducted on macaques [46,47], rhesus monkeys [48], and owl monkeys [49] have identified the presence of categorical mechanisms underlying sound perception in these animals. Kuhl and Miller (1978) [29] conducted voicing distinction experiments using chinchillas, testing the animals in a categorisation paradigm in which they were trained to respond differently to the endpoints of a synthetic speech continuum (0 ms voice-onset time and +80 ms voice-onset time). The ability of the chinchillas to perceive the stimuli categorically was almost identical to that seen in adult English-speaking listeners, suggesting the mechanism is an innate non-species-specific property. Similar conclusions were drawn from studies undertaken on house mice. Mice were able to categorically perceive ultrasound vocalisations [50]. Other studies that have demonstrated the presence of passive categorisation in animals include work done on quails [51], wild swamp sparrows [52], songbirds [53], budgerigars [54], rats [55], and gerbils [56,57].

Passive Categorisation in Vision
Though categorisation was first observed for speech sounds, it has since been found that it is not a phenomenon unique to hearing, but one that is present for a variety of stimuli from a number of modalities; perhaps the most important being vision. Numerous human and animal studies have demonstrated the categorical perception of colour [25,26,31,35,36,58], shapes [59,60], and even more complex stimuli such as facial identity [6,[61][62][63][64] and facial expression [65,66].
Just as in sound, the perception of visual stimuli has been found to be nonuniform. For example, we do not perceive continuous gradations along the visible light spectrum, but rather a range of discrete hues [21,60]. Boynton and Gordon (1965) [26] conducted a study in which they asked participants to identify single wavelengths of colour using solely four basic colour terms-yellow, blue, green, and red, or a combination of two of these hues. The results of their colour-naming experiment revealed that participants could easily describe the wavelengths presented to them using one to two of the four basic colour terms, with a high level of agreement among participants. However, when the participants were allowed to use a broader range of colour terms (e.g., violet or orange), they admittedly struggled to define the observed wavelengths, leading to a reduction in agreement.
The findings of the Boynton and Gordon (1965) [26] study serve as perhaps a simple yet effective example of categorisation, in that the four primary terms represent a mutually contrastive set that can be used to describe the colour space exhaustively as a result of our discrete categorical perception of wavelengths [21]. Because we deconstruct this spectrum of different wavelengths into a limited number of distinct colour bands, we struggle to distinguish between colours within the same category (for instance, mahogany red and scarlet red), yet have no issue with discriminating between those colours that fall on the category boundary (such as green and yellow). Hence, instead of the spectral continuum being homogenous, the neural representation of it is not. Rather, it is warped in the same way as described previously with respect to sound; specifically, a larger change in wavelength is required to produce a just-noticeable difference (JND) in some regions in comparison to others [60]. Furthermore, a smaller relative change in wavelength is sufficient to generate a JND for colours that sit on category boundaries, whilst the opposite is true for those found within a colour category. The same principle applies to shapes; we find it easier to generalise and group a set of shapes under a familiar umbrella term than to classify them by specific names (e.g., rhombus, rectangle, trapezium shapes are classified as "square" shapes) (Figure 3) [59,67].
ally contrastive set that can be used to describe the colour space exhaustively as a resu of our discrete categorical perception of wavelengths [21]. Because we deconstruct th spectrum of different wavelengths into a limited number of distinct colour bands, w struggle to distinguish between colours within the same category (for instance, mahogan red and scarlet red), yet have no issue with discriminating between those colours that fa on the category boundary (such as green and yellow). Hence, instead of the spectral co tinuum being homogenous, the neural representation of it is not. Rather, it is warped the same way as described previously with respect to sound; specifically, a larger chang in wavelength is required to produce a just-noticeable difference (JND) in some region in comparison to others [60]. Furthermore, a smaller relative change in wavelength is su ficient to generate a JND for colours that sit on category boundaries, whilst the opposi is true for those found within a colour category. The same principle applies to shapes; w find it easier to generalise and group a set of shapes under a familiar umbrella term tha to classify them by specific names (e.g., rhombus, rectangle, trapezium shapes are class fied as "square" shapes) (Figure 3) [59,67]. tween shades of red such as mahogany, scarlet, and rose, individuals often 'categorise' these colou into one group: red. The same principle applies to shapes; instead of specifying what each speci shape is, individuals will group or "categorise" shapes that share similar properties together. Henc rounded shapes are simply seen as round, whereas those with sharp edges (such as the rhombu rectangle, and trapezium) are seen as square.
As in the case of sound categorisation, the innate nature of visual categorisation strongly supported with evidence from numerous cross-cultural, infancy, and anim studies. Berlin and Kay (1969) [25] demonstrated that basic colour categorisation tran cends culture and language. Bilingual adults from 20 different language communiti Figure 3. Categorisation explained in the context of colour and shape. Instead of distinguishing between shades of red such as mahogany, scarlet, and rose, individuals often 'categorise' these colours into one group: red. The same principle applies to shapes; instead of specifying what each specific shape is, individuals will group or "categorise" shapes that share similar properties together. Hence, rounded shapes are simply seen as round, whereas those with sharp edges (such as the rhombus, rectangle, and trapezium) are seen as square.
As in the case of sound categorisation, the innate nature of visual categorisation is strongly supported with evidence from numerous cross-cultural, infancy, and animal studies. Berlin and Kay (1969) [25] demonstrated that basic colour categorisation transcends culture and language. Bilingual adults from 20 different language communities determined the ideal representations of a limited selection of basic colour terms (chosen from a collection of 320 colours). Irrespective of the linguistic differences between the participants, there was high agreement among the group with respect to which colours they identified as being ideal exemplars, reflecting a universal uniformity in the categorical perception of colour [25].
Developmental studies are useful for the exploration of categorisation as they allow for observations to be made prior to the acquisition of language and culture. Human infants regularly partition the spectrum of different wavelengths into categories of hue [21]. Infants as young as four months are capable of categorising the visible spectrum into the basic psychological hues in the absence of any experience or formal training, with the formed categories being almost identical to those seen in adults [58]. The fact that infants, unaffected by learned experiences such as culture or language, possess the ability to perceive colours in a categorical manner offers undeniable support to the notion that categorisation is a phenomenon innate to humans.
In addition to infancy studies, a number of investigations have presented evidence of hue categorisation in non-human species. Irrespective of the variations present in the general perception of the visible spectrum among different species, it has been demonstrated that categorisation of hues is common and universal across species. Pigeon [35,36], the European honeybee [32], and the monkey [31] perception studies indicate that colour-sensitive species have the ability to categorically perceive the colour spectrum in some way, even if they might not replicate exactly what is seen in humans. Work conducted on species that are capable of categorical perception of colour but have no possibility of culture or language stands as another source of support for the innate and universal mechanisms of categorisation.

Categorisation as an Learned Feature
Decades of research lending support to both "passive" and "learned" views of categorical perception has led to the two joining forces and offering a new perspective on categorisation and its origin, proposing that the phenomenon is both an innate feature and one which can be learned through experience. The idea that categorisation is a passive trait underpinning speech perception and comprehension that is fine-tuned through training is one that dominates the literature. Studies of speech perception in infants have demonstrated that although categorisation is in most part ascribed to biological influences, it is also a feature capable of modification through experience with the parental language [68]. Developmental studies documenting the effects of experience on speech perception have demonstrated that infants as young as 6 months of age already show alterations in speech perception dictated by exposure to a specific language [69,70]. Specifically, an infant's perception of speech reflects the phonetic structure of the language it has been exposed to [71]. Though phonetic discrimination is a language universal feature in newborns, by the end of the first year of life, infants experience a reduction in this discrimination for non-native phonemes [69]. Over the course of 30 months, infants demonstrate an increasingly negative correlation between native and non-native phonetic perception; the more the infants improve in their ability to perceive native-language phonemes, the worse they become at perceiving non-native phonemes [69]. These results have been replicated by numerous cross-cultural studies exploring the effect that experience and learning has on the ability of infants of various cultural/lingual backgrounds (including English, Japanese, Spanish, Hindi, Salish, Thai, and Mandarin) to categorise speech [4,24,30,33,34,37,39]. As such, it appears that the abilities we as humans possess that allow us to categorise information originate from a bountiful and crucial set of biological constraints that are supplemented in their function by an equally important set of environmental constraints [72].
In addition to speech categorisation, Lynch, Eilers, Oller, and Urbano (1990) [73] have demonstrated a role for learning in non-speech categorisation using musical articulation experiments. Evaluating the ability of American 6-month-old infants and adults to detect mistunings along native major and minor scales, and non-native Javanese pelog scales, has revealed that while infants can equally perceive both native and non-native musical scales, adults show improved perception of native scales relative to their non-native counterparts. Reflecting on these findings, the investigators proposed that infants possess an innate equipotentiality for culturally universal scale perception, and that the culturally specific experience that takes place throughout development into adulthood shapes their perception and categorisation of music.

Controversy Surrounding Categorisation
The categorical nature of categorisation, along with the robustness of the phenomenon, has been called into question. Though numerous researchers have demonstrated its presence in a variety of domains, including the perception of speech [1,2,4,9,16,18], colour [74], familiar faces [6], and facial expressions [66], some have expressed concerns relating to the existence of the effect. Massaro (1998) [75] labels categorisation as a "lasting myth", suggesting that the use of multiple sources of information in perception requires these sources to be continuous rather than categorical. Challenging the logic of categorisation, Massaro (1998) [75] outlined the difficulty with which information from various sources would be integrated if even a single source of information was perceived categorically. If we consider the perception of speech, for example, sentential context would either agree or disagree with the categorisation of the speech stimuli. If the context of the sentence agrees with the input, no further information will be obtained; conversely, if it disagrees with the categorisation of the speech signal, the individual is put in a position in which there is conflict and inconsistency between the context and acoustic input [75]. In addition to this argument, the identification-based methods commonly used to demonstrate categorical perception were also criticised by Massaro (1998) [75] as being weak/questionable and guilty of indicating a categorisation effect even where it does not exist. Massaro (1998) [75] was not the only one to raise questions with respect to categorisation and the extent of its effect. Lively and Pisoni (1997) [17] evaluated the phenomenon with respect to the perceptual magnet effect (PME), replicating the original set of experiments conducted by Kuhl (1991) [2]. The three experiments were designed to assess two results deemed by Kuhl as essential to the generation of the PME; namely, (1) whether some examples of a phonetic category are rated as being better representations of that phonetic category than others, and (2) whether category members that approximate an idealised prototype are less discriminable than those not resembling the prototype. Failing to replicate Kuhl's results, and in turn fulfil the requirements of the PME, the authors found that the effect was not as robust as Kuhl had suggested.
Though this controversy cannot be overlooked, it should not stand to mean that categorisation as a concept should be entirely abandoned or ignored. Numerous empirical studies have demonstrated modifications in cortical plasticity following categorisation training, building on the methods previously used to explore categorisation and lending a great deal of support to the phenomenon [1,9]. As such, it should merely draw awareness and stand as a reminder for researchers to be mindful of their methodologies so as not to generate an effect where it does not exist.

Categorisation for Tinnitus
The notion that tinnitus is the consequence of a series of neuroplastic changes taking place in the central auditory pathways following peripheral injury is one that currently predominates in the literature. Many tinnitus treatments attempt to disrupt or modify tinnitus related neural activity through passive sound exposure over weeks or months (for a review, see Searchfield (2020) [76]). What makes auditory perceptual training paradigms attractive is the use of active participatory learning, which should result in more efficient learning, and plastic brain changes, rather than the usual passive sound exposure [77,78]. The auditory perceptual training programme that has received most interest for tinnitus therapy has been frequency discrimination training (FDT) [79,80]. The goal of FDT is to teach listeners to differentiate between tones matched closely in pitch with a greater degree of precision. Several studies investigating FDT have reported neuroplastic changes in the auditory cortex as a result of the training; specifically, FDT regimes giving rise to an increase in the cortical representation for trained frequencies [49,81,82]. The premise underlying the use of FDT for tinnitus is that it has the potential to disrupt the tinnitus-generating network through this tonotopic reorganisation of the primary auditory cortex [79]. A recent study employing FDT through a game module as a means of tinnitus management noted a change in pitch and reduced loudness for over 70% of the participants, as well as a reduction in the perceived severity and handicap of tinnitus [79]. However, there are still speculations with respect to the strength and robustness of the effect, highlighting the need for high-quality, unbiased, randomised controlled studies [77,83].
An alternative to FDT is categorisation training (CT). CT teaches listeners to categorise tones within a certain frequency range and identify them as being members of the same category [78]. Although FDT has received the most research interest in tinnitus, the observation that CT has reduces cortical activity for trained tones [1] suggests that it may be a more promising strategy. Guenther et al. (1999) [9] conducted a series of psychophysical experiments to determine the effect of discrimination and categorisation training paradigms on the auditory perceptual space of adults with no history of speech, language, or hearing disorders. Their study found that while discrimination training using non-speech stimuli led to an increase in sensitivity to differences in the training stimuli, categorisation training with the same sounds resulted in a decrease in their discriminability [9]. Further investigations using functional magnetic resonance imaging have shown that this decrease in discriminability following categorisation training is the result of a reduction in the size of cortical representation of the training tones [1]. The authors postulated that this reflects the redistribution of neural resources by the brain away from areas where distinction between sounds is not behaviourally important, and toward areas where precise discrimination is necessary [1].
While much work has been conducted on the principle of categorisation, it is evident that there is a relative dearth of tinnitus research on categorisation. The available studies investigating categorisation within the context of tinnitus are sparse (only two studies were identified in this review as containing relevant information) and limited to work by the corresponding authors lab. Recognising the promising nature of CT, Jepsen et al. (2010) [78] applied the findings of Guenther et al. (2004) [1] to study the effects of CT (and FDT) on tinnitus and late auditory evoked potentials (AEP). Using a handheld personal digital assistant (PDA) device, 24 participants underwent three weeks of either CT or FDT at pitch match. Prior to and following the training period, the participants attended test sessions in which AEPs were measured (in addition to several other assessments, including the tinnitus handicap inventory (THI)). In concert with the findings of Guenther et al. (2004) [1], the results of the Jepsen et al. (2010) [78] found a reduction in AEP amplitude following CT, and an increase following FDT, with the two perceptual training paradigms having equal but opposite effects. In addition to this, post-intervention examinations found that FDT led to a slightly greater reduction in THI score than CT, however participants were marginally more likely to find ignoring their tinnitus easier following CT than following FDT.
These findings lend support to the ability of categorisation training to modify cortical plasticity and reduce the area of tonotopic representation for trained tones, in turn acting as a potential means of tinnitus management. Jepsen et al. (2010) [78] suggested that CT be used in conjunction with some form of counselling or cognitive based methods to enhance the benefits of CT and further reduce tinnitus severity. Taking a step back, appropriate methods of CT administration must first be considered if it is to be used by patients in an everyday setting. Jepsen et al. (2010) [78] issued PDA devices to patients which meant that they could perform the CT task anywhere, anytime. However, the training involved a series of trials during which participants had to either hear examples of training sounds without responding ('listening trials') or identify one sound from a list of presented sounds as belonging to the training group by pressing a button on the PDA screen ("identification trial"). Though the method is straightforward, its simplicity and lack of an "end goal" might prevent its success in the long term as patients start to lose interest. Strategies for categorisation training must be considered to ensure compliance and increase efficacy. Perhaps taking a passive approach, in which the listener can be trained to categorise the stimuli without needing to physically attend to the training programme would be reasonable. This could be useful for individuals leading busy lifestyles, offering a means of management that does not require time to be dedicated specifically to completing the training. Durai et al. (2021) [84] attempted to recategorize tinnitus from a sound that represents the sensation in its entirety to a natural real-world sound using such a passive training paradigm. The study involved acute (30 min) and chronic (3 month) exposure to a training sound that matched the individual's tinnitus (i.e., their tinnitus avatar) cross-faded to a chosen nature sound (cicadas, birds, fan, water sound/rain, water and bird). Durai and colleagues (2021) [84] reported several behavioural findings, including a significant reduction in the tinnitus functional index score and subscales of intrusiveness of the tinnitus signal and ability to concentrate with tinnitus. While the participants did not report a change in tinnitus loudness, they observed changes in pitch, uniformity, and location. At a physiological level, the researchers reported changes in the activation of neural tinnitus networks and greater bilateral hemispheric involvement, specifically relating to attention and discriminatory judgments (dorsal attention network, precentral gyrus, ventral anterior network). The absence of a control group means that it is unclear whether these outcomes are the result of recategorization induced by the passive auditory training, or simply a reflection of the benefits of masking and/or any emotional benefit gained from listening to pleasant nature sounds. As such, further efforts are required to determine the role of passive perceptual training as a means of tinnitus management, as well as to better understand how it compares to conventional sound therapy (e.g., broad-band noise).
An alternative to using passive methods is to take an active training approach, one in which listener involvement is required in order to complete the training tasks. For example, categorisation training could be incorporated into a video game, necessitating the interaction between the listener and the training itself through a series of tasks in which the listener is required to respond in some way or another in order to progress through the training programme [85]. By designing a categorisation training game with well-defined aims and multiple levels of increasing difficulty, players are inherently motivated to participate and in turn aid in the success of the training [86]. However, active training paradigms are more time consuming to design and create, are prone to issues of a technical nature (e.g., glitches in gaming software), and require the participant's full attention, a greater level of technology use and knowledge, and a time slot dedicated to performing the tasks. As such, the next step in evaluating categorisation training for tinnitus management might be to undertake a more basic passive approach in a proof-of-concept trial, in combination with aspects of counselling and more cognitive based methods (as per the recommendation of Jepsen et al. (2010) [78]), prior to taking the leap to more complex resource-intensive approaches such as active training.

Conclusions
Categorisation is a phenomenon which presents itself as both an innate feature and one which can be learned over time through experience. While the effect has been observed in several different modalities, its presence in hearing and sound is of particular interest due to the implications it may have for conditions such as tinnitus. Implementation of a categorisation-based perceptual training paradigm could serve as a promising means of tinnitus management by reversing the changes in cortical plasticity that are seen in tinnitus, in turn altering the representation of sound within the auditory cortex itself. In the instance that the categorisation training is successful, this would likely mean a decrease in the level of activity within the auditory cortex (and other associated cortical areas found to be hyperactive in tinnitus) as well as a reduction in tinnitus salience, thus giving rise to a reduction in tinnitus severity and related distress. Whether passive or active training paradigms are applied, categorisation training is an avenue worth exploring. While the phenomenon is founded on relatively simple principles, it builds on current methods of sound-based therapies and offers a "next step/level" within this subset of tinnitus management strategies.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.