A Critical Review of the Deviance Detection Theory of Mismatch Negativity

Mismatch negativity (MMN) is a component of the difference waveform derived from passive auditory oddball stimulation. Since its inception in 1978, this has become one of the most popular event-related potential techniques, with over two-thousand published studies using this method. This is a testament to the ingenuity and commitment of generations of researchers engaging in basic, clinical and animal research. Despite this intensive effort, high-level descriptions of the mechanisms theorized to underpin mismatch negativity have scarcely changed over the past four decades. The prevailing deviance detection theory posits that MMN reflects inattentive detection of difference between repetitive standard and infrequent deviant stimuli due to a mismatch between the unexpected deviant and a memory representation of the standard. Evidence for these mechanisms is inconclusive, and a plausible alternative sensory processing theory considers fundamental principles of sensory neurophysiology to be the primary source of differences between standard and deviant responses evoked during passive oddball stimulation. By frequently being restated without appropriate methods to exclude alternatives, the potentially flawed deviance detection theory has remained largely dominant, which could lead some researchers and clinicians to assume its veracity implicitly. It is important to have a more comprehensive understanding of the source(s) of MMN generation before its widespread application as a clinical biomarker. This review evaluates issues of validity concerning the prevailing theoretical account of mismatch negativity and the passive auditory oddball paradigm, highlighting several limitations regarding its interpretation and clinical application.


Introduction
Mismatch negativity (MMN) is generally defined as a negative polarity deflection viewed across the 150-250 ms latency range in difference waveforms derived from standard and deviant event-related potential (ERP) responses elicited by passive auditory oddball paradigms [1][2][3]. This waveform feature is considered to be maximal approximately halfway between the center of the forehead and the apex (electrode Fz in the 10-20 system), although, like other ERP components, its morphology changes depending on the reference electrode(s). In passive auditory oddball paradigm experiments, subjects are instructed not to attend to ongoing changes in the auditory environment. Repetitive identical (i.e., standard) stimuli are played at a constant rate with a defined inter-stimulus interval (ISI) until they are unpredictably swapped for a physically different (i.e., deviant) stimulus. This sequence is then repeated multiple times to obtain enough electroencephalogram (EEG) segments to average together to produce corresponding ERP waveforms. Subtraction of the standard ERP from the deviant ERP is performed, and MMN is typically quantified from this difference waveform somewhere within the aforementioned latency range, although precisely where is generally not standardized between studies. While there are many variants on the theme of passive auditory oddball paradigms, the fundamental principle rests on using two or more physically distinct sounds, with the formation of a relatively predictable pattern using one that is subsequently perturbed by another, physically different sound. This aspect of unpredictable change in the stimulus presentation sequence is crucial to popular interpretations of MMN generation, as discussed later.
MMN is reportedly elicited by any perceptible change in the physical properties of a repeated sound [2,4,5]. This deviance can be made with regard to duration, frequency, loudness, source location, pitch transition, synthesized vowel, and presumably any combination thereof. While their predicted and unpredicted contexts are generally believed to be of critical importance, the magnitude of MMN is proportional to the degree of physical difference between standard and deviant stimuli [1]. Furthermore, changes in individual physical properties of sound are considered to produce distinct attributes in the MMN response, which may be characteristic of dissociable underlying mechanisms. For instance, studies that have applied source modelling techniques have identified separate loci of activity associated with different physical deviances [6][7][8][9]. This is particularly relevant for the substantial body of clinical research (discussed in a separate section below) where different types of physical deviance have been used in passive auditory oddball paradigms to elicit MMN in patient groups. It is considered to be possible to elicit MMN from patients who are sleeping [10], anaesthetized [11], or in comatose states [12]. A corresponding mismatch response (MMR) is also thought to exist in many different animal species, including amphibians [13], lower mammals [14], pigeons [15], dolphins [16] and monkeys [17].
This concludes the main introductory points concerning MMN without delving into the proposed interpretations of its underlying mechanisms. These may be summarized as follows: (1) it is quantified from the difference wave between standard and deviant ERP responses evoked by a passive auditory oddball paradigm; (2) it can be elicited by any physical difference between standard and deviant stimuli; (3) its magnitude is related to the degree of difference between standard and deviant stimuli; (4) it can be evoked while subjects are awake, asleep, or unconscious; and (5) it is widely shared throughout the animal kingdom. These consistent findings of MMN are themselves evidence of robust sensory processing mechanisms. However, the practical significance of this topic is principally derived from its putative utility as a biomarker for psychiatric disease [18][19][20]. This raises a number of questions concerning its underlying mechanisms that demand greater levels of scrutiny if this promise is to be realized. The next three sections briefly summarize the prevailing deviance detection theory and competing fresh afferent and sensory processing theories of MMN generation. The historical origins of MMN research and its development are then addressed from a meta-research perspective, followed by a critical evaluation of clinical MMN research. Several limitations affecting this field are then discussed, with some tentative recommendations offered, followed by concluding remarks.

Deviance Detection Theory
The mechanisms traditionally believed to underlie MMN generation have recently been expounded in a couple of comprehensive review articles [1,3]. These adequately recapitulate the prevailing theoretical framework, although without sufficiently challenging the limitations of supporting evidence. The present review offers a different perspective in an effort to bring balance to this discussion. From the earliest studies, MMN was interpreted as a difference-detection response that represents automatic stimulus discrimination [21]. This has since been referred to as the deviance detection theory. Using today's terminology, this theory considers that MMN reflects a neurophysiological prediction error signal elicited by an error in perceptual inference; however, it may be noted that this is essentially a rewording of the initially postulated mechanisms [22]. This account assumes processes of auditory object abstraction, whereby the physical properties of discrete sounds are represented together, which presumably occurs during the ascending pathway as sensory signals travel through brainstem nuclei and up towards the cortex. The means by which this abstraction is formulated may perhaps be through the functions of auditory neurons that are tuned to different physical properties of sounds [23], although the manner in which these cellular activities are integrated is uncertain. Further to the encoding of auditory objects, this theory posits that after being encountered, these auditory objects are stored in sensory memory, or in other words, represented by a perceptual predictive model. It is proposed that this memory representation is automatically compared with incoming auditory objects, and if the two mismatch, the MMN response is generated; thus, it is referred to as a prediction error signal [24]. If the incoming auditory object and expected input are not sufficiently different to initiate an MMN response, repetition suppression or stimulusspecific adaptation (SSA) occurs, which has been referred to as model adjustment [25]. These hypothesized mechanisms related to predictive modelling and comparative processes can be described as top-down, whereas those associated with formation of auditory objects may be referred to as bottom-up. Various terms have been used to describe this theory, including statistical learning and inference, sensory learning, auditory perceptual learning, change detection, sensory memory, predictive coding, auditory pattern learning, prediction error signaling, novelty processing, hierarchical rule learning, and automatic auditory discrimination. Despite the deviance detection theory being widely supported in over four decades of published research [2,21], it is not universally accepted.

Fresh Afferents Theory
The term SSA is used to describe a phenomenon observed where repetition of an identical stimulus results in attenuation of its evoked response, or repetition suppression. This is generally accepted as an established property in sensory neurophysiology. Some research [26,27] has examined the deviance detection theory and rejected it by concluding that SSA alone can account for MMN responses. This fresh afferents theory presented an opposition to the deviance detection hypothesis by suggesting that differential SSA to standard and deviant stimuli can sufficiently explain differences between their responses, and corresponding difference waveform defections [26,27]. The fresh afferents theory may be more appealing because it avoids the need to invoke some of the unproven theoretical elements of the deviance detection theory, such as auditory objects, predictive perceptual models and comparative mechanisms. However, this has frequently been portrayed as a less sophisticated means of MMN generation than deviance detection [28][29][30], presumably due to the omission of more elaborate top-down processes. Subsequently SSA was integrated into the deviance detection theory via the model adjustment hypothesis mentioned above [25], which retained the need for deviance-detection mechanisms to generate MMN. Otherwise the fresh afferents theory was thoroughly dismissed by proponents of the deviance detection theory [30]. It may be noted that the emergent definition of a genuine MMN response appears to prohibit any mechanism other than the prevailing deviance-detection hypothesis [31], which may constrain the scope for properly evaluating alternatives. While SSA is repeatedly shown in auditory neurons responding to presentation of tones of different frequency, it is unclear whether comparable processes occur in response to sequences of stimuli with changes in other physical aspects of sound [32]. As such, fresh afferents might not be sufficient to describe findings of MMN in response to changes in different physical parameters, and SSA could perhaps be an over-generalized term that refers predominantly to frequency-specific adaptation, although there have been recent efforts to verify this using frequency-balanced multi-tone stimuli [33].

Sensory Processing Theory
The deviance detection theory of MMN provides an intriguing conceptualization of how neural systems underlying perception might be orchestrated. However, with due consideration to the limitations summarized in the following sections, it may be worth re-evaluating this hypothesis. There are at least two fundamental aspects of sensory neurophysiology that challenge the interpretation of MMN specifically as a manifestation of the deviance-detection mechanisms described above. These are referred to here as the sensory processing theory: (1) physical aspects of sound stimuli fundamentally influence the intrinsic auditory response and ERP morphologies over the latency range of MMN [34][35][36][37][38], and (2) state changes, long-term adaptation or auditory habituation (collectively referred to as adaptation for the remainder of this article) obscure direct comparisons between auditory responses observed during different stimulus blocks [27,[39][40][41][42][43]. This adaptation refers generally to changes in responsiveness to auditory stimulation, opposed to SSA which refers to repetition suppression of the response to an identical stimulus; however, it is not discounted that these adaptive processed are potentially related.
The first mechanism explains MMN observed from the passive oddball paradigm as an artifact of intrinsic bottom-up processing of physically distinct sounds. Cortical potentials measured during auditory stimulation are considered to vary with low-level features of sound stimuli, such as stimulus duration, ISI duration, frequency (relative to cortical location, hearing sensitivity, and background stimulation), relative intensity, continuity with background sound level (stimulus onset and offset envelopes), and source location. The sensory processing theory suggests that the modulatory effects of these physical properties on sensory ERP components can result in amplitude differences which may be misinterpreted as evidence of deviance detection.
Adaptation can potentially explain differences between responses to physically identical deviant and control stimuli presented in separate blocks, which may have been overlooked in attempts to validate the deviance-detection interpretation of MMN using additional control sequences [34,44,45]. Auditory neurophysiology is characteristically non-linear and highly dynamic, constantly adapting to signals received from the sensory organs. We undergo changes in hearing sensitivity as we wake up every morning and throughout the day, dynamically according to our environment, and over the course of a lifetime. Physiological responses reflect these continuous relative adaptions in central auditory processing [42], and we suggest that these phenomena and their interaction with intrinsic responses to different physical properties of auditory stimulation are not sufficiently well characterized to omit their effect by counterbalancing stimulus blocks.
These intrinsic and adaptive sensory processing mechanisms can be suggested to account for much of the data presented in the MMN research literature. Given that these proposed mechanisms of the sensory processing theory are generally accepted (although perhaps not comprehensively characterized) principles of auditory neurophysiology, the philosophical law of parsimony, Ockham's razor may be applied. Where fundamental aspects of sensory processing can explain the observed phenomena, there is questionable benefit in over-interpreting the data to comply with some of the hypothesized but elusive mechanisms of the deviance detection theory.
The deviance detection and sensory processing theories of MMN are illustrated in Figure 1. It may be argued that there is no undisputable evidence that ERP waveforms elicited by the passive oddball paradigm reflect the rapid categorization and representation of discrete sounds as auditory objects which are compared with a predictive model. This concept is based on a psychological construct, and the observed waveforms can be adequately described by basic neurophysiological principles. Importantly, the sensory processing theory can potentially account for differential MMN responses to duration and frequency deviance that hitherto have gone unexplained [20,46] based on the manner in which low-level features of auditory stimuli are preserved in cortical responses, which generate associated deflections in difference waveforms derived from physically distinct stimuli. Specifically, MMN caused by differences in stimulus duration between standard and deviant stimuli, where ISI is constant, may reflect modulation of offset response latency and amplitude. In contrast, the source of MMN caused by differences in stimulus frequencies could be frequency-sensitive obligatory onset-response components.
For some clarity regarding how these competing theoretical accounts explain differences between standard, deviant and control responses, these are illustrated in Figure 2. The deviance-detection theory interprets differences between responses to deviant and control-deviant stimuli to reflect deviance detection, between standard and control-standard to reflect repetition suppression, and between deviant and standard stimuli to reflect a combination of both. In contrast, the sensory processing theory interprets differences between deviant and control-deviant to reflect adaptation, between standard and control-standard to reflect adaptation, and between deviant and standard to reflect intrinsic physical sensitivities of ERP components. These alternative theories predominantly consist of top-down versus bottom-up mechanisms, respectively. and deviant stimuli, where ISI is constant, may reflect modulation of offset response latency and amplitude. In contrast, the source of MMN caused by differences in stimulus frequencies could be frequency-sensitive obligatory onset-response components.
(a) (b) Figure 1. Illustrations of (a) deviance detection, and (b) sensory processing hypotheses of MMN. (a) Surmises automatic categorization of discrete stimuli as auditory objects which are compared with a representation of recently encountered auditory objects; if the two are sufficiently different, the deviance-detection MMN response is generated, otherwise repetition suppression occurs. (b) Explains deflections in difference waveforms by the manner in which physical dimensions of sound normally excite the auditory system, which affects the components of their respective ERP waveforms. In (a), differences between oddball and control paradigms are asserted as evidence of the hypothesized deviance-detection mechanisms; in (b), these differences are explained by dynamic sensory adaptation under changing environmental and physiological conditions. For some clarity regarding how these competing theoretical accounts explain differences between standard, deviant and control responses, these are illustrated in Figure  2. The deviance-detection theory interprets differences between responses to deviant and control-deviant stimuli to reflect deviance detection, between standard and control-standard to reflect repetition suppression, and between deviant and standard stimuli to reflect a combination of both. In contrast, the sensory processing theory interprets differences between deviant and control-deviant to reflect adaptation, between standard and control-standard to reflect adaptation, and between deviant and standard to reflect intrinsic physical sensitivities of ERP components. These alternative theories predominantly consist of top-down versus bottom-up mechanisms, respectively. (a) Surmises automatic categorization of discrete stimuli as auditory objects which are compared with a representation of recently encountered auditory objects; if the two are sufficiently different, the deviance-detection MMN response is generated, otherwise repetition suppression occurs. (b) Explains deflections in difference waveforms by the manner in which physical dimensions of sound normally excite the auditory system, which affects the components of their respective ERP waveforms. In (a), differences between oddball and control paradigms are asserted as evidence of the hypothesized deviance-detection mechanisms; in (b), these differences are explained by dynamic sensory adaptation under changing environmental and physiological conditions. The term adaptation refers to non-linear changes in the sensory response between stimulus blocks. It should be noted that this is an over-simplified representation, and response magnitudes may vary depending on the means of quantification, specific parameters of auditory stimulation, and influence of adaptation across stimulus blocks; therefore relative differences between standard, deviant and control responses need not necessarily match those depicted here. Adapted from [47].

Historical Perspective of MMN Research
The pioneer of MMN research, psychologist Risto Näätänen, was among the first to analyze difference waveforms derived from standard and deviant ERP responses obtained from passive auditory oddball paradigms [21]. This seminal study included five Annotations on the left-hand side present the prevailing deviantdetection interpretations for differences between responses to deviant, standard and control stimuli; the right-hand side illustrates equally-plausible sensory processing-based interpretations. The term adaptation refers to non-linear changes in the sensory response between stimulus blocks. It should be noted that this is an over-simplified representation, and response magnitudes may vary depending on the means of quantification, specific parameters of auditory stimulation, and influence of adaptation across stimulus blocks; therefore relative differences between standard, deviant and control responses need not necessarily match those depicted here. Adapted from [47].

Historical Perspective of MMN Research
The pioneer of MMN research, psychologist Risto Näätänen, was among the first to analyze difference waveforms derived from standard and deviant ERP responses obtained from passive auditory oddball paradigms [21]. This seminal study included five subjects who listened to intensity and frequency deviant oddball paradigms with 70 dB, 1 kHz standards and deviant stimuli of 80 dB or 1.14 kHz, respectively. While some evidence available at the time might have been overlooked [48], it is now accepted that sound intensity has a modulatory influence on ERP amplitudes [36]. Nevertheless, the initial interpretation proposed was that a component of the derived difference waveform reflects a mismatch between expected sensory input, represented by an echoic sensory-memory trace, and an unexpected sensory input; thought to represent processes of automatic stimulus discrimination. This theory can be characterized as a psychological construct, and is essentially the deviance detection theory before model adjustment was suggested to account for fresh afferents, as described above. A negative polarity deflection was observed in the derived deviant-minus-standard difference waveform. Thus, the term mismatch negativity was coined. While the descriptive terms have recently been updated to accommodate the predictive coding framework, with "prediction error" and "predictive model" replacing "change detection" and "sensory-memory trace", this hypothesis has remained relatively unchanged to the present day [1,2]. Näätänen has authored over twohundred articles pertaining to MMN and established multiple international collaborations in connection with this research, as illustrated in Figure 3. One of the cited benefits of the passive oddball paradigm during the early days was that standard and deviant stimuli are presented during the same stimulus block, so their ERP amplitudes should not suffer from issues related to differences in adaptation of the auditory response between blocks, which were recognized as significant factors affecting ERP research at the time [39,40]. This is revisited later when discussing the limitations associated with control paradigms used in contemporary MMN research.

Clinical Applications of MMN
While the theorized mechanisms of MMN can be disputed, this might not be particularly consequential if not for efforts to utilize MMN as a biomarker for the early detection of psychiatric diseases [18][19][20]. A substantial amount of clinical research has been concerned with recording MMN from different patient groups and comparing these waveforms with those of healthy controls. To briefly summarize, reduced amplitude or The situation where an initial theory conceived to describe the function of a biological process has not been disproven or substantially altered in over forty years of active research is perhaps indicative of either: (1) the founding hypothesis was reasonably correct and scarcely in need of revision, or (2) the research performed has been deficient in aim or scope to confidently revise this hypothesis. Given our limited intuitive understanding of the central nervous system the former appears very unlikely, but admittedly not impossible. The second proposition should therefore be considered more closely. It could be argued that much of the research in this field has been conducted and interpreted strictly through the lens of the founding theoretical framework that takes MMN to reflect mechanisms of deviance detection [2], and perhaps this is why theoretical development has been relatively stagnant. Viable alternative mechanisms of SSA (fresh afferents theory) have been convincingly and eloquently proposed to account for the observed phenomena [27]. However, adherents of the prevailing deviance detection hypothesis have refuted this competing theory [27], which has since become amalgamated with, and in some respects subordinated to, the overarching psychological construct of comparison between expected and unexpected auditory objects [3,25]. The tools and techniques applied to study the neurophysiology of MMN generation are subject to several limitations that leave ambiguity concerning the interpretation of resultant data. This affords ample opportunity for speculation, providing avenues through which the prevailing hypothesis can be accepted by default and propagated in the absence of decisive contradictory evidence.
While it might be perceived as a cynical topic, it is not beyond the realms of academic discourse to explore sociological factors that may have influenced the progression of MMN research. Interpersonal relationships undoubtedly play a significant role in governing the propagation of ideas. In academia, ideas can assume a form of currency, and historically academics have been fiercely defensive over their ideas [49]. Prolific authors with many collaborators who endorse their ideas can stand to profit directly and indirectly in terms of academic reputation and its multitude of benefits [50]. This provides unfavorable incentives that regrettably might be unavoidable and lead to less replicable research [51,52]. If the scientific method is pursued rigorously by independent researchers, careful experimentation should lead to improvements in our understanding, with incorrect theories eventually being revised or discarded [52,53]. However, MMN research appears to have been largely centralized within an inter-related network of academics ( Figure 3). The sheer volume of publications from this group may be sufficient to persuade others that their conclusions are well supported. Moreover, textbooks that include MMN tend not to challenge the deviance detection theory [54,55]. This influence can be further reflected in the research literature via the anonymous peer-review process, where even well-intentioned reviewers might be inclined to dismiss or recommend changes in the interpretation of experimental data that does not agree with the prevailing hypothesis. Overall, the apparent dominance of the deviance detection theory throughout the literature obfuscates the huge controversy regarding the nature of MMN.

Clinical Applications of MMN
While the theorized mechanisms of MMN can be disputed, this might not be particularly consequential if not for efforts to utilize MMN as a biomarker for the early detection of psychiatric diseases [18][19][20]. A substantial amount of clinical research has been concerned with recording MMN from different patient groups and comparing these waveforms with those of healthy controls. To briefly summarize, reduced amplitude or otherwise abnormal MMN has been associated with a multitude of conditions including schizophrenia spectrum disorders [45,56,57], autism spectrum disorders [58,59], other neurodevelopmental disorders [60][61][62], aging [63], alcohol intoxication and more [29]. There is also a strong association between N-methyl-D-aspartate (NMDA) receptors and MMN generation [46]. The diverse scope of disorders that appear to interfere with the mechanisms of MMN generation may raise questions about its apparent lack of specificity. Perhaps a common pathway is disrupted in each of these conditions, or subsets of these conditions tamper with different sub-processes involved in MMN generation. Proponents of the prevailing hypothesis claim that deviance detection mechanisms are affected in each of these disorders [2]. However, this presumes that the deviance detection theory is correct without direct evidence for its proposed top-down mechanisms and without appropriately excluding the possibility that more fundamental physiological principles might be responsible. For example, differences between groups of patients and healthy controls may reflect altered bottom-up intrinsic sensory processing affecting their ability to distinguish between different sounds [56].
Another one of the main criticisms regarding the clinical MMN literature is the lack of standardization across studies. This was recognized by Näätänen et al. (2004) who attempted to develop an "optimal paradigm" to be used in clinical cases. This attempt might have fallen short of satisfying the requirements of clinical research, because many studies continue to favor bespoke implementations of the passive oddball paradigm. This complicates direct comparisons between studies that have used slightly different methods. Even seemingly small differences in detail, such as subjects' behavior during experiments (e.g., reading a book or watching a silent movie) might alter their auditory processing by providing a different set of visual or imaginary cues [55]. Inexplicably, there are instances in the clinical literature where MMN has been computed back-to-front, with the deviant ERP subtracted from the standard ERP [60,64]. Without providing an explanation for this reversal, it could easily be overlooked by a reader. One might assume that this was done to obtain a negative peak within the agreed upon time-window for MMN, whereas if the computation was carried out conventionally this peak would have been assigned positive polarity. There are also examples of methodological deviation from the latency range normally used to quantify MMN [65], although the source of these measurements is nevertheless attributed to mechanisms postulated by the deviance detection theory. The flexibility with which the passive oddball paradigm and definitions of MMN appear to have been modified and used to justify theories in cognitive psychology is concerning, particularly where findings could otherwise be explained by fundamental sensory processing mechanisms that are not obviously related to cognition.
Differences between MMN responses evoked by duration or frequency changes have been consistently reported throughout the clinical literature, with duration-deviant stimuli generally appearing to be more sensitive than frequency-deviant stimuli to neurological changes believed to occur in various medical conditions [17,57,59,63,66]. However, the underlying mechanisms which are responsible have not been clearly articulated within the original theory, which typically assumes that MMN reflects general deviance detection, irrespective of the particular changing property of sound [20]. Consistent observations of differential responses to separate physical deviances may be suggestive of MMN reflecting low-level features of auditory stimuli that are relayed by bottom-up sensory processes. This deviant feature sensitivity aspect of MMN can also be problematic for interpreting studies that have used spectrally and temporally complex auditory stimuli in their experiments. For example, where synthesized vowels or speech sounds have been used [9,59,67], the effects of multiple physical deviances coincide in quite an uncontrolled fashion, with the MMN response potentially reflecting combined changes in each different physical property of sound.
Some might argue that a complete understanding of the mechanisms of MMN generation is not required for its useful application as a biomarker for diseases; provided that altered MMN has been consistently associated with a given disorder, it can be used to indicate the presence of said disorder. As noted above, abnormal difference waveforms are observed in multiple disorders, so other diagnostic criteria would have to take precedence, making MMN obsolete and potentially wasteful in terms of time and resources. Therefore, moving forward with regular clinical use without achieving a greater understanding of the underlying mechanisms may be ill considered. In contrast, a more comprehensive understanding of MMN generation and the effect of different physical stimuli on the evoked auditory response might provide additional neurophysiological information to that which is inaccessible by existing diagnostic methods. Ultimately this rests on a more rigorous investigation of clinical MMN that does not take the nature of its source for granted.

Passive Oddball Paradigm
The majority of MMN studies have incorporated the conventional passive auditory oddball paradigm, which includes physically different standard and deviant stimuli. This is inherently confounded because different physical properties of sound are known to influence ERP amplitudes [34][35][36][37][38], raising questions about the validity of interpreting resulting difference waveforms as evidence of the deviance detection theory. As noted previously, oddball paradigms with deviant stimuli that combine changes in multiple physical properties may elicit complex differences in the ERP that are difficult to tease apart; attempting to do so would require a thorough analysis of the temporal-frequency content of sound stimuli employed and how these relate to the evoked response. In this situation, deviance detection is frequently presumed to be responsible for deflections in the difference waveform between deviant and standard ERP responses [9,59,67]. However, according to the sensory processing theory, these deflections may be accounted for by the intrinsic responses to different physical aspects of the respective stimuli. There are also inconsistencies within the literature concerning the properties of MMN and the oddball paradigm. For instance, it has been reported that a standard presented immediately after a deviant causes an MMN response [68]; but this is contradicted by reports that MMN is not elicited by an individual stimulus per se, without being preceded by a few repetitions of a different (standard) stimulus to establish a predictive model/sensory-memory representation [22]. These two properties are irreconcilable because the latter implies that presentation of a single deviant stimulus is not sufficient to form a reliable predictive model, therefore the deviance detection theory cannot remain a valid interpretation in the former case. This could be characteristic of an overgrown theory that has been interpreted to fit with evidence in multiple studies while failing to acknowledge inconsistencies with prior work. With such a seemingly broad and flexible definition, the deviance-detection hypothesis of MMN may have been ascribed to any notable difference between standard and deviant responses without suitably excluding possible alternatives. This suggests that interpretations of difference waveforms derived from passive oddball paradigms may need to be reconsidered.

Control Paradigms
Several control sequences have been designed to validate MMN as a measure of deviance detection [44,45,69]. However, virtually all of the studies in this field continue to use the conventional oddball paradigm, which is required for performing the waveform subtraction synonymous with MMN [45,70]. The most commonly used control procedure is known as the many-standards control, where multiple stimuli are presented at the same rate as deviants in the oddball paradigm [44]. In this sequence, an unpredictable pattern of stimuli is presented, with each stimulus repeated in equal proportion to deviants in the oddball paradigm, with the same ISI. These stimuli include those which are physically identical to the standard and deviant from the oddball paradigm [3,31]. Because this sequence does not have a predictable pattern, it is believed to be incapable of generating deviance-detection MMN, although this might be debated, as some evidence suggests that a standard immediately following a single deviant in the oddball paradigm can elicit MMN [68], questioning its reliance on a predictable pattern of preceding auditory stimulation. By comparing the response to deviant stimuli in the oddball paradigm with its physically identical but contextually different counterpart in the many-standards control paradigm, differences between the two are often interpreted as evidence of true deviance-detection MMN. Unfortunately, these assertions fail to recognize that adaptation can affect responses to auditory stimuli presented in different stimulus blocks [1,[39][40][41]43]. Adaptation can cause intrinsic auditory evoked responses elicited by physically identical stimuli to decrease in magnitude across successive stimulus blocks, regardless of their context within oddball or control paradigms. Attempts at counterbalancing stimulus blocks cannot eliminate these effects, rather just confound them. The interdependent relationships between the physical properties of auditory stimulation sequences and adaptive neurophysiological processes are not clearly defined. It is not sufficient to assume that these are approximately linear and can be cancelled out by counterbalancing. As such, comparisons of responses elicited in different stimulus blocks should not be considered legitimate evidence for deviance-detection mechanisms.
The flip-flop control sequence has also been used, especially in animal research, to validate the deviance detection theory [3,31]. In these experiments, two oddball paradigms are presented, each with standard and deviant stimuli assuming opposite identities. For example, in the first oddball paradigm a lower frequency standard and higher frequency deviant might be used, while in the second paradigm, the standard is higher frequency while the deviant is lower frequency. Proponents suggest that genuine MMN can be obtained by comparing the responses of identical stimuli when they are in the deviant and standard conditions. Considerations of general adaptation and SSA also apply here. Moreover, the sensory processing differences between an infrequent ascending frequency tone versus an infrequent descending tone may be complicated by the range of frequencies used, differential hearing sensitivity, and the influence of cross-frequency adaptation that results in attenuation of responses to spectrally similar tones [71]. Using a pair of pure-tone auditory stimuli for example, the one with closest frequency to peak hearing sensitivity, or "best frequency", might be expected to evoke a greater magnitude response when presented as a deviant, and induce greater adaptation when presented as a standard; thereby the difference between deviant and standard responses elicited by the same frequency stimulus in two different conditions could at least partially reflect modulation by spectral hearing sensitivity. Additionally, in a duration-deviant flip-flop sequence, differences in the length of stimulus-on versus stimulus-off time could be reflected in differential adaptation of auditory responses. For example, an oddball paradigm with a long duration standard and short duration deviant may be expected to induce greater levels of adaptation than vice versa. Given the present lack of clarity regarding these issues, deviance detection cannot confidently be ascertained by comparing responses to stimuli presented in separate blocks of the flip-flop control.
A slightly different control sequence used in MMN studies is known as the roving oddball paradigm [9,72]. Multiple short trains of physically identical stimuli are presented, followed consecutively by one another, with a constant ISI. In the analysis of data recorded during this type of experiment, the deviant stimulus is considered to be the first in each train (that theoretically mismatches with the predictive model), while the standard is considered to be the last in each train (which should not mismatch with the theorized predictive model). While this method also avoids the confounding factor of using physically different standard and deviant stimuli, it suffers from some of the other limitations mentioned already. For instance, repetition suppression is a likely explanation for differences between responses to standard and deviant stimuli in the roving oddball paradigm [27]. Identification of flaws in each of these control sequences does not necessarily suggest they are without merit, simply that no method is perfect. Due to these issues, and additional technical limitations discussed below, data from these control paradigms cannot be relied upon for confirming the deviance detection theory, particularly when there are established mechanisms of auditory neurophysiology that can otherwise account for the observed phenomena.

Electrophysiology
The methods used to acquire data during these auditory experiments are also acknowledged to be limited in several dimensions. EEG is believed to reflect synchronized post-synaptic potentials of large numbers of cortical pyramidal neurons [54]. These cells tend to be oriented in the same direction, orthogonally to the cortical surface, so together can generate an electromagnetic field with enough strength to excite transducers located on the scalp. These cells accept sensory input from thalamic nuclei. Thus, the majority of ERP components are believed to reflect sensory information processing. While this is reasonably well established, there is a requirement for many repetitions of a stimulus or event during EEG recording to obtain an observable ERP waveform. This is due to the notoriously low signal-to-noise ratio of EEG recordings, which is contaminated with interference from biological and non-biological sources. The inability to reliably observe single-trial dynamics in EEG recordings limits the interpretation of responses to oddball and control paradigms, and their derived difference waveforms. Furthermore, the inverse source problem prevents accurate localization of the specific neural generators responsible for the observed ERP components [54,73]. It has been speculated for a long time that two neural generators combine to produce the MMN response: one temporal and one frontal, respectively accounting for auditory and cognitive aspects of the proposed deviance-detection mechanisms. However, evidence regarding this has been inconclusive [74]. Attempting to identify the neural source of an EEG response may be likened to the task of identifying an individual violinist playing among the chorus of a large orchestra with audio recordings from a single microphone positioned at the concert hall ceiling: far from a trivial problem. Furthermore, heterogeneity in cortical topology between individuals is thought to contribute to differences in auditory evoked electromagnetic fields measured from outside of the skull that are difficult to predict, effectively presenting another confounding biological factor [75]. While other neuroimaging technologies such as MEG and fMRI have been explored, the majority of early and contemporary MMN studies have used EEG to observe functional brain activity.
Much of the relevant animal literature has ascribed differences observed between deviant and standard or deviant and control responses to mechanisms of deviance detection, generally in agreement with the prevailing hypothesis that dominates the human MMN literature [14,69,76]. There are few counterexamples [27,43,77,78]. Curiously, the animal studies that support the deviance detection theory tend to report widely varying latency ranges and waveform morphologies [79], which may raise questions regarding the legitimacy of their conclusions. This also includes studies where single and multi-unit recording techniques have been interpreted to support mechanisms of deviance detection [47,80]. These methods are inherently limited in the number of cells that can be recorded simultaneously, arguably providing quite a restricted view of gross neurophysiological activity. Moreover, acute intracranial studies in animals typically involve using pharmacological compounds to induce anesthesia that alter normal physiological signaling. In contrast, chronic intracranial studies tend to suffer from deteriorating data quality over time, presumably due to the presence of a foreign object provoking an immune system reaction and movement of electrodes over time. Both acute and chronic intracranial studies encounter adaptation in neurophysiological responses, the mechanisms of which are not entirely understood. These limitations notwithstanding, diverse functional specializations have been observed from auditory nerve cells in response to different properties of sound using existing methods [23,81]. While it is important to continue developing these techniques, their current technical capabilities are limited to the extent that conclusions drawn from them regarding the generative mechanisms of MMN must be treated tentatively.

Conclusions
Subtracting responses elicited by physically different stimuli presented in a passive auditory oddball paradigm to make inference about deviance detection and predictive coding may be considered an inherently flawed approach, given that low-level physical properties of sound fundamentally influence the auditory evoked response. Using this technique to derive a biomarker for early diagnosis of psychiatric disease is not advisable, and could have negative consequences for prospective patients. Experimental procedures designed to validate the deviance detection theory of MMN have been inadequate, mainly due to state changes (both short-and long-term adaptation) which can differentially affect the sensory response during different stimulus blocks. More critical and discerning research is required to understand the causes of altered central auditory processing under different environmental and neurological conditions. The existing body of clinical MMN literature has a significant role to play in this effort by identifying patient groups that potentially exhibit deficiencies in specific sensory ERP components; for example, abnormal physical sensitivities or adaptive properties of on and off responses. However, it is likely that more sophisticated measurement and analysis techniques will be required to gain a fuller understanding of central auditory processing: difference waveforms derived from the passive oddball paradigm are unnecessarily ambiguous and susceptible to misinterpretation. From a meta-research perspective, the enterprise of MMN research has been largely centralized, which might explain why the prevailing deviance-detection theory has survived relatively unchanged for over four decades despite considerable empirical shortcomings.
Author Contributions: Both of the authors have contributed substantially to drafting and revising this manuscript and have approved the final submission. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Ethical review and approval were waived because this study did not involve any experiments on humans or animals.

Informed Consent Statement: Not applicable.
Data Availability Statement: Data sharing not applicable.