Next Article in Journal
Double Slit with an Einstein–Podolsky–Rosen Pair
Next Article in Special Issue
Musical Emotion Recognition with Spectral Feature Extraction Based on a Sinusoidal Model with Model-Based and Deep-Learning Approaches
Previous Article in Journal
Filter Media-Packed Bed Reactor Fortification with Biochar to Enhance Wastewater Quality
Previous Article in Special Issue
Real-Time Guitar Amplifier Emulation with Deep Learning
Open AccessReview

A History of Audio Effects

1
Centre for Digital Music, Queen Mary University of London, London E1 4NS, UK
2
Interdisciplinary Centre for Computer Music Research, University of Plymouth, Plymouth PL4 8AA, UK
*
Author to whom correspondence should be addressed.
Appl. Sci. 2020, 10(3), 791; https://doi.org/10.3390/app10030791
Received: 16 December 2019 / Revised: 9 January 2020 / Accepted: 13 January 2020 / Published: 22 January 2020
(This article belongs to the Special Issue Digital Audio Effects)

Abstract

Audio effects are an essential tool that the field of music production relies upon. The ability to intentionally manipulate and modify a piece of sound has opened up considerable opportunities for music making. The evolution of technology has often driven new audio tools and effects, from early architectural acoustics through electromechanical and electronic devices to the digitisation of music production studios. Throughout time, music has constantly borrowed ideas and technological advancements from all other fields and contributed back to the innovative technology. This is defined as transsectorial innovation and fundamentally underpins the technological developments of audio effects. The development and evolution of audio effect technology is discussed, highlighting major technical breakthroughs and the impact of available audio effects.
Keywords: audio effects; history; transsectorial innovation; technology; audio processing; music production audio effects; history; transsectorial innovation; technology; audio processing; music production

1. Introduction

In this article, we describe the history of audio effects with regards to musical composition (music performance and production). We define audio effects as the controlled transformation of a sound typically based on some control parameters. As such, the term sound transformation can be considered synonymous with audio effect. We focus on audio effects that are used for manipulation of recorded or performed sound, as opposed to signal processing techniques in the context of sound synthesis; that is, we take the view of an audio engineer or producer and discuss what is commonly referred to as audio effects in a music production studio. Thus, we will not specifically discuss synthesis, sound generation methods, or musical instruments.
Audio effects are essential tools in contemporary music production and composition. They are applied to enhance the perceived quality of an audio signal or for creative sound design and music composition. To achieve this, sound transformations alter one or more of perceptual attributes, such as loudness, time or rhythm, pitch, timbre, and spatialisation [1,2]. Control values determine the behaviour of the audio effect, which may be set by the user via a graphical user interface (GUI). A basic flowchart of an audio effect is shown in Figure 1.
We structure our article by specific technological innovations which gave rise to new inventions in the field of audio effects, highlighting the role of transsectorial innovation [3], i.e., the transfer of technology from one industrial sector to another, in the development of new audio effect techniques and music technology in general. Théberge [4] discusses this phenomenon with regard to the impact of digital technology on the musical instrument and audio industries, whereas Mulder [5] discusses music amplification technology and the crucial role that innovations in the fields of telephone, radio, and film technology have played in its development. Julien [6] argues that many innovations in audio effects stem from diverting music performance and recording technology, i.e., using the technology for purposes not originally intended. The main technological advances, crucial for innovation in the development of audio effects, are identified as electromechanics (which includes analogue recording technology such as magnetic tape), analogue signal processing, digital signal processing, as well as later developments in computer science and technology such as music information retrieval and machine learning. In order to put artificial sound modification in a wider historical context, we start by discussing architectural acoustics as part of musical composition and performance prior to the age of audio recording.
An exhaustive description of the large number of available audio effects and their technical specifications is out of scope for this article; however, we give examples of implementations of significant technical innovations and refer the reader to further literature where appropriate. There are several books covering the implementation of audio effects, along with some historical background [7,8,9,10,11]. Articles giving detailed overviews of specific audio effects can be found in the literature; for instance, the history of artificial reverberation is described in References [12,13] and equalisation is discussed in References [14,15]. Bode [16] outlines the history of electronic sound modification and analogue signal processing, largely focussing on sound manipulation in the context of sound synthesis. He concludes that many of the technological principles can be “traced back to the very early days of experimentation and that many of them have survived all phases of the technological evolution”. In our work, we investigate how this assessment still holds true in light of the digitalisation of music in the 1980s, the widespread move from hardware to software in music production studios beginning in the 1990s, and the emergent technologies enabled by recent advances in computer technology.
The rest of the paper is structured as follows. We first describe the influence of room characteristics on music and performance prior to the emergence of recording technology, followed by a brief overview of reverberation chambers in music production. A historical overview of artificial audio effects that were developed with the advent of the abovementioned technological innovations is then presented. We conclude by discussing current work and future directions in the field of audio effects.

2. Acoustic Effects in Built Spaces

2.1. Reverberation

Reverberation, “the decay of sound after a sound source has stopped“ [17], consists of reflected, attenuated versions of a direct signal in a particular space. Kuttruff [18] models the response of the room as a sum of “sinusoidal oscillations with different frequencies, each dying out with its own particular damping constant”. In an enclosed room, a large number of reflections from surrounding surfaces build up in a diffuse manner. The decay time of the reverberant signal is commonly defined by R T 60 , describing the time it takes for the reverberation level to decrease by 60 dB after suitable excitation of the space with an impulse signal. Depending on the shape and surface materials, the reflected echoes are not exact copies of the direct signal, which adds specific colouration (frequency enhancement or decrease) to a given audio signal. The complete impulse response of a room is commonly divided in three parts: direct sound, early reflections, and late reverberation (Figure 2).
The direct sound is the origin sound and the first to be experienced by the receiver (depending on the distance from the source and on the medium conditions). The early reflections are the first set of reflections that arrive at the listener’s ear and have a large impact on the perception of the size of a room and on colouration of the audio signal. The late reverberation is composed of the build up and decay of a large number of diffuse reflections. The late reflections are often considered to be uncorrelated aspects from the original signal. The time differences of the early reflections and late reverberation are not discernible by the listener due to the integration time of the human auditory system [8]. The transition between early reflections and late reverberation (mixing time) can be found with statistical methods [19,20,21]. When the direct sound is received within a certain time threshold from its first reflection, the listener is able to discriminate different sound phenomena or otherwise perceived as fused together. This effect is called precedence effect, and this time gap, i.e., the listener’s echo threshold, was found to depend on the kind of sonic material [22] and, in more complex way, on some periodical similarities of the audio signal [23,24].

2.2. Orchestrating Acoustic Effects

The desire to intentionally change the sonic characteristics of a sound for a musical performance is far older than recording technology. Exploiting the acoustical characteristics of a performance space is an early example of the intentional and artistic application of an audio effect to music. Room acoustics have always been interwoven with music. Indeed, the development of musical traditions may have been directly influenced by the architecture [25,26,27,28]. Sabine [29] is often considered as the founder of architectural acoustics. In his seminal work on reverberation, he started the paper with “The following investigation was not undertaken at first by choice but devolved in the writer on 1895 through instruction from the Corporation of Harvard University to propose changes for remedying the acoustical difficulties in the lecture-room of the Fogg Art Museum, a building that had just been completed” [29].
Sabine [30] explained the development of a sophisticated musical scale with the highly reflective surfaces of European buildings made from stone. He argues that the increase in size of the temples and churches eventually led to religious services consisting mainly of Gregorian chants characterised by slow tempo and sequential pitch changes rather than spoken word due to its unintelligibility in highly reverberant spaces. Conversely, open spaces, which can be considered as completely absorbent, may be favourable for the development of rhythmic music.
There are several examples in Western musical history, where it is demonstrated that composers were indeed taking the acoustical properties of the performance space into account in the composition process. Based on the cori spezzati (divided choirs) tradition, Giovanni Gabrieli specifically exploited the reverberant effect of the large St. Mark’s Cathedral in Venice, which featured two organs, one on each end of the church [31]. Richard Wagner’s composition has been linked to the acoustics of the Bayreuth Festspielhaus, and Berlioz described theoretical performance spaces that would support his musical intention [32]. In the text that follows, we describe the origins of acoustic effects, predating them to when the first found traces of aesthetic intentions were discovered in the selection or the design of certain sites.
The relationship between architecture and acoustics began many centuries ago. According to Blesser [32], we could hypothesise that, at the time of cavemen, paintings and decorations were executed in places presenting a peculiar resonance, to add through the amplification of the voice or other sounds a more convincing narrative to the depicted scene. Typical cave paintings have always been found in the most acoustically pleasant part of the cave, the part that is best for telling stories [33]. Fazenda et al. [34] confirmed that there may be a correlation between the placement of decorations and the reverberation time in some sections of these ancient caves. Suggestive examples that have aroused the interest of researchers consist of stone complexes for ritual use such as Stonehenge in England or the temple of Hal-Saflieni in Malta, for which archaeoacoustic projects were activated to recover the original sound and to communicate it to today’s public. In the case of Stonehenge [35,36], it was highlighted through the natural scale model of Maryhill in Washington how the sound of percussion instruments, elaborated in a rhythmic texture, could interact with the stone structures in creating a diffused and resonant spatiality, rich in reflections at low frequencies (48 Hz) and distinguishable echoes, more suitable for ritualistic purposes of popular music production. Moreover, it has been discovered that the site could favour particular subsonic frequencies (≈10 Hz), which can stimulate relaxation and trance states, favouring the creation and synchronisation of alpha brain waves.
Findings in the field of archaeoacoustics reveal that resonant characteristics of ancient structures may have been deliberately designed to exhibit specific acoustic qualities (the resonance frequencies of a room, or room modes, are the frequencies in which wavelengths are directly related to the room dimensions). Jahn et al. [37] measured the resonance frequencies of five megalithic chambers from the chalcolithic age and one chamber dated to 400 BC; they found the dominant resonance frequency of all the structures to be in the range of 95 to 120 Hz, particularly around 110 Hz. It suggests that the room modes were chosen to enhance male voices during rituals involving chanting. Cook et al. [38] later showed by electroencephalography (EEG) that human brain activity exhibits measurable changes when exposed to frequencies in close vicinity of 110 Hz, similar to those assumed to be associated with emotional processing. These findings can lead to the hypothesis that the room resonances of the investigated chambers had an additional purpose in changing the emotional state of participants of events involving musical performances.
Similar intentions have been discovered through studies conducted in the megalithic temples of Malta, in particular, the Hal Saflieni hypogeum. In this temple, the Oracle Room [39,40] seems to have particular acoustical characteristics. This particularity consists in having a wide resonance at low frequencies, with a reverberation of 16 seconds, but the intelligibility of words becomes distorted over a very close distance. It seems that, therefore, special sound effects may be more suitable for syllables sung in a prolonged fashion, especially if from female voices rather than for tight rhythms. Moreover, the lack of intelligibility but the sustained ability to amplify sounds with components among the medium low frequencies, suggest for this structure and this place a use based on the atmospheres deriving from the sound blend and their unpredictable spatialisation due to the smooth surfaces rather than the clear communication of particular messages.
Moving forward a few centuries, a more extensive articulation of mathematical and geometrical knowledge was devised, useful for discriminating the effects caused by surfaces and constructions on sound phenomena. Vitruvius’ treatise “De Architectura”, dating from between 40 and 20 BC, was well known in ancient times [41]. As Vitruvius suggests, Greek architects and builders were considered experts of theatre design and their acoustic effects, probably also because of the influence that the Pythagorean school may have had in some intellectual contexts. In fact, Policleto, the architect of the theatre of Epidaurus (360 BC) [42,43], was a disciple of Pythagoras [44] who can be considered the main figure promoting the study of acoustics in Greek times. During this period, devices were created, such as the Acoustic Masks [45,46] that, according to Cassiodorus, were of surprising efficiency, worked as an impedance adapter for a spokesperson voice, causing a greater emission of sound [47]. Vitruvius is also responsible for introducing the famous acoustical descriptors that have influenced several scholars over the centuries. These consist of architectural solutions defined as resonantes, consonantes, circumsonantes, and dissonantes, depending on the integration of the architecture in the local geographic morphology. In addition to these definitions, Vitruvius suggests a complex use of devices called echeia to obtain particular acoustic effects at certain frequencies. These echeia, or pinakes, consist of ceramic pots, now known as Helmholtz resonators, aimed at amplifying specific frequencies. For a large time range, it was believed that such installations could tune the theatre, and numerous attempts were made to achieve perfect acoustics. Arns and Crawford [48] suggested that the first filters can be considered these acoustic resonators. Such filters can be reproduced through “electronic counterparts of mechanical quantities, with kinetic energy, potential energy, and heat energy corresponding to the energy in inductors, capacitors, and resistors respectively” [49].
Nevertheless, the measures taken in Roman times based on transforming the theatre as a natural place in a scenic architectural construction were much more useful. Already in the Greek age, wooden stages were employed to amplify the actions of the actors and to improve visibility. Romans introduced a roof working as a sound projector which directs the acoustical energy towards the spectators, allowing to expand the scene behind the stage. Hence, the stage not being the only energy loading part, it could serve as non-acoustic scenography with more freedom, incorporating otherwise detrimental galleries and architectural chiaroscuri. In the Roman era, tunnels were introduced above the highest terraces of the steps to collect the sound and to avoid its aerial dispersion, so to provide an energetic return to the most distant rows [50]. Already in Greek times, the geometric design and the ear were the main instruments for calculating the amplification of sound, as the theatre of Epidaurus demonstrates, as built on different foci ruled by the sectors of the audience to contrast confusion from sound focusing in the middle Vovolis [46].
Over the Middle Ages, there are not many traces of major acoustic developments. Echeia were found inserted into the walls of several medieval churches, for example in Switzerland [51] and Serbia [52]. It seems that the resonant vases were installed blindly following technical traditions or with the misplaced hope to improve the intelligibility of the spoken word and the acoustic performance of the enclosure [51].
In the 17th century, Athanasius Kircher [53] writes about acoustics as Magia Phonocamptica, the “magic art” of the bent sound. He illustrates numerous devices of which the only effect is that of modelling and modulating sound phenomenon. He drafts inventions to amplify sound through pipes in buildings and sound shells embedded in house walls, as if they were modern intercoms. He draws elliptical walls and ceilings to make sound walk better on the surface from one focus to another. This visionary activity was accompanied by his theories on the use of geometrical projections as well as listening experiments with choirs, called Musica per Echo to be performed under specific domes [53].
Among a number of sacred sites, examples of architectural constructions aimed at achieving particular sound effects with scenic vibrance were progressively found, such as the pyramids of Chichen Itza, of which the steps create the sound of the bird “Quealcoatzl”. Lubman [54] argues that the echo produced by the Mayan Chichen Itza pyramid deliberately resembles the call of the Quetzal bird, which has been considered sacred in Mayan culture and hailed as the bird messenger of the gods. The sound is produced by periodic reflections of the sound of a hand clap from the stair faces of the steps on the pyramid (Figure 3).
This sonic effect could be seen more as a special effect than a reverberation as described in the previous examples; however, the principle is the same: the reflected sound, or the addition of the reflections to the original, is the intended sonic outcome and, thus, structures and buildings become means to alter sound in a controlled way [55].
Other examples can be found in Islamic architectures, where finely decorated domes and fractal excavations ensure a uniform and suggestive diffusion. We can find traces of whispering galleries in late 17th century in Europe, designed to allow speakers to engage in discrete conversations by speaking to walls [25,53]. However, it is debated if this design was initially intentional [30,56]. Over the 18th century, a number of theatres were designed with specific instructions and guidelines aiming at making them more functional for their purpose [57]. The horse-shaped theatre optimised for opera developed around this time [58]. The Roman “Odeon” was rediscovered by Wagner in his Bayreuther Festspielhaus in order to improve listener envelopment [59]. Science of acoustics began to be employed in architecture to deliver better theatrical performances, but we must wait until Sabine for the first coherent formulations on reverberation theory, relationships between surface and volume, and frequency absorption coefficients.
It is to be noted that, as measurement instruments also developed, based on mechanical devices controlled by human force, motors, and later electricity, recording sound became increasingly a viable possibility. As sound started being recorded, it disjointed from its original placement and its original acoustic justification. This, on the one hand, gives birth to a new medium detached from space and its architecture; on the other hand, no longer being dependent on its causality or its performance, it paved the way to the arisal of infinite new creative possibilities, depending on the reproduction techniques, their control, and their technology.

2.3. Reverberation Chambers in Music Recording

Artificial reverberation may be the oldest audio effect used in music recording and transmission. With the arrival of electrical recording technology, it was possible to capture some of room characteristics on a disc. In the mid-1920s, this was mainly done when recording classical music to give the listener the acoustic impression of being transported into to the performance space. The “dry” sound was preferred for popular music, providing an experience of more intimacy in that the reverberation depends solely on the listening space [60]. As one of the largest broadcast companies at the time, RCA (Radio Corporation America) registered a patent for a reverberation chamber in 1926 [61]. Its original application was replacing reverberation, that has been lost in sound recordings with close microphones to avoid background noise [32].
To apply reverberation to an instrument in the recording studio, a signal from the mixing desk is sent to loudspeakers placed at one end of the reverberation chamber. The direct and reflected sound is picked up by microphones in the room, whereby the reverberant sound becomes more prominent as the microphones are placed farther away. The sound produced in the chamber is then fed back to the mixing console and added to the original dry sound, enabling control over the amounts of added reverberation. Further control over the sound characteristics is achieved in reverberation chambers that can be dynamically altered, for instance, by variable reflective panels of dual chambers separated by a wall with variable aperture. The design principles of early purpose-built reverberation chambers for recording studios is discussed in Reference [62].

3. Artificial Audio Effect Technologies

3.1. Sound Recording and Reproduction Technology

With the advent of recording technology in the late 19th/early 20th century in the form of the phonograph, new possibilities with regards to composition and performance were quickly discovered by some contemporary composers. While traditionally composers were limited to describing their music in notation form, focusing on pitch, rhythm, and meter, other parameters such as dynamics and articulation could not be described as precisely and were subsequently subject to the performer’s interpretation. Indeed, composer Igor Stravinsky [63] noted in his biography with regards to the player piano and gramophone that, in his view, recording technology offers means to impose “some restrictions on the notorious liberty … which prevents the public from obtaining a correct idea of the author’s intention” and to prevent musicians from “stray[ing] into irresponsible interpretations of … musical text”. Similarly, Bernardini and Rudi [64] identify “a different, deeper, and total control over timbre” as the most important motivation for the use of audio effects as a compositional tool.
Stravinsky [65] proposed to use the gramophone as a tool to apply to musical sounds what could be regarded as audio effects by creating “specific music for phonographic reproduction”. He envisioned a musical form where the intended timbre of a sound is achieved only through mechanical reproduction. Early electric recordings of that time, though improved the quality of previous mechanical recordings, still had a considerably limited frequency range of around 100 to 5000 Hz. Furthermore, the phonograph was prone to distortion and noise, such as crackling; wow and flutter (cyclic speed fluctuations caused by mechanical limitations); the pinch effect (caused by a varying width of a groove of a monophonic recording and the resulting vertical motion of the stylus); the tracing error (caused by the difference in shape of the sharp chisel used for cutting the groove and the rounded stylus); and the tracking error (caused by the varying angle of the mounted tone-arm towards the groove during playback). Indeed, considering this degradation of sound, the deliberate use of the phonograph to shape timbre may be considered an early example of audio distortion and filter effects. Toch [66] made a similar statement with regards to the use of the gramophone as a means to create music. His Grammophonmusik (gramophone music) compositions specifically exploited the gramophone’s new possibilities including its shortcomings and peculiarities with regards to the faithful reproduction of music.
The use of pitch-shifting effects in the form of rerecording sound played back at different speeds can be observed in the gramophone music of Paul Hindemith and Ernst Toch. In Hindemith’s Trickaufnahmen (trick recordings), performed at the annual modern music festival Neue Musik Berlin 1930 alongside Toch’s works, he rerecorded several previously recorded sounds simultaneously at different playback speeds [67]. He created chords by using a technique of which, with the advent of magnetic tape, the term overdubbing would be coined. The concept of the effect is indeed comparable to the modern harmoniser effect, where harmonically pitch-shifted versions of a note are added to create a harmonic chorus effect. The first instance of sound manipulation using the phonograph may be that of Stefan Wolpe, who in 1920 set up a Dada performance featuring eight phonographs that played simultaneously classical and popular music at variable speeds including reversal of playing direction [68]. Darius Milhaud similarly used the phonograph in musical performances in the 1920s; however, in contrast to Toch’s and Hindemith’s music, these performances did not produce recorded compositions. Nevertheless, the time of gramophone music was relatively short and only few composers picked up on this work. Edgard Varèse for instance experimented with multiple variable-speed turntables in 1935. Another notable example is John Cage, who was in attendance at the Berlin performances of Hindemith and Toch and who later acknowledged the importance of their work. Cage’s work Imaginary Landscapes No. 1 from 1939 involved playback of electronically generated sounds at different speeds [69]. Advances in technology to record audio optically on sound film, which had several advantages over disc recordings, also contributed to the diminishing interest in gramophone music at the time. As opposed to film, the 78 rpm discs used for phonographic playback had a recording limit of only 4 min. In addition, film could be spliced and cut. In the world of experimental (and popular) music, however, the magnetic tape, as developed in the 1930s by German company AEG (Allgemeine Elektrizitäts-Gesellschaft) and after World War 2 by American company Ampex, was of greater importance.
Pierre Schaeffer and Pierre Henry were the driving force behind the development of the French school of early electronic music, musique concrète, beginning in the 1940s and the founding of the Groupe de Recherche Musicales (GRM) in Paris. Musique concrète was based mainly on manipulated recorded sound, where music was seen as a “sequence of sound objects”. According to Schaeffer [70], these sound objects “must be distinguished from the sound body or from the device that creates it”. This is also referred to as acousmatic sound, a sound that one hears without seeing the causes behind it [71]. Schaeffer describes this disassociation of sound from the sound source or context by the listener as “reduced listening”, which can be supported by artificial manipulation of sound. Indeed, Schaeffer laid out several postulates and rules for musique concrète, one of them stating the need to learn how to utilise “sound manipulating devices”, such as tape recorders, microphones, and filters.
Although the term musique concrète is mainly associated with compositions for tape recorders, Schaeffer also experimented with turntables in his earlier works, pioneering several audio effects. The first five Études de Bruits, premiered in 1948, contained filtered sounds (1/3 octave, low-pass, and high-pass filters) as well as mechanical reverberation of material recorded on shellac records. Moreover, sound modifications included sound transposition (variable speed playback), reverse playback, and dynamic volume envelopes [72].
The arrival of the tape recorder in the early 1950s opened up new possibilities for sound manipulation and led to the development of dedicated devices, such as the phonogène based on speed variations of the playback, thus changing the pitch of the audio material.
A later version of the device, the universal phonogène from 1963, was capable of changing pitch independently from duration. The device was preceded by the time/pitch changer developed by Gabor [73] based on optical film recording and the Tempophon by German company Springer in 1955 using magnetic tape [7]. Even earlier examples of patents based on the principle of rotating pickup systems for changing the duration of sound recordings can be found. In his review of early time stretching/pitch shifting devices, Marlens [74] identifies related patents filed for instance by French and Zinn [75] and Fairbanks et al. [76]. Marlens [74] presented his own device as well, the Audulator, a keyboard instrument that is capable of reproducing a sound over two octaves without changing the duration. These devices were based on the principle of time-granulation. Gabor [73] replaced the slit through which light is projected to a photocell with a rotating cylinder with multiple slits to achieve this. In a similar fashion, the devices using magnetic tape had a rotating drum with multiple playback heads, thus picking up segments of the recording successively [76]. The pitch was controlled by the relative speed of the heads to the tape, while time stretching could be achieved by multiplicative and chop-out scanning [74].
The pitch is dependent on the relative velocity of the tape to the multiple rotary head. If the head rotates in the direction of the tape movement, the relative speed is lowered, which results in a lower pitch. Moving the playback head in the opposite direction increases the speed at which the information from the tape is read, thus raising the pitch. Multiplicative scanning results in lengthening of the duration; likewise, shortening is achieved by skipping portions of the tape at equal intervals.
This development is significant, since it presents an early example of dissociating pitch from duration, the basis for many effects in use today based on time stretching and pitch shifting as opposed to simple variable speed replay. Furthermore, these developments constitute early examples of granulation. Employing granulation as a compositional tool has been pioneered by Iannis Xenakis in Metastasis (1954), Concret PH (1958), and Analogique A-B (1959), the latter being described by Xenakis [77]. For the piece Analogique B (1959), he recorded sinusoids produced by an analogue tone generator and scattered grains onto time-grids after cutting the tones into short fragments. Xenakis proposed the idea that every sound, including musical recordings can be represented as a combination of elementary sounds. More novel devices were developed at GRM, such as a three-head tape recorder for simultaneous playback and the Morphophone featuring ten heads for the creation of echo effects and looping capabilities.
Soon, a number of standard techniques for the creation of tape music were established. Splicing and cutting of tape in various manners, for instance, created different effects of the transition from one sound to another [78]. Tape delay (see Figure 4), manipulation of playback speed, and tape reversal became common practice.
Although experimental composers and researchers within the academic framework continued to innovate in the audio effects domain, the emerging technologies became increasingly important in the production of popular music post World War II. In 1941, Les Paul started producing repetition effects with disks and multiple playback pickups [16]. Towards the end of the decade, he made extensive use of multitrack recording-based effects and pioneered several techniques in the creation of his music [79]. By recording instruments with a tape recorder while playing back previously recorded tapes, he produced his music by layering the instruments one after the other. Here, the playback was sometimes played back at double speed, which as a result also transposes the audio material by one octave and changes the timbre due to the shifted spectrum. Another technique of Paul was to play back, for instance, a rhythm guitar at half speed while recording a guitar solo on top of it. In the final piece, the combined tracks would be played at normal speed again. This resulted in the solo being played tremendously fast. Indeed, due to the novelty of these techniques, listeners at the time were often clueless as to how this music could have been produced. Les Paul himself would often give intentionally misleading explanations for his sound. He attributed the high-pitched guitar sound to guitars of small sizes pitched one octave higher and claimed to create the chorus-like effects on vocal and guitar tracks with a fictional device he called the Les Paulverizer capable of multiplying instruments and vocals. In reality, however, the chorus was a result of his multitracking technique of recording several takes on top of each other. However, confronted with the dilemma that it was impossible to reproduce his studio sound in a live performance setting, around 1956, Les Paul would indeed build a little black box to be mounted on his guitar, referred to by him as the Les Paulverizer (see Figure 5). This device allowed him to control a tape recorder while playing in order to record and playback several tracks while on stage.
With the new possibilities of multitracking technology, delay-based effects emerged, such as delay (echo), flanging, and slapback. These effects are closely related to each other. The main difference lies in the delay time ranges in which they operate. Table 1 shows the approximate delay time ranges for the members of this family of effects. While flangers usually apply periodic modulation, many chorus implementations perform random modulation of the delayed signal. Furthermore, as opposed to the chorus and flanger effects, echo effects (and resonator and slapback effects) typically do not employ any modulation of the delay time.
While Les Paul is considered the first to use the flanger effect in 1952 [16]; its first use with automatic double tracking (ADT) goes back to Ken Townsend experimenting as a recording engineer at Abbey Road Studios in the late 1960s. John Lennon is credited with coining the term “flanging” for this particular effect while dynamically changing the speed of the second tape machine manually [81,82].
Another standard audio effect in a music production studio is the chorus. With the arrival of the first digital delay effect unit, the Lexicon Delta T, in 1973, it was possible to achieve shorter delays than with the previously used tape delays [6]. Although the effect relies on multiple tracks of the same recording, it may still be classified as a timbre-effect [83]. It is a delay-based audio transformation where the output is a linear sum of the dry input signal and the dynamically delayed input signal. The delay time range for this effect is relatively short (under 30 ms) to make the output sounds perceived as one.
Modulation of the delay time results in some deviation in pitch, depending on the modulation depth. Specifically, reducing the delay time by reading from the delay line at a faster rate increases the pitch while slowing down reading from the delay line lowers the pitch. The typical application of the chorus in music production is to take a single source sound and to emulate multiple sources playing in unison to represent the natural effect occurring when several performers play or sing the same music. The effect takes into account the fact that it is virtually impossible for musicians to play in perfect synchronicity.

3.2. Electromechanical Effects

There are several audio effect devices combining analogue electronics with mechanical systems. Electromechanical effects include different approaches to artificial reverberation as well as other lesser known implementations, such as echo and tremolo. A special class of electromechanical audio effects are those based on recording and playback technology, most importantly the tape player, which we discussed in the previous section.
Research in the field of long-distance communication in the early 20th century led to techonologcal innovations that form the basis for the development of the first electromechanical audio effects. In an effort to develop energy-storage delay lines, Bell Telephone Laboratories built the first electromechanical delay line using helical springs [84] in order to simulate the delay occurring in long-distance telephone calls. In 1939, Laurens Hammond filed a patent for a reverberation system based on this technique [85], which became an integrated part of the model A-100 Hammond organ first sold in 1959. The system creates the effect by sending the signal to a transducer exciting a spring. The resulting mechanical energy is converted into an electrical signal by a transducer at the other end and is added to the original input signal on the output. These spring reverb units usually consist of two or more springs, characterised by their wire gauge, coil diameter, and metal composition, as well as their tension and length. A damping mechanism may be added to adjust the decay tie of the reverberated signal. Although (or as a result of) not producing a natural sounding room or hall simulation, spring reverberation became a typical sound in music of the 60s and 70s, particularly its use with the electric guitar. Even today, it is still a sought after audio effect. However, there are techniques to produce more natural-sounding spring reverberation. For instance, Fidi [86] used helix springs with long time delays in which transfer characteristics were changed statistically by etching the wire surfaces and filtered out residual correlated signals [12].
Based on the same principle, German company Elektro-Mess-Technik (EMT) introduced the first plate-reverberator in 1957. Instead of a spring, a thin metal plate was suspended under tension with a transducer attached to its centre. Due to the more complex vibration pattern, the plate reverb produces a denser, more natural-sounding reverberation effect. Plates of different materials were used, with steel plates [87] preceding thinner gold plates [88], that allowed smaller designs with improved high-frequency response. Plate reverberators also often included a damping system to control the reverberation time.
Improving the sound of the Hammond organ was the motivation behind the development of the rotating speaker first developed by Donald Leslie in 1937. In a Leslie rotating speaker system, the amplified signal is routed to rotating horns for the higher frequency part of the spectrum and a bass speaker is used for the lower frequency components. The bass speaker faces down into a rotating baffle chamber (drum), directing the sound outwards. The result is an effect likened to tremolo, which can be explained by the Doppler effect that occurs when a sound source moves with respect to the observer. A sound source moving towards the observer results in an upward shift in frequency and moving away from the observer results in a downward shift. Leslie introduced the first commercially available rotating speaker model. Somewhat related, Stockhausen invented an apparatus using a rotating speaker to produce continuous movement of a sound source between the audio channels in his electroacoustic compositions. It consisted of a round table of which the axis is mounted with a ball bearing. A speaker is placed on the table projecting sound outwards. Four microphones are positioned around the table and record the signal on one channel each [89].
While the rotating speaker systems and plate reverberators were particularly large and therefore not easily portable, there are several examples of early electromechanical effects with smaller designs. In the DeArmond Tremolo for instance, an electric motor shakes a small canister of electrolytic fluid, grounding the input signal when the splashing fluid comes in contact with a metal connector. The DeArmond 601 Tremolo which uses this design is considered the first guitar effects pedal and became widely available in the late 1940s. However, Story and Clark Piano Co. showcased electric pianos fitted with a DeArmond Tremolo as early as 1941 [90].
There are several examples of delay effect implementations based on alternative storage systems, as opposed to tape. The TEL-RAY Oilcan Echo, invented by Raymond Lubow and sold as a guitar effects pedal, uses a rotating disc inside a small metal drum that is filled with electrolytic oil. A pickup attached to a spinning flywheel inside the drum produces the echo effect [91]. In an effort to simulate the sound of the Leslie rotating speaker later in a compact product, Lubow [92] later developed a rotating wah effect pedal, also based on the “oil can” design. The Binson EchoRec delay [93], developed by Bonfiglio Bini who previously manufactured radios, replaces the tape loop with a memory disc with stainless steel wire wound around an aluminium thread ring. Introduced to the market in the late 1950s, this design allowed for a wider frequency response and did not suffer from producing artefacts known from tape-based delays, such as wow and flutter.

3.3. Analogue Signal Processing

A large number of sound transformation techniques based on analogue signal processing, i.e., physically altering a continuous signal by changing the voltage or current with electrical components, appeared with the introduction of early electronic musical instruments. In this context, the sound transformations were implemented in order to shape synthesised audio signals as opposed to recordings. Bode [16] reviewed the history of electronic sound modification with an emphasis on sound synthesis. In some of the earliest examples of electronic instruments, the Telefunken Trautonium (marketed from 1933–1935) and the Hammond Novachord (presented 1939 at the New York World’s Fair) formant filters are used for the shaping of overtones. Invented by Dudley [94,95], the voder and vocoder simulated the resonances of human voice using band-pass filters. The vocoder, originally designed to reduce the bandwidth of speech transmission, included a signal analyser of the filtered bands to control the synthesis process, an approach now referred to as envelope following.
The early Hammond organs were capable of amplitude modulation (AM) for a tremolo effect and were later able to perform frequency modulation (FM) for vibrato. Bode [16] notes that the tremolo effect preceded the vibrato capability in the Hammond instruments due to initial difficulties in the implementation. In the 1940s, instruments equipped with ring modulators appeared, for instance, the Bode Melochord [96]. Werner Meyer-Eppler [97], cofounder of the Studio for Electronic Music of the West German Radio in Cologne, which was heavily involved in the exploration of music creation based purely on electronically synthesised sounds since the 1950s, discusses the musical applications of ring modulation. Both AM and ring modulation (RM) produce side bands. AM retains the carrier frequency in the resulting spectrum due to the unipolar modulator; the modulation frequency itself on the other hand is not present. The spectrum of a ring-modulated signal consists of the same side bands; however, due to the bipolar modulator, the carrier frequency is not present.
Based on the ring modulator, Bode [98] presented a frequency shifter, which, unlike the commonly known pitch-shifting effect, alters each component of the spectrum by a fixed amount, thus changing the harmonic relationships and creating new timbres. This device has been developed for electronic musical instruments manufacturer Moog, a company credited for marketing one of the first modular synthesisers. Moog developed a large number of techniques in the field of signal transformation for musical purposes and filed several patents, among them a phase-shift-type frequency shifter, specialised filters, and ring modulators to be integrated in electronic instruments [16].
Artificial reverberation based on delay lines and all-pass filters was described by Schroeder and Moorer [99,100]. The all-pass filters diffuse the sound by adding frequency-dependent time shifts to the output and are also referred to as impulse expanders or impulse diffusers.
The vacuum tube (or thermionic valve), an analogue component invented in the first decade of the 20th century for the amplification of low-level signals, has been used extensively in the design of guitar amplifiers and microphones as well as in audio effects such as equalisers and dynamics compressors. Compressors reduce the dynamic range of an audio signal. Dynamic range compression was initially developed for radio broadcast and has been used since the 1950s to compensate for the limited dynamic range of the broadcasting medium. For instance, AM radio has a dynamic range of 50 dB and FM radio has a dynamic range of 60 dB. In the 1960s, compressors found widespread use in recording studio, again, to reduce the dynamic range of a complete recording to the specifications of the recording medium, e.g., LP (long-playing) records (65 dB) or analogue tape recorder (70 dB). The effect is controlled by several parameters: The threshold determines the sound level above which the the volume is reduced. The amount of volume change is governed by the ratio setting. A ratio of 1:1 produces no change on the output; 10:1 results, for instance, in a +10 dB change in volume on the input in a +1 dB change on the output. Additional parameters are gain to boost the output, and attack and release, determining how quickly the compressor responds to input levels exceeding and falling below the given threshold (see Figure 6). Additionally, the knee parameter may shape the response at the threshold point, i.e., how curved the transition is. Early compressors used a hinge parameter instead of a threshold parameter, setting a midpoint of the approximated dynamic range of the input signal with the ratio determining the amount of dynamic range reduction (see Figure 7) [101].
A compressor with a very high ratio, nearing infinite, is referred to as a limiter. While compressors have been in use since the 1930s following the invention of vacuum tubes, prior to their introduction, audio engineers often relied on manually adjusting the loudness of the incoming signal during recording [10]. In the mastering process of contemporary music production, dynamic compression is applied to the mixed signal of a recording as well as creatively on individual channels. Compressors may also be controlled by feeding a secondary signal into the sidechain, a secondary audio input for the level detection. This technique is applied for instance to attenuate music in a radio program during speech or to create a ducking effect by feeding the bass drum into the sidechain, controlling the attenuation of other instruments [102]. Expanders work in a similar fashion; however they increase the dynamic range by decreasing the output level when the input level falls under the given threshold. In dynamic range processors, the input sound level is measured in a sidechain; hence, this nonlinear effect is an adaptive effect, i.e., it exhibits variable behaviour dependent on sound characteristics. We discuss adaptive audio effects in more detail in Section 4.1.
The filtering of audio frequencies can be traced back to experiments with frequency-division multiplexing in acoustic telegraphy in the late 19th century [103]. Early audio filters were integrated parts of phonograph playback systems and audio receivers. These equalisers were fixed to specific frequency ranges to be amplified or attenuated. While the inventors of the moving coil loudspeaker Kellogg and Rice experimented with equalizers as early as the 1920s to enhance the loudspeakers’ frequency response, John Volkmann, working for RCA in the 1930s, is credited with the invention of the first equalizer designed as a stand-alone device equipped with variable frequency filters. The equalizer found widespread use in the film industry, both to enhance speech in post-production as well as to improve the sound in cinemas. Filters for the adjustment of bass and treble are described in the 1949 paper by Williamson [104]. By the 1950s, equalisation became a standard technique, both in record production and playback. An early example of a stand-alone equaliser with sliders to control the attenuation and amplification of a bass shelving filter and a peaking filter is the Langevin Model EQ-251A equaliser introduced in the beginning of the 1960s. While earlier designs featured 2 or 3 tone controls, later professional graphic equalizers often have gain controls for 31 bands, with the centre frequency of each band fixed at 1/3 of an octave away from the center frequency of the neighboring bands. In the early 1970s, more flexible equalisers were developed. Parametric equalizers allowed, in addition to the gain control, adjustment of the centre frequency and bandwidth of each filter [105]. The design principles and history of equalizers are further detailed by Välimäki and Reiss [15] and Reiss and Brandtsegg [106].
Based on a sweeping filter, in 1966, Del Casher and Bradley J. Plunkett of Warwick Electronics created a novel effect pedal simulating the effect created by trumpet players modulating the sound by moving a mute at the bell of their instruments. The wah-wah pedal produces its typical sound by moving the centre frequency of a resonant band-pass filter up and down in the spectrum depending on the pedals position. Prior to the commercial availability of the effect, self-made designs have been used since the 1950s. Guitarist Chet Atkins is credited as having recorded the first song using a similar device. He modified a DeArmond volume control pedal by replacing the pedal’s volume potentiometer with a control to move a tone control from low to high frequencies [107]. Since its inception, the wah-wah effect has found widespread use, especially after Jimi Hendrix’s use of the Cry-Baby pedal in the late 1960s, and remains popular to this day.
While the shape of traditional filters discussed above is controlled manually by the user, adaptive filters present an example of an adaptive audio effect in the analogue domain. Adaptive filters adjust their frequency response depending on the incoming signal. They are characterised by (i) the signals being processed, (ii) the structure that defines how the output signal is derived from the input signal, (iii) the parameters that are iteratively adjusted to change the the filter’s transfer function, and (iv) the circuit design (or algorithm for digital implementations) that defines the error function and how the parameters are optimised [108]. Alt‘ough most current applications of adaptive filtering are in the digital domain, analogue implementations of adaptive filters have been especially relevant when digital electronics could not provide sufficient processing speed. The first applications of adaptive filters include adaptive antenna systems mitigating the effect of interfering directional noise sources [109], the equalisation, echo cancellation, and crosstalk cancellation in wired digital telecommunication [110]. In digital magnetic storage systems, analogue adaptive filters are used to provide forward equalisation for the signal received from the read head. The first adaptive noise cancelling system was designed in a 1965 student project at Stanford University [111]. A detailed summary of early applications of analogue adaptive filters can be found in Reference [112].
Controlled nonlinear distortion of audio signals, resulting in added harmonics and a compressed sound, for aesthetic purposes has long been established as a standard audio effect used in modern music production. For instance, it is an essential tool in rock music to produce the typical guitar sound. Distortion as a musical effect to shape timbre can be traced back to the introduction of the electric guitar and, as many other discoveries in the field of audio effects, can be attributed to accidental discovery and unintended use of technology. In order to compete with the loudness of brass instruments when guitars were included in dance bands in the 1920s, guitar pickups were introduced. The first commercial electric guitar was manufactured in 1931 by Rickenbacker in the form of the Frying Pan lap steel guitar. To reduce manufacturing cost, early electric guitar amplifiers used output transformers which exhibited distortion levels of up to 50% when turned up beyond their specification. By the 1940s, blues guitarists discovered that overdriving amplifiers beyond their capacity enabled them to deliberately shape the sound. The first commercial distortion pedal was the Maestro FZ-1 Fuzz Tone manufactured by Gibson and became available in the early 1960s. It was aimed at guitar, banjo, and bass players, and the sound was described as “simulating other instruments such as trumpets, trombones, and tubas” [113,114]. In another example of unintended use of technology, the FZ-1 is the result of reverse-engineering the sound of a recording using an amplifier with a damaged tube for Marty Robbins’ 1961 song “Don’t Worry”. Liking the sound, the producer used the recording for the final mix [115]. There are several other examples of distorted sounds in early rock and roll music that were the result of accidents and subsequent experimentation. For instance, the distinct guitar sound in Jackie Brenston and Ike Turner’s song “Rocket 88” reportedly stems from experimentation with an amplifier that fell from the roof of a car, resulting in a damaged speaker cone. Guitarist Paul Burlison of Rock ’n’ Roll Trio produced distortion effects by manipulating a tube of a damaged amplifier that he accidentally dropped. The history of the distorted guitar in rock music is discussed in more detail in References [114,116,117]. Although in today’s electronic devices vacuum tubes have been superseded by semiconductors, they still remain popular, especially with guitarists and hi-fi enthusiasts. Their ongoing popularity can be attributed to their distinctive nonlinear behaviour, creating subtle effects characterised by a warm and smooth sounds, which are achieved by adding oddly spaced harmonics to the signal. A more dramatic effect generally referred to as distortion, can be achieved by increasing the input level further into the nonlinear regions of the circuit. These distortion devices are designed to add higher harmonics to the spectrum. Overdrive, distortion, and fuzz are terms often used for the classification of different types of this effect. Overdrive generally refers to a milder effect where an almost linear audio effect is pushed just over the threshold into the nonlinear region by higher input levels, while the fuzz effect is completely nonlinear and is described as producing a particularly harsh sound [8].

3.4. Digital Signal Processing

With computer technology becoming cheaper and more powerful towards the 1990s, it took an increasingly prominent role in music production. Indeed, the computer is the centre of the modern music production studio today and most recording and editing tasks are performed within a digital audio workstation (DAW), with digital audio effects as an important factor in contemporary music production.
Computer-based DAWs are often modelled after the principle of multitrack tape recorders, emulating the established principles of the analogue recording studio. The components of the traditional recording studio, such as mixers and effects, are replaced by digital signal processing (DSP) implementations. Software synthesisers can replace external hardware synthesisers, in which case the signal can be recorded within the computer system omitting the analog-to-digital conversion step. Many DAWs feature virtual effect racks for real-time digital audio effects. Effect implementations often come in the form of plug-ins that can be integrated in different host DAWs to be applied on a given track. This considerably grows the opportunities for analogue emulation and modelling. The DAW, with its easily accessible and powerful digital audio effects, continues the trend that started with the emergence of multitrack recording: the traditional role of the audio engineer and producer has been the recording of performers and subsequent mastering. Eno [118] argues that, in the latter processes, engineers are part of the creative process through their mixing decisions, such as which instruments are predominant, where the instruments are placed in the stereo field, the clarity and masking of different instruments, and the audio effect modifications of those instruments. Therefore, the studio itself became a compositional tool [64,119]. Producers such as George Martin, Trevor Horn, and Phil Spector, who worked with several successful bands in the second half of the 20th century, are especially noted for their innovations in the field [120,121]. The lines between producer, composer, and performer became increasingly blurred, and today, these roles may indeed be filled by a single person. This trend intensified with the emergence of music genres such as disco and sampling-based hip hop followed by popular electronic music genres and was further made possible by the decreasing cost of high-quality recording technology. This exceeds the role of enhancing the sound in a postproduction scenario. For further discussion of this phenomenon, the reader is referred to the literature, for instance [122,123,124,125]. An extensive study on the methodologies and workflows of professional music producers and an analysis of the abstraction mechanisms in digital music production systems is investigated in Reference [126].
While the emulation of established analogue devices in the digital domain remains an active field of research, the majority of digital audio effects rely on signal processing principles and existing audio effects developed in the analogue domain. Digital signal processing technology makes it possible to design effects with greater complexity and precision [8]. For instance, effects based on analysis/synthesis techniques can be applied in real time, such as the phase vocoder [127] and noise removal [128]. A novel effect introduced in the late 20th century to mention is the auto-tune effect, the automatic pitch and intonation correction of the singing voice. The effect algorithm uses an auto-correlation function to determine the instantaneous pitch of an input signal and changes its pitch according to given scale [129]. This effect is another example of both transsectorial innovation and the widespread misappropriation of audio effects. The effect’s inventor Harold A. Hildebrand devised the auto-tune algorithm based on his work in seismic data processing for the oil industry, recognising the shared technologies of music and geophysical applications, such as correlation, linear predictive coding, and formant analysis [130]. The auto-tune effect processor developed by Hildebrand’s company Antares since the late 1990s was originally designed to apply subtle corrections to voice recordings while still keeping a natural timbre and intonation in order to lower costs by eliminating the need for manual corrections or retakes. Soon, however, music producers started to use the effect in such a way that it altered the vocal recordings, deliberately making them sound heavily processed and unnatural. Cher’s 1998 song “Believe” is considered the first mainstream song that features the typical synthetic vocal sound characterised by perfect pitch and unnatural instantaneous pitch changes introduced by the exaggerated auto-tune effect. Hildebrand himself stated that he “never figured anyone in their right mind would want to do that” [131].
An example for a purely digital effect is the bitcrusher, a distortion effect that transforms the sound by reducing the bit-depth and sample rate of the input signal. The output sound is characterised by a decreased bandwidth and added quantisation noise. Furthermore, computer systems allow the organisation of a vast number of grains in granulation-based effects [132] and make it possible to apply reverberation by convolution, where a signal is convolved with an impulse response to recreate the acoustic characteristics of a real or artificial space. This technique arguably produces the most realistic room simulation; however, standard convolution lacks the ability to control the produced sound or to interpolate between a set of given impulses [133]. It was Schroeder [99] who proposed the first digital simulation of reverberation. However, it was not until 1977 that the first completely digital reverberator has been made available commercially [134]. Around the same time, the first approach for capturing and reapplying a convolution reverberation was developed [133]. Today, artificial and convolution reverberation is a standard tools in audio production, as technology moves further from hardware and closer to software solutions. Modern digital reverb units may be some combination of artificial and convolution techniques or may even emulate vintage analogue solutions. Convolution is also used to emulate other linear systems, such as the tonality of guitar cabinets. A review of techniques for the digital emulation of tube-based amplifiers have can be found in References [135,136].

4. Current Research and Future Directions

4.1. Adaptive Digital Audio Effects

Adaptive audio effects are often considered a more recent or newer class of audio effect; however, they rely mostly on traditional signal analysis and processing principles and have been around for decades. Adaptive audio effects are typically controlled by mapping higher level features to some audio effect parameters. The high-level signal flow of an adaptive effect is depicted in Figure 8. The construction of an adaptive digital audio effects includes three steps [137]:
  • the analysis/feature extraction aspect
  • the mapping between features and effects parameters
  • the transformation and resynthesis aspect of the digital audio effect
Adaptive digital audio effects may be classified in in the following categories [137]:
  • Auto-adaptive effects—control parameters are derived from features extracted from the input signal.
  • External-adaptive effects—control parameters are derived from at least one input signal other than that to which the effect is applied.
  • Feedback-adaptive effects—control parameters are derived from features extracted from the output signal.
  • Cross-adaptive effects—a combination of at least two external-adaptive effects, where the features mapped to control parameters of the effect applied to one signal are extracted from the other.
One of the earliest examples of an adaptive audio effect is the dynamic range compressor (see Section 3.3), where the extracted envelope of a sound signal is used to apply an adaptive gain function [138,139]. These adaptive effects were developed further into noise gates, expanders, limiters, companders, and upward compressors [140,141,142,143]. From this point, adaptive audio effects were developed where an audio feature is used to control another audio track. This is often described as sidechain and can be primarily found in side-chain compressors for ducking, though there are cases where it is used with noise gates [140]. In all these cases, the feature used to control the audio effect is a smoothed version of the input signal, often described as the envelope. Effects such as the compressor or noise gate use nonlinear functions to transform the signal according to the incoming signal. More sophisticated effects have been proposed with numerous possible features to be used as control parameters, among them spectral-, loudness-, and pitch-related features.
In more recent implemetations, adaptive audio effects are controlled by some descriptive audio features. These features are obtained by applying techniques drawn from the field of Music Information Retrieval (MIR). Audio features describe qualities of a given audio signal and are also referred to as audio descriptors [144,145,146]. Low-level audio features include spectral descriptors (for instance, by computing Fourier transform) and descriptors for loudness or dynamics, while high-level descriptors cover abstract and semantic concepts such as key, chords, as well as music genres or mood. Audio features in particular with respect to adaptive digital audio effects are reviewed in Reference [1]. Detailed discussion about these audio features, their extraction, and mapping to control parameters can be found in References [137,147]. Although these adaptive digital audio effects as described by Verfaille et al. [83] can be considered a new class of audio effects, the audio transformations themselves are, for the most part, based on established signal processing techniques and audio effect algorithms.
When the analysis stage is part of the effect processor itself, various limitations are introduced to its application: firstly, if the whole audio signal needs to be taken into account in order to obtain the necessary control data, these transformations cannot be applied in real time. This not only introduces limitations to the creative music production process but also renders these effects unsuitable for live performance settings. On the other hand, the approach to obtain the control data from existing, previously extracted metadata reduces the computational burden by omitting the often very complex feature analysis algorithms. Techniques to implement content-based audio transformations that are based on high-level features that are extracted and stored in a database prior to the application of the effects have been developed in the course of the Content-based Unified Interfaces and Descriptors for Audio/music Databases available Online (CUIDADO) project [148,149] and in the development of experimental plug-in effect software [150]. A basic diagram of the signal flow of such a metadata-driven adaptive audio effect is given in Figure 9.
Verfaille et al. [83] proposed implementation strategies for a large number of features to be used for non-real-time adaptive audio effects, many of which have been commonly used for timbre space description based on the MPEG-7 proposals [151]. A deep review and discussion of adaptive audio effects is presented in Reference [106].

4.2. Intelligent Music Production

The use of audio effects has been extended to automating tools for Intelligent Music Production (IMP) and automatic mixing [152]. Within this field, there has been a large focus on the use of some signal analysis approaches to automate, or directly control, some parameters of a preexisting audio effect. This can be performed by machine learning from data [153,154], performing some curve fitting or mapping to some higher-level parameters [155,156] or using some direct signal analysis to control an audio effect parameter directly [157,158].
Stables et al. [152] present one approach to the use of audio effects for the application of automatic mixing, where there is some high level analysis of the audio signals to capture some audio feature representation, which can then be applied directly to a mixing process in a highly deterministic way. This concept is presented more formally by Moffat et al. [159], who combine the Audio Effects Ontology [160] and the Audio Feature Ontology [161]. This concept is demonstrated in the Semantic Compressor [150]. The great advantage of the approach presented by Moffat et al. [159] is that it allows for a constraint optimisation to be applied, where a number of contradictory rules may be considered and interpreted to allow for one of many suitable answers to be identified. As discussed in Moffat and Sandler [162], the use of a single audio effect for a single purpose is not always applicable in the field of IMP. There is single use for an audio effect, and as such, there is little scope for fully automating a single audio effect, deterministically, through direct audio feature analysis. The automation of audio effect parameters for automatic mixing is only one approach for performing IMP [163].
Pardo et al. [164] proposed the integration of source separation techniques into the field of IMP, which would allow for a higher level of understanding as to the processing taking place. In a mixing context, the engineers’ task is to combine a number of sources in a pleasant and appropriate manner, regardless of how many microphones are used to capture the signal. This work was prototyped, which demonstrated that source separation will improve IMP systems [157].
There is growing scope for the use of Deep Neural Network (DNN) approaches in the use of automatic mixing. Martínez Ramírez and Reiss [165] identify that previous automatic mixing approaches do not capture the highly nonlinear approaches. Since then, there have been several approaches towards using DNN for automatic mastering, including source separation and remixing [166], automatic mixing through audio feature transformation [167,168], or some audio mixing style transfer approach [169]. Many of these approaches are heavily derived from the field of image processing, particularly the field of audio style transfer. This is yet another example of transsectorial innovation being present at all stages of audio effect production. DNNs are also at the centre of current research in modelling audio effects, for both linear transformations such as equalization as well as time-varying audio effects involving delay lines and low-frequency oscillators [170,171,172].

5. Conclusions

In this article, we have outlined the history and origins of audio effects for musical applications with a focus on how technical breakthroughs influenced their development. From the first instances of using room acoustics in order to achieve an intended sound in a composition onwards, shaping timbre of recorded or performed sound has been an integral part of music composition. When recording techniques were invented, the design of reverberators became more important. This has first been achieved by physical reverberation chambers and later through the use of electronics and digital technology. We have shown that the large majority of principles for the audio effects in use have been proposed and in most cases implemented several decades ago. Indeed, the audio effects currently in use are, with only a few exceptions, merely improvements or variations of well-established techniques.
It is clear that a large number of new developments in the field of audio effects are built upon transsectorial innovation. We discussed several examples, where this has been facilitated through transsectorial migration [4]—the migration of individuals from one industry to another. In our case, these individuals are often engineers with an interest in music performance or production that recognised the potential of specific technologies for the development of audio effects.
The majority of audio effects, from their very core, have not changed in any meaningful way, but more importantly, the way in which they are applied or the frameworks into which they are built has changed and developed with the times. Old analogue hardware is less commonly used for producing new chart records, but the state of the art in digital analogue emulation has been pushed to the forefront of research in an attempt to replicate the nostalgic experience of these traditional audio effects [173].
The evolution of technology has had considerable impact on the range of audio effects available and on the growth in audio effect types and options. The development of mechanical technologies allowed for the capture, transportability, and consistent replication of some differing audio effects. The growth of digital technologies forced the generalisation of a number of these audio effects and grew the field of digital analogue emulations. Digital technologies also lend themselves to producing some mapping layer between some high-level abstract parameter and the audio effect parameters themselves [174].

Future Perspectives

The technology evolutions have had a steady impact on the range of audio effects available; however, there seems to be little work encompassing the latest growth in machine learning technologies to the use of audio effects. There is use of these technologies to model and represent existing audio effects [170,171,172] to represent the entire mixing process [165] or to improve the ability for audio cleanup technologies [175]. However, although we have identified some examples in this article, there has been limited use of this technological advancement to produce new types of audio effects or to facilitate the ability for mix engineers to interact with audio in a completely new way. The scope of technological advancements are consistently in a position to produce more interesting and diverse creative approaches [173,176].
There has been a recent growth in the use of adaptive audio effects in automatic mixing [106,152]. As discussed in [163], there are a number of opportunities for the use of machine learning approaches to learn the direct transform of audio, allowing data-driven computational approaches to learning signal processing systems. The advantages of this approach could produce a multitude of new and different audio effects which are less restricted by traditional analogue hardware design and DSP and instead can represent some more intuitive, or perceptual, attributes or high-level transformational space of audio. The scope and flexibility of this approach lends itself well both to recent technological advancements and to the continual trends of audio effects over the years and their ability to grow with the latest technology whilst still maintaining a grounding in the historical implications and definitions of the technology used to create them.

Author Contributions

Conceptualization, T.W., D.M., and A.M.; writing, T.W., D.M., and A.M.; supervision and funding acquisition, M.B.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by EPSRC grant EP/L019981/1 Fusing Audio and Semantic Technologies for Intelligent Music Production and Consumption, by EPSRC grant EP/S026991/1 RadioMe, and by the CdT Media and Arts Technology through the EPSRC grant EP/G03723X/1.

Acknowledgments

Mark B. Sandler acknowledges the support of the Royal Society as a recipient of a Wolfson Research Merit Award.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Verfaille, V.; Holters, M.; Zölzer, U. Introduction. In DAFX—Digital Audio Effects, 2nd ed.; Zölzer, U., Ed.; John Wiley & Sons, Inc.: Chichester, UK, 2011. [Google Scholar]
  2. Wilmering, T.; Fazekas, G.; Sandler, M.B. Audio Effect Classification Based on Auditory Perceptual Attributes. In Proceedings of the AES 135th Convention, New York, NY, USA, 17–20 October 2013. [Google Scholar]
  3. Piatier, A. Transectorial innovations and the transformation of firms. Inf. Soc. 1988, 5, 205–231. [Google Scholar] [CrossRef]
  4. Théberge, P. Any Sound You Can Imagine: Making Music/Consuming Technology; Wesleyan University Press: Middletown, CT, USA, 1997. [Google Scholar]
  5. Mulder, J. Early history of amplified music: Transectorial innovation and decentralized development. In Proceedings of the Audio Engineering Society Conference: 59th International Conference on Sound Reinforcement Engineering and Technology, Montreal, QC, Canada, 15–17 July 2015. [Google Scholar]
  6. Julien, O. The diverting of musical technology by rock musicians: The example of double-tracking. Pop. Music 1999, 18, 357–365. [Google Scholar] [CrossRef]
  7. Roads, C. The Computer Music Tutorial; MIT Press: Cambridge, MA, USA, 1996. [Google Scholar]
  8. Zölzer, U. DAFX—Digital Audio Effects, 2nd ed.; J. Wiley & Sons, Inc.: Chichester, UK, 2011. [Google Scholar]
  9. Reiss, J.D.; McPherson, A. Audio Effects—Theory, Implementation and Application; CRC Press: Boka Raton, FL, USA, 2015. [Google Scholar]
  10. Réveillac, J.M. Musical Sound Effects—Analog and Digital Sound Processing; ISTE Ltd.: London, UK; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2018. [Google Scholar]
  11. De Man, B. Audio Effects in Sound Design. In Foundations in Sound Design Volume 1: Linear Media; Filimowicz, M., Ed.; Routledge: New York, NY, USA, 2019; Volume 1, Chapter 4. [Google Scholar]
  12. Välimäki, V.; Parker, J.D.; Savioja, L.; Smith, M.J.O.; Abel, J.S. Fifty Years of Artificial Reverberation. IEEE Trans. Audio Speech Lang. Process. 2012, 20, 1421–1448. [Google Scholar] [CrossRef]
  13. Välimäki, V.; Parker, J.D.; Savioja, L.; Smith, M.J.O.; Abel, J.S. More Than Fifty Years of Artificial Reverberation. In Proceedings of the AES 60th International Conference: DREAMS (Dereverberation and Reverberation of Audio, Music, and Speech), At Leuven, Belgium, 3–5 February 2016. [Google Scholar]
  14. Bohn, D.A. Operator Adjustable Equalizers: An Overview. In Proceedings of the Audio Engineering Society Conference: 6th International Conference: Sound Reinforcement, Nashville, TN, USA, 5–8 May 1988. [Google Scholar]
  15. Välimäki, V.; Reiss, J.D. All About Audio Equalization: Solutions and Frontiers. Appl. Sci. 2016, 6, 129. [Google Scholar] [CrossRef]
  16. Bode, H. History of Electronic Sound Modification. J. Acoust. Soc. Am. 1984, 32, 730–738. [Google Scholar]
  17. Cox, T.; d’Antonio, P. Acoustic Absorbers and Diffusers: Theory, Design and Application; CRC Press: Boca Raton, FL, USA, 2016. [Google Scholar]
  18. Kuttruff, H. Room Acoustics; CRC Press: Boca Raton, FL, USA, 2009. [Google Scholar]
  19. Jot, J.M.; Cerveau, L.; Warusfel, O. Analysis and synthesis of room reverberation based on a statistical time-frequency model. In Proceedings of the Audio Engineering Society Convention 103, New York, NY, USA, 26–29 September 1997. [Google Scholar]
  20. Stewart, R.; Sandler, M. Statistical measures of early reflections of room impulse responses. In Proceedings of the 10th International Conference on Digital Audio Effects (DAFx-07), Bordeaux, France, 11–14 September 2007; pp. 59–62. [Google Scholar]
  21. De Cesaris, S.; Morandi, F.; Loreti, L.; D’Orazio, D.; Garai, M. Notes about the early to late transition in Italian theatres. In Proceedings of the ICSV22, Florence, Italy, 12–16 July 2015. [Google Scholar]
  22. Haas, H. The influence of a single echo on the audibility of speech. J. Audio Eng. Soc. 1972, 20, 146–159. [Google Scholar]
  23. D’Orazio, D.; De Cesaris, S.; Garai, M. A comparison of methods to compute the “effective duration” of the autocorrelation function and an alternative proposal. J. Acoust. Soc. Am. 2011, 130, 1954–1961. [Google Scholar] [CrossRef]
  24. D’Orazio, D.; Garai, M. The autocorrelation-based analysis as a tool of sound perception in a reverberant field. Rivista di estetica 2017, 66, 133–147. [Google Scholar] [CrossRef]
  25. Baumann, D. Music and Space: A Systematic and Historical Investigation into the Impact of Architectural Acoustics on Performance Practice followed by a Study of Handel’s Messiah; Peter Lang: Bern, Switzerland, 2011; p. 33. [Google Scholar]
  26. Herr, C.R.; Siebein, G.W. An Acoustical History of Theaters and Concert Halls: An Investigation of Parallel Development in Music, Performance Spaces, and the Orchestra. In Proceedings of the 86th ACSA Annual Meeting and Technology Conference: Constructing Identity “Souped-up” and “Un-plugged”, Cleveland, OH, USA, 14–17 March 1998. [Google Scholar]
  27. Alvim, D. As the World Leaks into the Work: Composition and architecture. Organised Sound 2018, 23, 51–60. [Google Scholar] [CrossRef]
  28. Forsyth, M. Buildings for Music: The Architect, the Musician, the Listener from the Seventeenth Century to the Present Day; The MIT Press: Cambridge, MA, USA, 1985. [Google Scholar]
  29. Sabine, W.C. Reverberation. In The American Architect; Peninsula Publishing: Los Altos, CA, USA, 1900; p. 4. [Google Scholar]
  30. Sabine, W.C. Collected Papers on Acoustics; Dover: Mineola, NY, USA, 1964; p. 272. [Google Scholar]
  31. Bryant, D. The ‘cori spezzati’of St Mark’s: Myth and reality. Early Music Hist. 1981, 1, 165–186. [Google Scholar] [CrossRef]
  32. Blesser, B. An Interdisciplinary Synthesis of Reverberation Viewpoints. J. Acoust. Soc. Am. 2001, 49, 867–903. [Google Scholar]
  33. Weinel, J. Inner Sound: Altered States of Consciousness in Electronic Music and Audio-Visual Media; Oxford University Press: New York, NY, USA, 2018. [Google Scholar]
  34. Fazenda, B.; Scarre, C.; Till, R.; Pasalodos, R.J.; Guerra, M.R.; Tejedor, C.; Peredo, R.O.; Watson, A.; Wyatt, S.; Benito, C.G.; et al. Cave acoustics in prehistory: Exploring the association of Palaeolithic visual motifs and acoustic response. J. Acoust. Soc. Am. 2017, 142, 1332–1349. [Google Scholar] [CrossRef]
  35. Till, R. Songs of the stones: An investigation into the acoustic history and culture of Stonehenge. [email protected] J. 2011, 1, 1–18. [Google Scholar] [CrossRef]
  36. Watson, A.; Keating, D. Architecture and sound: An acoustic analysis of megalithic monuments in prehistoric Britain. Antiquity 1999, 73, 325–336. [Google Scholar] [CrossRef]
  37. Jahn, R.G.; Devereux, P.; Ibison, M. Acoustical resonances of assorted ancient structures. J. Acoust. Soc. Am. 1996, 99, 649–658. [Google Scholar] [CrossRef]
  38. Cook, I.A.; Pajot, S.K.; Leuchter, A.F. Ancient Architectural Acoustic Resonance Patterns and Regional Brain Activity. Time Mind J. Archaeol. Conscious. Cult. 2008, 1, 95–104. [Google Scholar] [CrossRef]
  39. Devereux, P. A Ceiling Painting in the Hal-Saflieni Hypogeum as Acoustically-Related Imagery: A Preliminary Note. Time Mind 2009, 2, 225–231. [Google Scholar] [CrossRef]
  40. Till, R. An archaeoacoustic study of the Hal Saflieni Hypogeum on Malta. Antiquity 2017, 91, 74–89. [Google Scholar] [CrossRef]
  41. Baldwin, B. The Date, Identity, and Career of Vitruvius. Latomus 1990, 49, 425–434. [Google Scholar]
  42. Lokki, T.; Southern, A.; Siltanen, S.; Savioja, L. Studies of Epidaurus with a hybrid room acoustics modelling method. In Proceedings of the Acoustics of Ancient Theaters Patras, Patras, Greece, 18–21 September 2011. [Google Scholar]
  43. Lokki, T.; Southern, A.; Siltanen, S.; Savioja, L. Acoustics of Epidaurus–studies with room acoustics modelling methods. Acta Acust. United Acust. 2013, 99, 40–47. [Google Scholar] [CrossRef]
  44. Williams, K.; Ostwald, M.J. Architecture and Mathematics from Antiquity to the Future; Springer: Cham, Switzerland, 2015. [Google Scholar]
  45. Vovolis, T. Mask, actor, theatron and landscape in classical Greek theatre. In Proceedings of the Acoustics of Ancient Theatres Conference, Patras, Greece, 18–21 September 2011. [Google Scholar]
  46. Vovolis, T. Acoustical masks and sound aspects of ancient greek theatre. Classica 2012, 25, 149–173. [Google Scholar] [CrossRef]
  47. Cingolani, S.; Spagnolo, R. Acustica Musicale e Architettonica; UTET Libreria: Torino, Italy, 2005. [Google Scholar]
  48. Arns, R.G.; Crawford, B.E. Resonant Cavities in the History of Architectural Acoustics. Technol. Cult. 1995, 36, 104–135. [Google Scholar] [CrossRef]
  49. Cauer, E.; Mathis, W.; Pauli, R. Life and work of Wilhelm Cauer (1900–1945). In Proceedings of the 14th International Symposium on Mathematical Theory of Networks and Systems (MTNS2000), Perpignan, France, 19–23 June 2000. [Google Scholar]
  50. Spagnolo, R. Manuale di Acustica Applicata; UTET Libreria: Torino, Italy, 2001. [Google Scholar]
  51. Desarnaulds, V. De l’acoustique des églises en Suisse–Une Approche Pluridisciplinaire. Ph.D. Thesis, École Polytechnique de Lausanne, Lausanne, Switzerland, 2002. [Google Scholar]
  52. Mijic, M.; Sumarac-Pavlovic, D.; kralja Aleksandra, B. Acoustic Resonators in Serbian Orthodox Churches; Forum Acusticum Sevilla: Sevilla, Spain, 16–20 September 2002. [Google Scholar]
  53. Kircher, A. Phonurgia Nova sive Coniugium Mechanico-Physicum Artis et Naturae Paranympha Phonosophia Concinnatum; Rudolph Dreherr: Kempten, Germany, 1673; pp. 50, 98–99. [Google Scholar]
  54. Lubman, D. Archaeological acoustic study of chirped echo from the Mayan pyramid at Chichén Itzá. J. Acoust. Soc. Am. 1998, 104. [Google Scholar] [CrossRef]
  55. Garza, C.; Medina, A.; Padilla, P.; Ramos, A.; Zalaquett, F. Arqueoacústica maya. La necesidad del estudio sistemático de efectos acústicos en sitios arqueológicos. Estud. Cult. Maya 2008, 32, 63–87. [Google Scholar] [CrossRef]
  56. Schafer, R.M. The Soundscape: Our Sonic Environment and the Tuning of the World; Simon and Schuster: New York, NY, USA, 1993; p. 221. [Google Scholar]
  57. Postma, B.N.; Jouan, S.; Katz, B.F. Pre-Sabine room acoustic design guidelines based on human voice directivity. J. Acoust. Soc. Am. 2018, 143, 2428–2437. [Google Scholar] [CrossRef]
  58. D’Orazio, D.; Nannini, S. Towards Italian Opera Houses: A Review of Acoustic Design in Pre-Sabine Scholars. Acoustics 2019, 1, 15. [Google Scholar] [CrossRef]
  59. D’Orazio, D.; De Cesaris, S.; Morandi, F.; Garai, M. The aesthetics of the Bayreuth Festspielhaus explained by means of acoustic measurements and simulations. J. Cult. Herit. 2018, 34, 151–158. [Google Scholar] [CrossRef]
  60. Doyle, P. Echo and Reverb: Fabricating Space in Popular Music Recording, 1900–1960; Wesleyan University Press: Middletown, CT, USA, 2005. [Google Scholar]
  61. Round, H.J.; West, A.G.D. Transmission and Reproduction of Sound. U.S. Patent 1,853,286, 12 April 1932. [Google Scholar]
  62. Rettinger, M. Reverberation Chambers for Broadcasting and Recording Studios. J. Audio Eng. Soc. 1957, 5. [Google Scholar]
  63. Stravinsky, I. An Autobiography; W. W. Norton: New York, NY, USA, 1962. [Google Scholar]
  64. Bernardini, N.; Rudi, J. Compositional Use of Digital Audio Effects. J. New Music Res. 2002, 31, 87–91. [Google Scholar] [CrossRef]
  65. Stravinsky, I. Meine Stellung zur Schallplatte. Kult. Schallplatte 1930, 1, 65. [Google Scholar]
  66. Toch, E. Über meine Kantata ’Das Wasser’ und meine Grammophonmusik. Melos 1930, 9, 221. [Google Scholar]
  67. Katz, M. Hindemith, Toch, and Grammophonmusik. J. Musicol. Res. 2001, 20, 161–180. [Google Scholar] [CrossRef]
  68. Davies, H. A History of Sampling. Organised Sound 1996, 1, 3–11. [Google Scholar] [CrossRef]
  69. Holmes, T. Electronic and Experimental Music, 3rd ed.; Routledge: New York, NY, USA, 2008. [Google Scholar]
  70. Schaeffer, P. La Musique Concrète; Presses Universitaires de France: Paris, France, 1967. [Google Scholar]
  71. Schaeffer, P. Traité des Objets Musicaux; Editions du Seuil: Paris, France, 1966. [Google Scholar]
  72. Teruggi, D. Technology and musique concrète: The technical developments of the Groupe de Recherches Musicales and their implication in musical composition. Organised Sound 2007, 12, 213–231. [Google Scholar] [CrossRef]
  73. Gabor, D. Theory of Communication. Part 1: The analysis of information. J. Inst. Electr. Eng. Part III Radio Commun. Eng. 1946, 93, 429–441. [Google Scholar] [CrossRef]
  74. Marlens, W.S. Duration and Frequency Alteration. J. Audio Eng. Soc. 1966, 14, 132–139. [Google Scholar]
  75. French, N.R.; Zinn, M.K. Method of and Apparatus for Reducing Width of Transmission Bands. U.S. Patent 1,671,151, 29 May 1928. [Google Scholar]
  76. Fairbanks, G.; Everitt, W.L.; Jaeger, R.P. Recording Device. U.S. Patent 2,886,650, 12 May 1959. [Google Scholar]
  77. Xenakis, I. Formalized Music; Revised Edition; Pendragon Press: Stuyvesant, NY, USA, 1992. [Google Scholar]
  78. Cage, J. Werkverzeichnis; Edition Peters: Frankfurt, Germany, 1962. [Google Scholar]
  79. Kane, B. Acousmatic Fabrications: Les Paul and the ’Les Paulverizer’. J. Vis. Cult. 2011, 10, 212–231. [Google Scholar] [CrossRef]
  80. Dutilleux, P. Filters, Delay, Modulations and Demodulations: A Tutorial. In Proceedings of the 1st COST-G6 Workshop on Digital Audio Effects (DAFx98), Barcelona, Spain, 19–21 November 1998. [Google Scholar]
  81. Lewisohn, M. The Beatles Recordings Sessions; Harmony Books (Crown Publishing): New York, NY, USA, 1989. [Google Scholar]
  82. Abbey Road Studios. Inside Abbey Road Artificial Double Tracking. Available online: https://www.abbeyroad.com/news/inside-abbey-road-artificial-double-tracking-2530 (accessed on 19 January 2020).
  83. Verfaille, V.; Zölzer, U.; Arfib, D. Adaptive Digital Audio Effects (A-DAFx): A New Class Of Sound Transformations. IEEE Trans. Audio Speech Lang. Process. 2006, 14, 1817–1831. [Google Scholar] [CrossRef]
  84. Wegel, R.L. Wave Transmission Device. U.S. Patent 1,852,795, 5 April 1932. [Google Scholar]
  85. Hammond, L. Electrical Musical Instrument. U.S. Patent 2,230,836, 4 February 1941. [Google Scholar]
  86. Fidi, W. Delay Device Particularly for the Production of Artificial Reverberation. U.S. Patent 3,517,344, 23 June 1970. [Google Scholar]
  87. Kuhl, W.K. Über die akustischen und technischen Eigenschaften der Nachhallplatte. Rundfunktechnische Mitteilungen 1958, 2, 111–116. [Google Scholar]
  88. Kuhl, W.K. Reverberation Device. U.S. Patent 3,719,905, 6 March 1973. [Google Scholar]
  89. Manning, P. The significance of techné in understanding the art and practice of electroacoustic composition. Organised Sound 2006, 11, 81–90. [Google Scholar] [CrossRef]
  90. Exhibits. The Music Trade Review Magazine Article. Music Trade Review Magazine, July 1941; 15–22. [Google Scholar]
  91. Lubow, R. Electrostatic Storage System. U.S. Patent 3,215,911, 2 November 1965. [Google Scholar]
  92. Lubow, R. Vibrato System with Variable Speed Signal Storage Disc. U.S. Patent 3,518,354, 30 June 1970. [Google Scholar]
  93. Taylor, P. History of the Binson Amplifier HiFi Company. Available online: https://www.effectrode.com/knowledge-base/history-of-the-binson-amplifier-hifi-company/ (accessed on 19 February 2019).
  94. Dudley, H. System for the Artificial Production of Vocal or other Sounds. U.S. Patent 2,121,142, 21 June 1938. [Google Scholar]
  95. Dudley, H. Remaking Speech. J. Acoust. Soc. Am. 1939, 11, 169–177. [Google Scholar] [CrossRef]
  96. Rhea, T.L. Bode’s Melodium and Melochord. Contemp. Keyboard Electron. Perspect. 1980, 6, 68. [Google Scholar]
  97. Meyer-Eppler, W. Elektronische Klangerzeugung: Elektronische Musik und Synthetische Sprache; Ferdinand Dümmlers: Bonn, Germany, 1949. [Google Scholar]
  98. Bode, H. Solid State Audio Frequency Spectrum Shifter. In Proceedings of the AES 17th Annual Meeting, New York, NY, USA, 11–15 October 1965. [Google Scholar]
  99. Schroeder, M.R. Natural Sounding Artificial Reverberation. J. Audio Eng. Soc. 1962, 10, 219–223. [Google Scholar]
  100. Moorer, J.A. About This Reverberation Business. Comput. Music J. 1979, 3, 13–28. [Google Scholar] [CrossRef]
  101. Jeffs, R.; Holden, S.; Bohn, D. Dynamics Processors—Technology & Application Tips; RaneNote 155; Rane Corporation: Mukilteo, WA, USA, 2005. [Google Scholar]
  102. Hodgson, J. A field guide to equalisation and dynamics processing on rock and electronica records. Pop. Music 2010, 29, 283–297. [Google Scholar] [CrossRef]
  103. Lundheim, L. On Shannon and “Shannon’s Formula”. Telektronikk 2002, 98, 20–29. [Google Scholar]
  104. Williamson, D.T.N. Design of tone controls and auxiliary gramophone circuits. Wirel. World 1949, 55, 20–29. [Google Scholar]
  105. Massenburg, G. Parametric Equalization. In Proceedings of the 42nd Audio Engineering Society Convention, Los Angeles, CA, USA, 2–5 May 1972. [Google Scholar]
  106. Reiss, J.D.; Brandtsegg, Ø. Applications of Cross-Adaptive Audio Effects: Automatic Mixing, Live Performance and Everything in Between. Front. Digit. Humanit. 2018, 5, 17. [Google Scholar] [CrossRef]
  107. Reinhart, M.S. Chet Atkins: The Greatest Songs of Mister Guitar; McFarland & Company, Inc.: Jefferson, NC, USA, 2014. [Google Scholar]
  108. Douglas, S.C. Introduction to Adaptive Filters. In Digital Signal Processing Handbook; Madisetti, V.K., Williams, D.B., Eds.; CRC Press LLC: Boca Raton, FL, USA, 1999. [Google Scholar]
  109. Widrow, B.; Mantey, P.; Griffiths, L.; Goode, B. Adaptive antenna systems. Proc. IEEE 1967, 55, 2143–2159. [Google Scholar] [CrossRef]
  110. Lucky, R.W. Automatic equalization for digital communication. Bell Syst. Tech. J. 1965, 44, 547–588. [Google Scholar] [CrossRef]
  111. Widrow, B.; Glover, J.R.; McCool, J.M.; Kaunitz, J.; Williams, C.S.; Hearn, R.H.; Zeidler, J.R.; Dong, J.E.; Goodlin, R.C. Adaptive noise cancelling: Principles and applications. Proc. IEEE 1975, 63, 1692–1716. [Google Scholar] [CrossRef]
  112. Carusone, A.; Johns, D. Analogue adaptive filters: Past and present. IEE Proc. Circuits Dev. Syst. 2000, 147, 82–90. [Google Scholar] [CrossRef]
  113. Snoddy, G.T.; Hobbs, R.V. Tone Modifier for Electrically Amplified Electro-Mechanically Produced Musical Tones. U.S. Patent 3,213,181, 19 October 1965. [Google Scholar]
  114. Hicks, M. Sixties Rock: Garage, Psychedelic, and Other Satisfactions; University of Illinois Press: Champaign, IL, USA, 2000. [Google Scholar]
  115. Diekman, D. Twentieth Century Drifter: The Life of Marty Robbins; University of Illinois Press: Champaign, IL, USA, 2012. [Google Scholar]
  116. Walksman, S. The Turn to Noise: Rock Guitar from the 1950s to the 1970s. In The Cambridge Companion to the Guitar; Coelho, V.A., Ed.; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
  117. Herbst, J.P. “My setup is pushing about 500 watts—It’s all distortion”: Emergence, development, aesthetics and intentions of the rock guitar sound. Vox Popular 2020. forthcoming. [Google Scholar]
  118. Eno, B. The Studio as Compositional Tool. In Audio Culture–Readings in Modern Music; Warner, D., Ed.; Bloomsbury Publishing: New York, NY, USA, 2017. [Google Scholar]
  119. Risset, J.C. Examples of the Musical Use of Digital Audio Effects. J. New Music Res. 2002, 31, 93–97. [Google Scholar] [CrossRef]
  120. Warner, T. Pop Music: Technology and Creativity: Trevor Horn and the Digital Revolution; Ashgate Publishing Company: Farnham, UK, 2003. [Google Scholar]
  121. Moorefield, V. The Producer as Composer—Shaping the Sounds of Popular Music; MIT Press: Cambridge, MA, USA, 2005. [Google Scholar]
  122. Ramshaw, P. Is music production now a composition process? In Proceedings of the 2005 Art of Record Production Conference, London, UK, 17–18 September 2005. [Google Scholar]
  123. McIntyre, P. The Systems Model of Creativity: Analyzing the Distribution of Power in the Studio. In Proceedings of the 2008 Art of Record Production Conference, University of Massachusetts, Lowell, MA, USA, 14–16 November 2008. [Google Scholar]
  124. Marrington, M. Experiencing musical composition in the DAW: The software interface as mediator of the musical idea. In Proceedings of the 2010 Art of Record Production Conference, Leeds, UK, 2–4 December 2011. [Google Scholar]
  125. Bell, A.P. Dawn of the DAW: The Studio as Musical Instrument; Oxford University Press: New York, NY, USA, 2018. [Google Scholar]
  126. Duignan, M. Computer Mediated Music Production: A Study of Abstraction and Activity. Ph.D. Thesis, Victoria University of Wellington, Wellington, New Zealand, 2008. [Google Scholar]
  127. Flanagan, J.L.; Golden, R. Phase vocoder. Bell Syst. Tech. J. 1966, 45, 1493–1509. [Google Scholar] [CrossRef]
  128. Adlersberg, S.; Stettiner, Y.; Aizner, M.; Berstein, A. Noise Reduction System. U.S. Patent 5,012,519, 30 April 1991. [Google Scholar]
  129. Hildebrand, H.A. Pitch Detection and Intonation Correction Apparatus and Method. U.S. Patent 5,973,252, 26 October 1999. [Google Scholar]
  130. Crockett, Z. The Mathematical Genius of Auto-Tune. Priceonomics 2016. Available online: https://priceonomics.com/the-inventor-of-auto-tune/ (accessed on 16 January 2020).
  131. Tyrangiel, J. Singer’s Little Helper. Time Magazine, 16 February 2009; Volume 173. [Google Scholar]
  132. Roads, C. Microsound; MIT Press: Cambridge, MA, USA, 2004. [Google Scholar]
  133. Heyser, R.C. Acoustical measurements by time delay spectrometry. J. Audio Eng. Soc. 1967, 15, 370–382. [Google Scholar]
  134. Bäder, K.O.; Blesser, B. Digitaltechnik im Studio: Ein Elektronisches Nachhallgerät. Fernseh Kinotech. 1977, 31, 443–445. [Google Scholar]
  135. Pakarinen, J.; Yeh, D.T. A Review of Digital Techniques for Modeling Vacuum-Tube Guitar Amplifiers. Comput. Music J. 2009, 33, 85–100. [Google Scholar] [CrossRef]
  136. Esqueda, F.; Pöntynen, H.; Parker, J.; Bilbao, S. Virtual Analog Models of the Lockhart and Serge Wavefolders. Appl. Sci. 2017, 7, 1328. [Google Scholar] [CrossRef]
  137. Verfaille, V.; Arfib, D.; Keiler, F.; von dem Knesebeck, A.; Zölzer, U. Adaptive Digital Audio Effects. In DAFX—Digital Audio Effects, 2nd ed.; Zölzer, U., Ed.; John Wiley & Sons, Inc.: Chichester, UK, 2011. [Google Scholar]
  138. Giannoulis, D.; Massberg, M.; Reiss, J.D. Digital dynamic range compressor design—A tutorial and analysis. J. Audio Eng. Soc. 2012, 60, 399–408. [Google Scholar]
  139. Stikvoort, E.F. Digital dynamic range compressor for audio. J. Audio Eng. Soc. 1986, 34, 3–9. [Google Scholar]
  140. Oliveira, A.J. A feedforward side-chain limiter/compressor/de-esser with improved flexibility. J. Audio Eng. Soc. 1989, 37, 226–240. [Google Scholar]
  141. Hämäläinen, P. Smoothing of the control signal without clipped output in digital peak limiters. In Proceedings of the International Conference on Digital Audio Effects (DAFx), Hamburg, Germany, 26–28 September 2002; pp. 195–198. [Google Scholar]
  142. McNally, G.W. Dynamic range control of digital audio signals. J. Audio Eng. Soc. 1984, 32, 316–327. [Google Scholar]
  143. Hughes, B.J. Magnetic Recording Dynamic Range Compressor/Expander System. U.S. Patent 2,886,650, 12 May 1959. [Google Scholar]
  144. Herrera, P.; Serra, X. Audio Descriptors and Descriptor Schemes in the Context of MPEG-7. In Proceedings of the International Computer Music Conference (ICMC 1999), Beijing, China, 22–27 October 1999. [Google Scholar]
  145. Peeters, G.; Giordano, B.L.; Susini, P.; Misdariis, N.; McAdams, S. The Timbre Toolbox: Extracting Audio Descriptors from Musical Signals. J. Acoust. Soc. Am. 2011, 130, 2902–2916. [Google Scholar] [CrossRef]
  146. Verfaille, V.; Guastavino, C.; Traube, C. An interdisciplinary approach to audio effect classification. In Proceedings of the 9th International Conference on Digital Audio Effects (DAFx-06), Montréal, QC, Canada, 18–20 September 2006. [Google Scholar]
  147. Moffat, D.; Ronan, D.; Reiss, J.D. An Evaluation of Audio Feature Extraction Toolboxes. In Proceedings of the 18th International Conference on Digital Audio Effects (DAFx-15), Trondheim, Norway, 30 November–3 December 2015. [Google Scholar]
  148. Celma, O.; Gómez, E.; Janer, J.; Gouyon, F.; Herrera, P.; Garcia, D. Tools for content-based retrieval and transformation of audio using MPEG-7: The SPOffline and the MDTools. In Proceedings of the AES 25th International Conference, London, UK, 17–19 June 2004. [Google Scholar]
  149. Vinet, H.; Herrera, P.; Pachet, F. The CUIDADO Project: New Applications based on Audio and Music Content Description. In Proceedings of the International Computer Music Conference (ICMC), Göteborg, Sweden, 16–21 September 2002. [Google Scholar]
  150. Wilmering, T.; Fazekas, G.; Sandler, M.B. High Level Semantic Metadata for the Control of Multitrack Adaptive Audio Effects. In Proceedings of the 133rd Convention of the AES, San Francisco, CA, USA, 26–19 October 2012. [Google Scholar]
  151. Peeters, G.; McAdams, S.; Herrera, P. Instrument Sound Description in the Context of MPEG-7. In Proceedings of the International Computer Music Conference (ICMC 2000), Berlin, Germany, 27 August–1 September 2000; pp. 166–169. [Google Scholar]
  152. Stables, R.; Reiss, J.D.; De Man, B. Intelligent Music Production; Routledge: New York, NY, USA, 2019. [Google Scholar]
  153. Moffat, D.; Sandler, M.B. Machine Learning Multitrack Gain Mixing of Drums. In Proceedings of the Audio Engineering Society Convention 147, New York, NY, USA, 16–19 October 2019. [Google Scholar]
  154. Chourdakis, E.T.; Reiss, J.D. A Machine-Learning Approach to Application of Intelligent Artificial Reverberation. J. Audio Eng. Soc. 2017, 65, 56–65. [Google Scholar] [CrossRef]
  155. Moffat, D.; Sandler, M.B. An Automated Approach to the Application of Reverberation. In Proceedings of the Audio Engineering Society Convention 147, New York, NY, USA, 16–19 October 2019. [Google Scholar]
  156. Ma, Z.; De Man, B.; Pestana, P.D.; Black, D.A.; Reiss, J.D. Intelligent multitrack dynamic range compression. J. Audio Eng. Soc. 2015, 63, 412–426. [Google Scholar] [CrossRef]
  157. Moffat, D.; Sandler, M.B. Automatic Mixing Level Balancing Enhanced Through Source Interference Identification. In Proceedings of the 146th Audio Engineering Society Convention, Dublin, Ireland, 20–23 March 2019. [Google Scholar]
  158. Perez Gonzalez, E.; Reiss, J. Automatic gain and fader control for live mixing. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA’09), New Paltz, NY, USA, 18–21 October 2009; pp. 1–4. [Google Scholar]
  159. Moffat, D.; Thalmann, F.; Sandler, M.B. Towards a Semantic Web Representation and Application of Audio Mixing Rules. In Proceedings of the 4th Workshop on Intelligent Music Production (WIMP), Huddersfield, UK, 14 September 2018. [Google Scholar]
  160. Wilmering, T.; Fazekas, G.; Sandler, M.B. AUFX-O: Novel Methods for the Representation of Audio Processing Workflows. In Proceedings of the 15th International Semantic Web Conference (ISWC2016), Kobe, Japan, 17–21 October 2016. [Google Scholar]
  161. Allik, A.; Fazekas, G.; Sandler, M. An ontology for audio features. In Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), New York, NY, USA, 7–11 August 2016; pp. 258–275. [Google Scholar]
  162. Moffat, D.; Sandler, M.B. Adaptive Ballistics Control of Dynamic Range Compression for Percussive Tracks. In Proceedings of the 145th Audio Engineering Society Convention, New York, NY, USA, 17–20 October 2018. [Google Scholar]
  163. Moffat, D.; Sandler, M.B. Approaches in Intelligent Music Production. Arts 2019, 8, 125. [Google Scholar] [CrossRef]
  164. Pardo, B.; Rafii, Z.; Duan, Z. Audio Source Separation in a Musical Context; Springer Handbook of Systematic Musicology; Springer: Berlin/Heidelberg, Germany, 2018; pp. 285–298. [Google Scholar]
  165. Martínez Ramírez, M.A.; Reiss, J.D. Deep Learning and Intelligent Audio Mixing. In Proceedings of the 3rd Workshop on Intelligent Music Production (WIMP), Salford, UK, 15 September 2017. [Google Scholar]
  166. Mimilakis, S.I.; Cano, E.; Abeßer, J.; Schuller, G. New sonorities for jazz recordings: Separation and mixing using deep neural networks. In Proceedings of the 2nd AES Workshop on Intelligent Music Production (WIMP), London, UK, 13 September 2016. [Google Scholar]
  167. Martínez Ramírez, M.A.; Reiss, J.D. Analysis and prediction of the audio feature space when mixing raw recordings into individual stems. In Proceedings of the 143rd Audio Engineering Society Convention, New York, NY, USA, 18–21 October 2017. [Google Scholar]
  168. Martínez Ramírez, M.A.; Reiss, J.D. Stem audio mixing as a content-based transformation of audio features. In Proceedings of the 19th IEEE Workshop on Multimedia Signal Processing (MMSP), Luton, UK, 16–18 October 2017. [Google Scholar]
  169. Verma, P.; Smith, J.O. Neural style transfer for audio spectograms. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Workshop for Machine Learning for Creativity and Design, Long Beach, CA, USA, 8 December 2017. [Google Scholar]
  170. Martínez Ramírez, M.A.; Reiss, J.D. End-to-End Equalization with Convolutional Neural Networks. In Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18), Aveiro, Portugal, 4–8 September 2018. [Google Scholar]
  171. Martínez Ramírez, M.A.; Reiss, J.D. Modeling of Nonlinear Audio Effects with End-to-End Deep Neural Networks. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019. [Google Scholar]
  172. Parker, J.D.; Esqueda, F.; Bergner, A. Modelling of Nonlinear State-Space Systems Using a Deep Neural Network. In Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-19), Birmingham, UK, 2–6 September 2019. [Google Scholar]
  173. Bromham, G. How can academic practice inform mix-craft? In Mixing Music; Perspectives on Music Production; Hepworth-Sawyer, R., Hodgson, J., Eds.; Routledge: New York, NY, USA, 2016; Chapter 16; pp. 245–256. [Google Scholar]
  174. Sabin, A.T.; Rafii, Z.; Pardo, B. Weighted-function-based rapid mapping of descriptors to audio processing parameters. J. Audio Eng. Soc. 2011, 59, 419–430. [Google Scholar]
  175. Nercessian, S.; Lukin, A. Speech Dereverberation using Recurrent Neural Networks. In Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-19), Birmingham, UK, 2–6 September 2019. [Google Scholar]
  176. King, A. Technology as a vehicle (tool and practice) for developing diverse creativities. In Activating Diverse Musical Creativities: Teaching and Learning in Higher Music Education; Burnard, P., Haddon, E., Eds.; Bloomsbury Publishing: London, UK, 2015; Chapter 11; pp. 203–222. [Google Scholar]
Figure 1. An audio effect and its control.
Figure 1. An audio effect and its control.
Applsci 10 00791 g001
Figure 2. Room impulse response consisting of direct sound, early reflections, and late reverberation: The time is dependent on the room size and other properties, such as geometry and surface materials.
Figure 2. Room impulse response consisting of direct sound, early reflections, and late reverberation: The time is dependent on the room size and other properties, such as geometry and surface materials.
Applsci 10 00791 g002
Figure 3. The Chichén Itzá pyramid in Tinum, Mexico. Photo license: Creative Commons CC0.
Figure 3. The Chichén Itzá pyramid in Tinum, Mexico. Photo license: Creative Commons CC0.
Applsci 10 00791 g003
Figure 4. Traditional tape echo: the signal picked up by the playback head feeds back to the record head.
Figure 4. Traditional tape echo: the signal picked up by the playback head feeds back to the record head.
Applsci 10 00791 g004
Figure 5. With a black control box (the Les Paulverizer) mounted on a guitar, Les Paul was able to control tape machines during live performances. Photo © 2010 by Mark Zaputil (Zap Ltd. Music), used with permission.
Figure 5. With a black control box (the Les Paulverizer) mounted on a guitar, Les Paul was able to control tape machines during live performances. Photo © 2010 by Mark Zaputil (Zap Ltd. Music), used with permission.
Applsci 10 00791 g005
Figure 6. Parameters for gate, expander, compressor, and limiter.
Figure 6. Parameters for gate, expander, compressor, and limiter.
Applsci 10 00791 g006
Figure 7. Compressor/expander with hinge parameter.
Figure 7. Compressor/expander with hinge parameter.
Applsci 10 00791 g007
Figure 8. Diagram of an adaptive digital audio effect: The control parameters are derived from audio features extracted from an input signal. The features may be extracted from the signal that is to be transformed (input 1), from a different input signal (input 2), or from the effect output.
Figure 8. Diagram of an adaptive digital audio effect: The control parameters are derived from audio features extracted from an input signal. The features may be extracted from the signal that is to be transformed (input 1), from a different input signal (input 2), or from the effect output.
Applsci 10 00791 g008
Figure 9. Signal flow of an adaptive audio effect using control parameters obtained from metadata.
Figure 9. Signal flow of an adaptive audio effect using control parameters obtained from metadata.
Applsci 10 00791 g009
Table 1. Delay time range approximates for delay-based effects according to Dutilleux [80]: The modulation sources may vary depending on the implementation and musical application.
Table 1. Delay time range approximates for delay-based effects according to Dutilleux [80]: The modulation sources may vary depending on the implementation and musical application.
Delay Range (ms)ModulationEffect Name
0–20-Resonator
0–15sinusoidalFlanger
10–25random/sinusoidalChorus
25–50-Slapback
>50-Echo
Back to TopTop