Audio Enhancement of Physical Models of Musical Instruments Using Optimal Correction Factors: The Recorder Case

Bakogiannis, Konstantinos; Polychronopoulos, Spyros; Marini, Dimitra; Kouroupetroglou, Georgios

doi:10.3390/app11146426

Open AccessArticle

Audio Enhancement of Physical Models of Musical Instruments Using Optimal Correction Factors: The Recorder Case

by

Konstantinos Bakogiannis

,

Spyros Polychronopoulos

,

Dimitra Marini

and

Georgios Kouroupetroglou

^*

Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, GR-15784 Athens, Greece

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(14), 6426; https://doi.org/10.3390/app11146426

Submission received: 29 May 2021 / Revised: 30 June 2021 / Accepted: 8 July 2021 / Published: 12 July 2021

(This article belongs to the Special Issue Musical Instruments: Acoustics and Vibration II)

Download

Browse Figures

Versions Notes

Abstract

:

A simulation of a musical instrument is considered to be a successful one when there is a good resemblance between the model’s synthesized sound and the real instrument’s sound. In this work, we propose the integration of physical modeling (PM) methods with an optimization process to regulate a generated digital signal. Its goal is to find a new set of values of the PM’s parameters’ that would lead to a synthesized signal matching as much as possible to reference signals corresponding to the physical musical instrument. The reference signals can be: (a) described by their acoustic characteristics (e.g., fundamental frequencies, inharmonicity, etc.) and/or (b) the signals themselves (e.g., impedances, recordings, etc.). We put this method into practice for a commercial recorder, simulated using the digital waveguides’ PM technique. The reference signals, in our case, are the recorded signals of the physical instrument. The degree of similarity between the synthesized (PM) and the recorded signal (musical instrument) is calculated by the signals’ linear cross-correlation. Our results show that the adoption of the optimization process resulted in more realistic synthesized signals by (a) enhancing the degree of similarity between the synthesized and the recorded signal (the average absolute Pearson Correlation Coefficient increased from 0.13 to 0.67), (b) resolving mistuning issues (the average absolute deviation of the synthesized from the recorded signals’ pitches reduced from 40 cents to the non-noticeable level of 2 cents) and (c) similar sound color characteristics and matched overtones (the average absolute deviation of the synthesized from the recorded signals’ first five partials reduced from 41 cents to 2 cents).

Keywords:

correlation; musical instruments; optimization; physical modeling; recorder; tuning

1. Introduction

The acoustic simulation of musical instruments using computer models is a pole of attraction for scientists of multidisciplinary fields (i.e., physics, informatics, musicology, etc.) [1,2,3,4,5,6,7,8]. In the last decades, several digital sound synthesis techniques have been developed (e.g., sampling, spectral modeling, and physical modeling) [9]. Physical Modeling (PM) is the technique that, by simulating the instrument’s physical phenomena, can generate its sound. The audible result of this technique depends purely on the level of detail of the model. Describing all the phenomena in detail is not trivial (i.e., non-linearity of the vibrating reed, complex geometries, etc.). Thus, in the process of simulation, assumptions are made in order to simplify the model, which inevitably affects the final result.

Applying correction factors is a technique that can enhance the accuracy of a signal that is produced by a digital generator. Its goal is to find the values of the model’s parameters that lead to a generated signal matching as much as possible to a reference one. In this work, we propose a minimum error method to choose the optimal parameters given the predetermined criteria. The integration of musical instruments’ PM methods with our framework can tune the model, given the inherent PM limitations.

In the next section, we present the integration of PM of musical instruments with the optimization framework (Section 2). Next, we put the proposed method in practice for the case of a commercial Hohner recorder (Section 3). The physical model of the recorder, based on the Digital Waveguides technique, is presented in Section 3.1, followed by a detailed description of the optimization technique adopting the optimization framework to enhance the model’s audio (Section 3.2). Finally, we compare the synthesized signals generated by the PM (without adopting the optimization framework) and by the PM-OPT (adopting the optimization framework) with the relative recorded from the real musical instrument and present our results in terms of the degree of similarity, the tuning accuracy, and the sound color characteristics (Section 3.3) ending up with a discussion about the current work (Section 4).

2. Method

In this work, we present the integration of the optimization framework with the PM of musical instruments (Figure 1). In brief, this method enables the modification of the PM parameters through correction factors. It is an iterative process to solve an optimization problem. In every iteration, the optimizer tries a new set of values for the correction factors (

C F_{n e w}

, Figure 1), resulting in new synthesized signals. Next, the signals are evaluated according to the predetermined criteria. The goal is to determine the optimal set of correction factors derived by the solution of an optimization problem. An optimal set of correction factors is the one that when applied on the relative parameters of the PM, will result in the synthesized signal with the highest evaluation score (Output, Figure 1). We would like to note here that in this work we used the known PM techniques, for the simulation of the musical instrument, thus, any modeling challenges (e.g., nonlinearities) concerning the PM were taken from the relevant literature (see Section 3.1). Although nonlinearities concern the PM and not the adopted optimization framework, it is a fact that they affected our approach by imposing the need to use a different set of correction factors for every note produced rather than for a single set.

The first step is to define the details of the synthesis part (Input 1, Figure 1). It includes the determination of (a) the modifiable parameters, (b) the parameters’ limits, and (c) the specific modification type of every parameter. In this step, the designer should ensure that unwanted alterations of the core elements in the PM algorithm are avoided. Thus, setting the modifiable parameters (determination point a) and their limits (determination point b) is essential to avoid results with no physical meaning, such as placing the position of a tonehole outside the body of a wind instrument. In principle, all the parameters of the algorithm can be potentially considered tunable parameters. However, if parameter A depends on parameter B (i.e., A = f(B)), and B only affects A (i.e., there is not a parameter C for which C = g(B)), then modifying both A and B is unnecessary. Based on the above, we chose the tunable and locked parameters. Type of modification (determination point c) in the proposed method is the mathematical expression which, with the use of a correction factor (

C F

), tweaks a parameter P: (e.g.,

P_{m o d i f i e d} = P_{i n i t i a l} + C F, P_{m o d i f i e d} = P_{i n i t i a l} \cdot C F, P_{m o d i f i e d} = {P_{i n i t i a l}}^{C F}

, etc.).

The second step includes the evaluation of the generated signal. The goal in our approach essentially is the following: the synthesized signal, generated by a PM according to building details (Input 2, Figure 1), to be as similar as possible to a reference. In our case, the reference is derived from a recorded signal generated by a musical instrument (Input 3, Figure 1). The proposed model, apart from recordings, works with other reference signals as well. For example, the goal signals could be generated by digital synthesis techniques (e.g., additive) and even signals consisting of raw numbers describing sound parameters (e.g., inharmonicity, deviation, durations, and other acoustic features found, for example, in [10]). In every iteration, the optimizer assigns new values to the correction factors’ set (

C F_{n e w}

, Figure 1), which modifies correspondingly the Physical Modeling resulting in a new synthesized signal (Figure 1). The synthesized signal is then compared with the recording. An objective function calculates the resemblance of these two signals (in our case, the function is cross-correlation, see Section 3.2). It enables the quantification of the degree of their similarity (

D S

, Figure 1). The optimizer determines which specific set of correction factors (

C F_{k}

, Figure 1) resulted when applied to the PM, in the maximum degree of similarity (

D S_{k}

, Figure 1) between the synthesized and the recorded signal (Figure 1). This particular set is the optimizer’s output (Figure 1).

3. Case Study: Recorder

3.1. The Physical Model

In order to demonstrate the integration of the optimization framework with PM of musical instruments, we present the case of a recorder. The recorder constitutes a wind instrument with a flute-like (air-jet) excitation mechanism and a cylindrical resonator. The player produces various pitches by changing the fingering (i.e., arrangements of closed or open toneholes), or by overblowing. The instrument in our case study is a typical commercial recorder: Hohner’s melody recorder with baroque fingering (type 1-095.143-1011) and eight toneholes (seven regular toneholes in the upper part of the acoustic pipe and one fingerhole in the bottom part).

The method used for the physical modeling of the instrument is the Digital Waveguides (DWGs), a technique introduced by J. Smith [11], which simulates traveling waves by digital delay lines [12]. In this work, we have chosen to demonstrate our framework on a computationally cheap PM that enables a fast calculation runtime of a significantly high number of iterations during the optimization process (here 10 k, see Section 3.3). However, every PM technique (e.g., FEM) is built upon parameters that can potentially be tuned with the use of correction factors, hence, it can be integrated with the optimization framework. The only prerequisite is the available computation power to enable the optimizer to perform several iterations. Figure 2 demonstrates the block diagram of our recorder’s PM based on established approaches [13,14,15].

Recorder’s excitation phenomenon is based on the effect of an air jet blown that strikes a sharp edge (labium) [16]. In Figure 2, we simulate the air jet traveling from the player’s lips to the labium by a delay line (jet-delay). The mouth pressure (forming the air-jet) is simulated as a constant pressure enriched with vibrato and noise content. In the real world, this constant mouth pressure is not reached and released instantly thus, in our model, a dynamic envelope is applied to provide the duration of the attack, the sustain, and the release. The air jet is modeled by a static non-linear element using a sigmoid function [17]. Here we use the

y = x - x^{3}

sigmoid function as proposed in [17]. When the air blown by the player enters the instrument’s bore the air particles inside the resonator’s cavity start to vibrate. The bore effect is simulated as a one-dimensional DWG by using delay lines (one delay line for the right and one for the left-going part of the wave, noted as

z^{- M}

in Figure 2) [12]. The length of the digital delay lines (in samples), which depends on the speed of the acoustic waves, corresponds to the bore’s physical length (in meters). A more accurate modeling of wind instruments should also take into consideration the end corrections [18,19] in order to tune the generated pitch. However, our initial model neglects end corrections (and therefore creates a lower—than the more accurate model—correlation due to the frequencies mismatch, see Section 3.3 and Figure 3) as it is the proposed optimization framework that chooses the optimal one itself. The pressure waves travel from the mouthpiece along the tube towards the other end (assumably, right-going). When reaching the end so-called the bell, a portion of the wave is reflected towards the mouthpiece (assumably, left-going) and the other portion is transmitted outside the instrument. The superposition of the right- and left-going pressure waves forms a standing wave inside the resonator’s cavity. In particular, the effect of the bell is to radiate out of the instrument the high frequencies and to reflect back the low frequencies. This reflection is simulated as a lowpass filter, the RL(z), which is, in our case, a first-order averaging filter. Further, to simplify the simulation, we assume that the first open tonehole defines the effective length of the bore [20] and consequently the length of the digital delay-line.

3.2. Analysis by Synthesis Model

We identified eleven internal parameters which are part of all the components of the block diagram in Figure 1 and affect the synthesized signal in both time and frequency domain. More specifically, three parameters affect the dynamic envelope of the mouth pressure (the duration of the attack, the sustain, and the release), three parameters affect the properties of the input (the frequency, the content, and the noise of the vibrato), one parameter affects the length of the delay lines, one parameter affects the interpolation used to achieve accurate tuning, and three parameters affect the filters’ coefficients (the transmission of the tube into the mouth, and the reflections at both of its open ends). The number of correction factors is, thus, set to eleven. The chosen modification type is a multiplication (

P_{m o d i f i e d} = P_{i n i t i a l} \cdot C F

), which, after several trials, proved to derive the best results and enabled the setting of initial generic logical boundaries (i.e., the range of values the optimizer is permitted to assign to the parameters in every iteration). The initial value of all the correction factors is set to one, which corresponds to the synthesized signal generated by the unmodified PM before the optimization framework integration. The determination of the logical boundaries of eleven parameters is not a straightforward task. Choosing a single modification type (in our case, this type is the multiplication) makes it easier to deal with this task by enabling the initial setting of generic logical boundaries as a starting point before their individual specification. These generic logical boundaries have been set to half and double the initial values for the lower and upper boundary, respectively. After several trials to ensure that all the extreme values are within logical limits and unwanted alterations of the PM algorithm’s core elements are avoided, these boundaries were set for every individual correction factor.

The core part of the optimization framework integration with the PM is the comparison between the synthesized and the real sound of the relative musical instrument. To make this comparison possible, we recorded samples of the commercial Hohner recorder mentioned above. The reference signals (nine signals for nine fingerings) are the recordings of the Hohner recorder. The recordings took place at the audio recording studio of the National and Kapodistrian University of Athens, Department of Informatics and Telecommunications, using an electroacoustic chain with a flat frequency response (microphone: SD Systems LCM 85 MK II with “LP” Preamp Power Supply, soundcard: apogee duet, computer: MacBook air 2019). The distance (recorder–microphone) was approximately 1m and the microphone was placed off-axis from the instrument’s bell. We want to note at this point, that in order to cross-correlate our method’s performance, we recorded each note of the recorder 15 times and calculated all the possible Pearson correlation coefficients between the 15 recordings of the same note. As a reference signal to evaluate our model we chose for each note this recording that had the highest average Pearson correlation coefficient between itself and the rest 14 recordings of the same note. The total average Pearson correlation coefficient was found to be 0.7, with a standard deviation of 0.16.

The degree of similarity between the synthesized and recorded signals is defined by the Pearson correlation coefficient (Equation (1)), which measures the linear correlation between two variables [21] and takes values between −1 and +1 (+1 corresponds to total positive linear correlation, 0 to no linear correlation, and −1 to total negative linear correlation). As it is here non-relevant whether the correlation is positive or negative, we take the absolute value of this coefficient to define the objective (Equation (2)). Moreover, considering that computational optimizers deal with minimization problems more efficiently, we set our objective to output the minus absolute coefficient (Equation (2)). In that way, the objective is introduced to the optimizer (in this work we use Nelder-Mead, see Section 3.3) which is searching for a set of variables to minimize the objective and thus, maximize the correlation coefficient. In this work, the objective function leads to a non-convex optimization where the optimizer is looking for a global minimum. Thus, the number of iterations needs to be quite big to ensure good results. The objective function takes two inputs: (i) the synthesized digital signal generated by the PM and (ii) the recorded reference signal generated by the recorder. Our model’s goal is to create a model with the best signals’ match in terms of physical properties. The reason we made this choice is that maximizing the resemblance of the reference with the synthesized signals in terms of physical properties would, consequently, maximize the resemblance in terms of perceptual properties.

ρ_{S, R} = \frac{cov (S, R)}{σ_{S} σ_{R}}

(1)

Obj = - | ρ_{S, R} |

(2)

After determining the set of the correction factors, the boundaries, and the objective, the next step is to put the optimizer into practice. In this work, our focus is to find the optimal correction factors to tweak the algorithm’s parameters in order for the synthesized signal to be as close as possible to the relative instrument’s signal. The optimizer, at every iteration, is trying a new set of variables for the modifiable parameters of the recorder’s PM that generates a signal (synthesized signal) to be compared (correlation coefficient) with the recorded signal (goal signal). The optimizer will minimize the objective function for all the possible notes (fingerings) and output an optimal set of correction factors.

3.3. Results and Discussion

In this work, we studied the enhancement of the generated signal of the PM of a Hohner melody recorder with baroque fingering using the optimization framework. We studied the fingering system, which results from the sequential opening of the toneholes (i.e., the one that starts with having all toneholes closed and lifting the fingers one by one, beginning with the closest to the bell-end). The recorder’s eight toneholes result in nine notes, which correspond to the sequence of all toneholes closed (note 1) to all open (note 9). The proposed model’s inputs are (a) the building information, i.e., the geometrical details to synthesize nine audio files (9 notes), (b) the nine relative recordings of the real instrument, (c) the initial values for the correction factors along with their upper and lower boundaries and outputs nine sets of correction factors, one individual set per note.

In order to calculate the optimal set of correction factors, we put in practice two optimization techniques. We compared their efficiency and embedded the winner to our model. The mathematical optimizers tested here are the Nelder-Mead (NM) [22,23] and the Simulated Annealing (SA) [24,25], which have been both used in acoustic-related studies [26,27]. To benchmark their performance, we run the relevant algorithms ten times for 10 k iterations per time. After several trials, this number of maximum iterations per time proved to be adequately high to satisfy the need for accurate tuning (the deviation between the recorded and the synthesized signals’ pitch to be less than 10 cents). Both techniques achieved the best costs (maximum correlation factors) of similar values (±5% maximum deviation); however, NM was found to be more efficient than SA since it came back with the best cost value much faster (NM: 100–400 iterations, SA 1 k–5 k iterations).

Our framework significantly enhanced the similarity between the synthesized and the recorder signals (Figure 3). In the case of the PM synthesized signals (i.e., prior to the optimization integration—the initial value of all the correction factors equals one), the average Pearson correlation factor was 0.13 (the minimum and maximum are 0.03 and 0.48, respectively) and in the case of PM-OPT synthesized signals (PM integrated with the optimization framework), the correlation factor has reached the average value of 0.67 (the minimum and maximum are 0.59 and 0.76 respectively). The model resulted in a significant increase in the degree of similarity (Pearson correlation factor > 0.59) for all the notes, even for the ones with a low initial value (Pearson correlation factor < 0.1, notes 1, 3, 5, 7–9).

The improvement in terms of the degree of similarity resulted in the synthesis of more accurately tuned signals as per the relevant reference recorded signals (Figure 4). The average absolute deviation of the fundamental frequency of the synthesized from the recorded signals reduced from 40 cents in the case of PM signals, which corresponds to an interval of almost half semitone (a half-semitone deviation is 50 cents) to only 2 cents in the case of PM-OPT signals (which is a non-noticeable difference [28]). In 5 out of 9 notes, the PM-OPT model led to the synthesis of perfectly tuned signals with the relevant recordings (0 cent deviation, Figure 4 notes 2–5, 7).

Moreover, significant improvement in sound color resemblance was observed. The partials of the synthesized and the recorded signals initially deviated (e.g., PM vs. Recording case in Figure 5), and now they match (e.g., PM-OPT vs. Recording case in Figure 5). In order to measure the sound color resemblance, we studied the matching of the recorded and synthesized signals’ frequency content by taking into consideration the first five partials (the fundamental and the first four overtones, Table 1, Table 2 and Table 3). The sound color resemblance per note between the recorded and the synthesized signals is determined by their first five partials average absolute deviation (the two columns on the right of Table 3). We can see this value is significantly lower for all the PM-OPT deviation from Recording cases than the corresponding PM deviation from Recording cases. The average value for all the nine notes prior to the optimization framework integration (PM deviation from Recording) is 41 cents, whereas, after the integration (PM-OPT deviation from Recording) diminishes to only 2 cents. For eight out of nine notes, the PM-OPT and the recording have almost identical spectrum contents (partials average absolute deviation ≤2 cents). This improvement is a byproduct of the precise tuning of the fundamental frequency. PM produces partials that resemble a harmonic series, which can be found in the recordings as well. Therefore, tuning the fundamental frequency tunes, correspondingly, the overtones.

4. Conclusions

In this work, we proposed a method that enables the maximization of the physical modeling (PM) of musical instruments efficiency by applying the optimal correction factors and presented a case study of a specific commercial recorder. PMs of musical instruments simulate the sound production mechanism of the relative physical instruments. However, the detailed analytical description of the phenomena governing the sound generation mechanism to design an accurate PM of the musical instruments is not trivial. The proposed use of the optimization framework to enhance the generated audio signal of the PM of musical instruments helps in practice the production of more realistic PM-generated signals. The results for the musical instrument used in our study indicate that the proposed model enhances the degree of similarity between the synthesized and the recorded signal (the average absolute Pearson Correlation Coefficient increased from 0.13 to 0.67), resolving mistuning issues (the absolute deviation of the synthesized from the recorded signals’ pitches reduced from 40 cents to the non-noticeable level of 2 cents) and resulting to similar sound color characteristics (matching overtones).

We expect that this work will motivate researchers to create more complex optimization techniques using multiobjectives that will allow the parallel accounting of both the physical (e.g., inharmonicity, amplitude deviation, spectrum entropy) and the perceptual properties (e.g., pitch, loudness, roughness), as well as further validation schemes based on listening tests. We further expect that the approach we propose with this work will further improve the efficiency of both the existing and future PMs of musical instruments.

Author Contributions

Conceptualization, G.K.; methodology, K.B. and S.P.; software, K.B., S.P. and D.M.; validation, K.B. and S.P.; formal analysis, K.B. and S.P.; investigation, K.B., D.M., S.P. and G.K.; writing—original draft preparation, K.B.; writing—review and editing, S.P.; visualization, D.M.; supervision, G.K.; project administration, G.K.; funding acquisition, G.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship, and Innovation, under the call RESEARCH-CREATE-INNOVATE (project MNESIAS: “Augmentation and enrichment of cultural exhibits via digital interactive sound reconstitution of ancient Greek musical instruments” code: T1EDK-02823/MIS 5031683).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Allen, A.; Raghubanshi, N. Aerophones in flatland: Interactive wave simulation of wind instruments. ACM Trans. Graph. 2015, 34, 134. [Google Scholar] [CrossRef]
Schnell, N.; Battier, M. Introducing composed instruments, technical and musicological implications. In Proceedings of the 2002 Conference on New Interfaces for Musical Expression, Dublin, Ireland, 24–26 May 2002; Brazil, E., Ed.; National University of Singapore: Singapore, 2002; pp. 1–5. [Google Scholar]
Hunt, A.; Wanderley, M.M.; Paradis, M. The importance of parameter mapping in electronic instrument design. J. New Music Res. 2003, 32, 429–440. [Google Scholar] [CrossRef]
Bilbao, S. Direct simulation for wind instrument synthesis. In Proceedings of the 11th International Digital Audio Effects (DAFx-08) Conference, Espoo, Finland, 1–4 September 2008; pp. 1–8. [Google Scholar]
Kontogeorgakopoulos, A.; Tzevelekos, P.; Cadoz, C.; Kouroupetroglou, G. Using the CORDIS-ANIMA Formalism for the Physical Modeling of the Greek Zournas Shawm. In Proceedings of the International Computer Music Conference (ICMC08), Belfast, UK, 24–29 August 2008; pp. 395–398. [Google Scholar]
Tzevelekos, P.; Georgaki, A.; Kouroupetroglou, G.T. HERON: A Zournas Digital Virtual Musical Instrument. In Proceedings of the 3rd ACM International Conference on Digital Interactive Media in Entertainment and Arts (DIMEA), Athens, Greece, 10–12 September 2008; pp. 325–359. [Google Scholar] [CrossRef]
Tzevelekos, P.; Perperis, T.; Kyritsi, V.; Kouroupetroglou, G. A Component-Based Framework for the Development of Virtual Musical Instruments Based on Physical Modeling. In Proceedings of the 4th Sound and Music Computing Conference, Lefkada, Greece, 11–13 July 2007; Spyridis, C., Georgaki, A., Kouroupetroglou, G., Anagnostopoulou, C., Eds.; National and Kapodistrian University of Athens: Athens, Greece, 2007; pp. 30–37. [Google Scholar]
Pfeifle, F.; Bader, R.M. Real-Time Finite-Difference Method Physical Modeling of Musical Instruments Using Field-Programmable Gate Array Hardware. J. Audio Eng. Soc. 2015, 63, 1001–1016. [Google Scholar] [CrossRef]
Smith, J.O., III. Viewpoints on the history of digital synthesis. In Proceedings of the International Computer Music Conference (ICMC 1991), Montreal, QC, Canada, 16–20 October 1991; pp. 1–10. [Google Scholar]
Beauchamp, J.W. Analysis and Synthesis of Musical Instrument Sounds. In Analysis, Synthesis, and Perception of Musical Sounds. Modern Acoustics and Signal Processing; Beauchamp, J.W., Ed.; Springer: New York, NY, USA, 2007; pp. 1–89. [Google Scholar] [CrossRef]
Smith, J.O., III. Physical Modeling Using Digital Waveguides. Comput. Music J. 1982, 16, 74–91. [Google Scholar] [CrossRef]
Scavone, G. Delay-Lines and Digital Waveguides. In Springer Handbook of Systematic Musicology; Bader, R., Ed.; Springer: Berlin/Heidelberg, Germany, 2018; pp. 259–272. [Google Scholar] [CrossRef]
Carpenter, T.G.F. Developing an Audio Unit Plugin Using a Digital Waveguide Model of a Wind Instrument. Master’s Thesis, Acoustics and Music Technology, University of Edinburgh, Edinburgh, UK, 2012. [Google Scholar]
Smith, J.O. Digital waveguide architectures for virtual musical instruments. In Handbook of Signal Processing in Acoustics; Havelock, D., Kuwano, S., Vorländer, M., Eds.; Springer: New York, NY, USA, 2008; pp. 399–417. [Google Scholar] [CrossRef] [Green Version]
Scavone, G.P. An Acoustic Analysis of Single-Reed Woodwind Instruments with an Emphasis on Design and Performance Issues and Digital Waveguide Modeling Techniques. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 1997. [Google Scholar]
Fletcher, N.H.; Rossing, T.D. The Physics of Musical Instruments, 1st ed.; Springer: New York, NY, USA, 1991; pp. 449–454. [Google Scholar]
Cook, P.R. A meta-wind-instrument physical model, and a meta-controller for real-time performance control. In Proceedings of the International Computer Music Conference, San Jose, CA, USA, 14–18 October 1992; Michigan Publishing: Ann Arbor, MI, USA, 1992; pp. 273–276. [Google Scholar]
Fletcher, N.H. Air flow and sound generation in musical wind instruments. Ann. Rev. Fluid Mech. 1979, 11, 123–146. [Google Scholar] [CrossRef] [Green Version]
Wang, S. Wavelength and end correction in a recorder. ISB J. Phys. 2009, 3, 1–5. [Google Scholar]
Wolfe, J. The acoustics of woodwind musical instruments. Acoust. Today 2018, 14, 50–56. [Google Scholar]
Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson correlation coefficient. In Noise Reduction in Speech Processing; Springer Topics in Signal Processing; Benesty, J., Kellermann, W., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; Volume 2, pp. 1–4. [Google Scholar] [CrossRef]
Singer, S.; Nelder, J. Nelder-mead algorithm. Scholarpedia 2009, 4, 2928. [Google Scholar] [CrossRef]
Luersen, M.A.; Le Riche, R. Globalized Nelder–Mead method for engineering optimization. Comput. Struct. 2004, 82, 2251–2260. [Google Scholar] [CrossRef]
Van Laarhoven, P.J.M.; Aarts, E.H.L. Simulated annealing. In Simulated Annealing: Theory and Applications; Van Laarhoven, P.J.M., Aarts, E.H.L., Eds.; Springer: Dordrecht, The Netherlands, 1987; pp. 7–15. [Google Scholar] [CrossRef]
Kirkpatrick, S.; Gelatt, C.D.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef] [PubMed]
Polychronopoulos, S.; Memoli, G. Acoustic levitation with optimized reflective metamaterials. Sci. Rep. 2020, 10, 4254. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bakogiannis, K.; Polychronopoulos, S.; Marini, D.; Terzēs, C.; Kouroupetroglou, G.T. ENTROTUNER: A computational method adopting the musician’s interaction with the instrument to estimate its tuning. IEEE Access 2020, 8, 53185–53195. [Google Scholar] [CrossRef]
Fastl, H.; Zwicker, E. Psychoacousticss, 3rd ed.; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Block diagram of the Physical Modeling and Optimization framework integration, showing the input parameters and the looping procedure to output the optimal set of correction factors resulting in the maximum similarity between the synthesized and the recorded signal.

Figure 2. Block diagram of the recorder’s physical model, simulated with the Digital Waveguides Technique, based on [13,14,15].

Figure 3. The degree of similarity (calculated by the absolute Pearson’s correlation factor) between the synthesized and recorded signals in various notes for PM (blue) and PM with optimal correction factors (orange).

Figure 4. The average absolute deviation of the fundamental frequency of the synthesized and the recorded signals in cents in various notes for PM (blue) and PM with optimal correction factors (orange).

Figure 5. Spectral comparison of the synthesized signal without adopting the optimal correction factors (blue line, upper plot) and with adopting the optimal correction factors (blue line, lower plot) with the relative (note 6) recording (orange line).

Table 1. The first five partials of the recorded signals per 9 notes, in Hertz.

Note	Recorded Signals Frequency (Hz)
Note	Fundamental	1st Overtone	2nd Overtone	3rd Overtone	4th Overtone
1	522	1046	1568	2090	2614
2	591	1183	1774	2365	2957
3	661	1322	1982	2644	3305
4	721	1439	2160	2883	3604
5	781	1560	2347	3131	3901
6	883	1767	2659	3536	4414
7	1003	2005	3011	4013	5017
8	1117	2334	3352	4469	5586
9	1213	2428	3642	4855	6070

Table 2. The first five partials of the synthesized signals without adopting the optimal correction factors (PM) and after adopting the optimal correction factors (PM-OPT) per 9 notes, in Hertz.

Note	Synthesized Signals Frequency (Hz)
	Fundamental		1st Overtone		2nd Overtone		3rd Overtone		4th Overtone
	PM	PM-OPT	PM	PM-OPT	PM	PM-OPT	PM	PM-OPT	PM	PM-OPT
1	511	523	1022	1045	1533	1568	2044	2091	2555	2613
2	594	591	1188	1183	1782	1774	2376	2365	2970	2957
3	668	661	1337	1322	2005	1982	2673	2643	3342	3304
4	722	721	1444	1441	2165	2162	2887	2883	3609	3603
5	824	781	1649	1563	2473	2344	3297	3125	4122	3906
6	912	884	1823	1767	2735	2651	3647	3538	4559	4415
7	1013	1003	2027	2007	3040	3010	4054	4013	5068	5017
8	1197	1111	2394	2222	3592	3333	4789	4443	5986	5554
9	1222	1214	2444	2428	3666	3622	4888	4856	6110	6073

Table 3. The absolute deviation (in cents) of the synthesized signals, without adopting the optimal correction factors (PM) and after adopting the optimal correction factors (PM-OPT), from the recording signals for the 9 notes, the average absolute deviation of all the partials between the recorded and the synthesized signals for the 9 notes and the total average absolute deviation for all the partials and notes.

Note	Deviation of Synthesized Signals from the Recorded Signals (in Cents)
	Fundamental		1st Overtone		2nd Overtone		3rd Overtone		4th Overtone		Partials Average
	PM	PM-OPT	PM	PM-OPT	PM	PM-OPT	PM	PM-OPT	PM	PM-OPT	PM	PM-OPT
1	37	3	40	2	39	0	39	1	39	1	39	1
2	9	0	7	0	8	0	8		8	0	8	0
3	18	0	20	0	20	0	19	1	19	1	19	0
4	2	0	6	2	4	2	2	0	2	1	3	1
5	93	0	96	3	91	2	89	3	95	2	93	2
6	56	2	54	0	49	5	54	1	56	0	54	2
7	17	0	19	2	17	1	18	0	18	0	18	1
8	120	9	120	9	120	10	120	10	120	10	120	10
9	13	1	11	0	12	10	11	0	11	1	12	2
							Total average absolute deviation				41	2

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bakogiannis, K.; Polychronopoulos, S.; Marini, D.; Kouroupetroglou, G. Audio Enhancement of Physical Models of Musical Instruments Using Optimal Correction Factors: The Recorder Case. Appl. Sci. 2021, 11, 6426. https://doi.org/10.3390/app11146426

AMA Style

Bakogiannis K, Polychronopoulos S, Marini D, Kouroupetroglou G. Audio Enhancement of Physical Models of Musical Instruments Using Optimal Correction Factors: The Recorder Case. Applied Sciences. 2021; 11(14):6426. https://doi.org/10.3390/app11146426

Chicago/Turabian Style

Bakogiannis, Konstantinos, Spyros Polychronopoulos, Dimitra Marini, and Georgios Kouroupetroglou. 2021. "Audio Enhancement of Physical Models of Musical Instruments Using Optimal Correction Factors: The Recorder Case" Applied Sciences 11, no. 14: 6426. https://doi.org/10.3390/app11146426

APA Style

Bakogiannis, K., Polychronopoulos, S., Marini, D., & Kouroupetroglou, G. (2021). Audio Enhancement of Physical Models of Musical Instruments Using Optimal Correction Factors: The Recorder Case. Applied Sciences, 11(14), 6426. https://doi.org/10.3390/app11146426

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Audio Enhancement of Physical Models of Musical Instruments Using Optimal Correction Factors: The Recorder Case

Abstract

1. Introduction

2. Method

3. Case Study: Recorder

3.1. The Physical Model

3.2. Analysis by Synthesis Model

3.3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI