Next Article in Journal
Study on the Deformation Characteristics of the Surrounding Rock and Concrete Support Parameter Design for Deep Tunnel Groups
Next Article in Special Issue
Influence of Noise Level and Reverberation on Children’s Performance and Effort in Primary Schools
Previous Article in Journal
A Titanium Alloy Defect Detection Method Based on Optical–Acoustic Image Fusion
Previous Article in Special Issue
Audio Features in Education: A Systematic Review of Computational Applications and Research Gaps
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Digital Signal Processing of the Inharmonic Complex Tone

by
Tatjana Miljković
,
Jelena Ćertić
,
Miloš Bjelić
* and
Dragana Šumarac Pavlović
School of Electrical Engineering, University of Belgrade, 11000 Belgrade, Serbia
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(15), 8293; https://doi.org/10.3390/app15158293
Submission received: 30 June 2025 / Revised: 23 July 2025 / Accepted: 24 July 2025 / Published: 25 July 2025
(This article belongs to the Special Issue Musical Acoustics and Sound Perception)

Abstract

In this paper, a set of digital signal processing (DSP) procedures tailored for the analysis of complex musical tones with prominent inharmonicity is presented. These procedures are implemented within a MATLAB-based application and organized into three submodules. The application follows a structured DSP chain: basic signal manipulation; spectral content analysis; estimation of the inharmonicity coefficient and the number of prominent partials; design of a dedicated filter bank; signal decomposition into subchannels; subchannel analysis and envelope extraction; and, finally, recombination of the subchannels into a wideband signal. Each stage in the chain is described in detail, and the overall process is demonstrated through representative examples. The concept and the accompanying application are initially intended for rapid post-processing of recorded signals, offering a tool for enhanced signal annotation. Additionally, the built-in features for subchannel manipulation and recombination enable the preparation of stimuli for perceptual listening tests. The procedures have been tested on a set of recorded tones from various string instruments, including those with pronounced inharmonicity, such as the piano, harp, and harpsichord.

1. Introduction

Digital signal processing (DSP) is widely used in different areas, including the processing of musical signals [1,2]. However, different types of signals require tailored approaches. Algorithms that work well for certain signals may produce inaccurate or misleading results when applied to others. For example, a musical tone recorded with a standard sampling rate of 48 kHz is technically a wideband and non-stationary signal, exhibiting rapid changes over time and frequency. For this reason, it is important to adjust DSP techniques to that type of signal and use those algorithms that are fully designed with the nature of the signal in mind.
Musical tones generated by real instruments typically consist of a fundamental frequency and a series of partials. These components form a complex tone that can deviate from strict harmonic alignment due to the physical properties of the instrument [3]. Such deviations are especially pronounced in plucked or struck string instruments, where the stiffness of the string causes inharmonicity—a phenomenon in which the partials are not located at exact integer multiples of the fundamental frequency [4,5,6]. In the literature, the phenomenon of inharmonicity is described by the inharmonicity coefficient B. This coefficient is determined not only by the stiffness of the string but also by other physical parameters such as Young’s modulus of elasticity, tension, radius, and the length of the string [7]. Furthermore, a mathematical model has been developed that relates the inharmonicity coefficient B to the partial frequencies fn of the complex tone:
f n = n f 0 1 + n 2 B ,   n = 1 , 2 , ,
where n denotes the order of the partial, f0 is the fundamental frequency, and B represents the inharmonicity coefficient [8]. It should be noted that in the presence of inharmonicity, the fundamental frequency is not exactly the same as the frequency of the first or fundamental partial.
Typical values of the inharmonicity coefficient B lie within the range of 10−5 to 10−3. The inharmonicity coefficient B and string properties are related through string parameters as described by the equation [5]:
B = π 3 E d 4 64 T l 2 ,
where E represents Young’s modulus of elasticity (in Pascal), T is the wire tension (in Newtons), and d and l are the wire diameter and active wire length (in meters).
The perceptual consequences of inharmonicity have been examined from multiple perspectives, including pitch perception [9,10], timbre characteristics [11,12], the perception of musical intervals [13,14,15], and overall tone quality [7,16,17,18]. Studies have also investigated perceptual thresholds for inharmonicity, identifying limits beyond which the sound of an instrument is perceived as unnatural or unpleasant [19], thereby underscoring the role of inharmonicity in tone production and instrument tuning. Notably, inharmonicity alters the spacing between successive partials, which affects pitch perception by disrupting the harmonic template typically used by the auditory system. This effect has been empirically demonstrated in listening experiments with piano tones, where inharmonicity causes a perceived upward shift in pitch relative to harmonic tones—an effect that contributes to tuning practices such as the Railsback stretch [18].
Given the inharmonicity, an inharmonic complex tone {x[k]} can be represented as a sum of its partial components:
x n k = n = 1 N A n k cos 2 π f s f n + f n v k k + φ n + r e s t k ,
where N is the number of the prominent partials, An[k] is a time-varying envelope, fn is the frequency of the n-th inharmonic partial, f n v k is a slowly varying component of the frequency, and φn is the signal phase. The component rest[k] contains random noise and other artifacts, including quasi-sinusoidal components with frequencies other than expected partial values. Those additional sinusoidal components can arise from different phenomena specific to each of the instruments. Variation of the partial frequency, often referred to as pitch glide, is typically small and is usually neglected in rough signal analyses. This assumption is supported by empirical findings in the analysis of plucked string instruments, where pitch glide occurs primarily during the note onset and decays rapidly, often falling below perceptual thresholds [9,20]. For many DSP applications, including spectral analysis and partial tracking, this small, transient variation can be neglected in models.
In this paper, a set of digital signal processing procedures developed for the analysis of string instrument musical tones with prominent inharmonicity is proposed and presented. These procedures are organized in the form of the MuPI (Musical Signal Processing with Prominent Inharmonicity) application, comprising three major modules. The application, developed in MATLAB 2024, is focused on the quick inspection of recorded inharmonic tones following a processing chain. The application is intended for users with a good understanding of musical signal phenomena and basic DSP skills. Each module is designed as a Graphical User Interface (GUI) application.
The approach presented in this paper relies on observing the signal in both time and frequency domains in parallel. As a result of the application, additional information can be extracted from the signal and included in its description, such as the fundamental frequency, number of prominent partials, and inharmonicity coefficient B. The last submodule of the application assumes that processing of the subchannel signals corresponds to the frequency bins of each prominent partial. Additional features of the time envelopes of single partials can be extracted from the subchannel analysis. For the decomposition of the signal, a multichannel doubly complementary filter with an additional phase correction block is proposed, which improves the time alignment between the signals of the subchannels.
To position the proposed MuPI application within the landscape of existing audio signal analysis tools, a comparative evaluation was conducted against three well-established frameworks: LTFAT [21], Essentia [22], and MIRtoolbox [23]. The comparison was performed across several criteria relevant to time–frequency analysis, feature extraction, signal manipulation, and system flexibility (Table 1).
Unlike LTFAT and MIRtoolbox, which primarily rely on standard forms of frequency analysis, MuPI integrates a specific doubly complementary filter bank design, enabling more precise decomposition of signals into partials. Furthermore, while Essentia and MIRtoolbox emphasize high-level musical descriptors (e.g., tempo, key, chord progression), MuPI focuses on domain-specific parameters such as the inharmonicity coefficient and partial envelope extraction, which are not directly supported in the aforementioned tools. In terms of signal manipulation, MuPI offers functionality comparable to LTFAT by providing users with detailed control over subchannel signals and their recombination. This capability is particularly important for preparing stimuli in perceptual listening experiments. While Essentia and MIRtoolbox offer limited or no reconstruction capabilities, MuPI supports ideal reconstruction from filter banks, with a user-defined number of channels, further enhancing its flexibility.
Compared to LTFAT, MuPI is less broad but more focused on the specific task of analyzing the single complex tone of string instruments. The main goal was to keep the application as simple as possible and user-friendly for users and researchers in the field of musical signal processing who are not deeply familiar with advanced topics of DSP. For that reason, the application is designed as an intuitive GUI with some parameters set in advance and with a limited number of options.
In Section 2, an overview of the major algorithms used in the developed application is presented. In Section 3, the details of the developed application are provided. Section 4 contains some examples and the results of the proposed application. In Section 5, a discussion of the results is given, while in Section 6, concluding remarks are given.

2. DSP Algorithms Used for the Analysis of String Instruments’ Recordings

The single musical tone signal is a wideband signal with prominent sinusoidal components. The signal is usually short, with very short quasi-stationary intervals. Based on this, the analysis of the signal in the developed application presented in this paper is performed in sub-bands. The recorded signal is split into sub-bands. The application supports time and frequency analysis, as well as a set of simple manipulations of the subchannels that assume simple modifications of the filtered signals and recombine them into the wideband signal.

2.1. Digital Filter Bank

A digital filter bank is used to decompose the signal into subchannels. The goal is to provide insight into the time and frequency parameters of each partial separately. Although any set of filters can be used for this task [2], the designated filter bank has the advantage of being doubly complementary (both all-pass and power complementary) [24]. The filter bank can be multirate [25] or single-rate. The output signals and subchannels are not decimated, allowing users to listen to each subchannel with the same playback sampling frequency as for the original signal. The double complementarity of the proposed bank enables the reconstruction of the input signal by simply summing all subchannels.
A filter bank is designed as a tree structure, a cascaded connection of doubly (all-pass and power) complementary two-channel filter banks [26,27], as shown in Figure 1 for the four channels. The number of subchannels can be arbitrarily large. In the proposed application, with additional all-pass filters (A02(z) and A03(z) in Figure 1, this structure preserves the double complementarity property. The sum of the channel signals {xn0[k]} is equal to the input signal filtered by the cascaded connection of all-pass filters:
A 0 t o t a l z = n = 1 N + 1 A 0 n z ,
where N + 2 is the number of channels. The final block in each channel is the phase correction block. It is implemented by inverse filtering of subchannel signals by the A0total(z) [28]. With this additional phase correction block, the sum of all outputs of all channel signals is a zero-phase all-pass filtered input signal. In theory, it means that the input signal is completely restored by the sum of all channels. The only different samples are at the end of the signal and cannot be completely avoided due to limitations of the zero-phase filtering. The proposed application is developed for off-line processing of a prerecorded signal, and the proposed phase correction can be easily implemented. The proposed method preserves the time differences in the signal components. Additionally, it allows modifications to the signal, such as canceling one of the channels or similar simple manipulations.
Each two-channel filter bank is obtained by spectral transformation of the prototype half-band filter bank [24,29]. The prototype low-pass filter is designed as an odd-order Butterworth filter or an Elliptic minimal Q factor filter (EMQF) [30,31]. Poles of the real coefficient half-band low-pass transfer function H L P H B z are located on the imaginary axes. One of them is located at the origin, and the remaining poles are complex conjugate pairs. In that case, the complementary filter pair, low-pass H L P H B z filter, and high-pass filter H H P H B z can be implemented as a parallel connection of two all-pass filters, A 0 H B ( z ) and A 1 H B ( z ) [31]:
H L P , H P H B ( z ) = A 0 H B ( z ) ± A 1 H B ( z ) 2 .
All-pass branches consist of a cascade connection of simple all-pass second-order sections, with additional delay corresponding to the origin located pole in the A1(z) branch
A 0 H B z = m = 2 , 4 , M + 1 / 2 β m H B + z 2 1 + β m H B z 2 , A 1 H B z = z 1 m = 3 , 5 , M + 1 / 2 β m H B + z 2 1 + β m H B z 2 ,
where β m H B is the square of the pole r m H B , m = 2, 3, …, (M + 1)/2, and β m H B < β m + 1 H B . The odd half-band filter order M is calculated based on the values of the required stop-band attenuation As and the required pass-band edge frequency ωp0. As can be concluded from (6), the zeros of all-pass transfer functions are reciprocal to the poles.
The advantage of Butterworth and EMQF filters is that this half-band filter pair (a simple two-channel filter bank) can be easily transformed into a filter bank with an arbitrary crossover frequency [24,27,30]. The only additional parameter needed for the spectral transformation is the exact value of the crossover frequency ωc. By the transformation, each second-order section (6) is transformed into a section with changed poles and zeros:
β m H B + z 2 1 + β m H B z 2 β m + α 1 + β m z 1 + z 2 1 + α 1 + β m z 1 + β m z 2 ,
where α = cos ω c , β l = β l H B + α 1 2 β l H B α 1 2 + 1 , and a 1 = 1 1 α 2 α = tan π 2 ω c 2 .
The trivial first-order section is transformed into a non-trivial all-pass section:
z 1 α 1 + z 1 1 + α 1 z 1 .
The transformation procedures (7) and (8) rely on spectral transformation of the digital filters [32] that can be applied to different classes of filters. In the proposed application, the starting half-band filter is limited to either Butterworth or EMQF types because for these filter types, the coefficient α is the same in all second-order sections (7), allowing for rapid calculation of the resulting filter pair. By transformation, the pass-band edge frequency of the resulting low-pass filter ωp is set to a value determined by the following [24]:
tan ω p 2 = 1 ξ tan ω c 2 ,
where ωc is the crossover frequency of the transformed filter pair, and ξ is the selectivity parameter determined by the pass-band edge frequency ωp0 of the prototype low-pass half-band filter
ξ = 1 tan 2 ω p 0 2 .
The stop-band edge frequency of the resulting low-pass filter ωs is determined by the following:
tan ω s 2 = ξ tan ω c 2 .
It can be seen (9) and (10) that the transition band of the resulting filter pair depends on its crossover frequency and pass-band edge frequency of the prototype low-pass half-band filter.
In the application presented in this paper, the array of crossover frequencies (ωc) is calculated based on the estimated positions of the partials for determined values of fundamental frequency f0 and inharmonicity coefficient B as a normalized geometric mean of two successive partials:
ω c , n + 1 = 2 π f s f n f n + 1 ,   n = 1 , 2 , N ω c , 1 = 2 π f s f 1 .
where fn, n = 1, 2, …, N + 1 are frequencies (in Hz) calculated as in (1), and fs is the sampling frequency of the recorded signal. It should be noted that in the recorded signal, some partials can be totally suppressed (i.e., missing). The 3 dB bandwidth of channel n is defined by crossover frequencies ωc,n and ωc,n+1. In the proposed application, the total number of channels is the number of partials N enlarged by 2. The first channel (with index zero) contains components of the signal that are below the first partial, i.e., the pitch frequency, and the last channel contains higher frequency components of the signal, usually with no prominent partials. However, the same structure of the filter bank can be used for different analysis scenarios, including octave bank design or uniform bank design. The overall computational complexity depends on the order of the prototype filter and the number of channels. In the current form of the application, for each additional channel, an additional filter pair is required, which is obtained from the prototype filter by applying transformations (7) and (8). The number of channels used in the testing phase of the design was up to 90, with stop-band attenuations of up to 80 dB and ωp0 in the range [0.45 0.495]. Due to the inherent properties of the Butterworth filter, its applicability is limited to stop-band attenuation of 60 dB and ωp0 up to 0.45.
The proposed design of the filter bank is suitable for analyzing a single tone of a harmonic musical signal, with or without noticeable inharmonicity, and for different numbers of prominent partials and fundamental frequencies within a wide range.

2.2. The Short-Time Fourier Transform

The single tone musical signal is a non-stationary signal. For these types of signals, analysis is typically performed frame by frame. The frame-by-frame processing is included in the proposed application, basically for the time-dependent spectrum estimation.
The Short-Time Fourier Transform (STFT) is used for spectral analysis [1]:
X S T F T p , e j ω = k = x k w k p R e j ω k l = 0 L 1 x l + p R w l e j ω l + p R ,
where w[l], l = 0, 1, …, L−1 is the window function of the length L, p is the frame index, and R (in samples) is the shift of the successive frames. The expression can be transformed into a form:
X S T F T p , e j ω = e j ω p R l = 0 L 1 x l + p R w l e j ω l .
And it can be implemented by the Discrete Fourier Transform (DFT) of each frame as follows:
X S T F T p , q = e j q 2 π p R l = 0 L 1 x l + p R w l e j q 2 π l ,   q = 0 , 1 , N f f t 1 ,
where Nfft is the length of the DFT. If it is set to a value greater than L, the time array is zero-padded. Throughout this paper, the term STFT is used for the time-dependent analysis, and the term DFT is used for a single frame or the whole-length signal.
Currently, two types of window functions are supported: Hann and Blackman–Harris. The parameter R is chosen based on the Constant Overlap—Add (COLA) condition [1]:
p = X S T F T p , e j ω = p = l = 0 L 1 x k w k p R e j ω k = k = x k e j ω k p = w k p R = p = w k p R = 1 ,   f o r   e a c h   k X e j ω . ,
In the current version of the application, it is set to 0.125 because for that value, the condition (16) is satisfied for both supported window functions. In the current form of the application, the inverse STFT is not used, and for that reason, the COLA condition does not have the full meaning. However, if the additional modules relying on the inverse STFS are included, this issue will be considered more carefully.
In the presented application, the overlapping DFT series is used for the visualization instead of the spectrogram that is typically used. The reason is that the STFT is only calculated for a limited number of frames and used to track the frequency glide of the signal’s partials. If the window length is set to achieve good time resolution, that is usually not good enough for satisfactory resolution in the frequency domain. For the detection of the positions of maximum values of the channel signal’s spectrum, the zero padding improves the result; however, that does not resolve the resolution problem, i.e., the components close in spectrum are difficult to resolve.

2.3. Envelope Estimation

By the proposed digital filter bank, an array of N + 2 subchannel signals is obtained, {xn[k]}, k = 0, 1, …, K−1, n = 0, 1, …, N + 1. Each element of the array is a signal of the same length K as the selected segment of the recorded signal. If the bank parameters are properly chosen, for channels 1 to N, the content corresponds to one harmonic partial of the signal. The signals xn[k] can be further analyzed in the time and frequency domains. The estimation of the envelope of a musical signal can be performed in various ways; an overview of methods is presented in [33,34]. Having in mind that in this case signals are prefiltered, i.e., can be classified as narrow band signals, in the proposed application, the estimation by Hilbert transform is applied
A n k = x n 2 k + x ^ n 2 k ,
where x ^ n k is the Hilbert transform of xn[k]. In some cases, it is useful to express envelopes mathematically. In the proposed application, the approximation equation is a modified version of the expression proposed in [35]:
A n f i t k = A n 1 t k A n 2 1 A n 3   t k A n 4 A n 5 ,
where parameters An1An5 are calculated by fitting the envelope An[k] in MATLAB by the fit function [36]. In Table 2, the starting, minimum, and maximum input values for all parameters are provided. The other input parameters of the fit function are set to default values.
The proposed approach is simplified compared to [35] because the signal is not split into segments traditionally used in music tone analysis and synthesis [37]. Instead, an additional component is inserted into the (18). The reason is that the proposed application is intended for the rapid automatic processing of recordings, with only a very limited set of allowed modifications and recombination of subchannel signals. On the other hand, the extracted envelopes usually are not perfect, i.e., additional smoothing or other preprocessing is needed before the split-point detection [33,35]. Having that in mind, the expression (18) is chosen as a compromise between the perfect envelope fit and the complexity of the solution. This proposed form of the envelope expression (18) has been experimentally proven to be sufficient for the automatic processing of most test signals, in the expected scenario of application usage. However, if there is a need for different approaches, the modular form of the current application allows adding new modules or replacing the third submodule that uses the described approach of envelope approximation.

2.4. Inharmonicity Coefficient Estimation from the Recorded Signal

The presented application is primarily designed for analyzing musical tones produced by string musical instruments. For the plucked or struck string instruments, the inharmonicity is not negligible. Although the inharmonicity analysis is not the primary goal of the presented application, it must be estimated because the design of the filter bank relies on the values of the partial frequencies. Even if it seems that the variations in position raised by the inharmonicity are small, for the signals with a long array of prominent partials, the misadjustment between harmonic and inharmonic array can be significant for the larger index partials. For this reason, it is necessary to estimate the value of the inharmonicity coefficient B before designing the filter bank. In this application, it is achieved through simple visual matching between spectral coefficients and an array of frequencies obtained with the assumed inharmonicity coefficient value. By manually changing the value of B, a result that is good enough for the filter bank design can be obtained. In the second step, the result is verified once again. After the filtering of the output signal, the spectral maximums of each subchannel are estimated. Based on those values, the inharmonicity coefficient is recomputed using the algorithm proposed in [38]. A mismatch of the values can be used as the cause for further analysis of the recorded signal. The algorithm [38] is simple and suitable for automatic processing. It begins with an initially presumed value of inharmonicity coefficient and an array of frequencies corresponding to partials of the signal extracted from the recording before the core algorithm starts. The length of the partial frequency array depends on the instrument and can range from just a few to more than 50 components. The algorithm is an iterative procedure that calculates the new inharmonicity coefficient value in each iteration, thereby minimizing the difference between the actual partial frequencies estimated from the recorded signal and the array of frequencies calculated using (1) in each iteration. In the presented application, it is implemented as an additional module that can be activated after the filter bank is designed.
To evaluate the performance of the MuPI implementation of the algorithm [38] for estimating the inharmonicity coefficient, a comparative test was conducted using reference data published in the original paper [38]. Principally, the algorithm consists of two steps: estimation of the partials frequency array and iterative evaluation of the inharmonicity coefficient B. The second step introduced a partial frequencies deviation (PFD) algorithm and validated it using synthetic piano tones with known inharmonicity profiles for key numbers 1–35 (f0 ∈ [27.5 195.99]). The same synthetic dataset was analyzed using the MuPI implementation of the inharmonicity estimation method on the array of frequencies estimated as maximal values for each subchannel. The results obtained from MuPI were compared against the original synthetic reference curve published in the paper. Figure 2 illustrates the comparison: the solid black line represents the reference value of the inharmonicity coefficient B used for the synthetic piano model, while red dots denote the values estimated by the MuPI implementation.
The visual overlap of the two curves confirms that the MuPI implementation of the algorithm [38] achieves full agreement with the target values. Quantitatively, all estimated inharmonicity coefficients are within ±2% deviation from the reference values, indicating high numerical accuracy. In particular, the shape of the inharmonicity profile—characterized by a decreasing trend in the bass range and an increase above key number 28—is faithfully reproduced.
This comparison validates both the accuracy of the MuPI implementation of the estimation procedure. It also demonstrates that MuPI’s concept is suitable for fine-grained spectral parameter analysis of inharmonic tones.

3. The Developed Application Details

In this section, features of the developed application are presented. The overall structure follows the scheme given in Figure 3. The three submodules, each with a dedicatedly developed GUI, have been developed. The signals and other data are transferred from one submodule to another via temporarily saved files. MATLAB 2023b (Update 5 version 23.2.0.2459199) was used to create the application. The required toolboxes to run the application are Curve Fitting Toolbox (version 23.2) and Signal Processing Toolbox (version 23.2). The module structure of the application enables the introduction of new submodules, the improvement of existing submodules without a full reorganization of the application, or the replacement of existing submodules with new ones.

3.1. Submodule MuPI_S

The starting submodule, MuPI_S, is designed to load the recorded signal and perform simple manipulations, which can be thought of as signal preprocessing. The GUI design is illustrated in Figure 4. The content of the signal remains unchanged, except for the time truncation of the recording. The time truncation allows for the direct processing of recordings, i.e., discarding parts of the recording that contain no content related to the analyzed signal. The whole application is dedicated to the analysis of a complex single musical tone, as modeled by (3). The spectral content of the signal changes over time, and the proposed submodule provides insight through the simultaneous observation of different parts of the signal. Up to four segments of the signal can be briefly analyzed. The specification of the window length (control “T Win Spec”), type (Hann or Blackman–Harris, control “Win type”), and number of DFT/STFT points (control “N DFT”) for further analysis is also set in this phase of the processing. The STFT is calculated as described in Section 2.2 and defined by the window function and Nfft parameters set by the user. The R is preset to 0.125, and the number of successive windows is limited to 50 to speed up the calculation and avoid the problem of extremely large arrays or matrices. The first segment defined is considered a “whole” signal and is used in further analysis for the complete processing chain. The second segment is also transferred to the next submodule and can be used for verification of the significance of partials. The outputs of this submodule, transferred to the next submodule by a temporarily saved *.mat file, are samples of two segments of the signal, selected values for the length and type of window function, selected values for the single-frame DFT analysis, and the file path of the recorded signal.

3.2. Submodule MuPI_B

The main task realized in the second submodule, MuPI_B, is the design of the complementary filter bank. The main window of the second submodule GUI is given in Figure 5. Initially, the spectrum of the signal segment selected in the previous step is plotted. The application is designed for analyzing signals with prominent inharmonicity. For that reason, the first step is to manually set the parameters f0 and inharmonicity coefficient B (1). The values can be set by a slider control or by entering exact values. The set of dashed red helplines indicates the expected positions of partials based on parameters f0 and the inharmonicity coefficient B set by the user. By adjusting these red lines, the user can choose the values for fundamental frequency f0 and inharmonicity coefficient B manually. Those values are used for calculating the crossover frequencies of the filter bank (12). The number of bank channels is the number of prominent partials entered by the user, enlarged by 2. The parameters of the prototype half-band filter have to be set. The starting stop-band attenuation As = 60 (in dB), as well as the pass-band edge normalized frequency ωp = 0.45π, are defined. The selectivity of the filter bank channel filters increases as ωp increases [31]. The Butterwort or EMQF filter can be selected. Theoretically, by the design explained in Section 2.1, a stable digital filter is obtained. The practical limits arising from the numerical instability of calculating the order and coefficients of the EMQF prototype and transformed filters are As = 80 dB and ωp0 = 0.495π. The filter order for the Butterworth filter is larger compared to EMQF, and, as a result, the practical limits are further decreased to As = 60 dB and ωp0 = 0.45π. After all parameters are set, the filter bank design is performed by selecting the “New bank” and “Design Bank” controls. As a result of the design procedure, additional figures are generated and plotted as part of the bank design control. If the design fails, it can be repeated with less strict requirements. After the design is completed, the processing itself can be initialized by the control “Processing”. As a result, two new figures are created. One with instant frequencies obtained by frame-by-frame processing of the channel signals for each segment of the input signal (selected in the previous step described in Section 3.1). These curves can be used to verify the frequency glide. The other is with values of the inharmonicity coefficient B obtained by frame-by-frame processing of the channel signals for each segment of the input signal.
This second diagram can be used as a verification of the manually set inharmonicity coefficient value. Additionally, the parameter M2M is calculated. This parameter is calculated as a ratio between the maximum and mean values in the relevant spectrum range. It is transferred to the next submodule and can be used as a rough estimate of the partial prominence. If the ratio is small, the partial might be missing, and the frequency of the maximum is, in fact, a random value in the frequency range of the channel. By selecting the controls located above the diagram, other components can be added or removed from the plot. The x-axis is in a linear scale. Although logarithmic frequency scales are commonly used in audio signal processing, for this type of analysis, with partials in a quasi-harmonic array, a linear scale seems more suitable. The y-axis can be in a linear or logarithmic scale. The components of the plot are as follows:
  • Spectrum of the input signal (segment corresponds to the whole signal) calculated as a whole-length DFT with a rectangular window;
  • Spectrum of the input signal (segment corresponds to the whole signal) calculated as a whole-length DFT with selected window, control “win”;
  • Spectrum of the windowed additional segment, control “seg”;
  • Additional lines correspond to the expected positions of the phantom partials that are corollary to non-linear phenomena, controls “d1” and “x2”;
  • Gains of equivalent channel filters, control “Bnk”;
  • Frame-by-frame positions of maximal values of channel spectra, control “oFDs”. This control is activated after the signal has been processed by a designed filter bank, as the values are calculated based on the filtered signal. Darker and smaller markers correspond to the later frames.
As an example, the plot of the fundamental frequency in time for two segments is shown in Figure 6, along with the corresponding graphics within the GUI window. It can be seen that both representations show the small frequency change over time. The limits of change are smaller for the second segment.
The outputs of this submodule, transferred to the next submodule by a temporarily saved *.mat file, are subchannel signals, parameters of the designed bank, sampling frequency, estimated values of fundamental partial, inharmonicity coefficient B, M2M array values, and the recorded signal file path.

3.3. Submodule MuPI_A

The third and final submodule, MuPI_A, is designed as a tool for the simple manipulation of the submodule channels (Figure 7). During the initialization of this submodule, time envelopes of all channels are fitted, and parameters An1An5, as described in Section 2.3, are extracted. The user can adjust all parameters based on the inspection of the result using time and frequency representations of the original and modified content of the channel signal, as well as by listening to the original and modified signals. Currently, several options for modification are supported:
  • Amplitude scaling of the signal (including total mute);
  • Replacing the extracted channel signal with
A n f i t k cos 2 π f s f n k + φ n + s _ n f i l t _ n o i s e n k
  • where An[k] is the envelope obtained by the fitting of the channel extracted envelope; fn is the partials frequency, estimated by the frequency position of maximum in DFT of the channel signal; filt_noise[k] are samples of the filtered Gaussian noise; and s_n is the scaling factor (it can be zero, i.e., the noise component can be omitted). The filtered noise is the output signal from the same filter bank used for the signal processing, with the Gaussian random array as the input signal;
  • Replacing the extracted channel signal with the samples of filtered Gaussian noise, as explained in relation to (19).
The input signal can be perfectly reconstructed by simply adding all subchannels, with the exception of a small unavoidable error resulting from non-ideal zero-phase filtering, as explained in Section 2.1. The sum can include a modified signal instead of the original channel signal. As a result, a slightly modified overall signal is produced. This basic signal synthesis can be used for a preliminary investigation of the impact of specific signal parameters or components on the perceptual quality of the signal, for example, in verifying stimulus synthesis for listening tests.

4. Experimental Results

In this section, the results obtained by the proposed applications in three experiments are explained. All experiments were conducted on a notebook computer with specifications provided in Table 3. In all experiments, the previously recorded signals were used. The sampling frequency for all recordings used in the presented experiments was 48 kHz. All signals are recorded by the researchers of the Acoustic Laboratory of the School of Electrical Engineering, University of Belgrade, as a part of the joint project with the researchers and teachers of the Faculty of Music, University of Arts in Belgrade.

4.1. Signal Processing Chain of the Recording Expected to Exhibit Inharmonicity

The first experiment follows the processing of the recorder harp signal As4. The signal is recorded at the Faculty of Music, University of Arts in Belgrade, in a room typically used for practicing and teaching. The harp was played by the faculty professor. The recording is part of a set dedicated to evaluating the properties of harp tones. The signal and the segments’ beginning and end points are plotted in Figure 8. The first segment corresponds to the entire recording.
After the signal is preprocessed by the submodule MuPI_S, the spectrum content is plotted, and f0 and inharmonicity coefficient B are set using slider controls, visually matching the positions of the red help lines with the signal spectrum. The filter bank was designed as a 17-channel bank (the number of prominent partials was set to 15 by the user). The MuPI_B GUI window is shown in Figure 9.
The signal can be further processed, yielding additional results. In Figure 10, the results obtained by the module implementing the algorithm [38] are presented. The calculation is repeated for 50 frames for both selected segments of the signal and for the whole-length signal.
The estimated values differ slightly from the value evaluated manually by the user. The variations from frame to frame in the first 50 frames of the signal (blue line) are probably due to the frequency glide, i.e., the spectrum content is slightly different in different frames. The second segment of the signal starts at 1 s, and at least some of the spectrum components have already vanished before this point. It can be assumed that the larger variation of the extracted inharmonicity coefficient values is due to the fact that spectral maximums are calculated automatically, regardless of whether the prominent sinusoidal component is even present.
The next step is to investigate the time envelopes of the signal partials. The GUI of the third submodule, representing one of the signal partials analysis, is presented in Figure 11. The 6th partial is chosen because the spectrum in that range contains two very close components, as indicated by the green curve in the spectrum plot, with the lower one (in frequency) probably arising from the phantom partial. Those two components impact the shape of the time envelope. The black line in the time plot indicates the extracted envelope, and the green-black line indicates the fitted “corrected” envelope. Each component of the signal, each partial, can be included in the summing signal, excluded, or replaced with a synthesized partial in the form of (19). In this example, the partials 6, 7, 8, and 9 were replaced with synthesized ones. The shape of the “corrected” 6th partial spectrum is shown in the spectrum plot, Figure 11, with a green-black line. The resulting signals are saved as audio files and later processed outside the application. The recording Harp_As4_sum_1.wav is imported into MATLAB as array x, and the recording Harp_As4_sum_2.wav as array y. The difference is calculated as z = xy, and the spectrum of that difference is calculated and plotted in Figure 12. It can be seen that the difference between those two signals is in the frequency range from 2.2 to 4 kHz, which corresponds to partials replaced by approximated signals.

4.2. Analysis of the Irregularities in the Signal Spectrum

The same signal is used for the demonstration of the detection of the phantom partial, i.e., partial with a frequency that is not part of the expected array of frequencies for the estimated value of B. In Figure 13, the part of the spectrum corresponding to the 6th and 7th partials of the recorded harp As4 signal is presented for the set values f0 = 416.9 Hz and B = 0.00022. The red lines correspond to the expected positions of the partials, and the blue and green lines correspond to the expected positions of the phantom components. The frequencies of the phantom components are calculated based on the spectrum content and positions of the lower-order signal partials. The spectrum is calculated as the whole-length DFT of a signal with a rectangular window. Therefore, further analysis is necessary to determine the time–frequency content of the signal. The STFT cannot provide a good enough resolution in frequency. The time envelope of the 6th channel is already plotted in Figure 11, and simultaneous analysis of time and frequency can be used as a verification of the nature of the additional spectral component. The shape of the envelope of the 6th partial indicates that two spectral components of the signal are overlapping in time, i.e., can be used to confirm that the additional spectral component is a phantom partial. The frequency of the quasi-sinusoidal part of the envelope is approximately 3.2 Hz and matches half of the difference between the two spectral peaks. The extracted content of the 6th channel is available as Harp_As4_channel_6.wav.

4.3. Simple Manipulation of the Recorded Signal

As an example of basic signal manipulation, the exclusion of the first several partials is chosen. The modified signals can be used as test signals for pitch perception listening tests. The signal without content of the first channel (corresponding to the fundamental partial) is saved as an audio file Piano_A3_sum_min_01.wav. The signal without content of the first and second channel (corresponding to the fundamental partial and second partial) is saved as an audio file Piano_A3_sum_min_01_02.wav. Finally, the signal with only odd partials is saved as an audio file Piano_A3_sum_odd.wav. The files Piano_A3_sum_min_01.wav and Piano_A3_sum_min_01_02.wav are imported into MATLAB, and the spectra of the two modified signals are plotted together with the spectrum of the original signal. In Figure 14, the results for the signals with missing first and first two partials are given. It can be seen that the content is the same, except for the deliberately removed partials.

4.4. Large Number of Channels

This experiment demonstrates the design of the filter bank with a large number of channels. The number of filter bank channels is adjusted to match the recorded harpsichord signal, which features a “rich” spectrum with many prominent partials, as shown in Figure 15. The value of the inharmonicity coefficient B is small (1.8 × 10−5), but due to the large number of channels, the positions of the higher partials deviate significantly from the harmonic sequence, so it was necessary to create a bank with a large number of channels. The target number of channels was 90.
The filter bank design was repeated for the required number of prominent partials from 10 to 90, with a step of 10 and with the same prototype low-pass half-band filter. The time required for the bank design and the time for processing all channels were measured. Each design (for a certain number of partials) was conducted ten times, and the mean value and standard deviation for each design were calculated. It should be noted that the computational cost for the specific design is the same for each trial; however, the experiment is performed on a notebook computer, and variations in time are unavoidable in the standard mode of processing. In Figure 16, the mean values and standard deviations of the time needed for the filter bank design are given. The diagram confirms that the filter bank design is efficient and fast. The relatively high standard deviation is expected due to the quite low processing time. The processing time depends on the number of channels, as expected.
In Figure 17, the mean values and standard deviations of the time needed for the processing of the signal by the entire bank (all channels) are given. The processing time depends on the number of channels and the length of the input signal. The values presented in the diagram are for a signal length of 2 s. The processing time depends on the number of channels, as expected. The bank is implemented as a tree structure (Figure 1), and the dependence of the processing time on the number of channels is not linear. In the current version of the MuPI_B submodule, filtering is implemented using the MATLAB filter function, where each filter is realized as a parallel connection of two all-pass filters. Each all-pass filter is implemented as a cascaded connection of second-order sections. The result of this operation is a matrix of size K × (N + 2), where K is the signal length (in samples), and N is the number of prominent partials set by the user.

5. Discussion

In Section 3, three different recordings were used to present different algorithms included in the current form of the application. In Table 4, the major characteristics of the analyzed recorded signals important for the present experiment are summarized.
The characteristics of the signals are different in aspects of the number of prominent partials and values of the inharmonicity coefficient B. The harp tone is characterized by a low number of prominent partials as well as the presence of other sinusoidal components, near the expected positions of the partials. Those components can influence the estimation of the frequency array that is used as an input to the inharmonicity coefficient B PDF estimation algorithm. By simultaneous inspection of the signal in time and frequency, as described in example 2, the problem can be detected and even corrected by replacing the “corrupted” partial with the expected one.
The piano tone is usually the one easily recognized. Because of that, the example based on the piano tone is used for the illustration of the tool for the simple recombination of subchannels that can be used for the modification of the signal in a controlled manner. Realized applications can be used as a signal preparation tool for subjective tests.
The harpsichord recording is characterized by a large number of prominent partials, requiring a large number of channels in the filter bank design. It is shown that the proposed approach can provide the required bank.
All presented examples illustrate the scope of the proposed approach based on the decomposition of the signal into channels corresponding to signal partials.

6. Conclusions

In this paper, the analysis of single-tone signals from string musical instruments based on subchannel decomposition is presented. By this approach, the complex signal is divided into an array of less complex signals, and further analysis of each channel is performed simultaneously in both the time and frequency domains. The decomposition of the signal is performed by a non-uniform multichannel filter bank with additional phase correction, preserving time instances between channels. It is expected that each subchannel contains only one strong quasi-sinusoidal component. Sometimes the subchannel signal has an additional strong quasi-sinusoidal component, such as phantom partials. Due to the relatively short intervals of quasi-stationarity in the musical signal, it is challenging to detect and accurately verify the presence of those irregularities. The STFT is a powerful tool for analyzing non-stationary signals; however, due to the short window lengths, the close picks in frequency are non-separable. In the presented approach, the combination of time and frequency analysis of each channel and analysis of the overall signal improves the analysis. MuPI enables more accurate analysis of tone signals thanks to a specialized filter bank design that is not present in applications of similar purpose. Unlike other tools, it is focused on parameters like inharmonicity and reconstruction, which makes it suitable for the analysis of stringed instruments.
The presented application features a modular structure, enabling the implementation of various additional modules. In future work, the additional processing block can be developed that relies on the extrapolation of the signal as a tool for improving the resolution of the STFT [39].
The parameter M2M is calculated as a reasonable merit of the prominence of the partial. However, it is only provided as a value but not fully used in the current version of the application. In further work, it can be seen as a tool for excluding the non-prominent or completely “missing” partials from the evaluation of the signal parameters, including the inharmonicity coefficient B. For the proposed structure of the filter bank, it is unexpected to detect two strong partials in the same band. The future work will also focus on finding a quick and simple tool for the detection of channels with more than one frequency peak.

Author Contributions

Conceptualization, D.Š.P.; methodology, M.B. and J.Ć.; software, J.Ć.; validation, T.M. and M.B.; formal analysis, T.M. and J.Ć.; investigation, T.M.; resources, D.Š.P.; data curation, J.Ć. and T.M.; writing—original draft preparation, M.B., J.Ć. and T.M.; writing—review and editing, J.Ć. and D.Š.P.; visualization, J.Ć. and M.B.; supervision, D.Š.P.; project administration, T.M.; funding acquisition, D.Š.P. and J.Ć. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the Ministry of Science, Technological Development and Innovation of the Republic of Serbia, under contract numbers 451-03-136/2025-03/200103, 451-03-137/2025-03/200103.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset of measured signals used in this study is openly available at the following SharePoint link: https://etfbgacrs-my.sharepoint.com/:f:/g/personal/bjelic_etf_bg_ac_rs/EoWPGJyUw0ZIqU6ADv7uINMBa1QqSmEQjNW8eyS3I5rUHw?e=xiQ0xa (accessed on 23 July 2025). The software developed for signal processing and analysis is available on GitHub: https://github.com/milosetf/MuPI (accessed on 23 July 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
DSPDigital Signal Processing
GUIGraphical User Interface
STFTShort-Time Fourier Transform
DFTDiscrete Fourier Transform
COLAConstant Overlap—Add
EMQFElliptic Minimal Q Factor
PDFPartial Deviation Frequency

References

  1. Smith, J.O. Spectral Audio Signal Processing. Available online: http://ccrma.stanford.edu/~jos/sasp/ (accessed on 10 June 2025).
  2. Smith, J.O. Introduction to Digital Filters: With Audio Applications. Available online: http://ccrma.stanford.edu/~jos/filters/ (accessed on 10 June 2025).
  3. Polychronopoulos, S.; Bakogiannis, K.; Marini, D.; Kouroupetroglou, G.T. Optimization Method on the Tuning, Sound Quality, and Ergonomics of the Ancient Guitar Using a DSP-FEM Simulation. IEEE Access 2022, 10, 133574–133583. [Google Scholar] [CrossRef]
  4. Rossing, T.D. The Science of String Instruments; Springer: New York, NY, USA, 2010; ISBN 978-1-4419-7109-8. [Google Scholar]
  5. Fletcher, N.H.; Rossing, T.D. The Physics of Musical Instruments, 2nd ed.; Springer: New York, NY, USA, 2010; ISBN 978-1-4419-3120-7. [Google Scholar]
  6. Young, R.W. Inharmonicity of Plain Wire Piano Strings. J. Acoust. Soc. Am. 1952, 24, 267–273. [Google Scholar] [CrossRef]
  7. Fletcher, H.; Blackham, E.D.; Stratton, R. Quality of Piano Tones. J. Acoust. Soc. Am. 1962, 34, 749–761. [Google Scholar] [CrossRef]
  8. Fletcher, H. Normal Vibration Frequencies of a Stiff Piano String. J. Acoust. Soc. Am. 1964, 36, 203–209. [Google Scholar] [CrossRef]
  9. Järveläinen, H.; Välimäki, V. Audibility of Initial Pitch Glides in String Instrument Sounds. In Proceedings of the International Computer Music Conference, Havana, Cuba, 17–23 September 2001. [Google Scholar]
  10. Anderson, B.E.; Strong, W.J. The Effect of Inharmonic Partials on Pitch of Piano Tones. J. Acoust. Soc. Am. 2005, 117, 3268–3272. [Google Scholar] [CrossRef]
  11. Järveläinen, H.; Välimäki, V.; Karjalainen, M. Audibility of the Timbral Effects of Inharmonicity in Stringed Instrument Tones. Acoust. Res. Lett. Online 2001, 2, 79–84. [Google Scholar] [CrossRef]
  12. Galembo, A.; Askenfelt, A.; Cuddy, L.L.; Russo, F.A. Perceptual Relevance of Inharmonicity and Spectral Envelope in the Piano Bass Range. Acta Acust. United Acust. 2004, 90, 528–536. [Google Scholar]
  13. Plomp, R.; Levelt, W.J.M. Tonal Consonance and Critical Bandwidth. J. Acoust. Soc. Am. 1965, 38, 548–560. [Google Scholar] [CrossRef]
  14. Geary, J.M. Consonance and Dissonance of Pairs of Inharmonic Sounds. J. Acoust. Soc. Am. 1980, 67, 1785–1789. [Google Scholar] [CrossRef]
  15. Cohen, E.A. Some Effects of Inharmonic Partials on Interval Perception. Music Percept. 1984, 1, 323–349. [Google Scholar] [CrossRef]
  16. Polychronopoulos, S.; Marini, D.; Bakogiannis, K.; Kouroupetroglou, G.T.; Psaroudakes, S.; Georgaki, A. Physical Modeling of the Ancient Greek Wind Musical Instrument Aulos: A Double-Reed Exciter Linked to an Acoustic Resonator. IEEE Access 2021, 9, 98150–98160. [Google Scholar] [CrossRef]
  17. Giordano, N. Explaining the Railsback Stretch in Terms of the Inharmonicity of Piano Tones and Sensory Dissonance. J. Acoust. Soc. Am. 2015, 138, 2359–2366. [Google Scholar] [CrossRef]
  18. Jaatinen, J.; Pätynen, J. Effect of Inharmonicity on Pitch Perception and Subjective Tuning of Piano Tones. J. Acoust. Soc. Am. 2022, 152, 1146–1157. [Google Scholar] [CrossRef] [PubMed]
  19. Moore, B.C.J.; Peters, R.W.; Glasberg, B.R. Thresholds for the Detection of Inharmonicity in Complex Tones. J. Acoust. Soc. Am. 1985, 77, 1861–1867. [Google Scholar] [CrossRef] [PubMed]
  20. Lee, N.; Smith, J.O., III; Abel, J.; Berners, D. Pitch Glide Analysis and Synthesis from Recorded Tones. In Proceedings of the International Conference on Digital Audio Effects, Como, Italy, 1–4 September 2009; pp. 430–437. [Google Scholar]
  21. Søndergaard, P.L.; Torrésani, B.; Balazs, P. The Linear Time Frequency Analysis Toolbox. Int. J. Wavelets Multiresolution Inf. Process. 2012, 10, 1250032. [Google Scholar] [CrossRef]
  22. Bogdanov, D.; Wack, N.; Gómez, E.; Gulati, S.; Herrera, P.; Mayor, O.; Roma, G.; Salamon, J.; Zapata, J.; Serra, X. ESSENTIA: An Open-Source Library for Sound and Music Analysis. In Proceedings of the 21st ACM international Conference on Multimedia, Barcelona, Spain, 21–25 October 2013; pp. 855–858. [Google Scholar]
  23. Lartillot, O.; Toiviainen, P.; Eerola, T. A Matlab Toolbox for Music Information Retrieval; Springer: New York, NY, USA, 2008; pp. 261–268. [Google Scholar]
  24. Ćertić, J.D.; Milić, L.D. Investigation of Computationally Efficient Complementary IIR Filter Pairs with Tunable Crossover Frequency. AEU Int. J. Electron. Commun. 2011, 65, 419–428. [Google Scholar] [CrossRef]
  25. Necciari, T.; Holighaus, N.; Balazs, P.; Průša, Z.; Majdak, P.; Derrien, O. Audlet Filter Banks: A Versatile Analysis/Synthesis Framework Using Auditory Frequency Scales. Appl. Sci. 2018, 8, 96. [Google Scholar] [CrossRef]
  26. Cassidy, R.J.; Smith, J.O. A Tunable, Nonsubsampled, Non-Uniform Filter Bank for Multi-Band Audition and Level Modification of Audio Signals. In Proceedings of the Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 7–10 November 2004; Volume 2, pp. 2228–2232. [Google Scholar]
  27. Ćertić, J.D.; Šumarac Pavlović, D.; Salom, I. Nonuniform Complementary Filter Bank for Analysis of Audio Signals. In Proceedings of the Forum Acusticum 2011, Aalborg, Denmark, 27 June–1 July 2011; pp. 2565–2570. [Google Scholar]
  28. Lyons, R.G. Understanding Digital Signal Processing, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2011; ISBN 978-0-13-702741-5. [Google Scholar]
  29. Milić, L.D.; Damjanović, S.; Nikolić, M. Frequency Transformations of IIR Filters with Filter Bank Applications. In Proceedings of the APCCAS 2006—2006 IEEE Asia Pacific Conference on Circuits and Systems, Singapore, 4–7 December 2006; pp. 1051–1054. [Google Scholar]
  30. Milić, L.D.; Lutovac, M.D. Efficient Algorithm for the Design of High-Speed Elliptic IIR Filters. AEU Int. J. Electron. Commun. 2003, 57, 255–262. [Google Scholar] [CrossRef]
  31. Milić, L.D.; Lutovac, M.D. Efficient Multirate Filtering. In Multirate Systems: Design and Applications; Jovanovic-Dolecek, G., Ed.; IGI Global: Hershey, PA, USA, 2002; pp. 105–142. ISBN 978-1-930708-30-3. [Google Scholar]
  32. Constantinides, A.G. Spectral Transformations for Digital Filters. Proc. Inst. Electr. Eng. 1970, 117, 1585. [Google Scholar] [CrossRef]
  33. Caetano, M.; Burred, J.J.; Rodet, X. Automatic Segmentation of the Temporal Evolution of Isolated Acoustic Musical Instrument Sounds Using Spectro-Temporal Cues. In Proceedings of the International Conference on Digital Audio Effects (DAFx-10), Graz, Austria, 6–10 September 2010. [Google Scholar]
  34. Caetano, M.; Rodet, X. Improved Estimation of the Amplitude Envelope of Time-Domain Signals Using True Envelope Cepstral Smoothing. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 4244–4247. [Google Scholar]
  35. Jensen, K. Envelope Model of Isolated Musical Sounds. In Proceedings of the 2nd COST G-6 Workshop on Digital Audio Effects (DAFx99), Trondheim, Norway, 9–11 December 1999. [Google Scholar]
  36. MATLAB Fit Function. Available online: https://www.mathworks.com/help/curvefit/fit.html (accessed on 10 June 2025).
  37. Bernstein, A.D.; Cooper, E.D. Piecewise-Linear Technique of Electronic Music Synthesis. AES J. Audio Eng. Soc. 1976, 24, 446–454. [Google Scholar]
  38. Rauhala, J.; Lehtonen, H.-M.; Välimäki, V. Fast Automatic Inharmonicity Estimation Algorithm. J. Acoust. Soc. Am. 2007, 121, EL184–EL189. [Google Scholar] [CrossRef] [PubMed]
  39. Kauppinen, I.; Roth, K. Audio Signal Extrapolation—Theory and Applications. In Proceedings of the International Conference on Digital Audio Effects, DAFx, Hamburg, Germany, 26–28 September 2002; pp. 105–110. [Google Scholar]
Figure 1. Structure of the multichannel double complementary non/decimated filter bank (example with four channels). Each level consists of a low-pass HLPq and a high-pass HHPq filter pair with the crossover frequency ωcq. Overall array of signals x0n[k], n = 0,…N + 1 is obtained by the doubly complementary filter bank, whereas the signals x0n[k], n = 0,…N + 1 are additionally phase corrected.
Figure 1. Structure of the multichannel double complementary non/decimated filter bank (example with four channels). Each level consists of a low-pass HLPq and a high-pass HHPq filter pair with the crossover frequency ωcq. Overall array of signals x0n[k], n = 0,…N + 1 is obtained by the doubly complementary filter bank, whereas the signals x0n[k], n = 0,…N + 1 are additionally phase corrected.
Applsci 15 08293 g001
Figure 2. MuPI Inharmonicity estimation test for synthetic piano. The red dots represent the values obtained by the MuPI application. The solid line represents the correct inharmonicity coefficient values given for synthesized piano tones for keys 1−35 in paper [38].
Figure 2. MuPI Inharmonicity estimation test for synthetic piano. The red dots represent the values obtained by the MuPI application. The solid line represents the correct inharmonicity coefficient values given for synthesized piano tones for keys 1−35 in paper [38].
Applsci 15 08293 g002
Figure 3. Organization structure of the proposed MuPI application by modules. Each module is responsible for certain processing. MuPI_S is responsible for the initial signal preprocessing and visualization. MuPI_B is in charge of calculating signal parameters. MuPI_B is in charge of signal processing from a multichannel bank as well as signal synthesis.
Figure 3. Organization structure of the proposed MuPI application by modules. Each module is responsible for certain processing. MuPI_S is responsible for the initial signal preprocessing and visualization. MuPI_B is in charge of calculating signal parameters. MuPI_B is in charge of signal processing from a multichannel bank as well as signal synthesis.
Applsci 15 08293 g003
Figure 4. Submodule MuPI_S window, with a typical first step (preprocessing) scenario presented. The signal in time is plotted in the upper diagram, with time stamps of the beginning and ending points of segments given in the legend. The dashed lines indicate starting and ending samples of signal segments, blue for the whole length signal and red for the second segment. The result of the STFT analysis of the first 50 windows of the selected signal part is presented in the lower diagram; the darker line corresponds to the later window.
Figure 4. Submodule MuPI_S window, with a typical first step (preprocessing) scenario presented. The signal in time is plotted in the upper diagram, with time stamps of the beginning and ending points of segments given in the legend. The dashed lines indicate starting and ending samples of signal segments, blue for the whole length signal and red for the second segment. The result of the STFT analysis of the first 50 windows of the selected signal part is presented in the lower diagram; the darker line corresponds to the later window.
Applsci 15 08293 g004
Figure 5. Submodule MuPI_B window, with an initial view presented. The signal analyzed is the harp signal with a certain degree of inharmonicity. The dashed red lines correspond to the expected position of particles for the value of inharmonicity coefficient B initially set to 0. However, the real frequency positions of partials differ from the ideal harmonic array, i.e., the red lines do not correspond to local maxima of the plotted spectrum.
Figure 5. Submodule MuPI_B window, with an initial view presented. The signal analyzed is the harp signal with a certain degree of inharmonicity. The dashed red lines correspond to the expected position of particles for the value of inharmonicity coefficient B initially set to 0. However, the real frequency positions of partials differ from the ideal harmonic array, i.e., the red lines do not correspond to local maxima of the plotted spectrum.
Applsci 15 08293 g005
Figure 6. The presentation of the frequency glide involves obtaining values of instantaneous frequencies through frame by frame analysis: (a) partial frequency in time for two different segments (blue is whole signal and red is segment selected in MuPI_S), while (b) markers correspond to (frequency of the maximum, maximum value) pairs. Star (*) corresponds to whole length signal. Red dashed help-line corresponds to expected partial frequency. Circles (o) corresponds to maxima obtained based on frame by frame analysis (one marker per frame). Darker and smaller markers correspond to later windows.
Figure 6. The presentation of the frequency glide involves obtaining values of instantaneous frequencies through frame by frame analysis: (a) partial frequency in time for two different segments (blue is whole signal and red is segment selected in MuPI_S), while (b) markers correspond to (frequency of the maximum, maximum value) pairs. Star (*) corresponds to whole length signal. Red dashed help-line corresponds to expected partial frequency. Circles (o) corresponds to maxima obtained based on frame by frame analysis (one marker per frame). Darker and smaller markers correspond to later windows.
Applsci 15 08293 g006
Figure 7. Submodule MuPI_A window, with an initial view presented. The first channel (displayed in the figure) corresponds to the first partial. The time and frequency content of the signal is plotted, along with estimated and fitted time envelopes. All relevant extracted parameters are displayed for the current channel. The graph on the left shows the signal in time (green color), the extracted envelope of the signal (solid black line) and the repaired envelope (dashed black line). Color coding is used to indicate the position of the currently selected channel in the entire spectrum (lower channels are marked in green, higher in red). Graph on the bottom right: the spectrum of the extracted channel is shown in green, and the spectrum of the signal from the channel with the corrected envelope is shown in dashed black.
Figure 7. Submodule MuPI_A window, with an initial view presented. The first channel (displayed in the figure) corresponds to the first partial. The time and frequency content of the signal is plotted, along with estimated and fitted time envelopes. All relevant extracted parameters are displayed for the current channel. The graph on the left shows the signal in time (green color), the extracted envelope of the signal (solid black line) and the repaired envelope (dashed black line). Color coding is used to indicate the position of the currently selected channel in the entire spectrum (lower channels are marked in green, higher in red). Graph on the bottom right: the spectrum of the extracted channel is shown in green, and the spectrum of the signal from the channel with the corrected envelope is shown in dashed black.
Applsci 15 08293 g007
Figure 8. The time-domain plot of the recorded signal used for further analysis. The dashed lines indicate starting and ending samples of signal segments, blue for the whole length signal and red for the second segment.
Figure 8. The time-domain plot of the recorded signal used for further analysis. The dashed lines indicate starting and ending samples of signal segments, blue for the whole length signal and red for the second segment.
Applsci 15 08293 g008
Figure 9. The look of MuPI_B GUI window after the bank was designed. The bank is fitted to the signal content, i.e., each subchannel corresponds to a single signal partial. The red helplines correspond to partials of the signal for the set values of the fundamental frequency f0 and inharmonicity coefficient B. Dashed black lines are gains of filters of the designed bank.
Figure 9. The look of MuPI_B GUI window after the bank was designed. The bank is fitted to the signal content, i.e., each subchannel corresponds to a single signal partial. The red helplines correspond to partials of the signal for the set values of the fundamental frequency f0 and inharmonicity coefficient B. Dashed black lines are gains of filters of the designed bank.
Applsci 15 08293 g009
Figure 10. Values of the inharmonicity coefficient B calculated by the automatic algorithm [38] for 50 frames for two different segments (blue is whole signal and red is segment selected in MuPI_S). Solid lines corresponding to the mean values.
Figure 10. Values of the inharmonicity coefficient B calculated by the automatic algorithm [38] for 50 frames for two different segments (blue is whole signal and red is segment selected in MuPI_S). Solid lines corresponding to the mean values.
Applsci 15 08293 g010
Figure 11. The look of MuPI_A GUI for one selected partial. The replacement of the extracted subchannel with the synthesized one is demonstrated. The graph on the left shows the signal in time (green color), the extracted envelope of the signal (solid black line) and the repaired envelope (dashed black line). Color coding is used to indicate the position of the currently selected channel in the entire spectrum (lower channels are marked in green, higher in red). Graph on the bottom right: the spectrum of the extracted channel is shown in green, and the spectrum of the signal from the channel with the corrected envelope is shown in dashed black.
Figure 11. The look of MuPI_A GUI for one selected partial. The replacement of the extracted subchannel with the synthesized one is demonstrated. The graph on the left shows the signal in time (green color), the extracted envelope of the signal (solid black line) and the repaired envelope (dashed black line). Color coding is used to indicate the position of the currently selected channel in the entire spectrum (lower channels are marked in green, higher in red). Graph on the bottom right: the spectrum of the extracted channel is shown in green, and the spectrum of the signal from the channel with the corrected envelope is shown in dashed black.
Applsci 15 08293 g011
Figure 12. The spectrum of the difference between two signals x and y. The signal x is obtained by summing all channels of the decomposed input signal. The signal y is obtained by replacing channels 6, 7, 8, and 9 with the synthesized signals. Those channels correspond to the frequency band from 2.2 to 4 kHz.
Figure 12. The spectrum of the difference between two signals x and y. The signal x is obtained by summing all channels of the decomposed input signal. The signal y is obtained by replacing channels 6, 7, 8, and 9 with the synthesized signals. Those channels correspond to the frequency band from 2.2 to 4 kHz.
Applsci 15 08293 g012
Figure 13. The spectrum X of the selected partial, showing the case when the prominent phantom component is present close to the actual partial of the signal. Signal partials marked with red dashed lines. Cyan and green dashed lines indicate phantom partials originating from nonlinearities.
Figure 13. The spectrum X of the selected partial, showing the case when the prominent phantom component is present close to the actual partial of the signal. Signal partials marked with red dashed lines. Cyan and green dashed lines indicate phantom partials originating from nonlinearities.
Applsci 15 08293 g013
Figure 14. The spectrum X0 of the recorded signal, the spectrum X1 of the signal with the first (fundamental) partial removed, and the spectrum X12 of the signal with the first two partials removed.
Figure 14. The spectrum X0 of the recorded signal, the spectrum X1 of the signal with the first (fundamental) partial removed, and the spectrum X12 of the signal with the first two partials removed.
Applsci 15 08293 g014
Figure 15. The spectrum of the recorded harpsichord signal. An example of the spectrum of a real recorded signal with a large number of partials (more than 70).
Figure 15. The spectrum of the recorded harpsichord signal. An example of the spectrum of a real recorded signal with a large number of partials (more than 70).
Applsci 15 08293 g015
Figure 16. The execution time of the filter bank design (dependency on the number of bank channels). Mean value and standard deviation are shown. Note that the timescale is in milliseconds.
Figure 16. The execution time of the filter bank design (dependency on the number of bank channels). Mean value and standard deviation are shown. Note that the timescale is in milliseconds.
Applsci 15 08293 g016
Figure 17. The execution time of the filter bank implementation (dependency on the number of bank channels). Mean value and standard deviation are shown. The input signal is the selected segment of the recorded signal, and the output is an array of signals, one for each channel.
Figure 17. The execution time of the filter bank implementation (dependency on the number of bank channels). Mean value and standard deviation are shown. The input signal is the selected segment of the recorded signal, and the output is an array of signals, one for each channel.
Applsci 15 08293 g017
Table 1. Comparison of the proposed application with existing ones according to multiple criteria. The symbol ✔ means that the criterion is met. The symbol ✘ means that it is not.
Table 1. Comparison of the proposed application with existing ones according to multiple criteria. The symbol ✔ means that the criterion is met. The symbol ✘ means that it is not.
CriteriaLTFATEssentiaMIRtoolboxMuPI (Proposed)
LanguageMATLAB, OctaveC++, PythonMATLABMATLAB
Time–frequency methods✔ Gabor, Wavelet, STFT, filter bank✘ Limited✘ STFT✔ Windowing, DFT, STFT, filter bank
Low-level feature extraction✔ Manually through frequency analysis✔ MFCC, spectral centroid, energy, pitch✔ Zero-crossing, RMS, MFCC, spectral flux✔ Manually through frequency analysis enhanced by auxiliary parameters extraction provided automatically
High-level feature extraction✔ Beat tracking, tempo, key, chords✔ Tempo, beat, pitch, chords✔ Envelope partial extraction, Inharmonicity coefficient
Signal manipulation✔ Detailed processing (filter bank, synthesis)✘ Minimal✔ Detailed processing (filter bank, synthesis)
Visualization✔ Arbitrary✔ Arbitrary✔ Arbitrary
Speed and performance✔ Fast✔ Very fast✘ Slow for large data sets✔ Fast
Flexibility and extensibility✔ Modular✔ Modular✘ Hard to expand✔ Modular
Real-time processing✔ Partial (C++)
Inharmonicity coefficient✔ Yes, PFD algorithm
Filter bank✔ Yes (in detail), ideal reconstruction✘ Partial (analysis), no reconstruction✘ Partial (analysis), no reconstruction✔ Yes (in detail), ideal reconstruction
Number of channels in filter bank✔ Arbitrary✘ Cannot be assigned✘ Cannot be assigned✔ Arbitrary
Table 2. The input values of the parameters (14) for the MATLAB function fit.
Table 2. The input values of the parameters (14) for the MATLAB function fit.
ParameterStarting ValueMin. ValueMax. Value
An10.51 × 10−610
An221.001Inf
An3e0.1Inf
An40−0.50.5
An521 × 10−6Inf
Table 3. The specifications of the computer used for the presented experiments.
Table 3. The specifications of the computer used for the presented experiments.
ParameterValue
CPU13th Gen Intel(R) Core(TM) i7-13650HX 2.60 GHz
RAM16.0 GB
Operating system64-bit Windows 11 Home
GraphicsNVIDIA GeForce RTX 4050, 6 GB
Table 4. Signal parameters used in the experiments.
Table 4. Signal parameters used in the experiments.
SignalToneNumber of Prominent
Partials
Fundamental Frequency [hz]Inharmonicity Coefficient b
Harp_As4_recoredAs415416.92.2 × 10−4
Piano_A3_recordedA325220.62.65 × 10−4
Harpsichord_F3_recoredF3>70174.41.8 × 10−5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Miljković, T.; Ćertić, J.; Bjelić, M.; Pavlović, D.Š. Digital Signal Processing of the Inharmonic Complex Tone. Appl. Sci. 2025, 15, 8293. https://doi.org/10.3390/app15158293

AMA Style

Miljković T, Ćertić J, Bjelić M, Pavlović DŠ. Digital Signal Processing of the Inharmonic Complex Tone. Applied Sciences. 2025; 15(15):8293. https://doi.org/10.3390/app15158293

Chicago/Turabian Style

Miljković, Tatjana, Jelena Ćertić, Miloš Bjelić, and Dragana Šumarac Pavlović. 2025. "Digital Signal Processing of the Inharmonic Complex Tone" Applied Sciences 15, no. 15: 8293. https://doi.org/10.3390/app15158293

APA Style

Miljković, T., Ćertić, J., Bjelić, M., & Pavlović, D. Š. (2025). Digital Signal Processing of the Inharmonic Complex Tone. Applied Sciences, 15(15), 8293. https://doi.org/10.3390/app15158293

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop