A Signal Complexity-Based Approach for AM–FM Signal Modes Counting

: Frequency modulated signals appear in many applied disciplines, including geology, communication, biology and acoustics. They are naturally multicomponent, i.e., they consist of multiple waveforms, with speciﬁc time-dependent frequency (instantaneous frequency). In most practical applications, the number of modes—which is unknown—is needed for correctly analyzing a signal; for instance for separating each individual component and for estimating its instantaneous frequency. Detecting the number of components is a challenging problem, especially in the case of interfering modes. The Rényi Entropy-based approach has proven to be suitable for signal modes counting, but it is limited to well separated components. This paper addresses this issue by introducing a new notion of signal complexity. Speciﬁcally, the spectrogram of a multicomponent signal is seen as a non-stationary process where interference alternates with non-interference. Complexity concerning the transition between consecutive spectrogram sections is evaluated by means of a modiﬁed Run Length Encoding. Based on a spectrogram time-frequency evolution law, complexity variations are studied for accurately estimating the number of components. The presented method is suitable for multicomponent signals with non-separable modes, as well as time-varying amplitudes, showing robustness to noise.

Signal decomposition as well as IFs detection are among the most important goals in signal processing. As a matter of fact, many methods for signal modes recovery have been proposed. They are mainly based on time-frequency (TF) analysis [9,10] and its combination with Radon Transform (RT) or inverse RT [11][12][13], de-chirping procedures [14][15][16] and image processing techniques [17][18][19]. Since modes' number is unknown, the decomposition task is sometimes performed iteratively, according to some energy balance-based stopping criteria [20]. However, most methods require prior knowledge of the number of signal components [11,[21][22][23][24][25].
It is worth noticing that interference between signal modes, as well as its destructive or additive effects, plays a very critical role in any estimate concerning FM multicomponent signals (MCS), including modes counting. An example is shown in Figure 1, concerning the spectrogram of two MCS, respectively affected by additive (a) and destructive interference (b). As it can be observed, when signal modes overlap in the TF plane, only one component is perceived in (a) while very low amplitude is observed in (b). Therefore, independently of the interference effect, spectrogram does not provide a meaningful signal representation over the whole TF plane. As a result, most of decomposition tools and IF estimators are unreliable in such cases, except for some promising but still limited methods [11,12,25,26], as well as procedures designed for a specific signal class [10,13,27]. It is worth observing that sparsification approaches, such as Synchrosqueezing Transform and Reassignment method, are also limited to separable modes [9,28]. In particular, the resulting accuracy is worse in case of destructive interference, because of a greater loss of resolution in TF distributions (TFDs). A contribution to overcome this drawback has been presented in [29,30].
For the reasons discussed above, methods dealing with MCS need effective strategies for modes number estimation, both in additive and destructive interference.
By combining TF analysis and Hough Transform (or RT), Hilbert Huang approach allows for linearly FM-MCS detection, as well as modes number estimation [31][32][33]. Given a TFD, such as Wigner Ville distribution (WVD) or smoothed WVD, RT computes its lines integrals over the whole TF domain. A linear chirp, i.e., a signal with linear IF, is thus mapped to a distribution which is well concentrated at a single point. The latter corresponds to IF line parameters. As a result, a linearly FM-MCS can be easily decomposed by a peaks detection procedure, even if signal modes overlap or cross in the TF plane. Unfortunately, independently of the adopted TF representation, a non-linearly FM component has a more spread distribution in the Radon domain, rather than a single peak, as underlined in [11]. It turns out that this approach remains limited to linearly FM signals. A method for estimating the number of modes of an FM signal has been proposed in [34,35]. This paper introduced a local version of Rényi entropy, namely Short-term Rényi entropy, as a measure of signal complexity, i.e., the number of components. As main advantage, the presented procedure does not require the prior knowledge of one signal mode complexity and it applies to MCS whose modes are differently FM, e.g., a polynomial and a sinusoidal IFs. This approach is very interesting but it suffers from some drawbacks. Specifically, it needs a synthetic reference (piece of) signal to estimate the Rényi entropy and then the components number. As major limitation, constant amplitude as well as non-overlapping components are also required to give a correct final result. In particular, the experimental results presented in [34,35] have confirmed method limitation in overlapped modes analysis. Precisely, in the cases shown in Figure 1, Rényi entropy-based method is expected to detect only one component at the interference regions. Some limitations in Rényi entropy-based approach have also been highlighted in [36]. This paper is aimed at overcoming these drawbacks and proposing an effective counting procedure suitable for MCS with overlapping components. To this aim, the potential of Kolmogorov complexity, adapted to this specific case, is exploited to introduce a proper spectrogram coding. As it will be clearer in the next section, Kolmogorov complexity can be computed just theoretically. It is possible to find practical approximations of it but they are just sub-optimal solutions. Looking at Figure 1, it stands to reason that it can be encoded as a binary map-keeping in mind that we are just interested in the number of components. It turns out that the Run Length Encoding (RLE) may be the fastest and simplest way to represent it.
In order to convert the spectrogram into a binary map, we have simply to threshold it, as shown in Figure 2. The optimal time-varying threshold (i.e., estimated columnwise) can be achieved by a simple Minimum Description Length (MDL) functional, as it provides the best explanation of an object under study (i.e., spectrogram) among a set of candidate models (i.e., thresholded spectrograms). Once the binary map is achieved, a modified RLE is applied to encode complexity concerning a transition between adjacent columns. Asymptotic Equipartition Property (AEP) allows us to characterize complexity variations due to interference, as well as new mode contribution. Complexity variations in subsequent columns are then analyzed, based on a spectrogram TF evolution law, to estimate the number of components-further details are contained in Section 2.
As it will be shown in Section 3, the proposed approach is able to manage time-varying amplitude signals, any interference kind, as well as localized and hyperbolic chirps. In addition, the proposed approach shows a certain robustness to noise. Method's advantages with respect to existing procedures as well as limitations are also discussed in the same section. Finally, Section 4 draws the conclusions.

Materials and Methods
This section presents the theoretical tools needed for addressing the problem of modes number estimation and for introducing the proposed approach. In particular, Section 2.1 is devoted to MCS representation in the TF plane by means of spectrogram, while Section 2.2 introduces the fundamental concepts from Information Theory used in this paper. The proposed method is finally presented in Section 2.3.

AM-FM Signals and Spectrogram Representation
A MCS modulated both in time and frequency (AM-FM) is defined as [8] where f k ∈ L 2 (R) is the k-th mode, a k and φ k are smooth time-varying functions, respectively denoting f k amplitude and phase and N is the number of components. Phases time-derivatives, i.e., φ k (t), k = 1, ..., N are defined as signal IFs. The Short-Time Fourier Transform (STFT) of f , with respect to a real and even analysis window Spectrogram is then defined as STFT squared modulus, i.e., P(u, ξ) = |S g f (u, ξ)| 2 , and it measures the energy of f in the TF neighborhood of (u, ξ).
In case of a single component, i.e., f (t) = a(t) cos φ(t), STFT can be expressed as [37] where s > 0 is the window support length, * stands for the Fourier Transform of * and (u, ξ) denotes a corrective term, which is negligible if a(t) and φ (t) have small relative variations over the support of the window g. Under this assumption, from Equation (3) it follows Equation (4) tells us that, for each fixed u, spectrogram profile is a shifted copy of the analysis windowĝ, modulated by a(u), as depicted in Figure 3. Indeed, the following evolution law can be derived.
Proof. By taking into account Equations (4) and (5) simply follows by direct computation. Indeed, (5) is an advection equation with source term. It is worth underlying that the latter also implies spectrogram's support shifting, for u varying.

Remark 2.
Limited to the support of P 1 , spectrogram satisfies the evolution law in Equation (5). As a result, for each fixed u, we observe two window profiles which move independently in the TF plane, according to the corresponding IF, but with invariant support lengths.
On the contrary, if the separability condition as in Equation (7) is not satisfied at a fixed u, i.e., |φ 1 (u) − φ 2 (u)| < ∆ω, then the single profiles P i , i = 1, 2 have overlapped supports, i.e., they interfere with each other in the TF plane, as shown in Figure 4b.
The interference is said to be additive if spectrogram energy increases at the non-separability region, with respect to the overall energy of the two isolated contributions, i.e., P 1 and P 2 . Conversely, it is referred as destructive if spectrogram presents a lower energy balance. It is worth noticing that the cosine sign in Equation (6) plays a crucial role in this taxonomy.

Remark 3.
Regardless of the interference effects, when modes switch from a separability region to an interference one, a support length variation is observed.

Signal Complexity
This section provides a very short review of some fundamental concepts in Information Theory that are useful to understand the rest of the paper. As presented in Sections 2.2.1 and 2.2.2, Kolmogorov generalization of Shannon entropy focuses on the most effective algorithm to encode the data under exam. In principle, it does not take into account for irreversible information coding. In other words, the underlying hypothesis is that all data produced by the source are representative. As a matter of fact, this is not our case. Rather than dealing with this problem from a general point of view, we focus on the two particular cases necessary to develop the proposed model.
Specifically, we will first need to reduce the available information i.e., spectrogram, to the data really necessary to estimate the number of components. In this case, a powerful and well known approach is the Minimum Description Length, which is topic of Section 2.2.3. The binary map obtained by a thresholding procedure can be easily represented by the Run Length Encoding introduced in Section 2.2.4.
The second aspect to be deepened will be to discard information that is produced by the source but has entropy far from the "true" one. This is possible by exploiting the Asymptotic Equipartition Property, which is presented in Section 2.2.5.

Shannon Entropy
Shannon entropy measures the amount of information concerning an emitting source. More precisely, given a discrete random variable X, with probability density function p(x), its entropy is defined as where log(·) stands for the logarithm to base 2 of (·) and X denotes the set of all possible values taken by X. Thanks to Shannon entropy, the concept of information has been formalized as average uncertainty of a random variable. As an example, the entropy of a deterministic variable is trivially zero, as the outcome is predictable. On the contrary, a uniformly distributed random variable realizes maximum entropy, among all the possible distributions defined on the same probability space. It is worth pointing out that entropy only depends on the probability distribution and not on the specific source values, resulting in a general and flexible tool. Shannon entropy also corresponds to the minimal amount of storage required to encode and transmit any information, thus providing a lower bound in data compression.

Kolmogorov Complexity
Kolmogorov formalized the concept of complexity of an object as the length of the shortest binary computer program that describes the object. Coarsely speaking, the simplest explanation is also the best one, according to Occam's razor principle. More formally, let us denote a generic computer program by p, let L(p) be its length and U (p) its output with respect to the universal computer U . Kolmogorov complexity of a string x is then defined as [38] It is possible to prove that, in simple (low complexity) distributions, the expected Kolmogorov complexity is close to the Shannon entropy. Unfortunately, Kolmogorov complexity is not practically computable and only some approximations are available. However, it represents a fundamental concept in Information Theory.

Minimum Description Length
Minimum Description length (MDL) is a well known and powerful tool to estimate the best hypothesis (among a class i.e., a model, of candidated ones) as well as its parameters [38,39]. It is in agreement with the Occam razor, where this principle is implemented by equating learning with coding. In other words, given a (finite size) data sample, the simplest model that well fits them is also the best one. Formally, the simplest formal way to implement MDL is represented by the crude MDL. Note that even though refined MDL is more performing via a more elegant formalism, it is out of the purposes of this paper. Coming back to crude MDL, it selects a model from a set of candidates M (1) , M (2) , . . . by minimizing the following cost where L(M (k) ) is the cost (in terms of bits) required for coding the model M (k) , while L(X|M (k) ) is the number of bits required for coding the data X given the model. In general, the better the model the higher its cost but the smaller the approximation error. That is why the selection of the best model is a trade off between complexity and good approximation.

Run Length Encoding
Run Length Encoding (RLE) is a simple lossless data compression algorithm, particularly suitable for the acquisition of binary images [40]. An input sequence S is converted into a shorter sequence S , whose entries are the numbers of consecutive pixels of the same colour (black or white) observed in S. For the sake of convenience, in this work we will adopt a couple-based coding. An example of RLE application is shown in Figure 5.
Also two-dimensional data can be processed through a columnwise scanning, i.e., by 2D RLE. When data contain enough uniform regions, high compression is achieved, while in case of more structured patterns, RLE gives little compression or no compression.
It is worth underlying that, besides being a compression tool, RLE is an entropy encoding, i.e., it can be used as a measure of the amount of similarity between data under analysis and a reference data class. This is the strategy adopted in the presented paper.

Asymptotic Equipartition Property
Asymptotic Equipartition Property (AEP) is a direct consequence of the Weak Law of Large Numbers and it is formalized in the following theorem [38]. Theorem 1. Let X, X 1 , X 2 , ..., X n be a sequence of independent and identically distributed random variables From Equation (12), it follows p(X 1 , X 2 , ..., X n ) ≈ 2 −n H(X) , i.e., the joint probability distribution is close to 2 −n H(X) , with high probability. Specifically, let us define the typical set A (n) as the one composed of the observed sequences (x 1 , x 2 , ..., x n ) such that or equivalently the following properties can be derived from Theorem 1.
(i) A (n) has probability nearly 1; (ii) elements in A (n) are nearly equiprobable; (iii) A (n) cardinality is nearly 2 n H(X) .
In other words, the relevant information is only represented by the typical sequences, whose sample entropy is close to the "true" one H(X).

The Proposed Method
In our assumptions, the separability region corresponds to most of the spectrogram domain. According to Section 2.2.5, this is equivalent to say that most of spectrogram columns belong to the typical set (i.e., modes are separated at time u), while few columns do not (i.e., modes do interfere at time u). A sequence of typical columns can be interpreted as a stationary process (entropy does not significantly change). Conversely, switching from a typical to a non-typical column can be seen as non-stationarity (entropy significantly changes). Therefore, spectrogram can be globally interpreted as a non-stationary process where separability regions may be alternated to interference ones. This argument allows us to correctly estimate the number of modes, even in case of interference.
The proposed method is composed of three main steps: Step 1. spectrogram thresholding and binary map derivation; Step 2. binary map encoding by RLE; Step 3. selection of typical columns for modes counting.
It is worth highlighting that Step 3 can be skipped if signal modes are well separated over the whole TF domain. However, the third phase is the most delicate one. It is oriented to differently manage spectrogram columns, that can be seen as sequences produced by the source, discarding those having entropies far from the "true" one. Further details will be provided in the sequel.

Thresholding by MDL
Let us assume that spectrogram is defined on the TF plane discretization (u i , ξ j ), with i = 1, ..., n, and j = 1, ..., m. Given a threshold value T > 0, spectrogram can be converted to a binary map as follows: In practical applications, it is required to select the optimal threshold T in an automatic way. To our purposes, T is required to be a function of the time variable u. Moreover, non-constant amplitude chirps will be considered. It turns out that, for each fixed u i , a threshold T(u i ) will be adopted for each spectrogram column P(u i ).
As outlined in the previous section, a crude MDL approach can be used to determine the threshold, as better described below.
Starting from the i-th column of the spectrogram P(u i ), one may select K different thresholds T k = T k (u i ), k = 1, ..., K, i.e., K different models. The latter can be compared via the functional in Equation (11) to select the best one. More precisely, by denoting with i , the discarded energy during the thresholding process; and, similarly, by denoting the total number of column points with P TOT i , the number of selected points with P Sel i (T k ) and the number of the discarded points P Disc i (T k ), the terms of the MDL functional in Equation (11) can be defined. The first term L(M (k) ) takes into account the fraction of the selected points of the i-th column. Hence, by omitting the dependence on T k , we have With regard to the second term, it can be classically set equal to By replacing Equations (16) and (17) in Equation (11), the optimal threshold is given by arg min In practice, this term accounts for the entropy of the error in approximating the original column by the thresholded one. Such a residual error entropy can be roughly seen as P Sel i Q B , where Q B is the number of bits allocated for each residual coefficient-assuming a uniform quantization.
For a fixed u i , the threshold value T¯k(u i ) defined by Equation (18) is selected as the best spectrogram column model. Spectrogram column is then converted to a binary sequence according to Equation (15), with T = T¯k. By repeating the procedure for each i = 1, ..., n, a binary map is finally obtained.
A few observations are required. Both avoid the commonly used parameter λ whose aim is to balance the entropy of the two terms in the MDL functional. Considering the total entropy (rather than bits/sample), we can multiply both the aforementioned quantities by 100. This is useful to consider positive quantities when managing the logarithm. This trick simplifies the second term of the functional that can be then described by log(·), that is the entropy describing an integer known the range, rather than 2 log(·) + 1, that represents the entropy describing any integer.

Encoding by RLE and Modes Counting
As already seen in Section 2.2.4, RLE is one of the most powerful methods to code a binary map. Coarsely speaking, 1D RLE encodes a binary sequence through couples of values where the first one is the color (i.e., black or white) and the second one is the number of pixels having that color. Such a technique is also an entropy encoding.

Spectrogram encoding and complexity
Without loss of generality, let us consider a two-component signal whose modes are well separated at a fixed u i . Then, the i-th column of the binary map B derived from the spectrogram, denoted as B(u i ), will be coded as follows: where l p is the length of the p-th set of consecutive (black or white) pixels. If modes are still separated at u i+1 , then from Equation (5) it follows that B(u i+1 ) will be coded as: where l 2 and l 4 have not been changed (this is not a crucial aspect in the presented method). It turns out that, if modes are separable, the number of modes can be easily computed by counting how many couples appear in the code. More precisely, by denoting with N couples the number of column couples, the modes cardinality is thus given by: A similar argument can used in the case of MCS with N components. Unlikely, the simple computation in Equation (21) does not work when there are interfering chirps. The 1D RLE has been considered so far. If one would consider the 2D RLE from left to right (from time u = 0 to u = +∞) the interference region would be a region with a greater complexity, whenever the latter is defined as the amount of bits necessary to code a transition between two subsequent binary columns. It is worth pointing out that, in this paper, 2D RLE stands for 1D RLE sequentially applied to columns.
A more critical situation goes on when interference generates some black holes in the binary map, as in the case depicted in Figure 2b. This additional budget of bits, required to code the complexity, is not justified by the real underlying information still composed of just two components. The presence of black holes is particularly noticeable in case of very spread distributions, as in the example shown in Figure 6. It is worth underlying that the window size should be carefully chosen in order to avoid too spread spectrograms and to minimize the non-separability region. However, it is convenient to remove the eventual black holes, as done in Figure 7.
Coming back to 2D RLE, the complexity is characterized by the difference between the number of couples in adjacent RLE representations, i.e., In this way, when the code in Equation (19) switches to the one in Equation (20), τ = 0 and then complexity is zero.
On the contrary, if two modes interfere with each other at u i+2 , then B(u i+2 ) code is More precisely, it holds τ < 0, as some couples disappear due to modes overlapping-see Remark 3 and Figure 4b. The same situation occurs in the general case of N components. As a result, even if some couples disappear, complexity is greater than the separable case, as the amount of bits needed to code the transition is accounted for.  Figure 6b with removed black holes.

Detecting interfering columns
By assuming that the separability region corresponds to most of the spectrogram domain, the interference region can be seen as a sort of outlier in a stationary situation. It implies a worse compression (even though there's a missing couple in the RLE code). It is due to the fact that RLE exploits the similarity between adjacent columns: columns in the non-separability regions have a different code when compared to a column in the separability one. Hence, in case of interfering modes, a larger number of bits is required for recording the dissimilarities between adjacent columns, which means greater entropy and increased global complexity.
Based on this considerations, the "true" entropy of the source is the one corresponding to the separability region. As a result, interference columns can be recognized as the ones that do not belong to the typical set, as defined in Section 2.2.5. Only columns whose entropy H c satisfies are retained, whereH is the "true" (or reference) entropy and is a small positive quantity. Columns whose entropy does not meet the constraints in Equation (24) are not representative of the information conveyed by spectrogram and they won't be considered in modes counting.

New mode contribution
Interference among modes is not the only cause of dissimilarity in adjacent columns. Since modes may have different time supports, one contribution can appear at a different time instant, as in Figure 8b. Both cases yield to different RLE code with respect to the separable case discussed in the previous paragraph, resulting in a greater complexity. For instance, when the third mode appears in the map shown in Figure 8b, the column code switches from the one in Equation (19) to (0, l 1 ), (1, l 2 ), (0, l 3 ), (1, l 4 ), (0, l 5 ), (1, l 6 ), (0, l 7 ) .
As a result, taking into account Equation (22), τ > 0 and the complexity is higher. It is worth observing that, in this case, the complexity increase is justified by the fact that there's really a new chirp somewhere in the TF plane. What happens in the interference region is instead an energy cancellation, which means an entropy value that is significantly far from the reference one. It turns out that the typical columns to be selected are the ones characterized by an entropy H c such that i.e., whose entropy distribution is not symmetric around the entropy distribution peak, allowing new energy blobs located far from the already present contribution. Such additional energy can not be produced by interference and then it has to be due to a new mode. For this reason, columns satisfying the condition in Equation (26) will be included in modes counting. It is worth pointing out that τ > 0 also occurs when leaving an interference region, as in the case depicted in Figure 8a. Therefore, checking only the sign of τ is not sufficient for correctly estimating the number of components. The advection evolution law in Equation (5) can be exploited to solve this issue. More precisely, if modes share the same time support and two adjacent columns are compared, then support boundary points derive from the ones in the previous column, as emphasized by the round markers in Figure 8c,e. On the contrary, an isolated new mode is an outlier in the transport phenomenon. Indeed, by comparing Figure 8d,f, no correspondence between columns is observed at the new mode support. The same happens in the reverse situation, i.e., whenever an isolated contribution disappears.

The Algorithm
The whole procedure for estimating the number of components of AM-FM signals is explained below. Without loss of generality, modes are assumed to be separable at u = 0 (this assumption is not strict, as it is sufficient to start from a time-instant belonging to a separability region, which is most of spectrogram support, in our assumptions).
Given a MCS f (t), as defined in Equation (1), the binary map B is derived from spectrogram. Then, B is processed by 2D RLE, i.e., it is scanned, from left to right, to code each column and also to sequentially compute the number of modes, based on the comparison between adjacent columns. If higher complexity is found, then adjacent columns are further investigated for identifying modes new contributions, as well as disappearances, so that the number of modes can be correctly updated. The required steps are the following.
(2) Threshold spectrogram at the 10% of its absolute maximum (this is a pre-processing step, justified by the assumption that noise is equally distributed over the whole domain). (3) For each u in the signal time support (3.1) select spectrogram column P(u, ξ) and normalize it; (3.2) for each test threshold T k ∈ {0.1, 0.2, ..., 1} evaluate the MDL functional, as in Equation (18), and select T¯k(u) providing the functional minimum; (3.3) threshold spectrogram column P(u, ξ) according to Equation (15), with T = T¯k(u).
(4) Set B as the overall columns obtained in step 3.3.
Step 2. 2D RLE encoding and modes counting (1) Code the first binary column, i.e., B(u 1 ) by RLE and compute N modes (u 1 ) according to Equation (21). The function IsIsolated checks if an observed complexity increase is due to an isolated contribution. In particular, it outputs 1 if a new mode has appeared, −1 if a mode switches off. Its implementation can be found in Algorithm 1, while a flowchart of the whole procedure is provided in Figures 9 and 10.

Experimental Results
The proposed method has been tested on several synthetic MCS with various FM and AM. This section presents the most significant results concerning linear combinations of modes with length L = 512 and localized Gaussian blobs. Both additive and destructive interference effects are considered. In all tests, STFT is computed by using a gaussian window with length s ∈ {32, 46, 58, 64} and the corresponding spectrogram is adopted as representative TFD. Signals are embedded in additive white gaussian (AWG) noise with Signal to Noise Ratio (SNR) ranging from 6 dB to 20 dB. The relative counting error, defined as the number of correct estimations with respect to the time domain dimension, is computed for each considered SNR level.
The first test is aimed at evaluating method effectiveness in case of critical destructive interference. To this aim, a two-component noisy signal (SNR = 10 dB) composed of a cubic and a linear chirp is considered in Figure 11. Signal in the time domain is shown in (a), while its spectrogram, computed with a window of length s = 46, is in (b). As it can be observed, when modes supports overlap in the TF plane, spectrogram presents very low energy and, therefore, it does not clearly reveal the individual components at the interference region. Each spectrogram column is normalized separately and the optimal threshold, as in Equation (18), is selected. The time-dependent threshold realizing MDL functional minimum is shown in (c). The latter is used for thresholding spectrogram columnwise, so that the binary map shown in (d) is obtained. Black holes have been removed. The estimated number of modes, as a temporal function, is provided in (e). It is worth observing that the proposed method is able to correctly detect the number of modes at the interference region. Finally, the relative counting error against SNR is shown in (f). Similar results are achieved for different window lengths, as shown in Figures 12 and 13, respectively referred to s = 32 and s = 64. As it can be easily seen by comparing Figures 12d and 13d, better performances against noise are obtained for a larger window length s, as it provides a more concentrated spectrogram distribution.
In order to compare the proposed method to the ones in the literature, Figure 14 shows the estimated number of modes of the signal in Figure 11a by using the Rényi entropy-based approach [34,35]. The required parameters have been set according to [35]. As it can be observed, the result in Figure 14b is affected by the presence of boundary effects in spectrogram domain (around u = 350). In addition, only one component is detected at the non-separability region, while the proposed method is able to provide the correct number of modes, as shown in Figure 11e.
In Figure 15 a two-component AM-FM signal is considered. Specifically, it is the superposition of a quadratic mode modulated by a linear amplitude and a linear chirp modulated by a gaussian amplitude, embedded in AWG at SNR = 15 dB. The signal in the time domain is shown in (a) and its spectrogram can be found in (b). As before, MDL functional minimization provides the optimal thresholds in (c) for obtaining the map depicted in (d). The detected number of components is shown in (e) and the relative counting error for varying SNR is plotted in (f). Also in this case, the proposed method estimates the correct number of modes at the interference region. Figure 16 concerns a three-component FM signal corrupted by noise (SNR = 10 dB). As it can be observed by looking at spectrogram in (b), two modes interfere with each other in additive way, so that only one component is revealed at the interference region. The optimal threshold level selected by MDL functional minimization is shown in (c), resulting in the binary representation depicted in (d) and the estimate in (e). Even in this case, despite interference, the correct number of modes is estimated and the localized gaussian blob is also detected. Figure 17 shows the result given by the Rényi entropy-based approach in [34]. As it can be noticed, the estimation is incorrect around u = 50 due to the presence of time varying amplitudes and, as expected, only one contribution is detected at the interference region. This limitation is overcome by the proposed method, as confirmed by Figure 16e.
A similar result is achieved by considering the configuration shown in Figure 18, where the gaussian blob time support is close to the one interested by interference.
The following test aims at investigating method effectiveness in dealing with multiple interference regions, as well as new localized mode contributions. In particular, the FM noisy signal in Figure 19a is considered (SNR = 15 dB). The latter consists of the superposition of three interfering polynomial FM modes (respectively cubic, linear and constant chirp) and a gaussian blob. As shown by Figure 19b, three disjointed interference regions occur. By thresholding spectrogram columns at those levels provided by MDL functional minimization and shown in Figure 19c, the binary map in Figure 19d is obtained. The estimated number of modes is provided in Figure 19e, confirming algorithm robustness to multiple interference regions, as well as new mode contribution or mode switch off. Relative counting error against SNR can be found in Figure 19f.
In Figure 20, a noisy hyperbolic chirp is considered as test signal. As it can be noticed by looking at (a), signal frequency rapidly increases with respect to time. The corresponding spectrogram, shown in (b), is processed to select the optimal threshold level, according to MDL functional minimization, resulting in (c). The binary map shown in (d) is then derived. Even if a reduction in spectrogram sections' support is observed in (b), and then different RLE are obtained by processing the column of (d), the method still is able to estimate the correct number of modes, as shown in (e). Finally, (f) depicts the relative counting error when SNR varies. Figure 21 shows the result provided by the Rényi entropy-based approach in [34]. As it can be observed in (b), IF fast increase in time affects the estimation, resulting in inaccurate results. Figure 22 is aimed at showing method capability in dealing with complex signals and refers to the sum of a sinusoidally FM component and a linearly one. As confirmed by Figure 22f, the proposed method provides the correct number of modes, although the presence of cross-terms and intersection between modes. Figure 23 concerns a three-component signal where one mode (the constant chirp) presents a much higher energy with respect to the others, that conversely interfere with each other in a destructive way, as shown by spectrogram in (b). In this very pathological case, the threshold value obtained by MDL functional minimization does not provide a consistent representation-null contribution is observed in the binary map in (c) in a subregion of the second interference region. From Figure 23d, we can observe that only the dominant constant chirp is detected in that subregion.
Finally, the proposed method has been tested on the real-life bat echo location signal shown in Figure 24a. As confirmed by Figure 24d, although some negligible instabilities, the correct number of modes is provided, for each time sample.

Some Observations
As shown in Section 3.1, the proposed method is able to correctly estimate modes number, even if signal components do overlap in the TF plane. This is the main finding in this paper.
Compared to the Rényi entropy-based procedure in [34,35], the presented method is suitable for AM and overlapped modes and it seems to be more robust to boundary effects in spectrogram domain. It is worth underlying that one of the main assumptions in [34,35] is that signal individual modes have similar structure in TF plane. Unfortunately, the presence of boundary effects, as well time-varying amplitudes, can locally alter signal structure, that is why Rényi entropy computation can be inaccurate, resulting in wrong modes counting. For the same reason, the proposed method seems to be more suitable for processing hyperbolic chirps.
Experimental results also showed method robustness to noise, up to SNR = 12 dB. In higher noise environment, a larger relative counting error is sometimes observed, as shown in Figure 15. In this particular case, estimation is incorrect at the time boundary of modes support. In other words, method fails in estimating the exact instant of mode appearance or mode switch off. Nevertheless, it is worth observing that the error is zero at the interference region.
In any case, high noise levels make the detection of low energy mode more critical. Indeed, concentrated fake (due to noise) distributions with large energy could be observed in the spectrogram support, and they may be confused with localized gaussian blobs. In order to overcome this drawback, the adaptive thresholding procedure proposed in [41] can be adopted as a pre-processing for the extraction of significant information from noisy TFDs.
It is worth pointing out that the lack of information observed in Figure 23c is not due to MDL inefficiency, but it depends on critical issues in signal characteristics, i.e., presence of a dominant mode and destructive interference, at the same time. In such cases, an optimal threshold value could not exist.
Numerical examples have also confirmed the robustness of the proposed method to wFMindow size. It is worth pointing out that a larger window size, which provides more concentrated spectrograms, is preferable for reducing the non-separability region and then for getting a more accurate estimate of modes number, as shown in Figures 11-13. Too spread distributions, as the one shown in Figure 2, usually present large amplitude cross-terms which may affect the optimal threshold detection by MDL, resulting in incorrect modes number estimation. For this reason, higher concentration is preferable in case of close signal modes. On the other hand, the application of the presented method to improved resolution TF distributions, such as the reassigned ones, will led to higher computational cost, with respect to simple spectrogram. The effect of grid modifications on method accuracy has been also investigated and method is somewhat robust to TF resolution.
It is also worth pointing out that the spectrogram evolution law in Equation (1) formally allows us to define the columnwise counting procedure. That is why spectrogram is the TFD adopted in this paper. As mentioned above, spectrogram of a MCS is generally affected by cross-terms, especially for close components. Different transforms, such as Smoothed Pseudo Wigner Ville or S-Transform, can significantly reduce cross-terms-even though they can not be completely removed. For this reason, more advanced kernels will be considered in future researches, as they may improve the overall accuracy. However, it is worth highlighting that robustness to cross-terms is precisely one of the advantages of the presented method.
Pathological situations such as interference occurs exactly when a new contribution appears, as well as multiple interference regions at the same time instant and, finally, more than one mode appear or disappear simultaneously have not been considered in this paper. These issues will be the subject of future studies.
In [11], authors proposed a method for separating IF curves in FM-MCS analysis. In particular, the procedure was based on the combination of spectrogram and RT and required signal modes' number. Compared with [11], the present work differs both in methodology and purpose: it encodes and processes spectrogram information to estimate modes number. For this reason, it can be useful as a pre-processing for the method in [11], as well as for all methods dealing with MCS and requiring signal cardinality knowledge.
In conclusion, even though further refinements are needed to make the proposed method suitable for more general situations, experimental results have confirmed its potential in dealing with interfering modes, that represents one of the most challenging scenario in MCS analysis.

Conclusions
This paper introduced a method for estimating the number of modes of a frequency modulated multicomponent signal (MCS). Based on a spectrogram time-frequency (TF) evolution law, the proposed approach exploits Kolmogorov complexity in order to model the information conveyed by spectrogram. In particular, interference as well as new mode contribution are formally explained by means of the Asymptotic Equipartition Property. Based on these theoretical results, the relevant information in spectrogram distribution, i.e., the one sufficient to correctly estimate modes number, is obtained by simply converting spectrogram into a binary map. This is done by an adaptive and automatic thresholding procedure, based on Minimum Description Length principle. Modes counting is then performed by a 2-dimensional Run Length Encoding implementation that measures similarity between adjacent columns.
As confirmed by the experimental results, the proposed method is suitable for time-varying amplitudes MCS and also for interfering MCS, i.e., for signals having close or overlapping modes both in time and frequency. Therefore, it contributes to overcome the limitations in Rényi entropy-based approach introduced in [35]. The proposed method is also able to deal with non-linearly FM signals; it is more general than Hilbert-Huang methods [31]. In particular, the presented procedure can be useful in applications that require a knowledge of modes number. The critical case of very low amplitudes in spectrogram, as well as the applicability of the presented study to other TF distributions will be investigated in depth in future research.

Conflicts of Interest:
The authors declare no conflict of interest.