Advances in Crest Factor Minimization for Wide-Bandwidth Multi-Sine Signals with Non-Flat Amplitude Spectra "2279

Multi-sine excitation signals give spectroscopic insight into fast chemical processes over bandwidths from 101 Hz to 107 Hz. The crest factor (CF) determines the information density of a multi-sine signal. Minimizing the CF yields higher information density and is the goal of the presented work. Four algorithms and a combination of two of them are presented. The first two algorithms implement different iterative optimizations of the amplitude and phase angle values of the signal. The combined algorithm alternates between the first and second optimization algorithms. Additionally, a simulated annealing approach and a genetic algorithm optimizing the CF were implemented.


Introduction
Dielectric analysis (DEA) is a well-known method for the characterization of material behavior and a technology for monitoring chemical processes, e.g., the curing of thermosetting resins [1], the curing of adhesives [2], and the polymerization process of polyamide 6 [3]. A more general term is electrical impedance spectroscopy (EIS) [4]. In the context of biological processes, it is also referred to as bio-impedance spectroscopy (BIS) [5].
Independent of the application, DEA compares the phase and amplitude of a sinusoid excitation signal applied to a sensor in contact with a specimen with its response signal. Changes in phase and amplitude over time give an indication of the state of the specimen. Ongoing chemical reactions creating new molecular structures result in changing dielectric behavior which can further be used to correlate other physical parameters or states, e.g., the viscosity or the state of cure.
In addition to being a characterization method, DEA has the benefit of being applicable to process monitoring and process control [6,7], thus showing great potential for inline quality monitoring solutions for adhesive part assembly or 3D printing using fast-curing resins [8].
Historically, in order to achieve full spectroscopic results, sweeping approaches using single-frequency sine waves were used. Especially for fast processes that take place in a few seconds or less, new approaches are needed to achieve spectroscopic information. Multisine signals provide the means to achieve the desired results. Nevertheless, using multi-sine excitation signals with only few frequencies for process monitoring and relying on absolute values drastically limits the usage in industrial applications, as the measurement principle is prone to disturbances from external influences, e.g., contamination or parasitic induction. Furthermore, the use of only a small number of frequencies limits the information necessary to derive a complete picture of the processes or effects occurring, not only in the time domain but also in the frequency domain.
In [3,8], an approach was shown using multi-sine excitation signals with up to 20 frequencies incorporated giving spectroscopic insight into fast chemical processes for high bandwidths. With recent modifications, the system is now able to monitor bandwidths from 10 2 Hz to 10 7 Hz, resulting in a need for excitation signals with more than 20 frequencies distributed over the measurement bandwidth to provide sufficient spectral resolution.
To compare the generated signals objectively, metrics are needed which give insight into the signals and the information they contain. One commonly used metric is the crest factor (CF). The CF determines the information density of a multi-sine signal. Minimizing the CF yields higher information density and is the goal of the presented work.

Multi-Sine
Large bandwidth impedance spectroscopy (IS) requires a dedicated/specific excitation signal. The main requirement toward these signals is that they allow for a combined analysis of the signals in the time and frequency domains. Multiple options have been reported for these applications in the literature over the past decades. The most commonly used signals are binary sequences such as maximum length binary sequences (MLBS) [9] or discrete interval binary sequences (DIBS) [10], chirp signals [11], and multi-sine signal signals [12]. Multi-sine signals for large-bandwidth impedance analysis offer several advantages over other signal types. They allow for a custom amplitude spectrum while having customizable excitation frequencies.
The signal is generated by adding up multiple sine waves, while each wave can be chosen with its particular frequency f n , amplitude a n , and phase ϕ n according to the following equation: (1)

Crest Factor
A widely used metric to evaluate and compare multi-sine signals in the time domain is the use of the crest factor (CF). This metric shows how much amplitude is consumed by a signal to introduce a certain amount of energy into a system [13]. Higher values indicate harmonics, while low values for multi-sine signals imply no or little interferences between the specific frequencies. The CF is calculated as the ratio between the peak value of a signal and its effective (root mean square) value.
For a signal in the time domain s(t) measured over a time interval [0; T], the CF(s) is calculated according to the following formula: (3)

State of the Art
Current methods used for optimization of multi-sine signals are either analytical approaches for calculating the phase angles or iterative algorithms. The idea behind the analytical formulas is to control the crest factor (CF) by appropriately choosing the phases of the regarded components of the multi-sine signal. One of the first attempts to solve this problem was proposed by Schroeder [14]. His approach adapted the phase angles of the single multi-sine components according to the following formula: With a i being the amplitude of the i-th component, m = 1, . . . , M − 1, and ϕ 0 ∈ [−π, π]. Schroeders approach was adopted by Newman [15] yielding a slightly different formula for optimal phase angles.
A major difference between these two formulas is that Schroeder took the amplitude values of the different frequency components of the multi-sine signal into account, whereas Newman's method only uses the number of exciting frequencies. Therefore, Schroeder's scheme often gets better results in the case of non-constant amplitudes [13].
By now, further approaches to solve this problem analytically have been presented. In recent years, several formulas have been introduced by Ojarand [16]. These three equations Φ 1 i , Φ 2 i , and Φ 3 i calculate the phase angle for frequency component i. They are quite easy to calculate and behave well especially for sparse frequency distributions. In the first case, the formula also shows promising results for denser frequency distributions.
The parameter B can be freely chosen in between 0 and 180, and i stands for the currently regarded frequency component. The five formulas described above all behave differently from one another depending on the distribution of the frequencies. Nonetheless, it must be pointed out that these methods do not achieve acceptable results in terms of the CF.
Using iterative algorithms to optimize the phase angles values promises to be a more satisfying approach. Several versions of well-behaving algorithms have been presented in the last decades and yet they do not yield optimal solutions. These can so far only be provided by an exhaustive search of all possible phase combinations. Within the last few years, Ojarand proposed two different iterative algorithms [16,17].
The first one, presented in 2014, optimizes the phase angles by selectively searching through a given range of phase angles. Hence, this algorithm takes a lot of time to find suitable phases for each frequency. For 20 or more frequency components, it would takes days to optimize their phase angle, which makes this algorithm unsuitable for industrial applications.
In 2017, Ojarand et al. presented another algorithm trying to solve existing problems with their algorithms only finding local minima. This is a well-known problem that appears with iterative algorithms that minimize the CF of multi-sine signals [17]. Therefore, the main idea of their new method was to start the iteration with a fixed phase set, calculated by an analytical formula. After that, the multi-sine signal is modulated. Then, the main part of the algorithm starts, consisting of another iteration in which they first build the Fourier spectra and calculate the inverse discrete Fourier transformation (IDFT). Thereafter, the CF of the resulting signals needs to be calculated and compared to the currently lowest CF. In the case of an improvement of the CF, the currently optimal phase set gets updated. Subsequently, they use a logarithmic clipping function to clip the current multi-sine signal and then calculate the discrete Fourier transformation (DFT) for this signal. The last step is needed to obtain a new phase set from the DFT that may yield a lower CF. This iteration can be executed an arbitrary number of times. At the end, a new phase set is calculated by one of the analytic formulas, and then the whole process described above is repeated.
As mentioned before, the first algorithm achieves quite low CFs, because it selectively searches the whole given phase spectrum for a near-to-optimal phase angle combination. This comes at the cost of taking more and more time for an increased number of frequencies in the multi-sine signal. Therefore, this algorithm is not useful for the application discussed in this paper. The second algorithm on the other hand returns promising results, especially for signals whose frequencies are distributed over a rather small bandwidth.

Optimization Approaches
In this section, three new iterative algorithms that minimize the CF are presented and subsequently tested to compare their performance in terms of CF minimization to the algorithm Ojarand presented in 2017.

Iterative-Stochastic Optimization
The first new algorithm comprises two separate components, an iteratively operating algorithm and a stochastic algorithm. The iteratively working algorithm optimizes the CF of the currently regarded multi-sine signal by optimizing the phase angles.
An overview over the workflow of the algorithm is given in Figure A1 (Appendix A). The algorithm starts with calculating a first set of phases, e.g., Equation (6). In our implementation, we use the formula Φ 2 i (k) = 180 B/i. Then, each phase angle is regarded and optimized separately. To achieve this, each phase angle is increased once and decreased once. The multi-sine signal is modulated using the new phase angle, resulting in a new multi-sine signal differing in only one component. At the end of this iteration, the CF of the new signal can be calculated and compared to the previous one to update the so-far bestfound phase set. Summing up, this iteratively working algorithm searches for a minimum that is located close to the consigned set of phases. Therefore, this algorithm only delivers a local optimum and should be used in combination with other globally acting algorithms.
To handle the drawback of the first introduced algorithm, a second one was developed to combine it with. This second one operates stochastically. The idea behind this algorithm is to calculate random phase angles ϕ and amplitude values a i in a specified range. For the i-th component, they are defined as follows: where A i is the amplitude value of the i-th component calculated at the beginning of the algorithm. We restrict the amplitudes because the originally prescribed distribution of the amplitudes should preferably keep its form. As a result, the newly calculated amplitude value may deviate by a maximum of 10% from the value calculated at the beginning. This stochastic algorithm alternately calculates a set of random phases angles and then random amplitude values. These are then used to modulate a new multi-sine signal and compare its CF to the lowest CF reached so far. In the case of an improvement, it saves the better phase with respect to amplitude values and continues the random calculations.
To combine the benefits of both above-described algorithms, the iterative-stochastic optimization algorithm alternates between the iterative and the stochastic algorithm. In this way, the chance of finding the global minimum rises, and the chance of getting stuck in a local minimum is minimized. The algorithm terminates when a set number of CF calculations is reached.

Simulated Annealing
In addition, a simulated annealing (SA) approach is adapted for the specified problem. SA is a metaheuristic approach to approximate a global optimum. The algorithm consists of iteratively executed steps. For each problem, the components of the annealing schedule, acceptance probability p, current state s, state transition, and cost function have to be defined [18].
The annealing schedule reduces the start temperature T 0 at each iteration k until it reaches the final temperature T f , and the algorithm terminates. We used a schedule where the current temperature T k is reduced at each iteration k by the cooling factor c according to Furthermore, the algorithm keeps track of the current state that corresponds to a possible solution for the problem. Each state consists of a phase angle and amplitude Comput. Sci. Math. Forum 2022, 2, 11 5 of 10 value for each frequency component. At the start, the state of both parameters is randomly initialized. Afterward, a neighbor state is selected at each iteration by changing a phase angle or amplitude value of a randomly chosen frequency component according to the value range of Equation (7).
The current state s k−1 transitions into the neighbor state s k if the crest factor (CF) of the neighbor state is smaller than the current state. It also transitions into the neighbor state with the following acceptance probability: Otherwise, the current state is kept. At the end, the algorithm returns the state with the lowest CF. The specific configuration of the algorithm is based on several test runs and consists of the following parameters: T 0 = 100, T f = 0.00005, and c = T f T 0 1 n CF , where n CF specifies the number of CF calculations. The parameter n CF can be chosen freely and determines the runtime duration.

Genetic Algorithm
The genetic algorithm (GA) is another metaheuristic approach that we adopted for the specified problem. GA is part of the domain of evolutionary algorithms and is defined by the following components: initialization, selection, crossover, and mutation [19].
At the beginning, multiple candidate solutions are randomly generated to build a start population. Each of these is characterized by chromosomes that model the properties of a solution. A chromosome, in turn, is modeled as the phase angle and amplitude value of a frequency component. Afterward, the iterative process begins with the selection of parents for the next generation.
We used tournament selection for the selection process. The tournament size k was set to the value of 3, and each candidate was chosen randomly. The best candidate of the tournament was selected on the basis of the lowest CF. Then, a crossover operation generated the offspring by combining two selected individuals. For that matter, a uniform crossover that chooses random chromosomes from either parent with equal probability was used.
A mutation operation at the end of the iteration changed for each offspring the amplitude and phase angle of each frequency component with the following probability: where n f req specifies the number of frequency components. The mutation changes a phase angle or amplitude value by randomly choosing a new value in the specified value ranges of these parameters mentioned in Equation (7). These steps were repeated until the final number of generations n generation was reached. To compare our approaches, we used the number of CF calculations n CF to specify the runtime duration. Therefore, we set where n pop is the population size. We set the population size equal to 100 and the probability for keeping the original parent in the next population to 40%. The specified configuration was determined by trial and error.

Experiments
According to the intended application scenarios, different parameters were selected for a detailed investigation. Three amplitude distributions-uniform, linear, and exponential (the latter two decreasing with increasing frequency)-were of importance for our research. The upper bandwidth was limited to 10 6 Hz as this was a reasonable tradeoff between CF reduction and calculation effort. The frequency distribution was fixed as was the number of iterations.
All possible combinations from Table 1 were tested for each algorithm. A random start state was used, which was the same for all algorithms. Furthermore, each configuration was executed five times due to stochastic events. The exception was the algorithm from Ojarand as it always delivered the same result. To conduct the experiments, the calculations were executed on Microsoft Azure using the programming language Python. The used hardware configuration was a Standard-F32s-v2 compute unit using 32 virtual Intel(R) Xeon(R) Platinum 8272CL CPUs with 2.60 GHz. The memory size and the storage size (SSD) were set to 64 GiB. A separate process with an individual configuration was started on each CPU core in parallel.

CF Minimization
The results for the specified configurations and algorithms are visualized in Figure 1. Each figure represents the results for a specific amplitude distribution. The abbreviation Mixed stands for the iterative-stochastic optimization algorithm and Clip stands for the algorithm presented by Ojarand [16]. Furthermore, Schroeder's analytic formula from Equation (4) was used as a baseline. Using stochastic elements, SA, GA, and Mixed were calculated several times. Thus, the mean and standard deviation of the CF after n CF iterations are shown.

Time per CF Reduction
The runtime and progression of the algorithms were analyzed by investing the improvement of the CF over time. Error! Reference source not found. illustrates this process. Accordingly, the improvement ΔCF was calculated by taking the difference between the best-found CF until a specific iteration and the start value. Only the results for 50 frequency components are shown due to limited space. Nevertheless, the CF progression for the other frequency components was similar. For each iteration, the mean and standard deviation over all five repetitions of an algorithm were calculated and visualized with one exception. Due to the lack of stochastic events, the Clip algorithm was not calculated multiple times. The actual runtime can be calculated by taking the time taken for each iteration from Table 2.
The results show that the final CF for Clip and Schroeder was surpassed by all presented algorithms after only a few iterations. The results using Schroeder's formula were similar to the final CF of the Clip algorithm (see Error! Reference source not found. for All presented algorithms outperformed Ojarand's algorithm and Schroeder's formula in terms of CF. SA, in general, delivered the best results, followed by GA and the iterative-stochastic algorithm. Especially for multi-sine signals with large frequencies, the outperformance became significantly large, with up to 1-1.5-fold increases in CF.
In summary, a broad range of algorithms was compared using predefined conditions. The advantage is that comparability between different algorithms on uniform conditions was created. Nevertheless, a limited selection of algorithms and hyperparameter optimization was applied. Therefore, it is not guaranteed that there are not more suitable algorithms for the specified problem. Furthermore, the presented algorithms did not always deliver the same results. We tried to analyze this effect by running the algorithms multiple times, but five repetitions were not enough to give a general statement. Nevertheless, our tests indicate only small deviations between different runs.

Time per CF Reduction
The runtime and progression of the algorithms were analyzed by investing the improvement of the CF over time. Figure 2 illustrates this process. Accordingly, the improvement ∆CF was calculated by taking the difference between the best-found CF until a specific iteration and the start value. Only the results for 50 frequency components are shown due to limited space. Nevertheless, the CF progression for the other frequency components was similar. For each iteration, the mean and standard deviation over all five repetitions of an algorithm were calculated and visualized with one exception. Due to the lack of stochastic events, the Clip algorithm was not calculated multiple times. The actual runtime can be calculated by taking the time taken for each iteration from Table 2. the other frequency components was similar. For each iteration, the mean and standard deviation over all five repetitions of an algorithm were calculated and visualized with one exception. Due to the lack of stochastic events, the Clip algorithm was not calculated multiple times. The actual runtime can be calculated by taking the time taken for each iteration from Table 2.
The results show that the final CF for Clip and Schroeder was surpassed by all presented algorithms after only a few iterations. The results using Schroeder's formula were similar to the final CF of the Clip algorithm (see Error! Reference source not found. for comparison). The high reduction lasted until iteration 10,000, before slowing down. SA was the exception because it depended on the predefined annealing schedule. Therefore, the characteristic curve was always the same regardless of the number of iterations. Nevertheless, the curve of SA showed that the start phase of the algorithm was purely random, and the last phase seemed very long. This is an indication that a more diligent hyperparameter search could accelerate the algorithm.
Furthermore, none of the algorithms were optimized for runtime, and the values from Table 2 are only reference values. These runtime values depend strongly on various parameters, such as hardware configuration, programming language, or parallelization. Another topic is the standard deviation of different runs compared between the algorithms. In general, the deviation was not as pronounced in the SA algorithm as in the iterative-stochastic and GA methods.  The results show that the final CF for Clip and Schroeder was surpassed by all presented algorithms after only a few iterations. The results using Schroeder's formula were similar to the final CF of the Clip algorithm (see Figure 1 for comparison). The high reduction lasted until iteration 10,000, before slowing down. SA was the exception because it depended on the predefined annealing schedule. Therefore, the characteristic curve was always the same regardless of the number of iterations. Nevertheless, the curve of SA showed that the start phase of the algorithm was purely random, and the last phase seemed very long. This is an indication that a more diligent hyperparameter search could accelerate the algorithm.
Furthermore, none of the algorithms were optimized for runtime, and the values from Table 2 are only reference values. These runtime values depend strongly on various parameters, such as hardware configuration, programming language, or parallelization. Another topic is the standard deviation of different runs compared between the algorithms. In general, the deviation was not as pronounced in the SA algorithm as in the iterativestochastic and GA methods.

Conclusions and Future Work
Using multi-sine signals with low CF is a necessity for high-precision measurement systems such as the DEA relying on the comparison of excitation and response signals. Due to nonlinearities in the electrical components, as well as disturbances in the measuring path, a signal with low CF is favorable as the frequency analysis becomes more robust, thus resulting in a less error-prone measurement device.
With an increase in bandwidth, an increase in frequencies monitored is required, especially if the analysis is difficult to support by a model-based approach using small sets of measurement points for the model fit. For industrial applications, where environmental influences, contamination, material aging, and differences between material batches are more a rule than an exception, a fast implementation is required, and a simplified approach is preferred. The presented methods showed significant improvements in reaching a low CF, as well as obtaining a fair CF, in a short amount of time, especially for high frequencies over a wide bandwidth. Thus, the foundation was laid to apply analytical methods as a function of time-dependent frequency behavior over a wide bandwidth, which opens up new paths for the investigation of fast-curing adhesives or similar chemical processes and phase changes in thermoplastics.
Our plans for future research are to investigate the effect of different start values, e.g., in Schroeder's or Newman's formula, on the performance and results of our presented algorithms. These formulas yielded considerably better results than random values for the phase angles and could, therefore, have a significant impact. Further improvement of our presented algorithms will also be a topic. For the iterative-stochastic method, a selective search instead of the current iterative search is a timesaving option. In addition, the runtime needs to be further inspected. For example, the algorithms can be improved by using a more runtime-oriented programming language or parallelization. Another issue is to ensure reliability of the algorithms by increasing the amount of repetitions.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.