3.2. System Components and Modeling
The in-vehicle audio system has a complex structure comprising multiple components. Key components include a signal generator, an audio analyzer, BNC connectors, a speaker, and a vehicle multimedia system. The transmission of signals starts with the input sound from the signal generator, traveling through BNC connectors to the audio analyzer, and then to the speaker. At the end of the signal process, the sound waves produced by the speaker are delivered to users via the in-vehicle multimedia system. These components are critical not only for the in-vehicle audio quality but also for directly influencing the auditory experience of drivers and passengers. The final output obtained in the vehicle audio system is the result of multiplying the frequency responses of the system components with each other. Therefore, the frequency response of each component is among the critical factors directly impacting system performance. The system diagram is shown in
Figure 2. Subsequent sections will address the modeling of each component in terms of frequency response and examine the interactions between these responses.
White noise is frequently preferred in the calibration processes of sound systems because it has equal sound intensity in a wide frequency range, such as 20 Hz–20 kHz. This feature allows for objective testing of the system’s frequency response throughout the entire frequency band; thus, it enables a comprehensive analysis of system performance. Particularly in acoustic arrangements of sound systems and rooms, this broad-spectrum sound source is utilized to assess system performance and the necessary adjustments are implemented. Consequently, the responses of sound systems at various frequencies can be objectively measured and optimized.
A similar usage is found in in-vehicle sound systems. Thanks to its balanced and comprehensive frequency spectrum, white noise provides an ideal test signal for accurately detecting the frequency response of sound systems across a wide frequency range. A significant advantage is that during in-vehicle acoustic adjustments using white-noise signals, it is possible to interactively examine and adjust the effects of applied filters across all frequencies [
6].
Acoustic analyzers use frequency weighting curves to simulate the human ear’s sensitivity to different frequencies. The A, B, and C weighting scales, displayed in
Figure 3, represent various frequency response filters employed in sound measurements. The process of weighting adjusts the measured decibel (dB) values of sounds at specific frequencies to match the sensitivity levels perceived by the human ear at those frequencies. Given the human ear’s lower sensitivity to low frequencies and higher sensitivity to high frequencies, these weighting curves are essential for accurately reflecting perceived sound intensity. These curves aim to align the measured sound levels more closely with the natural response of the human ear and enhance the effective assessment of sound’s true impact. Therefore, weighting filters are applied to the white-noise signal used as the input sound.
A-weighting primarily represents the sensitivity of the human ear to ambient noise measurements at low sound levels. This scale is more sensitive between the frequencies of 500 Hz to 10 kHz, while it reduces sounds at lower and higher frequencies, making it ideal for everyday environmental sound measurements. B-weighting is designed for medium to high sound levels (80 to 90 phon) and emphasizes sounds across the frequency range slightly more, making it useful in environments such as cinema and music production. C-weighting is used at high sound levels and provides a flatter response across a wide frequency range. Thereby, low- and high-frequency sounds are measured at nearly their original levels. This achieved accuracy is especially required in industrial settings and at concerts [
14].
B-weighting is a method designed to better reflect the sensitivity of the human ear at medium to high sound levels. Since it is necessary to regulate the system based on the mid to upper sound levels in setting vehicle sound systems, B-weighting, shown in
Figure 3, which presents the frequency response from 20 Hz to 20 kHz, is used. In this context, our study is based on the equal-loudness contour at 1000 Hz, with an amplitude level of 90 dB, as shown in
Figure 1. After applying the B-weighting process to the white-noise signal, the input audio used in this study is generated. The frequency response of this input sound is depicted in
Figure 4.
- (B)
Parametric Equalizer Filters
The parametric equalizer filter is a tool commonly used in audio processing and music production and has become a standard component in automotive multimedia systems. This filter adjusts the audio signal by targeting specific frequency ranges, thereby facilitating the adjustment of the sound’s tonal balance. The parametric equalizer is adjusted based on three primary parameters: center frequency (), bandwidth (or factor), and gain (). The gain modifies the intensity of the sound signal by increasing or decreasing the amplitude within the selected frequency band. The bandwidth determines the range of frequencies where the filter is effective, and it is usually expressed in octaves. The factor is defined as the inverse of the ratio of the bandwidth to the center frequency and indicates the sharpness of the filter. A high factor means a narrower bandwidth and a sharper filter response. These parameters allow users to finely tune how narrowly or broadly they want to affect a specific frequency, thereby precisely achieving the desired sound characteristics.
The parametric equalizer is a crucial tool in audio processing, primarily incorporating various filter types such as peaking (bell), shelving, and notch filters. In vehicle multimedia systems, the peaking filter is particularly favored. This filter facilitates the adjustment of in-vehicle sound systems in, accordance with the principle of equal loudness, through its advantages, such as frequency adjustment flexibility, tone control, and focused frequency intervention. The peaking filter is designed to either emphasize or attenuate signals within a specific frequency band and plays a critical role in sound processing and equalization [
15].
The mathematical model of the peaking filter is represented by a transfer function
that describes its effect on the frequency domain of the signal. This transfer function defines the relationship between the input and output of the signal as a function of frequency and is commonly formulated as (1)
Here,
represents the frequency under study,
is the center frequency (the frequency to be emphasized or attenuated),
denotes the gain (in amplitude level, measured in dB), and
represents the quality factor (which inversely defines the bandwidth of the filter) [
5].
According to the principle of equal loudness, the precise regulation of different frequency bands is necessary. Therefore, multiple parametric equalizer filters are generally used. The ISO 226:2003 standard [
12] defines the characteristics of parametric filters required to achieve an ideal equal-loudness contour. Adjusted according to this standard, ten parametric filters were designed to provide optimal sound correction with corresponding gain and Q factor values at the specified frequencies. Each of these filters is set according to the frequency, gain, and Q factor values detailed in
Table 1.
In this study, ten parametric equalizer filters have been utilized to ensure an ideal sound experience in vehicle multimedia systems. A filter order of 12 was chosen, which is aligned with the order of existing filters in in-vehicle multimedia sound systems. The filters were created using the fdesign.parameq function, which is available in the DSP System Toolbox library of Matlab R2021b (9.11.0.1769968 version) software, and allows users to design a parametric equalizer filter with specified parameters.
Amplifiers serve as fundamental power-boosting devices in sound systems. In vehicle multimedia systems, amplifiers increase the amplitude of the received audio signal, enabling speakers to produce sound at higher volumes and with higher quality. This improves the signal-to-noise ratio (SNR), enhancing the clarity and detail of the sound while minimizing distortions.
Among in-vehicle entertainment systems, Class-D amplifiers are particularly preferred. These amplifiers are advantageous due to their high energy efficiency and quality sound output. Class-D amplifiers process audio signals in digital format, amplifying them directly, without converting them back to analog signals. This process allows them to achieve high sound levels with lower energy consumption, compared to traditional analog amplifiers. Additionally, these amplifiers provide a balanced and consistent response across a wide frequency range, making them ideal for music and sound effects [
16]. In this study, an amplifier suitable for 4-ohm speakers was selected.
Figure 5 details the frequency response of the chosen amplifier at 4 ohms.
Speakers function as the final output component of sound systems; they convert filtered and amplified audio signals into physical sound waves and deliver them to listeners.
The types of speakers used in vehicle sound systems are specially designed to provide optimal performance across different frequency ranges. Essentially, these speakers are categorized into four main types to cover low, mid, and high frequencies: subwoofer/woofer, mid-range, tweeter, and full-range speakers. Each type of speaker is optimized for a specific frequency range, and the frequency ranges of these speakers are detailed in
Table 2 [
6].
The type of speaker used in this study is the full-range speaker, which covers a wide frequency range.
Figure 6 shows the frequency response of the full-range speaker. The manufacturer states that the effective operating frequency of this speaker is between 85 Hz and 12.5 kHz. This wide frequency range indicates that the speaker can adequately produce both low-frequency bass sounds and high-frequency treble sounds. In this study, the modeling of the speaker was based on this frequency response curve.
3.3. Modeling and Optimization of System Output
In this study, to ensure an ideal audio experience in vehicle multimedia systems, the processes of signal processing, filtering, integration of amplifier, and speaker responses were detailed, and the combined effect of these components was optimized. While creating the system model, a B-weighting white-noise signal was used as the input signal, followed by the integration of parametric equalizer filters, amplifier, and speaker components, sequentially. This integration is based on the cascading method, where the output of each system component serves as the input for the next component. This method allows for the step-by-step processing of the signal and the sequential application of each component’s effect.
In this cascading process, the role of convolution, one of the fundamental concepts of signal processing theory, is of great importance. Convolution, as a mathematical operation, involves applying a system response (e.g., the frequency response of an equalizer filter or an amplifier) to an input signal. This process is essentially performed by “folding” each point of the input signal with the system response and summing the results.
The convolution theorem explains the frequency domain representation of this process: the convolution of two signals (input and system response) in the time domain is equivalent to the multiplication of these signals in the frequency domain. This transformation is shown in (2).
Here, represents the output signal in the time domain, denotes the input signal in the time domain, and represents the system’s impulse response in the time domain. The variable signifies the time at which the signal is shifted over the signal. In the frequency domain, represents the output signal, denotes the input signal, and is the system’s frequency response.
This property allows engineers and designers in the field of signal processing, particularly in filter design and audio processing applications, to perform complex signal processing operations more efficiently. In this study, since the system components were modeled in the frequency domain, the convolution operation was also performed in this domain [
17].
To model the resulting signal at the system output, the amplitude responses of the input signal, amplifier, and speaker components described in
Figure 4,
Figure 5 and
Figure 6 were first imported into Matlab. Since the signals were defined in the frequency domain, the frequency responses of the parametric equalizer filters were calculated using the “freqz” function, and their amplitude responses were obtained. Here, a sampling frequency of 96 kHz, which is also used in the actual setup, was employed. The convolution of these obtained signals was performed by multiplication, as shown in Equation (2), and the speaker output was modeled.
In this study, statistical analyses were used to measure the performance by comparing the experimental results obtained from the speaker output. Pearson correlation analysis and RMSE methods were utilized for this purpose.
Pearson correlation analysis is a statistical method used to measure the linear relationship between two data sets. This analysis helps determine how two variables change together, producing a correlation coefficient (r) between −1 and 1. A correlation coefficient close to 1 indicates a strong positive linear relationship, while a value close to −1 indicates a strong negative linear relationship. A value near 0 suggests no relationship between the two data sets. In this study, Pearson correlation analysis was used to examine the linear relationship between the obtained results. The Pearson correlation coefficient is calculated using (3).
Here, x and y refer to the data sets, and n refers to the total number of data points.
The RMSE is a method used to measure the magnitude of differences between two data sets. The RMSE evaluates the deviations among the obtained results and quantitatively indicates the amount of error. A low RMSE value indicates that the results are close to each other. RMSE is calculated using (4).
These statistical analyses played a critical role in evaluating the agreement between the ideal contour and the experimental and simulation results. While Pearson correlation analysis determined the linear relationship between the results, RMSE analysis quantitatively measured the accuracy of this relationship [
18].
To align the signal output from the speaker with the ideal contour given by the equal-loudness principle, the optimization of filter parameters was performed. Metaheuristic algorithms were preferred for the optimization. Metaheuristic algorithms are methods capable of conducting effective and efficient searches over large solution spaces and are not specific to a particular problem. In this study, the genetic algorithm (GA), one of the metaheuristic algorithms, was chosen.
The GA is an optimization technique based on the principles of biological evolution and operates on solution sets called chromosomes. Initially, an initial population consisting of random solutions is created. Then, the fitness of each solution is evaluated according to a specific objective. Solutions with higher fitness values are selected to have greater representation in subsequent generations. Crossover operations are performed among the selected solutions to create new solutions (offspring), and small random changes (mutations) are applied to the solutions obtained from the crossover. Finally, the population is updated with the new solutions, and the process is repeated until a certain fitness value is achieved, or a specified number of iterations is reached [
19].
Genetic algorithms have the capability to operate effectively in large and complex solution spaces. In terms of flexibility and adaptability, genetic algorithms can be tailored to various optimization problems and provide flexibility for different objectives [
20]. Given the numerous frequency and parameter combinations in vehicle audio systems, genetic algorithms are ideal for addressing such problems.
In this study, a GA was implemented using the “optimoptions” and “ga” functions from Matlab’s Global Optimization Toolbox. The GA was configured with a population size of 100 individuals. Thus, 100 different solutions were evaluated in each generation, and it allowed for a more comprehensive exploration of the solution space. The algorithm was allowed to run for a maximum of 50 generations, ensuring that the algorithm had sufficient time to search for an optimal solution, while keeping the computation time under control.
The success of the GA depends on the careful selection of various parameters. In this study, the crossover rate was set to 0.8. This rate helps maintain genetic diversity while allowing successful traits to be passed on to the next generation. Additionally, uniform mutation was chosen as the mutation function, and the mutation rate was set to 0.2. This enables the algorithm to avoid local minima and explore a broader solution space.
The selection function used was stochastic uniform selection. This method increases the probability of selecting individuals based on their fitness values, allowing better-performing solutions to be more represented in the next generation. Arithmetic crossover was used as the crossover function. This method combines the genetic information of the parents to create new individuals, facilitating the discovery of better solutions.
In the application of the GA, the fitness function was determined by comparing the speaker output with the ideal contour. The fitness function was calculated using both the correlation and the RMSE values. Genetic algorithms inherently strive to minimize the fitness function. Therefore, if the correlation value was positive, the fitness function was calculated as the inverse of the correlation (); if the correlation was negative, a penalty value was added. This approach ensures that the genetic algorithm works to increase the correlation, thereby decreasing the fitness value as the correlation becomes more positive. Additionally, the RMSE value was incorporated into the fitness function, ensuring that a smaller RMSE results in more similar outcomes. This method has helped the speaker output to become closer to the ideal contour, thereby enhancing the performance of in-vehicle audio systems.
The genotype representation is achieved by encoding the speaker filter parameters as a series. Each genotype is a vector containing the values of center frequency, gain, and Q factor. The genotype is coded to include three parameters for each filter, as shown in (5).
The genotypes used in this study were determined within the lower and upper bounds, specified by the ISO 226:2003 standards and listed in
Table 1. These bounds aim to bring the filter parameters closer to the ideal contour. The lower bounds (
) and upper bounds (
) for the first to the tenth filter are provided in
Table 3.
The computational complexity of the genetic algorithm depends on the population size and the number of generations. The parameters used in this study were carefully selected to maintain control over the computation time while ensuring sufficient diversity and performance. These parameters maximize the flexibility and adaptability of the GA, creating an ideal environment for optimizing the frequency response of in-vehicle audio systems. Consequently, a higher quality and more comfortable audio experience is achieved for both drivers and passengers.