1. Introduction
The noise inside car cabins is an important factor affecting the perception and comfort of drivers and passengers [
1]. Its role in the global appreciation of cars is expected to become even more important due to the increasing popularity of electric vehicles (EVs) and hybrid-electric vehicles (HEVs). According to a recent International Energy Agency report [
2], in 2022, the total number of electric cars on the world’s roads reached 26 million, up 60% relative to 2021. The change in power systems, from internal combustion engines to electric motors, has made car interiors quieter, and functioning noises other than powertrain noise have become more noticeable. Among them, heating, ventilation, and air-conditioning (HVAC) systems operate constantly and are thought to have a significant impact on passenger comfort, as the noise level when the HVAC system is off is significantly lower than in internal combustion engine vehicles (ICEVs). Masullo et al. [
3] showed that, with electric-driven engines, the interior noise was reduced dramatically, by up to 9 dB. The most significant differences occurred at the lowest (25% and 50%) airflow rates (AFR), where the noise of the internal combustion engine in idle mode masked the air-conditioning noise, and the annoyance (i.e., psychoacoustic annoyance [
4] or; subjective annoyance [
1]) between the HVAC in the on and off conditions increased considerably in HEVs compared to ICEVs.
The literature is rich in studies that investigated how to reduce HVAC noise levels using more or less conventional methods. Some of them focused their efforts on the re-design of blowers and blades [
5,
6] or on the zones of flow separations with high turbulent kinetic energy [
7]. Bennouna et al. [
8] conducted an experimental investigation using two strategies: passive absorbers on the HVAC body and ducts, or HVAC integration into the vehicle. Their findings showed that passive absorbers induce low aeraulic noise reduction, whereas HVAC relocation induces significant noise reduction along both structure radiation and aeraulic paths. Allam and Åbom [
9] investigated using splitters made of microperforated plates in cooling-fan inlet/outlet applications, suggesting a locally reacting design. Arenas and Crocker [
10] compared old and new sound-absorbing materials in terms of acoustics and other characteristics (i.e., safety, weight, and technological optimization), predicting a rapid expansion in the latter over the next few years. Singh and Mohanty [
11] proposed using jute felt and waste cotton as low-cost, lightweight, biodegradable, and recyclable natural materials with high potential for HVAC noise control. Other authors proposed solutions integrating active noise-control systems [
12,
13].
On the other hand, a limited number of scientific papers investigated the sound quality of vehicles’ HVAC systems. Yoon et al. [
14] used a neural network to predict positive sound evaluations. Nakasaki et al. [
15] evaluated the subjective impressions of air-conditioning sounds onboard vehicles in terms of loudness and sharpness, deriving two main factors: volume and thermal factors. Wagner et al. [
16] associated the participants’ impression of hearing the sound of an air conditioner that cools or heats with sharpness. The less distinct the sharpness parameter is, the more it sounds like an air-conditioning unit heats. In line with the review work of Ma et al. [
17] on the perceptual dimensions of sound, where the authors carried out a meta-analysis of semantic differential method applications to indoor and outdoor sounds, recent research investigating the perception of sound in car cabins [
18,
19,
20] has highlighted two perceptual dimensions, the aesthetics (or quality) (Factor 1) and the loudness (or powerfulness) of sound (Factor 2), that well describe the subjective impressions of individuals inside car cabins in the presence of HVAC noise. The results of these studies showed that the distribution of factor scores changed as the airflow rate changed, particularly for HEVs compared to ICEVs. The findings suggested that as the airflow rate decreases, the impression of the quality and quietness of sound increases.
In contrast, as shown in
Figure 1 [
21], the factor scores for the same airflow rate tend to be distributed close to each other, giving similar subjective impressions. The differences between the two types of engines were mainly observed for low airflow rates (e.g., 25%), where HEVs were rated as being of higher quality, more comfortable, and quieter than ICEVs. Moreover, as the airflow rate increases, the difference becomes smaller, resulting in a negative impression, such as being noisy or unpleasant. The conclusions of all these works [
18,
19,
20,
21] suggested that the evaluation of HVAC noise inside vehicles was affected by its balance with the powertrain noise. Masullo, Yamauchi et al. [
22] also analyzed the differences in perception of car cabin noise between participants from different cultures, revealing that while Japanese participants varied their quality ratings depending on changes in airflow rate, the Italian group was less sensitive to this variation. For both groups, the results were evident mainly in the case of HEVs.
Whereas previous experiments and analyses were only focused on the combination of HVACs and engine noise, more realistic and complex operative conditions should be considered to understand how the overall sound quality perception in the presence of HVAC noise can be improved.
From a sound-design perspective, Haverkamp [
23] states that the car interior should be a calm auditory environment with soft sounds that do not attract much attention but contribute to a lively and comfortable ambience. To this end, vehicle infotainment systems that already perform several audio processing tasks [
24] (i.e., voice recognition, equalization, and spatial rendering) could be used for a secondary goal, i.e., making the most of the mechanism of informational masking provided by played-back audio cues. In this view, music and broadcast cues can potentially affect the attention and well-being of individuals [
25], fostering top-down rather than bottom-up attentional mechanisms [
26].
Different from the energetic masking caused by competition between a masker and masked sound due to overlapping excitation patterns in the peripheral parts of the auditory system [
27], informational masking can reduce the capability to segregate a target sound from competing sounds [
28,
29]. The use of informational masking has been recently extended from speech to other sectors to modulate individuals’ attentional processes. As described by Posner [
30], two different attentional orienting mechanisms exist: an endogenous mechanism able to elicit top-down control and an exogenous bottom-up mechanism, which is a primitive response to an external cue. In the working sector, informational masking was proposed to improve performance and satisfaction in open-plan offices [
31]. In environmental acoustics, cognitive efforts are minimized by activating voluntary attention (bringing distractions under control through inhibition) and fostering automatic, involuntary attention.
To this end, several authors proposed the use of augmented positive sounds, such as water or bird [
32,
33,
34] sounds, demonstrating both in the laboratory [
35] and in situ [
36] their effectiveness in terms of the human brain’s responses.
In this paper, we intend to understand whether the perceived sound quality of a car containing HVAC noise is affected by infotainment-system audio (ISA) cues.
To this end, we conducted laboratory experiments in which music and radio broadcasts of musical and verbal audio were used as competitive attentional targets of air-conditioning noise and listened to simultaneously. The research questions (Qs) and related hypotheses (Hs) are listed below:
Q1: Can different ISA cues moderate the perceived sound quality in an electric-vehicle car cabin containing HVAC noise?
H1: Informational masking provided by ISA cues affects the sound quality in electric-vehicle car cabins when the HVAC system is activated.
Q2: How does sound quality perception change using ISA cues at different AFR levels of an HVAC system?
H2: When HVAC systems are activated in electric-vehicle car cabins, ISA cues moderate the perceived sound quality differently at different AFR levels.
Q3: How does sound quality perception change between Japanese (JPN) and Italian (ITA) groups using ISA cues?
H3: When HVAC systems are activated in electric-vehicle car cabins, ISA cues moderate the perceived sound quality differently in the Japanese (JPN) and Italian (ITA) groups.
Q4: What is the optimal SNR between ISA cues and HVAC noise to improve the perceived sound quality in an electric-vehicle car cabin?
H4: When an HVAC system is activated in an electric-vehicle car cabin, the SNR between ISA cues and HVAC noise moderates sound quality perception.
2. Methodology
A within–between experimental design was prepared to investigate the study hypotheses. The experiments took place in parallel in the laboratories of the Department of Architecture and Industrial Design of the Università degli Studi della Campania “Luigi Vanvitelli” (Sens i-Lab) and of the Department of Acoustic Design of Kyushu University (Yamauchi Lab), involving Italian and Japanese samples of participants. The experiments combined audio stimuli of HVAC noise at several AFR levels and broadcast radio recordings, as well as a famous song sung in Japanese and in the Italian language and a musical version of the song. All the stimuli were presented at different SNRs. Two preliminary experiments were conducted to choose the duration time of the audio stimuli (stimulation length) and the signal-to-noise ratio levels. All auditory stimuli were played back by headphones HD 650 (Sennheiser, Wedemark, Germany). In Sens i-Lab, the playback sound levels of the headphones were calibrated by an HSU III.2 artificial head coupled with a SQobold 4-channel system (Head Acoustics, Herzogenrath, Germany), while in the Yamauchi Lab, the levels were calibrated by an artificial ear Type 4153 (Brüel & Kjær, Darmstadt, Germany) and a sound-level meter Type 2250 (Brüel & Kjær).
2.1. Preliminary Setup n. 1: Determination of Stimulation Length
In a previous experiment on stationary noises combining internal combustion and hybrid-electric-vehicle noise with HVAC noise, the stimulus length was 5 s [
3]. However, when conducting experiments on information masking using infotainment-system audio cues, it is necessary to ensure that the listening duration is sufficient to understand the conveyed information [
37,
38,
39] entirely. On the other hand, it is also necessary to ensure that the length of the entire experiment does not place an excessive burden on the participants. We conducted a preliminary test involving two students at a soundproof laboratory in Ohashi Campus, Kyushu University, to determine an appropriate stimuli length in the main experiments.
Stimuli: Air-conditioning noise was recorded inside an electric car in a steady condition while broadcast audio was synthesized and superimposed on the air-conditioning noise as an auditory stimulus for the infotainment system. There were six different stimulus lengths: 5, 10, 15, 20, 25 and 30 s. Considering the intention to perform a main cross-cultural experiment, the audio broadcast (a weather forecast) was prepared in English and then translated into Japanese. Google Text-to-Speech [
40] was used to convert the text into speech. The AFR of the HVAC system was set at 25, 75 and 100%, as in previous experiments [
3,
22]. The SNR of the audio broadcast and HVAC noise was calculated as indicated in Equation (1) and adjusted to between 0 and −9 dB. There were 6 types (length) × 3 types of AFR × 2 types of SNR. In total, 36 different conditions were simulated.
Method: All the stimuli were calibrated and presented using the abovementioned headphones. Participants rated the stimuli using 14 different 7-point scales. Of the 36 types of stimuli mentioned in the previous section, 6 stimuli of the same length were played randomly, and this was considered one block. Each block was evaluated twice, and 12 blocks were conducted for each experimental participant.
Results: The time required to evaluate all stimuli within one block was recorded. The time needed per block for one of the two experimental participants showed a tendency that the longer the stimulation, the shorter the time required for the experiment. The other participant also showed a similar tendency, with minimal changes. According to the introspection report, there was an opinion that “If the length is short, the stimulus will have to be played repeatedly, making it more difficult to evaluate”. It is conceivable that the shorter the stimulus, the greater the workload on the experimental participants. Therefore, we decided that the length of the stimuli to be used in the main experiment would be 30 s, allowing participants to understand the linguistic information better and reduce their burden.
2.2. Preliminary Setup n. 2: Determination of Signal-to-Noise Ratio Steps
A further preliminary test was also conducted to determine the signal-to-noise ratios for the main experiment.
Stimuli: Similar to the preliminary experiment n. 1, a synthesized sound of Japanese speech from an audio broadcast was combined with the HVAC noise at four different levels (25%, 50%, 75% and 100%), so that, over cues of 30 s, there were sound equivalent levels differences between the broadcast and HVAC noise (Equation (1)) of 0.0, −3.0, −4.5, −6.0, −9.0 and −12.0 dB. In total, 4 AFR × 6 SNR = 24 combinations of stimuli were prepared.
Method: The experiment was conducted in a soundproof laboratory in Ohashi Campus, Kyushu University, with the participation of 3 students. All the stimuli were calibrated and presented using the abovementioned headphones. While listening to these stimuli, participants were asked to evaluate the loudness of the broadcast using a 7-point Likert scale ranging from “very loud” to “very small” compared to the noise. Each of the three participants was given two ratings for each stimulus.
Results: For each participant and stimulus combination, the average loudness rating of the broadcast audio was obtained. The number of times the rating difference was zero between SNR conditions under the same airflow conditions was counted. The number of times that the loudness of the broadcast sound was judged to be the same as the HVAC noise was highest when the SNR values were between −3.0 and −4.5 dB (n. 6), followed by the conditions in which the SNR values were between −4.5 and −6.0 dB (n. 2) and between −6.0 and −9.0 dB (n. 1). These results suggested that when the SNR was 3.0 dB or less, it became difficult to perceive any difference in the broadcast audio.
3. Main Experiment
Based on the two preliminary experiments, a main experiment was conducted to examine the effects of infotainment audio on perceived sound quality when an air-conditioning system is functioning. The experiment was conducted both in Japan, in the Yamauchi Lab in the Department of Acoustic Design at Kyushu University, and in Italy, in the Sens i-Lab in the Department of Architecture and Industrial Design of the Università degli Studi della Campania “Luigi Vanvitelli” (
Figure 2).
3.1. Materials and Methods
Besides the air-conditioning noises and the infotainment audio used in the preliminary experiments, combinations of both with different signal-to-noise ratios were prepared. Details of the audio stimulus conditions are described below.
HVAC noise: Audio recordings of the air-conditioning system of an electric-vehicle model were extracted from a previous study [
3]. The audio recordings were of an electric-vehicle model with the air-conditioning system functioning at three of the most frequently used airflow rates: 25%, 50% and 75% (hereafter referred to as AFR1, AFR2 and AFR3, respectively).
ISA cues: These consisted of a Japanese (JPN_BR) and an Italian (ITA_BR) version of an audio broadcast created using Google Text-to-Speech and 3 audio tracks which were extracted from the song “Let it Go”, used during the end credits of the movie Frozen: Japanese (JPN_SO)- and Italian (ITA_SO)-lyric versions and a karaoke (INST) version. These were played back using the car audio system of an internal combustion engine vehicle, a C-segment passenger car, while its engine was stopped, and recorded using a head-and-torso simulator Type 4100 (Brüel & Kjær) positioned in the passenger seat.
SNR: Considering the results of the preliminary experiments, the SNR was set in 4 dB steps, and three types were used: −4.0, −8.0 and −12.0 dB.
The previous 3 HVAC conditions × 5 types of ISA cues × 3 types of SNR generated 45 sound stimuli. Three more conditions with only air-conditioning noise and no masking audio (CTRL) were added. In total, 48 stimuli were used. Representative chunks of 30 s were extracted from each recording and analyzed using the software Artemis Suite 14.3 (Head Acoustics) (see
Table 1).
After the calibration, the stimuli were played back by HD 650 (Sennheiser) headphones.
Figure 3 shows the octave-band A-weighted sound-equivalent levels of the stimuli at different AFRs and of the background noise in the two labs. The spectra represent the experimental listening conditions, that is, when wearing headphones.
The subjective impressions of different sound aspects (quality, pleasantness, power, and spatio-temporal and spectral structure) were evaluated utilizing a 7-point semantic differential scale of 14 adjective pairs translated from English into Japanese and Italian [
21] (see
Table 2).
3.2. Participants
Twenty-three Japanese (twelve males and eleven females; M = 23.0; SD = 2.7) and twenty Italian participants (eleven males and nine females; M = 28.8; SD = 7.6) participated in the study. All of them participated voluntarily, without receiving any compensation for doing so.
3.3. Procedure
Participants were seated in front of a laptop and wore headphones in the center of the room. First, each participant was administered the Weinstein noise sensitivity scale [
41,
42] to test their noise sensitivity. After that, the test procedure was explained in detail to each participant: they were asked to rate their impression of each of the 48 presented stimuli on a 7-point semantic differential scale of adjective pairs (see
Table 3).
A laptop was used to play the audio and collect the ratings on the semantic differential. Participants listened and rated each stimulus using the 14 adjective pairs before moving on to the next stimulus. They could listen to each stimulus as many times as they wanted. Participants were informed of the possibility of taking a break whenever they needed to during the experiment. Halfway through the experiment, a prompted message from the interface would suggest that participants take a break in any case. The average session duration was 45 min.
3.4. Statistical Analyses
To confirm what emerged from previous research about the main dimensions characterizing onboard sound quality perception when a HVAC system is activated, a Principal Component Analysis (PCA) was initially conducted.
The main and interactive effects of the type of ISA, AFR, and SNR and of group were analyzed with respect to the two main perceptual dimensions (aesthetics and loudness) and each single sound quality attribute by means of the following:
- -
Mixed-factorial 6 × 3 × 2 ANOVAs that treated the type of ISA as a 6-level within-subject factor (CTRL, INST, ITA_BR, ITA_SO, JPN_BR, and JPN_SO), the AFR as a 3-level within-subject factor (AFR1, AFR2, and AFR3), and the cultural group as a 2-level between-subject factor (ITA and JPN);
- -
Mixed-factorial 4 × 2 ANOVAs that treated the SNR as a 4-level within-subject factor (CTRL, −12, −8 and −4) and the cultural group as a 2-level between-subject factor.
These latter analyses were conducted to understand whether the informational masking provided by ISA cues could moderate the perception of sound quality in car cabins and to answer the four research questions.
5. Discussion
The PCA results confirmed what was already found in previous research [
21], i.e., inside electric-vehicle car cabins, the most important perceived sound quality dimensions are related to the aesthetic and loudness characteristics. These results are also in line with previous findings of other authors; see Kuwano et al. [
43]. However, unlike earlier research, a new and exogenous sound source in the car cabin introduced a further dimension associated with the playback of the ISA cues. Although this aspect deserves interest with respect to better comprehending how the car audio system configuration influences the perception inside car cabins, this aspect was not addressed in this study.
Considering the four research questions, it can be observed that the informational masking provided by the introduction of ISA cues can moderate the perceived sound quality in electric-vehicle cabins containing HVAC noise. Concerning the first research question, Q
1, it can be observed that the introduction of ISA cues can be useful in improving the aesthetic dimension of the sound quality onboard and making it more functional, natural, and pleasant. On the other hand, it also changes the perception onboard from deep to sharp, providing a better sense of quietness, though this was only found for the Italian group. Among the different types of cues, the instrumental one improved the aesthetic ratings. Although applied in a different context, car cabins, the findings are in line with previous research which demonstrated how the use of additional natural or augmented natural [
36,
44,
45,
46] sounds or music [
45] can be considered an effective and soft mitigating strategy to compensate for negative perception of the sound environment of everyday-life environments.
The results for the second research question, Q2, showed that the effect of ISA cues did not interact with the AFR. However, the main effect of AFR on the Japanese group highlighted a dramatic drop in the aesthetic rating caused by a deeper unpleasant impression. This outcome was accompanied by a more rapid increase in the loudness perception in the AFR3. This result showed once more that, due to negative perceptions, the Japanese group is sensitive to high AFR levels of HVAC systems.
Considering the third research question, Q3, although the effect of ISA cues did not interact with the Group factor for the two main sound quality dimensions, it can be observed that some differences emerged for specific attributes of the aesthetic and loudness dimensions.
The results for the last research question, Q4, show that the SNR between ISA cues and HVAC noise is important in improving sound quality perception in an electric-vehicle car cabin. Such improvement becomes effective only at an SNR of −4 dB. This condition fosters an aesthetic improvement of the existing sound quality onboard independently of the AFR level, the type of ISA cue, and the cultural group.