Assessing Acoustic Conditions in Hybrid Classrooms for Chinese Speech Intelligibility at the Remote End

Li, Qian; Li, Nan; Wang, Yan; Li, Zheng; Tian, Mengyun; Zhang, Yihan

doi:10.3390/buildings15111909

Open AccessArticle

Assessing Acoustic Conditions in Hybrid Classrooms for Chinese Speech Intelligibility at the Remote End

by

Qian Li

^1,2,*

,

Nan Li

³

,

Yan Wang

⁴

,

Zheng Li

¹

,

Mengyun Tian

¹

and

Yihan Zhang

¹

School of Civil Engineering and Mechanics, Yanshan University, Qinhuangdao 066000, China

²

Hebei Key Laboratory of Green Construction and Intelligent Maintenance for Civil Engineering, Yanshan University, Qinhuangdao 066000, China

³

State Key Laboratory of Subtropical Building and Urban Science, School of Architecture, South China University of Technology, Guangzhou 510641, China

⁴

School of Architecture and Urban Planning, Guizhou University, Guiyang 550025, China

^*

Author to whom correspondence should be addressed.

Buildings 2025, 15(11), 1909; https://doi.org/10.3390/buildings15111909

Submission received: 28 April 2025 / Revised: 21 May 2025 / Accepted: 27 May 2025 / Published: 1 June 2025

(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

Download

Browse Figures

Versions Notes

Abstract

Blended Synchronous Learning helps teachers and students communicate without geographical restrictions. The effect of communication between the face-to-face end and the remote end was not only affected by the performance of the equipment but also by the acoustic conditions in the classroom. This paper measured the acoustic parameters in the hybrid classrooms and conducted subjective speech intelligibility tests. It was found that for the hybrid classroom with a decentralized sound reinforcement system, the background noise level was high because lots of equipment was needed for synchronous learning. The speech intelligibility scores of the remote end were lower than those at the face-to-face end. Acoustic parameters of reverberation time (RT) and excessive signal-to-noise ratio (SNR) showed a negative correlation with speech intelligibility scores in the remote end. It was recommended that the sound pressure level (SPL) of the sound reinforcement system should not be too high and that appropriate sound absorption treatment be performed. The size of the hybrid classroom should be controlled to prevent the sound that arrived 50 ms after the direct sound from arriving. When SNR was 33 dB(A) for hybrid classrooms, which had a good performance in the face-to-face end with the speech intelligibility scores, T₂₀ should be within 0.8 s to achieve the target value of 83% for SI scores at the remote end.

Keywords:

blended synchronous learning; hybrid classroom; speech intelligibility; acoustic conditions

1. Introduction

Blended Synchronous Learning is defined as “Learning and teaching where remote students participate in face-to-face classes by means of rich-media synchronous technologies such as video conferencing, web conferencing, or virtual worlds” [1], also defined as “integration of physical classroom and cyber classroom settings using synchronous learning to enable unlimited connectivity for teachers and students from any part of the world” [2]. This model is also called synchronous learning [3], synchromodal learning [4], HyFlex course [5], synchronous hybrid learning [6], hybrid synchronous instruction [7], and Here-or-There (HOT) instruction [8]. In summary, no matter what it is called, this hybrid teaching model uses certain technical means to allow remote students (or teachers) and students (or teachers) in the classroom (face-to-face) to attend classes simultaneously. In this paper, we chose “Blended Synchronous Learning” to describe this model and referred to the classrooms as “hybrid classrooms”.

The existing literature mainly focuses on analyzing the advantages and disadvantages of this mode and the technical problems encountered but pays less attention to the impact of the acoustic environment.

For the advantages, it allowed remote participants to experience an instructor’s lesson, ask and answer questions, offer comments in class, and generally allow engagement “in a similar manner to on-campus students” [9], stimulated innovative questions and conversations [10], providing greater flexibility for how students choose to attend classes [1,10] and more equitable learning experiences for students who were geographically isolated or could not physically attend classes due to various reasons [11,12]. Therefore, this model was widely used during the outbreak of COVID-19, which accelerated its popularity on campus. However, many classrooms only introduced the equipment system to rapidly implement Blended Synchronous Learning without special acoustic design and renovation. Afterwards, due to economic factors and other reasons, the classrooms were not acoustically updated.

For the shortcomings, online students’ attention was more easily distracted [13]. Technical issues, such as internet outages, power failures [14], and the poor quality of audio [4,15,16], disrupted the teaching process. Bower and co-authors identified sound quality as the most critical factor for this mode [17], and they focused on the equipment.

In traditional face-to-face classrooms, bad acoustic conditions decrease the quality of speech communication, reducing the school performance of students and causing the teachers to suffer from fatigue [18]. According to the ISO 9921:2003 standard [19], the quality of speech communication can be expressed in terms of speech intelligibility. Acoustic parameters, such as background noise level (BNL), reverberation time (RT), speech transmission index (STI), signal-to-noise ratio (SNR), and Clarity (C₅₀), were confirmed to have an impact on speech intelligibility [20,21,22,23,24,25,26,27,28]. The STI, which is recognized as an objective parameter of speech intelligibility by IEC 60268-16 (5th edition), has proven effective in evaluating and predicting speech intelligibility [29]. However, traditional classroom acoustic research focused on the face-to-face end and did not consider the remote end.

For sound quality in Blended Synchronous Learning, equipment indeed plays a very important role. Do acoustic conditions also affect speech intelligibility in traditional classrooms? Recent research showed that hybrid classrooms with poor acoustic conditions decreased students’ ability to comprehend the subject matter as they could not hear the lecture [30]. Elmehdi H. and Tato, A. [31] measured six hybrid classrooms using parameters of BNL, RT, C₅₀, and strength (G) and found that most classrooms failed to meet international standards. Moreover, 88% of surveyed students reported that noise impaired their comprehension of course material and hindered communication with instructors and peers. However, they did not conduct a speech intelligibility test to obtain accurate listening accuracy on both ends. Moreover, L. Galbru and K. Kitapci’s paper showed that, in the traditional classroom, the impact of room acoustic conditions on the speech intelligibility of four languages (English, Polish, Arabic, and Chinese Mandarin) was different [32]. Therefore, speech intelligibility in different languages should be studied.

More cases need to be investigated to clarify the impact of hybrid classroom acoustic conditions on the quality of speech communication of the Blended Synchronous Learning model and answer the following questions:

In different acoustic conditions, do face-to-face users and remote users acquire the same speech intelligibility in Blended Synchronous Learning?
How does the acoustic environment affect the listening experience at the remote end?

To answer the above questions, in this paper, we measured the room-acoustics parameters and SNR and conducted a Chinese Mandarin speech intelligibility test in five different sizes of hybrid classrooms with the same hybrid learning equipment system on a Chinese university campus. Specifically, first, in the materials and methods part, we described the basic information and the equipment for synchronous learning in the five selected hybrid classrooms in detail, introduced the measurement system and methods for acoustic parameters, and explained the experimental way and procedure of subjective speech intelligibility tests. Then, in the Section 3 of the article, the results of BNL and room-acoustic parameters, the SI scores, and their relationships were listed. After that, in the discussion part, we discussed the reasons why the relationship between acoustic parameters and SI scores in hybrid classrooms was different from that in traditional classrooms and other spaces with sound reinforcement systems, gave recommended size for the classroom and values for the range of acoustic parameters that had an impact on SI scores. We also explained the limitations of the research and future work in this part. Finally, we summarized the work in the conclusions part.

2. Materials and Methods

2.1. Room Description

The research involved five hybrid classrooms located on the middle floors of an office and teaching complex building at a university in China. The classrooms are located facing the core area of the university, not adjacent to the city road. Only a few vehicles pass on the road on this side of the classroom; hence, the outdoor environment is relatively quiet. The exterior wall and the interior wall for separating the corridor were made of a 200 mm aerated concrete block surface coated with latex paint. The other walls were made of 120 mm light steel keel covered by double-layer gypsum board filled with 50 mm rock wool. The ceiling was fixed with a metal perforated plate, and the floor material was PVC on concrete. Classrooms 1# and 3# did not have windows. Other classrooms had windows, and the materials were double-tempered glass, 12 mm + 6 mm Air + 12 mm, with very light Venetian blinds. Table 1 displayed the varying geometry, dimensions of width ∗ length ∗ height (W ∗ L ∗ H), floor areas, volumes, and capacities of the rooms.

There was a static omnidirectional microphone, which was used to pick up the voice of the teacher for remote students on the podium in the classroom, two movable handheld omnidirectional wireless microphones which one was used to pick up the voice of the teacher for face-to-face students and the other was used to pick up the voice of the face-to-face students. They all had good sound pickup performance. The static omnidirectional microphone featured 256 MS echo cancellation, intelligent dynamic noise reduction, full-duplex 360° pickup range, and a pickup radius of 3–6 m. The two movable omnidirectional wireless microphones had very low handling noise, excellent vocal quality, and a windscreen to prevent breath pops. Different numbers of 6.5-inch round ceiling speakers were installed in the ceiling according to the room area (shown in Figure 1). There was a camera to catch the picture of the classroom located in the corner of the classroom. A power conditioner for audio transferring to the speaker, a receiver of the wireless microphone system, and an HDCP-compliant scaling presentation switcher were gathered in the service box with 0.6 m width and length and 0.95 m height next to the podium table. The remote and face-to-face users were connected by Tencent Meeting. There were multiple-mode HVAC systems in these classrooms.

2.2. On-Site Acoustic Measurement Procedures

The acoustic parameters were measured with doors and windows closed and unoccupied. The time was when there were no classes in the surrounding classrooms.

2.2.1. BNL and SNR

A B&K sound level meter in class 1 type 2270 was used and positioned at a height of 1.2 m from the floor, corresponding to the level of the student’s ears. The A-weighted equivalent sound level (L_Aeq) was recorded in one-third octave bands from 63 Hz to 8 kHz for one minute at various locations for 1 min. B&K4231 was used for calibration before measurement.

The BNL measurements for all rooms were conducted under two distinct categories. The first category was measured in the face-to-face students’ area under five specific conditions, and in each condition, we used the average of the results of each position: (i) no equipment, (ii) service box only, (iii) service box and medium speed air-conditioning, (iv) service box, computer and projector, (v) full equipment operation (service box, medium-speed air conditioning, computer, and projector). Conditions (ii) and (iii) represented self-study scenarios, while (iv) and (v) represented actual class conditions. Although neither the Chinese standard GB 50118-2010 [33] nor the American standard ANSI/ASA S12.60/Part 1-2010 (R2020) [34] includes teaching equipment noise in their background noise limit provisions, our study accounted for the significant noise contribution from the service box. This consideration was particularly important because the supporting equipment for remote classes was complex. Most teachers did not know how to operate the equipment in the service box and only turned on and off the microphone on the podium table; the power supply of the equipment in the service box was often turned on. Therefore, background noise tests of conditions (ii)~(v) in the hybrid classroom of this study took the service box into consideration to express actual usage conditions. The BNL was assessed using the positions represented by solid circles, as shown in Figure 1, and the average value of all measurement points in the students’ area for each classroom. The SPL and the second category of BNL in condition (v) were measured during the speech intelligibility test in the position of the static omnidirectional microphone for picking up sound for remote students on the podium table to acquire the SNR. The position was expressed by triangles in Figure 1.

2.2.2. RT, Early Decay Time (EDT) and C₅₀

RT, EDT, and C₅₀ were measured in accordance with ISO 3382-2:2008 [35] using the integrated impulse response method with maximum-length sequence (MLS) signals, as other researchers had conducted [36]. A B&K dodecahedral sound source in type 4292-L (powered by a B&K 2734 amplifier) was used and placed in two corners of each hybrid classroom. The receivers were located on the other diagonal opposite to the sound source. The audio signal was received by a B&K in type 4966 omnidirectional microphone, amplified by a B&K 1704, and then collected and transferred to a computer using a B&K USB Audio Interface sound card in type ZE0948. Subsequently, the signal was deconvolved using Dirac 6.0 software to obtain impulse response data for RT, EDT, and C₅₀. The measurement system is shown in Figure 2, and the layout of sound sources and receiver positions is shown in Figure 3. Each receiver was measured three times, and the calculated RT, EDT, and C₅₀ of each octave band were averaged arithmetically.

2.2.3. STI

HBK in type 4720 Echo Speech Source was used for STI measurements. It was calculated by Dirac 6.0, obtained by averaging STI (male) and STI (female) to acquire the STI. The HBK4720 Echo Speech Source was located in front of the podium table, 1.5 m above the floor and 1.6 m away from the wall, facing the students. The receiver positions were evenly distributed in the classroom and were the same as the one for the background noise that was tested. The layout of the sound source and receivers is shown in Figure 1, and the measurement system is shown in Figure 4.

2.3. Subjective Speech Intelligibility Test

Hastie, M. [2] divided the Blended Synchronous Learning model into nine types, as shown in Table 2.

Mode 3 is one of the most common patterns. Subjective tests were conducted on Mode 3 using Chinese Mandarin speech intelligibility test word lists (PB word lists) as specified by GB 15508-1995 [37] in each hybrid classroom. Each list consists of 25 groups of 3 Chinese syllables—a total of 75 Chinese syllables—and keeps the balance between the level of difficulty and phonemic characteristics [38]. The three Chinese syllables in each row are randomly arranged with no semantic connection, with introductory words in front: ‘‘The Y row is “X X X”. Where “Y” stands for the row number and “X X X” stands for three Chinese syllables.

The subjects were chosen from undergraduate and graduate students aged from 22 to 35 years old. They can speak, hear, and spell Chinese syllables fluently and have no known hearing problems. There were 10 subjects on each side of the hybrid classroom to listen to the word list at the same time. They were asked to write down the Chinese syllables that all the subjects had heard. It was scored only if the pronunciation and spelling were correct. The subjective Chinese Mandarin intelligibility scores (SI scores) were averaged across those subjects and expressed as a percentage of the score to the total 75. If the score of all subjects is three times the standard deviation, the table would be eliminated.

When conducting the Chinese Mandarin speech intelligibility test, the sound reinforcement system in the classroom was turned on. In each room, two word lists were read separately by a male teacher and a female teacher who could read the word list fluently and had obtained the Class IIA Mandarin certificate to read the word lists at a rate of 4.0 words per second in the position of podium table with the movable handheld omnidirectional wireless microphone. There are ten word lists in total. The male teacher read one odd-numbered word list, and the female teacher read one even-numbered word list. The SPL in each classroom was measured while the teacher read the word list, with basically the same value in each room at about 77.0 dB(A) (shown in Section 3.1).

The users at the remote end (staying alone in their bedroom) logged into Tencent Meeting with their APPLE and HUAWEI mobile phones and listened to the audio with the earphones that come with their phones, while the face-to-face students listened to the audio from the ceiling speakers and the natural sound from the teacher.

3. Results

3.1. Results of BNL and SNR

The measured background noise levels (BNLs) in hybrid classrooms under different conditions ranged from 34.9 to 47.0 dB(A), as shown in Figure 5. The details of BNL in each position in the five classrooms are listed in Appendix A Table A1. The many devices caused the background noise to be high. Results of conditions (ii) illustrated the impact of the service box on BNL. Since it was placed on the podium, the farther away from it, the smaller the sound. The BNL measurement was to average the data of different positions in the classroom. The spaces of classrooms 4 and 5 were larger, and the positions far away from the service box were less affected. Correspondingly, the BNL of these two classrooms actually decreased more than that of other classrooms. The high value of conditions (iii) and (v), both of which included air conditioning, reflected that air conditioning had a significant impact on the BNL. GB 50118-2010 [33] required that the BNL in the classroom should not be higher than 45 dB(A). The BNL in room 2# in condition (v), room 3# in conditions (iii)–(v), and room 4# in condition (v) exceeded the standard, which was the same as many researchers in ordinary classrooms [39,40]. Except for condition (i) in room 2#, 5#, and condition (ii) in room 5#, all BNL exceeded the US standard with 37 dB(A) [34]. So, in actual use, the speakers were turned on to ensure sufficient SNR.

The SPL in each classroom, as shown in Figure 6, was determined by the amplification of the sound reinforcement equipment and were similar, at about 77.0 dB(A), but the SNR varied with the value of 27.2–34.6 dB(A) because of the differences in BNL. It was much higher than that was set in ordinary classrooms with 15 dB(A).

3.2. Results of Room-Acoustic Parameters

The room-acoustic parameters of T₂₀, T₃₀, EDT, and C₅₀ are shown in Figure 7 with 125–4000 Hz. Although the room ceiling was installed with a metal perforated plate, since the sound absorption coefficient of other interfaces was small and because there was no special acoustic design for these classrooms, the RT was still relatively high, and the frequency characteristics of the RT were not flat. We used the average value of 500–1000 Hz of T₂₀, T₃₀, EDT, and C₅₀, as well as STI, to analyze the effects and the results of each room, as shown in Figure 8. T₂₀ was 0.67–0.93 s and T₃₀ was 0.71–0.98 s. According to T/CAIACN 007-2022 [41], for classrooms with sound reinforcement systems, the reverberation time should not be greater than 0.8 s. Only room 2#, 3# in T_20, and room 3# in T₃₀ met the requirements. C₅₀ should be greater than 0 dB in the Italian standard UNI 11532 [42], except for classroom 5# at 250 Hz. The rest all meet the requirements. From the single number frequency averaging 500–1000 Hz, all classrooms met the requirements of C₅₀.

The STI was in the range of 0.53–0.66, as shown in Table 3. According to Appendix G of IEC 60268-16-2020 [29], the smallest STI value in the classroom should be 0.62. Rooms 4# and 5# did not meet the requirements.

3.3. Differences Between SI Scores in Each Hybrid Classroom

The SI scores for each listener are shown in Appendix A Table A2, and the average SI scores of each hybrid classroom in each condition are shown in Figure 9.

Scores at the remote ends were consistently lower than those at the face-to-face ends across all classrooms. According to Appendix E of IEC 60268-16-2020 [29], the STI value of 0.62 corresponds to an SI score of the PB-words of 83%, with better speech intelligibility. This 83% threshold was therefore adopted as the benchmark for assessing compliance with intelligibility requirements in this study.

There were significant differences in SI scores between the remote and face-to-face ends (Mann–Whitney U test) in each classroom with p (two-tailed) values of <0.001. SI scores between at least two classrooms in the remote end had significant differences (Kruskal–Wallis test) with the p (two-tailed) < 0.001. SI scores at least between two classrooms in the face-to-face end also had significant differences (Kruskal–Wallis test) with the p (two-tailed) = 0.003.

Since the SI scores of the face-to-face end were already very high to the ceiling, the remote end was much lower than them, the face-to-face end was consistent with that of an ordinary multimedia classroom with a sound reinforcement system, and the SI scores in the ordinary classroom had been widely studied, the focus of this article was to analyze which acoustic parameters would affect the SI results at the remote end and what the influence rules are.

3.4. Effects of SNR in Classroom and Room-Acoustic Parameters on SI Scores at Remote End

The Spearman correlation analysis of SNR and room-acoustic parameters with all SI scores showed that SNR, T₂₀, and T₃₀ showed significant correlation with SI scores, with correlation coefficients of −0.554, −0.260, and −0.433, respectively, with the p (two-tailed) being <0.001, 0.009, <0.001. The relationship between RT, SNR, and SI scores is shown in Figure 10. The black balls are the SI scores of all remote listeners, and the red cubes are the SI scores of the remote end of each classroom. As the SNR increased, the SI value decreased, and the impact of SNR was greater than that of T₂₀.

STI, EDT, and C₅₀ showed no significant correlation with the SI scores in the remote end, with the p (two-tailed) being 0.210, 0.408, and 0.728. Although classroom 4# with an STI value of 0.53 was lower than 0.62, the SI score was higher than 83%; STI cannot express the relationship with SI score in the remote end well.

Different prediction equations similar to Peng [25] of SI scores in ordinary classrooms considering the SNR and room-acoustic parameters were obtained by Python 3.13, using the regression method, as shown in Table 4. Equation (1), which included SNR and T₂₀, achieved the highest coefficient of determination with 0.310, an RMSE of 0.080, and a p-value below 0.001. These statistics indicated that T₂₀ and SNR explain 31.0% of the variance in SI scores.

4. Discussion

4.1. BNL in Hybrid Classroom and the Way Service Box Should Be Used

The BNL results in condition (v) were aligned with findings from other hybrid classroom studies with a range of 43.9–49.6 dB(A) [43]. Different from [43], we also measured the conditions without air conditioners in autumn and spring and self-study according to the usage function of the university campus classroom. During non-instructional hours, measured BNL consistently exceeded the 40 dB(A) threshold recommended for self-study environments [33], negatively impacting students’ concentration. Keeping the service box powered on not only wasted energy but also caused great interference for the students. It can be set up with a main power supply and placed in a location that is easy for teachers to operate. Turning off the power after class can solve the problem of excessive background noise during self-study. A low-noise service box for a hybrid learning system or a wireless or wired service box that can be set up in the corridor outside the classroom needs to be developed.

4.2. SNR for Hybrid Classroom, Size of the Hybrid Classroom and Equipment Choice

Compared with [43], which used a satisfaction survey to investigate the feelings of remote students, this paper used a subjective speech intelligibility test to illustrate the impact of the sound field conditions in the classroom at the remote end. In this paper, our experiments specifically for the Chinese Mandarin speech intelligibility test found that the SI scores of the remote end showed a significant negative correlation with the overlarge SNR, which was completely opposite to the results in traditional classrooms in English and Chinese Mandarin [20,25]. The reason may be that echoes with greater delay differences above a rather marked threshold value cause the sound impression of speech to be disturbed, even to complete unintelligibility [44]. For Blended Synchronous Learning, when the microphone picked up the sound from the remote end, in addition to picking up the natural sound of the teacher that had passed through the indoor sound field, it also picked up the sound broadcasting by the sound reinforcement system. The teacher, with a smaller SPL than the speakers, was close to the microphone, resulting in direct sound arriving immediately, while the sound from the speakers arrived much later and with a large SPL. In addition, when the SPL of the speaker is high, more sound energy outside the coverage angles will also reach the microphone. The higher the SNR, the more energy broadcasting by the sound reinforcement system could reach the microphone which captured sound for remote end. If a stronger sound with an interval of more than 50 ms after the direct sound can be picked up, the sound impression of speech will be disturbed. A higher SNR in the classroom will not always be better. While the distance corresponding to 50 ms is 17 m, for the dispersed speakers, if there is no good sound absorption treatment in the classroom and the microphone is not de-reverberated, the total path length of the first reflected sound of any speaker should be less than 17 m from the microphone. So, the size of the classroom for blended learning should not be too large. For the classroom in this paper with distributed sound reinforcement system, the distance between the teacher and the microphone for picking up the sound for remote end is about 1.2 m, assuming that the loudspeaker, which was farthest from the rear wall and closest to the microphone, is 3 m away from the blackboard wall, the distance difference between the first reflected sound from the back wall and the direct sound that reaches the microphone should be less than 17 m, then the longest dimension of the classroom should not exceed 11.2 m, as shown in the Figure 11.

Moreover, for the hybrid classroom, a primary and secondary speaker system in which the volume of the secondary speaker can be controlled to be lower or column loudspeakers to improve directivity may be more advantageous than a decentralized system with all speakers at the same sound pressure level, and the arrangement of the speakers should be carefully designed to ensure that there are no sounds from speakers arriving too late.

4.3. Correlation Between RT and SI Scores

While Spearman correlation analysis revealed a significant relationship between remote-end SI scores and reverberation time (RT), the resulting prediction equation’s coefficient of determination (R² = 0.31) was substantially lower than the results for conventional classrooms [25] and other results with sound reinforcement systems but in large spaces [45]. For Blended Synchronous Learning, the factors affecting the sound at the remote end were not only the acoustic parameters but also the equipment. Moreover, besides the sound reinforcement system, it involved extra processing stages with sound pickup and transmission to the remote end, which involved more equipment after the sound was amplified. The role of the equipment in the entire sound transmission process cannot be ignored. Previous research has confirmed the importance of equipment [4,15,16,17]. According to Equation (1), if we set SNR with 33 dB(A) and 20 dB(A), correspondingly, the T₂₀ should be less than 0.8 s and 1.1 s to help SI scores reach the target of 83%. That means with higher SNR, the room would need more sound absorption to achieve a smaller T₂₀. Obviously, it is more economical to appropriately reduce the signal-to-noise ratio. However, in classrooms where loudspeakers are used, to prevent acoustic feedback and howling (If no special microphone is used), low reverberation times are needed. Therefore, we have to take into account both sound absorption and signal-to-noise ratio and find a balance between the two aspects.

4.4. Limitations and Future Work

The SPL and BNL used to calculate the indoor SNR in this study were not the average data at multiple measurement points in the students’ area but only the position of the microphone that picks up the sound for the remote end. If the sound field in the classroom is uneven, this value cannot represent the position of the students. Hence, it cannot be used for correlation analysis with the SI results on the face-to-face end. In this paper, we did not consider the audio–visual interaction, and students would have a screen during the classes; thus, the actual listening effect may be underestimated. This article only experimented on one set of sound transmission equipment, and there are many types of equipment on the market that can achieve hybrid learning. If the dereverberation effect of the pickup equipment is better, it may be better than the results of this article. This paper only conducted subjective experiments on Mode 3, and the results were based on the above constraints. However, the students or teachers at the remote end need to interact with the users at the classroom end. Subsequent research on Mode 7 is necessary. This paper was based on the analysis of experimental data from five classrooms, and the SI scores showed good performance in the face-to-face end. The predicted conclusions were based on the measured data range, and no other subjective experiments on other SNR and RT were conducted. Further research is needed to determine whether the conclusion can be used in a wider range of acoustic parameters. We will try other reverberation times and signal-to-noise ratios both in Mode 3 and Mode 7 to find the relationship between the room acoustics and SI scores when the two modes interact in future work.

5. Conclusions

This study explored the influence of acoustics parameters on Chinese speech intelligibility in the remote end of hybrid classrooms. The results showed that for the hybrid classroom with a decentralized sound reinforcement system, although the RT in the classroom is higher than the value limited by Chinese standards for the ordinary multimedia classroom in primary and secondary schools, the SI scores in the face-to-face end were high and higher than that of remote end in all classrooms. There was a significant difference in SI scores between the remote end and the face-to-face end. Acoustic parameters of RT and SNR showed a negative correlation with remote end SI scores, and the effect of SNR was different with ordinary classrooms due to the overly large SNR. STI, EDT, and C₅₀ showed no significant correlation with the SI scores at the remote end.

In order to achieve the ideal SI value at the remote end, the size of the classroom, the arrangement of the speakers, and RT should be controlled to prevent excessive sound energy 50 ms after the direct sound from reaching the microphone, which is for picking up sound for the remote end. A relation curve of SI scores and T₂₀ and SNR in the hybrid classroom was established. Unlike ordinary classrooms with a high determination coefficient R² of 0.99, it was only 0.310. Equipment for Blended Synchronous Learning also plays an important role. According to the curve established, when SNR was 33 dB(A), the T₂₀ should be within 0.8 s to achieve the 83% SI scores at the remote end.

Author Contributions

Conceptualization, Q.L.; methodology, Q.L.; software, Q.L.; validation, Q.L. and N.L.; formal analysis, Q.L. and N.L.; investigation, Q.L., Y.W., Z.L., M.T. and Y.Z.; resources, Q.L.; data curation, Q.L. and Y.W.; writing—original draft preparation, Q.L.; writing—review and editing, Q.L.; visualization, Q.L.; supervision, Q.L.; project administration, Q.L.; funding acquisition, Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the China Scholarship Council. No.202308130168.

Data Availability Statement

The original contributions presented in the study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

Thanks to all the volunteers who participated in the subjective speech intelligibility test.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. BNL of each position in each hybrid classroom.

	Position	Condition (i)	Condition (ii)	Condition (iii)	Condition (iv)	Condition (v)
1#	B1	40.3	42.6	44.5	42.6	42.8
	B2	37.6	43.1	42.4	43.1	43.6
	B3	39.1	36.3	41.3	43.1	42.5
	B4	39.0	39.1	41.9	42.9	42.4
	Average	39.1	41.1	42.7	42.9	42.8
2#	B1	32.7	37.8	43.8	39.3	42.4
	B2	35.9	37.9	44.0	40.0	43.1
	B3	34.9	49.0	45.4	49.9	48.5
	B4	36.4	38.2	43.3	40.1	43.5
	Average	35.2	43.9	44.2	45.0	45.1
3#	B1	42.7	46.8	48.8	49.3	49.8
	B2	43.9	42.4	44.6	43.4	45.6
	B3	40.6	42.4	42.6	43.7	45.1
	B4	41.6	41.7	43.7	42.4	45.3
	Average	42.4	43.9	45.7	45.7	47.0
4#	B1	36.9	39.1	44.6	39.0	45.1
	B2	36.9	37.9	45.3	38.1	45.5
	B3	35.6	34.0	45.0	36.4	45.7
	B4	37.6	37.8	45.0	38.9	45.2
	B5	38.0	38.2	45.4	40.3	45.4
	B6	36.1	36.4	45.1	36.7	44.7
	B7	37.6	37.1	44.5	36.2	45.2
	B8	36.9	37.0	45.1	37.1	45.0
	B9	35.7	36.0	44.3	36.2	44.7
	Average	36.9	37.3	44.9	37.9	45.2
5#	B1	32.8	33.9	42.9	35.7	43.6
	B2	33.0	34.2	42.5	34.4	41.8
	B3	34.9	35.2	43.3	35.8	43.3
	B4	34.8	34.9	42.5	35.2	43.1
	B5	38.3	38.9	43.0	43.5	44.1
	B6	33.3	35.2	43.5	36.6	43.5
	B7	34.0	34.1	42.8	34.4	43.1
	B8	34.1	33.8	41.9	34.5	42.8
	B9	34.8	36.2	43.8	36.8	45.5
	B10	35.7	36.6	42.5	37.6	41.9
	B11	34.5	35.0	43.8	36.1	43.5
	B12	35.8	35.0	42.7	35.8	43.2
	Average	34.9	35.5	43.0	37.3	43.4

Table A2. SI scores in each hybrid classroom.

	Volunteer	1#	2#	3#	4#	5#
Face-to-face end	1	97.33%	98.67%	94.67%	96.00%	96.00%
	2	96.00%	96.00%	90.67%	98.67%	94.67%
	3	97.33%	98.67%	90.67%	96.00%	90.67%
	4	97.33%	100.00%	90.67%	94.67%	90.67%
	5	97.33%	100.00%	98.67%	96.00%	97.33%
	6	98.67%	93.33%	96.00%	97.33%	98.67%
	7	96.00%	100.00%	97.33%	97.33%	96.00%
	8	98.67%	100.00%	92.00%	89.33%	88.00%
	9	97.33%	93.33%	89.33%	97.33%	97.33%
	10	93.33%	96.00%	93.33%	97.33%	94.67%
	11	98.67%	98.67%	97.33%	93.33%	97.33%
	12	98.67%	100.00%	86.67%	96.00%	89.33%
	13	96.00%	96.00%	93.33%	97.33%	98.67%
	14	94.67%	93.33%	97.33%	98.67%	97.33%
	15	97.33%	98.67%	94.67%	97.33%	97.33%
	16	96.00%	96.00%	92.00%	92.00%	92.00%
	17	93.33%	98.67%	90.67%	98.67%	96.00%
	18	89.33%	93.33%	94.67%	97.33%	96.00%
	19	96.00%	98.67%	96.00%	92.00%	90.67%
	20	94.67%	93.33%	94.67%	86.67%	92.00%
	Average	96.20%	97.13%	93.53%	95.47%	94.53%
Remote end	1	64.00%	57.33%	81.33%	81.33%	74.67%
	2	84.00%	64.00%	92.00%	90.67%	86.67%
	3	69.33%	60.00%	98.67%	81.33%	76.00%
	4	74.67%	77.33%	94.67%	90.67%	42.67%
	5	80.00%	85.33%	97.33%	76.00%	68.00%
	6	70.67%	81.33%	81.33%	94.67%	66.67%
	7	78.67%	89.33%	86.67%	85.33%	89.33%
	8	81.33%	89.33%	78.67%	88.00%	80.00%
	9	73.33%	82.67%	90.67%	86.67%	65.33%
	10	76.00%	92.00%	94.67%	90.67%	73.33%
	11	80.00%	77.33%	82.67%	78.67%	76.00%
	12	74.67%	89.33%	89.33%	89.33%	81.33%
	13	84.00%	82.67%	93.33%	90.67%	80.00%
	14	76.00%	85.33%	76.00%	90.67%	64.00%
	15	78.67%	82.67%	90.67%	86.67%	77.33%
	16	70.67%	77.33%	86.67%	92.00%	62.67%
	17	81.33%	76.00%	81.33%	84.00%	93.33%
	18	78.67%	68.00%	86.67%	81.33%	86.67%
	19	77.33%	64.00%	80.00%	78.67%	65.33%
	20	81.33%	76.00%	88.00%	86.67%	77.33%
	Average	76.73%	77.87%	87.53%	86.20%	74.33%

References

Bower, M.; Dalgarno, B.; Kennedy, G.; Lee, M.J.; Kenney, J. Blended Synchronous Learning: A Handbook for Educators, 1st ed.; Office for Learning and Teaching, Department of Education Sydney, Australia: Canberra, Australia, 2014.
Hastie, M.; Hung, I.C.; Chen, N.S.; Kinshuk. A blended synchronous learning model for educational international collaboration. Innov. Educ. Teach. Int. 2010, 47, 9–24. [Google Scholar] [CrossRef]
Park, Y.J.; Bonk, C.J. Synchronous learning experiences: Distance and residential learners’ perspectives in a blended graduate course. J. Interact. Online Learn. 2007, 6, 245–264. [Google Scholar]
Bell, J.; Sawaya, S.; Cain, W. Synchromodal classes: Designing for shared learning experiences between face-to-face and online students. Int. J. Des. Learn. 2014, 5, 68–82. [Google Scholar] [CrossRef]
Abdelmalak, M.M.M.; Parra, J.L. Expanding learning opportunities for graduate students with HyFlex course design. Int. J. Online Pedagog. Course Des. (IJOPCD) 2016, 6, 19–37. [Google Scholar] [CrossRef]
Butz, N.T.; Stupnisky, R.H. A mixed methods study of graduate students’ self-determined motivation in synchronous hybrid learning environments. Internet High. Educ. 2016, 28, 85–95. [Google Scholar] [CrossRef]
Romero-Hall, E.; Vicentini, C.R. Examining distance learners in hybrid synchronous instruction: Successes and challenges. Online Learn. J. 2017, 21, 141–157. [Google Scholar] [CrossRef]
Zydney, J.M.; McKimmy, P.; Lindberg, R.; Schmidt, M. Here or there instruction: Lessons learned in implementing innovative approaches to blended synchronous learning. TechTrends 2019, 63, 123–132. [Google Scholar] [CrossRef]
White, C.P.; Ramirez, R.; Smith, J.G.; Plonowski, L. Simultaneous delivery of a face-to-face course to on-campus and remote off-campus students. TechTrends 2010, 54, 34–40. [Google Scholar]
Paul, J.; Jefferson, F. A comparative analysis of student performance in an online vs. face-to-face environmental science course from 2009 to 2016. Front. Comput. Sci. 2019, 1, 7. [Google Scholar] [CrossRef]
Norberg, A.; Dziuban, C.D.; Moskal, P.D. A time-based blended learning model. Horiz. 2011, 19, 207–216. [Google Scholar] [CrossRef]
Minadzi, V.M.; Segbenya, M. Usefulness and challenges with blended learning during the COVID-19 pandemic in Ghana: The mediating role of human resource factors. Comput. Hum. Behav. Rep. 2024, 16, 100468. [Google Scholar] [CrossRef]
Sadeghi, M. A shift from classroom to distance learning: Advantages and limitations. Int. J. Res. Engl. Educ. 2019, 4, 80–88. [Google Scholar] [CrossRef]
Sharma, D.; Sood, A.K.; Darius, P.S.; Gundabattini, E.; Darius Gnanaraj, S.; Joseph Jeyapaul, A. A study on the online-offline and blended learning methods. J. Inst. Eng. (India) Ser. B 2022, 103, 1373–1382. [Google Scholar] [CrossRef]
Bower, M.; Dalgarno, B.; Kennedy, G.E.; Lee, M.J.; Kenney, J. Design and implementation factors in blended synchronous learning environments: Outcomes from a cross-case analysis. Comput. Educ. 2015, 86, 1–17. [Google Scholar] [CrossRef]
Wang, Q.; Huang, C. Pedagogical, social and technical designs of a blended synchronous learning environment. Br. J. Educ. Technol. 2018, 49, 451–462. [Google Scholar] [CrossRef]
Bower, M.; Kenney, J.; Dalgarno, B.; Kennedy, G.E. Blended synchronous learning: Patterns and principles for simultaneously engaging co-located and distributed learners. In Proceedings of the 30th ASCILITE—Australian Society for Computers in Learning in Tertiary Education Annual Conference, Sydney, Australia, 1–4 December 2013; pp. 92–102. [Google Scholar]
Astolfi, A.; Pellerey, F. Subjective and objective assessment of acoustical and overall environmental quality in secondary school classrooms. J. Acoust. Soc. Am. 2008, 123, 163–173. [Google Scholar] [CrossRef]
ISO 9921:2003; Ergonomics—Assessment of Speech Communication. International Organization for Standardization: Geneva, Switzerland, 2022.
Bradley, J.S. Speech intelligibility studies in classrooms. J. Acoust. Soc. Am. 1986, 80, 846–854. [Google Scholar] [CrossRef]
Bradley, J.S.; Sato, H. The intelligibility of speech in elementary school classrooms. J. Acoust. Soc. Am. 2008, 123, 2078–2086. [Google Scholar] [CrossRef]
Astolfi, A.; Bottalico, P.; Barbato, G. Subjective and objective speech intelligibility investigations in primary school classrooms. J. Acoust. Soc. Am. 2012, 131, 247–257. [Google Scholar] [CrossRef]
Hodgson, M.; Nosal, E.-M. Effect of noise and occupancy on optimal reverberation times for speech intelligibility in classrooms. J. Acoust. Soc. Am. 2002, 111, 931–939. [Google Scholar] [CrossRef]
Choi, Y.-J. The intelligibility of speech in university classrooms during lectures. Appl. Acoust. 2020, 162, 107211. [Google Scholar] [CrossRef]
Jianxin, P. Chinese speech intelligibility at different speech sound pressure levels and signal-to-noise ratios in simulated classrooms. Appl. Acoust. 2010, 71, 386–390. [Google Scholar] [CrossRef]
Peng, J. Chinese syllable and phoneme identification in noise and reverberation. Arch. Acoust. 2014, 39, 483–488. [Google Scholar] [CrossRef]
Jianxin, P. Relationship between Chinese speech intelligibility and speech transmission index using diotic listening. Speech Commun. 2007, 49, 933–936. [Google Scholar] [CrossRef]
Zhu, P.; Mo, F.; Kang, J. Relationship between Chinese speech intelligibility and speech transmission index under reproduced general room conditions. Acta Acust. United Acust. 2014, 100, 880–887. [Google Scholar] [CrossRef]
IEC 60286-16; Sound System Equipment-Part 16: Objective Rating of Speech Intelligibility by Speech Transmission Index. International Electrotechnical Commission: Geneva, Switzerland, 2020.
Razali, A.W.; Din, N.C.; Yahya, M.N.; Sulaiman, R.; Razak, A.S.A. Classroom transformation for better education: Part 2-A preliminary study on acoustic design strategies for hybrid learning classrooms. In Proceedings of the INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Chiba, Japan, 20–23 August 2023; pp. 116–126. [Google Scholar]
Elmehdi, H.; Tato, A. Assessing Acoustic Conditions in Hybrid Classrooms with COVID-19 Social Distancing at the University of Sharjah. In Proceedings of the INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Glasgow, UK, 21–24 August 2022; pp. 4898–4903. [Google Scholar]
Galbrun, L.; Kitapci, K. Speech intelligibility of English, Polish, Arabic and Mandarin under different room acoustic conditions. Appl. Acoust. 2016, 114, 79–91. [Google Scholar] [CrossRef]
GB 50118-2010; Code for Design of Sound Insulation of Civil Buildings. Ministry of Housing and Urban-Rural Development of the People’s Republic of China: Beijing, China, 2010.
ANSI S12.60/Part 1-2010 (R2020); Acoustical Performance Criteria, Design Requirements, and Guidelines for Schools. American National Standards Institute: Washington, DC, USA, 2020.
ISO 3382-2; Acoustics-Measurement of Room Acoustic Parameters-Part2 Reverberation Time in Ordinary Rooms. International Organization for Standardization: Geneva, Switzerland, 2008.
Iannace, G.; Trematerra, A.; Trematerra, P. Acoustic correction using green material in classrooms located in historical buildings. Acoust. Aust. 2013, 41, 213–218. [Google Scholar]
GB/T15508-1995; Acoustics-Speech Articulation Testing Method. China Zhijian Publishing House: Beijing, China, 1995.
Yang, L.; Zhang, J.; Yan, Y. An improved STI method for evaluating Mandarin speech intelligibility. In Proceedings of the 2008 International Conference on Audio, Language and Image Processing, Shanghai, China, 7–9 July 2008; pp. 102–106. [Google Scholar]
Sato, H.; Bradley, J.S. Evaluation of acoustical conditions for speech communication in active elementary school classrooms. In Proceedings of the International Congress on Acoustics, Kyoto, Japan, 4–9 April 2004; pp. 1–4. [Google Scholar]
Knecht, H.A.; Nelson, P.B.; Whitelaw, G.M.; Feth, L.L. Background noise levels and reverberation times in unoccupied classrooms. Am. J. Audiol. 2002, 11, 65–71. [Google Scholar] [CrossRef]
T/CAIACN 007-2022; General Technical Specification for Classroom Sound Reinforcement System. Community Associations Institute: Falls Church, VA, USA, 2022.
Berardi, U.; Iannace, G.; Trematerra, A. Acoustic treatments aiming to achieve the italian minimum environmental criteria (cam) standards in large reverberant classrooms. Can. Acoust. 2019, 47, 73–80. [Google Scholar]
Elmehdi, H.; Tato, A. Acoustic Comfort in Hybrid Learning Spaces: Students Perspective. In Proceedings of the INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Glasgow, UK, 21–24 August 2022; pp. 4826–4831. [Google Scholar]
Haas, H. The influence of a single echo on the audibility of speech. J. Audio Eng. Soc. 1972, 20, 145–159. [Google Scholar]
Li, X.; Zhao, Y. Exploring Factors Influencing Speech Intelligibility in Airport Terminal Pier-Style Departure Lounges. Buildings 2025, 15, 426. [Google Scholar] [CrossRef]

Figure 1. BNL and SNR positions in hybrid classrooms.

Figure 2. RT, EDT, and C₅₀ measurement systems.

Figure 3. Layout of sound sources and receiver positions for RT measurement.

Figure 4. STI measurement system.

Figure 5. BNL in hybrid classrooms.

Figure 6. SNR on podium table in hybrid classrooms.

Figure 7. T₂₀, T₃₀, EDT, and C₅₀ in octaves in hybrid classrooms.

Figure 8. Single number frequency averaging of 500–1000 Hz of T₂₀, T₃₀, EDT, and C₅₀ in hybrid classrooms.

Figure 9. Average SI scores of each condition in each hybrid classroom.

Figure 10. The relationship between RT, SNR, and SI scores.

Figure 11. The longest dimension of the classroom.

Table 1. Details of tested hybrid classrooms.

Room	Geometry	Dimension W ∗ L ∗ H (m)	Floor Area (m²)	Volume (m³)	Capacity (No. of Students)
1#	Rectangle	5.1 ∗ 8.2 ∗ 3.1	41.8	129.6	21
2#	Rectangle	8.45 ∗ 8.2 ∗ 3.1	67.8	210.2	40
3#	Rectangle	5.2 ∗ 6.0 ∗ 3.1	31.2	96.7	21
4#	Rectangle	8.2 ∗ 13.5 ∗ 3.1	109.74	340.2	63
5#	Rectangle	7.9 ∗ 16.25 ∗ 3.1	127.1	394.1	81

Table 2. Blended Synchronous Learning Model [2].

Descriptions	Cyber Classroom		Physical Classroom
Descriptions	Teacher	Student	Teacher	Student
Blended Synchronous Learning Mode 1	×	√	×	√
Blended Synchronous Learning Mode 2	×	√	√	×
Blended Synchronous Learning Mode 3	×	√	√	√
Blended Synchronous Learning Mode 4	√	×	×	√
Blended Synchronous Learning Mode 5	√	×	√	×
Blended Synchronous Learning Mode 6	√	×	√	√
Blended Synchronous Learning Mode 7	√	√	×	√
Blended Synchronous Learning Mode 8	√	√	√	×
Blended Synchronous Learning Mode 9	√	√	√	√

Note: √ means with one or more participants; × means without any participants.

Table 3. STI in each hybrid classroom.

	1#	2#	3#	4#	5#
STI	0.64	0.63	0.66	0.53	0.53

Table 4. Different prediction equations considering the SNR and RT in the remote end of Mode 3.

Formula Number	Acoustic Parameter Indicators	Expression	Determination Coefficient R²
(1)	T₂₀, SNR	SI = −0.041 × SNR + 8.438 × T₂₀ − 5.245 × T₂₀² − 1.182	0.310
(2)	T₃₀, SNR	SI = −0.031 × SNR + 5.014 × T₃₀ − 2.991 × T₃₀² − 0.255	0.290
(3) *	RT, SNR	SI = 3.12 × SNR − 0.064 × SNR² − 6.15 × RT + 57.2	0.99

* Equation (3) is from Peng in ordinary classrooms for undergraduate students [25].

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Q.; Li, N.; Wang, Y.; Li, Z.; Tian, M.; Zhang, Y. Assessing Acoustic Conditions in Hybrid Classrooms for Chinese Speech Intelligibility at the Remote End. Buildings 2025, 15, 1909. https://doi.org/10.3390/buildings15111909

AMA Style

Li Q, Li N, Wang Y, Li Z, Tian M, Zhang Y. Assessing Acoustic Conditions in Hybrid Classrooms for Chinese Speech Intelligibility at the Remote End. Buildings. 2025; 15(11):1909. https://doi.org/10.3390/buildings15111909

Chicago/Turabian Style

Li, Qian, Nan Li, Yan Wang, Zheng Li, Mengyun Tian, and Yihan Zhang. 2025. "Assessing Acoustic Conditions in Hybrid Classrooms for Chinese Speech Intelligibility at the Remote End" Buildings 15, no. 11: 1909. https://doi.org/10.3390/buildings15111909

APA Style

Li, Q., Li, N., Wang, Y., Li, Z., Tian, M., & Zhang, Y. (2025). Assessing Acoustic Conditions in Hybrid Classrooms for Chinese Speech Intelligibility at the Remote End. Buildings, 15(11), 1909. https://doi.org/10.3390/buildings15111909

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Assessing Acoustic Conditions in Hybrid Classrooms for Chinese Speech Intelligibility at the Remote End

Abstract

1. Introduction