A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin–English Mixed Conversations
Abstract
1. Introduction
2. Literature Review
2.1. Early Efforts in Mixed-Language ASR
2.2. Development of Mixed-Language Datasets
2.3. Traditional ASR Approaches and Their Limitations
2.4. Advancements in End-to-End (E2E), Hybrid, and Semi-Supervised Models for ASR
2.5. Remaining Challenges and Research Gaps
3. Research Methodology
3.1. Evaluation Metrics for Monolingual and Bilingual Speech Recognition
3.1.1. Word Error Rate and Character Error Rate
3.1.2. Mixed Error Rate
3.2. Datasets
4. Results and Discussion
4.1. VOSK Monolingual ASR Toolkit
4.1.1. Operational Workflow
4.1.2. Performance Evaluation
4.2. The SEB-ASR System
4.2.1. Operational Workflow
4.2.2. Performance Evaluation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Hou, S.Y.; Wu, Y.L.; Chen, K.C.; Chang, T.A.; Hsu, Y.M.; Chuang, S.J.; Chang, Y.; Hsu, K.C. Code-switching automatic speech recognition for nursing record documentation: System development and evaluation. JMIR Nurs. 2022, 5, 37562. [Google Scholar] [CrossRef]
- Saksamudre, S.K.; Shrishrimal, P.P.; Deshmukh, R.R. A review on different approaches for speech recognition system. Int. J. Comput. Appl. 2015, 115, 23–28. [Google Scholar] [CrossRef]
- Gao, J.; Wan, G.; Wu, K.; Fu, Z. Review of the application of intelligent speech technology in education. J. China Comput. Assist. Lang. Learn. 2022, 2, 165–178. [Google Scholar] [CrossRef]
- Davis, K.H.; Biddulph, R.; Balashek, S. Automatic recognition of spoken digits. J. Acoust. Soc. Am. 1952, 24, 637–642. [Google Scholar] [CrossRef]
- Rabiner, L.R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 1989, 77, 257–286. [Google Scholar] [CrossRef]
- Jurafsky, D.; Martin, J.H. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition; Prentice Hall: Upper Saddle River, NJ, USA, 2009. [Google Scholar]
- Yeh, C.F.; Huang, C.Y.; Sun, L.C.; Lee, L.S. An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling. In Proceedings of the 7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010), Tainan, Taiwan, 29 November–3 December 2010; pp. 214–219. [Google Scholar] [CrossRef]
- Guo, P.; Xu, H.; Xie, L.; Chng, E.S. Study of semi-supervised approaches to improving English-Mandarin code-switching speech recognition. In Proceedings of the Interspeech 2018, Hyderabad, India, 2–6 September 2018; pp. 1928–1932. [Google Scholar] [CrossRef]
- Amazouz, D.; Adda-Decker, M.; Lamel, L. Addressing French/Algerian code-switching Arabic speech. In Proceedings of the Interspeech 2017, Stockholm, Sweden, 20–24 August 2017; pp. 62–66. [Google Scholar] [CrossRef]
- Guzmán, G.A.; Ricard, J.; Serigos, J.; Bullock, B.E.; Toribio, A.J. Metrics for modeling code-switching across corpora. In Proceedings of the Interspeech 2017, Stockholm, Sweden, 20–24 August 2017; pp. 67–71. [Google Scholar] [CrossRef]
- Alemu, A.A.; Melese, M.D.; Salau, A.O. Towards audio-based identification of Ethio-Semitic languages using recurrent neural network. Sci. Rep. 2023, 13, 19346. [Google Scholar] [CrossRef]
- Waibel, A.; Soltau, H.; Schultz, T.; Schaaf, T.; Metze, F. Multilingual speech recognition. In Verbmobil: Foundations of Speech-to-Speech Translation; Springer: Berlin, Germany, 2000; pp. 33–45. [Google Scholar]
- Long, Y.; Li, Y.; Zhang, Q.; Wei, S.; Ye, H.; Yang, J. Acoustic data augmentation for Mandarin-English code-switching speech recognition. Appl. Acoust. 2020, 161, 107175. [Google Scholar] [CrossRef]
- Chan, J.Y.C.; Cao, H.; Ching, P.C.; Lee, T. Automatic recognition of Cantonese-English code-mixing speech. Int. J. Comput. Linguist. Chin. Lang. Process. 2009, 14, 281–304. [Google Scholar]
- Winata, G.I.; Madotto, A.; Wu, C.S.; Fung, P. Towards end-to-end automatic code-switching speech recognition. arXiv 2018, arXiv:1810.12620. [Google Scholar] [CrossRef]
- Li, J. Recent advances in end-to-end automatic speech recognition. APSIPA Trans. Signal Inf. Process. 2022, 11, e8. [Google Scholar] [CrossRef]
- Shmyrev, N.V. Vosk Speech Recognition Toolkit: Offline Speech Recognition API for Android, iOS, Raspberry Pi and Servers with Python, Java, C# and Node. 2020. Available online: https://github.com/alphacep/vosk-api (accessed on 4 June 2025).
- Li, D.C.S. Cantonese-English code-switching research in Hong Kong: A Y2K review. World Englishes 2000, 19, 305–322. [Google Scholar] [CrossRef]
- Yang, H.C.; Hsiao, H.W.; Lee, C.H. Multilingual document mining and navigation using self-organizing maps. Inf. Process. Manag. 2011, 47, 647–666. [Google Scholar] [CrossRef]
- Zhang, Y.; Tsai, F.S.; Kwee, A.T. Multilingual sentence categorization and novelty mining. Inf. Process. Manag. 2011, 47, 667–675. [Google Scholar] [CrossRef]
- Segev, A.; Gal, A. Enhancing portability with multilingual ontology-based knowledge management. Decis. Support Syst. 2008, 45, 567–584. [Google Scholar] [CrossRef]
- Gey, F.C.; Kando, N.; Peters, C. Cross-language information retrieval: The way ahead. Inf. Process. Manag. 2005, 41, 415–431. [Google Scholar] [CrossRef]
- Jung, J.J. Cross-lingual query expansion in multilingual folksonomies: A case study on Flickr. Knowl. Based Syst. 2013, 42, 60–67. [Google Scholar] [CrossRef]
- Lee, C.W.; Wu, Y.L.; Yu, L.C. Combining mutual information and entropy for unknown word extraction from multilingual code-switching sentences. J. Inf. Sci. Eng. 2019, 35, 597–610. [Google Scholar] [CrossRef]
- Lyu, D.C.; Hsu, C.N.; Chiang, Y.C.; Lyu, R.Y. Acoustic model optimization for multilingual speech recognition. Int. J. Comput. Linguist. Chin. Lang. Process. 2008, 13, 363–385. [Google Scholar] [CrossRef]
- Wu, C.H.; Chiu, Y.H.; Shia, C.J.; Lin, C.Y. Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs. IEEE Trans. Audio Speech Lang. Process. 2006, 14, 266–276. [Google Scholar] [CrossRef]
- Qian, Y.; Liang, H.; Soong, F.K. A cross-language state sharing and mapping approach to bilingual (Mandarin–English) TTS. IEEE Trans. Audio Speech Lang. Process. 2009, 17, 1231–1239. [Google Scholar] [CrossRef]
- Lyu, D.C.; Lyu, R.Y.; Zhu, C.L.; Ko, M.T. Language identification in code-switching speech using word-based lexical model. In Proceedings of the 7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010), Beijing, China, 29 November–3 December 2010; pp. 460–464. [Google Scholar] [CrossRef]
- Lyu, D.C.; Tan, T.P.; Chng, E.; Li, H. SEAME: A Mandarin-English code-switching speech corpus in South-East Asia. In Proceedings of the Interspeech 2010, Makuhari, Chiba, Japan, 26–30 September 2010. [Google Scholar] [CrossRef]
- Shen, H.P.; Wu, C.H.; Yang, Y.T.; Hsu, C.S. CECOS: A Mandarin-English code-switching speech database. In Proceedings of the 2011 International Conference on Speech Database and Assessments (Oriental COCOSDA), Hsinchu, Taiwan, 26–28 October 2011; pp. 120–123. [Google Scholar] [CrossRef]
- Wang, D.; Tang, Z.; Tang, D.; Chen, Q. OC16-CE80: A Mandarin-English mixlingual database and a speech recognition baseline. In Proceedings of the 2016 Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA), Bali, Indonesia, 26–28 October 2016; pp. 84–88. [Google Scholar] [CrossRef]
- Vu, N.T.; Lyu, D.C.; Weiner, J.; Telaar, D.; Schlippe, T.; Blaicher, F.; Chng, E.S.; Schultz, T.; Li, H. A first speech recognition system for Mandarin-English code-switch conversational speech. In Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, 25–30 March 2012; pp. 4889–4892. [Google Scholar] [CrossRef]
- Vu, N.T. Automatic Speech Recognition for Low-Resource Languages and Accents Using Multilingual and Crosslingual Information. Ph.D. Thesis, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany, 2014. [Google Scholar] [CrossRef]
- Bhuvanagiri, K.; Kopparapu, S.K. An approach to mixed language automatic speech recognition. In Proceedings of the Oriental COCOSDA 2010, Kathmandu, Nepal, 24–25 November 2010. [Google Scholar]
- Bhuvanagiri, K.; Kopparapu, S.K. Mixed language speech recognition without explicit identification of language. Am. J. Signal Process. 2012, 2, 92–97. [Google Scholar] [CrossRef]
- Zellou, G.; Lahrouchi, M. Linguistic disparities in cross-language automatic speech recognition transfer from Arabic to Tashlhiyt. Sci. Rep. 2024, 14, 313. [Google Scholar] [CrossRef]
- Van Der Westhuizen, E.; Niesler, T. Synthesising isiZulu-English code-switch bigrams using word embeddings. In Proceedings of the Interspeech 2017, Stockholm, Sweden, 20–24 August 2017; pp. 72–76. [Google Scholar] [CrossRef]
- Adel, H.; Vu, N.T.; Schultz, T. Combination of recurrent neural networks and factored language models for code-switching language modeling. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Sofia, Bulgaria, 4–9 August 2013; pp. 206–211. [Google Scholar]
- Weng, C.; Cui, J.; Wang, G.; Wang, J.; Yu, C.; Su, D.; Yu, D. Improving attention based sequence-to-sequence models for end-to-end English conversational speech recognition. In Proceedings of the Interspeech 2018, Hyderabad, India, 2–6 September 2018; pp. 761–765. [Google Scholar] [CrossRef]
- Shan, C.; Zhang, J.; Wang, Y.; Xie, L. Attention-based end-to-end models for small-footprint keyword spotting. In Proceedings of the Interspeech 2018, Hyderabad, India, 2–6 September 2018; pp. 2037–2041. [Google Scholar] [CrossRef]
- Akinpelu, S.; Viriri, S. Speech emotion classification using attention based network and regularized feature selection. Sci. Rep. 2023, 13, 11990. [Google Scholar] [CrossRef]
- Rahman Chowdhury, F.R.; Wang, Q.; Moreno, I.L.; Wan, L. Attention-based models for text-dependent speaker verification. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 5359–5363. [Google Scholar] [CrossRef]
- Fang, X.; Gao, T.; Zou, L.; Ling, Z. Bidirectional attention for text-dependent speaker verification. Sensors 2020, 20, 6784. [Google Scholar] [CrossRef]
- Luo, N.; Jiang, D.; Zhao, S.; Gong, C.; Zou, W.; Li, X. Towards end-to-end code-switching speech recognition. arXiv 2018, arXiv:1810.13091. [Google Scholar] [CrossRef]
- Huang, Z.; Wang, P.; Wang, J.; Miao, H.; Xu, J.; Zhang, P. Improving transformer-based end-to-end code-switching speech recognition using language identification. Appl. Sci. 2021, 11, 9106. [Google Scholar] [CrossRef]
- Huang, Z.; Xu, J.; Zhao, Q.; Zhang, P. A two-fold cross-validation training framework combined with meta-learning for code-switching speech recognition. IEICE Trans. Inf. Syst. 2022, 105, 1639–1642. [Google Scholar] [CrossRef]
- Nga, C.H.; Vu, D.Q.; Le, P.T.; Luong, H.H.; Wang, J.C. MLSS: Mandarin English code-switching speech recognition via mutual learning-based semi-supervised method. IEEE Signal Process. Lett. 2025, 32, 1510–1514. [Google Scholar] [CrossRef]
- Chen, M.; Pan, J.; Zhao, Q.; Yan, Y. Multi-task learning in deep neural networks for Mandarin-English code-mixing speech recognition. IEICE Trans. Inf. Syst. 2016, 99, 2554–2557. [Google Scholar] [CrossRef]
- Ali, A.; Renals, S. Word error rate estimation for speech recognition: e-WER. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Melbourne, Australia, 15–20 July 2018; pp. 20–24. [Google Scholar] [CrossRef]
- Klakow, D.; Peters, J. Testing the correlation of word error rate and perplexity. Speech Commun. 2002, 38, 19–28. [Google Scholar] [CrossRef]
- Levenshtein, V.I. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics-Doklady 1966, 10, 707–710. [Google Scholar]
- Kessler, B. Computational dialectology in Irish Gaelic. In Proceedings of the Seventh Conference on European Chapter of the Association for Computational Linguistics, Dublin, Ireland, 27–31 March 1995; pp. 60–66. [Google Scholar] [CrossRef]
- Wieling, M.; Bloem, J.; Mignella, K.; Timmermeister, M.; Nerbonne, J. Measuring foreign accent strength in English: Validating Levenshtein distance as a measure. Lang. Dyn. Change 2014, 4, 253–269. [Google Scholar] [CrossRef]
- Mustafa, M.B.; Yusoof, M.A.; Khalaf, H.K.; Rahman Mahmoud Abushariah, A.A.; Kiah, M.L.M.; Ting, H.N.; Muthaiyah, S. Code-switching in automatic speech recognition: The issues and future directions. Appl. Sci. 2022, 12, 9541. [Google Scholar] [CrossRef]
- Povey, D.; Ghoshal, A.; Boulianne, G.; Burget, L.; Glembek, O.; Goel, N.; Hannemann, M.; Motlicek, P.; Qian, Y.; Schwarz, P.; et al. The Kaldi speech recognition toolkit. In Proceedings of the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (ASRU), Waikoloa, HI, USA, 11–15 December 2011. [Google Scholar]
- Ardila, R.; Branson, M.; Davis, K.; Henretty, M.; Kohler, M.; Meyer, J.; Morais, R.; Saunders, L.; Tyers, F.M.; Weber, G. Common Voice: A massively-multilingual speech corpus. In Proceedings of the 12th Language Resources and Evaluation Conference (LREC), Marseille, France, 11–16 May 2020; pp. 4218–4222. [Google Scholar]
Related Work | Model/Method | Dataset | MER (%) |
---|---|---|---|
Guo et al. [8] | Lattice-Free Maximum Mutual Information (LF-MMI)-based semi-supervised training under code-switching conditions | SEAME | 20.54–29.56 |
Winata et al. [15] | End-to-end system using connected timing classification (CTC) | SEAME | 24.61 |
Vu et al. [32,33] | Two-pass system with IPA and Bhattacharyya distance, discriminative training, statistical machine translation (SMT)-based text generation, and language identification (LID) system integration | SEAME | 36.60 |
Adel et al. [38] | Recurrent neural network language models (RNNLMs) with factored language models (FLMs) and parts of speech (POS) | SEAME | 32.81 |
Luo et al. [44] | Hybrid CTC attention-based end-to-end system for code-switching speech | SEAME | 34.24 |
Huang et al. [45] | CTC transformer-based end-to-end model with integrated LID module | SEAME | 30.95 |
Huang et al. [46] | Meta-learning framework with two-fold cross-validation training | SEAME | 31.02 |
Nga et al. [47] | Mutual learning-based semi-supervised speech (MLSS) recognition approach | SEAME | 17.60 |
Chen et al. [48] | Multi-task deep neural network | LDC and in-house data | 32.20 |
This work | Self-evaluated bilingual automatic speech recognition (SEB-ASR) system | In-house dataset and YouTube videos | 12.78 |
English | Mandarin | |
---|---|---|
Maximum (%) | 22.80 | 25.32 |
Minimum (%) | 0.00 | 0.00 |
Average (%) | 8.43 | 9.04 |
Dataset A | Dataset B | Dataset All | YouTube | |||||
---|---|---|---|---|---|---|---|---|
VOSK | SEB-ASR | VOSK | SEB-ASR | VOSK | SEB-ASR | VOSK | SEB-ASR | |
Maximum (%) | 79.81 | 20.64 | 88.00 | 20.93 | 89.93 | 24.00 | 99.57 | 28.91 |
Minimum (%) | 31.89 | 0.00 | 28.00 | 0.00 | 30.03 | 0.00 | 43.22 | 0.00 |
Average (%) | 54.93 | 9.36 | 52.08 | 7.56 | 56.49 | 11.50 | 65.22 | 12.78 |
Start Time | End Time | Confidence | Recognized Word | Original Audio Text |
---|---|---|---|---|
5.22 | 5.97 | 1.0000 | 這個 (zhè ge) | 這個 (zhè ge) |
10.17 | 10.41 | 0.3929 | 哎 (āi) | |
10.41 | 10.95 | 0.5900 | 爹 (diē) | idea |
15.24 | 16.23 | 1.0000 | 非常 (fēi cháng) | 非常 (fēi cháng) |
21.32 | 21.45 | 0.2517 | 而 (ér) | perfect |
27.51 | 28.08 | 1.0000 | 我們 (wǒ mēn) | 我們 (wǒ mēn) |
28.08 | 28.47 | 1.0000 | 的 (de) | 的 (de) |
32.40 | 32.81 | 0.5419 | 我 (wǒ) | work |
37.14 | 37.74 | 1.0000 | 需要 (xū yào) | 需要 (xū yào) |
37.74 | 38.43 | 1.0000 | 提高 (tí gāo) | 提高 (tí gāo) |
42.42 | 42.75 | 0.2720 | 逸飛 (yì fēi) | |
42.75 | 42.90 | 0.9928 | 是 (shì) | efficiency |
42.90 | 43.32 | 0.9035 | 誰 (shéi) |
Symbol | Start Time | End Time | Confidence | Recognized Word | Original Audio Text |
---|---|---|---|---|---|
+ | 5.22 | 5.97 | 1.0000 | 這個 (zhè ge) | 這個 (zhè ge) |
- | 10.17 | 10.41 | 0.3929 | 哎 (āi) | |
- | 10.41 | 10.95 | 0.5900 | 爹 (diē) | idea |
+ | 15.24 | 16.23 | 1.0000 | 非常 (fēi cháng) | 非常 (fēi cháng) |
- | 21.32 | 21.45 | 0.2517 | 而 (ér) | perfect |
+ | 27.51 | 28.08 | 1.0000 | 我們 (wǒ mēn) | 我們 (wǒ mēn) |
+ | 28.08 | 28.47 | 1.0000 | 的 (de) | 的 (de) |
- | 32.40 | 32.81 | 0.5419 | 我 (wǒ) | work |
+ | 37.14 | 37.74 | 1.0000 | 需要 (xū yào) | 需要 (xū yào) |
+ | 37.74 | 38.43 | 1.0000 | 提高 (tí gāo) | 提高 (tí gāo) |
- | 42.42 | 42.75 | 0.2720 | 逸飛 (yì fēi) | |
+ | 42.75 | 42.90 | 0.9928 | 是 (shì) | efficiency |
+ | 42.90 | 43.32 | 0.9035 | 誰 (shéi) |
Assigned Symbol | Start Time | End Time | Reassigned Symbol | Confidence | Recognized Word | Original Audio Text |
---|---|---|---|---|---|---|
- | 42.42 | 42.75 | - | 0.2720 | 逸飛 (yì fēi) | |
+ | 42.75 | 42.90 | - | 0.9928 | 是 (shì) | efficiency |
+ | 42.90 | 43.32 | - | 0.9035 | 誰 (shéi) |
Symbol | Start Time | End Time | Confidence | Recognized Word | Original Audio Text |
---|---|---|---|---|---|
+ | 5.22 | 5.97 | 1.0000 | 這個 (zhè ge) | 這個 (zhè ge) |
- | 10.17 | 10.41 | 0.3929 | 哎 (āi) | |
- | 10.41 | 10.95 | 0.5900 | 爹 (diē) | idea |
+ | 15.24 | 16.23 | 1.0000 | 非常 (fēi cháng) | 非常 (fēi cháng) |
- | 21.32 | 21.45 | 0.2517 | 而 (ér) | perfect |
+ | 27.51 | 28.08 | 1.0000 | 我們 (wǒ mēn) | 我們 (wǒ mēn) |
+ | 28.08 | 28.47 | 1.0000 | 的 (de) | 的 (de) |
- | 32.40 | 32.81 | 0.5419 | 我 (wǒ) | work |
+ | 37.14 | 37.74 | 1.0000 | 需要 (xū yào) | 需要 (xū yào) |
+ | 37.74 | 38.43 | 1.0000 | 提高 (tí gāo) | 提高 (tí gāo) |
- | 42.42 | 42.75 | 0.2720 | 逸飛 (yì fēi) | |
- | 42.75 | 42.90 | 0.9928 | 是 (shì) | efficiency |
- | 42.90 | 43.32 | 0.9035 | 誰 (shéi) |
Segment Start Time | Segment End Time | Recognized Word | Original Audio Text |
---|---|---|---|
10.17 | 10.95 | idea | idea |
21.32 | 21.45 | perfect | perfect |
32.40 | 32.81 | work | work |
42.42 | 43.32 | efficiency | efficiency |
Case | Text | SC | IC | DC | SW | IW | DW | NM | MER |
---|---|---|---|---|---|---|---|---|---|
Recognized text without reassigning symbol | 這個(zhè ge) idea 非常(fēi cháng) perfect 我們的(wǒ mēn de) work 需要提高(xū yào tí gāo) if 是誰(shì shéi) | 0 | 3 | 0 | 1 | 0 | 0 | 15 | 26.66% |
Recognized text with reassigned symbol | 這個(zhè ge) idea 非常(fēi cháng) perfect 我們的(wǒ mēn de) work 需要提高(xū yào tí gāo) efficiency | 0 | 0 | 0 | 0 | 0 | 0 | 15 | 0.00% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hai, X.; Aranganadin, K.; Yeh, C.-C.; Hua, Z.; Huang, C.-Y.; Hsu, H.-Y.; Lin, M.-C. A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin–English Mixed Conversations. Appl. Sci. 2025, 15, 7691. https://doi.org/10.3390/app15147691
Hai X, Aranganadin K, Yeh C-C, Hua Z, Huang C-Y, Hsu H-Y, Lin M-C. A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin–English Mixed Conversations. Applied Sciences. 2025; 15(14):7691. https://doi.org/10.3390/app15147691
Chicago/Turabian StyleHai, Xinhe, Kaviya Aranganadin, Cheng-Cheng Yeh, Zhengmao Hua, Chen-Yun Huang, Hua-Yi Hsu, and Ming-Chieh Lin. 2025. "A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin–English Mixed Conversations" Applied Sciences 15, no. 14: 7691. https://doi.org/10.3390/app15147691
APA StyleHai, X., Aranganadin, K., Yeh, C.-C., Hua, Z., Huang, C.-Y., Hsu, H.-Y., & Lin, M.-C. (2025). A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin–English Mixed Conversations. Applied Sciences, 15(14), 7691. https://doi.org/10.3390/app15147691