Quantitative Measurement of Hakka Phonetic Distances
Abstract
1. Introduction
2. Methodology
3. Phonetic Distances in Hailu Hakka Vowels
Articulatory Approach
- a.
- b.
- a.
- b.
- a.
- b.
- a.
- b.
4. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
1 | I am grateful to the reviewer for the insightful observation on the theoretical motivation behind the selection of IPA symbols. This point has prompted clarification in the current manuscript. |
2 | Each component of the vector is divided by the maximum vector length, corresponding to the square root of n, where n equals 3 for vowels. |
3 | The maximum and minimum observed values are estimated using the “2-sigma rule” of the “68–95–99.7 rule” based on the mean and standard deviation, as shown below:
Maximum Value (Upper Limit) = μ + 2σ (2) Minimum Value (Lower Limit): μ − 2σ.
|
References
- Allassonnière-Tang, M., & Wan, I. P. (2024). Revisiting the automatic prediction of lexical errors in Mandarin. Linguistics Vanguard, 10(1), 527–535. [Google Scholar] [CrossRef]
- Allassonnière-Tang, M., Wan, I.-P., & Lee, C. (2024). Semantic and phonological distances in free word association tasks. In M. Dong, J.-F. Hong, J. Lin, & P. Jin (Eds.), Chinese lexical semantics (CLSW 2023) (Vol. 14515, pp. 91–100). Lecture notes in computer science. Springer Nature Singapore. [Google Scholar] [CrossRef]
- Chung, R.-F. (2017). Acoustic studies on vowels of Hailu Hakka. Available online: https://cloud.hakka.gov.tw/details?p=11646 (accessed on 15 September 2023).
- Do, Y., & Lai, R. K. Y. (2021). Accounting for lexical tones when modeling phonological distance. Language, 97(1), e39–e67. [Google Scholar] [CrossRef]
- Fisher, W. M., & Fiscus, J. G. (1993, April 27–30). Better alignment procedures for speech recognition evaluation. IEEE International Conference on Acoustics Speech and Signal Processing (Vol. 2, pp. 59–62), Minneapolis, MN, USA. [Google Scholar] [CrossRef]
- Frisch, S., Broe, M., & Pierrehumbert, J. (1997). Similarity and phonotactics in Arabic. Rutgers Optimality Archive, 223, 1–55. [Google Scholar]
- Gildea, D., & Jurafsky, D. (1996). Learning bias and phonological-rule induction. Computational Linguistics, 22(4), 497–530. [Google Scholar]
- Heeringa, W. (2004). Measuring dialect pronunciation differences using Levenshtein distance [Doctoral dissertation, University of Groningen]. [Google Scholar]
- Heeringa, W., Gooskens, C., Nerbonne, J., & Kleiweg, P. (2006). Evaluation of string distance algorithms for dialectology. In J. Nerbonne, & E. Hinrichs (Eds.), Linguistic distances: Workshop at the joint conference of the international committee on computational linguistics and the association for computational linguistics (pp. 52–62). Association for Computational Linguistics. [Google Scholar]
- Hennig, C. (2010). Methods for merging Gaussian mixture components. Advances in Data Analysis and Classification, 4(1), 3–34. [Google Scholar] [CrossRef]
- International Phonetic Association. (2015). IPA chart. Available online: http://www.internationalphoneticassociation.org/content/ipa-chart (accessed on 10 September 2023).
- Jeng, J.-Y. (2011). Speech acoustics: The science of spoken sound. Psyche Publishing. [Google Scholar]
- Jokisch, O., & Hain, H.-U. (2017). A trainable method for the phonetic similarity search in German proper names. In A. Karpov, R. Potapova, & I. Mporas (Eds.), Speech and computer (Vol. 10458, pp. 46–55). Lecture notes in computer science. Springer International Publishing. [Google Scholar] [CrossRef]
- Kessler, B. (2005). Phonetic comparison algorithms. Transactions of the Philological Society, 103(2), 243–260. [Google Scholar] [CrossRef]
- Kondrak, G. (2003). Phonetic alignment and similarity. Computers and the Humanities, 37(3), 273–291. [Google Scholar] [CrossRef]
- Ladefoged, P., & Maddieson, I. (1996). The sounds of the world’s languages. Blackwell. [Google Scholar]
- Li, C. W. (2011). An acoustic study of Hai-lu Hakka vowels [Master’s thesis, National Chengchi University]. [Google Scholar]
- Luce, P. A., Goldinger, S. D., Auer, E. T., & Vitevitch, M. S. (2000). Phonetic priming, neighborhood activation, and PARSYN. Perception & Psychophysics, 62(3), 615–625. [Google Scholar] [CrossRef] [PubMed]
- Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19(1), 1–36. [Google Scholar] [CrossRef] [PubMed]
- Luo, Z.-J. (1984). A study of the Sixian Hakka grammar [Doctoral dissertation, National Taiwan Normal University]. [Google Scholar]
- Luo, Z.-J. (2000). The history of the Hakka community in Taiwan: Language aspect. Historical Records Committee of Taiwan Provincial Government. [Google Scholar]
- Nerbonne, J., & Heeringa, W. (1997). Measuring dialect distance phonetically. In Computational phonology: Third meeting of the ACL special interest group in computational phonology (pp. 11–18). Association for Computational Linguistics. Available online: https://aclanthology.org/W97-1102/ (accessed on 14 September 2023).
- Oakes, M. P. (2000). Computer estimation of vocabulary in a protolanguage from word lists in four daughter languages. Journal of Quantitative Linguistics, 7(3), 233–243. [Google Scholar] [CrossRef]
- Pierrehumbert, J. (1993). Dissimilarity in the Arabic verbal roots. In Proceedings of north east linguistic society (NELS) (Vol. 23, pp. 367–381). University of Massachusetts. [Google Scholar]
- Saiegh-Haddad, E. (2004). The impact of phonemic and lexical distance on the phonological analysis of words and pseudowords in a diglossic context. Applied Psycholinguistics, 25(4), 495–512. [Google Scholar] [CrossRef]
- Tang, C. (2009). Mutual intelligibility of Chinese dialects: An experimental approach [Doctoral dissertation, University of Leiden]. Available online: http://hdl.handle.net/1887/13963 (accessed on 14 September 2023).
- Tang, C., & van Heuven, V. J. (2015). Predicting mutual intelligibility of Chinese dialects from multiple objective linguistic distance measures. Linguistics, 53(2), 285–312. [Google Scholar] [CrossRef]
- Vakulenko, M. O. (2019). Calculation of semantic distances between words: From synonymy to antonymy. Journal of Quantitative Linguistics, 26(2), 116–128. [Google Scholar] [CrossRef]
- Vakulenko, M. O. (2021). Calculation of phonetic distances between speech sounds. Journal of Quantitative Linguistics, 28(3), 223–236. [Google Scholar] [CrossRef]
- Vakulenko, M. O. (2023). Unified parametrization of phonetic features and numerical calculation of phonetic distances between speech sounds. Journal of Quantitative Linguistics, 30(1), 67–85. [Google Scholar] [CrossRef]
- Vitevitch, M. S. (1997). The neighborhood characteristics of malapropisms. Language and Speech, 40(3), 211–228. [Google Scholar] [CrossRef] [PubMed]
- Wan, I.-P. (1999). Mandarin phonology: Evidence from speech errors [Doctoral dissertation, State University of New York]. [Google Scholar]
- Wan, I.-P. (2002). Alignments of prenuclear glides in Mandarin. Crane Publishing. [Google Scholar]
- Wan, I.-P., & Jaeger, J. (2003). The phonological representation of Mandarin vowels: A psycholinguistic study. Journal of East Asian Linguistics, 12(3), 205–257. [Google Scholar] [CrossRef]
- Wang, J., Green, J. R., Samal, A., & Yunusova, Y. (2013). Articulatory distinctiveness of vowels and consonants: A data-driven approach. Journal of Speech, Language, and Hearing Research, 56(5), 1539–1551. [Google Scholar] [CrossRef] [PubMed]
Parameters | Vowels and Vector Value |
---|---|
Horizontal position | 1. front vowels: p1 ≡ ph = 1 2. central vowels: p1 ≡ ph = 1/2 3. back vowels: p1 ≡ ph = 0 |
Vertical position | 1. close vowels: p2 ≡ pv = 1 2. near-close vowels: p2 ≡ pv = 5/6 3. close-mid vowels: p2 ≡ pv = 2/3 4. mid vowels: p2 ≡ pv = 1/2 5. open-mid vowels: p2 ≡ pv = 1/3 6. near-open vowels: p2 ≡ pv = 1/6 7. open vowels: p2 ≡ pv = 0 |
Roundness | 1. rounded vowels: p3 ≡ pr = 1 2. unrounded vowels: p3 ≡ pr = 0 |
Parameters | Vowels and Vector Value |
---|---|
Horizontal position | 1. front vowels: ph = 5/12 2. central vowels: ph = 3/8 3. back vowels: ph = 1/3 |
Vertical position | 1. close vowels: pv = 1/2 2. near-close vowels: pv = 5/12 3. close-mid vowels: pv = 1/3 4. mid vowels: pv = 1/4 5. open-mid vowels: pv = 1/6 6. near-open vowels: pv = 1/12 7. open vowels: pv = 0 pv = 3/4 for approximants (semi-vowels) |
Roundness | 1. rounded vowels: pr = 1/2 2. unrounded vowels: pr = 0 pr = 3/4 for labiovelar approximant w |
Voicing | All vowels: pvoice = 1 |
Nasality | 1. nasal and heavily nasalized vowels: pnasal = 1 2. lightly nasalized vowels: pnasal = 1/2 3. the rest: pnasal = 0 |
Retroflexity | 1. rhotacized vowels: pret = 1 2. the rest: pret = 0 |
Homogeneity | 1. monophthongs: phom = 1 2. diphthongs: phom = 0 |
Pulmonic feature | All vowels: ppul = 1 |
Continuation feature | All vowels: pcont = 1 |
Sibilant feature | All vowels: psib = 0 |
Laterality | All vowels: plat = 0 |
Trill feature | All vowels: ptrill = 0 |
Ejective feature | All vowels: peject = 0 |
Click feature | All vowels: pclick = 0 |
Implosive feature | All vowels: pimpl = 0 |
Vowels | Vectors (Vakulenko, 2021) | Rank | Vowels | Vectors (Vakulenko, 2023) | Rank |
---|---|---|---|---|---|
|[i]-[e]| | 0.1925 | 1 | |[i]-[ɨ]| | 0.0108 | 1 |
|[o]-[u]| | 0.1925 | 1 | |[i]-[e]| | 0.0430 | 2 |
|[i]-[ɨ]| | 0.2887 | 3 | |[o]-[u]| | 0.0430 | 2 |
|[e]-[ɨ]| | 0.3469 | 4 | |[e]-[ɨ]| | 0.0444 | 4 |
|[e]-[a]| | 0.4811 | 5 | |[e]-[a]| | 0.0867 | 5 |
|[ɨ]-[a]| | 0.5774 | 6 | |[e]-[ɑ]| | 0.0887 | 6 |
|[i]-[a]| | 0.6455 | 7 | |[ɨ]-[a]| | 0.1291 | 7 |
|[ɨ]-[ɑ]| | 0.6455 | 7 | |[i]-[a]| | 0.1295 | 8 |
|[ɨ]-[u]| | 0.6455 | 7 | |[ɨ]-[ɑ]| | 0.1295 | 8 |
|[ɨ]-[o]| | 0.6736 | 10 | |[ɨ]-[u]| | 0.1295 | 8 |
|[e]-[ɑ]| | 0.6939 | 11 | |[i]-[ɑ]| | 0.1309 | 11 |
|[ɑ]-[o]| | 0.6939 | 11 | |[i]-[u]| | 0.1309 | 11 |
|[a]-[o]| | 0.7515 | 13 | |[e]-[o]| | 0.1309 | 11 |
|[i]-[ɑ]| | 0.8165 | 14 | |[ɨ]-[o]| | 0.1365 | 14 |
|[i]-[u]| | 0.8165 | 14 | |[i]-[o]| | 0.1378 | 15 |
|[e]-[o]| | 0.8165 | 14 | |[e]-[u]| | 0.1378 | 15 |
|[ɑ]-[u]| | 0.8165 | 14 | |[ɑ]-[o]| | 0.1552 | 17 |
|[i]-[o]| | 0.8389 | 18 | |[a]-[o]| | 0.1555 | 18 |
|[e]-[u]| | 0.8389 | 18 | |[ɑ]-[u]| | 0.1826 | 19 |
|[a]-[u]| | 0.8660 | 20 | |[a]-[u]| | 0.1829 | 20 |
Participant 1 (F) | Participant 2 (F) | Participant 3 (F) | Participant 4 (M) | Participant 5 (M) | Participant 6 (M) | Mean | SD | ||
---|---|---|---|---|---|---|---|---|---|
[i] | F1 | 295.11 | 310.54 | 305.02 | 280.57 | 275.47 | 270.05 | 289.46 | 16.6 |
F2 | 2987.17 | 3257.02 | 3100.11 | 2563.06 | 2645.47 | 2635.17 | 2864.67 | 288.5 | |
[e] | F1 | 560.78 | 610.32 | 575.00 | 497.85 | 450.19 | 505.12 | 533.21 | 60.0 |
F2 | 2655.19 | 2690.11 | 2877.95 | 2416.00 | 2317.46 | 2399.47 | 2559.36 | 215.6 | |
[ɨ] | F1 | 448.72 | 470.00 | 503.56 | 410.45 | 398.78 | 367.35 | 433.14 | 50.2 |
F2 | 1809.76 | 1811.53 | 1745.22 | 1578.35 | 1303.01 | 1637.34 | 1647.53 | 193.1 | |
[a] | F1 | 1312.58 | 1596.12 | 1458.69 | 1058.32 | 986.21 | 958.63 | 1228.4 | 266.7 |
F2 | 1785.22 | 1989.54 | 1874.01 | 1298.11 | 1346.38 | 1250.87 | 1590.69 | 328.0 | |
[o] | F1 | 623.47 | 615.02 | 633.12 | 589.21 | 601.08 | 582.17 | 607.34 | 19.9 |
F2 | 1103.00 | 1258.25 | 1006.38 | 1102.56 | 987.36 | 954.36 | 1068.5 | 111.2 | |
[u] | F1 | 489.21 | 501.28 | 517.65 | 403.65 | 399.01 | 358.11 | 444.81 | 66.0 |
F2 | 753.21 | 788.96 | 771.02 | 692.27 | 721.01 | 717.11 | 740.59 | 36.6 |
<F1HH> | <F2HH> | |
---|---|---|
male | 664.185 | 1668.87 |
female | 945.615 | 2005.115 |
F1HH | F2HH | F1HH | F2HH | ||
---|---|---|---|---|---|
[i] | male | 275.36 | 2614.57 | 289.46 | 2864.67 |
female | 303.56 | 3114.77 | |||
[e] | male | 484.39 | 2377.64 | 533.21 | 2559.36 |
female | 582.03 | 2741.08 | |||
[ɨ] | male | 392.19 | 1506.23 | 433.14 | 1647.53 |
female | 474.09 | 1788.84 | |||
[a] | male | 1001.05 | 1298.45 | 1228.4 | 1590.69 |
female | 1455.80 | 1882.92 | |||
[o] | male | 590.82 | 1014.76 | 607.34 | 1068.5 |
female | 623.87 | 1122.54 | |||
[u] | male | 386.92 | 710.13 | 444.81 | 740.59 |
female | 502.71 | 771.06 |
f1HH | f2HH | <f1HH> | <f2HH> | ||
---|---|---|---|---|---|
[i] | male | 0.41 | 1.57 | 0.37 | 1.56 |
female | 0.32 | 1.55 | |||
[e] | male | 0.73 | 1.42 | 0.67 | 1.40 |
female | 0.62 | 1.37 | |||
[ɨ] | male | 0.59 | 0.90 | 0.55 | 0.90 |
female | 0.50 | 0.89 | |||
[a] | male | 1.51 | 0.78 | 1.52 | 0.86 |
female | 1.54 | 0.94 | |||
[o] | male | 0.89 | 0.61 | 0.77 | 0.58 |
female | 0.66 | 0.56 | |||
[u] | male | 0.58 | 0.43 | 0.56 | 0.41 |
female | 0.53 | 0.38 |
Hakka Vowels | Vector | Rank |
---|---|---|
|[o]-[u]| | 0.2702 | 1 |
|[i]-[e]| | 0.3400 | 2 |
|[ɨ]-[o]| | 0.3883 | 3 |
|[ɨ]-[u]| | 0.4901 | 4 |
|[e]-[ɨ]| | 0.5142 | 5 |
|[i]-[ɨ]| | 0.6841 | 6 |
|[a]-[o]| | 0.8006 | 7 |
|[e]-[o]| | 0.8261 | 8 |
|[ɨ]-[a]| | 0.9708 | 9 |
|[e]-[u]| | 0.9961 | 10 |
|[e]-[a]| | 1.0070 | 11 |
|[i]-[o]| | 1.0585 | 12 |
|[a]-[u]| | 1.0602 | 13 |
|[i]-[u]| | 1.1656 | 14 |
|[i]-[a]| | 1.3463 | 15 |
a | e | i | o | u | ɨ | |||
---|---|---|---|---|---|---|---|---|
Male | F1 | mean | 768 | 568 | 383 | 643 | 492 | 412 |
SD | 77 | 84 | 55 | 89 | 70 | 54 | ||
min | 614 | 400 | 273 | 465 | 352 | 304 | ||
max | 922 | 736 | 493 | 821 | 632 | 520 | ||
F2 | mean | 1503 | 1884 | 2207 | 1097 | 1058 | 1445 | |
SD | 164 | 169 | 181 | 148 | 199 | 246 | ||
min | 1175 | 1546 | 1845 | 801 | 660 | 953 | ||
max | 1831 | 2222 | 2569 | 1393 | 1456 | 1937 | ||
Female | F1 | mean | 904 | 666 | 453 | 748 | 552 | 475 |
SD | 108 | 112 | 61 | 109 | 83 | 86 | ||
min | 688 | 442 | 331 | 530 | 386 | 303 | ||
max | 1120 | 890 | 575 | 966 | 718 | 647 | ||
F2 | mean | 1687 | 2155 | 2497 | 1231 | 1138 | 1639 | |
SD | 133 | 167 | 220 | 141 | 167 | 167 | ||
min | 1421 | 1821 | 2057 | 949 | 804 | 1305 | ||
max | 1953 | 2489 | 2937 | 1513 | 1472 | 1973 |
<F1HH> | <F2HH> | |
---|---|---|
male | 597.5 | 1614.5 |
female | 711.5 | 1870.5 |
F1HH | F2HH | F1HH | F2HH | ||
---|---|---|---|---|---|
[i] | male | 383 | 2207 | 418 | 2352 |
female | 453 | 2497 | |||
[e] | male | 568 | 1884 | 617 | 2019.5 |
female | 666 | 2155 | |||
[ɨ] | male | 412 | 1445 | 443.5 | 1542 |
female | 475 | 1639 | |||
[a] | male | 768 | 1503 | 836 | 1595 |
female | 904 | 1687 | |||
[o] | male | 643 | 1097 | 695.5 | 1164 |
female | 748 | 1231 | |||
[u] | male | 492 | 1058 | 522 | 1098 |
female | 552 | 1138 |
f1HH | f2HH | <f1HH> | <f2HH> | ||
---|---|---|---|---|---|
[i] | male | 0.64 | 1.37 | 0.64 | 1.35 |
female | 0.64 | 1.33 | |||
[e] | male | 0.95 | 1.17 | 0.94 | 1.16 |
female | 0.94 | 1.15 | |||
[ɨ] | male | 0.69 | 0.90 | 0.68 | 0.89 |
female | 0.67 | 0.88 | |||
[a] | male | 1.29 | 0.93 | 1.28 | 0.92 |
female | 1.27 | 0.90 | |||
[o] | male | 1.08 | 0.68 | 1.06 | 0.67 |
female | 1.05 | 0.66 | |||
[u] | male | 0.82 | 0.66 | 0.80 | 0.63 |
female | 0.78 | 0.61 |
Hakka Vowels | Vectors | Rank |
---|---|---|
|[o]-[u]| | 0.2667 | 1 |
|[ɨ]-[u]| | 0.2812 | 2 |
|[a]-[o]| | 0.3274 | 3 |
|[i]-[e]| | 0.3597 | 4 |
|[e]-[ɨ]| | 0.3809 | 5 |
|[e]-[a]| | 0.4136 | 6 |
|[ɨ]-[o]| | 0.4420 | 7 |
|[i]-[ɨ]| | 0.4670 | 8 |
|[e]-[o]| | 0.5053 | 9 |
|[e]-[u]| | 0.5469 | 10 |
|[a]-[u]| | 0.5566 | 11 |
|[ɨ]-[a]| | 0.6002 | 12 |
|[i]-[u]| | 0.7369 | 13 |
|[i]-[a]| | 0.7728 | 14 |
|[i]-[o]| | 0.8037 | 15 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wan, I.-P. Quantitative Measurement of Hakka Phonetic Distances. Languages 2025, 10, 185. https://doi.org/10.3390/languages10080185
Wan I-P. Quantitative Measurement of Hakka Phonetic Distances. Languages. 2025; 10(8):185. https://doi.org/10.3390/languages10080185
Chicago/Turabian StyleWan, I-Ping. 2025. "Quantitative Measurement of Hakka Phonetic Distances" Languages 10, no. 8: 185. https://doi.org/10.3390/languages10080185
APA StyleWan, I.-P. (2025). Quantitative Measurement of Hakka Phonetic Distances. Languages, 10(8), 185. https://doi.org/10.3390/languages10080185