Estimating Sentence-like Structure in Synthetic Languages Using Information Topology
Abstract
:1. Introduction
2. Analyzing Language Using Information Topology
2.1. Statistical Manifolds
2.2. Contrasting Distributions on a Riemannian Manifold
2.3. Normalized Ollivier–Ricci Curvature
2.4. Information Topology Manifold
3. Information-Theoretic Sentences
3.1. Incremental Relative Information
3.2. Curvature of Incremental Tangent Normalized Wasserstein Distance
Algorithm 1 Proposed information topology SLU estimation algorithm |
|
3.3. F-Measure Performance Analysis
3.4. An Information-Theoretic Performance Measure
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lengyel, G.; Nagy, M.; Fiser, J. Statistically defined visual chunks engage object-based attention. Nat. Commun. 2021, 12, 1–12. [Google Scholar] [CrossRef] [PubMed]
- Rogers, L.L.; Park, S.H.; Vickery, T.J. Visual statistical learning is modulated by arbitrary and natural categories. Psychon. Bull. Rev. 2021, 28, 1281–1288. [Google Scholar] [CrossRef] [PubMed]
- Frank, S.L.; Bod, R.; Christiansen, M.H. How hierarchical is language use? Proc. R. Soc. B Biol. Sci. 2012, 279, 4522–4531. [Google Scholar] [CrossRef]
- Poeppel, D.; Emmorey, K.; Hickok, G.; Pylkkänen, L. Towards a New Neurobiology of Language. J. Neurosci. 2012, 32, 14125–14131. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Koedinger, K.R.; Anderson, J.R. Abstract planning and perceptual chunks: Elements of expertise in geometry. Cogn. Sci. 1990, 14, 511–550. [Google Scholar] [CrossRef]
- Guoxiang, D.; Linlin, J. The lexical approach for language teaching based on the corpus language analysis. In Proceedings of the 2011 IEEE 3rd International Conference on Communication Software and Networks, Xi’an, China, 27–29 May 2011; pp. 665–668. [Google Scholar]
- Nishida, H. The influence of chunking on reading comprehension: Investigating the acquisition of chunking skill. J. Asia TEFL 2013, 10, 163–183. [Google Scholar]
- Krishnamurthy, R. Language as chunks, not words. In Proceedings of the JALT2002 Conference Proceedings: Waves of the Future; Swanson, M., Hill, K., Eds.; JALT: Tokyo, Japan, 2003; pp. 288–294. [Google Scholar]
- Ma, L.; Li, Y. On the Cognitive Characteristics of Language Chunks. In Proceedings of the International Conference on Social Science, Education Management and Sports Education, Beijing, China, 10–11 April 2015; Atlantis Press: Amsterdam, The Netherlands, 2015; pp. 198–200. [Google Scholar]
- Jia, L.; Duan, G. Role of the prefabricated chunks in the working memory of oral interpretation. In Proceedings of the 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), Yichang, China, 21–23 April 2012; pp. 541–543. [Google Scholar]
- Levinson, S.C. Turn-taking in human communication–origins and implications for language processing. Trends Cogn. Sci. 2016, 20, 6–14. [Google Scholar] [CrossRef] [Green Version]
- Reed, C.M.; Durlach, N.I. Note on information transfer rates in human communication. Presence 1998, 7, 509–518. [Google Scholar] [CrossRef]
- Pal, S.; Naskar, S.K.; Bandyopadhyay, S. A hybrid word alignment model for phrase-based statistical machine translation. In Proceedings of the Second Workshop on Hybrid Approaches to Translation, Sofia, Bulgaria, 8 August 2013; pp. 94–101. [Google Scholar]
- Liu, Y.; Stolcke, A.; Shriberg, E.; Harper, M. Comparing and combining generative and posterior probability models: Some advances in sentence boundary detection in speech. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, 25–26 July 2004; pp. 64–71. [Google Scholar]
- Ruppenhofer, J.; Rehbein, I. Detecting the boundaries of sentence-like units on spoken German. In Proceedings of the Preliminary 15th Conference on Natural Language Processing (KONVENS 2019), Erlangen, Germany, 9–11 October 2019; Friedrich-Alexander-Universität Erlangen-Nürnberg; German Society for Computational Linguistics & Language Technology: Erlangen, Germany, 2019; pp. 130–139. [Google Scholar]
- Matusov, E.; Mauser, A.; Ney, H. Automatic sentence segmentation and punctuation prediction for spoken language translation. In Proceedings of the Third International Workshop on Spoken Language Translation, Kyoto, Japan, 27–28 November 2006. [Google Scholar]
- Gotoh, Y.; Renals, S. Information extraction from broadcast news. Philos. Trans. R. Soc. London. Ser. A Math. Phys. Eng. Sci. 2000, 358, 1295–1310. [Google Scholar] [CrossRef] [Green Version]
- Read, J.; Dridan, R.; Oepen, S.; Solberg, L.J. Sentence boundary detection: A long solved problem? In Proceedings of the COLING 2012: Posters, Mumbai, India, 8–15 December 2012; pp. 985–994. [Google Scholar]
- Sanchez, G. Sentence Boundary Detection in Legal Text. In Proceedings of the Natural Legal Language Processing Workshop 2019, Minneapolis, Minnesota, 7 June 2019; Association for Computational Linguistics: Minneapolis, Minnesota, 2019; pp. 31–38. [Google Scholar]
- Griffis, D.; Shivade, C.; Fosler-Lussier, E.; Lai, A.M. A quantitative and qualitative evaluation of sentence boundary detection for the clinical domain. AMIA Summits Transl. Sci. Proc. 2016, 2016, 88. [Google Scholar]
- Kolár, J.; Liu, Y. Automatic sentence boundary detection in conversational speech: A cross-lingual evaluation on English and Czech. In Proceedings of the 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA, 14–19 March 2010; pp. 5258–5261. [Google Scholar]
- Jelinek, F. Continuous Speech Recognition by Statistical Methods. Proc. IEEE 1976, 64, 532–556. [Google Scholar] [CrossRef]
- Wallach, H.M. Conditional random fields: An introduction. In Technical Report MIS-CIS-04-21; Now Publishers: Tokyo, Japan, 2004. [Google Scholar]
- Kreuzthaler, M.; Schulz, S. Detection of sentence boundaries and abbreviations in clinical narratives. BMC Med. Inform. Decis. Mak. 2015, 15 (Suppl. 2), S4. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wanjari, N.; Dhopavkar, G.; Zungre, N.B. Sentence boundary detection for Marathi language. Procedia Comput. Sci. 2016, 78, 550–555. [Google Scholar] [CrossRef] [Green Version]
- Ramesh, V.; Kolonin, A. Interpretable natural language segmentation based on link grammar. In Proceedings of the 2020 Science and Artificial Intelligence Conference (S.A.I.ence), Novosibirsk, Russia, 14–15 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 25–32. [Google Scholar]
- Mori, S.; Nobuyasu, I.; Nishimura, M. An automatic sentence boundary detector based on a structured language model. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 2002), Denver, CO, USA, 16–20 September 2002. [Google Scholar]
- Liu, Y.; Shriberg, E. Comparing evaluation metrics for sentence boundary detection. In Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’07, Honolulu, HI, USA, 15–20 April 2007; IEEE: Piscataway, NJ, USA, 2007; Volume 4, pp. IV–185. [Google Scholar]
- Back, A.D.; Wiles, J. An Information Theoretic Approach to Symbolic Learning in Synthetic Languages. Entropy 2022, 24, 259. [Google Scholar] [CrossRef]
- Piantadosi, S.T.; Fedorenko, E. Infinitely productive language can arise from chance under communicative pressure. J. Lang. Evol. 2017, 2, 141–147. [Google Scholar] [CrossRef]
- Back, A.D.; Angus, D.; Wiles, J. Transitive Entropy—A Rank Ordered Approach for Natural Sequences. IEEE J. Sel. Top. Signal Process. 2020, 14, 312–321. [Google Scholar] [CrossRef]
- Sandler, W.; Meir, I.; Padden, C.; Aronoff, M. The emergence of grammar: Systematic structure in a new language. Proc. Natl. Acad. Sci. USA 2005, 102, 2661–2665. [Google Scholar] [CrossRef] [Green Version]
- Nowak, M.; Plotkin, J.; Jansen, V. The evolution of syntactic communication. Nature 2000, 404, 495–498. [Google Scholar] [CrossRef]
- Amari, S.I. Information geometry of the EM and em algorithms for neural networks. Neural Netw. 1995, 8, 1379–1408. [Google Scholar] [CrossRef]
- Cichocki, A.; Amari, S.I. Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities. Entropy 2010, 12, 1532–1568. [Google Scholar] [CrossRef] [Green Version]
- Shannon, C.E. A Mathematical Theory of Communication (Parts I and II). Bell Syst. Tech. J. 1948, XXVII, 379–423. [Google Scholar] [CrossRef] [Green Version]
- Wang, Q.; Suen, C.Y. Analysis and Design of a Decision Tree Based on Entropy Reduction and Its Application to Large Character Set Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 1984, 6, 406–417. [Google Scholar] [CrossRef]
- Kim, J.; André, E. Emotion Recognition Based on Physiological Changes in Music Listening. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 2067–2083. [Google Scholar] [CrossRef] [PubMed]
- Shore, J.E.; Gray, R. Minimum Cross-Entropy Pattern Classification and Cluster Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 1982, 4, 11–17. [Google Scholar] [CrossRef] [PubMed]
- Shekar, B.H.; Kumari, M.S.; Mestetskiy, L.; Dyshkant, N. Face recognition using kernel entropy component analysis. Neurocomputing 2011, 74, 1053–1057. [Google Scholar] [CrossRef]
- Hampe, J.; Schreiber, S.; Krawczak, M. Entropy-based SNP selection for genetic association studies. Hum. Genet. 2003, 114, 36–43. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Xiang, Y.; Deng, H.; Sun, Z. An Entropy-based Index for Fine-scale Mapping of Disease Genes. J. Genet. Genom. 2007, 34, 661–668. [Google Scholar] [CrossRef]
- Gianvecchio, S.; Wang, H. An Entropy-Based Approach to Detecting Covert Timing Channels. IEEE Trans. Dependable Secur. Comput. 2011, 8, 785–797. [Google Scholar] [CrossRef]
- Back, A.D.; Angus, D.; Wiles, J. Determining the Number of Samples Required to Estimate Entropy in Natural Sequences. IEEE Trans. Inf. Theory 2019, 65, 4345–4352. [Google Scholar] [CrossRef]
- Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
- Rao, C. Information and accuracy attainable in the estimation of statistical parameters. Bull. Calcutta Math. Soc. 1945, 37, 81. [Google Scholar]
- Amari, S. Differential geometry of curved exponential families-curvatures and information loss. Ann. Stat. 1982, 10, 357–385. [Google Scholar] [CrossRef]
- Amari, S.I. Information Geometry and Its Applications; Applied Mathematical Sciences; Springer: New York, NY, USA; Tokyo, Japan, 2016; Volume 194. [Google Scholar]
- Shannon, C.E. A Mathematical Theory of Communication (Part III). Bell Syst. Tech. J. 1948, XXVII, 623–656. [Google Scholar] [CrossRef]
- Sluis, R.A.; Angus, D.; Wiles, J.; Back, A.; Gibson, T.A.; Liddle, J.; Worthy, P.; Copland, D.; Angwin, A.J. An Automated Approach to Examining Pausing in the Speech of People with Dementia. Am. J. Alzheimer’s Dis. Other Dementias 2020, 35, 1533317520939773. [Google Scholar] [CrossRef] [PubMed]
- Ollivier, Y. A visual introduction to Riemannian curvatures and some discrete generalizations. In Analysis and Geometry of Metric Measure Spaces: Lecture Notes of the 50th Séminaire de Mathématiques Supérieures (SMS), Montréal, 2011; Dafni, G., McCann, R., Stancu, A., Eds.; AMS: Providence, RI, USA, 2013; pp. 197–219. [Google Scholar]
- Ni, C.C.; Lin, Y.Y.; Gao, J.; Gu, D.; Saucan, E. Ricci Curvature of the Internet Topology. In Proceedings of the IEEE Conference on Computer Communications INFOCOM 2015, Hong Kong, China, 26 April–1 May 2015; IEEE Computer Society: Washington, DC, USA, 2015. [Google Scholar]
- Sandhu, R.; Georgiou, T.; Reznik, E.; Zhu, L.; Kolesov, I.; Senbabaoglu, Y.; Tannenbaum, A. Graph Curvature for Differentiating Cancer Networks. Sci. Rep. 2015, 5, 12323. [Google Scholar] [CrossRef] [Green Version]
- Whidden, C.; Matsen IV, F.A. Ricci-Ollivier Curvature of the Rooted Phylogenetic Subtree-Prune-Regraft Graph. arXiv 2015, arXiv:1504.00304. [Google Scholar]
- Back, A.D.; Wiles, J. Entropy Estimation Using a Linguistic Zipf-Mandelbrot-Li Model for Natural Sequences. Entropy 2021, 23, 1100. [Google Scholar] [CrossRef]
- Calhoun, S. The centrality of metrical structure in signaling information structure: A probabilistic perspective. Language 2010, 86, 1–42. [Google Scholar] [CrossRef]
- Chater, N.; Manning, C.D. Probabilistic models of language processing and acquisition. Trends Cogn. Sci. 2006, 10, 335–344. [Google Scholar] [CrossRef]
- Courville, A.C.; Daw, N.D.; Touretzky, D.S. Bayesian theories of conditioning in a changing world. Trends Cogn. Sci. 2006, 10, 294–300. [Google Scholar] [CrossRef]
- Meyniel, F.; Dehaene, S. Brain networks for confidence weighting and hierarchical inference during probabilistic learning. Proc. Natl. Acad. Sci. USA 2017, 114, E3859–E3868. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kiss, T.; Strunk, J. Unsupervised multilingual sentence boundary detection. Comput. Linguist. 2006, 32, 485–525. [Google Scholar] [CrossRef]
- Choi, S.; Cichocki, A.; Park, H.M.; Lee, S.Y. Blind source separation and independent component analysis: A review. Neural Inf. Process.-Lett. Rev. 2005, 6, 1–57. [Google Scholar]
- Francis, W.N.; Kucera, H. Brown Corpus Manual—Manual of Information to Accompany A Standard Corpus of Present-Day Edited American English, for Use with Digital Computers; Department of Linguistics: Macquarie Park, NSW, Australia, 1979. [Google Scholar]
- Local, J.; Kelly, J. Projection and ’silences’: Notes on phonetic and conversational structure. Hum. Stud. 1986, 9, 185–204. [Google Scholar] [CrossRef]
- Moon, T.K. The expectation-maximization algorithm. IEEE Signal Process. Mag. 1996, 13, 47–60. [Google Scholar] [CrossRef]
- Rabiner, L.R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 1989, 77, 257–286. [Google Scholar] [CrossRef] [Green Version]
- Chinchor, N.; Dungca, G. Four scorers and seven years ago: The scoring method for MUC-6. In Proceedings of the Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, Columbia, MD, USA, 6–8 November 1995. [Google Scholar]
- Makhoul, J.; Kubala, F.; Schwartz, R.; Weischedel, R. Performance Measures For Information Extraction. In Proceedings of the DARPA Broadcast News Workshop, Washington, DC, USA, 28 February–3 March 1999; pp. 249–252. [Google Scholar]
- Rijsbergen, V.; Joost, C.K. Information Retrieval, 2nd ed.; Butterworths: London, UK, 1979. [Google Scholar]
- Chawla, N.V. Data mining for imbalanced datasets: An overview. In Data Mining and Knowledge Discovery Handbook; Springer: New York, NY, USA, 2009; pp. 875–886. [Google Scholar]
- Kulkarni, A.; Chong, D.; Batarseh, F.A. 5—Foundations of data imbalance and solutions for a data democracy. In Data Democracy; Batarseh, F.A., Yang, R., Eds.; Academic Press: Cambridge, MA, USA, 2020; pp. 83–106. [Google Scholar]
- Nechaev, Y.; Ruan, W.; Kiss, I. Towards NLU model robustness to ASR errors at scale. In Proceedings of the KDD 2021 Workshop on Data-Efficient Machine Learning, Singapore, 15 August 2021. [Google Scholar]
- Li, W. Random texts exhibit Zipf’s-law-like word frequency distribution. IEEE Trans. Inf. Theory 1992, 38, 1842–1845. [Google Scholar] [CrossRef] [Green Version]
- Li, W. Zipf’s Law Everywhere. Glottometrics 2002, 5, 14–21. [Google Scholar]
- Montemurro, M.A. Beyond the Zipf-Mandelbrot law in quantitative linguistics. Physica A 2001, 300, 567–578. [Google Scholar] [CrossRef] [Green Version]
- Mandelbrot, B. The Fractal Geometry of Nature; W. H. Freeman: New York, NY, USA, 1983. [Google Scholar]
Model | |
---|---|
KS | 78.91 |
BW | 68.09 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Back, A.D.; Wiles, J. Estimating Sentence-like Structure in Synthetic Languages Using Information Topology. Entropy 2022, 24, 859. https://doi.org/10.3390/e24070859
Back AD, Wiles J. Estimating Sentence-like Structure in Synthetic Languages Using Information Topology. Entropy. 2022; 24(7):859. https://doi.org/10.3390/e24070859
Chicago/Turabian StyleBack, Andrew D., and Janet Wiles. 2022. "Estimating Sentence-like Structure in Synthetic Languages Using Information Topology" Entropy 24, no. 7: 859. https://doi.org/10.3390/e24070859
APA StyleBack, A. D., & Wiles, J. (2022). Estimating Sentence-like Structure in Synthetic Languages Using Information Topology. Entropy, 24(7), 859. https://doi.org/10.3390/e24070859