Advanced NLP Procedures as Premises for the Reconstruction of the Idea of Knowledge †
- Hermeneutics as a set of philosophical reflections is entirely qualitative. Its procedures were also not designed to evaluate texts, but to interpret them. However, on the other hand, the thread underlying hermeneutics is important, as it makes the text completely autonomous and detaches it from the author’s instance. In the original version, this autonomy meant the divine origin of the text undergoing exegesis.
- Textual datasets used as corpora cannot be considered the final representation of the resources. They are subject to internal processes of continuous construction. They refer also directly to the Internet, which is a very unstable source of text.
- Internet text resources are not homogeneous. They are scattered among various communities, including national communities, and do not constitute a coherent whole. Since we are dealing with a complex phenomenon, the deterministic delimitation of the separate areas is very difficult.
- The datasets underlying the linguistic model are arbitrary or even random. Model authors often use vague selection criteria, such as quality, without defining this quality in any way and adopting it intuitively. Their general assumption is that the maximum richness of texts is only a loose postulate. Each interference in the corpus increases the degree of its randomness, which does not necessarily mean reaching the level of statistical representativeness of knowledge, language, etc.
- Each of the procedures (algorithms) used within GPT models has an unknown impact on the semantic structure, starting with tokenization of the text corpus using Byte Pair Encoding (BPE). The techniques of recurrence and attention, as well as other, less innovative techniques, such as softmax, etc. also have such an impact on the formation of the final semantic structure. It is also difficult to imagine the possibility of a semantic evaluation of these procedures by a method other than trial and error due to the distributed and implicit nature of this semantics.
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
- Deng, L.; Liu, Y. (Eds.) Deep Learning in Natural Language Processing; Springer: Singapore, 2018; ISBN 978-981-10-5208-8. [Google Scholar]
- Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models Are Few-Shot Learners. arXiv 2020, arXiv:2005.14165. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models Are Unsupervised Multitask Learners. OpenAI blog 2019, 1, 9. [Google Scholar]
- Jurafsky, D.; Martin, J.H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2020. [Google Scholar]
- Aggarwal, C.C. Machine Learning for Text; Springer International Publishing: Cham, Switzerland, 2018. [Google Scholar]
- Charniak, E. Introduction to Deep Learning; The MIT Press: Cambridge, MA, USA, 2019; ISBN 978-0-262-03951-2. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.; Dean, J. Distributed Representations of Words and Phrases and Their Compositionality. arXiv 2013, arXiv:1310.4546. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Lévy, P. Collective Intelligence, Mankind’s Emerging World in Cyberspace; Perseus Books: Cambridge, MA, USA, 1999. [Google Scholar]
- Reddit Karma-Reddit.com. Available online: https://www.reddit.com/wiki/karma (accessed on 25 September 2021).
- Flasiński, M. Introduction to Artificial Intelligence; Springer: Cham, Switzerland, 2016; ISBN 978-3-319-40020-4. [Google Scholar]
- Turing, A.M. Computing Machinery and Intelligence. Mind 1950, 236, 433–460. [Google Scholar] [CrossRef]
- Porter, S.E.; Robinson, J.C. Hermeneutics. An Introduction to Interpretative Theory; William B. Erdmans Pulishing Company: Grand Rapids, MI, USA; Cambridge, UK, 2011. [Google Scholar]
- Malpas, J.; Gander, H.-H. (Eds.) The Routledge Companion to Hermeneutics; Routledge: Abingdon, UK, 2015. [Google Scholar]
- Bleicher, J. Contemporary Hermeneutics: Hermeneutics as Method, Philosophy and Critique; Reprint, 1982 editon; Routledge & Kegan Paul: London, UK; Boston, MA, USA, 1980; ISBN 978-0-7100-0552-6. [Google Scholar]
- Sowa, J.F. The Role of Logic and Ontology in Language and Reasoning. In Theory and Applications of Ontology: Philosophical Perspectives; Poli, R., Seibt, J., Eds.; Springer: Dordrecht, The Netherlands, 2010; pp. 231–263. ISBN 978-90-481-8845-1. [Google Scholar]
- Foucault, M. Les Mots et Les Choses. Unearchéologie des Sciences Humaines; Gallimard: Paris, France, 1966; ISBN 978-2-07-022484-5. [Google Scholar]
- Foucault, M. L’archéologie du savoir; Gallimard: Paris, France, 1969; ISBN 978-2-07-026999-0. [Google Scholar]
- Foucault, M. L’ordre du Discours: Leçoninaugurale au Collège de France Prononcée le 2 Décembre 1970; Gallimard: Paris, France, 1971. [Google Scholar]
- Maciag, R. Discursive Space and Its Consequences for Understanding Knowledge and Information. Philosophies 2018, 3, 34. [Google Scholar] [CrossRef]
- Maciag, R. Ontological Basis of Knowledge in the Theory of Discursive Space and Its Consequences. Proceedings 2020, 47, 11. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Maciag, R. Advanced NLP Procedures as Premises for the Reconstruction of the Idea of Knowledge. Proceedings 2022, 81, 105. https://doi.org/10.3390/proceedings2022081105
Maciag R. Advanced NLP Procedures as Premises for the Reconstruction of the Idea of Knowledge. Proceedings. 2022; 81(1):105. https://doi.org/10.3390/proceedings2022081105Chicago/Turabian Style
Maciag, Rafal. 2022. "Advanced NLP Procedures as Premises for the Reconstruction of the Idea of Knowledge" Proceedings 81, no. 1: 105. https://doi.org/10.3390/proceedings2022081105