1. Introduction
Deep learning algorithms perform well in all fields of data processing because of their complexity [
1]. From the results achieved by deep learning, it seems that goal for the field of machine learning-artificial intelligence is in reach, i.e., the goal of making machines as intelligent as humans, and with the ability to think like humans. A considerable number of people hold the view that machines can be made to reach the level of intelligence like or beyond that of humans, simply by the development of algorithms and computing power. Philosophy of information provides a perspective on the nature of human intelligence, from which current deep learning approaches are still a huge gap away from real intelligence of the subject.
2. The Nature of “Intelligence” in Philosophy of Information
According to philosophy of information, intelligence is the agential way and method of grasping, processing, creating, developing, using, and achieving information by a subject with cognitive or practical ability. In other words, intelligence is the information activity of an agential subject, and in order to achieve subject-like intelligence, it is necessary to have subject-like information activity [
2] (pp. 160–177).
Since humans are the most common form of a subject, we generally consider artificial intelligence through the study of the phenomenon of human intelligence. As mentioned earlier, and because intelligence is an information activity, it is necessary to consider human information activity first. Human information activity has a bottom-up hierarchical progression: human cognition, as the higher form of information, evolves layer by layer starting from low-level information. The low-level information in nature enters the human nervous system, generates sensation and perception, and creates information for itself stored in memory. Based on these memories, the agential subject achieves the creation of information and generates renewed information, feeding this renewed information, which becomes new information, back to nature and society through social practice. Human information activity also has top-down orientation and inhibition: the higher-level information activity regulates, evaluates, and guides the direction and intensity of the lower-level information activity for its own purpose, strengthens those parts that are consistent with its own purpose and needs, and inhibits those aspects that are inconsistent.
Through the synthesis of bottom-up and top-down, all levels of human information activities become an organic whole. The interaction of all levels of information makes human intelligence a complex system, with evolutionary and emergence characteristics of a self-organized system. Thus, it is possible to improve itself through the continuous creation and condensation of information, to emerge as intelligence-like phenomena.
3. The Static Nature of Artificial Neural Networks
Many terms exist in the field of machine learning, like those used to describe human intelligence, implying the similarity of existing artificial intelligence to human intelligence. For example, it is often assumed that deep neural networks are also “evolutionary” because they improve their parameters during the process of machine learning; however, this so-called evolution is fundamentally different from the evolutionary nature of the human information activity system as a complex system. The essence of this is that the “parameters” improved by the network are only a small part of the overall system.
In machine learning terms, parameters represent values in the system that are changed by the algorithm itself during training, as opposed to the concept of hyperparameters—values that were determined before learning begins. The performance of a neural network and similar machine learning systems often depends on the value of the hyperparameters, which is an important part of the design of the neural network to be measured [
3]. One class of hyperparameters determines things like the shape and size of the input matrix or the number of layers of the network. They determine the spatial structure of the neural network. Another class of hyperparameters are temporal, it includes the learning rate, the number of epochs of learning, and the threshold of the loss function that determines when the learning process stops.
For an evolutionary system, the experienced time is transformed into the structure of space, and the passed time is stored in the subsequent spatial structure, thus generating new properties of the system [
4]. However, during the training of a neural network, the hyperparameters representing the spatial structure are designed, i.e., they are other-organized, and remain constant throughout the process. The length of time for training experienced by the network also depends on fixed temporal hyperparameters that are pre-designed at the beginning, not on the starting spatial structure of the network. In neural networks, these parameters representing time and space are static, which makes the neural network structures lack the necessary evolutionary and self-organizing properties.
Therefore, neural networks do not have the emergent qualities that are necessary to generate intelligence from the system. For a self-organizing system, emergence is the process by which the interaction between the constituent units of the system generates new properties of the whole from the bottom up. Although the training of a neural network yields products that are emergent and difficult to predict and explain, the fixed hyperparameters in the neural network and the inherently deterministic and pre-designed algorithms prevent the individual components of the neural network from changing the whole to produce qualitatively new properties. Thus, the neural network itself that lacks self-organization cannot archive the whole information activity system of the subject, and intelligence cannot emerge from it.
4. What Perceptron Achieved Is Not Perception
Another similarity in terminology comes from the metaphor of perceptron for human perception. Multilayer perceptrons are the origin of artificial neural networks, and this name implies that the process it implements is an imitation of human perception—an imitation of the level of the intuitive recognition process of the information subject. The objective of the intuitive recognition process of information is information for itself. If a process intends to mimic the process of intuitive recognition of information, then the input to this process should also be information for itself, or the information field around the system. However, the perceptron as a program has digital signals as its input, not this kind of information; or rather, its input is data. Its desired output, on the other hand, is a classification of data. This classification is not based on the data itself, but on the information of that material entity to which these data refer.
For example, when a photograph is taken of someone, the light emitted and reflected by that person and the surrounding environment passes through the lens of the camera to the camera’s sensor, which samples that light discretely based on its own resolution [
5] (pp. 57–63). This is a process of using data information to refer to that entity’s information about the person. In the sense of philosophy of information, for the intelligent subject, the scenario is a symbol; and with the conventions of photographic technological details, the data of the photograph becomes symbolic information referring to that entity.
The input of a perceptron is a symbol associated with an entity, and the output of a perceptual machine is the category of distinction. The gender of that person is also a symbol referring to that entity. Thus, although the program is called a perceptron, the process it imitates is more like a logical deduction process of symbolic information. The process after the external light enters the lens of the camera and before it is sampled and encoded by the photoreceptor is like the subject’s sensation, but this process has no intelligence because the perceptual part is missing afterwards. For an intelligent subject, perception is the process of understanding and interpreting the identified information. A perceptron only transforms a set of data into another set of data, simply re-referring and re-encoding in the process, without any understanding or interpretation of the symbolic information. In fact, it is often the designer of the perceptron who understands and interprets.
5. Symbol Origin and Symbol Grounding
Are perceptrons performing symbolic reasoning since they imitate symbolic reasoning processes? Although symbols are often used as synonyms for patterns in fields such as pattern recognition, in the case of human information activities, the concept of symbols needs to include the process of referring one information pattern to another information pattern, which is made possible by some subjective agreement stored in the information activity system. As a result, the person can understand the symbol’s referent, and thus make logical deduction of the symbol. The association of symbols with referents requires the human perception to understand and interpret these symbols. Although a person uses photograph and gender data to refer to information about someone and treats it as a symbol, the perceptron does not include this part of conventional information in the memory and cannot use it to correct the learning process. Thus, from the perceptron’s point of view, the input and output information lack symbolic connotations due to the lack of perception in the information activity process. The information cannot be associated with their meaning as symbols, they degenerate into data. What the perceptron does is nothing more than computation.
The symbolic information must be present to achieve logical deduction. The system needs to contain information related to the referent of the agreed symbols. Cognitive scientist Harnad posed a symbol grounding problem to AI: How do symbols acquire meaning in a symbolic system [
6]? In terms of philosophy of information, it is then a question of how the information patterns used by AI become symbolic information. As a structure of the subject’s high-level information activity, symbolic information must be based on various low-level information activities. The subjective conventions related to the referent of symbols were closely related to the lower level of information activity: “perception”. When attempting to obtain the meaning of a symbol from the pattern, the lack of perceptual information prevents this deduction from mere informational model.
The only feasible path to address symbolic grounding is the bottom-up path of research. Only based on an information activity system and the process of direct information interaction with environment, sensory, perception, and memory, can an abstract pattern rely on the conventions of the system itself to produce its referent. In other words, symbols are created from the emergence of the information activity of subject. The process of symbolic reasoning and creation cannot be separated from other parts of the self-organized information activity system.
6. Lack of Self-Organization and Emergent Properties in Machine Information Flow
It has become more evident that the information flow associated with machine learning is not systematic when comparing it to the human information activity system. For the low-level information activity, the “sensory” source of the neural network is the data sampled from various external sensors, which are often chosen arbitrarily because only the data are important to the network. As a result, the assimilation and dissimilation of information of the sensors and the surrounding environment, and the condensation of information in these processes, are not reflected in the “learning” of the neural network. In addition, the neural network acquires only part of the static information of a particular sensor when a particular set of data is recorded. On the other hand, these sensors are often controlled by humans or placed in locations that are not associated to the network, i.e., these sensors and the associated information are other-organized, and the “learning results” of the neural network are often not translated into feedback and control of these sensors. Moreover, this information has not acquired interpretation and direct symbolic relation through the machine’s “perception” process, in which case “artificial intelligence” cannot emerge from the lower level of “sensory” activity. Thus, the “low-level information activity” of this information flow lacks self-organization and emergence properties.
For high-level, the renewed information created by the subject can directly become the subject’s purpose and plan, creating a real physical entity through the process of social practice. Artificial neural networks, on the other hand, are not social. The results of their “symbolic reasoning” often stop there and cannot be fed back into the environment through social practices. The purposes of neural networks cannot be modified in the process of “learning”, and these purposes often depend on the design of scientists and engineers in the form of deterministic algorithms, hyperparameters, and dataset labels, which are not part of the parameters that can be modified by neural networks.
Essentially, the “decisions” made by these so-called “artificial intelligences” cannot be called a decision, because a real decision requires more of an emergent process and cannot be achieved with only preconditions [
7]. In the “decision” process of neural networks, for example, the relevant factors and processes are organized and deterministic. In the process of “decision making” of neural networks, the relevant factors and processes are other-organized and deterministic. They depend essentially on pre-design, rather than on the actual “decision” process of the neural network, so do not produce a performance like high-level information activity of a subject. As a result, these “high-level information activities” also lack self-organization and emergence properties.
The deep learning approach we currently use and the information structures embodied in this approach fall short of the complexity, holistic properties, self-organization, and evolution to produce the information activity system of intellectual subject. Mechanically determined deep learning and other related design methods alone are not sufficient if we really want to reach the original goal of artificial intelligence as a research field—to build information systems that exhibit human-like intelligence.