Dendrite Net with Acceleration Module for Faster Nonlinear Mapping and System Identification

Nonlinear mapping is an essential and common demand in online systems, such as sensor systems and mobile phones. Accelerating nonlinear mapping will directly speed up online systems. Previously the authors of this paper proposed a Dendrite Net (DD) with enormously lower time complexity than the existing nonlinear mapping algorithms; however, there still are redundant calculations in DD. This paper presents a DD with an acceleration module (AC) to accelerate nonlinear mapping further. We conduct three experiments to verify whether DD with AC has lower time complexity while retaining DD's nonlinear mapping properties and system identification properties: The first experiment is the precision and identification of unary nonlinear mapping, reflecting the calculation performance using DD with AC for basic functions in online systems. The second experiment is the mapping precision and identification of the multi-input nonlinear system, reflecting the performance for designing online systems via DD with AC. Finally, this paper compares the time complexity of DD and DD with AC and analyzes the theoretical reasons through repeated experiments. Results: DD with AC retains DD's excellent mapping and identification properties and has lower time complexity. Significance: DD with AC can be used for most engineering systems, such as sensor systems, and will speed up computation in these online systems. The code of DD with AC is available on https://github.com/liugang1234567/Gang-neuron


Introduction
The development of online systems, such as sensor systems, mobile phones, and computers, is changing the world we live in [1,2]. Nowadays, attention has been paid to the running speed of online systems, as running speed is the evaluation index for system performance. Nonlinear mapping is an essential and common demand in online systems, such as basic function calculation (e.g., y = sin(x), where map x to y) in computers [3][4][5]. The time complexity of nonlinear mapping directly affects the running speed of online systems [6]. However, after years of development, the speed of nonlinear mapping is becoming increasingly difficult to improve, and there is less and less research on improving the running speed of online systems by accelerating the nonlinear mapping speed.
It is typical to store and calculate basic nonlinear functions (nonlinear mapping) in a polynomial form in computers (e.g., sin(x) in C [4], or sin(x) in java [5]). This means that polynomial storage and calculation methods with lower time complexity will speed up the running speed of online systems. In mathematics and computer science, Horner's method (or Horner's scheme) is a well-established algorithm with lower time complexity (O(n)) for polynomial evaluation [7]. In 2021, the authors of this paper proposed DD [8]. DD can be seen as a polynomial form with lower time complexity than the traditional polynomial form, and the time complexity of DD is consistent with Horner's method. During the year, DD has already been used in multiple areas, such as energy saving [9], spatiotemporal traffic flow data imputation [10], high-dimensional problems [11], image processing [12], multi-objective optimization [13], accuracy prediction of the RV reducer to be assembled [14], and precipitation correction in flood season [15]. Nevertheless, there are still some redundant calculations in DD when the order of the DD model is higher than the number of inputs.
This paper proposes an acceleration module for DD to reduce redundant calculation, named DD with AC, which further speeds up the computation in online systems. According to the theory of DD, DD with AC can be used for nonlinear mapping and system identification. Meanwhile, the time complexity of DD with AC should be lower than DD according to the aim of this study. Consequently, the corresponding theory of DD with AC is explored by the experiments here. The main contributions of this paper are presented as follows: 1.
In this paper, the redundant calculations in DD are found, and some characteristics of the redundant calculations are given; 2.
This paper presents an acceleration module for DD to reduce redundant calculations and presents DD with AC by theoretically analyzing the redundant calculations in DD; 3.
The proposed concept is experimentally justified and computationally verified based on the theoretical analysis. After theoretical and experimental analysis, it is demonstrated that DD with AC can be used for nonlinear mapping and system identification with lower time complexity than DD.
The rest of the paper is organized as follows. Section 2 introduces DD and describes the design of DD with AC. Experiments and results are given in Section 3. Section 4 discusses some experimental results and the significance of this study, and the conclusions are drawn in Section 5. In a previous study, the authors of this paper proposed a basic machine learning algorithm called Dendrite Net or DD with white-box properties, controlled accuracy to improve generalization, and low computational complexity [8]. DD's main concept is that if the output's logical expression contains the logical relationship of a class among inputs (and\or\not), the algorithm can recognize the class after learning [8]. DD is one of the dendrites of a Gang neuron (an improved artificial neuron) [16], and its essence is a specialized polynomial form.

Design of Dendrite Net with Acceleration
DD consists of DD modules and linear modules (see Figure 1). The DD module is straightforward and is expressed as follows.
where A l−1 and A l are the inputs and outputs of the module. X denotes the inputs of DD. One of the elements in X can be set to 1 to generate a bias. W l,l−1 is the weight matrix from the (l − 1)-th module to the l-th module." • denotes the Hadamard product. Hadamard product is used to construct interactive items (e.g., x 1 x 2 ). The last module of DD is the linear module, and the linear module is expressed as follows.
where A L−1 and A L are the inputs and outputs of the module. L expresses the number of modules. W L,L−1 is the weight matrix from the (L − 1)-th module to the L-th module.  The following set of equations describes the gradient descent rule of DD.
The forward propagation of DD module and linear module: The error-backpropagation of DD module and linear module: The weight adjustment of DD: where Y and Y are DD's outputs and labels, respectively. m denotes the number of training samples in one batch. The learning rate α can either be adapted with epochs or fixed to a small number based on heuristics.
The most attractive feature is that there are only matrix multiplication and Hadamard product in DD operation, which confers a white-box property and lower time complexity onto DD. White-box property: The trained DD model can be translated into the Relation spectrum about inputs and outputs by formula simplification with software (e.g., MATLAB, an example in Figure 2). Concretely, the optimized weights are assigned to the corresponding matrices in Equations (1) and (2). Then the Relation spectrum was obtained through formula simplification in software. The white-box property solves the "black-box" issue of ML in Figure 3 [17]; thus, DD integrates nonlinear mapping/pattern recognition and system identification, compared to other ML algorithms (see Table 1). The lower time complexity makes it possible for DD to become a common algorithm in online systems.

"x-y"system
Samples Generator x y y black box Cannot analyse "x-y"system？  [17]. Traditional ML can generatê y approaching y via x, but cannot analyze the "x-y" system. Interestingly, the trained DD model can be transformed into a Relation spectrum such as the Taylor series for system identification [18,19].
[Analogous to the Fourier transform and Fourier spectrum used to decompose the signal, the DD and the Relation spectrum decompose the system.].

Redundant Calculations in Dendrite Net
DD constructs the interaction terms of the input variables and increases the order by concatenating DD modules (see Figures 1 and 2) [8]. However, when the order of the DD model is higher than the number of inputs, the previous DD module will build all the interaction terms, while the later DD module will only increase the order without increasing the interaction terms. For instance, in Figure 1, if the number of inputs is l + 1, l DD modules have constructed all the interactive items (the order of l DD modules is l + 1.), the later red DD modules will only increase the order and cannot increase the interactive items. Hence, the later DD modules have redundant calculations, and we can increase the order with lower computation than the DD module. Therefore, we replace the later DD modules with an acceleration module to increase the order faster in this paper (see Figure 4).
x 0  Figure 4 shows the dendrite net with the acceleration module. It contains DD modules, a linear module, and an acceleration module. The overall architecture of DD with the acceleration module is shown in Figure 5. The architecture can be represented according to the following formula.
where X and Y denote the input space and the output space. One of the elements in X can be set to 1 to generate a bias. W i,i−1 is the weight matrix from the (i − 1)-th module to the i-th module. The last module is linear. d denotes the number of DD modules. c denotes Power of Number. • denotes Hadamard product. The DD modules and the linear module have been described previously. The acceleration module is expressed as follows: where A d and A d+1 are the inputs and outputs of the module. X denotes the inputs of DD with AC. One of the elements in X can be set to 1 to generate a bias. d denotes the number of DD modules. c denotes Power of Number. X c represents X to the power of c. W d+1,d is the weight matrix from the d-th module to the d + 1-th module. • denotes Hadamard product.
In order for the DD with AC to include all terms under the target order and use fewer modules, the order of DD with AC n, the number of DD modules d, the power of AC c, and the input dimension of DD with AC a should satisfy the following equations: where d + 1 + c = n means that the order of DD with AC is equal to the sum of the order of DD modules (d + 1) and the order of the acceleration module (c). (c − 1) × a < n means that if c exceeds this range (the value of c is too large), DD with AC can not construct all the interactive items. c × a ≥ n means that if the value of c is too small, there are still redundant computations. d and c are calculated by Algorithm 1 in the case of a given order n.

Learning Rules
The graphical illustration of the learning rule is shown in Figure 6. As an example, we use one-half of the mean squared error (MSE) as the loss function. The learning rules of DD modules and the linear module have been described previously. The following set of equations describes the error back-propagation-based learning rule of the acceleration module (see Figure 6) [22].  The error back-propagation of the acceleration module: The weight adjustment of the acceleration module: where dA d+1 represents the error from the later module, m denotes the number of training samples in one batch, α is the learning rate, and the other symbols represent the intermediate variables or have been explained in Equation (10).

Experiments and Results
Ref. [8] demonstrates that DD has lower time complexity than traditional polynomials or ML with nonlinear functions, which will speed up the computation of online systems. In addition, DD has white-box properties and controllable accuracy for nonlinear mappings. The main purpose of the following experiments is to demonstrate the feasibility of DD with AC to further speed up the computation while retaining the properties of DD.

Precision and Identification of Unary Nonlinear Mapping
In order to investigate the precision and identification of unary nonlinear mapping, we considered the normalized Bessel function defined by: where we defined x ∈ [−10, 0) ∪ (0, 10] , then x and f (x) were normalized to [−1, 1] , respectively. We gradually increase the order of DD with AC to approximate the normalized Bessel function (from order 4 to order 15). The architectures of DD are designed by Ref. [8] and are shown in Table 2. The architectures of DD with AC are obtained from Algorithm 1, and the results are shown in Table 2. It is worth noting that the input dimension of DD or DD with AC is set to 2 and one of the inputs is always 1 (Bias). A linear module follows each model, and the statistics for this module are not included in this table. "k"DD + AC"j": The model contains "k" DD modules and one acceleration module, and the power of X in the acceleration module is "j". Figure 7 displays the precision of unary nonlinear mapping using DD with AC. What stands out in this figure is the gradual increase in precision with the increasing order, which is present in both DD and DD with AC. Furthermore, the precision of DD with AC is lower than DD in the same order, especially at higher orders. This may be due to the fact that it is more difficult to find the optimal solution using DD with AC than using DD. Although DD with AC has a drawback, it retains the key property of DD; that is, the precision increased with the number of modules, which corresponds to the property in Taylor's expansion [8]. The trained DD with AC models were translated into the relation spectrum about inputs and outputs by formula simplification with MATLAB 2019b [13,18,19]. Concretely, we set system input variables[1x] and use it to express the forward propagation formula (see Figure 4). Then, the optimized weights were assigned to the corresponding matrices in DD with AC. Finally, the Relation spectrum was obtained through formula simplification in MATLAB. Note: The Relation spectrum, similar to the Fourier spectrum, focuses on transforming and observing the corresponding phenomenon in the spectrum for analysis. The Relation spectrum presents the polynomial itself after the format transformation, so the transformation can be proved to be valid by observing whether there are similar relations in the Relation spectrum. More explanations and applications can be found in previous researches, such as Ref. [19], Ref. [18], and Ref. [13].
Turning now to the identification by DD with AC in Figure 8, the comparison between DD and DD with AC shows that DD with AC also retains the properties of the relation spectrum of DD. The difference in the Relation spectrum corresponds to the difference in precision in Figure 7. Models with large differences in precision also have large differences in the Relation spectrum. In other words, the Relation spectrum in Figure 8 explain the models in Figure 7.

Mapping Precision and Identification of Multi-Input Nonlinear System
Modeling a multi-input nonlinear system is an essential and common demand in online systems, such as sensor systems. We randomly constructed a multi-input nonlinear system, whose output is shown in Figure 9 (dotted line) and the inputs are defined by: where we defined t ∈ [0, 7]. . Precision comparison between DD and DD with AC for multi-input nonlinear system. (a) Multiinput nonlinear system using DD with AC. (b) Precision comparison between DD and DD with AC as the target order increases. "k"DD + AC"j": The model has "k" DD modules and one acceleration module, and the power of X in the acceleration module is "j".

DD DD with AC Number of Modules in DD Number of Modules in DD with AC
A linear module follows each model, and the statistics for this module are not included in this table. "k"DD + AC"j": The model contains "k" DD modules and one acceleration module, and the power of X in the acceleration module is "j". Figure 9 presents the precision of multi-input nonlinear system using DD with AC. What stands out in this figure is that the precision gradually improved with the increasing order, which exists in DD and DD with AC.
The trained DD with AC models were transformed into a Relation spectrum of the inputs and outputs by simplifying the equations in MATLAB 2019b [13,18,19]. Figure 10 provides an identification comparison between DD and DD with AC for a multi-input nonlinear system. The results between DD and DD with AC are similar, revealing that DD with AC also retains the properties of the Relation spectrum in DD. These properties have a wide range of applications, such as analyzing the human brain [19] and physical design [13].

Computation of Time Complexity
Here we take unary nonlinear mapping as an example to calculate the time complexity. The time complexity of the modules is summarized in Table 4. Table 4. Time complexity in DD and DD with AC for unary nonlinear mapping.

Module Time Complexity
DD module ("WA • X") 6 multiplication, 2 addition Linear module ("WA") 2 multiplication, 1 addition Acceleration module ("WA • X c ") (4 + 2c) multiplication, 2 addition Unary Nonlinear Mapping means that the input vector X contains two elements, one of which is 1, for example: . c denotes Power of Number. X c represents X to the power of c.
DD is composed of DD modules and a linear module [8]. Therefore, DD contains 6(n − 1) + 2 multiplication and 2(n − 1) + 1 addition, where n denotes the order of polynomial. The time complexity of DD is O(n), which happens to be consistent with Horner's method [7].
We take the two-input system (unary nonlinear mapping) and the four-input system using DD with AC as an example and display the number of modules required for the target order in Figure 11. Concretely, the architecture of DD with AC is obtained by Algorithm 1, and the results are shown in Tables 2 and 3. It is evident from Figure 11 and Table 4 that, compared with increasing one order by adding a DD module, the time complexity is reduced by 2 multiplications and 1 addition by adjusting the acceleration module. Therefore, the time complexity of DD with AC is less than that of DD or Horner's method [7]. When one of the inputs is 1, it corresponds to the unary nonlinear mapping above. (b) Module number for four-input system. When one of the inputs is 1, it corresponds to the above multi-input nonlinear system.

Experiments of Time Complexity
In addition, to further verify the above results, 20 runs of DD and DD with AC were performed for the two-input system and the four-input system, and the run times (online speeds) were recorded. The tests were executed in MATLAB 2021b on a 2.2-GHz laptop Personal Computer (PC). Among them, we recorded the running time for 10,000 forward-propagation with 1000 samples for the 2-input systems and the running time for 10,000 forward-propagation with 7000 samples for the 4-input systems.
The results are ideal. All online speeds from the online tests met our expectations (see Figure 12). By comparing the results of Figures 11 and 12, it can be concluded that the online speed corresponds with the number of modules.

Discussion
Prior studies have noted the importance of nonlinear mapping in online computation [1][2][3][4][5]. Our previous studies about DD observed faster speeds of nonlinear mapping using DD [8]. Due to the white-box attribute, controllable precision, and lower time complexity, DD was comprehensively applied in the areas of energy, traffic, weather, and physical design [9][10][11][12][13][14][15]. However, there has still been a redundant calculation in DD. This study aimed to eliminate the redundant computation while retaining DD's properties.
According to the analysis of DD, we designed an acceleration module, presented DD with AC, and conducted three experiments. These experimental results are in accord with the theoretical results. DD with AC has a lower time complexity than DD (see Figures 11 and 12), has controllable precision (see Figures 7 and 9), and can also be used for nonlinear mapping and system identification (see Figures 8 and 10). This paper was the continued work of previous studies about DD, which may improve the application of DD in various fields. Table 5 shows some examples of current applications of DD by referring to the reviewer's suggestions.
The limitation of this paper comes from the lack of engineering data. In this paper, DD with AC is fundamentally verified by many experiments, which is a stronger way to prove it than using special data. Future experiments will be conducted on engineering problems. Table 5. Some applications of DD.

Applications Literature
A hybrid data-driven framework for spatiotemporal traffic flow data imputation Literature [10] Energy saving of buildings for reducing carbon dioxide emissions using novel dendrite net integrated adaptive mean square gradient Literature [9] Unsteady aerodynamics modeling method based on dendrite-based gated recurrent neural network model Literature [23] A radial sampling-based subregion partition method for dendrite network-based reliability analysis Literature [24] An Algorithm for Precipitation Correction in Flood Season Based on Dendritic Neural Network Literature [15] Multi-Objective Optimization for the Radial Bending and Twisting Law of Axial Fan Blades Literature [13] An Accuracy Prediction Method of the RV Reducer to Be Assembled Considering Dendritic Weighting Function Literature [14] An Adaptive Dendrite-HDMR Metamodeling Technique for High-Dimensional Problems Literature [11] Convolutional dendrite net detects myocardial infarction based on ECG signal measured by flexible sensor Literature [25] Photovoltaic Power Prediction Under Insufficient Historical Data Based on Dendrite Network and Coupled Information Analysis Literature [26]

Conclusions
This paper presents a Dendrite Net with an Acceleration module for nonlinear mapping and system identification. The theoretical and experimental results suggest that DD with AC retains DD's nonlinear mapping properties and system identification properties. Interestingly, the time complexity of DD is lower than the traditional polynomial or ML with a nonlinear function and is consistent with Horner's method. The time complexity of DD with AC is lower than DD or Horner's method, which provides a new strategy for online systems that require lower time complexity and has the potential to speed up the calculation of basic functions in computers.