An Evolutionary Field Theorem: Evolutionary Field Optimization in Training of Power-Weighted Multiplicative Neurons for Nitrogen Oxides-Sensitive Electronic Nose Applications

Neuroevolutionary machine learning is an emerging topic in the evolutionary computation field and enables practical modeling solutions for data-driven engineering applications. Contributions of this study to the neuroevolutionary machine learning area are twofold: firstly, this study presents an evolutionary field theorem of search agents and suggests an algorithm for Evolutionary Field Optimization with Geometric Strategies (EFO-GS) on the basis of the evolutionary field theorem. The proposed EFO-GS algorithm benefits from a field-adapted differential crossover mechanism, a field-aware metamutation process to improve the evolutionary search quality. Secondly, the multiplicative neuron model is modified to develop Power-Weighted Multiplicative (PWM) neural models. The modified PWM neuron model involves the power-weighted multiplicative units similar to dendritic branches of biological neurons, and this neuron model can better represent polynomial nonlinearity and they can operate in the real-valued neuron mode, complex-valued neuron mode, and the mixed-mode. In this study, the EFO-GS algorithm is used for the training of the PWM neuron models to perform an efficient neuroevolutionary computation. Authors implement the proposed PWM neural processing with the EFO-GS in an electronic nose application to accurately estimate Nitrogen Oxides (NOx) pollutant concentrations from low-cost multi-sensor array measurements and demonstrate improvements in estimation performance.


Introduction
Evolutionary neural networks have been in progress, and the neuroevolution, which enables cooperation of evolutionary computation with neural information processing, contributes to improvements of Artificial Neural Network (ANN) models in data-driven real-world applications [1][2][3][4][5][6][7]. Evolutionary optimization has been used for both architecture optimization of neural network models [8,9] and training of neural networks [5]. However, data-driven evolutionary optimization was shown to be effective in solving real-world problems [10,11]. A comprehensive review of data-driven evolutionary optimization and its engineering applications have been presented in [11]. A prominent advantage of the evolutionary optimization method comes from the easy employment of genetic and evolutionary search processes in searching solutions of very sophisticated optimization problems [11], and this property can facilitate the training process of neural networks in cases that the gradient-based optimization method is not feasible to apply; Table 1. Advantages and disadvantages of some fundamental metaheuristic optimization methods that were used for the training of ANNs.

PSO
For the training of shallow neural networks, the PSO can present faster convergence than backpropagation algorithms [14] and perform global searching [15].
Although performing a global search, it is possible to converge to local minima. Inappropriate selection of hyper-parameters of PSO may produce relatively poor results [18].

GA
The GA can provide better training performance than backpropagation algorithms [13] because GA performs a gradient-free optimization [15] and global search. It can be effective for training of shallow neural networks.
The convergence to minimum solution can take longer when hyper-parameters are not well tuned [18].

DE
It can perform global searching in the training [23] and find optimal ANN training solutions at the expense of more computation time [27].
The DE algorithm may cause premature convergence and poor performance [18,28].
The origin of evolutionary computation has strong connection with evolutionary biology and evolution theory. Algorithms of evolutionary computation may be inspired by evolution mechanisms of individuals and species in macroscopic and microscopic scales and genetics of organisms at the molecular biology level [29][30][31]. In the current study, we proposed an evolutionary optimization method on the basis of an Evolutionary Field Theory (EFT) of search agents. To the best of our knowledge, the term of evolutionary field theory was used by Papadopoulos et al. and they established the stochastic field model of uncertain systems by using non-homogeneous evolutionary fields developed by Priestley [32,33]. These works considered this term for spectral analysis of non-stationary processes according to the concept of evolutionary spectra, where spectral functions were assumed to be timedependent [34]. In the current study, the term of evolutionary field is used for a property space where search agent properties evolve in time, to better fit to the solution environment of a predefined optimization problem. Therefore, we conjecture an evolutionary field theory of the search agents in order to establish a theoretical foundation for analysis and design of population-based evolutionary algorithms from an agent-environment perspective that is very similar to the basics of the reinforcement learning [35]. This theorem can establish a bridge between majority of population-based evolutionary search algorithms and the reinforcement learning foundation collection. The contributions of this study are twofold: (i) Suggestion of an evolutionary field optimization; (ii) Development of a PWM neural processor for evolutionary nonlinear programming in data-driven applications.
For the training of this PWM neural processor, we suggest an evolutionary metaheuristic optimization method. The proposed algorithm is referred to as Evolutionary Field Optimization (EFO) because it is based on the evolutionary field theory of search agents. This algorithm was suggested to facilitate the training of the power-weighted multiplicative neuron models by implementing neuroevolutionary machine learning in this study. The EFO algorithm performs field-adapted geometric search strategies in the evolution field that was composed of property codes of search agents. The proposed EFO-GS algorithm implements a hybrid search strategy that combines advantages of a geometric space search strategy with differential evolutionary search mechanisms. Then, the EFO-GS is implemented for the optimization of weight parameters in training of PWM neuron models in this study. We provide a deepened analysis on features of the multiplicative neuron models, and suggest a power-weighted multiplicative neuron model in order to manage three different operation modes of this neuron, which are the real-valued neuron mode, the complex-valued neuron mode, and the mixed-mode operations. To address the solution of real-valued regression problems, the multiplicative neuron model is modified by appending a special type activation function, which is referred to as the mapping-toreal function. This function maps dual properties of complex numbers (real-imaginary parts or magnitude-phase properties) to a real value. Thus, this extension enables us to convert results in the complex-valued domain of the neuron into a real-valued signal at the neuron output. A practical application of a PWM neuron model with EFO training (PWM-EFO) was demonstrated for estimation of NO x concentration to achieve accurately the soft-calibration of a low-cost multisensor array. An experimental study was conducted and the effectiveness of the proposed estimation models was demonstrated for electronic nose applications.

A Brief Review of Pathways from Additive Neurons and Multiplicative Neurons
In order to perform practical machine learning tasks, ANNs have been widely preferred for the identification of black-box models from very sophisticated and noisy data stacks. Originally, the topic of ANN model can be traced back to the suggestion of a simple artificial neural cell model by the physiologist Warren McCulloch and the mathematician Walter Pitts in 1943 [36]. However, harnessing the learning power of ANNs has begun after Rosen Blatt's multilayer perception and proposition of the backpropagation algorithm [37]. These progresses have been milestones on the headway of deep neural networks. Then, a variety of application areas have emerged, such as in modeling [38], control [39], signal processing [40], image processing [41]. The backpropagation algorithm has been widely preferred training algorithm for multilayer feedforward ANNs in order to establish a neural model of the input-output relations in the training datasets [42]. Although there exists several variants of backpropagation algorithms, the Levenberq-Marquardt (LM) algorithm is widely used since it provides an enhanced training performance for feedforward multilayer neural networks [43,44]. The role of activation functions and design of parametric activation functions by using evolutionary methods were discussed in [45].
An extension of basic neuron model, known as multiplicative neural networks, emerged in the 1980s. Use of multiplicative units in neurons and their high-order model representation capability were discussed by Giles et al. [46]. Later, Durbin and Rumelhart suggested a multiplicative neural network structure. Thus, product units have been considered as a new form of computational unit for feedforward neural networks. They conjectured that the multiplication unit was more biologically plausible and more computationally powerful than the addition unit [47]. Another work reported that the multiplicative neural network can solve some problems by using less neurons than additive neurons and heuristic methods were suggested for the solution of training problems of multiplicative networks [48]. Afterwards, Schmitt investigated the computation complexity and learning skills of multiplicative neural networks and provided a detailed survey of their biological and computational origins [49]. Besides its computational origin, multiplicative neural activity has some neurobiological bases: from a neurobiological standpoint, Salinas and Abbott reported that multiplicative neural responses could arise in the overall responses of the neuron population in parietal cortex, and the multiplicative gain modulation could play an important role in the transformation of object locations from the retinal to body-centered coordinates. They conjectured that neurons with multiplicative responses can act as powerful computational elements in biological neural networks [50]. Main reason for this analogy with multiplication is that nonlinear relations in a neural system can be better represented by the product units than the additive units in modeling. In the machine learning domain, Simon revealed an interesting correspondence between the multiplicative neuron and the additive neuron, according to the identify ∏ j x p i,j = e ∑ j p i,j In(x j ) [47], and remarked that the multiplicative neuron network can be expressed in the form of an additive neuron network with a different nonlinearity [51]. At another venue, polynomial neural networks have been in progress, and their advantage over additive networks has been investigated [52][53][54]. Relations between polynomial regression and classical neural networks were discussed and polynomial activation functions were shown on the basis of the Taylor theorem [54].

Evolutionary Field Search
In general, population-based evolutionary algorithms (e.g., genetic algorithm, differential evolution, particle swarm optimization, etc.) implement a collection of search agents that iteratively repositions the solution in the search space of an optimization problem, to find a better solution point during the optimization process. Repositioning of search agents is commonly performed by predefined evolution rules; for instance, fundamental genetic processes for genetic algorithm, motion equations for swarm-based metaheuristic optimization methods. On the other hand, the field of reinforcement learning is closely related with optimization of agent response in an environment from its experience [35]. This study provides a bird's eye view for the population-based evolutionary search algorithms from the perspective of the reinforcement learning. This theorem may be also useful for analysis and design memetic algorithms and memetic computing [55]. The following section aims to present an evolutionary field theorem that has a common aspect or establishes a common foundation in the analysis of these type search agents.

Evolutionary Field Theorem of Search Agents
The evolution field is a multi-dimensional space of agent property code, where agent properties are represented by a property code X k ∈ R D (the parameter D is the dimension of the property code and k is the agent index) and evolve in time. Search agents, which are characterized by their property code, only act in a solution environment of optimization problems, and each agent represents an individual in the solution environment. Commonly, a selection mechanism is designated such that their chances of survival depend on the fitness of the agent to the solution environment. Therefore, the objective function F(X k ) measures the suitability of agent property code to represent an optimal solution of the environment, and the value of F(X k ) expresses the field value of the property code, and the field value has been widely used in the reposition of agent property codes within the evolution field. Hence, the field value is considered for the evolution of agents and selection of them. In essence, objectives assign a suitability value to the property code of the search agent in the evolutionary field. The evolutionary field can be defined by a closed set of property codes and the objective function in the form (X k , F(X k )), and this closed set is a minimal set to design evolution strategies of agent properties in the field. Figure 1 depicts the evolution field of the property codes and their association with the agents of the solution environment. The property code is represented by a vector X k , where vector elements x i,j represent the jth property of the search agent k. To manage geometrical evolution strategies of property codes, agent properties are widely embedded into a Cartesian coordinate system. Thus, distance metrics become valid in order to express evolutionary relations between the agent property codes. Consequently, property codes in Cartesian coordinates establish a metric space (X k , d) where the operator d represents a metric that holds:

1.
d(X i , X j ) ≥ 0-the metric d(X i , X j ) expresses dissimilarity of agent properties in the defined metric space. The equality state d(X i , X j ) = 0 implies that the agent i and the agent j are the same agent in the solution environment. Values of d(X i , X j ) can express a measure for differentiation of agent properties, and it can be used to evaluate the amount of the evolution of the agent property code; 2.
d(X i , X j ) = d(X j , X i )-agent properties do not apply any priority; 3.
d(X i , X j ) ≤ d(X i , X k ) + d(X k , X j )-agent properties obey the triangle inequality and it allows to define geometrical evolution strategies. The shortest evolutionary path does not involve any deflection in the code space.
Relocation of the agent property code in the evolution field results in the change of agent properties, and the property code relocation is referred to as evolution of the agent. Amount of evolution can be expressed by the distance metric in the evolution field. Let us assume the property code (X k [n]) of the agent k at the instance n changes to a property code X k [n + 1] at the instance n + 1. The amount of evolution can be measured at this instance by the seasonal evolution that satisfies The seasonal evolution rate in the property code of agent k can be written as where the operator X k [n] represents the norm of X k [n] vector. Evolution of some agent properties may be advantageous for survival of agents that act in the solution environment, and some may not be advantageous. The higher field value F(X k [n]) infers a higher tendency to evolve, and the evolutionary energy density at the code X k [n] can be expressed by the value of F(X k [n]). The advantageous evolution can be perceived with the condition F(X k [n + 1]) < F(X k [n]). Then, the negative derivative condition to minimize the field value (evolutionary energy density) of agent k can be written on the basis of the Lyapunov stability as (See Remark A1 in Appendix A for the mathematical foundation of this negative derivative condition.) Useless seasonal evolution can be detected by checking the condition F(X k [n + 1]) ≥ F(X k [n]). Since the evolution is a continuing process, a useful evolutionary path for agent property k can be expressed for L number of seasons as follows: To our knowledge, selection mechanisms in nature may not always have to know or be aware of the most advantageous path in a long horizon of the evolution process. Therefore, the selection mechanism in nature can be assumed to behave in the manner of Markovian processes; useful transitions of agent properties, namely seasonal advantageous evolution of the property code can be expressed according to the current state of field values as Quality of the property evolution can be expressed by the loss in evolutionary energy, and it can be written as where the seasonal quality index Q[n] takes a value [−1, 1]. A value of −1 implies the low quality, and the value of +1 implies the high quality in seasonal evolution.

An Evolutionary Field Optimization with Geometric Strategies
EFO-GS algorithm implements a hybrid search methodology that aims to benefit from advantages of the geometrical search strategies in the evolutionary field to improve the differential evolution processes. Effective geometrical search methods have been shown to convergence to minimum points [56][57][58]. The EFO-GS algorithm evolves an initial property code X k towards the seasonal best of codes (X best [n] = argmin X j F(X j [n]), j = 1, 2, 3, . . ., h k , the parameter h k is the number of agent population) by repeatedly performing advantageous seasonal evolution of the property code according to scattering geometry of agent's property codes within the evolution field. Two essential genetic mechanisms are employed to perform the seasonal evolution of agent properties: (i) Field-adapted differential crossover of property codes: The property difference is expressed as where x k,i is ith property of the property code X k and x p,i is ith property of the property code X p . (To make the formulation simple and clear, we prefer to express element-wise formulations instead of vector or matrix forms.) The field-adapted differential crossover is performed by using the property difference ∆(x k,i , x p,i ) according to a predefined geometrical rule on the field as depicted in Figure 3. Figure 3 depicts a geometrical interpretation for the field-adapted differential crossover for x k,i and x p,i components of property codes X k and X p . To perform a high-quality property evolution in the convex part of the field, this geometrical rule was proposed as: each property in the property code performs a differential crossover with a magnitude of the seasonal quality factor |Q[n]| towards the agent X k [n] with the lower field value F(X k [n]). The X pk,j represents the evolved property of this geometric crossover rule. Arithmetically, this rule can be expressed as It is useful to consider the change of quality factor magnitude depending on property codes. Figure 4 shows the magnitude of the seasonal quality factor for a relative change of the field values of the agent X k as F(X k ) = γF(X p ), γ ∈ [−1, 1]. For γ = 1, it implies F(X k ) = F(X p ), and it leads to a zero value for the seasonal quality factor magnitude (|Q[n]|). No geometrical crossover is applied, and it yields x pk,i = x p,i in this case. For γ ≤ 0, it implies F(X k ) F(X p ), and it yields the highest value for the seasonal quality factor, meanwhile the geometrical differential crossover is fully performed toward to the property code X k that has the lower field value. Recently, a different definition of evolution quality and distance metrics were employed to improve differential evolution performance [21]; however, the algorithmic structure and evolution formulations in [21] are not the same as those of the EFO algorithm. Figure 3. An illustration that describes the field-adapted differential crossover between x k,i and x p,j components of property codes X k and X p . A property code search in locations around the seasonal best agent code X best [n] improves the convergence speed of the optimization process. This leads to a seasonal exploitation in the evolutionary field at the best possible seasonal quality factor Q[n] for each agent. (A proof of this proposition is given with Remark A2 in Appendix A.) All agents perform a geometrical crossover at the highest quality factor toward the seasonal best code X best [n], and thus maximize the total quality (∑ Then, a field-adapted differential crossover is defined according to the following quality function-based evolution rule: This field-adapted differential crossover is applied to the property code of agents except the best agent X best [n]. The best agent X best [n] performs field-aware metamutation. The evolutionary field values of agents are measured by the objective function according to the performance of all agents in a solution environment. The seasonal best agent of the evo-lutionary field is found by X best [n] = argmin X j F(X j [n]). Then, the field-adapted differential crossover update of the code property in the evolutionary field can be expressed as Such an update of ith property of agents enables the transformation of agents with their experience in the solution environment toward the more successful agent properties with the highest total seasonal quality factor.
(ii) Field-aware mutation and bifurcated metamutation of property codes: To gain a field awareness in the mutation process, the mutation process is performed depending on the field value F(X k [n]). Thus, the evolution tendency of agent properties is regulated according to their field values. More mutation tendency is promoted for agent property codes that suit the solution environment less because their field values are high. This regulation results in a field awareness in the mutation process. This behavior is closely related to the conjecture that increased difficulty in living conditions and rise in environmental stresses can lead to more coincidental mutation of living organisms in nature, and such increase in the mutation tendency, in turn, contributes to search of more suitable characteristic properties to manage adaptation of the organism to the environment. In this mutation rule of the proposed algorithm, less fitting agents of the solution environment should exhibit more tendencies for mutating in the evolutionary field, and this leads to more exploration in the evolutionary field. The seasonal field-aware mutation tendency of an agent property is expressed relative to the best agent of the evolutionary field as The mutation update of agent properties in the field should be a stochastic process to enrich search possibilities; the field-aware mutation update is expressed as where the parameter r m is a uniform random number in a range of [−0.5, 0.5] and p g stands for the field-aware mutation range. This equation provides a randomization in mutation of the agent properties in a range that is relative to the field value of the best agent property. When the field value of the best agent decreases to a lower field value, other agents tend to perform more mutations to become more competitive in exploration. If the best agent does not reach low field values, the other agents become more conservative by reducing their mutation range p g to perform more exploitation in their local property space. After introducing the field-adapted differential crossover and field-aware mutation process of the property codes, the next-generation agent property in the field is produced by aggregation of contributions of the crossover and mutation processes to the property code in each season. The property code update rule of an agent can be written as a linear combination of updates of these genetic processes.
However, the best agent property does not perform the field-adapted differential crossover and the field-aware mutation as other search agents. To gain more awareness on the field topology, the best property code is rewarded to mutate around the field center of all property codes. This metamutation behavior considers the constellation of other code properties and benefits from the geometrical knowledge associated with the center of property code distribution. This knowledge is extracted by calculating the reverse-weighted center of the code constellation that is expressed as The reverse-weighted center is a formulation of the behavior that the property code with lower field value has higher weight in the determination of the center point. (Roulette wheel selection method of the genetic algorithm uses a similar formulation for its reproduction process.) Since bifurcated metamutation of the best property code around a global property center and a personal property center is useful to increase exploration potential of the best property code, we performed a bifurcated metamutation by selecting one of two processes with equal probability: the first one is the search of regions around the reverse-weighted center in a range of a scattering radius A scat of the other agent code constellations in the field. The mutation in this field region is called property assimilation. The second is the random exploration around its own surrounding property code space with a scattering radius A scat . This mutation region is called property conservatism. The scattering radius of the code constellation is determined by the absolute scattering radius of agent constellations in the evolutionary field. Finally, the best property code can produce a new code according to where rand is a uniform random number in the range of [0, 1]. The parameter r b is a random number in the range of [−0.5, 0.5] to randomize the search in these regions. The p out is a weight coefficient to resize the absolute scattering radius A scat . The absolute scattering function A scat is calculated as The term of metamutation was previously used in several previous works in a different context, to define mutation process improvements [59][60][61]. However, these definitions are not the same with the concept of bifurcated metamutation process that is defined in the current study. Figure 5 shows a depiction of two search areas of metamutation of the best property code. These regions are indicated by a dashed circle with the center a for the property assimilation and a dashed circle with the center b for the property conservatism. As a result, the best agent property code repositions with a probability of 0.5 in the field space with reverse-weighted center X 0 (a preference of property assimilation to survive among other successful agents) or its own position X best and the radius of A scat (a preference of property conservatism to survive with its own possessions). Steps of the EFO-GS algorithm can be summarized as follows: Step 1 Randomly distribute all agent property codes X k within the evolution field; Step 2 Calculate the field values F(X k ) for each property codes X k ; Step 3 Select the seasonal best agent property code as X best [n] = argmin X j F(X j [n]); Step 4 Perform the field-adapted differential crossover and field-aware mutation combination for agent property codes according to Equation (15) except the seasonal best agent property code X best and obtain new generation candidates of the seasonal property codes, X k ; Step 5 Perform only bifurcated metamutation for the seasonal best agent property code X best according to Equation (17) and obtain a new generation candidate of the seasonal property code X best ; Step 6 Form a seasonally evolved new generation set of the seasonal property codes as X k = X best , X k , and calculate the field values F( X k ) for each new generation property codes from X k ; Step 7 Select the agent property codes with lower field values from old and new property code collections X k , X k and update the set of X k ; Step 8 If a predefined stopping criteria is not met, select the best agent property code with the lowest field value as the optimal solution of the optimization problem. Otherwise, go back to step 3.

Evolutionary Training of Power-Weighted Multiplicative Neural Processor via Evolutionary Field Optimization Algorithm
Evolutionary optimization methods have been used in the training of artificial neural networks [7,12,62,63], and this training method is known as evolutionary training. Primarily, Whitley et al. used a genetic algorithm in weight optimization of neural network by using binary encoding of weights [62] and the weight coefficients of the neural network are represented by a string of binary values. This causes a limitation for weight values because the expression precision of binary encoding may not be enough for every application [7]. Then, the real number encoding was used to express weights in the genetic algorithm [64]. Some works reported that training performances of GA were comparable with the training of backpropagation method because the backpropagation method uses a gradient-based local search and it may easily fall into local minimums [13]. Evolutionary training methods can perform a global search strategy and this may improve the search performance compared to a local search. However, the number of optimized parameters (dimension of search space) can limit the performance of the evolutionary methods due to exponential growing of the search space. Xiangping et al. showed that a hybridization of GA and backpropagation methods can improve the training performance, where the GA determines optimal initial values of weights for the backpropagation algorithm [65]. In the literature, several Evolution Algorithms (EAs) have been used for the training of ANNs [24,66,67] and performance improvements and shortcomings were discussed. In the current study, an EFO training of the suggested power-weighted multiplicative neural processor will be carried out and an application in electronic nose design for NO x measurement and control for the aerospace industry and air quality is presented in the following sections.
Electronic noses have been widely utilized for detection and classification of gases by implementing machine learning classifiers [68,69]. They have also been used for accurate measurement of gas concentrations by using machine learning-based sensor calibrators [70][71][72]. Today, electronic nose technologies can contribute to the improvement of many daily-life processes. For instance, the gas sensors and electronic nose solutions promise important agricultural applications; for instance, monitoring and prediction of important parameters related to the growth and harvest of a crop, and allow data-driven management practices in several stages of agricultural activities [73]. Another useful application of electronic noses was demonstrated for discrimination of pathogenic bacterial volatile compounds [74].

Preliminaries for Multiplicative Unit
The addition of multiplicative units to the classical neuron model of McCulloch and Pitts [36] has some origins in biological research studies [47,49] and mathematical studies [46,49]. First of all, multiplicative units can increase the nonlinear approximation and representation capabilities of neural networks [46,49]. Some works, which implemented multiplication in neural processing, have suggested that the use of polynomial nonlinearity [52,53], power series, and Binomial series [54] in the classical neuron model contributes to approximation skills of neuron models compared to classical neuron models. Effects of product units in neural processing have been also elaborated in several preliminary works [47][48][49]. The multiplicative unit (product term) was defined as where the input variable x j is a positive real variable and the power (exponent) p j is a positive real number [48,51]. It is very useful to consider Simon's discussion to gain deeper insight on relations between that multiplicative neural networks and additive neural networks [51]. Simons revealed the fact that the multiplicative neural network can be expressed in the form of an additive network with a different nonlinearity formulation that originates from the identity ∏ j x p i,j = e ∑ j p i,j In(x j ) [47,51]. This was a very important and useful observation from a machine learning point of view because it opens a door for implementation of multiplicative elements in neuron networks similar to additive elements. In this section, we extend this discussion and consider the relation between multiplicative units and the weighted geometric average to observe some assets. The multiplicative unit, which is defined by Equation (20), is a generalization of the weighted geometric average operator, which has been solely utilized in the calculation of multiplicative preference or priorities in decision making [75][76][77][78]. The multiplicative unit turns into a weighted geometric average operator when the normalization condition ∑ h j p j = 1 is satisfied (see Remark A3 in Appendix A). In essence, the multiplication of variables can be useful to process exponential relations between the parameters. Another advantage may be that the multiplication operator allows more spread of results over a wider value set than the addition operator because multiplication of parameters is mostly greater than addition of those parameters in many cases. Essentially, additive units can perform the weighted sum operation that presents correspondence with the arithmetic average.

Power-Weighted Multiplicative Neural Processing
This section introduces a general formulation of the power-weighted multiplicative neuron model for artificial neural processing of data. Figure 6 shows a block diagram that represents essential functional blocks of this neuron model. There are two additional functional blocks that are appended to the classical neuron model suggested by McCulloch and Pitts [36].  [79] and this is an important effect for processing real-world stimulus. The dendritic activity is not explicitly considered in the classical neuron model as a separate function; instead, both nonlinearity effect and output value limiting effect are performed by designing suitable activation functions. We used the power-weighted multiplication operations in order to represent nonlinear relations in unification of dendritic branches. The power-weighted multiplication is expressed to process neuron inputs x 1 , x 2 , x 3 , . . ., x h as where the power weight p i,j represents jth input (branch) of ith dendritic branch in the neuron. The parameter h is the number of inputs in the neuron model. Then, results of dendrites are collected by using a weighted sum operation and adding a bias b as follows: where the parameter w i is the weight of ith dendritic branch. Due to the fractional power of inputs, the PWM neural model can produce complex numbers, and accordingly, it can work as a complex-valued neuron. To see this operation, let us assume a negative input x < 0 and a fractional power p r ∈ R that is a non-integer number p r / ∈ Z, one can write (see Proposition A1 in Appendix A) x p r = |x| p r (cos(π p r ) + jsin(π p r )).
Equation (23) clearly shows that a negative-valued input x j < 0 and a fractional power weight p i,j ∈ R − {Z} result in a complex value u i ∈ C. Consequently, the PWM neural processor can perform in the complex-valued neuron mode. Advantages of the complex-valued neural neurons have been comprehensively reviewed by Bassey et al. [80]. In this comprehensive review work, contributions of the additional phase information to neural learning process have been highlighted. In order to contribute to this discussion in the current study, Figure 7 illustrates the domain of real-valued neurons and the domain of the complex-valued neurons. The figure clearly demonstrates the co-domain expansion of the neural function from a one-dimensional line into a plane by means of processing the complex values. The complex signal properties associated with this domain expansion (e.g., real and imaginary components, magnitude and phase properties) are also shown in the figure. Such expansion to the complex number domain can enhance data processing skills of PWM neurons because the complex-valued operation zone already covers its real-valued counterparts in the computation task. As summary, one can observe that the proposed PWM neuron model can operate as the real-valued neuron, the complex-valued neuron, and mixed-mode neuron depending on the interval of input values. Table 2 lists operation modes of a PWM neuron. To switch operation modes of the neuron to the complex-valued or the real-valued modes, an interval shifting scheme is suggested for the set of the input x. Table 2. Operation modes of power-weighted multiplicative neurons.

Intervals Operation Modes Interval Reversing to Switch between Operation Modes
x < 0 Complex-valued Neuron When x ≥ 0 in the real valued mode, use −x as input data because −x < 0 x ≥ 0 Real-valued Neuron When x < 0 in the complex valued mode, use −x as input data because −x > 0 x j < 0 and x j ≥ 0 Mixed-mode Neuron It operates the mixed-mode when input data have positive and negative values Due to the complex-valued operations of the power-weighted multiplicative neural processing, we added a mapping-to-real function to obtain real-valued outputs. The mapping-to-real functions map the weighted sum of dual complex number properties to a real value, which enables to convert results of neural processing in complex number domain to a real-valued signal for transmission of a real signal to the neuron output. A generic mapping-to-real function is defined as where z 1λ (v) and z 2λ (v) are dual property functions, and parameters a 1 and a 2 are corresponding weights of dual properties of complex numbers (e.g., real-imaginary properties or magnitude-phase properties). Complex numbers have two types of dual properties that can be implemented by z 1λ (v) and z 2λ (v) functions with λ = c, p. These are (i) Cartesian (λ = c) properties: real and imaginary parts of the complex number v = v r + jv im : z 1c (v) = Re{v} = v r and z 2c (v) = Im{v} = v im (25) (ii) Polar (λ = p) properties: magnitude and phase properties of the complex number v = v r + jv im : If parameters a 1 and a 2 are determined during the training process, the mappingto-real function contributes to the learning process and performs a trainable mapping. However, they can be set to fixed values to gain desired properties for the neuron. For example, for polar properties, setting a 1 = 1 and a 2 = 0 results in a mapping depending on the magnitude of the complex number and it yields a positive real number. In addition, the mapping according to the phase information can be obtained by setting a 1 = 0 and a 1 = 1.
Following the mapping-to-real function, an activation function can be used to the limit output values of the PWM neuron to predefined output ranges. This may represent the limited amplitude signals in the synaptic transmission of biological neurons. Well-known activation functions ϕ(s) is a linear activation function to avoid any change, the sigmoid activation to limit the output in a range of [0, 1] or the tangent hyperbolic activation to limit the output in a range of [−1, 1]. Other popular activation functions or parametric activation functions can be used. The output of a neuron is written as This neuron model can be a generalization of other artificial neuron models and it is capable of expressing several well-known models after properly selecting the PWM neuron model parameters. Table 3 shows representation of several neuron models that are obtained by suitable selection of PWM neuron model parameters. Therefore, special cases of the PWM neuron model can express other neuron models in Table 3. This reveals the fact that the model representation capacity of the PWM neuron model covers these models. Such a model coverage enhancement leads to additional parameters to optimize and the associated training difficulties. Therefore, neuroevolutionary approaches and metaheuristic methods can be preferable in the training of this type of sophisticated neural model. In the current study, the proposed EFO-GS algorithm is implemented to perform the evolutionary training of PWM neural models. Table 3. Reduction of the PWM neuron model to other neuron models via the proper parameter setting.

Proper Parameter Configuration Neural Network Model Model Formulation
None Let us express the overall network function of PWM neurons. To consider a complexvalued neuron, which is the general case in operation of a PWM neuron, one can rewrite Equation (22) by using Equation (23) in Equation (21) as (see Theorem A1 in Appendix A) The real and imaginary parts of v = v r + jv m are obtained as Then, the mapping-to-real function for Cartesian properties (λ = c) yields The mapping-to-real function for polar properties (λ = p) is calculated as These solutions reveal the following remarks: -When ∑ h k=1 p i,k ∈ Z or ∀p i,k ∈ Z, then it results in sin(π ∑ h l=1 p i,l ) = 0 and cos(π ∑ h l=1 p i,l ) = (−1) ∑ h l=1 p i,l , the PWM neuron operates in the real-valued mode, and its function can be simplified to -When ∑ h k=1 p i,k / ∈ Z or ∃p i,k / ∈ Z, a PWM neuron operates in the complex-valued mode as shown by Equation (28).

An Electronic Nose Application for Monitoring NO x Concentration by Solid-State Multisensor Array
Artificial neural networks have wider utilization in machine learning when computational intelligence with learning ability is essential for applications. Data-driven control of complex real systems is becoming a central topic within the machine learning application domain because of promising intelligent real-world systems. In today's intelligent system concepts, artificial neural networks preprocess the fused data stream from sensor networks, and they can provide reliable estimation of the current and future system states and they may produce suitable control responses to regulate the monitored system status. Inevitably, the preservation of the air quality in crowded cities requires an active, data-driven air quality control scheme that can detect local buildups of pollutants in urban areas. One of the important atmospheric pollutants is nitrogen oxide (NO x ). Monitoring of NO x emission has been considered to preserve air quality in crowded cities [70], to increase fuel efficiency in NO x emissions in aviation and aerospace industry [81], to improve design of gas turbine engines for aircraft and power stations [82]. Due to the large size and high cost of chemical analyzers, low-cost solid state sensor arrays have begun to be utilized in on-field monitoring of pollutant gases [70,71,83,84]. However, measurements of low-cost multisensor arrays are not accurate, and they need calibration according to the precise measurements of chemical analyzers. Therefore, an artificial neural network was implemented to estimate chemical analyzer measurements from measurements of the multisensor arrays, and the effectiveness of this soft-calibration approach (calibration by software) was shown in an air-quality monitoring application [71]. A cooperation of multisensor arrays with measurement systems is referred to as an electronic nose system. The estimation model performs for the sensor calibration in order to improve precision of measurements [70,71] and machine learning-based sensor calibration was preferred for intelligent systems. Then, the soft-calibration models have become an essential part of electronic nose systems. The current experimental study shows implementation of the PWM neural processor with EFO-GS as the soft-calibration model. A PWM neural model was trained for accurate estimation of NO x concentrations from a low-cost multisensor array measurement dataset. This dataset includes hourly measurements from solid state gas sensors, commercial temperature and humidity sensors, and a conventional air pollution analyzer (the reference chemical analyzer was used for the ground truth data) [70,71]. A microcontroller board, which was hosting a microprocessor, a GSM (Global System for Mobile Communications) data transmission unit, and the solid state sensor array, was used to collect sensor data with a sampling period of 8 s, and an hour average of the sensor data was used to form hourly measurement instances in the dataset [70,71]. Table 4 introduces these sensors and calibrator model parameters. The training dataset was composed of 586 measurement instances that were collected during 24 days, and the following 241 measurement instances were used for the test dataset in order to estimate the next 10-day-long hourly measurements. To implement EFO-GS algorithm for the training of PWM neural network, the property code of the EFO-GS includes all coefficients of the single PWM neuron model as X k = W k b k P k,1 P k,2 . . . P k,m A k (36) where weight coefficients of the sum unit are denoted by W k = [w k,1 w k,2 . . . w k,m ], the coefficients of the power-weighted multiplication in i th dendritic branch are represented by P k,i = [p k,i,1 p k,i,2 . . . p k,i,h ], and coefficients of generic mapping-to-real function are A k = [a k,1 a k,2 ]. The EFO-GS algorithm minimizes the sum of the square loss function to perform training of the PWM neuron model. Figure 8 shows a flowchart that describes implementation of the EFO-GS algorithm for training of PWM neural processors in order to obtain an estimation model from measurement data. This chart is also a general block diagram of metaheuristic data analysis scheme where the PWM neural model is a learning model from the dataset, and the metaheuristic optimization is used to solve the optimization problem in order to find an optimal solution of the data analysis problem. This application indeed solves a measurement error reduction problem (a soft-calibration problem) for on-field sensor data. Figure 9 shows 241 measurement instances that are hourly averages of the multisensor array data and the reference chemical analyzer measurements for NO x . The figure illustrates the test dataset that includes the collected data from solid state multisensors sensitive to CO, NMHC, NO x , NO 2 , O 3 , and the reference chemical analyzer for NO x during 10 days of observation (y-axis shows the values of average concentration measurements from sensors and the chemical analyzer, and x-axis indicates the measurement instances). The reference chemical analyzer measurements are correct measurements to be learned by machine learning methods to calibrate low-cost sensor arrays. To show modeling performance of a PWM neuron, a single PWM neuron with 8 inputs and 5 dendrites was implemented in the real-valued mode, and its performance was compared with a multi-layer classical ANN model and a Genetic Programming (GP) model. The training dataset is used to obtain an estimation model in the form of Figure 10 shows the convergence of square error during EFO-GS optimization of the single PWM neuron for the NO x concentration estimation model. The EFO-GS has performed 2000 iterations and optimizes 48 parameters of the PWM neuron. (The number of fractional power weight p i,j is 8 × 5 = 40 parameters, the number of weights w i for five dendritic branches is 5, the number of bias (b) is one and number of mapping-toreal function parameters (a1, a2) is 2.) Training tasks were performed for 586 hourly measurement data and performance tests were performed for the subsequent 241 hourly measurement data in Figure 9. The test data were not involved in any stage of the training of the PWM neuron. The classical neural network was implemented with 3 layers. It has 10 neurons in the first hidden layer, 2 neurons in the second hidden layer, and one neuron at the output layer. The total number of weight parameters is 115. The GP model was implemented by using the GP algorithm with Orthogonal Least Square (GpOls) [85]. The GpOls algorithm was developed for effective identification of nonlinear input-output models by using a tree-based genetic programming with a linear least square modeling technique [85].  Figure 11 shows NO x concentration estimations of the tested machine learning methods for the test dataset. To better view convergence of the estimation model to the reference analyzer measurements (ground truth measurements), Figure 12 presents a close view of Figure 11. The figures reveal that all estimation models provide consistent estimates, and these models can be used for the calibration of multisensor arrays in practice. However, the PWM neuron uses quite less optimization parameters than the ANN model to reach this performance level.   Table 5 lists performance indices in order to evaluate concentration estimation performances of the PWM neuron with EFO-GS, classical ANN, and GP models for NO x measurements. Regression performance is widely evaluated by using Mean Square Error (MSE). MSE performance of the PWM neuron with EFO-GS is better than other models. The R 2 score measures the fitting performance of models to data in the regression analysis. The PWM neuron with EFO-GS provides an R 2 score of 89%, which is higher than that of the other models. Figure 13 shows change of the sum of square error (SSE) through the estimation period, and it evaluates the cumulative square error distribution for all test data. In the beginning, the SSE performance of ANNs is better. However, after a 50-h estimation period, the square error of ANN model sharply increases. The SSE model of GP model exhibits an instant SSE rise around 100 h. The PWM neuron with EFO-GS is more consistent for long-term estimation, and this indicates that data generalization of the PWM neuron with EFO-GS can be better than other methods. It is useful to consider the histogram of instant measurement errors to validate this effect.  Figure 14 shows histogram analysis of estimation errors according to the reference analyzer measurements. These figures illustrate a distribution of instant measurement errors around zero value. The measurement error distribution of the PWM-EFO indicates more successful estimation and generalization from the training dataset so that instant errors accumulate near to zero and distribution around zero is more balanced and similar to the normal distribution. Results in Table 6 confirm observations in histogram analysis. For successful estimation and generalization, the mean value of estimation errors should be zero, the standard deviation of estimation errors should be minimum, and distribution around zero be more balanced (symmetrical) for the test data. This implies that useful modeling information is absorbed from the training dataset. The instant measurement error of the PWM-EFO has a mean value that is closer to zero, which indicates an improved generalization of data, and it has the lowest standard deviation, which implies better learning from the data.

Experimental Results for Complex-Valued Mode PWM-EFO Neuron
This section presents the results of the real-valued neuron mode and the complexvalued neuron mode of the PWM-EFO method for NO x estimation. The complex-valued mode was activated by assigning a negative sign to the multisensor array input data that were originally all positive-valued. This makes all input values a negative real number, and the dendritic branches of PWM yield complex numbers according to Equation (23). The PWM neuron processes complex numbers. Accordingly, the training dataset was arranged in the form of (−x 1 , −x 2 , −x 3 , −x 4 , −x 5 , −x 6 , −x 7 , −x 8 , y d ) to shift in the complexvalued neuron mode. In the previous section, it worked in the real-valued PWM neuron mode since the training dataset was arranged in the form of (x 1 , x 2 , x 3 , x 4 , x 5 , x 6 , x 7 , x 8 , y d ). Figure 15 shows estimations of these PWM neuron modes for the test dataset. Table 7 shows estimation performance indices. Results indicate that the estimation performance of the real-valued mode operation is slightly better than those of the complex-valued mode. A reason for these results can be that the dataset has a nature of real-valued relations, and there may be no need for computation in the complex domain for this dataset.  Figure 16 shows the change of the sum of square error (SSE) during 241 hourly measurement estimations while using real-valued mode and complex-valued mode PWM neurons. Up to the 100th measurement, the complex-valued mode provides a better SSE performance; however, around the 190th measurement, its SSE performance is getting worse. Overall, the long-term SSE performance of real-valued neurons is better, and these results indicate that the generalization of the training dataset is better for the real-valued mode in this NO x calibration problem.

Conclusions
This study suggested an evolution field theorem to establish a theoretical background for the analysis of the agent-based evolutionary computation systems and an EFO-GS optimization algorithm was introduced on the basis of this theorem. This algorithm performs a geometrical evolution according to the evolutionary field values under the assumption of a Markovian search process. The evolution field theorem can form a common theoretical basis, where population-based evolutionary optimization algorithms can be analyzed, designed, and compared. Another contribution of this study addressed the improvement of basic neuron models: after briefly reviewing multiplicative neuron studies, the computational scheme of the multiplicative neurons were modified by using non-integer power weights and the mapping-to-real function block. Thus, a PWM neural processing unit with multi-mode operation was suggested as a generalization of classical ANNs. The EFO-GS optimization was implemented for the training of the PWM neurons. Operation modes of the suggested PWM neurons were investigated in detail, and computational supremacy of the PWM neurons over conventional neural models was discussed theoretically and shown experimentally in the electronic nose application.
Engineering application of the EFO-GS optimization was demonstrated for the training of a PWM neuron to obtain the soft-calibration model for improvement of NO x measurements by using a low-cost multisensor array. Figure 17 depicts a block diagram of the electronic nose that combines a soft-calibration model and a multisensor unit. The experimental study on the air quality dataset revealed that the PWM neuron model with EFO-GS optimization can improve the accuracy of NO x measurements from solid state sensor arrays, and it can be implemented as an integral part of electronic nose applications. This study illustrated the performance of this soft-calibration model to estimate NO x concentration measurements in the range of [14,368] ppb from multisensor array data. However, the PWM neuron with EFO-GS can be used to generate a soft-calibration model for other gases (CO, NO 2 , NMHC) so that the reference chemical analyzer measurements are available in the dataset.  Proof. The negative derivative of evolutionary energy function minimizes the evolutionary energy. Let us take a Lyapunov energy function as the evolutionary energy F(X k (t)) > 0. For convergence to the minimum, the energy function should satisfy the stability criterion dF(X k (t)) dx < 0.
One can substitute the finite difference for discretization of the derivative operator as follows For a sequential unit time increment, h can be set to 1 in order to represent discrete evolution seasons.
dF(X k (t)) dx F(X k [n + 1])) − F(X k [n])) < 0 Accordingly, the following condition minimizes evolutionary energy at each discrete evolution season: ∆F k [n] = F(X k [n + 1]) − F(X k [n]) < 0 Remark A2. A geometrical crossover of all agent property codes X i [n] towards the seasonal best agent code X best [n], at a scale of the magnitude of quality factor |Q[n]|, maximizes the total quality factor ((∑ h k i=1 Q i [n]) at each seasonal evolution. The maximum total quality factor in seasonal evolution is Proof. Let us assume that property codes of X i [n] change toward X best [n] = argmin X j F(X j [n]) at the season n. The seasonal evolutionary quality factor is written according to Equation (7) Q i [n] = − F(X best [n]) − F(X k [n]) |F(X best [n])| + |F(X k [n])| .
Since the selection mechanism guarantees the selection of advantageous evolution that satisfies the condition ∆F i = F i (X best ) − F i (X i ) < 0 in the algorithm, one can easily write the quality factor as Hence, for the selection of advantageous evolution, one can write ∑ |. In addition, it is apparent that |F(X best [n]) − F(X i [n]))| ≥ |F(X i [n]) − F(X k [n])|, because of X best [n] = argmin X j F(X j [n]). Then, the maximum value of total quality factor is written for the evolution towards X best in form of Consequently, the seasonal evolution of all property codes towards X best [n] maximizes the total quality factor in the evolution process.
Remark A3. The multiplicative unit, which is expressed as u = ∏ h j=1 x p j , turns into a weighted geometric average operator when the condition ∑ h j=1 p j = 1 is satisfied.
Proof. One can write geometric average of parameters x 1 , x 2 , x 3 , . . ., x h as Let us apply exponent weights α i > 0 for each parameter x j as x α i j . The order-weighted geometric average of x 1 , x 2 , x 3 , .., x h series is written in the form of where the exponents p j = α i h is the power weight. The function G 0 expresses a weighted geometric average and when the power weight satisfies the condition ∑ h j=1 p j = 1, Equation (20) performs an order-weighted geometric average [75,76].
Proposition A1. Assuming a negative real number (x ∈ R − ) and a non-integer power p r ∈ R − {Z}, one can state that (x p r ) can be written as the complex-valued parameter x p r = |x| p r (cos(π p r ) + jsin(π p r )).
Proof. Since the parameter x is a negative real number (x < 0 and x ∈ R), one can write (x = (−1)|x|) and, then, x p r = ((−1)|x|) p r = (−1) p r |x| p r = ( √ −1) 2p r |x| p r = (j) 2p r |x| p r The property j 2 = e jπ is used in above expression, the x p r can be written as x p r = e jπ p r |x| p r .
Theorem A1. (Complex-valued Neuron Mode): A PWM neuron model, which is defined by Equations (21) and (22), can perform a complex-valued neuron mode that can be expressed in the form of