This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

A multi-layered perceptron type neural network is presented and analyzed in this paper. All neuronal parameters such as input, output, action potential and connection weight are encoded by quaternions, which are a class of hypercomplex number system. Local analytic condition is imposed on the activation function in updating neurons’ states in order to construct learning algorithm for this network. An error back-propagation algorithm is introduced for modifying the connection weights of the network.

Processing multi-dimensional data is an important problem for artificial neural networks. A single neuron can take only one real value as its input, thus a network should be configured so that several neurons are used for accepting multi-dimensional data. This type of configuration is sometimes unnatural in applications of artificial neural networks to engineering problem, such as processing of acoustic signals or coordinates in the plane. Thus, complex number systems have been utilized to represent two-dimensional data elements as a single entity. Application of complex numbers to neural networks have been extensively investigated, as summarized in the references [

Though complex values can treat two-dimensional data elements as a single entity, what we should treat data with more than two-dimension in artificial neural networks? Although this problem can of course be solved by applying several real-valued or complex-valued neurons, it would be useful to introduce a number system with higher dimensions, the so-called hypercomplex number systems.

Quaternion is a four-dimensional hypercomplex number system introduced by Hamilton [

Thus, there has been a growing number of studies concerning the use of quaternions in neural networks. Multilayer perceptron (MLP) models have been developed in [

One of the difficulties in constructing neural networks in the quaternionic domain is about the introduction of suitable functions for the activation function in updating the neurons’ states. A typical type of activation function is the so-called “split” type function, in which a real-valued function is appliedto update each component of a quaternionic value [

Recently, another class of analyticity for the quaternionic functions has been developed [

This paper presents an MLP-type quaternionic neural network with locally analytic activation function. All variables in this network, such as input, output, action potential and connection weights, are encoded by quaternions. A learning scheme, a quaternionic equivalent of error back-propagation algorithm, is presented and theoretically explored. The derivation of the learning scheme in this papera dopts the Wirtinger calculus [

Quaternions forma class of hypercomplex numbers consistingofareal number and three imaginary numbers-^{(}^{e}^{)}, ^{(}^{i}^{)}, ^{(}^{j}^{)}, and ^{(}^{k}^{) }are real numbers. The division ring of quaternions, ^{(}^{i}^{)}^{(}^{j}^{)}^{(}^{k}^{)}}. In this representation, ^{(}^{e}^{) }is the scalar part of

Quaternion bases satisfy the following identities,

Next, we define the operations between quaternions ^{(}^{e}^{)}, ^{(}^{e}^{)}, ^{(}^{i}^{)}, ^{(}^{j}^{)}, ^{(}^{k}^{)}) and ^{(}^{e}^{)}, ^{(}^{e}^{)}, ^{(}^{i}^{)}, ^{(}^{j}^{)}, ^{(}^{k}^{)}). The addition and subtraction of quaternions are defined in a similar manner as for complex-valued numbers or vectors,

The product of

The quaternion norm of

It is important to introduce an analytic function (or differentiable function) to serve as the activation function in the neural network. This section describes the required analyticity of the function in the quaternionic domain, in order to construct activation functions for quaternionic neural networks.

The condition for differentiability of the quaternionic function

The analytic condition for the quaternionic function, called the Cauchy-Riemann-Fueter (CRF) equation, yields:

This is an extension of the Cauchy-Riemann (CR) equations defined for the complex domain. However, only linear functions and constants satisfy the CRF equation [

An alternative approach to assure analyticity in the quaternionic domain has been explored in [

A quaternion

From the definition in Equation (15), we deduce that ^{2 }_{x }= −1. If _{x}_{x}

A quaternionic difference of ^{(}^{e}^{)}, ^{(}^{i}^{)}, ^{(}^{j}^{)}, ^{(}^{k}^{)}), can be decomposed by using:

Then, the following relations hold:
_{⊥} = 0, _{x}_{x}_{x}_{x}_{x}_{x} is a quaternion without a real part. Thus, _{x} and _{x}

Then,

Considering _{x}_{x}

Hence, _{x}

^{(}^{k}^{) }= 0) in this figure due to difficulties in representing a four-dimensional vector space. In this example, for a given quaternion _{x} is defined in the ^{(}^{e}^{)}(real axis) and ^{(}^{r}^{) }in the quaternionic space, and the analytic condition is constrained in this plane.

A schematic illustration of local complex plane in a quaternionic space, where the component

The local derivative operators are introduced, corresponding to the form of _{‖}, as follows:

Note that the variables ^{∗}

When _{⊥} = 0, the local derivative of _{⊥} =0 always holds.

Moreover, if ^{∗}, it becomes:
^{∗} being independent of each other. As a result, we can treat quaternionic functions in the same manner as complex-valued functions under the condition of local analyticity.

The structure of the network assumed in this paper is shown in

The numbers of neurons in the input, hidden and output layers are set to _{nm}_{n}

The structure of the multilayer perceptron in this paper.

Processing the neurons’ outputs in the output layer can be defined in the same manner as in the hidden layer. The output of the neuron in the output layer, _{k}_{kn}

The connection weights should be modified by the so-called learning algorithms, in order to obtain the desired output signals with respect to the input signals. One of the learning algorithms for MLP-type networks is the error back-propagation (EBP) algorithm. The following section describes our derivation of this algorithm for the presented network.

An EBP algorithm works so that the output error, calculated by the neurons’ outputs at the output layer and the desired output signals, is minimized. In the case of networks with three layers as shown in

First, let _{k}_{kn}_{kn}^{ ∗}’s. The output error E at the time t is then defined as

The output error should be real-valued so that it can be minimized.

Suppose that the connection weights are updated at the time (_{kn}

Note that the local analytic condition in quaternionic domain should be satisfied in calculating the derivatives. Thus, if we set Δ_{kn}

If the real part of _{kn}^{∗}_{‖} are expanded by using chain rule of derivative and ∂^{∗}_{‖}= 0 from the local analytic condition is applied:
_{k}

Similarly, the updates for the connection weights _{nm}^{∗}_{nm}

Hence, _{nm}_{nm}

This leads to ∆^{*}_{nm}_{‖ }can be expanded by chain rules and local analytic conditions, ^{*}_{‖} = 0, ^{*}_{‖} = 0, and ^{*}_{‖} = 0 are applied:

Using the derivatives

Once a set of network output {_{k}_{k}

As an example of activation functions for neurons (

This concern is known as the universal approximation theorem [

This theorem is also discussed in the case of complex-valued networks [

The first type of complex-valued functions concerns the functions without anysingular points. These functions can be used as activation functions and the networks with this type of activation functions are shown as good approximators. Although some of the functions are not bounded, they can be used by introducing bounding operation for their regions. The second type concerns the functions having the bounded singular points, e.g., the discontinuous functions. These singularities can be removed and thus they can also be used for activation functions and can achieve their universality. The last type is for the functions with the so-called essential singularities,

In the proposed quaternionic MLPs, for example, a quaternionic tanh function can be used as an activation function. This function is unbounded and may contain several kinds of singularities as in the case of complex-valued functions described above. Thus this quaternionic MLPs would face the same problem,

This paper has proposed a multilayer type neural network and an error back-propagation algorithm for its learning scheme in the quaternionic domain. The neurons in this network adopt locally analytic activation functions. The quaternionic functions with local analytic conditions are isomorphic to the complex functions, thus several activation functions, such as complex-valued tanh function, can be used extendedly in the quaternionic domain. The Wirtinger calculus, where a quaternion and its conjugate are treated as independent of each other, makes the derivation of the learning scheme clear and compact.

Analytic conditions for quaternionic functions are derived by defining a complex plane at a quaternionic point, which is a kind of reduction from quaternionic domain to complex domain. There exists another type of reduction in quaternionic domains, such as the commutative quaternion, which is a four-dimensional hypercomplex number system with commutativity in its multiplication [

Showing the universality of the proposed network is also an important issue. Quaternionic functions, such as tanh function, may contain several kinds of singularities where the values of functions or their differentials are not defined in particular regions. In complex-valued networks with fully complex-valued function [

Also, it is necessary to investigate the performances of the proposed network, though in this paper the experimental exploration could not be accomplished. The proposed network is similar to the networks proposed in [

Application of the presented network to engineering problems is also challenging. The processing of three or four dimensional vector data, such as color/multi-spectral image processing, predictions for three-dimensional protein structures, and controls of motion in three-dimensional space, will be the candidates from now on.

This study was financially supported by Japan Society for the Promotion of Science (Grant-in-Aids for Young Scientists (B) 24700227 and Scientific Research (C) 23500286).