An Artificial Neural Network Approach for the Prediction of Absorption Measurements of an Evanescent Field Fiber Sensor

This paper describes artificial neural network (ANN) based prediction of the response of a fiber optic sensor using evanescent field absorption (EFA). The sensing probe of the sensor is made up a bundle of five PCS fibers to maximize the interaction of evanescent field with the absorbing medium. Different backpropagation algorithms are used to train the multilayer perceptron ANN. The Levenberg-Marquardt algorithm, as well as the other algorithms used in this work successfully predicts the sensor responses.


Introduction
Evanescent field absorption (EFA) has been widely employed by some researchers for sensing of chemical and biological parameters over the past two decades [1,2]. Typical applications include sensing of chemical species such that pH [3,4], concentration [5], humidity [6], gas [7], combustible liquids [8] and as biosensors [9]. EFA sensing mechanism utilized by fiber optic sensors is based on the absorption of the light carried by the evanescent field that coexists in the fiber cladding. In EFA fiber sensors the core is surrounded by an absorptive cladding that consists of a liquid [10,11] or a sol-gel material [3,12] that include an absorbing dye, since most chemical materials are not able to absorb the optical power directly. The effect to be sensed can be related to the power change because absorbing dye causes attenuation in the optical power.
In the last decade, ANNs became popular because they have ability to learn, fast real-time operation and ease of implementation features [13][14][15]. These features made ANNs useful for optical fiber technology especially to extend measurement range [16][17][18], calibration [19], signal processing [20,21] and development of intelligent fiber optic sensors [22].
In this work, we have used the output power of an EFA fiber sensor made up a PCS fiber bundle. Then we have used a canonical variable incorporating the sensor's and absorbing dye's parameters to predict the response of the sensor by the aim of ANN because it can produce proper outputs for given inputs without any necessity to mathematical formulations between input and output data.

EFA Sensor Measurements
The sensing process based on EFA has been performed by using the arrangement in Figure 1 [23]. The sensor is adopted from a spectrophotometer, WPA S105, in which white light is created by a prefocusing incandescent lamp and reaches a diffraction grating by reflecting off a mirror. The diffraction grating is attached to a rotating rod and acts as a monochromator to set the wavelength required. Thus, a wavelength selectable sensor is obtained. Silicon support WPA S105 The sensing element was a bundle that consists of five plastic cladding silica (PCS) fibers as shown in Figure 2. The PCS fibers are especially suitable for evanescent field applications in chemical sensing because plastic claddings can easily be removed by mechanically or chemically and replaced by a suitable medium. The use of the bundle for sensing purposes doesn't affect the sensitivity because the input and output power of the sensor will increase a factor of the fiber number in the bundle. However, the bundle not only improves the coupling of the light into the fibers, but also increases the interaction surface, and makes the detection of the light at the bundle output better. The fibers in the bundle are separated at the sensing region to allow the whole of the evanescent field of each fiber in the bundle is accessed by the external solution. In this work, claddings of the PCS fibers in the bundle were mechanically removed in order to maximize the interaction between the evanescent field, which is created by the total internal reflection, and the chemical to be sensed. A fiber in the bundle, forming the sensing region, is shown in Figure 3.  For absorption measurements, Bromophenol Blue (BPB) indicator dye filled into the cuvette was used for absorbing cladding material. BPB is an indicator dye whose color is blue near pH 7 and varies by changing the pH. The monochromator was adjusted at a wavelength of 590 nm since the BPB solution has an absorption peak near this wavelength. The sensor response (P out /P in ) has been measured for different BPB concentrations by using the arrangement given in Figure 1 and results are given in Figure 4. In the figure, γ is a canonical variable incorporating the sensor parameters and given by [24,25], where α is the bulk absorption coefficient of the chemical solution, l is the interaction length of the light with the chemical solution, and V, the normalized frequency, is a dimensionless waveguide parameter of the fiber. The normalized frequency of the fiber is an important parameter for sensitivity purposes in EFA sensors because it plays an important role the amount of the evanescent field. Simply, the smaller the normalized frequency, the more evanescent field the fiber has. Therefore, in order to ensure a detectable interaction between the evanescent field and the indicator dye, the normalized frequency must be as small as possible. This can be achieved by the longer wavelength, the smaller fiber core diameter and the smaller relative refractive-index difference between the core and cladding [24,25].

Artificial Neural Networks (ANNs)
Artificial neural networks (ANNs) are one of the popular branches of artificial intelligence [13][14][15]26]. They have very simple neuron-like processing elements (called nodes or artificial neurons) connected to each other by weighting. The weights on each connection can be dynamically adjusted until the desired output is generated for a given input. An artificial neuron model consists of a linear combination followed by an activation function. Different types of activation functions can be utilized for the network; however the common ones, which are sufficient for most applications, are the sigmoidal and hyperbolic tangent functions. In most of the application, hyperbolic tangent transfer function is a better representation compared to sigmoid transfer function.
Amongst the different types of connections for artificial neurons, feed forward neural networks are the most popular and most widely used models in various applications reported in the literature. They are also known as the multilayered perceptron neural networks (MLPNNs). In an MLPNN, neurons of the first layer send their output to the neurons of the second layer, but they do not receive any input back from the neurons of the second layer.
The general structure of an MLPNN is given in Figure 5 and consists of three layers: an input layer, with a number of neurons equal to the number of variables of the problem, an output layer, where the Perceptron response is made available, with a number of neurons equal to the desired number of quantities computed from the inputs, and an intermediate or hidden layer. While an MLPNN consisting of only the input and the output layers provide satisfaction for linear problems, additional intermediate layers are required in order to approximate nonlinear problems. For example, all problems which can be solved by a perceptron can be solved with only a hidden layer, but it is sometimes more efficient to use two (or more) hidden layers. The only task of the neurons in the input layer is to distribute the input signal x i to neurons in the hidden layer. Each neuron j in the hidden layer sums up its input signals x i after weighting them with the strengths of the respective connections w ji from the input layer and computes its output y j as a function f of the sum, given by ( ) where f can be a simple threshold function such as a sigmoid, or a hyperbolic tangent function. The output of neurons in the output layer is computed in the same manner. Following this calculation, a learning algorithm is used to adjust the strengths of the connections in order to allow a network to achieve a desired overall behavior. There are many types of learning algorithms in the literature [13][14][15]26]. However, it is very difficult to know which training algorithm will be more efficient for a given problem. The algorithms used to train ANNs in this study are Levenberg-Marquardt Backpropagation (LM), Scaled Conjugate Gradient Backpropagation (SCG), Broyden Fletcher Goldfarb Shanno Quasi-Newton Backpropagation (BFGS), Bayesian Regularization Backpropagation (BR), and Conjugate Gradient Backpropagation with Polak-Ribiére updates (CGP).
The LM algorithm is an iterative technique locating a local minimum of a multivariate function that is expressed as the sum of squares of several non-linear, real-valued functions and updates weight and bias values according to Levenberg-Marquardt optimization. The SCG which is a member of the class of conjugate gradient methods is a supervised learning algorithm for feedforward neural networks. The BFGS is one of the most powerful and sophisticated quasi-Newton methods and has the advantage over Newton's method that the second partial derivatives are not needed. The BR algorithm updates the weight and bias values according to LM optimization and minimizes a combination of squared errors and weights, and then determines the correct combination so as to produce a network Hidden Layer(s) Output Layer

Application of ANN and Results
An MLPNN consisting of one input, two hidden layers and one output was used to predict the EFA sensor response. The input and output variables of the network are γ and P out /P in , respectively and the prediction process can be defined as follows: α is a measurable coefficient for a given solution. l is an adjustable length for a sensor configuration and V can be calculated by using optical fiber parameters and the wavelength of the optical source. Consequently, γ is a known parameter and can be used for the prediction of the sensor response. A training dataset consisted of randomly selected γ vs. P out /P in is applied to the network and the network is trained. By using the weights obtained after training, the new responses are then estimated for unseen γ values.  Figure 6 shows an example of the networks, the model with LM algorithm, proposed in this work. The model consists of one neuron in the input layer, three and five neurons in the first and the second hidden layers, respectively, and one neuron in the output layer. Each network is trained with nine dataset normalized between -1 and +1 before the training. A hyperbolic tangent transfer function is used as activation function in hidden layers and a linear one is used in output layer. The performance of the algorithms used in the network is compared in terms of their mean square errors (MSEs). Training times of the all networks used in this work are shorter than three seconds. The network types and resulted MSEs are given in Table 1. Training results given in Table 1 show that the ANN model with the BR algorithm has the best performance with the smallest MSE value although it has the minimum number of the neurons.
The prediction performances of the ANN models are tested with five experimental points obtained from the sensor described in Section 2. The comparisons of the sensor responses and the network outputs are given in Table 2. It can be seen from the table that the ANN model with the LM algorithm has the smallest MSE value. The best (LM) and the worst (SCG) ANN outputs with respect to MSEs  If the percentage errors with respect to the experimental results are calculated for another performance comparison, it can be seen that the models with BR and LM have the smallest maximum percentage, errors being smaller than 2.0 (the maximum errors of the other models are smaller than 2.5).
For the discussion of influence of neuron numbers in the hidden layers on MSE and the sensor response, we trained the networks with the same neuron numbers and the same initial weights by using the model given in Figure 6. The network outputs and resulted MSEs are given in Table 3. As can be seen from the table, the BR algorithm produces improper outputs for these conditions. It can be said that the network parameters such as the neuron numbers, the number of hidden layers, the activation functions and the initial weights are peculiar to the algorithm and a change in one of these parameters can cause important changes at the outputs. Table 3. Performance comparisons of the networks for the model given in Figure 6.

Conclusions
An artificial neural network approach has been introduced in this paper to predict the response of an evanescent field absorption fiber sensor. Performance comparisons show that all of the neural models used in this work can predict the sensor responses with considerable errors. It is useful to note that, the neural network approaches can tolerate measurement errors. In conclusion, the artificial neural network approaches can play an important role in the design and development of intelligent sensors.