A Neural Network-Based Interval Pattern Matcher

One of the most important roles in the machine learning area is to classify, and neural networks are very important classifiers. However, traditional neural networks cannot identify intervals, let alone classify them. To improve their identification ability, we propose a neural network-based interval matcher in our paper. After summarizing the theoretical construction of the model, we take a simple and a practical weather forecasting experiment, which show that the recognizer accuracy reaches 100% and that is promising.


Introduction
Pattern recognition [1][2][3], which focuses on the recognition of patterns, is a branch of machine learning.In many cases, patterns are learned from training data, which is supervised learning, but when there is no training example provided, some pattern recognition algorithms can be used to discover previously unknown patterns, which is unsupervised learning [4][5][6].
Pattern recognition is to assign a class to a given input value or a set of values [7,8].Generally speaking, pattern recognition algorithms aim to provide a reasonable answer for all possible inputs and to perform most likely matching of the inputs, taking into account their statistical variation.This is contrasted with pattern-matching algorithms, which look for exact output that matches to the input in a pre-existing pattern or patterns.A common example of a pattern-matching algorithm is regular expression matching [9][10][11].In contrast to pattern recognition, pattern matching is generally not considered as a type of machine learning but a type of artificial intelligence, although pattern-matching algorithms can sometimes succeed in providing similar-quality output to the sort provided by pattern-recognition algorithms.
So far, there are some pattern-matching algorithms in the world, including ones for intervals.However, most of the interval-matching algorithms are used in much applied areas [12] and are highly related to specific features.In other words, these interval pattern matching methods are designed for specific problems, not for general ones.
The main contribution of the paper is to propose an interval pattern matcher that makes use of the neural network, and we call it a "neural network-based interval pattern matcher".This pattern matcher has the property that can identify a mathematical interval, instead of a real number.Experiments illustrate that neural networks can form the core of the "Neural Network-based Interval Pattern matcher" that can identify the patterns more quickly compared with the pattern-matching algorithm using a searching method once the training process finishes.
The organization of this paper is as follows.In Section 2, we briefly introduce the preliminaries of the paper: neural networks.Section 3 describes the modeling of a neural network-based interval pattern matcher.Section 4 is the experiment to test the classifier that we propose.Finally, conclusions are given.

Preliminaries: Neural Networks
Artificial neural networks (shown in Figure 1) are electronic neuron structures that simulate the neural structure of the brain.They process one training sample at one time and learn by comparing real output with targets.The errors from the initial training step are used to modify the network weights the second time around, and so on.Roughly speaking, an artificial neural network is composed of a set of input values (xi), associated weights (wij), and a function that sums the weights up and maps the results to the output vector (y).Neurons are organized into layers.The input layer is composed of neurons that represent the feature variable vector (x).The next layer is called the hidden layer; and there may be more than one hidden layer and more than one neuron in each hidden layer.The final layer is the output layer, where one node matches with one class for classification problems.Neural network is an important tool for classification [13,14].Basically, a classifier takes objects as inputs and assigns each one to a class; objects are represented as vectors of features and classes are represented as class labels, such as 0 or 1.
In the training phase, the correct class for each record is known, and the output nodes can therefore be assigned "class number" values-"1" for the node corresponding to the correct class, and "0" for the wrong class [15,16].It is thus possible to compare the network's calculated output values with these "correct" (target) values, and calculate an error term for each node, which is the "Delta" rule.These errors are then used to adjust the weights in the hidden layers so that the output values will be closer to the "correct" (target) values for the next time.After the process of training, a single sweep forward through the network assigns a value to each output neuron, whose value is assigned to whichever class's node had the highest value.
It is well-known that there are several types of neural networks, each of which has unique functions.In our paper, we focus on the pattern matching ability using a neural network, and come up with a pattern matcher that is a neural network with additional data preparation layer.The traditional neural network is typical enough to be used to in our system illustrate our problem.

Neural Network-Based Interval Classifier
Rules [17][18][19] are often stated as linguistic IF-THEN constructions.It has the general form "IF A THEN B", where A is called the premise of the rule and B is the consequence.We use (A; B) to denote this rule.Neural network can express such rule efficiently.However, it is more often seen that both A and B take a more complex form.
Take the simple rule set below as an example There are two rules in this example, and the number of inputs is two.Small and Small' denote intervals over the universe of x and y.R1 can be denoted as (Small, Small; Small); R2 can be denoted as (Small', Small'; Small').
Often, a continuous interval can be represented by some discrete examples.The discrete examples in the domain can be determined by interpolation.This idea reduces the processing complexity in our system using interval.For example, given the rule R1, we need some specific value or (values) to depict the interval Small to minimize the complexity of it, neural network would face the difficulty that one input values would face more than one output value.Details are shown as the following: If there is a one-to-many relationship between the inputs and targets in the training data, then it is not possible for any mapping to perform perfectly.One method is to calculate the average value for each interval.Any given network may or may not be able to approximate this mapping well at the beginning.However, it will form its best possible approximation to this mean output value when trained as well as possible.
One problem emerges when average features cannot distinguish one interval from the others.Sometimes, different intervals may have the same average values, especially for the intervals having overlap values.Whatever method we use, one interval should be distinguishable from the others on the limited domain related to the problem that we are trying to solve.For example, if our domain is {Large, Small, Middle}, we need to distinguish these three intervals but do not need to distinguish others.
Here, we have to emphasize the concept of "relation".A "relation" is just a relationship among objects.If there are some persons in class and each person has their own height, the pair of one person and his/her name is a "relation".Besides, these pairs are "ordered", which means one comes first and the other comes second.The set of all the starting points is called "the domain" and the set of all the ending points is called "the range".
What should we notice is that all functions are relations but not all relations are functions.A function is a sub-classification of relation and a well-behaved relation, we mean that, given a starting point, we know exactly where to go; given an x, we get only and exactly one y.The relation {(7,8;0), (7,8;1)} where "7" and "8" are input values and "0" and "1" are output values, is not a function since certain xelements are paired with more than one unique y-element.However, the relation {(1;2), (3;2)} where "1" and "3" are input values and "2" is an output value, is a function since certain y-elements value can be paired with more than one unique x-element.
It is well known that a neural network can approximate a function very well [20][21][22], but it cannot express the one-many mapping correctly since the weight changes would be compromised between different outputs given the same inputs.In a traditional neural network, the training error of a function would approximate zero when trained as well as possible [23][24][25]; while for the non-function relation, the error does not converge to a real number, but rather to an interval.
For example, if we are given a relation {(1;2), (1;3)} and train the neural network using the training sample (1;2) and (1;3) alternately, at the beginning the weight change would simulate the relation (1;2) and then it would simulate the relation (1;3) and so forth.Thus, the weight changes would be a compromise among different outputs ("1" and "3") given the same inputs ("1") after a lot of iterations.When we input "1" to this well trained neural network, it would output a number between two and three randomly based on our research.Totally speaking, the range of the interval is closely related to the error interval.
Here, we make use of this error interval to decide if the training process is finished or not in the training process.Besides, we can also use this error interval and testing outputs to identify different interval output in the testing process.Specifically, the error interval would fall within an interval that is closely related to the error interval and output some values that fall within the interval itself.The network structure of this example is shown in Figure 2. The main goal of the input layer is to generate some training samples randomly, each of which contains independent variable values (x1, …, xn) and values of the induced variable (y) that obey the corresponding rule.We generate values for each of the variables by drawing from a rule distribution [26][27][28].
Using the example from Section 3, there are two independent variable values (x1, x2) and one induced value y.To generate a training example for rule R1, we would draw a sample for x1 from the Small region of the domain of X1; x2 would be drawn from the Small region for X2, and y would be drawn from the Small region of Y.
Input layer: This layer is similar to standard neural networks with inputs from the data preparation layer.
Hidden layer: The function of this layer is similar to standard neural networks, except that we have separate networks for each set of input classes.The training itself is done using the Back-Propagation algorithm.During training, a training sample is only used to train a single network in the neural training layer, not all of them.
Output layer: This layer outputs an interval values.if we input (1,3), (1,4), (2,3), (2,4), the output would be a number between 6 and 7 as would the others.Thus, we could estimate the interval C based on the outputs of the corresponding well trained neural network.

Comparison Between Two Rules Using Our Program
The reason why we conduct this experiment is to testify that the range of the interval is closely related to the error interval.We tested the range solution on the simple rules R1 and R2: Small is a set with the interval of [0, 10].In other words, it is part of the "Small".Table 1 depicts the experimental environment that we used for the tests.We first test out the range solution on R1.For this example, the universal of all three variables is the interval [0, 100].To generate a test example, we simply generate a random three-tuple in [0, 100] 3 , and then check if this conforms to the stated rule R1. (If it does not, we discard the training example.) In this experiment, we generate 42,803 training samples that satisfied the R1 in the rule set.Using 30,000 training samples to train the neural network, we get the error values shown below in Figure 3. Once the network finishes the training process, the error falls approximately within the interval [0-0.2].Using the other 12,803 testing samples on the network, we get error values shown in Figure 4. We observe that the result approximately is between "0" to "0.2".This means that the training process is successful and there is no over-fitting problem.We used another rule R2 to test the range solution.Using 20,000 training samples to train, the training error falls approximately within an interval [0, 0.08], as shown in Figure 5.Using the other 10,000 testing samples to test, the testing error falls approximately within an interval [0, 0.07], as shown in Figure 6.It is observed that both rules' training errors fall within an interval after getting a stable state, and the difference between the training errors of the two rules is the range of the error.The reason for this phenomenon is that the inputs of a training sample correspond to different output values.If the gap among the biggest output value and smallest output value is big, the range of the error is big; otherwise, the range of error is small.
Although the training error also falls within an interval for a function, the range of the interval will approach zero as long as the training times are sufficient.This is different from the non-function relation, whose training error range would not approach zero no matter how many training times the training process would take [28][29][30].We also took the accumulated precipitation in the next 24 hours as outputs for the weather prediction models and used the precipitation observations to verify our results.Precipitation falls into five levels shown in Table 4. Experiments show that the accuracy of pattern matcher is 100%, which is promising.

Conclusions
This paper presents an interval pattern matcher that can identify patterns with interval elements using neural networks.Our new system is based on a traditional neural network, but unlike previous neural networks, our system can handle interval inputs values and interval output values.This model is more convenient for some applications when compared with systems that require single real inputs and outputs.
We show that the new system is suitable for interval pattern matching, which look for exact output that matches to the input in pre-existing pattern or patterns.That is, our system can identify intervals patterns and are better than other pattern matchers using searching algorithms when trained as well as possible, since search-based matcher have to spend extra time in searching.In addition, our system can identify interval-based patterns while previous neural networks can only deal with simple patterns with simple real number inputs and outputs.
Finally, experiments show that the error intervals are highly closed with the output: the bigger the interval output range, the bigger the error range.In addition, practical experiments shows that the accuracy of identification is 100%, which is promising.
As we know, the neural network can identify simple numbers, and we can now identify interval ones.This kind of matcher is important because most real world problems are better characterized by an interval rather than a single one.

Figure 3 .
Figure 3. Training error using range solution of R1.

Figure 4 .
Figure 4. Training error using range solution of R2.
humidity falls into four intervals shown in the third and fourth columns; dry and wet bulb temperature falls into five intervals shown in the fifth and sixth columns; wind speed falls into five intervals shown in the last two columns.

Table 4 .
Classification of precipitation outputs.