1. Introduction
There is a long-term interest in extending the fourth industrial revolution (Industry 4.0) to agricultural production [
1]. Regarding viticulture, in particular, the interest is in minimizing the human presence in the vineyard during production. In the aforementioned context, the accurate prediction of the grape maturity is critical in order to timely engage both human labor and equipment for harvest. Newly introduced technologies from associated areas such as the Internet of Things (IoT), Big Data, and Artificial Intelligence (AI) can be combined with autonomous robotic systems in order to collect and interpret data, monitor and evaluate crop status, and automatically plan effective and timely interventions. An early in-field assessment of fruit maturity level and therefore an estimation on harvest time has the potential to enable sustainable farming by balancing between economy, ecology, and optimal crop quality [
2]. However, the development of autonomous robots for agricultural applications faces the daunting scale of the data involved [
3]. Our special interest here is in the prediction of grapes maturity level, intended to be integrated into an autonomous grape-harvester robot [
4].
Fruit maturity can be studied as a time-series, where the sequence of maturity data, m
1, m
2, …, m
D is indexed in time. The objective is to predict future fruit’s maturity level. A number of different models have been used including linear ones such as classic autoregressive (AR), moving average (MA), autoregressive moving average (ARMA), and autoregressive integrated moving average (ARIMA) models; however, those models typically assume stationary time-series [
5,
6], and they may fail with non-stationary time-series regarding maturity, unless a very high order linear model is used to gain an insight into the system and its underlying laws. The latter needs extensive learning which calls for long processing time as well as computational resources. Therefore, nonlinear models, such as Neural Networks (NN), have been proposed to counter forecasting problems [
7,
8,
9]. The most straightforward approach for an NN model to learn a time-series is to provide the samples in time to the input of the NN. However, if the time-series are complex, then more past samples are needed; the latter usually results in a complex system of multiple inputs and weights.
This paper uses Intervals’ Numbers (INs) for predicting the maturity of grapes based on real-world measurements by a parametric IN-regressor model implementable by a neural network architecture. Note that INs have originally been introduced, under the name Fuzzy Intervals’ Numbers (FINs), in the context of fuzzy set theory [
10]. The interpretation of INs was later extended, in the context of the Lattice Computing (LC) information processing paradigm [
11]. In particular, an IN has been defined as a mathematical object that can be interpreted either as a fuzzy interval or as a distribution of samples; the latter interpretation implies that an IN can potentially represent all-order statistics [
11]. The mathematical properties of the set of INs have been studied for a long time. An algebra of INs has been established [
10,
12,
13]. INs have been employed in logic and reasoning applications [
14,
15]. Furthermore, they have been employed in interpolation/extrapolation applications [
16].
INs have already been applied to time-series classification applications regarding electroencephalography (EEG) signals [
17]. Recently, INs have been employed toward predicting the maturity of grapes [
18]. This work is a follow-up of [
18] with the following novelties: first, additional computational experiments are carried out using (1) the previous three and four INs to predict a future IN, (2) fewer data for training; second, a recursive scheme here demonstrates an IN-regressor’s capacity to learn the physical “law” that generates the non-stationary time-series of INs regarding grape maturity; third, the problem of time-series forecasting in agriculture is described, in mathematical terms, as a non-stationary time-series forecasting problem thus bringing INs to the foreground as an object for time-series processing in other domains with the advantage that an IN represents a distribution of data including all-order data statistics.
The layout of the paper is as follows:
Section 2 presents two IN-regressor models for prediction.
Section 3 details experimental application results. Finally,
Section 4 summarizes our contribution, and it discusses potential future work extensions.
2. An IN-Regressor Parametric Models for Prediction
Figure 1 displays an IN in its two equivalent representations, namely the membership–function–representation in
Figure 1a and the interval–representation in
Figure 1b. More specifically, the membership–function–representation is identical to a probability distribution function; therefore, it is amenable to interpretations, whereas its equivalent interval-representation lends itself to useful algebraic operations in the context of mathematical lattice theory.
The space of INs is known to be a cone in a linear space. Therefore, linear models such as ARMA models can be developed. The work in [
18] has proposed a nonlinear model in the space of INs implemented by three-layer feed-forward neural network, namely IN-based Neural Network or INNN for short, with
N = 2 inputs. In this work, the number of inputs of the INNN is increased to
N as shown in
Figure 2. More specifically, the input to INNN is an N-tuple (
Fk+1, …,
Fk+N) of INs, where
k ∈ {0, …,
n −
N} and
n is the total number of INs in a time-series F
1, …, F
n. The INNN is trained to learn mapping an N-tuple (
Fk+1, …,
Fk+N) to the true output IN
Fk+N+1. In other words, the nonlinear regressor model implemented by the INNN is trained to learn the physical “law” that generates the non-stationary time-series of INs regarding grape maturity. More specifically, a sliding window of size
N INs is used at times
k∈{0, …,
n −
N} to generate an N-tuple (
Fk+1, …,
Fk+N) of INs. We point out that an IN in this work represents a grape image data regarding the maturity of a grape bunch as described in [
18].
The architecture in
Figure 2 is trained to learn a difference equation that calculates an estimate
of the true future IN
Fk+N+1 based on
N past INs
Fk+1, …,
Fk+N. Learning involves the calculation of a set of parameters that minimize the error between the estimate
and the true output IN
Fk+N+1 induced from a training image. Algorithm 1 describes the training of the INNN based on a genetic algorithm (GA) [
18]. In particular, error minimization is pursued by a GA whose cost function is the metric distance between the estimate
and the true output IN
Fk+N+1. The population of chromosomes is a number of parameters including: (a) the set of weights of the neural network, (b) the parameters of the activation function of each neuron and (c) the set of biases for all neurons. In this case, the activation function is a sigmoid.
Algorithm 1. IN-Regressor Training by a Genetic Algorithm (GA) |
Consider the training data set. Generate an initial population of parameter sets. for g generations do Evaluate individuals using the distance between the IN-regressor computed output IN (prediction) estimate and the true output IN. Apply the genetic operators. end for
|
In contrast to the INNN model presented in [
18], which used only measured INs as inputs, the IN-regressor model in this paper uses, in addition, previous predictions as inputs to calculate predictions for future days.
Figure 3 delineates the operation of the recursive IN-regressor with
Ν inputs. In other words, for the calculation of a maturity prediction for a particular day, one or more of the input INs is actually a previous prediction. In this manner, the recursive scheme in
Figure 3 tests the capacity of the proposed IN-regressor to learn the physical “law” that generates a time-series of INs regarding grape maturity.
In terms of Computational Intelligence, the proposed IN-regressor model can be interpreted as a multilayer Fuzzy Inference System (FIS) for deep learning. In conclusion, knowledge is induced from the data in the form of rules; furthermore, fuzzy lattice reasoning (FLR) explanations of the IN-regressor answers can be given as demonstrated below.
3. Experimental Results
Grape maturity at harvest time is based on the composition balance of several maturity-related chemical compounds and sensory attributes such as color and taste [
19]. In order to exploit composition changes and decide on optimal harvest time, it is necessary to perform sensory assessments, i.e., ripeness evaluation, optimally by using non-destructive methods. Toward this end, Red–Green–Blue (RGB) color imaging has been used to calculate the color intensity distribution on grape images while ripening. More specifically, the green channel histogram was represented by an IN [
18].
This work extends the IN-regressor in [
18] whose results are partly presented below for comparison reasons. More specifically, in [
18], only two inputs were used, i.e.,
N = 2 in
Figure 2; whereas, in this work, additional experiments were carried out using both
N = 3 and
N = 4 in order to study the robustness of the prediction when incrementally more past data were used for prediction. Moreover, in [
18], three different training modes were used, namely (a) only One (the first) data sample, (b) Every Other data sample, and (c) nearly the First Half the data samples; whereas, in this work, an additional training mode was used, that is, (d) nearly the First Third of the data samples were also employed for training in order to study the robustness of the prediction when incrementally more data were used to train the IN-regressor. An IN-regressor was trained by the GA in Algorithm 1.
A trained IN-regressor was tested in two different modes, namely “forward” and “recursive” using all the remaining (non-training) data. During “forward” testing, the
N-tuple of INs inputs to the IN-regressor included exclusively real (true) INs induced from images, whereas, during “recursive” testing, the
N-tuple of INs input to the IN-regressor progressively included ever more of its previous IN predictions as shown in
Figure 3. Both “forward” and “recursive” testing were preceded by the same training.
Each different training/testing mode has a particular experimental value as explained in the following. More specifically, One, First Third, and First Half modes use progressively ever more training data; the Every Other mode indicates the effects of reducing the sampling rate by 2, by sampling every other day. Finally, the recursive scheme, in particular, demonstrates an IN-regressor’s capacity to learn the physical “law” that generates the time-series of INs regarding grape maturity toward achieving long-term predictions. In the experiments below, an interval-representation of an IN included L = 32 levels; moreover, the time-series of
n = 13 INs in [
18] was used.
The results for
N = 2 have been detailed in [
18].
Table 1,
Table 2,
Table 3 and
Table 4 detail the results for
N = 3.
Table 5 summarizes the results for all training/testing modes for all
N = 2,
N = 3, and
N = 4.
Table 5 clearly demonstrates that, as
N increases, the training error decreases as well as the corresponding standard deviation due to more accurate predictions. A similar observation holds for the testing error, for the same reason, even though the testing error is significantly larger with significantly larger standard deviation. For constant
N, as the number of training data increases, so does the error, due to “curve-fitting” problems; nevertheless, often the corresponding standard deviation appears to decrease. However, an IN-regressor demonstrates a good capacity for generalization on the testing data because, for constant
N, as the number of training data increases, the error decreases even though it is clearly larger than the corresponding training error. Especially promising is the performance of the IN-regressor in the recursive mode for
N = 4 when the First Half of the data were used for training. Then, an average of 6.65 was recorded with a standard deviation of 2.32 recorded compared to 4.30 and 0.96, respectively, recorded in the forward mode. The significance of the latter is that the proposed IN-regressor could potentially make accurate long-term predictions, thus providing time to engage both human labor and equipment for grape harvest.
An
N-tuple of INs input to the IN-regressor followed by its corresponding output IN can be interpreted as a fuzzy rule (i.e., knowledge), of a “Mamdani type” FIS, induced from the training data as indicated in
Figure 4, where
Figure 4a shows the rule’s antecedent and
Figure 4b shows the rule’s consequent.
4. Discussion and Conclusions
Agriculture 4.0 [
1], including viticulture, calls for intelligent decision-making. Of special interest is the accurate prediction of the grape maturity in order to timely engage both human labor and equipment for harvest. This work has proposed a parametric regressor, namely IN-regressor, model for grape maturity prediction.
The IN-regressor processes Intervals’ Numbers (INs) with the advantage that an IN represents a distribution of data including all-order data statistics. Hence, instead of representing the maturity status of grapes by few numbers, e.g., the mean and standard deviation of a number of measurements, the maturity status of grapes is represented by a distribution of measurements, i.e., all-order data statistics, toward better decision-making. A neural network architecture, namely INNN, with N inputs (INs) and one output (IN) was shown to implement the proposed IN-regressor.
Extensive computational experiment here has demonstrated that an IN-regressor can accurately predict the grape maturity status, especially for larger N as well for more training data. Therefore, the IN-regressor be used for predicting the grape harvest time. Furthermore, especially promising is a recursive IN-regressor scheme for long-term prediction.
The proposed IN-regressor has been interpreted as a deep learning FIS with a capacity to suggest explanations for its answers by “Mamdani type” fuzzy rules.
Technical future work will pursue one (or more) neural network layer(s) in the input as a filter that normalizes the effects of taken a grape image at different azimuth /altitude /distance /lighting conditions, etc. Furthermore, a faster algorithm for optimization will be pursued instead of a GA. An extension of this work can also demonstrate far more experimental results using data already acquired on-the-field.
As grapes mature, their image statistics change with time. Therefore, since an IN represents a distribution of image statistics regarding grape maturity, it follows that a time-series of INs by definition represents a non-stationary time-series process. Hence, the proposed IN-regressor can be used for predicting a future probability distribution function from past probability distribution functions in a non-stationary time-series. Therefore, apart from agriculture, this work has presented potentially useful instruments for other application domains including the environment [
20], medicine [
21], econometrics [
22], stock-market data [
23], and other.