Next Article in Journal
Predictive Spray Switching for an Efficient Path Planning Pattern for Area Coverage
Previous Article in Journal
Effect of Natural Edible Oil Coatings and Storage Conditions on the Postharvest Quality of Bananas
Previous Article in Special Issue
Computer Vision-Based Multiple-Width Measurements for Agricultural Produce
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Cloud-Based Intelligence System for Asian Rust Risk Analysis in Soybean Crops

by
Ricardo Alexandre Neves
1,2,*,† and
Paulo Estevão Cruvinel
1,3,*,†
1
Post-Graduation Program in Computer Science, Federal University of São Carlos, UFSCar, São Carlos 13565-905, SP, Brazil
2
Federal Institute of São Paulo, IFSP, São João da Boa Vista 13871-298, SP, Brazil
3
Brazilian Agricultural Research Corporation, Embrapa, São Carlos 13561-206, SP, Brazil
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
AgriEngineering 2025, 7(7), 236; https://doi.org/10.3390/agriengineering7070236
Submission received: 30 May 2025 / Revised: 30 June 2025 / Accepted: 4 July 2025 / Published: 14 July 2025

Abstract

This study presents an intelligent method for evaluating the risk of Asian rust (Phakopsora pachyrhizi) based on its development stage in soybean crops (Glycine max (L.) Merrill). It has been designed using smart computer systems supported by image processing, environmental sensor data, and an embedded model for evaluating favorable conditions for disease progression within crop areas. The approach also includes the use of machine learning techniques and a Markov chain algorithm for data fusion, aimed at supporting decision-making in agricultural management. Rules derived from time-series data are employed to enable scenario prediction for risk evaluation related to disease development. Measured data are stored in a customized system designed to support virtual monitoring, facilitating the evaluation of disease severity stages by farmers and enabling timely management actions.

1. Introduction

Advances in digital agriculture fostered the development of risk management tools [1,2,3,4]. Therefore, considering the different applications in the academic field and marketplace, a significant amount of data is required. Prior to its application to different models, this data needs to be stored and handled.
Various models and algorithms are used to collect, process, and transform this data, and the results obtained aid in decision-making. Different decision support systems are available for various agricultural crops [5]. Among these, significant attention has been paid to the soybean crop (Glycine max (L.) Merrill). Soy is considered one of the most important legumes in the world and generally has high protein content and numerous nutrients and bioactive factors beneficial to human life.
In 2024, the global soybean production was equal to 420.762 million metric tons, with Brazil and the United States leading the way. These two countries are consistently the top producers (Figure 1), followed by Argentina, China, and India [6].
However, in all the productive soybean countries, yield losses due to soybean diseases vary between harvests and have been considerable over time. Moreover, in such a context, Asian soybean rust (ASR) is one of the most serious diseases affecting soybean crops worldwide. It is known for causing premature defoliation, early maturation, and significant yield losses, potentially reaching up to 90% or total damage in crop areas [7].
Figure 1. A balance of the harvests related to the world’s main soybean-producing countries by year (in millions of tons); forecasts as of June 2023 [8].
Figure 1. A balance of the harvests related to the world’s main soybean-producing countries by year (in millions of tons); forecasts as of June 2023 [8].
Agriengineering 07 00236 g001
In Brazil, according to Godoy and collaborators, in the 1970s, soybeans became an economically important product, and their significance in the global agricultural market has increased ever since. Despite such an opportunity, data from the 2022–2023 harvest periods indicate that ASR occurred in all the soybean-producing regions of the country. In fact, ASR has been reported at different phenological stages of soybean plants, with a predominance of favorable occurrences between reproductive (R) stages, i.e., R4, R5, and R6, denoting the phenological stages [9]. This phenological window typically falls between the 85th and 95th day of the crop’s cycle.
Various factors, such as disease spread, climatic conditions, and the influence of other environmental variables, are important for understanding regional severity indices and their direct or indirect impact on losses.
The fungus Phakopsora pachyrhizi is the pathogen responsible for ASR [10]. In its initial stage, the disease appears as yellowish or orange spots; in the intermediate stage, these spots expand into larger reddish areas. In the advanced stage, the affected areas become tan, covering large portions of the leaf.
Owing to different climatic conditions, Brazil has a diversity of soybean-growing regions, and 44,062.6 million hectares are currently being used. Thus, making generalized recommendations to control a factor that directly influences the severity of ASR and covers all regions is not possible and solutions must be adaptive and customized. The variables that directly contribute to ASR infection are related to the duration of leaf wetness (6–12 h) at 15–28 °C. Rainfall near the dew point contributes directly to the infection and sporulation of the fungus that causes ASR, accelerating epidemics with regional spread. Ref. [11] reaffirmed that the duration of leaf wetness and night air temperature directly affect the spread of ASR and encouraged the use of methods that can measure or estimate the period of leaf wetness using relative humidity (RH). Similarly, some researchers highlighted rainfall as the leading cause of variation in the severity of ASR epidemics given the correlation between rainfall and disease severity. Lelis et al. used two models to assess conditions favorable to ASR development. One model indicated the number of hours with R H 90 % , and the other indicated a dew point depression of <2 °C. In both models, the working temperature range was 18–25 °C, which is considered ideal for the development of the fungus causing Asian rust. Consequently, in Brazil, July and August were identified as having the least favorable conditions for fungus development, whereas October–April was identified as the period with the most favorable conditions.
According to Bedin [12], plants with nutritional deficiencies are more susceptible to pathogen attacks than adequately nourished ones. Ref. [13] emphasized that models should integrate meteorological data, crop and disease information, and other inoculum sources (e.g., contagion or diffusion), as well as wind direction and speed, temperature, RH, leaf wetness, solar radiation intensity, and crop development stage. Also, mathematical models have been used for predictions of soybean diseases. Researchers used varying parameters, such as epidemiological knowledge and statistical methods [14,15]. In addition, Zagui and co-authors [16] developed a spatio-temporal model based on a fuzzy system to simulate ASR. Their approach integrated input variables into the decision model, including pathogen presence, susceptible plants, and favorable environmental conditions, thereby providing information on the region’s vulnerability to the disease. Yu and collaborators [17] introduced a method for recognizing soybean leaf diseases using traditional deep learning models (AlexNet, ResNet18, ResNet50, and TRNet50). They proposed a model based on an enhanced deep learning algorithm, which enabled effective recognition of soybean leaf diseases.
Recent studies have reinforced the role of mathematical models, hyperspectral sensors, and machine learning algorithms in advancing ASR monitoring and control strategies [18,19,20]. Some authors proposed a mechanistic model based on differential equations to simulate the initial phases of the disease epidemic, incorporating climatic variables and plant characteristics [21,22,23]. Other authors proposed the DC2Net model, which integrates advanced neural network techniques with hyperspectral imaging, achieving high accuracy in early ASR detection, including asymptomatic stages [24]. In contrast, other studies employed algorithms such as Random Forest (RF) and Support Vector Machine (SVM) to classify disease severity based on spectral data, demonstrating both precision and large-scale applicability [25,26]. Climatic risk assessments were also investigated, as in the study that mapped the Brazilian regions most susceptible to ASR based on historical meteorological data [27]. Complementarily, another applied machine learning techniques to multispectral images obtained via drones to estimate soybean defoliation levels, highlighting the potential of precision agriculture in monitoring symptoms associated with ASR [28]. Likewise, some authors mentioned that digital images, acquired using drones or satellites, can be used to assess severity states [29].
However, in such contexts, relying solely on images—especially those based on partial climatic information—is not sufficient to achieve a complete and precise diagnosis, i.e., in order to reduce and minimize false-positive information.
In fact, based on the literature and state-of-the-art data fusion techniques, it has become possible to observe opportunities to structure a complete rule base that systematically accounts for different situations in which ASR is likely to occur. Then, an intelligent decision support system can also be defined to assist producers in controlling such an important disease problem, including the rational and localized use of fungicide applications. This study aimed to present such a new method for evaluating the stage of favorability of ASR occurrence in a real crop area.

2. Materials and Methods

Accurately diagnosing the potential occurrence and severity of ASR in the field requires the integration of heterogeneous data. In this study, we combined specific climatic data, with patterns observed on digital soybean leaf images and key agronomic parameters (cultivar, plant spacing, and planting period). In such a context, a set of techniques—based on the literature and including advanced computational intelligence and vision algorithms for data fusion—were developed to support decision-making and operate in a cloud environment.
Thus, as illustrated in Figure 2, in addition to the materials cited below, the techniques employed are as follows. For data storage and analysis, data lake (DL), data warehouse (DW), data mart (DM), relational database (RD), object storage (OS), autonomous database (AD), and extract, transform, load (ETL) were employed. For the computational instances, data science environment (DSE) and analytics cloud service (ACS) were used. For climatic data series interpolator, the cubic spline was used. For image processing, median filtering, segmentation based on histogram equalization and automatic thresholding, clustering based on K-means, feature extraction based on HU moments, Scale-Invariant Feature Transform (SIFT), and Histogram of Oriented Gradients (HOG) were employed. For image pattern classification, principal component analysis (PCA) and SVM were employed. In addition, for data fusion, two different models were evaluated: one of them regarding the state of the art presented in the literature, meaning spatio-temporal modeling and simulation based on fuzzy systems, and one based on hidden Markov chains.

2.1. Materials

The materials used included a dataset of soybean leaf images collected in a real field crop during cultivation [30], a dataset of climatic data [31], and a dataset containing information on the soybean plant cultivated and used in the experimental pilot [32]. These datasets have the following characteristics:
1.
Image dataset: organized according to the protocol established by [33], where soybean leaves were collected from georeferenced plots and imaged under controlled laboratory lighting using a 24-megapixel digital camera. The images were acquired at a 90-degree angle with a 19-centimeter camera-to-leaf distance. The resulting dataset consists of sRGB images showing soybean leaves with various ASR symptoms against a complex background, with dimensions of 4128 × 3096 pixels, a resolution of 12.78 megapixels, and three color channels;
2.
Climate data: station name and location; station code; municipality; latitude, longitude; start date; end date; measurement periodicity: daily;
3.
Plant data: the crop variety (used the BRS-536), distance between plants and rows, plant height, and number of plants per linear meter.
The primary computing infrastructure, contracted from Oracle Cloud, was configured as follows: a Virtual Cloud Network (VCN) established within a private subnet, contained in a compartment that manages security through policies and security lists. The architecture employs object storage for public data and images in various processing stages. Data processing is conducted by a data science service and a Linux compute instance, with analysis and monitoring provided by the cloud service, which users can access via a Python-based web API and an adequate interface. In addition, it also used a computing infrastructure involving a workstation with the following configurations: x64-based PC architecture; Advanced Micro Devices, Inc. (AMD), Santa Clara, CA, USA, 64-bit processor, 3893 megahertz (MHz); 64 gigabytes (GB) of physical memory; and operating system: Microsoft Windows 10.

2.2. Methods

In relation to the methods, the data source is characterized by input data from public or private sources used in the decision model. These databases may originate from agencies under federal government control, third-sector entities including non-governmental organizations (NGOs), or directly from agricultural producers, provided the relevant variables are measured using sensors.
The daily historical data included the following climatic variables available for public access: total precipitation (mm); maximum temperature (°C); minimum temperature (°C); compensated average temperature (°C); RH (%); and dew point (°C).
Data structuring determines the organization of the data from the data source stage. The following components were used for the structuring (Figure 3): (1) different data sources; (2) data lake; (3) data marts; (4) data warehouse; (5) relational database; (6) data preparation; (7) quality requirements; and (8) data vector.
The infrastructure for organizing the data was planned to meet four possible scenarios: (1) the input of data exported via data marts from legacy systems; (2) the input of semi-structured and unstructured data via data lake; (3) the input of only structured data using data lake and storage in the relational database; and (4) a combination of the three previous scenarios, i.e., the use of input data via data marts and semi-structured, unstructured, and structured data. Algorithm 1 illustrates the steps for structuring the databases in pseudocode.
Algorithm 1: Data Structuring
input  :
d 1 — climatic data; d 2 —leaf images; d 3 —plant data (seeds, spacing, and location)
output:
Data vector
  1:
d 1 climatic data
  2:
d 2 leaf images
  3:
d 3 plant data
  4:
dimensions ← integrity, consistency, completeness
  5:
procedure  begin
  6:
       s 1 Func_receive_data( d 1 , d 2 , d 3 )
  7:
       s 2 Func_direct_data( s 1 )
  8:
       s 3 Func_prepare_data( s 2 )
  9:
       s 4 Func_validate_quality( s 3 , dimensions)
10:
       s 5 Func_generate_data_vector( s 4 )
11:
end procedure
    The complex background of the images constituting the dataset was removed, and the image segmentation technique was automatically used to investigate the color characterization of the disease. In this context, the band-pass thresholding technique (Equation (1)) was used, which consists of selecting a range of threshold values applied uniformly to all the pixels in the image [34,35]. Pixel values within the specified range are assigned to one category, while values outside this range are assigned to another.
f ( c x , c y ) = 1 , if L M m i n I ( c x , c y ) L M m a x 0 , otherwise
where I ( c x , c y ) is the pixel value at position ( c x , c y ) of the image, L M m i n is the lower threshold, and L M m a x is the upper threshold.
For this stage of processing, the following quality indicators of the processed data were considered: image histogram, mean squared error (MSE) metrics (Equation (2)), peak signal-to-noise ratio (PSNR) (Equation (3)), structure similarity index method (SSIM) (Equation (4)), and outliers.
MSE = 1 m n i = 1 m j = 1 n ( I A [ i , j ] I B [ i , j ] ) 2
where m and n are the width and height of the images, respectively; I A [ i , j ] and I B [ i , j ] are the values of the pixels at positions i , j in images I A and I B , respectively.
PSNR = 10 · log 10 M V P 2 MSE
where M V P represents the maximum value of a pixel in an image. In images with 8 bits per channel, as is the case with RGB color images, the maximum pixel value is 255; M S E is the mean squared error found between the reference and processed images.
SSIM ( I A , I B ) = ( 2 · μ I A · μ I B + α 1 ) · ( 2 · σ I A I B + α 2 ) ( μ I A 2 + μ I B 2 + α 1 ) · ( σ I A 2 + σ I B 2 + α 2 )
where I A and I B are the two reference and processed images, respectively; μ I A and μ I B are the averages of the pixel values in I A and I B , respectively; σ I A and σ I B are the standard deviations of the pixel values in I A and I B , respectively; σ I A I B is the covariance between the pixel values in I A and I B ; α 1 and α 2 are small constants that are added to avoid division by zero and stabilize the calculation, such that α 1 = ( k 1 · L ) 2 and α 2 = ( k 2 · L ) 2 , where L is the dynamic range of the pixel values (e.g., 255 for 8-bit images per channel); k 1 and k 2 are predefined constants.
The MSE, PSNR, and SSIM are frequently applied to evaluate image quality in various image-processing tasks. These matrices are used to compare images to meet different sensitivities and degradation contexts irrespective of the consideration of human perception [36,37,38].
In this study, a pattern recognition technique was used to extract ASR characteristics from soybean leaf images. For this, a set of color descriptors defining the patterns were obtained using the Scale-Invariant Feature Transform (SIFT) technique [39] (Equations (5)–(8)) and HU invariant moment technique [40] (Equations (9)–(20)), and texture descriptors were obtained using the Histogram of Oriented Gradients (HOG) technique [41] (Equations (21)–(25)).
G ( c x , c y , σ ) = 1 2 π σ 2 e ( c x 2 + c y 2 / 2 σ 2 )
L ( c x , c y , σ ) = G ( c x , c y , σ ) I ( c x , c y )
where G ( c x , c y , σ ) is the key location function at a point (cx, cy) in the image and at a specific scale ( σ ). This function represents the response of the Gaussian filter at that position and scale; c x and c y are the horizontal and vertical coordinates, respectively, for calculating the response of the Gaussian filter; σ is the Gaussian scaling parameter that controls the size of the Gaussian filter. The higher the value of σ , the larger the Gaussian filter and the smoother the response. The smaller the value of σ , the sharper the response, but the greater the sensitivity of the details.
M i j = ( A i j A i + 1 , j ) 2 + ( A i j A i , j + 1 ) 2
where M i j is the magnitude of the gradient at position ( i , j ) of the image, which represents the change in pixel intensities around that position. Next, the magnitude of these differences was calculated using the Pythagorean theorem. A i j is the value of the pixel in position ( i , j ) of the original image, representing the intensity or color value of the pixel in that position; A i + 1 , j and A i , j + 1 are the values of the pixels in the adjacent positions to the right of ( i + 1 , j ) and below ( i , j + 1 ) the pixel at ( i , j ) , respectively.
R i j = A T A N 2 ( A i j A i + 1 , j , A i , j + 1 A i j )
where R i j is the orientation of the gradient at position ( i , j ) of the image, representing the direction in which the greatest change in intensity occurs in the vicinity of the pixel ( i , j ) ; A i j is the value of the pixel at ( i , j ) of the original image, which represents the intensity or color value of the pixel in that position; A i + 1 , j and A i , j + 1 are the values of the pixels in the adjacent positions, to the right of ( i + 1 , j ) and below ( i , j + 1 ) the pixel at ( i , j ) , respectively.
As for the geometric descriptors, the two-dimensional, central, and normalized central moments need to be calculated to calculate the seven HU invariant moments [42].
m b p q = c x = 0 M 1 c y = 0 N 1 c x p c y q f ( c x , c y )
where p = 0, 1, 2,… e q = 0, 1, 2, … are integers.
μ p q = c x = 0 M 1 c y = 0 N 1 ( c x c x ¯ ) p ( c y c y ¯ ) q f ( c x , c y )
where p = 0, 1, 2, … e q = 0, 1, 2, … are integers.
c x ¯ = m 10 m 00 and c y ¯ = m 01 m 00
where c x ¯ and c y ¯ are the coordinates of the center of mass of the image, f ( c x , c y ) . Along with these central and two-dimensional moments, the other moments that constitute the set of HU invariant moments, given by Equations (12) and (13), were also considered.
η p q = μ p q μ 00 ς
ς = p + q 2 + 1
where p + q = 2, 3, …
1 = η 20 + η 02
where 1 is the orthogonal invariant that refers to the first invariant moment; η 20 is the second-order central moment, which is calculated from the image or region of interest (ROI) and represents the dispersion of the distribution of pixels or voxels along the X axis; η 02 is the second-order central moment, calculated from the image or ROI, and represents the dispersion of the distribution of pixels or voxels along the Y axis.
2 = ( η 20 η 02 ) 2 + 4 η 11 2
where 2 is the second invariant orthogonal to the rotation that refers to a measure of the geometric characteristics of the image or ROI; η 11 is the second-order central moment between the X and Y axes and represents the covariance between the axes of the distribution of pixels or voxels.
3 = ( η 30 3 η 12 ) 2 + ( 3 η 21 η 03 ) 2
where 3 is the third orthogonal rotation invariant that refers to the measure of the geometric characteristics of the image or ROI; η 30 is the third-order central moment along the principal X axis and represents the dispersion of the pixel or voxel distribution along that axis; η 12 and η 21 are third-order central moments involving displacement mixtures along the principal X and Y axes and represent the dispersion of the pixel or voxel distribution due to interactions between the axes; η 03 is the third-order central moment along the Y axis and represents the dispersion of the pixel or voxel distribution along that axis.
4 = ( η 30 + η 12 ) 2 + ( η 21 η 03 ) 2
where 4 is the fourth invariant orthogonal to the rotation that refers to the measure of the geometric characteristics of the image or ROI.
5 = ( η 30 3 η 12 ) ( η 30 + η 12 ) [ ( η 30 + η 12 ) 2 3 ( η 21 + η 03 ) 2 ] + ( 3 η 21 η 03 ) ( η 21 + η 03 ) [ 3 ( η 30 + η 12 ) 2 ( η 21 + η 03 ) 2 ]
where 5 is the fifth invariant orthogonal to the rotation that refers to the measure of the geometric characteristics of the image or ROI.
6 = ( η 20 η 02 ) [ ( η 30 + η 12 ) 2 ( η 21 + η 03 ) 2 ] + 4 η 11 ( η 30 + η 12 ) ( η 21 + η 03 )
where 6 is the sixth invariant orthogonal to the rotation that describes the geometric characteristics of an image or ROI.
7 = ( 3 η 21 η 03 ) ( η 30 + η 12 ) [ ( η 30 + η 12 ) 2 3 ( η 21 + η 03 ) 2 ] + ( 3 η 12 η 30 ) ( η 21 + η 03 ) [ 3 ( η 30 + η 12 ) 2 ( η 21 + η 03 ) 2 ]
where 7 is the seventh invariant orthogonal to the rotation used to describe the geometric characteristics of an image or ROI.
G r c x = I c x G r c y = I c y
where G r c x and G r c y represent the derivatives of the image I with respect to the coordinates c x and c y , respectively; the gradients are calculated using Sobel differentiation operators.
M a g = G r c x 2 + G r c y 2
where M a g is the gradient magnitude calculated by the gradients G r c x and G r c y .
Θ = arctan G r c y G r c x
where Θ is the orientation of the gradient, calculated using the arc tangent function (arctan).
H i s t ( θ ) = pixels per cell w ( θ Θ )
where H i s t ( θ ) represents the orientation histogram for a given cell. It is a distribution that shows the number of gradients with orientations in different angular ranges within the cell; θ represents the orientation of the gradient at a given pixel within the cell; Θ represents the predominant orientation of the gradients in the cell. This is often calculated from θ and can be used to weigh the contribution of each pixel to the histogram; w ( θ Θ ) represents a weighting function that determines the weighing of the contribution of a specific gradient to the histogram based on the angular difference between θ and Θ ; pixels in the selected cell indicates the sum of all the pixels in the cell.
v = v v 2 2 + ϵ 2
where ϑ represents the concatenated vector of orientation histograms in a block; | | ϑ | | 2 indicates the Euclidean norm (or length) of the vector ϑ , calculated as the square root of the sum of the squares of the vector elements; ϵ is a small constant added inside the square root to avoid possible divisions by zero.
Thus, these descriptors were used to recognize patterns that constitute the image variable for the fusion model, and the recognition of pixels or even clusters of pixels in green, yellow, and brown was considered.
As part of the deliverables of this stage, the process considers feature vectors organized into green, yellow, and brown.
For this stage of processing, the following quality indicators of the processed data were considered: missing values and dimensionality reduction. Missing values were assessed at the point in the process where the feature data vectors were joined.
For the high dimensionality indicator, the dimensionality of the feature vector was reduced to 130 columns.
A machine learning technique was employed to classify the patterns identified in the images, corresponding to each crop leaf. The SVM classifier was applied to process the feature vectors extracted from these patterns. The SVM classifier utilizes functions known as kernels, as shown in Equation (26). A kernel represents abstract spaces and receives two objects, x o i and x o j , in the input space to compute their scalar product in the feature space , which may reach very high dimensions, where the computational cost factor Φ can be substantial [43,44,45,46].
K ( x o i , x o j ) = Φ ( x o i ) . Φ ( x o j )
For the kernel to represent mappings that facilitate the calculation of scalar products, according to the function defined in Equation (26), the conditions provided by Mercer’s theorem were considered, which is characterized by giving rise to semidefinite matrices k, where each element K i j is defined by K i j = K ( x o i , x o j ) for all i , j = 1 , . . . , n , where Φ ( x o i ) and Φ ( x o j ) , respectively, represent x o i and x j after applying the feature mapping function  Φ ( x ) .
In this study, the SVM technique was applied using grid search to obtain the best hyperparameter configurations using the dataset of characteristics originating from soybean leaves. From the processing being performed in machine learning, metrics are generated for model evaluation. Various statistical indicators and classification metrics were used, which are fundamental for understanding the quality of predictions and the robustness of the model. These indicators allow an analysis ranging from data dispersion to the effectiveness of classifications, providing a comprehensive view of the model’s performance. Each metric used is presented individually below.
The machine learning model was evaluated based on the following metrics [47]: variance, standard deviation, precision, accuracy, support and revocation, F1-score, and area under the ROC curve (involving the measures true positive rate (TVP), true negative rate (TFP), and confusion matrix).
The confusion matrix represents the distribution of classifications made by the model, comparing predicted values with actual values, and involves the measures true positive (TP), true negative (TN), false positive (FP), and false negative (FN). Although not a metric itself, it provides the necessary data for calculating key performance metrics such as precision, recall, and F1-score, which, in turn, compose the classification report.
After using the classifier, the dimensionality of the feature vector was reduced owing to the use of the principal component analysis (PCA) technique. PCA is an unsupervised technique for dealing with high-dimensional data and is also known as the Karhunen–Loève transformation [48], Hotelling transformation [49], or singular value decomposition [50].
For this stage of the processing, the following quality indicators of the processed data were considered: accuracy, precision, F1-score, recall (the classifier report), the TP, TN, FP, and FN of the confusion matrix, and the area under the ROC curve.
Algorithm 2 illustrates the steps involved in processing, such as segmentation, pattern recognition and feature extraction, dimensionality reduction, and machine learning in pseudocode.
Algorithm 2: Image Processing
Agriengineering 07 00236 i001
A structured data vector was adopted for data fusion, combining variables from the climatic time series with those derived from the processing of digital images of soybean crop leaves. When structuring this variable vector (Figure 4), all time-series data are checked for gaps within the ten-day time windows considered for analysis. If any gap is found, data interpolation with a cubic spline is used (Equations (27)–(29)).
The cubic B-spline is a polynomial function consisting of continuous parts. Therefore, each part is composed of a 3rd degree polynomial in the interval [ x i k 1 , x i k ] , k = 1 , 2 , 3 , n . In addition, it obtains an interpolation formula that is smooth and continuous in the first and second derivatives, respectively, both within an interval and on its boundaries [51].
Γ ( x i ) = i = 0 n 1 c i B i , g ; t ( ι )
where c i are the coefficients, g represents the order of the B-spline, t represents the nodes, and B i , g ( ι ) is defined by Equations (28) and (29).
B i , 0 ( x i ) = 1 , s e t i ι < t i + 1 0 , s e otherwise
B i , k ( x i ) = x t i t i + k t i B i , k 1 ( ι ) + t i + k + 1 x i t i + k + 1 t i + 1 B i + 1 , k 1 ( x i )
Furthermore, after using the Equations (27)–(29) to complete all the climatic series of data, the rules for decision-making could be established. Such rules describe the set of conditions associated with the favorability definition for ASR occurrences.

2.3. Description of Data Fusion Process

This study compares two data fusion methods for addressing ASR: the first is based on the hidden Markov technique, while the second is a fuzzy logic approach, considered state-of-the-art in the literature [16].
The data fusion process using the hidden Markov chain technique [50,52,53] is based on the integration of variables from different sources and normalized physical quantities, as can be observed from data listed in Table 1.
In addition, this study considers a general rule base that integrates the main climate data and image patterns recognized from soybean leaves since they can be correlated, enabling risk assessment of disease severity and favorability diagnoses. Table 2 presents the general rule base for the decision-making process.
The data fusion based on fuzzy logic, as presented in the literature, is defined by [55]. In such an arrangement, four main types of decision functions are considered as follows [56,57]: (1) Gaussian, (2) trapezoidal, (3) triangular, and (4) singleton. For this development, the triangular function was used as described by Equation (30) given that L ( χ ) is a continuous strictly increasing function with L ( a ) = 0 and L ( b ) = 1 and R ( χ ) is a continuous strictly decreasing function with R ( b ) = 1 and R ( c ) = 0 .
μ α ( χ ) = 0 , if χ < a L ( χ ) , if a χ b R ( χ ) , if b χ c 0 , if χ > c
Additionally, like a discrete universe X , it was defined according to Equation (31), i.e., following Prokopowicz and collaborators [58].
α = χ X μ α ( χ ) / χ
where μ α ( χ ) and χ represent the membership degree of the pair object χ , with the “/” symbol denoting the pair separator and ∑ representing idempotent summation.
In fact, the concepts of fuzzy logic are organized into a fuzzy model given by the configuration of the antecedent and consequent variables. The formation (if <antecedent> then <consequent>) is used, adhering to conditions that can be fully or partially satisfied, according to the fuzzy inference mechanism, which defines the rule firing. It should also be noted that the rules were constructed according to the Mamdani inference model [59], as presented in Table 3. These descriptions contain descriptions of the constructed inferences, considering low, medium, and high favorabilities. Additionally, it represents the number of rule combinations generated for each inference.
The combinations arise from the variations, translated by the phenomenological knowledge of the ASR problem, expressed by the seven variables, V1 to V7 (antecedents), that feed the fuzzy model. These are necessarily composed of “OR” and “AND” conjunctions, forming unique rules. Therefore, by summing all combinations of the three favorability possibilities, one may find 120 constructed rules, which comprise the rule base to be submitted to the fuzzy inference engine for data fusion and the support decision method.
Moreover, to have the conditional fuzzy rules defined, in order to reach both the minimum t-norm function ( ) and the maximum t-norm function ( ) , as presented by Equations (32) and (33):
α T β = min ( α , β ) = α β
where α and β are fuzzy variables or sets being combined using the minimum t-norm function; T is the t-norm* operator representing the minimum operation (or minimum AND) used to combine the fuzzy sets α and β ; α β means the function’s output is the minimum membership value between the two sets for a given element of the universe of discourse.
α S β = max ( α , β ) = α β
where α and β are fuzzy variables or sets being combined using the maximum s-norm function and can be either a scalar value or a fuzzy set; S is the s-norm operator representing the maximum operation (or maximum OR) used to combine the fuzzy sets α and β ; α β means the function’s output is the maximum between the membership values of α and β for a given element of the universe of discourse.
Conversely, the defuzzification process consists of calculating a representative numerical output, where β 0 Y , from the resulting fuzzy set B ( β ) in Y . Therefore, it involves mapping fuzzy sets from the space Y to a single numerical value in Y , where F ( Y ) Y . Thus, the numerical result is calculated using the Center of Gravity (COG) method, utilizing Equations (34) and (35) [58].
β 0 = Y β μ B ( β ) d β Y μ B ( β ) d β
μ B ( β ) = i = 1 m F ( i ) ( χ 0 ) μ B ( i ) ( β )
where μ B ( β ) represents the membership of β to a fuzzy set B ; β is the output variable for which the membership in the fuzzy set B is calculated; i = 1 m represents the “supremum” or maximum operation to calculate the supreme membership among the m fuzzy sets resulting from the fuzzy inference; i is the index used to iterate from 1 to m through the fuzzy sets participating in the inference; F ( i ) ( x 0 ) is the membership function of the fuzzy set α ( i ) with respect to the input variable χ 0 , representing the membership of χ 0 to the fuzzy set α ( i ) ; μ B ( i ) ( β ) is the membership function of the fuzzy set B ( i ) with respect to the output variable β , representing the membership of β to the fuzzy set B ( i ) .
After the calculation is performed by the defuzzification process, a 5% error rate is computed on the resulting numerical value so that the favorability can be known. The value of the “favorability” consequence ranges from 0 to 100%, which maintains the standard used in the figure of merit approach. Then, the favorability result, given the numerical defuzzification value, is low favorability from 0 to 33.3%, medium favorability from 33.4 to 66.6%, and high favorability from 66.7 to 100%. Algorithm 3, presented below, uses methods from the Scikit-Fuzzy library [60].
Algorithm 3: Data Fusion with Fuzzy Logic Approach
Agriengineering 07 00236 i002
Finally, system validation was conducted with input from five specialists in agronomy and phytopathology, focusing on soybean diseases and particularly Asian rust. A questionnaire was designed for expert evaluation, including various occurrence scenarios observed in a soybean cultivation area, along with corresponding tables of climatic data and digital images of the crop leaves. This setup allowed the consulted experts to assess the presence or absence of Asian rust, as well as its severity stage when applicable. Regarding the hidden Markov chain [61], it is a dual stochastic process that has both observable and unobservable components. Hidden Markov chains are an extension of the Markov chains [62,63] defined as a stochastic model { X n , n N } that describes a sequence of events where the probability of a future event depends only on the current state and not on previous states. This Markovian property is expressed as
P r ( ζ n = ξ n | ζ n 1 = ξ n 1 , , ζ 0 = i 0 ) = P r ( ζ n = ξ n | ζ n 1 = ξ n 1 )
where P = p ( i j ) is the transition matrix governing the Markov chain; if ζ n denotes the state of the Markov chain at time n, then p i , j = P r ( ζ n = j | ζ n 1 = ξ ) ; that is, every entry of P satisfies p i j 0 and every line of P satisfies j p i j = 1 .
For the model developed, Markov chains have discrete states representing the possible conditions or configurations of the combination of variables constituting the input data vector. Each state in the discrete-time Markov chain corresponds to a discrete representation of the system’s situation at a given time. According to the probability model, changes in state are referred to as transitions. Transition probabilities describe the likelihood of these transitions between stages of favorability within a given period (time window).
In addition, considering the hidden Markov chains to be characterized by elements N [64] is important. The number of hidden states in the model, denoted by the individual states, is
S O = { s o 1 , s o 2 , , s o N }
where S O represents the set of individual possible states that a Markov chain can assume; s 1 , s 2 , , s N are the individual state variables that constitute the set S, where each s i represents a specific state in the Markov chain. The subscription i ranges from 1 to N, where N is the total number of states in the Markov chain.
For the result delivery stage, a set of reports was considered. This set was used to construct the decision-making recommendations based on information from management reports and by visualizing the method’s processing content available on the user interface via dashboard.
For this processing stage, the quality indicators (accuracy and precision) of the processed data were used to evaluate the outcomes obtained from the application of Markov chains. These quality indicators were based on autocorrelation theory, where expected values were estimated from the observations [65]. Thus, from a time series of N measurements for the Markov process (Equation (38)), e i represents the configurations generated for the time series, and i is the temporal order measured between observations. The estimator of the expected value for ȷ ^ is shown in Equation (39), where the symbol (dash) represents the sample mean.
ȷ i = ȷ i ( e i ) , i = 1 , , N
ȷ ¯ = 1 N ȷ i
The autocorrelation function for an observable ȷ ^ was defined (Equation (40)) considering the translation invariance in time for the equilibrium of the dataset established in the process. According to Equation (41), the variance of ȷ is a special case of autocorrelation.
C ^ ( t ) = C ^ i j = ( ȷ i ȷ i ) ( ȷ j ȷ j ) = ȷ i ȷ j ȷ i ȷ j = ȷ 0 ȷ t ȷ ^ 2
where C ^ ( t ) is the autocorrelation value for an observable at a given time t; C ^ i j is the autocorrelation value between two variables ȷ i and ȷ j ; ȷ i and ȷ j are the observations of the variables. Each of them can be viewed as a time series of data; ȷ i and ȷ j are the averages of the time series ȷ i and ȷ j , respectively; ( ȷ i ȷ i ) ( ȷ j ȷ j ) is a measure of the covariance between the time series; ȷ i ȷ j is the mean of the product of the time series ȷ i and ȷ j , representing the raw covariance between the two variables; ȷ ^ 2 is the mean square of the variable ȷ, i.e., ȷ 2 , which represents the variance of ȷ; ȷ 0 ȷ t is the average of the product of the variables ȷ 0 e ȷ t , where ȷ 0 is an observation at a given time and ȷ t is an observation at a later time t in the time series. It represents the covariance between ȷ 0 and ȷ t .
C ^ ( 0 ) = σ 2 ( ȷ )
where C ^ ( 0 ) is the value of autocorrelation at t = 0 for an observable, i.e., the covariance of the variable j with itself at the same instant of time; σ 2 ( ȷ ) represents the variance of the variable ȷ. The variance measures the dispersion of the values of the variable ȷ in relation to its mean, calculated as the average of the squares of the differences between each value and the mean of ȷ.
Another point to consider in the theory [65] is the analysis of self-consistency versus reasonable error. This involves examining the system’s equilibrium aspects by evaluating the time series within the context of the Markov chain and monitoring the integrated autocorrelation times obtained from different measurements of ȷ. Equations (42)–(44) define the calculation of the error Δ ȷ ¯ , the variance of the estimator ȷ, and the integrated correlation time τ i n t , respectively.
Δ ȷ ¯ = σ 2 ( ȷ ¯ ) w i t h σ 2 ( ȷ ¯ ) = τ i n t σ 2 ( ȷ ¯ ) N
where Δ ȷ ¯ is the standard error that measures the uncertainty associated with estimating the sample mean ȷ ¯ ; σ 2 ( ȷ ¯ ) is the variance that indicates the spread of the values of the sample means relative to the true population mean; τ int represents the integration time or integrated correlation time. This parameter describes the autocorrelation of the data; σ 2 ( ȷ ¯ ) N is the estimate of the variance of the mean ȷ ¯ based on the sample size N.
σ 2 ( ȷ ¯ ) = σ 2 ( ȷ ) N 1 + 2 t = 1 N 1 1 t N ϵ ^ ( t ) c o m ϵ ^ ( t ) = C ^ ( t ) C ^ ( 0 )
where ϵ ^ ( t ) is the autocorrelation function normalized at t = 0 , i.e., C ^ ( 0 ) = 1 , and measures the autocorrelation of the variable ȷ at different points in time t, normalized with respect to autocorrelation.
τ i n t = 1 + 2 t = 1 N 1 1 t N ϵ ^ ( t )
The use of the variable rule base in Algorithm 4 is innovative. In the current agronomic literature, as previously observed, such variables are generally considered individually. This approach thus enables the simultaneous consideration of all conditions that may lead to the occurrence of Asian rust in soybeans.
After the execution of Algorithm 4, the flow of methodological steps concludes by integrating data structuring, image processing, and the fusion of the involved variables. Each algorithm fulfills a specific function: Algorithm 1 organizes the data; Algorithm 2 performs image processing and pattern extraction; and Algorithm 4 integrates this information with climatic data to identify the risk of Asian rust occurrence. The combination of these methods enables the analysis of disease favorability within time windows, providing support for decision-making in soybean crop management.
Algorithm 4: Data Fusion with Hidden Markov Chain Approach
input  :
v—data vector; w i n d o w —temporal data window; r u l e s —rule base; c h a i n —hidden Markov chain
output:
Result of occurrence and favorability
  1:
v data vector
  2:
r u l e s rule base
  3:
c h a i n hidden Markov chain
  4:
q u a l i t y Precision σ ( f ¯ ) , Accuracy σ 2 ( f ¯ ) C ^ ( t )
  5:
procedure begin
  6:
       s 1 Func_process_rule_base( r u l e s )
  7:
       s 2 Func_process_fusion( s 1 , c h a i n )
  8:
       s 3 Func_process_markov_quality( s 2 , q u a l i t y )
  9:
       s 4 Func_generate_result( s 2 , s 3 )
10:
end procedure
For data fusion, the evaluation of the two selected models, as described below, was considered by the following metrics: accuracy, precision, and performance.

3. Results and Discussion

Results related to the performance of the computational architecture and the effectiveness of the image processing and classification techniques were obtained. Also, results of the outcomes related to probabilistic modeling using both the fuzzy and hidden Markov models were evaluated for ASR’s risk analysis and the system validation.

3.1. Implementation of the Cloud Architecture and Interfaces of the Intelligent System

For the Oracle Cloud platform, a study was conducted considering three possible architecture scenarios. Among them, the most suitable option identified (Figure 5) featured access to both private and public networks, interconnection of object storage components, infrastructure for compute instances, a data science environment, services for analytical data processing, and support for both transactional and multidimensional databases. Additionally, the computational infrastructure for hosting WEB services aimed at user monitoring was also highlighted.
Figure 6 illustrates the resources used for system development, without detailing network configurations, users, and access permissions, which were nonetheless implemented. In this context, a compute instance was also utilized, sized to host the code developed in Python technology, and configured to provide external access via public IP for the data fusion stage through a WEB framework. The compute instance was set up with the Oracle Linux 8.0 operating system, a configuration of 1 OCPU on an AMD architecture, and 16 GB of RAM. Access was established via the SSH protocol using the PuTTY application and 2048-bit public and private key encryption. Additionally, the object storage menu featured buckets, organized according to the processing structure to provide storage for both data source input and processing output. Similarly, appropriate configurations were applied to the Oracle database (Figure 7) for the transactional (relational) and multidimensional (DW) databases, respectively.
The instance configured to process the Python code in the data science environment was prepared using AMD architecture, with four OCPUs, 64 GB of RAM, and a 250 GB disk for storing processing results in the form of a VM.Standard.E3.Flex compute shape.
The technologies provided by Oracle Cloud enabled seamless integration of the architecture modules, which included the data science environments and the Linux Computing Instance. Consequently, the implementation of Python algorithms and their deployment on a web platform were facilitated.
The cloud-based intelligent system for Asian soybean rust risk analysis in soybean crops was designed to present results in a dashboard format. The system’s main interface (Figure 8) supported both the fusion stage processing and the visualization of results through a clean, intuitive navigation layout. Accordingly, tabs were positioned at the top of the interface, ensuring clear and organized information display.
The results of the processing performed on the cloud infrastructure were stored in databases when structured or in buckets when unstructured or semi-structured. These were analyzed using the Analytics Cloud Service, which also supplied the decision support system. The analyses were made available for user monitoring via a web interface through the Linux compute instance.
Additionally, recommendations for the soybean producer were relationally included based on the favorability results obtained from the processing. Thus, when the result indicates low favorability, a corresponding set of considerations is presented, as also occurs for medium- and high-favorability scenarios (Figure 9).
Further aspects were also taken into account in the recommendation reports, such as the inclusion of a link to the Phytosanitary Pesticide System (Agrofit) for consulting registered fungicide options for disease control, in accordance with the technical recommendations issued by the Brazilian Ministry of Agriculture, Livestock, and Supply (MAPA/Brazil).

3.2. Image Processing and Classification Performance

The results of processing the developed model, based on the established cloud infrastructure, include the organization of climatic data, the processing of soybean leaf images, and the fusion of variables through hidden Markov chains. The processing sequence involves interpolation of the time series, image processing and classification results, dimensionality reduction, and ultimately the assessment of disease favorability based on the generated analytical reports.
During the data reading stage, within the established windows (Table 4), interpolation was required in some cases to fill in missing records. Thus, the records were completed using cubic B-spline interpolation, as shown in one of the analyzed cases (Figure 10), which illustrates the arrangement adopted for organizing and using the time-series data of the variables considered for decision support.
Regarding the used interpolation, it was also observed that the correlation coefficients, obtained with the application of the B-spline function, were of the order of 0.66 for the precipitation data series, 0.78 for the maximum temperature data series, 0.82 for the minimum temperature data series, 0.63 for the relative humidity data series, 0.82 for the dew point data series, and 0.72 for the compensated average temperature data series.
Regarding the processing of leaf images collected in a real field via imaging, a dataset of sRGB images of soybean leaves exhibiting various ASR symptoms and containing complex backgrounds was used for method validation; dimensions: 4128 × 3096 pixels; resolution: 12,780,288 pixels. Thus, after splitting the RGB channels, the green channel was selected for processing as it exhibited a wavelength closest to the effects expected due to the potential presence of the rust pathogen. Image histogram techniques were applied to this channel, resulting in the minimization of background effects.
Next, a median filter with a 3 × 3 window was applied to smooth the image for better feature extraction. After this step, a highlight, as an automation point of the process, was the identification of the seed pixel, according to the disease reference colors and the threshold definition process (Figure 11), using statistical techniques such as median calculation, standard deviation, and outlier removal, considering a maximum associated error ≤ 5%.
The choice in thresholds involved analyzing image histograms and evaluating regions to segment the object of interest. The background exhibited a significant number of colors similar to those of the object of interest, i.e., the leaf. The adopted procedure for histogram evaluation was supervised, aiming to identify two thresholds capable of segmenting the largest possible background area without compromising the leaf region, which, due to ASR, displayed a variety of color tones. The histogram analysis focused on six different ranges: (a) 0 to 85, (b) 31 to 165, (c) 70 to 159, (d) 83 to 159, (e) 100 to 130, and (f) 18 to 200. Tests conducted with ranges (b) and (f), like the others, resulted in substantial pixel loss in the object of interest. The threshold range (b), from 31 to 165, yielded favorable results and was adopted as the standard for processing the image dataset.
The result of applying segmentation techniques (Figure 12) was organized in stages, i.e., first removing the background and then associating the result based on the reality of the absence, appearance, and presence of the disease in a soybean cultivation area, in other words, considering segmentation in green, yellow, and brown colors, respectively.
Additionally, after thresholding, the k-means technique (Figure 12d) was applied to cluster the image pixels according to the established color class definitions. These results indicated the need to consider up to six different labeled clusters, with label number four being used as it was indeed associated with identifying the occurrence of ASR, both in its intermediate and advanced stages.
Regarding the quality metrics of all the images analyzed: the MSE values (Equation (2)) ranged from 0.01 to 0.06, with a median of 0.03; the SSIM values (Equation (4)) ranged from 0.87 to 0.97, with a median of 0.94; the PSNR values (Equation (3)) ranged from 18.98 to 20.04, with a median of 14.29.
For the ranges of pixel values of an ROI related exclusively to green, yellow, and brown colors, the values summarized in Table 5 were observed for these metrics.
The OpenCV and Skimage libraries were used to extract the features and recognize the patterns. This was achieved by applying SIFT (Equations (5)–(8)), HOG, and HU moments (Equations (9)–(20)), with algorithms written in Python 3.6.8., based on the default parameters of these libraries. Each process generated a file with the characteristics of each color, and its storage was considered in the Oracle Cloud bucket.
For example, part of the processing can be observed in Figure 13, which illustrates the results for texture, color, and geometric shape.
The processing of these features using the HOG, SIFT, and HU invariant moment algorithms resulted in vectors with 130 features. The PCA technique was then used to reduce this vector to that with five features, as shown in Figure 14.
The choice in the ideal number of principal components was based on the total variance (Table 6), adopting a minimum threshold of 70% explained variance as the criterion. To ensure an efficient representation of the information in the feature vectors, nineteen principal components were sufficient to explain 70.79% of the total variance. In contrast, reducing to eighteen components explained 69.56%, which is slightly below the established threshold.
Based on PCA dimensionality reduction, different classifiers were evaluated (Decision Tree, K-Nearest Neighbor, Naïve Bayes, and Support Vector Machine (SVM)), with the latter selected for yielding the best results. For SVM, three kernels were tested, chosen according to the behavior of the data to be classified: linear, polynomial, and RBF. These configurations are shown in Table 7.
The data for training and testing, intended for selecting the SVM classifier, was organized considering three configuration aspects, namely percentages of 80–20%, 50–50%, and 70–30%, respectively, for the training and testing stages.
From the analyses performed, the third-order polynomial kernel presented the best result (Figure 15), where the best combination evaluated for the training and testing data, according to the classification report metrics (Table 8), was 80–20% (Table 9). That is, it presented the best metrics regarding accuracy, precision, recall, F1-score, area under the curve, and lower mean squared error (Equation (2)). In this context, the final configuration for the polynomial kernel is presented in Table 10.
When the polynomial kernel was used and the confusion matrix was analyzed, the main diagonal of the matrix correctly indicated 346 cases belonging to class “0”, i.e., absence of favorability to ASR. These records were indeed class “0”. However, 346 false positives were observed in the upper right quadrant, corresponding to cases that actually belonged to class “1”, i.e., favorable to the occurrence of ASR.
In the second quadrant along the main diagonal, 1312 cases were recorded. The model correctly classified these cases as belonging to class “1”, which they indeed did.
However, in the lower left quadrant, 49 false negatives were identified. These cases actually belonged to class “0”.

3.3. Results of Variable Fusion and Fuzzy Modeling for Favorability Prediction

Table 11 shows the fuzzy variable settings and the corresponding membership functions for the seven variables considered for risk analysis. In addition, Figure 16 shows the obtained results, as indicated for each interval of the membership functions, i.e., with the associated error of approximately ±5%, corresponding to each transition zone between the low, medium, and high favorability levels.

3.4. Results of Variable Fusion and Markovian Modeling for Favorability Prediction

Based on the structuring of the cloud architecture, the data for the variable fusion stage were selected. This dataset encompassed the considered time series period, enabling validation of the method using predefined ten-day temporal windows shifted along the series. Classification information was derived from the analysis of image processing using the Embrapa dataset.
Once the climatic time-series data were structured, and considering the set of images with their classified patterns, the variable fusion algorithm based on the Markovian model was applied, as shown in Figure 17.
For the favorability of ASR occurrence, the probability values for low, median, and high occurrences were considered to be 0.1, 0.3, and 0.7, respectively. The combinations denoted by “C” represented 27 possibilities generated from the variables V f 1 (leaf wetting period), V f 2 (minimum leaf wetting period), V f 3 (temperature range), V f 4 (maximum temperature), V f 5 (minimum temperature), V f 6 (dew point), and V f 7 (results of the image classification based on the soybean leaf color related to field truth), totaling 128 combinations. To evaluate disease occurrence, the hidden Markov chain observations were defined by the combinations of these seven variables ( V f 1 V f 7 ) and their associated probabilities within the windowing period, corresponding to the time-series data and this classification variable.
The transition probabilities, representing changes in disease favorability states, associated with each variable also comprised the hidden Markov chain and were identified by the percentages indicated in each observation.
The emission probabilities were derived from the state transitions of the observations within the hidden Markov chain. The combinations were selected through a data collection process using a time window, guided by the ASR favorability rule for different stages: (1) transition to the “Low” favorability state, when the set of variables corresponded to the 0– 33 % range according to the observations; (2) transition to the “Median” favorability state, when the identified variables were within the 34– 66 % range; and (3) transition to the “High” favorability state, when the variables exceeded the 66 % range. The hidden Markov chain is summarized schematically in Table 12.
The probability for the “Start” state in the Markovian model application was randomly assigned. Additionally, at the model’s onset, low favorability ( S 1 ) was set to 10%, median favorability ( S 2 ) to 20%, and high favorability ( S 3 ) to 70%. For state S 1 , the probability of remaining in S 1 was set at 40%, while the probability of transitioning to state S 2 was 60%. For state S 2 , the probability of remaining in the same state was 30%, whereas the probability of evolving to state S 3 was 70%. Once state S 3 was reached, the probability of remaining in this state was set to 100%, meaning a return to states S 1 or S 2 was not possible.
The hidden Markov chain customized for the process was obtained by using Equation (45) to calculate the probabilities (Table 12) of each combination in the hidden Markov chain.
P R c = 1 v a r = 1 n ( R v a r + β )
where P R c is the total probability of the hidden Markov chain combination; n is the quantity; V f represents the variables involved; and β is an adimensional constant used to avoid division by zero.
After defining the vector of occurrences corresponding to one of the windowings applied to the time series of climate and classification data, it was used as input for the Markovian algorithm.
As part of the method, the number of values within the same time window that satisfied the rule was counted for each variable. This involved transforming the number of occurrences relative to the values of the “ V f ” variables. The transformation function was defined as follows: when the number of occurrences of V f 1 , then V f = 1 ; and, when the number of occurrences of V f 1 , then V f = 0 .
After this, the occurrences were transformed to form the input vector for the Markovian algorithm. Table 4 presents an example of the data mapped over a ten-day window, in which the leaf wetness period variable registered seven occurrences of favorability for ASR. In this example, the values for minimum leaf wetting period, maximum temperature, minimum temperature, dew point, and image data were “1”, while no occurrences were recorded for the temperature range variable.
Based on the graph of considered rules (Figure 18) and data input vector (Table 13), the transformed occurrences indicated high favorability, considering values of six variables as “1”.
Thus, based on this input vector, the Markovian model generated an output vector that translated the information on the stage of favorability of ASR, as summarized in Table 14.
It is noteworthy that, depending on the input variables, the state of favorability was defined as low, medium, or high.
Examples of data processed in different time windows and originating from each process cycle are listed in Table 15. To assess the quality of processing, the time windows were selected via the web interface using the hidden Markov chain technique.
The result indicated an error of <1%, demonstrating a high-quality index for the data fusion process. Next, for low favorability, the first point was notable, presenting an error of “0” and accuracy and precision values of “1”.
Regarding the standard deviation (Table 15), the observed error differences were minor, making the obtained autocorrelation values reasonable. The calculated autocorrelations depended on two main factors: the input variables and the value of ( c ^ ( t ) ), which represents the size of the processing time window. The variation in ( c ^ ( t ) ) was minimal, affecting only the third or fourth decimal place as the processing was executed under the same infrastructure configuration. Consequently, under these conditions, this variable contributed minimally to the differences observed in the standard deviation of the reasonable errors for the calculated autocorrelations.
However, the variation in combinations of input variables from V f 1 to V f 7 more significantly influenced the differences in the standard deviation values of the reasonable errors for the calculated autocorrelations. The combinations of variables from V f 1 to V f 7 , as shown in Figure 19, indicate that the increase in standard deviation values was due to the presence of variables with a value equal to “1”. A noteworthy observed behavior was that the standard deviation value reached its peak with up to three variables equal to “1” and stabilized at the fourth. From the fifth variable equal to “1” onward, the standard deviation of the reasonable errors for the calculated autocorrelations began to decrease, indicating greater consistency in information processing.

3.5. Comparative Evaluation of the Results Between Modeling Based on the Fuzzy System and the Hidden Markov Chain

To compare the data fusion models, an evaluation framework based on two distinct scenarios was established, i.e., using for the first one 29 combinations for low-, 29 combinations for medium-, and 29 combinations for high-favorability occurrences. On the other hand, for the second one, there were only 41 combinations for medium favorability, and zero combinations for the low and high occurrences. Table 16 presents the final comparative results, where it is possible to observe, for both scenarios, the best behavior of the model based on the hidden Markov chain, which presented accuracy equal to 100% matching.
In such a context, the processing output was displayed on a dashboard panel (Figure 20), which, in summary, included the main information from the executed procedures, such as segmented images, visualization of climatic variables, data fusion and favorability results, as well as access to a container with decision-support reports, prepared based on the historical data cube from the data warehouse. Accordingly, reports (Subject 1), (Subject 2), and (Subject 3) could be displayed in separate containers at the top of the dashboard panel.

3.6. Analytical Reports

The analytical reports represent another important aspect of the analyzed results. These reports were generated from OLAP tool queries based on the DW historical database, whose model was constructed according to the defined requirements.
A load was executed on the DW using an SQL script, and data of interest to the producer was collected. The script was created by combining the data tables of the transactional database, which responded to the queries of the developed requirements. These requirements involved (1) the influence of climatic variables on the favorability of ASR (Subject 1, Figure 21); (2) the accounting of low, median, and high favorability per year (Subject 2, Figure 22); and (3) the influence of the image of the soybean leaf on the favorability of ASR per year, in the planting and harvesting stages, primarily R5 and R6 as they are the most affected by the disease (Subject 3, Figure 23).
The analytical report (Subject 1) showed that the highest incidences of ASR favorability over time were associated with the leaf wetting period, maximum temperature, minimum temperature, minimum leaf wetting period, and soybean leaf image classification data.
Following this line of reasoning, the other most significant variables contributing to disease favorability were the minimum period of leaf wetness, followed by the dew point and temperature range. These findings highlight the variables and their respective relationships on an annual scale in the historical data series.
In the analytical report (Subject 2), it was possible to observe the evaluation of favorability accounting. No cases of low favorability were found during the one crop cycle period. However, both medium and high favorabilities were identified. Figure 22a illustrates the historical series overview for the high-favorability case, while Figure 22b shows the overview for the medium-favorability case.
The period encompassing the interval between soybean reproductive phenological stages R4, R5, and R6 was assessed in the analytical report (Subject 3). ASR was found to be predominant during stages R5 and R6, which corresponded to the period with the highest incidence of the disease. This 17-day interval (17 November to 4 December) was mapped between the 85th and 95th days (stages V f 5 and V f 6 ). This result indicates a high degree of favorability for this specific period.
Regarding high favorability (Figure 23), in a few years of the time series, this level of favorability was not observed, even during the R5 and R6 stages of crop development. However, in the remaining years, disease occurrence was recorded. In one instance, during the R5 and R6 stages, only a single record of average favorability was found.

3.7. Computational Cost

To analyze the computational cost, both the CPU nuclei and memory were evaluated across the four-stage pipeline: (1) segmentation, (2) pattern recognition and PCA, (3) machine learning, and (4) variable fusion for the decision-support process. This evaluation was conducted from two perspectives. The first was a single-instance analysis, focusing on a specific time point from the climate data and its corresponding digital image from a soybean leaf (Figure 24). The second one was a full-dataset analysis, encompassing the processing of all the climate time series and the digital images for producing the result. The resulting percentage utilization of the processing units and memory is shown in Table 17.
The resource consumption dynamics during runtime are detailed in Figure 24, while a consolidated statistical summary is presented in Table 17. The machine learning and all seven variables for the data fusion stages exhibited intense and stable processing demands, with mean usages of 90.55% and 89.26%, respectively. In contrast, the feature extraction together with the PCA stage showed more variable behavior, with a mean usage of 27.51% and a high standard deviation (22.06), indicating that processing loads peaked at 75.10%. Memory consumption remained consistently less than 11.18%, and it was stable across all the stages, peaking at 11.60%, which demonstrates that memory was not a critical computational resource.

3.8. System Validation with Phytopathologists and Agronomists

The tests submitted to the specialists were also processed by the developed system. The responses of the specialists (phytopathologists and agronomists) were used as references, alongside the system’s corresponding accuracy relative to those responses. Additionally, to organize the responses into a unified dataset, normalization was applied based on the maximum and minimum values within the response set.
The system validation results demonstrate a strong correlation between the system and the specialists: the identification of Asian soybean rust presence yielded a coefficient of determination of R 2 = 0.94 , while the estimation of severity levels reached R 2 = 0.88 , as shown in Figure 25 and Figure 26, respectively.
These results express the amount of data variance explained using the linear model. Therefore, the high R 2 values found indicate that the developed system performed satisfactorily in relation to the responses provided by the consulted specialists.
It is important to observe in this developed method that it is not universal. In fact, it is customized for ASR risk management in soybean crops. However, it is based on the idea of being adaptive to other diseases that can occur in soybeans and other grain crops. In this case, such customization may become possible as long as these diseases encounter favorable situations involving climatic issues and symptoms expressed in the crop phenotype.

4. Conclusions

Crop diseases represent the main challenges faced by the agricultural sector. This paper presents a new method for assessing ASR in crops regarding an advanced intelligent and computational decision-making system based on cloud infrastructure. Early detection of ASR disease in crops is crucial to reduce not only its severity and spread in the field but also to minimize the use of fungicides. The decision model was implemented considering not only climatic time series of data but also digital images from soybean leaves, spatially collected for the evaluation of changes in their phenotype. In relation to the climatic time series of data, the use of B-splines resulted in correlation coefficients (CCs) in the interval 0.63 C C 0.82 , which avoided missing data. The absence of data reduces statistical power, which refers to the probability that the test will reject the null hypothesis when it is false. Also, the lost data can cause bias in the estimation of parameters, and it can reduce the representativeness of the samples. For ASR’s risk analysis, the processing employed a large data scale, incorporating both data lake and data warehouse systems, web-based operation, and integrated image feature extraction methods based on SIFT, HOG, and HU (invariant moments) for the pattern’s recognition on leaves, and PCA for dimensionality reduction. Moreover, classification using an SVM with a polynomial kernel was used, which achieved an accuracy larger than 84% and AUC larger than 0.90, demonstrating adequate performance. In addition, the use of the metrics PSNR, MSE, and SSIM enabled demonstrating the robustness of such an arrangement, i.e., leading to values in the ranges of 14.00 PSNR 15.00 , 0.03 MSE 0.05 , and ≥0.91, respectively. For the data fusion of the variables, i.e., the climate ones and the classified image patterns, the model based on the hidden Markov chain was selected since it presented the best effectiveness, allowing 100% matching for the three different levels of possible risk occurrences. The development of the data quality framework allowed a comprehensive evaluation, supporting the reliability of the method. The quality indicators were also evaluated based on autocorrelation theory and estimation of expected values from the processed data. According to the results, these indicators showed adequate accuracy and precision, with cross-correlation validation by experts in phytopathology, achieving linear regression correlation factor values above 0.85, i.e., confirming the method’s reliability. In conclusion, the results validated the developed method, demonstrating significant improvements over traditional climate- or even digital-image-only approaches through the integration of heterogeneous data fusion. Likewise, its practical viability has been shown for field implementation through an intuitive web interface, with the potential to reduce ASR-related losses through disease prevention, early detection, and rational use of fungicides. This development is of great relevance both for advancing knowledge in computer science techniques related to signal and digital image processing and reducing production risks in agriculture. The current method was specifically calibrated for soybeans and requires adaptation for other grain cultivars and geographic regions. For future work, one may aim to use convolutional networks and evaluate opportunities to enable unsupervised operations for agricultural plant disease assessments.

Author Contributions

This work was conducted collaboratively by both authors. Conceptualization, R.A.N. and P.E.C.; formal analysis, R.A.N. and P.E.C.; writing—original draft proposition, R.A.N.; writing—review and editing, P.E.C.; supervision, P.E.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Embrapa Instrumentation and São Paulo Research Foundation (Fapesp), project number 17/19350-2.

Data Availability Statement

The original data presented in the study are openly available in the repository 20240219_RAN_PEC on GitHub®, which can be accessed at https://github.com/ricardo-a-neves/20240219_RAN_PEC, available to be accessed since 15 May 2025.

Acknowledgments

The authors would like to thank the Brazilian Agricultural Research Corporation (Embrapa) and the Postgraduate Program in Computer Science at the Federal University of São Carlos (UFSCar). They would also like to thank the Federal Institute of São Paulo for allowing the first author to participate in this work, and Luciano Vieira Koenigkan for helpful discussions throughout the soybean crop dataset arrangements.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AcronymMeaning
ACSAnalytics Cloud Service
ADAutonomous Database
AIArtificial Intelligence
AMDAdvanced Micro Device
APIApplication Programming Interface
ASRAsian Soybean Rust
AUCArea Under the Curve
CCCorrelation Coefficient
CNNConvolutional Neural Network
COGCenter of Gravity
DLData Lake
DMData Mart
DSEData Science Environment
DWData Warehouse
EmbrapaBrazilian Agricultural Research Corporation
ETLExtract, Transform, Load
FapespSão Paulo Research Foundation
FNFalse Negative
FPFalse Positive
FSIMFeature Similarity Index
GBGigabyte
GISGeographic Information System
HOGHistogram of Oriented Gradients
HUHu Moments
INMETInstituto Nacional de Meteorologia
IoTInternet of Things
KNNK-Nearest Neighbor
MAPAMinistry of Agriculture, Livestock and Food Supply (Brazil)
MHzMegahertz
MSEMean Squared Error
OLAPOnline Analytical Processing
OSObject Storage
PCAPrincipal Component Analysis
PSNRPeak Signal-to-Noise Ratio
RDRelational Database
RFRandom Forest
RGBRed, Green, Blue
RHRelative Humidity
ROCReceiver Operating Characteristic
ROIRegion of Interest
SIFTScale-Invariant Feature Transform
SQLStructured Query Language
SSIMStructural Similarity Index
SVMSupport Vector Machine
TFPTrue False Positive
TNTrue Negative
TNRTrue Negative Rate
TPTrue Positive
TPRTrue Positive Rate
TVPTrue Positive Rate
VCNVirtual Cloud Network

References

  1. Rinaldi, M.; Murino, T.; Gebennini, E.; Morea, D.; Bottani, E. A literature review on quantitative models for supply chain risk management: Can they be applied to pandemic disruptions? Comput. Ind. Eng. 2022, 170, 108329. [Google Scholar] [CrossRef] [PubMed]
  2. García-Machado, J.J.; Greblikaitė, J.; Iranzo Llopis, C.E. Risk Management Tools in the Agriculture Sector: An Updated Bibliometric Mapping Analysis. Studies in Risk and Sustainable Development; University of Economics in Katowice: Katowice, Poland, 2024; pp. 1–26. [Google Scholar]
  3. Hackfort, S.; Marquis, S.; Bronson, K. Harvesting value: Corporate strategies of data assetization in agriculture and their socio-ecological implications. Big Data Soc. 2024, 11, 20539517241234279. [Google Scholar] [CrossRef]
  4. Ali, G.; Mijwil, M.M.; Buruga, B.A.; Abotaleb, M.; Adamopoulos, I. A survey on artificial intelligence in cybersecurity for smart agriculture: State-of-the-art, cyber threats, artificial intelligence applications, and ethical concerns. Mesopotamian J. Comput. Sci. 2024, 2024, 53–103. [Google Scholar] [CrossRef] [PubMed]
  5. Sahu, A.; Acharya, B.; Sahoo, P.S. Agricultural farming decision support system using artificial intelligence: A comparative analysis. In Optimizing Smart and Sustainable Agriculture for Sustainability; CRC Press: Boca Raton, FL, USA, 2025; pp. 212–236. [Google Scholar]
  6. Armstrong, M. The World’s Leading Soybean Producers. Statista. 2023. Available online: https://www.statista.com/chart/19323/the-worlds-leading-soybean-producers/ (accessed on 22 June 2025).
  7. Oerke, E.C.; Dehne, H.W. Safeguarding production—Losses in major crops and the role of crop protection. Crop Prot. 2004, 23, 275–285. [Google Scholar] [CrossRef]
  8. U.S. Department of Agriculture, Foreign Agricultural Service. Foreign Agricultural Service. 2025. Available online: https://www.fas.usda.gov/ (accessed on 25 June 2025).
  9. Godoy, C.V.; Seixas, C.D.S.; Soares, R.M.; Meyer, M.C.; Costamilan, L.M.; Adegás, F.S. Best Practices for the Management of Asian Soybean Rust; Technical Bulletin (Infoteca-E); Embrapa Soybean: Londrina, Brazil, 2017; Available online: http://www.infoteca.cnptia.embrapa.br/infoteca/handle/doc/1074899 (accessed on 10 October 2024). (In Portuguese)
  10. Goellner, K.; Loehrer, M.; Langenbach, C.; Conrath, U.; Koch, E.; Schaffrath, U. Phakopsora pachyrhizi, the causal agent of Asian soybean rust. Mol. Plant Pathol. 2010, 11, 169–177. [Google Scholar] [CrossRef]
  11. Beruski, G.C.; Gleason, M.L.; Sentelhas, P.C.; Pereira, A.B. Leaf wetness duration estimation and its influence on a soybean rust warning system. Australas. Plant Pathol. 2019, 48, 395–408. [Google Scholar] [CrossRef]
  12. Bedin, E. Foliar Applications of Copper in the Management of Asian Soybean Rust. Ph.D. Thesis, University of Passo Fundo, Passo Fundo, Brazil, 2018. (In Portuguese). [Google Scholar]
  13. Nunes, C.D.M.; da Silva Martins, J.F.; Del Ponte, E.M. Validation of a Model for Predicting Asian Soybean Rust Occurrence Based on Rainfall Data; Technical Bulletin 1516-8832; Embrapa Clima Temperado: Pelotas, Brazil, 2018; INFOTECA-E. (In Portuguese) [Google Scholar]
  14. Mila, A.; Yang, X.; Carriquiry, A. Bayesian logistic regression of Soybean Sclerotinia Stem Rot prevalence in the US North-central region: Accounting for uncertainty in Parameter Estimation. Phytopathology 2003, 93, 758–764. [Google Scholar] [CrossRef]
  15. de Carvalho Alves, M.; Pozza, E.A.; do Bonfim Costa, J.d.C.; de Carvalho, L.G.; Alves, L.S. Adaptive neuro-fuzzy inference systems for epidemiological analysis of soybean rust. Environ. Model. Softw. 2011, 26, 1089–1096. [Google Scholar] [CrossRef]
  16. Zagui, N.L.S.; Krindges, A.; Lotufo, A.D.P.; Minussi, C.R. Spatio-Temporal Modeling and Simulation of Asian Soybean Rust Based on Fuzzy System. Sensors 2022, 22, 668. [Google Scholar] [CrossRef]
  17. Yu, M.; Ma, X.; Guan, H. Recognition method of soybean leaf diseases using residual neural network based on transfer learning. Ecol. Inform. 2023, 76, 102096. [Google Scholar] [CrossRef]
  18. Ponte, E.M.D.; Godoy, C.V.; Li, X.; Yang, X.B. Models and applications for risk assessment and prediction of asian soybean rust epidemics. Fitopatol. Bras. 2006, 31, 533–544. [Google Scholar] [CrossRef]
  19. Simionato, R.; Torres Neto, J.R.; Santos, C.J.d.; Ribeiro, B.S.; Araújo, F.C.B.d.; Paula, A.R.d.; Oliveira, P.A.d.L.; Fernandes, P.S.; Yi, J.H. Survey on connectivity and cloud computing technologies: State-of-the-art applied to Agriculture 4.0. Rev. Ciênc. Agrôn. 2021, 51, e20207755. [Google Scholar] [CrossRef]
  20. de Oliveira, C.F.; Nanni, M.R.; Furuya, D.E.G.; de Souza, B.A.M.; Antunes, J.F.G. Detecting soybean rust in different phenological stages by vegetation indices from multi-satellite data. Comput. Electron. Agric. 2023, 210, 107923. [Google Scholar] [CrossRef]
  21. González-Domínguez, E.; Caffi, T.; Rossi, V.; Salotti, I.; Fedele, G. Plant disease models and forecasting: Changes in principles and applications over the last 50 years. Phytopathology® 2023, 113, 678–693. [Google Scholar] [CrossRef]
  22. Jeger, M.; Madden, L.; Van Den Bosch, F. Plant virus epidemiology: Applications and prospects for mathematical modeling and analysis to improve understanding and disease control. Plant Dis. 2018, 102, 837–854. [Google Scholar] [CrossRef] [PubMed]
  23. Garin, G.; Fournier, C.; Andrieu, B.; Houlès, V.; Robert, C.; Pradal, C. A modelling framework to simulate foliar fungal epidemics using functional–structural plant models. Ann. Bot. 2014, 114, 795–812. [Google Scholar] [CrossRef] [PubMed]
  24. Feng, J.; Zhang, S.; Zhai, Z.; Yu, H.; Xu, H. DC2Net: An Asian soybean rust detection model based on hyperspectral imaging and deep learning. Plant Phenomics 2024, 6, 0163. [Google Scholar] [CrossRef]
  25. Khalili, E.; Kouchaki, S.; Ramazi, S.; Ghanati, F. Machine learning techniques for soybean charcoal rot disease prediction. Front. Plant Sci. 2020, 11, 590529. [Google Scholar] [CrossRef]
  26. Li, W.; Guo, Y.; Yang, W.; Huang, L.; Zhang, J.; Peng, J.; Lan, Y. Severity Assessment of Cotton Canopy Verticillium Wilt by Machine Learning Based on Feature Selection and Optimization Algorithm Using UAV Hyperspectral Data. Remote Sens. 2024, 16, 4637. [Google Scholar] [CrossRef]
  27. Godoy, C.V.; Seixas, C.D.S.; Soares, R.M.; Marcelino-Guimarães, F.C.; Meyer, M.C.; Costamilan, L.M. Asian soybean rust in Brazil: Past, present, and future. Pesqui. Agropecu. Bras. 2016, 51, 407–421. [Google Scholar] [CrossRef]
  28. Neves, R.A.; Cruvinel, P.E. Application of Image Processing and Advanced Intelligent Computing for Determining Stage of Asian Rust in Soybean Plants. In Proceedings of the 2022 IEEE 16th International Conference on Semantic Computing (ICSC), Laguna Hills, CA, USA, 26–28 January 2022; pp. 280–286. [Google Scholar]
  29. Abbas, A.; Zhang, Z.; Zheng, H.; Alami, M.M.; Alrefaei, A.F.; Abbas, Q.; Naqvi, S.A.H.; Rao, M.J.; Mosa, W.F.; Abbas, Q.; et al. Drones in plant disease assessment, efficient monitoring, and detection: A way forward to smart agriculture. Agronomy 2023, 13, 1524. [Google Scholar] [CrossRef]
  30. Embrapa Soja. Digipathos Repository—Embrapa Soybean. 2021. Available online: https://www.digipathos-rep.cnptia.embrapa.br/ (accessed on 12 February 2021). (In Portuguese).
  31. Instituto Nacional de Meteorologia (INMET). Meteorological Database for Teaching and Research. 2019. Available online: https://portal.inmet.gov.br (accessed on 3 July 2019). (In Portuguese)
  32. Embrapa Soja Soy in Numbers (2019/20 Season). 2023. Available online: https://www.embrapa.br/web/portal/soja/cultivos/soja1/dados-economicos (accessed on 21 September 2023). (In Portuguese).
  33. Barbedo, J.G.A.; Koenigkan, L.V.; Halfeld-Vieira, B.A.; Costa, R.V.; Nechet, K.L.; Godoy, C.V.; Junior, M.L.; Patricio, F.R.A.; Talamini, V.; Chitarra, L.G.; et al. Annotated Plant Pathology Databases for Image-Based Detection and Recognition of Diseases. IEEE Lat. Am. Trans. 2018, 16, 1749–1757. [Google Scholar] [CrossRef]
  34. Yanowitz, S.D.; Bruckstein, A.M. A New Method for Image Segmentation. Comput. Vision Graph. Image Process. 1989, 46, 82–95. [Google Scholar] [CrossRef]
  35. Gonzalez, R.C.; Woods, R.E. Digital Image Processing; Pearson Education do Brasil: São Paulo, Brazil, 2010. (In Portuguese) [Google Scholar]
  36. Horé, A.; Ziou, D. Image Quality Metrics: PSNR vs. SSIM. In Proceedings of the 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar] [CrossRef]
  37. Preedanan, W.; Kondo, T.; Bunnun, P.; Kumazawa, I. A Comparative Study of Image Quality Assessment. In Proceedings of the 2018 International Workshop on Advanced Image Technology (IWAIT), Chiang Mai, Thailand, 7–9 January 2018; pp. 1–4. [Google Scholar] [CrossRef]
  38. Sara, U.; Akter, M.; Uddin, M.S. Image Quality Assessment through FSIM, SSIM, MSE and PSNR—A Comparative Study. J. Comput. Commun. 2019, 7, 8–18. [Google Scholar] [CrossRef]
  39. Lowe, D.G. Object Recognition from Local Scale-Invariant Features. In Proceedings of the Seventh IEEE International Conference on Computer Vision (ICCV), Kerkyra, Greece, 20–27 September 1999; Volume 2, pp. 1150–1157. [Google Scholar] [CrossRef]
  40. Hu, M.K. Visual Pattern Recognition by Moment Invariants. IRE Trans. Inf. Theory 1962, 8, 179–187. [Google Scholar] [CrossRef]
  41. Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar] [CrossRef]
  42. Zhao, W.; Wang, J. Study of Feature Extraction Based Visual Invariance and Species Identification of Weed Seeds. In Proceedings of the 2010 Sixth International Conference on Natural Computation (ICNC), Yantai, China, 10–12 August 2010; Volume 2, pp. 631–635. [Google Scholar] [CrossRef]
  43. Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  44. Vapnik, V. The Nature of Statistical Learning Theory; Information Science and Statistics; Springer: New York, NY, USA, 1999. [Google Scholar]
  45. Faceli, K.; Lorena, A.C.; Gama, J.; Carvalho, A.C.P.L.F. Inteligência Artificial: Uma Abordagem de Aprendizado de Máquina; LTC: Rio de Janeiro, Brazil, 2011. [Google Scholar]
  46. da Silva, G.; Ferreira, A.; Guilherme, D.; Grigolli, J.F.; Weber, V.; Pistori, H. Recognition of Soybean Diseases Using Machine Learning Techniques Based on Segmentation of Images Captured by UAVs. In Proceedings of the 16th Workshop on Computer Vision (WVC), Virtual, 7–10 November 2020; Brazilian Computer Society (SBC): Petrapolis, Brazil, 2020; pp. 12–17. (In Portuguese) [Google Scholar] [CrossRef]
  47. Murphy, K.P. Machine Learning: A Probabilistic Perspective; A comprehensive reference for the following metrics: Variance, standard deviation, precision, recall, F1-score, and ROC/AUC; MIT Press: Cambridge, MA, USA, 2012. [Google Scholar]
  48. Karhunen, K. Über Lineare Methoden in der Wahrscheinlichkeitsrechnung; Annales Academiae Scientiarum Fennicae; Series A. I. Mathematica-Physica; Suomalainen Tiedeakatemia: Helsinki, Finland, 1947. [Google Scholar]
  49. Hotelling, H. Analysis of a Complex of Statistical Variables into Principal Components. J. Educ. Psychol. 1933, 24, 417. [Google Scholar] [CrossRef]
  50. Klema, V.; Laub, A. The Singular Value Decomposition: Its Computation and Some Applications. IEEE Trans. Autom. Control 1980, 25, 164–176. [Google Scholar] [CrossRef]
  51. Greville, T.N.E. Theory and Applications of Spline Functions; Army Mathematics Research Center: Madison, WI, USA; Academic Press: New York, NY, USA, 1969. [Google Scholar]
  52. Boudaren, M.E.Y.; Pieczynski, W. Dempster–Shafer Fusion of Evidential Pairwise Markov Chains. IEEE Trans. Fuzzy Syst. 2016, 24, 1598–1610. [Google Scholar] [CrossRef]
  53. Li, Y.; Jha, D.K.; Ray, A.; Wettergren, T.A. Information-Theoretic Performance Analysis of Sensor Networks via Markov Modeling of Time Series Data. IEEE Trans. Cybern. 2017, 48, 1898–1909. [Google Scholar] [CrossRef]
  54. Neves, R.A. Cloud-Based Computer Vision and Intelligence System for Asian Rust Risk Management in Soybean Crops. Ph.D. Thesis, Federal University of São Carlos, São Carlos, Brazil, 2024. (In Portuguese). [Google Scholar]
  55. Zadeh, L.A. Fuzzy sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef]
  56. Jang, J.S.R.; Sun, C.T.; Mizutani, E. Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence; Prentice Hall: Upper Saddle River, NJ, USA, 1997. [Google Scholar]
  57. Pedrycz, W. Why triangular membership functions? Fuzzy Sets Syst. 1994, 64, 21–30. [Google Scholar] [CrossRef]
  58. Prokopowicz, P.; Czerniak, J.; Mikołajewski, D.; Apiecionek, Ł.; Ślęzak, D. (Eds.) Theory and Applications of Ordered Fuzzy Numbers: A Tribute to Professor Witold Kosiński; Studies in Fuzziness and Soft Computing; Springer International Publishing: Cham, Switzerland, 2017; Volume 355. [Google Scholar] [CrossRef]
  59. Mamdani, E.; Assilian, S. An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller. Int. J.-Hum.-Comput. Stud. 1999, 51, 135–147. [Google Scholar] [CrossRef]
  60. Scikit-Fuzzy. Version 0.4.2. Python Software. 2019. Available online: https://doi.org/10.5281/zenodo.3541386 (accessed on 21 June 2025).
  61. Baum, L.E.; Petrie, T. Statistical Inference for Probabilistic Functions of Finite State Markov Chains. Ann. Math. Stat. 1966, 37, 1554–1563. [Google Scholar] [CrossRef]
  62. Baum, L.E.; Eagon, J.A. An Inequality with Applications to Statistical Estimation for Probabilistic Functions of Markov Processes and to a Model for Ecology. Bull. Am. Math. Soc. 1967, 73, 360–363. [Google Scholar] [CrossRef]
  63. Markov, A. Extension of the Limit Theorems of Probability Theory to a Sum of Variables Connected in a Chain. In Dynamical Probabilistic Systems; Iosifescu, M., Ed.; English translation of the original 1906 Russian publication; John Wiley & Sons: Hoboken, NJ, USA, 1971; Volume 1, p. 552. [Google Scholar]
  64. Ching, W.K.; Ng, M.K. Markov Chains: Models, Algorithms and Applications; International Series in Operations Research & Management Science; Springer: New York, NY, USA, 2006. [Google Scholar] [CrossRef]
  65. Berg, B.A. Markov Chain Monte Carlo Simulations and Their Statistical Analysis: With Web-Based Fortran Code; World Scientific Publishing Company: Singapore, 2004. [Google Scholar]
Figure 2. Conceptual diagram.
Figure 2. Conceptual diagram.
Agriengineering 07 00236 g002
Figure 3. Database structuring diagram.
Figure 3. Database structuring diagram.
Agriengineering 07 00236 g003
Figure 4. Set of variables considered for analysis in a temporal window.
Figure 4. Set of variables considered for analysis in a temporal window.
Agriengineering 07 00236 g004
Figure 5. Oracle Cloud architecture for the intelligent system.
Figure 5. Oracle Cloud architecture for the intelligent system.
Agriengineering 07 00236 g005
Figure 6. Oracle Cloud architecture for the intelligent system: initial screen.
Figure 6. Oracle Cloud architecture for the intelligent system: initial screen.
Agriengineering 07 00236 g006
Figure 7. Oracle Cloud architecture for the intelligent system: autonomous database.
Figure 7. Oracle Cloud architecture for the intelligent system: autonomous database.
Agriengineering 07 00236 g007
Figure 8. Main interface (input).
Figure 8. Main interface (input).
Agriengineering 07 00236 g008
Figure 9. Recommendation interface (output).
Figure 9. Recommendation interface (output).
Agriengineering 07 00236 g009
Figure 10. Arrangement of the time series of data from the set of variables for the decision support system.
Figure 10. Arrangement of the time series of data from the set of variables for the decision support system.
Agriengineering 07 00236 g010
Figure 11. Examples of results obtained based on differents threshold selection, where (a) 0 t h r e s h o l d v a l u e s 85 , (b) 31 t h r e s h o l d v a l u e s 165 (c) 70 t h r e s h o l d v a l u e s 159 , (d) 83 t h r e s h o l d v a l u e s 159 , (e) 100 t h r e s h o l d v a l u e s 130 , (f) 18 t h r e s h o l d v a l u e s 200 .
Figure 11. Examples of results obtained based on differents threshold selection, where (a) 0 t h r e s h o l d v a l u e s 85 , (b) 31 t h r e s h o l d v a l u e s 165 (c) 70 t h r e s h o l d v a l u e s 159 , (d) 83 t h r e s h o l d v a l u e s 159 , (e) 100 t h r e s h o l d v a l u e s 130 , (f) 18 t h r e s h o l d v a l u e s 200 .
Agriengineering 07 00236 g011
Figure 12. An example of results obtained with the application of segmentation technique, where (a) is an RGB original image, (b) is the green band from the original one, (c) is the histogram processing equalization result, (d) are the processed labels from 0 to 5, related to the output results obtened after equalization, also showing into a red retangle the selected one, (e) is the selected label, (f) is the segmented values related to green pixels, (g) is the segmented values related to yellow pixels, (h) is the segmented values related to brown pixels.
Figure 12. An example of results obtained with the application of segmentation technique, where (a) is an RGB original image, (b) is the green band from the original one, (c) is the histogram processing equalization result, (d) are the processed labels from 0 to 5, related to the output results obtened after equalization, also showing into a red retangle the selected one, (e) is the selected label, (f) is the segmented values related to green pixels, (g) is the segmented values related to yellow pixels, (h) is the segmented values related to brown pixels.
Agriengineering 07 00236 g012
Figure 13. Non-normalized Hu, Hog, and SIFT descriptors.
Figure 13. Non-normalized Hu, Hog, and SIFT descriptors.
Agriengineering 07 00236 g013
Figure 14. Feature data after PCA processing.
Figure 14. Feature data after PCA processing.
Agriengineering 07 00236 g014
Figure 15. Result obtained with an SVM classifier based on a polynomial kernel.
Figure 15. Result obtained with an SVM classifier based on a polynomial kernel.
Agriengineering 07 00236 g015
Figure 16. Results based on the membership functions.
Figure 16. Results based on the membership functions.
Agriengineering 07 00236 g016
Figure 17. The Markov hidden chain’s model for Asian rust risk analysis in soybean crops.
Figure 17. The Markov hidden chain’s model for Asian rust risk analysis in soybean crops.
Agriengineering 07 00236 g017
Figure 18. Chart of accounted rules.
Figure 18. Chart of accounted rules.
Agriengineering 07 00236 g018
Figure 19. Standard deviation versus status of the variables related to each input in the hidden Markov chain applied to ASR risk evaluation.
Figure 19. Standard deviation versus status of the variables related to each input in the hidden Markov chain applied to ASR risk evaluation.
Agriengineering 07 00236 g019
Figure 20. Final dashboard interface (when under processing).
Figure 20. Final dashboard interface (when under processing).
Agriengineering 07 00236 g020
Figure 21. Data analysis analytical report (subject 1).
Figure 21. Data analysis analytical report (subject 1).
Agriengineering 07 00236 g021
Figure 22. Example of an analytical report (subject 2). In (a), the example illustrates that the combinations of the rule base variables were between 66.7% and 100%, while in (b) these combinations were between 33.4% and 66.6%.
Figure 22. Example of an analytical report (subject 2). In (a), the example illustrates that the combinations of the rule base variables were between 66.7% and 100%, while in (b) these combinations were between 33.4% and 66.6%.
Agriengineering 07 00236 g022
Figure 23. Data analysis analytical report (subject 3).
Figure 23. Data analysis analytical report (subject 3).
Agriengineering 07 00236 g023
Figure 24. Computational cost analysis.
Figure 24. Computational cost analysis.
Agriengineering 07 00236 g024
Figure 25. Validation of the presence or absence of Asian soybean rust.
Figure 25. Validation of the presence or absence of Asian soybean rust.
Agriengineering 07 00236 g025
Figure 26. Validation of Asian soybean rust severity level.
Figure 26. Validation of Asian soybean rust severity level.
Agriengineering 07 00236 g026
Table 1. Variables and physical quantities: data fusion.
Table 1. Variables and physical quantities: data fusion.
IDDescription of VariablesPhysical Quantity
V 1 Leaf Wetness PeriodPercentage (%)
V 2 Minimum Leaf Wetness PeriodMillimeters (mm)
V 3 Temperature RangeDegrees Celsius (°C)
V 4 Maximum TemperatureDegrees Celsius (°C)
V 5 Minimum TemperatureDegrees Celsius (°C)
V 6 Dew PointDegrees Celsius (°C)
V 7 Image Classification DataClassification Unit (0 or 1)
Table 2. Integral rule base for ASR favorability [54].
Table 2. Integral rule base for ASR favorability [54].
Climatic Conditions for Asian Soybean Rust Favorability
DescriptionVariableEstimated Value
Known Climatological Data
Leaf Wetness PeriodHours QuantityRelative humidity greater than or equal to 90%
Dew PointTemperatureDifference less than 2 °C
Temperature Range Favorable for Fungus DevelopmentTemperatureRange between 18 °C and 25 °C
Minimum and Maximum Temperature during Leaf Wetness PeriodTemperature RangeRange between 18 °C and 26.5 °C
Minimum Leaf Wetness PeriodTime6 h
New Presented Data
Soybean Leaf Cultivar DataClassificationPixel analysis
Phenomenology of Asian Soybean Rust ProblemDiscovery of Color ClassesAnalysis of green, yellow, and brown pixels
Disease Stage IdentificationPercentage occurrence of classesQuantity of pixels for each class
Favorability ProbabilitySet of variables from indicatorsLow, Median, and High
Table 3. Fuzzy inferences.
Table 3. Fuzzy inferences.
IfFavorabilityCombinations
If Favorability is TRUE for up to two variables THEN
1 option: V1 or V2 or V3 or V4 or V5 or V6 or V7
Low1
If Favorability is TRUE for up to two variables THEN
2 options: V1 or group (V2 or V3 or V4 or V5 or V6 or V7)
Low8
If Favorability is TRUE for up to four variables THEN
3 options: V1 AND V2 AND group (V3 or V4 or V5 or V6 or V7)
Medium21
If Favorability is TRUE for up to four variables THEN
4 options: V1 AND V2 AND V3 AND group (V4 or V5 or V6 or V7)
Medium35
If Favorability is TRUE for more than four variables THEN
5 options: V1 AND V2 AND V3 AND V4 AND group (V5 or V6 or V7)
High35
If Favorability is TRUE for more than four variables THEN
6 options: V1 AND V2 AND V3 AND V4 AND V5 AND group (V6 or V7)
High20
Table 4. Data series temporal window.
Table 4. Data series temporal window.
N.Precip.Max.
Temp.
Min.
Temp.
Relative
Humidity
Dew
Point
Comp. Average
Temperature
Status
14.2035.5024.0072.7523.0828.44Original
20.0032.5024.4088.7523.8025.80Original
318.0033.3022.5079.0022.8526.80Original
40.0033.0023.2084.0022.6225.52Original
50.0033.6023.8088.2524.0226.12Original
63.0034.5023.4083.0023.0626.18Original
70.0033.5024.0084.2523.4726.34Original
84.2035.5024.0072.8023.1028.40Interpolated
96.1032.8024.9088.7023.8025.90Interpolated
105.4032.6023.9086.8023.7026.00Interpolated
Table 5. Segmentation quality analysis: metrics and outliers.
Table 5. Segmentation quality analysis: metrics and outliers.
Segmented ImagesMetricsOutliers
MSEPSNR (dB)SSIMSeedsCalculation
Green0.0513.350.9103
Yellow0.0612.590.911414
Brown0.0512.940.9111
Table 6. Explained variance per principal component–19 components.
Table 6. Explained variance per principal component–19 components.
PCEigenvalue% of VarianceCumulative Variance (%)
10.6412.3012.30
20.5711.0223.31
30.336.3329.65
40.286.2835.93
50.285.3141.24
60.183.5644.79
70.183.5948.18
80.132.7950.97
90.142.5653.53
100.122.4756.00
110.132.5758.52
120.111.8760.39
130.091.6362.02
140.091.6563.66
150.081.5765.23
160.081.5466.76
170.071.4168.17
180.061.3169.48
190.061.2370.79
Table 7. Hyperparameter settings–grid search.
Table 7. Hyperparameter settings–grid search.
Polynomial Kernel Settings
kernel: polynomial, Degree: 3, 5, 7, Parameters C: 1, 10, 100, 1000,
Gamma: 0.001; 0.01; 0.1; 1, Class_Weight: (balanced, 0: 0.1 | 1: 0.9)
RBF Kernel Settings
kernel: RBF, Degree: 3, 5, 7, Parameters C: 1, 10, 100, Gamma: 0.001; 0.01; 0.1; 1,
Weight: (0: 0.3 | 1: 0.7) (0: 0.1 | 1: 0.9)
Linear Kernel Settings
kernel: linear, Parameters C: 1, 10, 100, Gamma: 0.01; 0.1; 1, Class_Weight: (0: 0.1|1: 0.9)
Table 8. Classifier report data—polynomial kernel.
Table 8. Classifier report data—polynomial kernel.
PrecisionRecallF1-ScoreSupport
00.880.500.64692
10.790.960.871361
Accuracy 0.812053
Macro Average0.830.730.752053
Weighted Average0.820.810.792053
Table 9. Comparative data—SVM classifier.
Table 9. Comparative data—SVM classifier.
Descriptive StatisticsConfiguration 80-20Configuration 70-30Configuration 50-50
Acc.MSEAUCAcc.MSEAUCAcc.MSEAUC
SVM Classifier—Linear Kernel
Minimum0.6920.0000.4400.6840.0000.4800.6870.0000.490
Maximum1.0000.3080.6901.0000.3160.6901.0000.3130.640
Mean0.7870.2130.5870.7920.2080.5880.7770.2230.583
Standard Error0.0110.0110.0080.0110.0110.0060.0110.0110.003
Variance0.0080.0080.0040.0070.0070.0020.0070.0070.001
Standard Dev.0.0890.0890.0620.0850.0850.0500.0850.0850.025
Median0.7500.2500.5900.7610.2390.5900.7410.2590.590
25th Percentile0.7310.1670.5400.7290.1790.5500.7240.2120.570
75th Percentile0.8330.2690.6400.8210.2710.6200.7880.2760.600
SVM Classifier—Polynomial Kernel
Minimum0.6920.0000.8200.7950.0340.8000.7690.0410.800
Maximum1.0000.3081.0000.9660.2051.0000.9590.2310.990
Mean0.7900.2100.9170.8600.1400.9160.8440.1560.900
Standard Error0.0110.0110.0060.0050.0050.0050.0050.0050.005
Variance0.0080.0080.0020.0020.0020.0010.0010.0010.001
Standard Dev.0.0880.0880.0430.0420.0420.0390.0360.0360.039
Median0.7560.2440.9150.8500.1500.9100.8410.1590.900
25th Percentile0.7310.1670.9000.8290.1280.8900.8150.1330.870
75th Percentile0.8330.2690.9480.8720.1710.9400.8670.1850.928
SVM Classifier—RBF Kernel
Minimum0.7090.0000.5700.7090.0000.5700.6870.0000.460
Maximum1.0000.2911.0001.0000.2911.0001.0000.3131.000
Mean0.7940.2060.8200.7940.2060.8200.7790.2210.769
Standard Error0.0110.0110.0150.0110.0110.0150.0110.0110.018
Variance0.0070.0070.0140.0070.0070.0140.0070.0070.020
Standard Dev.0.0840.0840.1190.0840.0840.1190.0840.0840.143
Median0.7650.2350.8300.7650.2350.8300.7440.2560.755
25th Percentile0.7290.1730.7130.7290.1730.7130.7280.2100.653
75th Percentile0.8270.2710.9300.8270.2710.9300.7900.2720.878
Table 10. Hyperparameters—polynomial kernel.
Table 10. Hyperparameters—polynomial kernel.
HyperparametersValues
C100
Weight (Class 0)0.3
Weight (Class 1)0.7
Degree3
Gamma0.1
Table 11. Configuration of the membership functions.
Table 11. Configuration of the membership functions.
DescriptionConfiguration
Antecedent: Leaf Wetness Period
Humidity below threshold0, 43, 89
Humidity at threshold88, 90, 94
Humidity above threshold93, 96, 100
Antecedent: Minimum Leaf Wetness Period
Time below threshold0, 14, 24
Time at threshold22, 46, 70
Time above threshold66, 83, 100
Antecedent: Soybean Leaf Image Classification Data
Unfavorable0, 0, 1
Favorable1, 1, 1
Antecedent: Dew Point
Temperature below threshold−2, −1, 0
Temperature at threshold0, 1, 2
Temperature above threshold2, 3, 4
Antecedent: Temperature Range
Initial: range below threshold0, 7, 15
Initial: range at threshold14.4, 18, 21.4
Initial: range above threshold21, 24, 27
Final: range below threshold14, 19, 24
Final: range at threshold23.4, 26, 28.4
Final: range above threshold28, 36, 44
Antecedent: Minimum Temperature
Minimum temperature below threshold0, 7, 15
Minimum temperature at threshold14, 18, 22
Minimum temperature above threshold21, 24, 27
Antecedent: Maximum Temperature
Maximum temperature below threshold14, 19, 24
Maximum temperature at threshold23, 26, 28
Maximum temperature above threshold27, 35, 43
Consequent: Favorability
Low0, 17.15, 33.3
Medium32.3, 50, 67.6
High66.6, 84, 100
Table 12. The hidden Markov chain data.
Table 12. The hidden Markov chain data.
C V f 1 V f 2 V f 3 V f 4 V f 5 V f 6 V f 7 S P _ V f 1 P _ V f 2 P _ V f 3 P _ V f 4 P _ V f 5 P _ V f 6 P _ V f 7
1000000010.000.000.000.000.000.000.00
2000000110.000.000.000.000.000.001.00
3000001010.000.000.000.000.001.000.00
4000001110.000.000.000.000.000.500.50
5000010010.000.000.000.001.000.000.00
9000100010.000.000.001.000.000.000.00
10000100110.000.000.000.500.000.000.50
11000101010.000.000.000.500.000.500.00
12000101120.000.000.000.330.000.330.33
13000110010.000.000.000.500.500.000.00
14000110120.000.000.000.330.330.000.33
15000111020.000.000.000.330.330.330.00
16000111120.000.000.000.250.250.250.25
128111111130.140.140.140.140.140.140.14
Table 13. Data input vector.
Table 13. Data input vector.
Input (Algorithm) V f 1 V f 2 V f 3 V f 4 V f 5 V f 6 V f 7
Occurrences:7601111
Transformed Occurrences:1101111
Table 14. Markov chain result.
Table 14. Markov chain result.
Selected Hidden Chain:1101111
Selected Probability:0.170.170.00.170.170.170.17
State (S):3
Favorability:High
Table 15. Markovian quality data.
Table 15. Markovian quality data.
Favorability V f 1 V f 2 V f 3 V f 4 V f 5 V f 6 V f 7 c ^ ( t ) σ 2 ( f ¯ ) σ AccuracyPrecision
Low00000001.000011
00000011.000.120.350.880.65
00000101.000.120.350.880.65
00000111.000.200.450.800.55
00001001.000.120.350.880.65
Median00001111.000.240.490.760.51
00010111.000.240.490.760.51
00011011.000.240.490.760.51
00011101.000.240.490.760.51
00011111.000.240.490.760.51
High01111011.000.200.450.800.55
01111101.000.200.450.800.55
01111111.000.120.350.880.65
10011111.000.200.450.800.55
11011111.000.120.350.880.65
Table 16. Performance comparison of the data fusion models.
Table 16. Performance comparison of the data fusion models.
Favorability Analysis—Evaluation
Fuzzy Logic
(Category)
Scenario 1Scenario 2
SamplesCorrect
(Count)
Accuracy (%)SamplesCorrect
(Count)
Accuracy (%)
Low Favorability29827.590N/AN/A
Medium Favorability291241.38412560.98
High Favorability291862.070N/AN/A
Hidden Markov Model
(Category)
SamplesCorrect
(Count)
Accuracy (%)SamplesCorrect
(Count)
Accuracy (%)
Low Favorability2929100.000N/AN/A
Medium Favorability2929100.004141100.00
High Favorability2929100.000N/AN/A
Table 17. Statistical summary of computational cost by process.
Table 17. Statistical summary of computational cost by process.
ProcessProcessing (%)Memory (%)
MeanStd. Dev.Min.Max.MeanStd. Dev.Min.Max.
Segmentation76.242.1075.1081.1011.180.449.8011.40
Feature Extraction with PCA27.5122.065.3075.108.520.657.909.70
Machine Learning90.553.5183.8094.1010.950.5210.4011.60
Variable Data Fusion89.263.4983.8094.1010.960.5110.4011.60
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Neves, R.A.; Cruvinel, P.E. A Cloud-Based Intelligence System for Asian Rust Risk Analysis in Soybean Crops. AgriEngineering 2025, 7, 236. https://doi.org/10.3390/agriengineering7070236

AMA Style

Neves RA, Cruvinel PE. A Cloud-Based Intelligence System for Asian Rust Risk Analysis in Soybean Crops. AgriEngineering. 2025; 7(7):236. https://doi.org/10.3390/agriengineering7070236

Chicago/Turabian Style

Neves, Ricardo Alexandre, and Paulo Estevão Cruvinel. 2025. "A Cloud-Based Intelligence System for Asian Rust Risk Analysis in Soybean Crops" AgriEngineering 7, no. 7: 236. https://doi.org/10.3390/agriengineering7070236

APA Style

Neves, R. A., & Cruvinel, P. E. (2025). A Cloud-Based Intelligence System for Asian Rust Risk Analysis in Soybean Crops. AgriEngineering, 7(7), 236. https://doi.org/10.3390/agriengineering7070236

Article Metrics

Back to TopTop