Monitoring Saffron Crops with UAVs

The use of information technology in agriculture has brought significant benefits to producers, such as increased profits and better product quality. Modern technology applications in agriculture rely on the use of unmanned aerial vehicles (UAVs) and wireless ground sensors to provide real-time information about fields and crops. In Europe, these techniques, referred to as Smart Farming (SF), are still in their infancy despite the large agricultural production of a wide range of products. For this reason, in this study, we experimented with the technologies of SF in the cultivation of Greek saffron, a rare spice with many uses. For this reason, and also because its harvest is quite labor intensive, it is one of the most expensive spices. Since the field of SF is relatively new and has not yet been used for this particular crop and area, the equipment and methods of data processing were selected experimentally after a review of the literature. The aim of the study was to remotely acquire image data of the crops and train a machine learning model to detect important objects such as saffron flowers and weeds.


Introduction
Smart Farming (SF) is a concept that essentially represents the application of advanced Information and Communication Technologies (ICTs) into agriculture to manage farming operations in real time. Smart Farming mainly depends on IoT (Internet of Things) technologies such as Wireless Sensor Networks (WSNs), unmanned aerial vehicles (UAVs), cloud services, and big data analytics, aiming to eliminate the need of physical work, to increase crop yields, and to reduce the environmental impact of irrational input usage. SF is considered to be the evolution of Precision Agriculture (PA), whose aim is to manage agricultural operations according to the estimated variability of crops based on spatial and temporal data. Some of the Smart Farming applications are as follows: • Autonomous vehicles and robots that can assist or replace the manual labor. • Automated control of inputs such as water and fertilizers to lower the cost for the farmers and help protect the environment.

•
Remote sensing technologies, such as sensors, UAVs, and satellites, that can be used for monitoring and managing soil, water, and other factors of production. These technologies can help toward the identification of factors that are stressing the crops, such as soil moisture, climate conditions, etc. • Machine learning techniques and big data analytics that can be adopted to analyze the large amount of data collected in order to detect potential threats to plants, such as weeds, animals, or diseases. • Computerized applications to create precision farming plans, field maps, crop logs, and yield maps. This allows for more precise application of inputs such as pesticides, herbicides, and fertilizers, helping to reduce costs, achieve higher yields, and practice more environmentally friendly agriculture.
This technology is not yet widespread in Europe, so there are cultivars such as Kozani saffron to which it has not yet been applied. Saffron is an expensive spice used in various fields, such as cooking, pharmaceutics, and perfumery. More specifically, Greek saffron is grown only in the Kozani region in Western Macedonia and is an important source of income for local producers. This fact was a great incentive to launch the project DIAS (http: //dias-project.gr/, accessed on 5 May 2022) for piloting the application of SF technologies in saffron cultivation in order to help producers reduce production costs and labor and increase their profits and product quality. The application of SF technologies is still in the research stage, so there is no standardization of the methods yet. Due to this fact, we initially performed a review of related pilot studies, examined the hardware options (UaV and sensor technologies) for monitoring remotely the fields, and selected the appropriate equipment. Finally, we evaluated the relevant data and image-analysis techniques to be applied. The lessons learned during the DIAS project are documented in this study, in which we proposed a process that employs UAVs to collect images from saffron fields and applied photogrammetry techniques and machine learning analysis to produce estimation models to detect saffron flowers, weeds, and signs of unwanted mammals. The aforementioned process consists of two main parts: (a) UAV photogrammetry methods for image acquisition and processing: Photogrammetry is the derivation of precise measurements from photographs. It involves taking a series of overlapping photographs of an object, building, person, or environment and converting them into a 2D or 3D model, using various computer algorithms. The photos can be taken either from the ground or from the air, as in this case from UAVs.
(b) Machine learning techniques for model training: As for the machine learning part, for comparison purposes, we chose two different photo-analysis methods whose products are used as input to a machine learning algorithm: pixel-based and object-based analysis. We also used two learning algorithms for the training method to compare their performance: Random Forest (RF) and Multilayer Perceptron (MLP).
The results showed that the pixel-based method had higher accuracy rates than the object-based method and RF had better accuracy than MLP in both cases.

Related Work
UAV technologies is a significant innovation facilitating not only the monitoring of crop production, but also the farmers themselves towards estimating the earnings from crop production. UAVs can provide information that ease the handling of problems detected in the fields and/or optimize harvesting by estimating the yield. Up until now, UAVs have been utilized in a variety of applications for Precision Agriculture, since they can affect productivity and efficiency of multiple farming processes. Such applications refer to soil analysis [1,2], mammal detection [3], and most specifically in monitoring field cultivation processes for different kinds of crops.
It is a fact that weeds growing in agricultural crops are not desirable plants and can affect production significantly. Usually, they appear due to space or water, intervening with the growth of crop yields and harvesting processes. Weed management is basically focused on the utilization of herbicides. However, such spraying not only poses a heavy pollution threat to the environment, but also significantly affects the growth and yield of the crops. Due to the aforementioned issues, Site-Specific Weed Management (SSWM) with UAVs is becoming quite popular [4]. This technique manages the spatially variable application of herbicides, avoiding spraying the entire field. In particular, the field is organized and separated into different management zones, since weed plants are known to be spread through only a few spots of the field. This way, UAVs can collect multiple images from the field and then generate an accurate weed cover map for the precise spraying of herbicide according to the needs of each zone.
UAVs are also popular for their ability to provide an estimation regarding the yield via monitoring the growth of vegetation. It is already proven that real-time monitoring of the cultivation process can increase the quality and quantity of agricultural productivity [5,6].
UAVs can collect variable information from the fields, flying in different heights and angles. The acquisition of images in a regular base can enable the recording of variabilities observed in the growth of crops, as well as the status of the overall biomass. UAV images can be also utilized to produce three-dimensional digital maps of the crops, providing a better view and understanding of parameters such as crop height, distance between rows or between plants, and the index Leaf Area Index (LAI). Monitoring the quantity of nitrogen content, as well as the biomass, which is the most common crop parameter, can lead to considerable conclusions regarding the need for additional fertilizer or other actions during cultivation. Based on this information, farmers can effectively and efficiently manage the use of inputs, the timing of harvesting, and the amount of soil and yield pathogens.
Similarly, collecting images from UAVs can also be beneficial for the detection of diseases in crops, even at early stages [7]. Constant monitoring of the health of vegetation is proven to be of significant importance toward preventing economic loss, as well as the further spreading of the disease. Poor vegetation health will eventually lead to reduced yield and consequently to the reduction of quality of production. Pesticides is a common countermeasure for plant diseases. However, such a strategy is quite costly and also increases the likelihood of groundwater contamination as pesticide residues in the products. UAV-based data-processing technologies can enable the detection of changes in biophysical and biochemical characteristics of the crops via an automated non-destructive manner. UAV disease control can occur either at the initial stage of infection by collecting crop health relevant information, or during the treatment of infection when farmers can use UAVs for targeted spraying, as well as for accurately monitoring the course of their intervention.
In Smart Farming, UAVs can also facilitate crop irrigation processes [8,9] toward achieving significant water conservation, while also being able to detect areas in cultivation with higher irrigation needs. Such processes can support the farmer to save time and increase crop productivity and quality. Furthermore, UAVs in cooperation with image processing tools can generate specialized maps that display the morphology of the soil, thus supporting the more efficient irrigation planning of each crop separately.
Following the same principles, UAVs can leverage crop spraying procedures [10], by applying chemicals in a timely and highly spatially resolved manner. Via a smart monitoring system for Smart Farming, crop fields can be efficiently studied regarding their morphology and crop heights, enabling the deployment of a spray management plan that covers the specific needs of each cultivation. Based on this plan, UAVs will be able to spray the appropriate amount of herbicide spatially, while also adjusting the amount of pesticide depending on the crop site in which it is located.
UAVs are also recommended for monitoring crops that require specific environmental conditions and parameters so as to be cultivated. One of these crops is saffron, which is mainly cultivated in Iran, which has 90% of the world's production, as well as in Greece, Spain, India, Morocco, and Italy [11]. The saffron cultivation is a quite demanding and delicate process requiring the deployment of tailor-based technologies and continuous crop monitoring to avoid animal and disease interventions [12]. Up until now, only a few studies have been focused on smart monitoring of saffron crops. In Reference [13], the authors discussed the applicability of Wireless Sensor Networks (WSNs) for saffron and cultivation in Afghanistan. In Reference [14], the findings are presented for mapping saffron across larger areas, as well as for monitoring changes in saffron distribution. Another recent study focused on the estimation of land area under cultivation of saffron by using time difference method and satellite imagery [15].
In contrast to previous studies, our work presents a complete overview of a methodology for monitoring fields that supports several options. In particular, the main differences with existing works are summarized as follows:

•
This study presented, applied, and evaluated a methodology that incorporates all the steps of the field monitoring process, including the remote data collection step with the appropriate equipment, the photo-processing step, the application of machine learning algorithms step, the application of the estimation models step, and finally to the step of the evaluation.

•
In contrast to previous research, the target of this study was twofold: (a) The first target was to estimate a variety of field attributes (i.e., weeds, flowers, and animal intrusions). The target was to upgrade profit earnings by the adoption of new technologies, ensuring flower picking at an appropriate time in a proper collection material at an appropriate age [12]. Moreover, animal intrusions in the fields, causing a huge loss in production, will be prevented, and weeds will be detected in time. (b) The second target was to examine the applicability of a variety of machine learning methods to farming estimations and draw conclusions based on their accuracy.

•
To suggest software and technologies that are of limited cost, since the methodology followed relies on tools that are open source [16][17][18].

Field Study Design Methodology
In this section, we present the design of the field study performed to collect and analyze data from saffron fields. In the following subsections, we present all the steps involved in the methodology followed ( Figure 1) and we explain in detail the reasons that led us to choose the specific equipment, methods, and fields for the study.

Selection of Field Operations to Be Monitored
The first step in the employed methodology was to identify the specific operations of the saffron production process which would be supported by the application of Smart Farming techniques. We conducted a survey with saffron cultivators and agronomists, who are experts on the specific type of cultivation, and we came up that the main elements which affect the production process and need to be further monitored. They are as follows: • The animal intrusions that destroy the crops.

•
The weeds that affect the growth of the plant and need to be removed.

•
The identification of saffron, which is the main plant and needs to be collected. • Thus, the following research questions emerged for the pilot study performed: • RQ1: Is it possible to estimate the existence of mammals in saffron crops with the help of UAV imagery?
A serious threat for saffron crops is rodents. In particular, damage is caused by farm rats, mice, and moles. Rats and mice are dangerous because they eat the plants. Moles, although they are not herbivores, live in the fields and have nests a few meters below the ground. The damage caused to bulbous plants and other crops lies in the fact that they are destroyed by the opening of the burrows in the ground. Weeds are unwanted plants that grow next to saffron flowers and threaten to choke them. The only way to deal with weeds is to manually locate and remove them. It would be very helpful and time saving if farmers could foresee the specific parts of the field where weeds could show up, so they would prevent their growth. Estimation of the production of certain parts of the field on early stages of the cultivation gives farmers the ability to intervene and treat those parts of the crop in order to maximize their production. In addition, having an assessment of the crops' production, farmers are more confident with their economic programming and investments. Telecom 2022, 3, FOR PEER REVIEW 5 Figure 1. Steps of the methodology followed in the field study. Figure 1.
Steps of the methodology followed in the field study.
In Figure 2, we take a closer look at the individual data to point out the different categories that exist in the cultivation. Figure 2a shows the saffron flower colored purple, Figure 2b shows the traces of the mouse in the form of hatching angles, and Figure 2c shows the weeds detected in this type of cultivation. These three images are zoomed sections of the large one. Using the open-source annotation tool CVAT (https://cvat.org/, accessed on 5 May 2022), the experts (saffron cultivators) labeled the object classes to produce Figure 2d, which shows the basic data of cultivation from a much closer perspective. In this image, the data are shown with different colors, e.g., red for the saffron flower, green for the weeds, blue for the mouse tracks, and black for pixels associated with the soil class (anything not belonging to the three classes).

•
RQ3: Is it possible to estimate the production of saffron crops with the help of UAV imagery?
Estimation of the production of certain parts of the field on early stages of the cultivation gives farmers the ability to intervene and treat those parts of the crop in order to maximize their production. In addition, having an assessment of the crops' production, farmers are more confident with their economic programming and investments.
In Figure 2, we take a closer look at the individual data to point out the different categories that exist in the cultivation. Figure 2a shows the saffron flower colored purple, Figure 2b shows the traces of the mouse in the form of hatching angles, and Figure 2c shows the weeds detected in this type of cultivation. These three images are zoomed sections of the large one. Using the open-source annotation tool CVAT (https://cvat.org/, accessed on 5 May 2022), the experts (saffron cultivators) labeled the object classes to produce Figure 2d, which shows the basic data of cultivation from a much closer perspective. In this image, the data are shown with different colors, e.g., red for the saffron flower, green for the weeds, blue for the mouse tracks, and black for pixels associated with the soil class (anything not belonging to the three classes).

Site Selection
The data used were RGB and multispectral images collected with the help of UAVs and processed by the method of photogrammetry to obtain the necessary information from the fields. The images came from 6 different crop fields of Kozani Saffron; the first two fields were in the harvest period in October 2021, while the remaining 4 were in the period between June and July 2021. These crops produce organic saffron. The selection of the specific fields was based on the fact that they are located far from each other and are a representative sample from all areas of the Prefecture of Kozani. In this way, it is possible to study different soil samples influenced by environmental elements, such as the presence of a lake, and by the climatic conditions of each region. Table 1 shows the coordinates for each field, the number of flights, and the number of photographs taken.

Equipment Selection
UAVs have emerged as one of the most promising technologies for agriculture. The use of UAVs to monitor fields and investigate moisture and nutrient deficiencies in crops holds tremendous potential for farmers. Two different models were used for this study: Sensefly eBee SQ and DJI Phantom 4 RTK, whose main characteristics are listed in Table 2. The choice of the specific models was based on the different types of sensors attached and variance in other attributes, such as speed and range. The eBee model was chosen because of its ability to capture multispectral images that are used for vegetation health calculations and the relatively large flight duration. Since saffron is a small flower and needs to be monitored from low altitude for detailed images, the Phantom RTK quadcopter was selected to provide this capability. After the selection of the equipment, we had to decide on the optimal height of the flights. In this study, we experimented with three flight heights, (a) 400 m, (b) 120 m, and (c) 12 m. The test flights proved that the first height (400 m), in the case of saffron cultivations, cannot provide useful information, since the flower is very close to the ground, and the accuracy of the image is too low to provide information. Therefore, the data and images obtained from these flights were not used. The second height is ideal for performing long-distance flights that can monitor relatively large field areas. Images from 120 m can provide images that can be accurately estimated for the existence of animals, weeds, and saffron. Most of the flights were performed in this height in the context of this pilot study. The third height (12 m) is ideal for acquiring images for training the estimation models. In this case, the low height gives us access to high-accuracy images that can be precisely annotated by the experts. The images acquired for fields No. 2 and No. 3 were from this low altitude and were used to train the model.

Selection of Photogrammetry Techniques
Photogrammetry is the method of acquiring reliable information about an object or a location through analyzing two-dimensional photographs. This information includes precise measurements of the three-dimensional features of the terrain or the object of interest. More specifically, photogrammetry allows for coordinate calculation; the quantification of distances and heights; and the creation of topographic maps, digital elevation models, and orthophotos.
There are two general types of photogrammetry, aerial and terrestrial, depending on whether the camera is in the air or on the ground. The terrestrial type is also called close-range photogrammetry because it concerns object distances up to 200 m. There is also a hybrid type called small-format aerial photogrammetry that combines the aerial vantage point with close object distances and high image detail. Aerial photogrammetry, especially UAV photogrammetry, has been widely used for Precision Agriculture for the reason that it has many advantages, such as high-resolution images, low cost, and flexible survey planning [19].
Using UAV photogrammetry, we generated the following types of products [20]: • Orthomosaic: Orthomosaics, or orthophotos ( Figure 3), correct any geometric distortion that is inherent in aerial images. By using a process called orthorectification, a highly detailed map referenced to the real world can be created. Orthorectification removes perspective from each individual image to create consistency across the whole map, while keeping the same level of detail from the original image. The final product is a single mosaic built through edge matching and color balancing. • Normalized Difference Vegetation Index (NDVI): NDVI ( Figure 4) is an indicator that calculates the vitality of vegetation based on UAV captured data. Live vegetation (where chlorophyll is present) reflects more infrared and green (in the electromagnetic spectrum) radiation than other wavelengths. Vegetation absorbs more blue and red radiation, so the human eye observes vegetation as green. The NDVI uses near infrared and red imaging channels to measure healthy vegetation. In mathematical terms, the formula is worded as follows: where NIR is the reflection in the near infrared range, and RED is the reflection in the red range of the spectrum. Multispectral images were used for NDVI calculation. • Ground-truth image: "Ground truth" stands for the objective observation, verifiable by humans, of the state of an object or information that can be considered a fact. The term "ground truth" has recently gained popularity, as it has been adopted by machine learning and deep learning approaches. In this context, we refer to a "ground truth image"-human-generated classifications of image data on which algorithms are trained or evaluated. This image can be created through a process called annotation, where we label the objects on an image by using an appropriate software tool.

Selection of Photogrammetry Techniques
Photogrammetry is the method of acquiring reliable information about an object or a location through analyzing two-dimensional photographs. This information includes precise measurements of the three-dimensional features of the terrain or the object of interest. More specifically, photogrammetry allows for coordinate calculation; the quantification of distances and heights; and the creation of topographic maps, digital elevation models, and orthophotos.
There are two general types of photogrammetry, aerial and terrestrial, depending on whether the camera is in the air or on the ground. The terrestrial type is also called closerange photogrammetry because it concerns object distances up to 200 m. There is also a hybrid type called small-format aerial photogrammetry that combines the aerial vantage point with close object distances and high image detail. Aerial photogrammetry, especially UAV photogrammetry, has been widely used for Precision Agriculture for the reason that it has many advantages, such as high-resolution images, low cost, and flexible survey planning. [19] Using UAV photogrammetry, we generated the following types of products [20]: • Orthomosaic: Orthomosaics, or orthophotos ( Figure 3), correct any geometric distortion that is inherent in aerial images. By using a process called orthorectification, a highly detailed map referenced to the real world can be created. Orthorectification removes perspective from each individual image to create consistency across the whole map, while keeping the same level of detail from the original image. The final product is a single mosaic built through edge matching and color balancing.  red radiation, so the human eye observes vegetation as green. The NDVI uses near infrared and red imaging channels to measure healthy vegetation. In mathematical terms, the formula is worded as follows: where NIR is the reflection in the near infrared range, and RED is the reflection in the red range of the spectrum. Multispectral images were used for NDVI calculation. • Ground-truth image: "Ground truth" stands for the objective observation, verifiable by humans, of the state of an object or information that can be considered a fact. The term "ground truth" has recently gained popularity, as it has been adopted by machine learning and deep learning approaches. In this context, we refer to a "ground truth image"-human-generated classifications of image data on which algorithms are trained or evaluated. This image can be created through a process called annotation, where we label the objects on an image by using an appropriate software tool.

Application of Machine Learning Techniques
Smart Farming produces and analyzes a huge amount of data in order to create the necessary information to help boost agricultural production [21,22]. It is for this reason that scientists and practitioners working in this field have begun to take an interest in machine learning methods. One significant advantage of machine learning technology is that we can create a program (model) and train it in solving autonomously large nonlinear problems by utilizing data from many different sources. Machine learning techniques are usually classified into two broad categories (supervised learning and unsupervised learning), depending on the nature of the training "signal" or "feedback" available in a learning system [23,24].
The goal of this study was to train a machine learning model so that it can recognize key features of saffron cultivation, such as saffron flowers, weeds, and mammals. We decided, for comparison reasons, to use two very common machine learning techniques for image classification in Smart Farming, the pixel-based and the object-based image analysis. Both of them take the results of the photogrammetry procedure as input and produce datasets that can be used for the model training.

•
Pixel-based image analysis: This method is based on classifying images pixel by pixel, using a set of rules to decide whether different pixels can be grouped according to similar characteristics. Code was developed in Python to export the RGB and NDVI values of the pixels from the images, and an Attribute-Relation File Format (ARFF) dataset was created. ARFF files are ASCII text files that describe a list of instances that have a set of attributes in common. The data that served as input were orthophotos in an RGB and NDVI spectrum. In addition, a separate image was created with the annotations for each pixel, essentially representing the actual data

Application of Machine Learning Techniques
Smart Farming produces and analyzes a huge amount of data in order to create the necessary information to help boost agricultural production [21,22]. It is for this reason that scientists and practitioners working in this field have begun to take an interest in machine learning methods. One significant advantage of machine learning technology is that we can create a program (model) and train it in solving autonomously large non-linear problems by utilizing data from many different sources. Machine learning techniques are usually classified into two broad categories (supervised learning and unsupervised learning), depending on the nature of the training "signal" or "feedback" available in a learning system [23,24].
The goal of this study was to train a machine learning model so that it can recognize key features of saffron cultivation, such as saffron flowers, weeds, and mammals. We decided, for comparison reasons, to use two very common machine learning techniques for image classification in Smart Farming, the pixel-based and the object-based image analysis. Both of them take the results of the photogrammetry procedure as input and produce datasets that can be used for the model training.

•
Pixel-based image analysis: This method is based on classifying images pixel by pixel, using a set of rules to decide whether different pixels can be grouped according to similar characteristics. Code was developed in Python to export the RGB and NDVI values of the pixels from the images, and an Attribute-Relation File Format (ARFF) dataset was created. ARFF files are ASCII text files that describe a list of instances that have a set of attributes in common. The data that served as input were orthophotos in an RGB and NDVI spectrum. In addition, a separate image was created with the annotations for each pixel, essentially representing the actual data (ground truth) of the image. The python PIL library was used for the analysis of each image, which is suitable for collecting pixel values from an image file. For the image annotation, we used the Computer Vision Annotation Tool (CVAT) [25], an open-source software. • Object-based image analysis (OBIA): OBIA, in contrast to pixel-based analysis that categorizes each pixel, groups small pixels on a common vector object. This process uses the image segmentation method suggested by Shepherd [26], which divides the pixels into homogeneous parts. Then these sections/objects are arranged in classes based on shape, spectrum, texture, and other features. In more detail, the analysis is based on objects that are built as pixel groups (raster clumps). The properties of each clump are stored in one raster attribute The produced ARFF datasets from both methods were used as inputs for the model training procedure. After a review of the literature, we concluded that Random Forests and Multilayer Perceptron (artificial neural networks) were the most efficient algorithms for our purposes.
• Random Forest (RF): RF is a machine learning algorithm for solving regression and classification problems. Its name derives from the use of a large number of decision trees, a fact that results in minimizing the occurrence of overtraining phenomena in each tree. Scipy.io library was utilized for ARFF file reading, and scikit-learn was used for the RandomForestClassifier function, which applies the RF algorithm in pandas data format. The number of trees that we used as a parameter for the function was 100. • Multilayer Perceptron (MLP): MLP is a kind of artificial neural network (ANN) and specifically belongs to the deep neural network category. Artificial neural networks [27] are an attempt to approach the human learning process. They essentially mimic biological neural networks by assigning nerve functions to a single element (neuron), which is only capable of summing its input and normalizing its output. Neurons are interconnected in arbitrary complex artificial neural networks and are organized into levels: an input level (corresponding to features), one or more hidden levels, and an output level (corresponding to categories). The goal of the learning algorithm is to determine the weights of connections between neurons (which are used to calculate weighted sums in each neuron) in order to reduce the classification error rate. If a neural network consists of more than three levels, it constitutes a deep neural network (DNN). Scikit-learn was used for the MLP algorithm application, too, with the MLPClassifier function.

Estimations
Estimating cultivation production is one of the most important issues for agricultural management and one of the areas in which precision farming techniques can offer the greatest benefit. The estimation step enables the visualization of the whole field in a user-friendly modelized image where the cultivator can immediately locate the mouse traces, the weeds to be harvested, and a percentage of the estimated production per day in the cultivation.
The method used in the present study aims to create images with distinct objects which are the result of the predictions of the machine learning models. Based on the predictions made by the model for each pixel of the orthophoto of the field, an image marked with different colors for each category is created. In this way, saffron crops, weeds, and mouse traces are located and estimated for the cultivation features to become better perceived. Figure 5 presents the initial RGB orthophoto obtained from a field (Figure 5a) and the output of the estimation model for the particular field (Figure 5b). In the predicted image (Figure 5b), the estimated values are depicted with different colors, e.g., red for the saffron flower, green for the weeds, blue for the mouse tracks, and black for pixels associated with the soil class (anything not belonging to the three classes).

Evaluation
The two evaluation methods that were utilized for the above algorithms are as follows: • 10-Fold Cross-Validation: Cross-validation [28] is a method of estimating the performance of a trained model. It is a process that enhances the accuracy of the results, helping to draw generalized, safe conclusions about the behavior of the categorizer in a data set. During the categorization process, the data set is divided into a training set (training set) and an evaluation set (test set). The general idea is that the data set is divided into equal parts and follows an iterative process in which, each time, one part is used for evaluation and the rest for the training of the whole in a circular order.
with different colors for each category is created. In this way, saffron crops, weeds, and mouse traces are located and estimated for the cultivation features to become better perceived. Figure 5 presents the initial RGB orthophoto obtained from a field (Figure 5a) and the output of the estimation model for the particular field (Figure 5b). In the predicted image (Figure 5b), the estimated values are depicted with different colors, e.g., red for the saffron flower, green for the weeds, blue for the mouse tracks, and black for pixels associated with the soil class (anything not belonging to the three classes).

Evaluation
The two evaluation methods that were utilized for the above algorithms are as follows: • 10-Fold Cross-Validation: Cross-validation [28] is a method of estimating the performance of a trained model. It is a process that enhances the accuracy of the results, helping to draw generalized, safe conclusions about the behavior of the categorizer in a data set. During the categorization process, the data set is divided into a training set (training set) and an evaluation set (test set). The general idea is that the data set is divided into equal parts and follows an iterative process in which, each time, one part is used for evaluation and the rest for the training of the whole in a circular order. • Confusion matrix: One of the basic concepts of classification efficiency is the confusion matrix table, which depicts the model's predictions versus ground-truth labels. Each confusion matrix row represents the instances in a predicted class, and each column represents the instances of a real class. A confusion matrix table is a key tool for evaluating the methods used in this study. The confusion matrix [29] is an overall picture of the categorization results. It is a table that compares the predicted categories in which the data were listed to those that they actually belong to and is used in the control phase of the model. It is c x c of dimensions, where c is the number of • Confusion matrix: One of the basic concepts of classification efficiency is the confusion matrix table, which depicts the model's predictions versus ground-truth labels. Each confusion matrix row represents the instances in a predicted class, and each column represents the instances of a real class. A confusion matrix table is a key tool for evaluating the methods used in this study. The confusion matrix [29] is an overall picture of the categorization results. It is a table that compares the predicted categories in which the data were listed to those that they actually belong to and is used in the control phase of the model. It is c x c of dimensions, where c is the number of categories. After the matrix completion, we used the three evaluation metrics described below: o Precision: The ratio between all the data that are assigned to the correct category, to the total number of data that are assigned to a category (correctly or incorrectly).

Results
In this section, we present the results of the estimation models produced to detect (a) saffron flowers, (b) weeds, and (c) mammal traces both for pixel-based analysis and object-based analysis methods. In particular, for the two analysis methods, we present the predictive power of the models based on the average values of the evaluation metrics presented in Section 3.7 for all fields participating in the pilot study. Additionally, we provide indicative orthophotos produced based on the RBG images, the NDVI index, the ground-truth images, and the estimated models for Field 1.

Pixel Based Image Analysis
In this subsection, we present the results of pixel-based analysis. Initially, the images collected from the UAV flights (performed at 120 m height) were processed with the help of the photogrammetry techniques presented in Section 3.4 to produce (a) RGB orthophotos, (b) orthophotos depicting NDVI, and (c) ground-truth images. The machine learning model was trained based on the pictures demonstrated in Figure 6, where every instance represents a pixel of the cultivation. Finally, we can calculate the number and position of the flowers of saffron and weeds and detect mouse traces. Another important performance metric that can be used to evaluate the applied algorithms is the average recall of the predicted values. That is due to the fact that, in the analysis, we evidently use imbalanced datasets (the cultivation presents large instances of soil data compared to instances depicting, for example, weeds or the existence of mammals). As a result, the dataset contains a large volume of soil class data compared to the other categories. In that case, Random Forest has shown a much more efficient performance for the categories we are interested in, such as saffron flowers, weeds, and mouse traces. On the This cultivation was preprocessed with photogrammetry techniques to produce the desired products. Figure 6a depicts the flourished Field 1 in the spectrum of RGB, and it was developed from the collected shots of the multiple flights of UAV, as presented in Table 1. Figure 6b depicts Field 1 in the spectrum of NDVI, produced from multispectral images, where the green color represents a higher NDVI value, followed by the yellow color. The soil has a mostly yellow color, and certain points in the image with a slightly red color depict the existence of vegetation. Finally, Figure 6c represents the actual category in which a pixel is classified. This picture was magnified so as to highlight the different categories of the cultivation. The data were summarized in one dataset, as described in Section 3.5, that was used to train the algorithms. The different values of RGB, NDVI, and the class of each pixel were joined in it.
The results after applying the two algorithms show that the Random Forest algorithm is the most efficient method, presenting an overall accuracy approaching 100% (Table 3). In parallel, the Multilayer Perceptron algorithm has an overall accuracy of 95%. Consequentially, the models are able to detect the differences between the soil, the weed, and the Saffron flowers, as well as the traces left by the mammals. Another important performance metric that can be used to evaluate the applied algorithms is the average recall of the predicted values. That is due to the fact that, in the analysis, we evidently use imbalanced datasets (the cultivation presents large instances of soil data compared to instances depicting, for example, weeds or the existence of mammals). As a result, the dataset contains a large volume of soil class data compared to the other categories.
In that case, Random Forest has shown a much more efficient performance for the categories we are interested in, such as saffron flowers, weeds, and mouse traces. On the contrary, the MLP classifier overlooks classes such as weeds or mammal traces by automatically classifying them in the field category, which is in the majority.
A more analytic report of the behavior of the two algorithms is presented in Table 4, where each row presents the accuracy, the precision, and the recall of the estimation models derived to predict animal intrusions, weeds, and saffron flowers. These results demonstrate not only the efficiency and performance of the algorithms in this type of cultivation but, nevertheless, the perspective of machine learning in such data. To be more detailed, Random Forest is able to give distinct results in every category, so as to, in future implementations, predict the appearance of weeds or even spots on the crop with the highest production.
Moreover, the MLP algorithm gives clear results regarding the further preprocessing of the images, in order to make more distinct the data that the algorithm will accept as input. Its inability to locate traces of mammals and weeds leads us to the conclusion that there is room for even more detailed study at that stage. This can be seen in Figure 7b, where the RF which is the most accurate locates more weed and mouse points in the same area opposed to MLP. crop with the highest production.
Moreover, the MLP algorithm gives clear results regarding the further preprocessing of the images, in order to make more distinct the data that the algorithm will accept as input. Its inability to locate traces of mammals and weeds leads us to the conclusion that there is room for even more detailed study at that stage. This can be seen in Figure 7b, where the RF which is the most accurate locates more weed and mouse points in the same area opposed to MLP. In addition, Table 5 lists the rate of existence of each type of pixel represented in the image of Field 1 (this field represents a flourished instance of a field; therefore, it is ideal to use it as a benchmark for the level of production in saffron). Figure 8 presents magnified and clearer pictures of the part of the original cultivation with the predicted values of the models. Through these pictures, it is possible to estimate the position of every category in the field and the amount of production. The results in Figure 8 show that the number of saffron flowers is 8% within the field, the weed is at 0.4%, and the mouse trace rate is at 8% based on RF classifier, which was the most accurate model. In addition, Table 5 lists the rate of existence of each type of pixel represented in the image of Field 1 (this field represents a flourished instance of a field; therefore, it is ideal to use it as a benchmark for the level of production in saffron). Figure 8 presents magnified and clearer pictures of the part of the original cultivation with the predicted values of the models. Through these pictures, it is possible to estimate the position of every category in the field and the amount of production. The results in Figure 8 show that the number of saffron flowers is 8% within the field, the weed is at 0.4%, and the mouse trace rate is at 8% based on RF classifier, which was the most accurate model.

Object-Based Image Analysis
In this section, the results of the object-based image analysis are presented. This method uses, in addition to the image types shown in Figure 4, the data from the raster file generated by the segmentation algorithm applied to the RGB image of the crop (Figure 9). This image shows the different objects that the segmentation algorithm (3.5-RSGISLib) detects in the RGB orthophoto. In this way, Figure 9 shows all the alternative objects found on the cropping area. The result of applying the OBIA algorithm is a new dataset where each instance corresponds to an object recognized by the segmentation algorithm.
This approach is recognized to be much more efficient and less time-consuming. The difference in the data set is that each object recognized by the Shepherd algorithm groups all the pixels in bins of the same category. From a set of millions of pixels, the algorithm gives us a few thousand different objects which we categorized into the four respective categories of interest (i.e., traces of animals, weeds, saffron flowers, and soil). At the same time, we calculated the average of RGB and NDVI values of all the pixels that belong to the same category. In this way, a smaller set of data that was much more goal-oriented was created.

Object-Based Image Analysis
In this section, the results of the object-based image analysis are presented. This method uses, in addition to the image types shown in Figure 4, the data from the raster file generated by the segmentation algorithm applied to the RGB image of the crop ( Figure  9). This image shows the different objects that the segmentation algorithm (3.5-RSGISLib) detects in the RGB orthophoto. In this way, Figure 9 shows all the alternative objects found The results of the object-based image analysis, as described in Section 3.5, showed an overall accuracy of 80% for RF and 76% for MLP (Table 6). Nevertheless, both models had low average recall rates. Specifically, the average recall of RF was 40% and that of MLP was 35%. This low performance of the two models is probably due to an incorrect configuration of the segmentation algorithm to detect the small flowers of saffron. Figure 8 shows the predicted images produced by both algorithms. Table 7 presents the evaluation metrics for every estimation model produced by the two algorithms applied. We can observe that both algorithms, in the context of object-based analysis, present limited estimation power. This is a consequence of the small size of our data and the fact that the original pictures where not precisely annotated by the experts (saffron cultivators). difference in the data set is that each object recognized by the Shepherd algorithm groups all the pixels in bins of the same category. From a set of millions of pixels, the algorithm gives us a few thousand different objects which we categorized into the four respective categories of interest (i.e., traces of animals, weeds, saffron flowers, and soil). At the same time, we calculated the average of RGB and NDVI values of all the pixels that belong to the same category. In this way, a smaller set of data that was much more goal-oriented was created. The results of the object-based image analysis, as described in Section 3.5, showed an overall accuracy of 80% for RF and 76% for MLP (Table 6). Nevertheless, both models had low average recall rates. Specifically, the average recall of RF was 40% and that of MLP was 35%. This low performance of the two models is probably due to an incorrect configuration of the segmentation algorithm to detect the small flowers of saffron. Figure 8 shows the predicted images produced by both algorithms. Table 6. Object-based image analysis results.

Model
Accuracy Precision Recall RF 80% 61% 65% MLP 76% 56% 57% Table 7 presents the evaluation metrics for every estimation model produced by the two algorithms applied. We can observe that both algorithms, in the context of objectbased analysis, present limited estimation power. This is a consequence of the small size of our data and the fact that the original pictures where not precisely annotated by the experts (saffron cultivators).  At the same time, we must keep in mind that saffron flowers, weeds, and even mammal tracks are small objects for this dataset, so a segmentation method such as this one may not be able to detect them, since it has not yet been applied to plants with such a small number of pixels; instead, it has been applied to buildings, trees, and geographic areas that objectively span more pixels [30][31][32][33].
However, we can conclude that the RF classifier was much more efficient than the MLP. This is even more evident from the predicted image, as shown in Figure 10. The MLP classifier fails to classify the important categories, resulting in a blackened image showing a non-flowering crop. Yet another piece of proof is presented in Table 8, where the rate of each category of RF and MLP deviates from the real data.
However, we can conclude that the RF classifier was much more efficient than the MLP. This is even more evident from the predicted image, as shown in Figure 10. The MLP classifier fails to classify the important categories, resulting in a blackened image showing a non-flowering crop. Yet another piece of proof is presented in Table 8, where the rate of each category of RF and MLP deviates from the real data.
Original RGB image.
Part of the original image with ground-truth data.
Random Forest predicted image. Multilayer Perception predicted image.

Discussion
This section further interprets the results of the pilot study performed on the adoption of UAV technologies for monitoring saffron cultivations. In addition, the limitations of the applied methods are presented, along with future directions for researchers and practitioners of this area.

Interpretation of Results
The methodology followed included the application of two different image-analysis methods (pixel-based and object-based) and the training of two machine learning models (Random Forest and Multilayer Perceptron) in a set of field images obtained by the observation of six cultivation fields. All images obtained where annotated by the cultivators that pointed out four field attributes (saffron, weeds, mammal traces, and soil). The dataset used to train the model was a saffron crop field for which the images were obtained by UAV flights within 120 m. The main conclusions from this pilot study are as follows: • The estimation accuracy of both pixel-based and object-based methods was promising with pixel-based methods presenting overall higher accuracy. The average accuracy rate for the Random Forest (RF) algorithm was at 100%, and for the Multilayer Perceptron algorithm, it was at 95%. The object-based analysis, despite the weaknesses that existed, also presented encouraging results with respect to accuracy (RF at 80% and for MLP at 76%).

•
It is important to note that, in order to evaluate machine learning algorithms, we need to use other performance metrics, such as the recall metric, in addition to the accuracy metric. This is very important because our dataset is not balanced; that is, the soil category outweighs the other categories (saffron, weed, and mammal). Thus, it can be seen that the basic method for evaluating the algorithms is the recall percentage for each category. The recall rates were at a medium level for both ML algorithms.

•
Regarding the object-based analysis method, it is important to mention that it presented relatively low percentages in recognizing each class separately.

•
One thing that should not be taken for granted is that the method per pixel gave much better results on higher flights than the method per object. In this regard, we can estimate that the method per pixel, although time-consuming, has the potential to improve the detection of weeds, saffron, and overall various types of diseases.

•
The results of this study are encouraging. It is possible to estimate the production of saffron, the detection of weeds, and the existence of mammals through images collected by UAV. In particular, by applying these ML models in real-time data, it is possible to detect and even predict from a single image the accurate position of the weed to be removed, the amount of production, and the total resources needed to grow and protect a saffron cultivation.

Limitations of the Study
In this section, we discuss the limitations that we have identified for this study. Regarding conclusion validity, which refers to how reasonable the findings of the analysis are, we mention that, regarding the statistical power of the results, we experimented with two machine learning techniques and a set of accuracy metrics in order to validate the methodology used. The model comparisons based on statistical tests, as described in Section 3.7, showed a high level of accuracy in the adopted methodology. Furthermore, regarding the heterogeneity of data, we used images from six different saffron fields to ensure the relativity and variability of the data included in the analysis. The field operations that we have selected to monitor in this study were appointed by the cultivators themselves. Therefore, we believe that the findings are relevant, without excluding future experimentation in other field operations related to saffron cultivations, such as the place and position of seed planting. Concerning reliability, we believe that the replication of our research is safe, and the overall reliability is ensured. The process that was followed in this study was thoroughly documented in Section 3, so it can be easily reproduced by any interested researcher. In either case, future verification of the accuracy of the machine learning algorithms used would be valuable. Concerning the external validity and, in particular, the generalizability supposition, changes in the findings might occur if the settings of the field study are altered, i.e., monitoring of different fields, more precise annotation, flights from lower heights, adoption of different types of sensors. Future replications of this study with different settings are encouraged in order to improve the overall efficiency of the monitoring process.

Future Directions
Under this scope, the overall findings of the adopted methodology and of the pilot study performed for monitoring saffron cultivations can offer useful future directions to both researchers and cultivators.
Researchers are encouraged to engage in the following:

•
Further investigate the object-based methodology applied in this type of cultivation so as to more effectively apply the segmentation algorithm in order to recognize the smallest objects of the image. In this direction, it is possible to collect much more information based on the morphology of the ground, or even investigate a more suitable parameterization of the segmentation algorithm. Additionally, in the future, researchers can experiment on a fusion approach of the pixel-based and object-based methods that might provide satisfactory results.

•
Perform additional studies in saffron-crop monitoring that will take into consideration the findings of the present study in order to achieve improved estimation accuracy. Researchers can change the parameters of the monitoring process and experiment in different saffron fields, use different types of sensors to make more detailed annotations, etc. • Compare data of UAV flights at different heights, so as to verify the accuracy of the two models and come up with the optimum height for the saffron cultivation. In this study, we experimented with three flight heights: (a) 400 m, (b) 120 m, and (c) 12 m. The first height (400 m) in the case of saffron cultivations cannot provide useful information, since the flower is very close to the ground and the accuracy of the image is too low to provide information. The second height is ideal for performing long-distance flights that can also provide images that can serve as input to the estimation models, while the third height (12 m) is ideal for acquiring images for training the estimation models. In the last case, the low height gives us access to high-accuracy images that can be precisely annotated by the experts.
Cultivators are encouraged to engage in the following: • Adopt the presented monitoring process, as the average accuracy of the produced estimation models is overall promising. Since the recall rates of animal intrusion detection were not so high, cultivators are encouraged to collect more spectral information for cultivation to determine the presence of mammals, their extent, and the amount of damage they cause. This information can be thermal images, the use of vegetation indices for moisture, and the information of NIR (near infrared) or RE spectra (red edge). • Give more attention to the annotation of the images so as to avoid classifying the different types of field attributes into wrong classes. As domain experts, they need to carefully interpret the images and provide the related knowledge to the annotation tools that will be used to train the estimation models.

Conclusions
To sum up, in this study, we present the results of the pilot application of UAVs for monitoring saffron cultivations. We used a model training process for pixel and object recognition and estimation in saffron cultivations. UAV photogrammetry techniques were initially employed to collect and process images taken from the fields. We performed 62 flights over 6 fields, producing a total of 1863 photos. Afterward, two different imageprocessing methods, pixel-and object-based analyses, were implemented to create datasets that were used as inputs for two different machine learning algorithms, RF and MLP. Both methods presented satisfying estimation accuracy, with the pixel-based method outperforming the object-based method. In more detail, the pixel-based method resulted in 100% accuracy in the RF algorithm and 95% in MLP, in contrast to object-based method, which had 80% and 76%, respectively. As for the learning algorithms, RF presented better accuracy (100%-80%) than MLP (95%-76%) in both cases. The results are highly dependent on the initial processing of the images. More specifically, we advise practitioners to annotate the images in more detail, perform flights in lower heights, and obtain images periodically to increase the efficacy of the estimation models. Additionally, we advise researchers to further work on object-based analysis methods in an attempt to increase the accuracy of the methods when applied for the identification of the smallest objects within a field. Finally, the study's findings show that the utilization of such methods can lead to the recognition of object categories with a high success rate.
Funding: This research was co-funded by the European Union and Greek National Funds through the Operational Program Competitiveness, Entrepreneurship, and Innovation, grant number T1EDK-04873, project "Drone innovation in Saffron Agriculture", DIAS.