Data Lifecycle Management in Precision Agriculture Supported by Information and Communication Technology

Featured Application: This paper conducts a literature review on information and communication solutions supporting data lifecycle in precision agriculture. Abstract: The role of agriculture in environmental degradation and climate change has been at the center of a long-lasting and controversial debate. This situation combined with the expected growth in crop demand and the increasing prices of fertilizers and pesticides has made the need for a more resource-e ﬃ cient and environmentally sustainable agriculture more evident than ever. Precision agriculture (PA), as a relatively new farming management concept, aims to improve crop performance as well as to reduce the environmental footprint by utilizing information about the temporal and the spatial variability of crops. Information and communication technology (ICT) systems have inﬂuenced and shaped every part of modern life, and PA is no exception. The current paper conducts a literature review of prominent ICT solutions, focusing on their role in supporting di ﬀ erent phases of the lifecycle of PA-related data. In addition to this, a data lifecycle model was developed as part of a novel categorization approach for the analyzed solutions.


Introduction
Modern agricultural techniques have been strongly criticized about their negative environmental impacts. Examples [1] of such impacts include: • Loss of biodiversity among plants and animals caused by monocultures; • Soil and groundwater pollution due to the use of chemical pesticides and fertilizers; • Soil eroding at a much faster pace than it can be replenished; • Fish die-offs; • Use of water and fossil fuels at unsustainable rates.
In the next few decades, the global population is expected to grow substantially, followed by a subsequent increase in global food demand [2]. The prices of fertilizers and pesticides are also expected to rise. Precision agriculture (PA) (also known as site-specific management, precision farming, prescription farming, etc.) [3] can have an important contribution in addressing the aforementioned challenges. It can also serve as part of an environmentally sustainable agricultural system while still maintaining profitability [4].
1. Describe: This element stresses the importance of metadata based on standards as well as sufficient documentation for all lifecycle stages. This leads to fewer errors and facilitates current and future use of data. 2. Manage Quality: The second cross-cutting element refers to quality assurance (QA) and quality control (QC) measures for all lifecycle stages. 3. Backup and Secure: The third cross-cutting element underlines the importance of preventing physical data losses and promotes secure data management methods.
Another popular data lifecycle model has been developed by the Digital Curation Centre (DCC). This model offers a generic and highly adaptable approach, which can be used for various purposes (e.g., as a training solution for data creators, for planning and organizing resources, for digital assets risk management). This model consists of: full lifecycle actions (description and representation of information, preservation planning, community watch and participation, curation and preservation), sequential actions (conceptualizing, creating and receiving, appraising and selecting, ingesting, preserving, storing, accessing, using and reusing, transforming), and occasional actions (disposing, reappraising, migrating) [14]. The data lifecycle model proposed by the University of Deusto provides an interesting approach mainly focusing on data management in smart cities [15]. This model encompasses eight lifecycle stages, i.e., discovery, capture, curate, store, publish, linkage, exploit, and visualize. The so-called comprehensive scenario agnostic data (COSA) data lifecycle model, which can be used in many different scientific fields, stresses the importance of data quality in all lifecycle stages and aims to address several data-related challenges. It has three main pillars: (1) data acquisition (including data collection/filtering/quality/description), (2) data processing (encompassing data processing/quality/analysis), and (3) data preservation (including data classification/quality/archive/dissemination). Several other data lifecycle models can be found in scientific bibliography. However, further analysis of such models falls out of scope of this paper.

Proposed Data Lifecycle Model
The data lifecycle model developed and adopted in the context of this paper is depicted in Figure 1. Based on the data lifecycle stages of the USGS model, our model has three constituent elements: 1. Data Collection and Internet-of-Things (IoT): The first element is responsible for directly or indirectly collecting existing data and/or generating new data. Furthermore, it encompasses managing the sources (e.g., data from sensors, databases, historical data) for the data collection. This element also underscores the vital importance of IoT technologies in data collection and in Based on the data lifecycle stages of the USGS model, our model has three constituent elements: 1. Data Collection and Internet-of-Things (IoT): The first element is responsible for directly or indirectly collecting existing data and/or generating new data. Furthermore, it encompasses managing the sources (e.g., data from sensors, databases, historical data) for the data collection. This element also underscores the vital importance of IoT technologies in data collection and in PA as a whole. The term IoT refers to interrelated computers and everyday objects, which can transmit and receive data over a network and often incorporate ubiquitous intelligence [16].

2.
Data Analysis and Artificial Intelligence (AI): The second element involves processing of data from various sources as well as extracting valuable knowledge and generating added value from data. It also refers to transforming raw data to a more sophisticated form in order to facilitate subsequent analysis and/or integration. AI has a central role in this element and covers a range of different techniques (e.g., computer vision, fuzzy logic, evolutionary algorithms, machine learning, semantic processing). 3.
Data Storage and Distribution: The third element refers to the necessary processes and resources for permanently or temporarily storing sets of data as well as for providing end-users with open or restricted access to data.
Assuring quality, privacy, and security in all data lifecycle stages are top priorities for PA. Therefore, a cross-cutting element titled "Security and Quality Assurance" is included in our proposed model. We followed a compact approach for the model so that it can easily be tailored to the needs of academic users from different scientific fields as well as of various other stakeholders. This compact approach offers compatibility and easy mapping with respect to the main existing data lifecycle models, as studied and presented in Section 2.1, while offering better generalization compared to more finely tuned approaches.

Categorization Method
For the classification of the analyzed PA-related solutions, we developed a two-fold approach based on the data lifecycle model presented above as well as on the technological categorization of the solutions (38 technological subcategories have been defined). The categorization was performed according to the main focus point(s) of each solution. As many solutions encompass different technologies and/or data lifecycle stages, finding a solution in one category does not exclude the possibility of this solution using and/or facilitating technologies from other categories as well.

Information and Communication Technology Solutions for Precision Agriculture
In the context of this literature review, after identifying the main goals and the scope of the present paper, we used the Google Scholar web search engine and the SCOPUS academic database to search for publications (academic articles) around our topic. There is an increasing interest about solutions supporting the data lifecycle in PA, as depicted in Figure 2. Specifically, Figure 2 depicts the number of relevant scientific publications indexed in the SCOPUS database (the following SCOPUS database query was used: "TITLE-ABS-KEY ("Precision agriculture" AND ("data management" OR "data analytics" OR "data storage" OR "data collection" OR "data analysis" OR "data lifecycle"))) for the years 2016, 2017, 2018, and 2019. PA as a whole. The term IoT refers to interrelated computers and everyday objects, which can transmit and receive data over a network and often incorporate ubiquitous intelligence [16]. 2. Data Analysis and Artificial Intelligence (AI): The second element involves processing of data from various sources as well as extracting valuable knowledge and generating added value from data. It also refers to transforming raw data to a more sophisticated form in order to facilitate subsequent analysis and/or integration. AI has a central role in this element and covers a range of different techniques (e.g., computer vision, fuzzy logic, evolutionary algorithms, machine learning, semantic processing). 3. Data Storage and Distribution: The third element refers to the necessary processes and resources for permanently or temporarily storing sets of data as well as for providing end-users with open or restricted access to data.
Assuring quality, privacy, and security in all data lifecycle stages are top priorities for PA. Therefore, a cross-cutting element titled "Security and Quality Assurance" is included in our proposed model. We followed a compact approach for the model so that it can easily be tailored to the needs of academic users from different scientific fields as well as of various other stakeholders. This compact approach offers compatibility and easy mapping with respect to the main existing data lifecycle models, as studied and presented in Section 2.1, while offering better generalization compared to more finely tuned approaches.

Categorization Method
For the classification of the analyzed PA-related solutions, we developed a two-fold approach based on the data lifecycle model presented above as well as on the technological categorization of the solutions (38 technological subcategories have been defined). The categorization was performed according to the main focus point(s) of each solution. As many solutions encompass different technologies and/or data lifecycle stages, finding a solution in one category does not exclude the possibility of this solution using and/or facilitating technologies from other categories as well.

Information and Communication Technology Solutions for Precision Agriculture
In the context of this literature review, after identifying the main goals and the scope of the present paper, we used the Google Scholar web search engine and the SCOPUS academic database to search for publications (academic articles) around our topic. There is an increasing interest about solutions supporting the data lifecycle in PA, as depicted in Figure 2. Specifically, Figure 2 depicts the number of relevant scientific publications indexed in the SCOPUS database (the following SCOPUS database query was used: "TITLE-ABS-KEY ("Precision agriculture" AND ("data management" OR "data analytics" OR "data storage" OR "data collection" OR "data analysis" OR "data lifecycle"))) for the years 2016, 2017, 2018, and 2019.  Out of the 112 papers reviewed, we analyzed 70 ICT solutions that support various stages of the data lifecycle in PA. The remaining 42 papers were excluded from our analysis due to their similarities with other papers and/or their focus on solutions other than PA. A snowballing process was also employed in certain solutions, where papers cited by the authors of the reviewed solutions were also analyzed. Based on our categorization method described in Section 2 as well as on the interconnections among the reviewed papers, we present the analyzed solutions in three subsections (Data Collection and IoT, Data Analysis and AI, and Data Storage and Distribution), complemented by a fourth subsection including solutions which focus on two or more stages of the data lifecycle model. Comments on results, conclusions, and potential future research works can be found in Section 4. A diagram describing the different stages of this literature review is provided in Figure 3. Out of the 112 papers reviewed, we analyzed 70 ICT solutions that support various stages of the data lifecycle in PA. The remaining 42 papers were excluded from our analysis due to their similarities with other papers and/or their focus on solutions other than PA. A snowballing process was also employed in certain solutions, where papers cited by the authors of the reviewed solutions were also analyzed. Based on our categorization method described in Section 2 as well as on the interconnections among the reviewed papers, we present the analyzed solutions in three subsections (Data Collection and IoT, Data Analysis and AI, and Data Storage and Distribution), complemented by a fourth subsection including solutions which focus on two or more stages of the data lifecycle model. Comments on results, conclusions, and potential future research works can be found in Section 4. A diagram describing the different stages of this literature review is provided in Figure 3.

Data Collection and IoT
Various data collection and IoT technologies (e.g., wireless sensor networks (WSN), geographic information systems (GIS), satellite communications) were adopted within several solutions analyzed for the purposes of this paper. The term GIS refers to computer systems for storing, analyzing, displaying, and managing geospatial data [17].

Satellite Communications
The use of receivers mounted on field equipment to acquire real-time geospatial information plays a catalytic role in PA. Stombaugh underscores the importance of global navigation satellite systems (GNSS) in field sampling and vehicle navigation as well as in optimizing the use of fertilizers and pesticides. He also highlights the advantages in accuracy and complexity of GNSS systems over older positioning technologies used in agriculture [18]. Saiz-Rubio et al. refer to several artificial satellites (e.g., American Landsat satellites, European Sentinel 2 satellite system, GeoEye-1-system, WorldView) which provide multispectral imaging information. These artificial satellites can be very beneficial for PA (e.g., by providing information about soil and water cover, using the normalized difference vegetation index (NDVI)) [19]. NDVI is a vegetation index which utilizes visible and near-infrared spectrums to detect live vegetation. It can be used for various reasons (e.g., for assessing the impacts of forest/land fires and floods, predicting land deterioration, detecting occurrences of draught) [20].

Internet of Things and Wireless Sensor Networks
Solutions based on IoT technologies and WSNs can provide useful for PA. Keswani et al. propose a solution for optimized irrigation, which utilizes various sensors (including soil moisture probes, soil temperature sensors, ambient temperature sensors, humidity sensors, CO2 sensors, and light dependent resistors) to acquire real-time measurements. Based on these measurements, the precise management of a water valve is achieved using neural network-based prediction of the soil water requirements [21]. Ferrández-Pastor et al. propose an IoT-based low cost network platform for sensors and actuators. This platform integrates machine to machine (M2M) and human-machine interface (HMI) protocols. Edge computing technologies based on the aforementioned multi-protocol approach are also used to develop control processes for several PA scenarios. The experimental results regarding this platform indicated several benefits (e.g., reduction in costs and energy

Data Collection and IoT
Various data collection and IoT technologies (e.g., wireless sensor networks (WSN), geographic information systems (GIS), satellite communications) were adopted within several solutions analyzed for the purposes of this paper. The term GIS refers to computer systems for storing, analyzing, displaying, and managing geospatial data [17].

Satellite Communications
The use of receivers mounted on field equipment to acquire real-time geospatial information plays a catalytic role in PA. Stombaugh underscores the importance of global navigation satellite systems (GNSS) in field sampling and vehicle navigation as well as in optimizing the use of fertilizers and pesticides. He also highlights the advantages in accuracy and complexity of GNSS systems over older positioning technologies used in agriculture [18]. Saiz-Rubio et al. refer to several artificial satellites (e.g., American Landsat satellites, European Sentinel 2 satellite system, GeoEye-1-system, WorldView) which provide multispectral imaging information. These artificial satellites can be very beneficial for PA (e.g., by providing information about soil and water cover, using the normalized difference vegetation index (NDVI)) [19]. NDVI is a vegetation index which utilizes visible and near-infrared spectrums to detect live vegetation. It can be used for various reasons (e.g., for assessing the impacts of forest/land fires and floods, predicting land deterioration, detecting occurrences of draught) [20].

Internet of Things and Wireless Sensor Networks
Solutions based on IoT technologies and WSNs can provide useful for PA. Keswani et al. propose a solution for optimized irrigation, which utilizes various sensors (including soil moisture probes, soil temperature sensors, ambient temperature sensors, humidity sensors, CO 2 sensors, and light dependent resistors) to acquire real-time measurements. Based on these measurements, the precise management of a water valve is achieved using neural network-based prediction of the soil water requirements [21]. Ferrández-Pastor et al. propose an IoT-based low cost network platform for sensors and actuators. This platform integrates machine to machine (M2M) and human-machine interface (HMI) protocols. Edge computing technologies based on the aforementioned multi-protocol approach are also used to develop control processes for several PA scenarios. The experimental results regarding this platform indicated several benefits (e.g., reduction in costs and energy requirements, increased acceptance by agricultural specialists) [22]. Foughali et al. propose a cloud IoT-based (further information about cloud computing can be found in the following subsection) solution for disease prevention in PA. In this solution, sets of temperature and humidity sensors send their measurements to a cloud platform through a gateway. These measurements are stored and analyzed in the cloud platform, and a decision-support system notifies the farmer through short message service (SMS) when the first attack of the plant disease takes place. Some of the benefits of this solution include increased efficiency, cost reduction, and minimization of the environmental impact by estimating the exact fungicide quantities required [23]. Palazzi et al. present an autonomous leaf-mounted radio-frequency identification (RFID) wireless temperature sensor for PA. With a weight less than 3 grammars, this sensor can be installed on a leaf to measure leaf-to-air differential temperature. This measurement is used to monitor the water stress levels of plants. The energy requirements of the sensor are covered by a flexible solar panel, and the wireless read range is approximately 3 m. A field evaluation of the proposed sensor proved their capability of discriminating water level variations in plants and thus their suitability for PA [24].

Data Analysis and Artificial Intelligence
Various data analysis and AI tools and technologies supporting PA were reviewed for the purposes of this paper, encompassing big data analytics, computer vision, data fusion/reconstruction, recommender systems, evolutionary algorithms, fuzzy logic, granulation/interpolation techniques, machine learning, heuristic/decomposition/denoising/pattern-matching algorithms, semantic processing, predictive models, simulation/visualization tools, and others.

Data Analysis for the Determination of Management Zones
Being able to determine spatial and temporal variability in crops is of paramount importance for PA, as mentioned in the introduction. Towards this direction, management zones are adopted to define subregions in a given field, which are characterized by similar soil, landscape, and yield-limiting factors [25]. In these subregions, farmers can employ homogeneous management practices in terms of irrigation, use of pesticides and fertilizers, planting density, tillage, soil sampling, etc. The use of management zones leads to increased efficiency and profitability and supports environmental sustainability.
Geostatistical and clustering models, interpolation methods, and GIS are particularly useful for the calculation of the management zones. Buttafuoco et al. underline the contribution of kriging interpolation methods in calculating plausible management zones for PA. These geostatistical methods help overcome several difficulties in identifying subfield areas. Such difficulties are often posed by a complex combination of factors, which can influence the effectiveness of specific inputs (e.g., fertilization, irrigation, use of pesticides) [26]. Ohana-Levi et al. demonstrate a weighted multivariate spatial clustering model for the determination of irrigation management zones. The specific model is based on machine learning and spatial statistics. A comparison with other clustering techniques showed a significant advantage of this model regarding the separability of points and their spatial distribution [27]. Kingsley et al. analyze a solution in which geostatistical models based on GIS are used for the predictive mapping of soil properties. This solution had important benefits over existing tedious, time-consuming, and expensive soil analysis methods. Six geostatistical interpolation methods were utilized for seven soil properties (total nitrogen, soil pH, soil organic carbon, available phosphorus, sand, clay, and cation exchange capacity), drastically improving the soil maps resolution by providing detailed information. The results can also be utilized by small farm holders to increase their production and reduce their irrigation requirements [28].
Groundwater models are also of great importance for PA, as they simulate and predict aquifer conditions by utilizing various hydraulic parameters and input datasets. Meta-heuristic algorithms are used for determining optimal and near-optimal parameters in groundwater models. Haddad et al. compare two meta-heuristic algorithms, i.e., particle swarm optimization and pattern search. The results of this comparison indicated that the latter approach offered more accurate and efficient aquifer parameters calibration [29].

Predictive Models and Decision Support
Decisions regarding farm management can have large environmental economic and environmental footprints. Therefore, predictive and decision-support technologies are particularly useful for PA. Technologies and tools which can support decision-making, include: big data analytics, machine learning (encompassing deep learning, neural networks, and support vector machines (SVMs), among other technologies), predictive models, satellite imagery, sensors measuring various data, granulation techniques, IoT, etc. The term big data describes large-volume, complex, and ever-growing datasets from diverse and sometimes autonomous sources. Several definitions refer to data variety, velocity, and volume (also known as "3Vs") as constituent parts of big data. A similar "4V" model complements the "3V" model with data veracity as the fourth "V", while the "5V" model also uses data value as the fifth "V" [30]. Machine learning refers to tools and methodologies aiming to enable computational applications to modify and/or adapt their actions so that they become more accurate and closer to the desired result [31].
Bendre et al. demonstrate a rainfall prediction model using big data analytics and neural networks. Based on the results of the research, the authors emphasize that the specific model has the potential of improving the prediction accuracy as well as of facilitating the decision-making of farmers about crop pattern and water management [32]. Ruan et al. propose a novel predictor for big data in PA cyber-physical systems (CPSs). A CPS can be defined as a system integrating computation with physical processes. Unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs) are two popular examples of CPSs [33]. Utilizing granulation techniques, genetic algorithms, and SVMs, this solution addresses the problem of reduced efficiency of similar predictors in handling large-scale data. The proposed predictor can facilitate decision-making for management and/or planting issues. Testing of the solution indicated satisfactory computation efficiency and prediction accuracy [34]. Newlands et al. demonstrate a novel forecasting method which can be used for predicting agricultural crop yield by integrating agroclimate and remotely-sensed indices. The integration is conducted within a probabilistic Bayesian framework, incorporating data and model structural uncertainty. The particular method utilizes random forest-tree machine learning techniques as well as Markov chain Monte Carlo simulation. The proposed technique offers an accurate, statistically robust, and flexible tool for yield forecast at a regional scale [35]. Bendre and Manthalkar propose a framework which can be used for predicting future weather conditions based on big data from weather stations. Neural network techniques were utilized, and a time series-based decomposition technique was also developed. The predicted results can be used for evaluating the forecasting performance of weather stations as well as for preventing yield loss caused by extreme weather events [36]. Peerlinck et al. demonstrate a model for yield and protein prediction which utilizes deep neural network techniques. This model can be used for the optimized application of fertilizers which can increase profits and reduce the environmental impact. The proposed deep learning-based methodology was found to offer increased accuracy compared to other methods used for protein and yield prediction (e.g., multiple regression, shallow feed-forward networks) [37].
Risk management systems are strongly interrelated with decision-support systems and mainly refer to processes which identify existing risks and plan actions to mitigate these risks [38]. Li et al. propose an IoT-based risk management system for solar greenhouses. This solution includes disaster forecasting (e.g., fog, haze, and cold temperatures) and utilizes early warning indicators, including low temperature, sparse sunlight, and diseases induced by unfavorable environmental conditions. The specific system aims to provide a framework for analyzing relationships between the vegetable damage dynamics and the meteorological events. Furthermore, it can also support decision-making and encourage farmers to use greenhouse environment control systems, thus improving the quality of their products [39].Řezník et al. analyze how geospatial big data processing can play a key role in the prevention and the mitigation phases of disaster risk reduction in the agricultural sector. In particular, farm machinery telemetry data, agrometeorological observation data from wireless sensor networks, as well spatial variability and crop status data from satellite imagery were found to be particularly useful not only for the main purposes of PA but also for disaster risk reduction [40].

Computer Vision
Computer vision is the main constituent element of many ICT solutions for PA. The main idea behind computer vision is to enable computers to understand diverse environments based on visual information. Image acquisition and image processing/analysis are the main components of computer vision systems. Image acquisition refers to devices (e.g., cameras) transforming electronic signals from a sensor to numerical representations. Image processing and analysis revolve around manipulating images or videos (e.g., to improve their quality and facilitate subsequent analysis) and extracting useful knowledge from them [41,42].
After rice transplantation has taken place, rice plant numbers and density are key factors for measuring yield and quality of rice grains. Md Nasim et al. propose an image processing technique based on morphological operation for automatically determining the number of rice plants. For this purpose, images collected by UAVs using satellite navigation are analyzed. The results of the research indicated high efficiency of the proposed image processing technique [43]. Firdaus et al. analyze how the use of convolutional neural networks and genetic algorithms in satellite image processing can facilitate PA. The results of their research showed that these technologies are beneficial for maximizing the economic factor as well as for minimizing CO 2 emissions and land degradation [44]. Yu et al. analyze a computer vision method for fruit detection, which can be utilized by harvesting robots. The method is based on mask region convolutional neural network (mask RCNN) technologies. The specific approach enabled the accurate categorization of fruits (ripe or unripe), marked regions with bounding boxes as well as indicated the visual location of the fruit picking points, by analyzing shape and edge features of the mask images, that were generated by the method [45]. Measuring the surface soil moisture content is vital for assessing soil-plant-water interactions, studying the climate change as well as for water budget closures. Shafian et al. propose a new approach for estimating the soil moisture content, based on multispectral satellite imagery. The proposed moisture index uses raw image digital count (DC) data in the thermal infrared, red, and near-infrared bands [46]. Alves and Cruvinel demonstrate a big data environment for soil analysis based on computed tomography (CT) images, which can contribute to the better understanding of problems of agricultural lands as well as to improving productivity and sustainability of crops. In the proposed solution, 3D reconstruction and statistical analysis algorithms are also employed [47]. Bah et al. demonstrate a deep learning-based model for detecting weeds, which uses images taken by UAVs. This model uses convolutional neural networks with unsupervised training datasets. The proposed solution was found to be flexible and provided similar results with other methods using supervised datasets while not being dependent on the time-consuming task of the creation of large agricultural training datasets with annotations at a pixel level by experts [48].
Data fusion and evaluation/benchmarking techniques are frequently used in computer vision-related technologies for PA. The term data fusion refers to merging two or more representations of the same object into a single, clear, and consistent representation [49]. Comba et al. present a new data fusion methodology which can be used for describing the canopy status variability. This methodology exploits information from multiple sources including: 2D multispectral imagery from UAVs, 3D point cloud crop models, and thermal imagery from UAVs. The proposed methodology was found to have higher discriminant power than methodologies utilizing a single source of data [50]. Xu et al. propose a model for rapid prediction of soil classes based on the outer product analysis data fusion algorithm. Data from visible near infrared and mid-infrared spectra were fused. This data was used together with other soil information (color, pH, organic matter, texture) for providing soil classification based on a support vector machine model. The proposed data fusion model was found to offer improved soil classification results as compared to the results using a single source of data [51].
Zheng et al. present a crop species classification and detection dataset. This dataset can provide the data benchmark to construct classification and detection models based on deep learning techniques. The aforementioned models can be utilized by agricultural picking robots to identify and classify crops more efficiently. This dataset was found to provide improved accuracy over other existing datasets [52]. Haug and Ostermann demonstrate a benchmark dataset which can be used for the evaluation of machine vision tasks in PA. The proposed evaluation metrics can be used for several segmentation and specification tasks, encompassing crop/weed discrimination, phenotyping of a single plant, etc. The main goal of this dataset is to simplify and improve complex procedures used in PA [53].

Visualization Techniques
Visualization techniques are frequently used in conjunction with computer vision technologies. Data visualization revolves around representing (e.g., through diagrammatic/pictorial representations) large amounts of data in order to enhance situation awareness. Augmented reality (AR) and virtual reality (VR) are two technologies gaining increasing traction nowadays. Both of them utilize data visualization methods. AR encompasses linking physical and virtual objects (e.g., in a computer or a smartphone screen). VR refers to submerging the user in a full three-dimensional experience, where physical objects are linked to the virtual world [54]. Tan et al. demonstrate an extensible architecture for data visualization and analysis in PA. This architecture has three main constituent elements: (1) a module for importing data from various sources, (2) a data-processing and visualization subsystem, and (3) an overall client-server architecture design. The proposed solution offers both 2D and 3D visualization capabilities for serving different needs in PA. One example of a 2D visualization functionality includes a map image overlaid with a 2D image with photosynthetically active radiation data [55]. Okayama and Miyawaki demonstrate an AR-based advanced gardening system aiming to support new farmers/gardeners in executing complex agricultural tasks based on data from sensors and various databases. This system provides visual guidance for several farming operations and is capable of recording the farmers' positions/viewpoints during the operations. In addition to this, it overlays virtual plants on the field where the farmer is working and enables comparisons between real and virtual plants [56]. Lin et al. propose a method for creating 3D panoramic models from image pairs captured with mobile stereo machine vision systems. These 3D models are then integrated in existing GIS platforms to create a virtual reality geographic information system (VRGIS). VRGIS is capable of efficiently retrieving spatial information, enabling navigation in VR. Several technologies are utilized for the proposed solution encompassing: image processing, pattern-matching algorithms, data fusion, 3D surface model building, database management, etc. The integrated VRGIS can serve as a platform for establishing field information for PA [57]. Carruth et al. demonstrate an immersive VR greenhouse simulation. The proposed solution enables training of students through realistic interfaces for controlled environment agriculture (CEA) and several physical greenhouse tasks. Through these interfaces, costs related to wrong management of a CEA system by untrained users can be prevented [58]. Phupattanasilp and Tong demonstrate a framework which integrates IoT data into an AR environment, thus facilitating the interpretation of this information as well as simple and effective monitoring of agricultural crops. Field testing of the proposed solution indicated smaller error proneness compared to other visualization methods as well as a potential for saving time and reducing waste [59].

Other Data Analysis Tools
Several other data analysis tools based on fuzzy logic, semantic technologies, decomposition/denoising algorithms, etc. have useful applications in PA. Fuzzy logic can be defined as a many-valued logic with special properties, which is used to model the vagueness phenomenon and the meaning of natural language utilizing a graded approach [60]. The main idea behind semantic technologies is to aid computers in understanding data by providing tools and methodologies for representation, integration, and acquisition of knowledge [61]. Kamilaris et al. propose a semantic framework for IoT-based PA applications, which provides real-time reasoning over heterogeneous sensor data streams. In addition to this, the proposed framework achieves semantic integration of information from different sources, including social media, sensor data, government regulations/alerts, connected farms, etc. Some benefits of the aforementioned solution are increased productivity, better product quality, resource use optimization, faster reactions to unpredictable events, transparency, etc. [62] Ravankar et al. propose an inexpensive semantic monitoring solution for vineyards. This solution takes input data from low-cost cameras, sensors, and processing boards mounted on robotic systems. This data is then semantically labeled in order to pinpoint which locations need to be monitored. Thus, farmers can save time and labor. The proposed solution can also be integrated into existing UGVs [63].
Non-linearity and low control accuracy are two shortcomings of several hydraulic transplanting robot systems, caused by interference from external factors. To overcome these problems, Jin et al. simulated a hydraulic transplanting robot control system by combining fuzzy control theory with proportional-integral-derivative (PID) control theory. The simulated control system helped overcome problems related to low accuracy and non-linearity and improved the overall stability and the performance of the robotic system [64]. Zhang et al. present a novel method for the optimization of a variable-rate fertilizer. This method is based on several techniques including: general regression neural networks (GRNN), multi-objective evolutionary algorithms, and decomposition algorithms. The proposed method was found to contribute to a more accurate and uniform fertilization application as compared to other existing methods [65]. Wang et al. propose a model for retrieving crop phenology data which utilizes synthetic aperture radar polarimetric decomposition techniques as well as neural networks, random forest, regression, and k-nearest neighborhood algorithms. This model was used in different crop types including canola, corn, soybean and wheat and was found to provide robust phenology retrieval results [66].
Data analysis methods such as harmonic analysis of time series (HANTS) algorithms can be used for reconstructing NDVI-related data. In satellite-derived time series regarding the NDVI, adverse atmospheric components frequently result in data gaps. Padhee and Dutta propose a moving offset method (MOM), which can be used for prefilling time series prior to applying HANTS algorithms. The proposed method was found to improve the quality of NDVI reconstruction in regions where frequent seasonal obstructions are frequent [67]. Another data reconstruction solution particularly useful for PA is proposed by Dong et al. [68]. In this paper, a 4D reconstruction approach is demonstrated, which contributes to crop monitoring at high spatio-temporal resolution. A spatio-temporal model of dynamic scenes is utilized in order to monitor growing crops, the appearance of which is constantly changing. A robust data association algorithm is also developed to tackle problems related to big changes to the appearance of crops, caused by scenes from different angles, at different times. The proposed model was found to offer qualitatively correct and quantitatively accurate monitoring of crops.

Data Storage and Distribution
Sharing valuable knowledge and experiences among different actors in agriculture as well as efficient and safe storage options are of vital importance for PA. In this regard, solutions around sharing platforms, cloud storage, blockchain, and other relevant technologies were analyzed for this literature review.

Sharing Platforms and Cloud Storage
Chen et al. demonstrate a distributed data sharing platform for multi-source IoT sensor agriculture and forestry data. The platform consists of data center, data adapter, data storage, data publishing, and data transmission subsystems. The data publishing subsystem provides standardized data query interfaces for single or multi-sensor real-time data as well as for historical data. The proposed platform bridges the gap between the data from different IoT sensors and the use of this data in various agriculture or forestry applications [69]. Uchinuno et al. demonstrate a knowledge sharing system for agriculture. This system enables sharing of two sources of data: data from sensors (e.g., temperature, humidity, CO 2 measurements) and work information from skillful farmers (e.g., experiences modeled in a digital Extensible Markup Language (XML) form). The proposed system helps maintain agricultural technology at high levels as well as promotes the concepts of PA [70]. Wu et al. present a heterogeneous agriculture database sharing platform based on the XML. Utilizing a browser/server (B/S) architecture, this platform enables users to access heterogeneous databases through their web browser. Databases which could utilize this platform include: practical agrotechnique/agriculture, policy/agricultural scientific talents/market supply/market demand databases. Testing of this platform proved its reliability and usefulness for data sharing in PA [71]. Rhee et al. propose a data-exchange platform for farm management information systems. This platform consists of the data collection service and the data sharing service. Testing of the proposed system revealed its flexibility and extensibility in collecting data from heterogeneous sources, including web services and serial communication devices. In addition to this, the platform was found to be very accurate and efficient for sharing data among different farm management information systems [72].
Cloud computing and its related services can very beneficial for PA. Cloud computing refers to the technologies enabling easy, ubiquitous, and on-demand access to a shared pool of computing resources, encompassing networks, storage, servers, services, etc. with minimal management/interaction needs [73]. Zhou et al. demonstrate a cloud-based remote sensing observation sharing framework for soil moisture mapping in PA. This framework consists of a cloud-based sensor observation service, a web processing service, and a distributed database subsystem. Testing of the proposed system indicated its contribution to improving earth observation data storage as well as to achieving large-scale mapping of soil moisture [74]. Pavón-Pulido et al. propose a novel, reliable, and easily extensible PA system based on cloud computing and cloud storage enabling farmers to monitor their crops and plan their agricultural tasks from a PC, a smartphone, or a tablet. The use of cloud storage instead of local storage contributes to minimizing management efforts as well as to optimizing the use of resources [75].

Blockchain and Smart Contracts
Blockchain technology is gaining traction globally in the area of data management and storage. This technology can be rather beneficial for PA as well. Blockchain can be defined as the distributed ledger which contains an ever-growing list of data records. All of the participating nodes confirm the validity and the accuracy of the aforementioned data records [76]. The term block (of blockchain) refers to a record containing data, a value with the hash (digital fingerprint of a block's data) of the current block as well as a value with the hash of the previous block. The link between the hash of the current block and the hash of the previous block serves as a basis for a cryptographically-linked chain of blocks [77]. Precision irrigation encompasses the accurate calculation of the crop water requirements as well as the precise application of water volumes at the right time. Bodkhe et al. highlight the advantages of integrating blockchain technologies for ensuring trust and security in precision irrigation systems as well as for mitigating various attack vectors through a proposed framework [78]. After analyzing several data threats directly connected with PA (e.g., malicious attacks or distortion of environmental data stored in centralized servers), Lin et al. underscore the role of blockchain technology and its cryptoeconomic security features in ensuring that the data and the technological infrastructure are protected from malicious attackers. The authors also provide an example of blockchain technologies utilization for a secure national distributed database which conforms to international agricultural standards and naming conventions [79].
Smart contracts refer to fully digital contracts in the form of small script programs, which are used and stored in blockchains, thus providing a tamper-proof logic [77]. Chun-Ting et al. introduce a decentralized traceability service platform, utilizing blockchain for ensuring data integrity and reliability as well as smart contracts for irreversible and trackable financial transactions among users. This solution uses IoT sensors for the collection of environmental data regarding the crops. These data are stored using a peer-to-peer (P2P) network. As the specific platform has the potential of being scalable by applying data tracking to each component on the agriculture production chain, it can enable farm-to-fork traceability [80]. Smirnov et al. demonstrate a solution for the collection and the distribution of robot sensor data in a trusted information space using smart contracts and blockchain technologies. In addition, utilizing IoT and fuzzy logic technologies, the proposed solution enables forming coalitions among robots with trust from all participants in order to perform tasks in PA which require cooperation [81].

Multi-Purpose Platforms
Several reviewed PA solutions, encompassing digital platforms, mobile applications, and CPSs, support multiple data lifecycle stages.

Management Information Systems and Digital Platforms
Burlacu et al. propose a farm management information system (FMIS) which can be used for land mappings, data collection from various sources, monitoring parameters, data processing, as well as for decision-support. In particular, decision-making is supported by informing the farmer about new technologies, available resources, competitors, prices, financial forecasts, regulations, etc. A relevant ontology was also developed to represent the knowledge which can be used for understanding the main concepts and relations regarding the FMIS [82]. Paraforos et al. demonstrate a FMIS which enables farmers to easily perform profitability analysis based on certain farm parameters, recorded transactions, and the performed agricultural tasks. This FMIS utilizes cloud computing technologies and was field tested on a winter wheat crop. It has the potential of facilitating decision-making of farmers regarding PA practices and use of resources [83]. Morais et al. describe and evaluate an open-source environment for data management in PA. This solution aims to provide full and free support for common low-cost hardware, which is used to support various environmental sensors and their generated data. Visualization tools and several data management tools are easily accessible by the users. Sharing data among registered users and creating rules for executing simple operations are also available [84]. González et al. present a dashboard for PA, which can be used for creating yield and fruit quality maps. The proposed solution utilizes GIS data as well as data from various libraries used for spatial data processing/analysis and provides the user with useful information in terms of farm management. Some of the benefits of the aforementioned dashboard include easier decision-making of farmers and optimized use of water, fertilizers, and pesticides [85]. Ngo et al. demonstrate an agricultural ontology which can be used for applying data science technologies in agriculture and can lead to increased productivity and more effective management. The knowledge base created in the context of this solution contains several semantic relations and hierarchies which can link diverse available resources. The ontology addresses many challenges which are present while pre-processing, transforming, analyzing, and integrating real-world big data in the agricultural domain. Furthermore, the proposed ontology is that it can serve as a basis for building bigger and more precise knowledge bases [86].
Supervisory control and data acquisition (SCADA) technology is another example of a multi-purpose platform for PA. The term SCADA focuses on the supervisory level rather than representing a full control system. It is a software package which communicates with programmable logic controllers (PLCs) and/or other commercial hardware modules [87]. Berrú-Ayala present a SCADA system for automated irrigation control of crops. This system enables real-time monitoring of the process variables (e.g., earth temperature/humidity, environment temperature/humidity, water pressure) as well as controlling the solenoid valves which are used for the irrigation. Optimized labor and water resources use, reduced costs, fewer diseases, and less stress on the plants were spotted as some of the potential benefits of the proposed solution [88]. Pandiarajan et al. demonstrate a mobile SCADA system for automatic crop field management. According to the measurements of different sensors, irrigation is controlled automatically, and the proper fertilizer and pesticide amounts are dosed. This system is also capable of detecting disturbances at the crop field (e.g., animal intrusion) and providing alarms accordingly [89].

Cyber-Physical Systems
CPSs were also treated as multi-purpose platforms in this paper, since they often utilize various technologies around data collection/analysis/storage/sharing, AI, IoT, etc. Some prominent CPS solutions can be used for measuring soil properties as well as for applying variable rates of fertilizers and pesticides based on the crops' spatial and/or temporal variability. Mogili et al. demonstrate the use of a UAV-based automatic pesticide distribution system, which reduces the exposure of farmers to dangerous chemicals and optimizes the use of water and pesticides. The aforementioned system captures photos of the crops and analyzes the images to identify which areas require the use of pesticides. Based on the results of this analysis as well as on geolocation data from a global positioning system (GPS), it automatically distributes the proper amount of pesticides to each area using a pulse width modulation (PWM) controller, which regulates the flow of the pesticide through a sprinkler [90]. Taosheng et al. demonstrate a novel robotic variable rate fertilization system which is capable of achieving precise fertilization application rates according to the soil requirements. This system can be controlled remotely using a smartphone application. It uses GPS data for its navigation and utilizes an actuator for the proper dosing of the fertilizer. Reduced costs as well as reduced environmental pollution and resource waste were some of the most important benefits of the proposed system [91]. Pobkrut et al. demonstrate a robotic platform which measures several soil properties by means of penetration measurements. The specific platform automatically detects obstacles which could potentially damage the measurement probes. Navigation can be performed either autonomously via satellite positioning or via remote control [92].
By facilitating phenotyping techniques, CPSs can provide useful information for PA. The term phenotyping refers to the comprehensive assessment of a plant's anatomy and physiology as well as of its complex ontogenetical and biochemical traits. Through phenotyping, useful knowledge can be acquired regarding a plant's growth, yield, stress, etc. [93,94] Young et al. present a low-cost, robust, and high-throughput phenotyping robot. Able to capture images from multiple perspectives within plant rows, this system overcomes certain problems (limited coverage area due to capturing from a single perspective, inability to collect all the required phenotypic data, etc.) faced by other similar robotic solutions. Using trait-extraction techniques, significant information regarding the stem width and the plant height can be obtained. A real-time kinematic global positioning system (RTKGPS) is used for the precise navigation of the robotic system. Field testing of the system above indicated high throughput phenotyping capabilities in energy sorghum crops [95].
CPSs can also facilitate various time-consuming agricultural activities (e.g., tilling, seeding, transplanting, harvesting, weed control). Matsuo et al. analyze the use of a robot for performing unmanned rotary tilling, utilizing a navigation system for obtaining its position and direction. These researchers also propose a double-vehicle operation in which the operator manually operates a conventional tractor while also supervising the operation of a nearby tilling robot. The results estimated 1.8 times increased efficiency as compared to the manned-only operation [96]. Kumar et al. demonstrate an IoT-based seed sowing robot which can be controlled through a mobile application. Several components of the proposed system can be fabricated in the house using 3D printing technologies. Important benefits for the farmers regarding the reduction of labor and total costs were also spotted [97]. Tamaki et al. demonstrate a rice transplanting robot using the RTKGPS and data from various sensors for its navigation. The specific system was found to have satisfactory transplanting accuracy. In addition to this, various data (e.g., working path, time, and GPS quality) recorded by the transplanting robot can be used for providing details about the cultivation history, thus contributing to a more credible food safety system [98]. Xiong et al. present an autonomous robot for strawberry harvesting. A novel obstacle-separation algorithm was proposed to overcome the difficulties faced by other similar solutions in harvesting in cluttered and unstructured environments. Using this algorithm, a robotic griper can actively push surrounding obstacles aside, leading to increased harvesting efficiency. An integration and control algorithm was used for continuously harvesting strawberries along polytunnels [99]. Nørremark et al. demonstrate an automatic weed control system utilizing autonomous vehicles and GPS technologies. The specific system addresses the problem of farmers having to remove weeds by hand within rows of their precision seeded crops. An autonomous tractor using GPS satellite navigation was field-tested, and simulation technologies were used for the assessment of the tilled area as well as for the quantification of the system error [100].
UAVs can be used in conjunction with UGVs, providing important benefits to PA. Liebisch et al. demonstrate the main concepts of the Flourish Project, which combines the capabilities of multi-copter UAVs with multi-purpose agricultural UGVs. One of the proposed uses refers to the creation of crop property maps using UAVs, multi-spectral imaging, and 3D reconstruction technologies. This solution facilitates crop management decision-support. Complementing the aforementioned solution, UGVs can be used for plant classification as well as for real-time crop-from-weed differentiating [101].

Smartphone Applications
Smartphones are extensively used across the world nowadays, and there is a plethora of mobile applications supporting various PA practices. Vellidis et al. demonstrate a mobile application which can be used for irrigation scheduling in cotton crops. This application estimates the crop root zone soil water deficits (RZSWD) based on weather station data, various crop and soil parameters, as well as on irrigation applications. After the user has inserted location of crops, soil type, and details about the irrigation strategy employed, the application informs the user when RZSWD surpasses a certain threshold; rain is reported by a nearby weather station or when phenological changes take place. Thus, the application facilitates decision-making around irrigation and can contribute to increased productivity and resource saving [102]. Delgado et al. present a smartphone application for executing nitrogen leaching simulations and conducting assessments of nitrogen loss risk while also correlating results with certain observed values. Using this application, farmers can make better decisions regarding nitrogen management and optimize the use of resources [103]. Petrellis analyzes a mobile application for the early diagnosis of plant diseases. Based on image processing, classification, and segmentation techniques, this application enables farmers to insert photos of their plants in order to identify diseases. Evaluation of this application indicated acceptable accuracy. Early detection of diseases provided by the application contributes to the minimization of the financial cost and the environmental impact [104]. Patrignani and Ochsner demonstrate a smartphone application for the measurement of fractional green canopy cover (FGCC). This parameter is of great importance in PA as it can be used to monitor canopy growth, interception of light, as well as the portioning of evapotranspiration. Several image processing techniques are utilized to estimate FGCC. Users can insert images and/or videos as inputs to the application. Evaluation of the application revealed importantly faster operation as compared to other similar software tools while also maintaining high accuracy [105].
The table below summarizes the results of the present literature review utilizing our proposed categorization method, which is described in Section 2.

Discussion on the Results-Conclusions
The present literature review surveyed a wide variety of ICT solutions for PA supporting one or multiple data lifecycle stages. These solutions were found to underpin PA applications that in turn contribute to substantial environmental and economic benefits (e.g., higher quality crops with higher yield, prevention of pesticides and fertilizers overuse, optimized used of water and other resources, prevention of soil degradation).
Among the different data lifecycle stages in Table 1, data analysis and AI stage encompassed the largest number of different technological categories, i.e., 22 subcategories. Being a focus point of 21 different solutions, computer vision was the most popular technological subcategory of data analysis and AI lifecycle stages. Computer vision tools and technologies were mainly used to facilitate classification and detection of crops, to extract useful knowledge for the condition and quality of the crops, as well as to enable effective CPS-based operations. Technologies revolving around decision-support and predictive models were also very popular among the data analysis and the AI solutions. These technologies offer valuable insights into actions farmers should take to increase productivity and efficiency as well as to prevent losses or destruction of crops due to diseases or natural phenomena. Despite the fact that data collection and IoT stages of the data lifecycle model encompassed only four technological subcategories (WSN was the most popular subcategory), it was the focus point of several solutions. Sensors of various kinds (mostly temperature, moisture, CO 2 , and light sensors) were of vital importance for monitoring crops condition and served as inputs to subsequent data analytics/integration/storage/sharing operations. Multispectral satellite imagery was used in many computer vision techniques, and satellite navigation was extensively used for the navigation of the majority of CPSs.
Regarding the papers focusing on storage and distribution lifecycle stages, sharing platforms and cloud storage platforms were very popular, facilitating secure storage and distribution of agricultural data as well as of valuable experiences and knowledge. Blockchain and smart contracts were found to have a great potential in PA, providing immutable and secure data storage and sharing.
Among solutions supporting multiple data lifecycle stages, CPSs were by far the most frequently occurring in the examined bibliography. The use of CPSs in the context of PA can radically change the form of modern farming activities. The reviewed PA solutions enabled several tasks (e.g., seeding, transplanting, weeding, tilling) to be conducted autonomously with minor or no human interaction. Other solutions revolved around dangerous (e.g., due to exposing the farmer to harmful chemicals) and time-consuming activities, which were carried out by CPSs, thus decreasing labor costs and minimizing the risks of health problems.
The proposed categorization method can be used for future research on PA from a single (or multiple) data lifecycle stage(s) perspective(s). Future research may also perform a cost-benefit analysis regarding the use of certain PA-related collection/analysis/storage/distribution technologies described in this review. The developed data lifecycle model can be used in its current form for research works around PA or other technologies encompassing multiple data lifecycle stages. It may also be adapted to meet the specific needs of business or academic members.
In conclusion, ICT solutions act as pillars and facilitators for data management in PA. Diverse ICT tools and technologies support different lifecycle stages in a variety of ways. From the analyzed bibliography, machine vision, CPSs, WSNs, decision-support systems, and satellite navigation were the technologies with the most vital role in PA and data analysis, and AI was the data lifecycle stage receiving the widest support from ICT solutions.
Author Contributions: K.D. was the paper initiator. He identified the need to systematically review the data lifecycle management in precision agriculture. He researched a variety of relevant information and communication solutions, especially those related to data analysis and artificial intelligence. E.D. performed extensive research on state-of-the-art information and communication solutions deployed for supporting the lifecycle of precision agriculture data. He focused on the area of data collection and internet of things as well as on data Storage and distribution. It is noted that both authors closely cooperated in order to produce a thorough and high-quality review paper. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.