A Survey on Intelligent Agricultural Information Handling Methodologies

The term intelligent agriculture, or smart farming, typically involves the incorporation of computer science and information technologies into the traditional notion of farming. The latter utilizes plain machinery and equipment used for many decades and the only significant improvement made over the years has been the introduction of automation in the process. Still, at the beginning of the new century, there are ways and room for further vast improvements. More specifically, the low cost of rather advanced sensors and small-scale devices, now even connected to the Internet of Things (IoT), allowed them to be introduced in the process and used within agricultural production systems. New and emerging technologies and methodologies, like the utilization of cheap network storage, are expected to advance this development. In this sense, the main goals of this paper may be summarized as follows: (a) To identify, group, and acknowledge the current state-of-the-art research knowledge about intelligent agriculture approaches, (b) to categorize them according to meaningful data sources categories, and (c) to describe current efficient data processing and utilization aspects from the perspective of the main trends in the field.


Introduction
Over the last hundred years technological achievements, like the utilization of heavy machinery and the industrialization of the production chain, introduced major changes to the agricultural working status quo. These changes, although pivotal for the current development of agriculture as we know it until the recent years, are not considered to be the last resort of agriculture, as world population feeding needs increase over time. The term intelligent agriculture, or smart farming, typically involves incorporation of computer science and information technologies into the traditional notion of farming. It allows the creation of large volumes of data, with the progressive introduction of intelligence in the process of their transformation into organized information, and the extraction of meaningful semantic knowledge, as the last link of this virtual chain of actions.
Factors such as climate change, demographic parameters, population movements, and overall aging and migration issues between rural and urban areas, all play a significant role and come into play when dealing with intelligent agriculture research. Specifically, climate parameters like temperature, precipitation, and soil moisture have significant impact on most of the aforementioned factors, thus the exploitation of artificial intelligence and machine learning techniques towards their efficient, reliable and accurate prediction is of great importance in modern agricultural activities, since they may heavily affect the corresponding production costs and can assist in minimizing environmental constraints.
Having declared the necessity and importance of the discussed domain, we may further elaborate on the specific aims of this paper. First of all, it is rather true that, being a modern application domain, most of the corresponding research literature on intelligent agriculture has been published over the recent years. Still, there are important research works, included herein, that deal with aspects either directly or indirectly related to the modern field. Considering this context, our research aims to achieve the following objectives: (a) To identify, group, and acknowledge the research knowledge about current intelligent agriculture that are state-of-the-art, (b) to categorize them according to meaningful data source categories, and (c) to describe current efficient data processing and utilization aspects from the perspective of the main trends in the field.
Our main motivation behind the production of this research work was our belief that identifying how computer science interprets intelligent agriculture over time, geographical space, and methodologies may aid the production of new, advanced, and innovative research with the objective of covering areas that have received less attention up until now and may introduce brand new applications in the process. Thus, we summarized and organized in a researcher-friendly tabular manner important or pioneer related research works deriving from diverse agricultural intelligence domains. In the following sections, we initially focus on different data collection techniques, ranging from meteorological data at the macro-(i.e., geographical region), meso-(i.e., field), and micro-level (i.e., specific intra-field areas), to GIS and remote sensing data, image data from drones, or even innovative crowd-sourcing activities. Moreover, we present the prevailing data processing and utilization methodologies that exploit notions from the machine learning computer science fields. We categorize them into two main groups, namely machine learning algorithms focusing on numerical data processing, and artificial intelligence exemplars from the drone or even satellite imagery domain.
On top of that, efficient environmental monitoring suggests innovatice procedures of meaningful knowledge extraction through various observation and evaluation techniques. In the typical case, "machines", under the generic auspice of computer science, support and preserve acquisition and management of such information into accessible semantic modules [1], whereas the goal of integrating the task of monitoring, physical computing, information technologies and knowledge representation has been served in the past by the semantic sensor network ontology and it's many variations [2]. In this framework, the two main factors are formed around the tasks of a knowledge base creation and corresponding data visualization.
It should be also noted at this point that this paper attempts to conduct a broad modern survey of the illustrated topics and, thus, form a somehow point of reference for fellow researchers. It attempts to study the existence, importance and impact of the notion of agriculture information handling methodologies within the scope of the modern and quite popular "smart" agriculture computing framework. Still, this is a tedious task to tackle, since mainly given to its inherent diversity, the term is nowadays widely acknowledged and has become a topic of interest in several multi-disciplinary sub-fields, ranging from traditional artificial intelligence algorithms to contextual semantics and other recently emerged innovative applications. Focus is also given on the identified challenges and potential future research directions that emerge from the herein discussed approaches. To the best of our recent knowledge, researcher, academics, but also interested stakeholders and even farmers themselves are nowadays eager to obtain comprehensive and perspicacious written reviews of related studies on modern smart agriculture information handling. Thus, this survey attempts to provide a rich source of ideas for them and a good point of reference for those who want to start studying related sub-fields in depth within the framework imposed by the latest related disciplines advances, namely agriculture and informatics. Among its goals is the identification of different types of methodologies in agricultural informatics and the provision of an overview on the definition and utilization of related research approaches exploited within recent approaches and applications.
The rest of this paper is structured as follows: In Section 2, we present related research works concerning the various data sources utilized with respect to agricultural aspects. Then, in Section 3, we present the main machine learning methodologies that may be efficiently applied on agricultural data. Section 4 discusses ad-hoc topics that are built on top of the data and algorithmic fronts, such as intelligent knowledge acquisition, data visualization, and crowd-wisdom exploitation. Finally, a short discussion together with drawn conclusions and future prospects relating to the proposed framework are presented within Section 5 that concludes the paper.

Data Sources and Collection
The natural conditions of agricultural fields and their neighbor areas are characterized by their meteorological, climatic, and soil attributes. Information Technologies (IT) have been shown to improve agricultural productivity in a number of ways. Furthermore, the study of conditions is usually performed through different type of datasets and in a multi-level scale. More specifically, in the following we reason agricultural monitoring as a classification of four concrete methods for data retrieval on the scope of analysis and machine learning. In Section 2.1 we present works related to on-site climatic sensory observations that largely produce numerical data, in contrast to Section 2.2 that exhibits spatially correlated physical factors. Additionally, satellite imagery presents a wealth of numerical, spatial-aware, and image data from remote sensing acquisition techniques (Section 2.3), which is enriched through the advent of aerial imagery applications in agricultural monitoring. Lastly, application of drone-aided technology (Section 2.4) unifies all physical characteristics of the approximate environment on an accessible research context.

Meteorological Sensory Data
Managing climatic and environmental challenges in relation to wine production can be implemented by recording and assessing meteorological data. Hunter et al. [3] studied the impact on microclimatic conditions using the distribution of local conditions for existing vineyards. The vines have a distinctive topology that is subjective to atmospheric, climatic and soil conditions of their given location. The scientists studied different orientations of Vitis vinifera L. cv. Shiraz variety in South Africa and tested their distinct environmental factors in three levels -macro, meso and micro. They've recorded temperature, radiation, rainfall and wind speed (at the macro level), ambient photosynthetic active radiation on top of the vines, wind speed and direction (at the meso level) and ambient temperature and relative humidity (at the micro level). Additionally, they measured relative humidity, leaf temperature, photosynthetic photo flux density (PPFD), transpiration, CO 2 concentration and photosynthetic activity at the canopy level, six weeks after the berry colouring phase. Hunter et al. managed to provide novel knowledge on the effect of row orientation on vineyard meso-, macro-and microclimate in addition to vine physiological status. Their results produced quantitative data related to the complex relationship between row orientation and physiological characteristics of vines.
The vision of sustainable management in agriculture was studied by Kaewmard and Saiyod [4] by introducing a Wireless Sensor Network (WSN) control system. They deployed several experimental sensing devices to monitor crop irrigation and produced a platform for water management, employing mobile application and services along with a wireless communication platform under the scope of the Internet of Things (IoT). Furthermore, their experimental results were related to the plants' growth rate indicating highly accurate measurements based on soil moisture, air humidity and temperature sensors.
Real-time data collection and distribution is an important aspect on the direction of applied monitoring in agriculture. Under this scope Kiyoshi et al. [5] proposed a decision support system based on multilevel sensor measurements. Their interoperable applications were capable of dynamically querying sensor metadata and analyzing historical weather data. Kiyoshi et al. produced predictions concerning the evaluation of various weather scenarios by implementing crop modeling and simulations. Furthermore, the authors used heterogenous data sources, promoting efficient crop management and control and subsequently assisting farmers' operation cycles at all stages of cultivation.
The research of Bock et al. [6] is related to the actual grapevine yield and quality and is based on long-term data accumulation for over the last two centuries (between years 1805-2010) in Germany. The authors analyzed long-term time series of grapevine yield in yield per hectare (hL/ha) measurements and the sugar content of must (i.e., the grape juice) in Degree Oechsle ( • Oe) measurements, as well as climate data through bibliographical references and sensor recordings.
It should be noted that Degree Oechsle measures the relative sweetness of the must and shows how much more 1 L of must weighs, compared to 1 L of water, whereas the entire dataset was homogenized by monthly observations depending on available resources. This long-term, multi-factoral research resulted into a verified upward trend in both yield and must sugar content over the examined time period. The researchers distinguished further between the anthropogenic and climatic impacts on yield in relation to temperature; their study underlines the need for continuous and calibrated measurements of morphological and chemical characteristics, along with meteorological conditions during vine production in relation to their productivity.
Another WSN agricultural monitoring platform was introduced by Tarange et al. [7]. The researches proposed an automated control system for crop irrigation based on soil moisture and temperature senors.
In particular, the sensor information was collected from the nodes in a continuous manner, allowing the system to control irrigation uniformly and reducing the fresh water consumption without the need of human interaction.
Andreoli et al. [8] developed a simulation related to vine growth parametric processes for the understanding of plant conditions at vineyard scale of Vitis vinifera L. Nebbiolo of NW Italy. The simulation is based on experimental observations from model vineyards. Their model required a set of meteorological data, soil parameters, vine's physical characteristics, geographic information and variety characteristics, which was then evaluated through existing empirical equations. Recording frequency is adjusted depending on the nature of the data and the sampling took place on 15 grid points over the Piedmontese territory in mixed elevation. Furthermore, the IVINE model showed that meteorological data produce more stable results and contrary to vine growth and phenological phases that were less accurately predicted.
Water is one of the fundamental aspects of life on earth and provides precious nutrients for many types of crop. Yuan and Zhen [9] estimated water consumption by agricultural activities in China. More specifically their research focused on agricultural water and irrigation net consumption for the period between 1984 to 2008 in the Hebei province. Meteorological data were used to calculate a baseline of evapotranspiration (ET 0 ), along with grain yield to find the actual evapotranspiration. Atmospheric and ground measurements included daily average temperature, relative humidity, precipitation, duration of sunshine, atmospheric pressure, vapor pressure, and wind speed, which emanate from 55 weather stations located in the Hebei area. Their study showed a spatial pattern of increased ET 0 that is possibly related to intensified agricultural activity and increasing temperatures. This work points out the use of meteorological monitoring (i.e., large scale monitoring, in this particular case) that may help researchers further understand the interdependent relations of climatic conditions and human activities.
The work of Tagami et al. [10] was related to the development and application of Wireless Sensor Networks (WSN) in viticulture, under the notion of developing appropriate management practices. They deployed a use-case scenario based on a network of highly usable hardware components for data collection. Their study was mainly focused on networking and software infrastructures that support an array of prototype weather stations and sensors; the latter collected information relevant to rainfall, wind speed and direction, barometric pressure, solar radiation, temperature, and humidity. In addition to atmospheric variables authors took into consideration soil matrix potential from soil moisture by using soil measurement probes buried at two different depths. The researchers achieved to minimize data loss and improve data accessibility on the scope of long-term data acquisition system. Furthermore, the array of sensors was operating properly and collecting reliable data due to the correct placement of sensors within the vineyards.
One of the early adopters of WSN in viticulture were Marino et al. [11], where they proposed the integration of specialized electronic sensors for climate monitoring and soil assessment. They deployed an experimental network for on-site automatic data acquisition, based on distributed processes. The network was implemented with the introduction of Electronic Zonal Stations (EZS) to a web platform and a connected database. The EZS consist of a data logger collecting measurements from the stations' sensors, and an Ultra High Frequency (UHF) radio modem that forwards the readings to the base station. The produced dataset is organized in registers that comprise measurements (temperature, relative humidity, leaf humidity, soil temperature, solar radiation, rain gauge, and biological parameters), along with date and time. Ultimately, they provided a low-cost and low power consumption network that offers real measurements in order to consistently validate various biological and ecological models used in smart viticulture.
Indices related to grapevine development were employed by Neethling et al. [12] while analyzing natural trends of climatic and bioclimatic parameters for the main grapevine varieties in France. Their research took place at the Loire valley in northeast France, where the wine was determined by the unique characteristics of the approximate geographical environment. The research team used temperature data from 7 stations in the area, but not situated within the vineyards. More specifically, temperature measurements produced several indices: Mean and seasonal temperature, growing degree-days index (GDD), Huglin index (HI), cool-night index (CI), and Diurnal Temperature Range (DTR). Additionally, they gathered monthly rainfall data from 5 stations within the research area, which used to calculate total rainfall, dry-spell-mean index, and number of days of heavy rainfall. Moreover, they dissected the indices in 2 groups (climatic and bioclimatic), which assisted them to conclude on a temperature rise during the growing season (April-September). Consequently, Neethling et al. used a large meteorological dataset that helped them indicate the influence of climatic and bioclimatic variations in berry composition of the main grape varieties cultivated in the Loire valley.
Taking above research efforts a step further, Internet of Things (IoT) technologies are supportive to conventional methods of data acquisition through the prism of prediction models development. Parra Plazas et al. [13] proposed a Computational Intelligence (CI) model using IoT technologies in the notion of climate monitoring. Their dataset originates from weather stations in a time-line between 2009 and 2011. The variables under study are precipitation (mm of water), atmospheric pressure (mm Hg), relative humidity (HR%), temperature (Co), wind speed (km/h), and wind direction (azimuth degrees). The data collected during the aforementioned time period produced a knowledge base on which two CI models were deployed. The models demonstrated that further experimentation is needed with an amplified dataset in order to improve its responsiveness.
Having said that, it is rather true that modern research on agricultural production and adaptation tends to be based on multi-disciplinary methodologies focused both on quantitative and qualitative measurements. Statistics derived solely from arithmetic/numerical data cannot provide the excepted results on production growth and durability. Thus, most recent studies deployed several different methodologies in the fields of geography, remote sensing, and informatics that largely enable researchers to further employ state-of-the-art agricultural analysis on top of the aforementioned methodologies and approaches. The latter are discussed in the next subsection.

Location-Based Data
Nowadays, the so-called Geographic Information Systems (GIS) play a powerful role in spatial analysis. The investigation and management of spatial variability in agricultural fields consist of mapping homogeneous management zones with the use of proximal and remote sensing methods that provide increased resolution and accuracy of spatial characterization of physical parameters. The basis of geographical integration into Decision Support Systems (DSS) introduced by Tian-en et al. [14] describing spatial work-flows and their components, under the scope of precision farming optimization. Their proposed web platform was based on data structures derived form sampling areas. The data are soil, water and climatic factors and are accompanied by location-based data in order to assist agricultural management within the the study areas.
Furthermore, it may reduce on-site sampling costs, and improve management of wine quality in relation to soil, crop, and yield features. In this framework, the notion of spatial variability of meteorological conditions is presented by Matese et al. [15] for the distinction of best vintages. The authors presented novel results on spatial variability and its definition at different topographic scales. They deployed several sensors (temperature, humidity, solar radiation) within the study areas and developed varied analytic methodologies depending on the climatic scale (meso, topo/macro, micro 1 and micro 2). All factors assessed within the context of spatial variation as dependant categorical variables, which are designed hierarchically in order to justify their meaning in parametric monitoring assessment. Their results illustrate the effects of several sources of spatial variation in viticulture and presented a "proof of concept" methodology for on-site application. Additionally, the in-field measurements potentially provide support to production and disease management for research and technical purposes. Finally, Matese et al. highlight the need to further develop multifactorial means of analysis on the scope of agro-meteorological monitoring in addition to the management of large environmental datasets.
Bai et al. [16] proposed a multi-sensor system that manages plant phenotypic data, which based on five sensor modules to measure crop canopy traits from agricultural terroirs. The research team also assessed data related to canopy reflectance spectra thoughout RGB images on soybean and wheat fields. The produced integrated sensor system used solar radiation and air temperature/relative humidity sensors, ultrasonic distance sensors, infrared thermal radiometers, NDVI sensors, portable spectrometers and RGB cameras that then were synchronized the analyzed data in order to provide inter-correlations among the sensor-based traits, validated of the reliability of the apparatus in data collection. Their study indicated the useful capabilities of location based sensor systems during plant breeding.
Geo-statistics as a tool for monitoring agricultural practices was the subject of study of Irimia et al. [17]. They used the daily averages of temperature, precipitation and sunshine duration variables for the period between 1961 and 2013 at a spatial resolution of 0.1 degrees (i.e., 10 × 10 km) in order to produce a spatial distribution map assessing the climatic suitability of wine production based on oenoclimatic aptitude index (IAOe). Their findings highlighted the negative evolution of climate suitability for wine production within the areas of study and where the changes are likely to generate climatic classification in wine growing regions in Romania.
The simulation of surface flows on the basis of erosion management was examined by Rodrigo-Comino et al. [18] in order to estimate soil erodibility, runoff discharge and erosion rate. They've implemented several rainfall simulations in their study areas and produced statistical results used in conjunction with spatial coefficients, which indicate spatial distribution of the mechanical effects of water in soil composition. Furthermore, authors produced proportional symbol maps that visualized the results of spatial coefficients, thus indicating the high soil and water losses within the study areas.
Finally, spatial characteristics of different energy consumption patterns in agriculture were presented by Tian et al. [19]. Specifically, the authors implemented a GIS methodology in analyzing the distribution between energy types with respect to their contribution in yield. Furthermore, their study recorded numerical data collected on-site that assisted them to analyze the spatial variation of energy consumption. They produced several thematic maps, which visualize several indices related to spatial correlation, clustering, and association. The results categorized the energy types in significance and Moran's I index proved to be the most productive tool for large scale spatial correlation analysis.

Data from Satellites
Remote sensing has proved to be game-changing for the agricultural sector, as it is considered to be fundamental for the trending "precision agriculture" framework. In principle, remote sensing estimates the properties of a plant through non-destructive processes in a fast and accurate method. It may include a wide range of applications in agriculture, such as crop growth monitoring, crop yield, and quality estimation, identification of irrigation needs, as well as biotic and abiotic damage [20].
Zarco-Tejada et al. [21] studied the effects of hyper-spectral indices in agricultural research and in particular in the natural conditions of vineyards. Their scope of research concentrated to vine reflectance along with soil spectra extracted from images collected at spatial resolution of 1 m and on 8 spectral bands. In particular, they concentrated at the calculation of narrow-band indices of pigment density. Furthermore, this study investigated the optical properties of Vitis vinifera L. leaves by taking into consideration its reflectance and transmittance properties. Additionally, the researchers measured the optical index calculation along with the destructive determination of pigments on a database retrieved between 2002 and 2003. The images were obtained from 103 study sites that include 24 fields and enabled the use of scaling up on leaf-level sensitive remote sensing indices. The indices assisted the researchers to assess on the effectiveness of canopy-level information retrieval methodology.
It is acknowledged that water is a major resource in agriculture and is related to arid regions. Remote Sensing (RS) toolkits offer region characterization on drought through specified indices developed for remote study. Rhee et al. [22] proposed a RS-based drought index that monitors arid and humid regions using multi-sensor data. Scaled Drought Condition Index (SDCI) combines several assets of RS methodology (i.e., Normalized Difference Vegetation Index (NDVI), Moderate Resolution Imaging Spectroradiometer Sensor (MODIS), and precipitation (TRMM)) in order to collect relevant information for the effect of waterlessness on annual basis. The SDCI was affected by the varied spatial resolution from each of the indices and scaling was an approach to segregate the drought effect among arid and humid areas, while the dataset will get enriched over time.
Zone scaling on thematic maps of wine grape terroirs was the subject of study of Vaudour et al. [23]. They proposed a mapping methodology of implementing bootstrapped regression trees on distinct combinations of morphometric data in conjunction with 20 meter SPOT satellite images. Furthermore, the authors managed to capture expert knowledge on grape harvest quality at various spots in South Africa on a regional scale. Their proof of concept methodology relied on a limited number of easily available spatial data at a medium spatial resolution; the positive aspect was the fact that the latter could be duplicated to any other homogeneous viticultural terroirs.
Productivity assessment analysis while implementing hyper-spectral reflectance recognition in relation to vines biophysical properties was proposed by Serrano et al. [24]. Their work focused on berry yield growth and quality attributes prediction in rain-fed vineyards. The indices implied differences among vineyards with relation to their vine canopy vigor and water status, on a water deficit time scale. Furthermore, their results indicated specific correlations among indices that are more susceptible to water sufficiency or deficiency on the scope of predicting berry quality attributes in relation to aquatic reserves.
Bourgeon et al. [25] proposed methodology for characterizing vine foliage with the use of multispectral satellite sensors. More specifically, they introduced a proximal imaging sensor that operates within the visible and near-infrared (NI) spectral bands. The camera sensor was deployed at the ground level and produced absolute reflectance images comparable to NDVI maps produced by commercial satellite systems. The physiological assessment of the vines was implemented through radiometric calibration of multispectral imagery by post-image processing tools on the scope of temporal monitoring the evolution of vine foliage.
Tang et al. [26] proposed both a proximal and a remote sensing method, utilizing two different methods for assisting in decision support and yield estimation from existing technologies. They employed ground video feed for green pixel, in conjunction to local thresholding and Self-Organizing-Maps from satellite imagery, in order to identify non-productive vines in the available canopy on the block level. This semi-supervised version provided recognition capabilities at the phenological stage as an automatic and low-cost monitoring application, potentially deployed over different times within the season.
Environmental, socio-economic, and physical effects on Mediterranean vineyards is the subject of study for Vinatier and Gonzalez-Arnaiz [27]. The researchers employed multi-temporal satellite imagery in order to comprehend and track the changes in land cover that shape agricultural landscapes. Furthermore, the changes in land management were composed in transition matrices that interpret gains, losses, and swaps for each category. In addition to temporal changes, images with sufficient spatial resolution were used in order to detect subtle differences in land use/cover on a specific scale.
Their work underlines the usefulness of satellite imagery in classifying alterations in yield distribution and quality, in relation to external parameters that drive land management in agricultural areas.
Yield quality and productivity can be assessed through satellite and proximal sensing by estimating spectral vegetation indices (SVIs). Anastasiou et al. [20] employed satellite and proximal sensing at the various stages of table grape growth for three cultivation years in a row (i.e., between 2015-2017). Landsat 8 satellite imagery and proximal sensing techniques produced calculations of Normalized Difference Vegetation Index (NDVI) and Green Normalized Difference Vegetation Index (GNDVI) that the SVIs were based upon. The indices measure seasonal plant response -exploiting soil and plant spectra-and produce the type of relationship between visible and near-infrared reflectance. Their results indicated higher accuracy of proximal sensing, providing higher correlations and temporal accuracy during the earlier period of grape growth.
The importance of water resources in the quality of vine production is highlighted by Loggenberg et al. [28]. They introduce a machine-learning based remote sensing methodology for the study of water stress in Shiraz type vineyards. The results are based on terrestrial hyper-spectral imaging capturing the study areas, which were located in Stellenbosch at the Western Cape of South Africa. They employed several remote sensing methodologies upon data acquisition and pre-processing techniques combining on-site and satellite sampling. The remote sensing methodology consists of spectral smoothing for multi-band images, classification (implemented with Random Forest (RF) and Extreme Gradient Boosting (XGBoosting) techniques), dimensional reduction, and accuracy assessment. The researchers concluded that remote sensing machine learning methodologies on hyperspectral data demonstrated that Random Forest (RF) classification optimally corresponds to water stress modelling in applications related to viticulture.

Image Data from Drones
In data-driven agriculture, Unmanned Aerial Vehicles (UAVs), or drones, as they are usually referred to, are a valuable tool for making inferences about grape composition from multispectral measurements. They can acquire high-resolution aerial images, digital RGB (Red-Green-Blue), hyper-spectral imagery of vineyards, as well as data of other modalities. When equipped with appropriate sensors, UAVs can collect useful data related to leaf temperature, vine water status, and canopy vigor. Depending on the need of the applications, equipped sensors can gather thermal, visible, hyperspectral, and/or multispectral images. For example, hyperspectral sensors can gather a broader range of wavelengths; a thermal camera can gather image data to determine canopy variables such as water stress.
Comba et al. [29] introduced the adoption of detection techniques in modern precision agriculture for quality production standards. The researchers proposed an image processing algorithm for the automatic detection of vine rows from gray-scale aerial imagery. Near-infrared (NI) images acquired with a Tetracam ADC-lite camera (http://www.tetracam.com/Products-ADC_Lite.htm) that was installed on a Mikrokopter Hexa-II drone (http://wiki.mikrokopter.de/en/HexaKopter). This array has a field of view and is able to achieve a 0.056 m/pixel ground resolution at a flight altitude of 150 m. Additionally, the effectiveness of the proposed method has been demonstrated by reporting the results of the elaborations applied to 4 sample vineyard images. The proposed method is able to process the original acquired image as it is, without any intervention or feedback from the user and requires a very limited number of calibration parameters. The results obtained from the processing of a set of sample images demonstrate the effectiveness of the proposed method, with a very limited error rate, even in the case of images with disturbances and different vine row orientations in the same image.
The work of Matese et al. [30] describes the implementation probabilities of multi-sensor UAVs with a variety of monitoring tasks. The researchers assessed the intra-vineyard in terms of characterization of the state of vines vigor using multispectral cameras, leaf temperature (thermal camera), and a model methodology of missing plants analysis with a high spatial resolution of RGB bands camera. The aforementioned apparatus was designed specifically for precision viticulture in the scope of rapid and multi-purpose environmental monitoring within the areas of interest. The UAV provided images with discriminative details in rows and inter-rows of vines and as a result provided pure pixels with solely the foliage. Their usage in agricultural applications, in general, and in viticulture, in particular, is enhanced by high flexibility of use, low operating costs, and a very high spatial resolution that they offer.
De Castro et al. [31] exploited the capabilities of UAVs to produce image data that support 3D landscape models. The aerial vehicles flew over 3 vineyards with a distinct gap between the vine rows at a different growth rate. More specifically they chose to make the flights in July and September in order to depict the two distinct phases of the crop. With their approach, they managed to gather information from 3 fields at two crop stages, analyzing a wide range of situations and as a result to produce robust data. The UAVs were flown at 30 m of altitude, taking images with spatial resolution of 1cm per pixel (i.e., ground image size: 37 × 28 m) in a rate of 1 s per frame. Therefore, they succeeded to get a 93% forward lap and 60% of side lap in order to create a 3D reconstruction of crops.
Multi-temporal monitoring of agricultural lands demands continuous characterization of vegetation on distinct dates. Pádua et al. [32] proposed a monitoring platform that uses multi-temporal data related to precision viticulture (PV) that can be used to monitor the size, shape, and vigour of grapevines canopies. Their work aims to vegetation characterization through multi-temporal analysis using accessible and low cost UAVs for the acquisition of very-high resolution imagery, up to ground sample distance (GSD) and accompanied sensor nodes. Thereafter, the gathered data were analyzed automatically with consideration of the natural, topographic, and morphological variables of the area of interest. Their method produced RGB orthophoto mosaics of the vineyards in whole, multi-temporal field analysis, and, finally, a crop-related and non-invasive data acquisition method for farmers.
All in all, high-resolution aerial images from drones can be used to determine the spatial distribution of a variety of canopy variables within a vineyard block and between different vineyard blocks. In this way, UAVs can be used to measure the spatial distribution of vigor, water stress, nutrient status, disease, yield components, and berry composition. Rationale for employing drones in precision viticulture practices are the higher flexibility of use, lower operational costs compared to the usage of conventional aircraft, and invariability to cloud cover. However, utilization of UAVs in precision viticulture is a relatively new area of research, not extensively tested and the major challenges to be considered are the massive amount of collected data, the time required for post processing of collected data, and the need for feature engineering in order to derive useful vegetation indices that are specific to variables of interest.

Machine Learning (ML) Methodologies for Agricultural Data
During the last years we have witnessed a plethora of technological breakthroughs in domains such as computer vision and machine translation, which have led to the establishment of Machine Learning (ML) as one of the core tenets of the current technological landscape. The aforementioned successes can mainly be attributed to the utilization of large and complex Deep Neural Network (DNN) architectures, better acknowledged as "deep learning." Moreover, the evolution and transformations in the smart agriculture field over the recent years result nowadays in the mass production of huge amounts of data, information, and content from many different sources, such as IoT devices and sensors, environmentalists, agronomists, winemakers, or plain farmers and interested stakeholders. Thus, deep learning along with other more "conventional" machine learning methodologies, can play a pivotal role towards addressing numerous research problems within the agricultural domain, mainly through the utilization of the large amount of available digitized agricultural information.
In the following, Section 3.1 presents research works related to exploiting continuous data (e.g., data from sensors such as temperature, soil moisture, etc.) for the purpose of creating predictive Machine Learning models capable to provide various forecasts, whereas Section 3.2 describes the usage of image-based information mostly by deep learning models, which are capable of classifying plant species, detect deceases, and finally, perform spatial-aware object classification.

ML-Based Forecasting on Numerical Data
One of the worth-noting works in the field is the one by Radhika and M. Shashi [33], where they exploit Support Vector Machines (SVMs) for the task of predicting the maximum daily temperature. They use a Multi-Layer Perceptron (MLP) as a baseline, trained through back-propagation. To accomplish the aforementioned prediction task, they rely on weather data provided by the University of Cambridge for the time period of 2003-2007. Through their experimental findings, the authors showcase that the SVM-based approach is able to achieve significant results, thus showcasing its suitability for this particular prediction problem.
Similarly, Gill et al. [34] develop a SVM regression model in order to predict the moisture of soil. In particular, the authors utilize both meteorological and soil moisture data, from 11 weather stations from the Oklahoma region in the United States of America for training and testing their prediction model. Finally, by comparing their proposed model against an Artificial Neural Network (ANN), they showcase its efficiency.
Mohammadi et al. [35] develop a predictive model for the task of inferring the dew point temperature, i.e., the temperature at which water vapor in the air condenses into liquid, since by predicting this, it is also possible to determine whether it will rain or snow at a particular date. To achieve that, the authors exploit Extreme Learning Machine (ELM), i.e., feed-forward neural networks, in which the weights of the nodes within a hidden layer do not require any kind of tuning. Through an experimental evaluation of their model against conventional machine learning models (e.g., SVMs and ANNs), using as a training dataset measurements from two meteorological stations in Iran, the authors were able to show that ELM can be applied for this problem with promising results.
Salcedo-Sanz et al. [36] propose a Machine Learning model for long-term air temperature prediction, a task that is useful for a plethora of domains, such as agriculture. They use the publicly available monthly mean temperature dataset provided by the Australian Bureau of Meteorology (BOM), from urban and regional areas in Australia. They develop and evaluate two distinct methods, i.e., a SVM regressor and an ANN, and showcase that the former is able to outperform the latter, regarding the air temperature prediction task.
Abbot and Marohasy [37] exploit Artificial Neural Networks in order to predict rainfall, for the region of Queensland in Australia. More specifically, the authors apply the aforementioned category of Machine Learning models for the purpose of providing a medium-term rainfall forecast that outperforms conventional statistical models typically exploited for this task. Similarly, Chithra et al. [38] utilise ANNs, for detecting the impact of climate change on the monthly mean maximum and minimum temperature in the Chaliyar river basin located in India.
Onal et al. [39] proposed an extended IoT Framework based on data integration, retrieval, processing, and learning layers trough a weather data clustering analysis methodology. Their learning model implemented unsupervised clustering as a method of big data utilization originating from 8000 weather stations from North America. Air temperature, wind-speed, relative humidity, visibility, and pressure data were analyzed by a traditional k-means clustering algorithm under the scope of the stations' geographical alignment.
Sehgal et al. [40] developed "ViSeed", a Machine Learning and visualization framework for weather and yield prediction, respectively. In particular, the authors train and evaluate a Long Short-Term Memory (LSTM) classifier on the Syngenta 2016 crop data challenge, whereas Song et al. [41] utilize a Deep Belief Network (DBN), in order to predict the soil moisture content in a corn field. By training and evaluating their proposed model on soil data collected from the Zhangye oasis, in Northwest China, and bench-marking against a multi-layer perceptron, the authors were able to showcase its effectiveness for the aforementioned task.
Finally, Priya et al. [42] developed an integrated system for precision agriculture implementing big data analytics, IoT, and Machine Learning methodologies. Their crop recommendation system uses Naïve Bayes to indicate if the conditions are optimal for sowing, plant growth, and harvesting of a plant, in addition to yield prediction and pesticide recommendation. Furthermore, the researchers produced a scalable model based on data collected from various sources, such as satellite images, sensor recorded field data, irrigation related reports, crop data, and weather data in India.
The interested reader is able to to review a brief comparative overview of the above fundamental research efforts in Table 1. More specifically, the Table includes four columns: The first column contains each work's bibliographic reference number; the second column describes the main task the particular work attempts to tackle; the third focuses on the main depicted methodology/algorithmic approach the authors propose or utilize in order to solve the particular research task/problem at hand; finally, the fourth column provides information on the utilized dataset(s) (if any).

Scene Classification
In the framework of scene classification, Rahnemoonfar and Sheppard [43] apply a deep neural network for the task of fruit counting. In particular, the authors present two different neural architectures, i.e., a novel convolutional neural network (CNN) and a modified version of the well-known Inception-ResNet network. In order to train the presented deep neural architectures, authors rely on a synthetic dataset, whereas, in order to validate their performance, they use real images collected from Google Images. Through their experimental approach authors showcase the efficiency of their proposed neural architecture against various baselines.
Similarly, Sa et al. [44] present DeepFruit, a neural network for fruit detection. More specifically, the authors adopt the Faster Region-based CNN architecture and through transfer learning train it to be able to successfully recognize sweet peppers contained in an image. Yalcin [45] developed Deep-Pheno, a CNN for the task of recognizing and classifying the phenological stages of plants, i.e., its life cycle. The author uses images depicting plants such as cotton, wheet, and pepper from cameras found on various agro-stations in Turkey for training and testing the model. The presented neural architecture was able to outperform various conventional machine learning algorithms in the aforementioned classification task.
Lee et al. [46] used CNNs in order to extract features from images in a completely unsupervised way, so as to train classifiers able to identify 44 different plant species. Moreover, the authors try to understand the feature representation of the model, so as to shed some light on its inner decision mechanisms. Authors train and test the CNN on a leaf dataset collected from the Kew Royal Botanic Gardens in England. Finally, in order to evaluate the usefulness of the features that the network learns, they use them as input to a NN and SVM classifiers, and thus are able to achieve beyond state-of-the-art results. In a similar approach, Dyrmann et al. [47] propose a CNN architecture for classifying images of seedings during the early growth stage of a plant. More specifically, the authors train a classifier that is able to predict, using vertically photographed images of seedings, 22 different plant species, achieving state-of-the-art results by training the network on 10,413 images collected from six different datasets.
Furthermore, Bargoti and Underwood [48] present an image-based fruit detection model using the Faster R-CNN neural architecture for mangoes, almonds, and apples, achieving very good results. In order to train the model, the authors use images captured from orchards in the regions of Victoria and Queensland in Australia. They present ablation studies so as to showcase the inner functionality of the neural model, as well as which factors within an image play an important role for the object detection task. Mohanty et al. [49] apply convolutional neural networks for automatically detecting and diagnosing plant diseases through images. To achieve that, the authors utilize a public dataset containing 54,306 images of both diseased and healthy plant leaves, and train two neural architectures, i.e., AlexNet and GoogLeNet, for classifying crop species and diseases status, achieving a 99% accuracy.
Similarly, Sladojevic et al. [50] use a convolutional deep learning model in order to automatically classify and detect plant diseases from images containing leafs. The proposed model was able to detect leaf presence and distinguish between healthy leaves and 13 different diseases, which can be visually diagnosed. Moreover, the authors created a new dataset for training and testing Machine Learning models for this particular task, containing more than 3000 images gathered from the Internet and then extend it to more than 30,000 using various data augmentation techniques. Mortensen et al. [51] utilize a deep neural network to perform semantic segmentation for mixed crops. In particular, the authors were able to use a convolutional neural architecture (i.e., VGG16) and through transfer learning (i.e., fine-tuning the network instead of training it from scratch) have it successfully estimate both the individual components, as well as the total biomass of a crop using an image dataset from a plot experiment at Foulum Research Center in Denmark.
Rußwurm and Körner [52] perform multi-temporal classification upon a time-series for the classification of crops for two seasons achieving state-of-the-art results, and thus showcasing the efficacy of deep learning methodologies for this particular domain. More specifically, to perform the aforementioned task, the authors applied an encoder-decoder neural architecture on an Earth observation (EO) sensors time-series dataset from the region of Munich, Germany. Finally, they also provide a visualization of the neural network's activations, so as to facilitate a better understanding of its inner decision-making mechanisms. Pound et al. [53] exploit neural networks, to perform image-based plant phenotyping, i.e., describe features such as its anatomical, ontogenetical, physiological, and biochemical properties. In particular, the authors train two different CNN-based neural networks to perform root and shoot detection for wheat varieties, using a different dataset for each task, respectively. Using this experimental approach they were able to achieve state-of-the-art results (i.e., >97% accuracy) for root and shoot feature identification and localization.
Namin et al. [54] present a neural architecture consisting of a CNN and Long-Short Term Network (LSTN) for the task of plant classification. In particular, the authors utilize the CNN for the unsupervised generation of features, and the LSTN to investigate the growth of plants. By applying their presented deep learning framework on an image-based time-series dataset depicting 4 different categories of Arabidopsis thaliana plant, they were able to showcase that deep learning networks can generate more useful features than the traditional hand-crafted image analysis features, as well as achieve better results than conventional Machine Learning methodologies. Kussul et al. [55] perform Land Cover and Crop Types classification using a dataset consisting of remote sensing data. To effectively tackle this particular problem, they use a multilevel deep learning approach, i.e., both supervised and unsupervised learning through neural networks for segmenting and classifying a multi-source satellite image dataset. Their proposed neural architecture was able to outperform traditional Machine Learning techniques such as random forests.
Riegler-Nurscher et al. [56] studied the sustainable cultivation of arable lands by introducing a novel generic image analysis method for the estimation of soil cover. They implemented a machine learning methodology for distinguishing between various types of soil cover through training a manageable amount of classification samples. Their method deployed entangled random forests on a wide training data set of soil images obtained by high resolution cameras. Furthermore, the pixel-wise method produced accurate classification of living plant material, building the basis for further development of soil cover estimation in agriculture.
Lastly, Hall et al. [57] investigate whether traditional hand-crafted image features can outperform those generated in an unsupervised way through deep neural networks for the task of leaf classification. By training a deep convolutional neural network on the Flavia dataset, the authors were able to showcase that features generated by the network are more useful than hand-crafting image features, although they stress that combining both approaches leads to optimized state-of-the-art results. Table 2 provides a brief comparative overview of the above discussed research efforts. As it was the case with previous Table 1, this Table includes also four similar columns, namely the first column contains each work's bibliographic reference number, the second column describes the main task the particular work attempts to tackle, the third column focuses on the depicted methodology the authors propose or utilize in order to solve the particular research task at hand, and the last fourth column presents information on the utilized dataset(s) (if any).

Spatial-Aware Object Classification
Modern imagery is possessed by hierarchies of features that are optimally described through convolutional networks. The goal of producing suitable size of outputs while at the same time keeping inference and learning at efficient levels is implemented by building fully convolutional structures that manage inputs of arbitrary size. Furthermore, deep machine learning is a powerful and robust tool to analyzing and predicting the statistical, geographical, and multispectral optical big data. It provides predictions and simulations of the landscape expanding and evolution (geographical big data) in more reasonable and robust method. In this framework, Long et al. [58] presented the basis of segmenting semantic data through convolutional networks. More specifically they noted that segmentation bridges the gap between semantics and location, based on the distinction of global (i.e., "what?") and local (i.e., "where?") information. Based on deep feature hierarchies the researchers were able to encode location and semantics in a nonlinear local-to-global pyramid. They managed to define a "skip architecture" to take advantage of a spectrum of features that combine deep, coarse, semantic information, and shallow, fine, appearance information. Furthermore, they stated that fully convolutional networks are a rich class of models, which are able to extent their classification nets to segmentation. Additionally, Semantic Segmentation allows to improve the architecture with multi-resolution layer combinations that dramatically improve the state-of-the-art, while simultaneously simplifying and speeding up learning and inference tasks.
The importance of aerial survey is underlined by the existing image collection and interpretation tools. Recent rapid progress on these domains incur due to accessibility and advancements in technology. However, imagery has little meaning by itself, unless it is processed and meaningful information is derived from it. Therefore, image classification comes as a way to address the problem of algorithm generalization through deep learning on the purposes of object recognition and detection, scene parsing, and classification. Categorizing each image pixel to a semantic class is typically called "Image Segmentation" and aims at understanding an image at the pixel level, although images of a fixed size are required to achieve optimized classification, dense prediction of images of any size, and significantly increased speed in the process. Still, the concept of smart agriculture convicts the automation of such tasks, especially where the optical sensors play an important role. Guijarro et al. [59] developed an image processing algorithm through new automatic approach for segmenting main image textures. Additionally, they also managed to refine the identification of sub-textures inside the main ones. They implemented a three step approach of automatic image segmentation that proved its effectiveness under different illumination conditions in outdoor environments.
The re-establishment of bee colonies through color satellite image segmentation was the subject of study of Sammouda et al. [60]. They introduced a pixel clustering methodology by introducing a Hopfield Neural Network (HNN) to consider monitoring aspects such as environmental conditions, poor resolution, and poor illumination. The algorithm was implemented successively with two, three, four, five, and six clusters, as well as with data from RGB channels, taking into consideration aspects such as population density, ecological distribution, and flowering phenology. Furthermore, supervised multispectral land-use classification may be implemented through multi-view deep learning methodology; Luus et al. [61] designed a deep CNN (DCNN) for a well-known dataset (UC Merced Land Use dataset: http://weegee.vision.ucmerced.edu/datasets/landuse.html) and used as a benchmark for several applications. Their end-to-end learning system was capable of complex land-use classification of high-resolution multispectral aerial imagery. Furthermore, it was shown that multi-scale views can be used to train a single network and increase classification accuracy compared to using single-view samples.
High-resolution remote sensing (HRRS) provides pixel size classification of individual entities in detail. Längkvist et al. [62] proposed a method for classifying objects from multispectral orthoimagery and digital surface models (DSMs). The researchers investigated the use of convolutional neural networks (CNNs) for per-pixel classification on satellite imagery with very high resolution (VHR) characteristics. They managed to categorize large landscapes into 5 classes (vegetation, ground, road, building, and water) and discovered that at the context size of about 24 pixels they can achieve low training and testing simulation time, and concurrently high and stable classification accuracy. Additionally, the researchers deployed multiple CNNs in parallel with varying context areas in order to provide a stable and accurate classification result.
Continuing, modelling techniques of entities of the agricultural terrain based on remote detection may be produced through suitable algorithms' development. Comba et al. [29] proposed a categorizing method for parallel and equidistant vine rows in order to eliminate spatial periodicity during the image elaboration process. More specifically, the proposed algorithm is able to remove disturbance image elements (e.g., roads, shadows) and minimize cloud obstruction. Their multi-parameter technique was implemented autonomously, without user intervention, while allowing automatic path computation in the corridors between vineyards. Finally the computational effort is affordable and the results can be obtained in a few minutes of elaboration for complex images.
The use of crowd-sourcing geo-tagged field photos in agricultural monitoring applications forms another interesting approach and was investigated by Xu et al. [63]. They developed a land cover type recognition model for field photos based on deep learning techniques. More specifically they combined a pre-trained convolutional neural network (CNN), as the image feature extractor, and a multinomial logistic regression model, as the feature classifier. Their model was able to describe its prediction confidence and can enable users to distinguish reliable and unreliable predictions. The researchers were also able to provide a new research direction in citizen science by proving the possibility that Artificial Intelligence is capable of helping with land cover classification and evaluating the model's performance.
The topic of land classification is strongly related to agricultural applications through its mapping changes and transformation on the landscape. Wurm et al. [64] investigated the application of Fully Convolutional Networks (FCNs) in map entities categorization from various satellite images. They deployed high resolution optical imagery form Quickbird (https://bit.ly/2Fvpq4u), in conjunction with Sentinel-2 (https://bit.ly/2UVFO3C) and TerraSAR-X (https://bit.ly/2WlZHBb) data, by using them in FCNs without fully connected the existing layers popularized by CNN architectures. Furthermore, they explored the capabilities of this process of "transfer learning" to later adopt a pre-trained CNN from VHR optical Quickbird imagery. The images were then applied to datasets with larger mapping areas but lower geometric resolution such as Sentinel-2. Additionally, they assessed the capabilities of transfer learning from optical imagery to existing active SAR imagery from TerraSAR-X. Their results produced extremely promising segmentation outcomes that can be used as a proof-of-concept in transfer learning and fully convolutional networks for landscape mapping in satellite imagery.
Lastly, following the previous notation we introduce Table 3, which provides the comparative overview of the above-mentioned research efforts in the same four-column tabular manner. It should also be clear at this point that the above research works present image processing techniques that can be used to enhance agricultural practices, by improving accuracy and consistency of processes, while at the same time reducing farmers' manual monitoring procedures. These techniques are quite important, since they offer flexibility and effectively substitutes the farmers' visual decision making, along with the identification of textures belonging to the soil and other variables, such as humidity and smoothness.

Intelligent Knowledge Acquisition
The adoption of smart technology (e.g., sensors) within the domain of agriculture has lead to an ever-increasing amount of digital information that can be utilized for various knowledge extraction tasks through visualization platforms and predictive models. Moreover, the datasets created from the aforementioned smart devices may be further enriched through manual and/or automated annotation techniques in order to improve their usefulness. Following Section 4.1 describes how the domain of knowledge base creation may benefit agriculture, whereas Section 4.2 depicts how leveraging various data visualization techniques can contribute towards a better understanding towards the applicability of agricultural data. Finally, Section 4.3 showcases the applicability of crowd-sourcing for the domain of agriculture, stressing various use-cases that can potentially enhance the efficiency of various agriculture-related tasks.

Knowledge Base Creation Characteristics
As the world becomes full of sensor networks producing information, agriculture enters the fields of data processing, storage, and retrieval at a very fast pace. Through various sources of information deriving from a wide data spectrum (e.g., geo-information, soil measurements, weather forecast, chemical analyses, to mention just a few data streams), the agriculture industry is in a position to grow an extremely rich system of integrated IoT data to power more informed decisions. The main challenge lies in the high heterogeneity of data types in the agricultural sector; agriculture makes use of various different data (e.g., research, meteorological, soil, financial/economic, statistical, satellite/remote sensing, administrative, germplasm, crop experiments, and field trials, to name a few) and for each data type there are numerous standards used and these standards are not always linked between each other or at least with the most prominent one of the sector. Thus, all collected data that accompany an agricultural object (i.e., crops, seeds, etc.) are typically stored in metadata records.
Metadata records are critical to the documentation and maintenance of interrelationships between information resources and are being used to find, gather, and maintain resources over long periods of time, like viticulture historical data. The consistent application of a descriptive metadata standard improves the user's search experience and makes information retrieval within a single collection or across multiple datasets more reliable. Descriptive, administrative, technical, and preservation metadata contribute to the management of information resources and help to ensure their intellectual integrity both now and in the future. In parallel with other domains, many researchers in the digital agricultural community recognized the need to lower the barriers for the management and aggregation of digital resources, by implementing some measure of interoperability among metadata standards and then with proprietary data structures. As a result, there is a wide range of proposed solutions, including crosswalks, translation algorithms, metadata registries, and specialized data dictionaries.
Typically, a crosswalk provides mapping of metadata elements from one metadata schema to another and is considered one of the most robust and flexible solution. The prerequisite to a meaningful mapping requires a clear and precise definition of the elements in each schema. The primary difficulty is to identify the common elements in different metadata schemas and put this information to use in systems that resolve differences between incompatible records. Crosswalks are typically presented as tables of equivalent elements in two schemas and, even though the equivalences may be inexact, they represent an expert's judgment that the conceptual differences are immaterial to the successful operation of a software process that involves records encoded in the two models. A crosswalk supports the ability of a retrieval mechanism to query fields with the same or similar content in different data sources. In other words, it supports the well-known and much desired -in the digital world -notion of "semantic interoperability".
Crosswalks are not only important for supporting the demand for single point of access or cross-domain searching; they are also instrumental for converting data from one format to another. However, aggregating metadata records from different repositories may create confusing display results, especially if some of the metadata were automatically generated or created by institutions or individuals that did not follow best practices or standard thesauri and controlled vocabularies. Still, mapping metadata elements from different schemas is only one level of cross walking. Another level of semantic interoperability addresses datatype registration and formatting of the values that populate the metadata elements, e.g., rules for recording personal names or encoding standards for dates, and the alignment between local authority files and adopted terminologies, but these topics exceed the scope of the current survey work.

Data Visualization
Taking above discussion a step further, the scope of developing conceptual understanding through data on agriculture is implemented with the assistance of visual representations of natural processes. Graphical representation, computer animation, and spatial overlay of agricultural data are often used in visualizing natural phenomena, as a derivative of data analysis and ML. Furthermore, scientific illustrations provide guidance on constructing visual models for knowledge systems and connection for visual modeling of the corresponding findings. Scientific papers facilitate the important concepts, schemata, processes, and rules that enhance learning through paradigms. Each figure or diagram posses the particular element and processing step that will guide researchers to form the appropriate research concept. Additionally, dynamic visualizations offer comparisons that investigate the effect of visualization components on the acquisition of environmental and spatial factors of agricultural research. On the aspect of image processing and analysis, visualization techniques are used to monitor and manage images that reduce transition complexities and raising future potentials for the research.
The goal of data visualization is to bridge the gap between researcher and data concepts on the basis of their underlying physical laws and properties. Thereafter, the presentation of massive amounts of data as figures, allow the researcher to trim ineffectual information, focus on the necessary, and comprehend the semantics of data. In this framework, Bajaj [65] categorized the aspects of dynamic representation of data in three categories, namely computation, display and querying, which address the parameters of every visualization environment. It is remarkable that visualization techniques are not only based on data types, but also on the processing operators that are inherent in each visualization technique [66]. The techniques could be described a visualization data pipeline that distinguishes them into four data stages: Value, analytical abstraction, visualization abstraction, and view. These stages describe the states of data and their according processing operations.
The familiarity of information entities in relation to their real world representation is crucial for human perception. The distinction between visualization were "science visualization" and "information visualization" and is implemented through the development of "visual analytics". Information visualizatio as a branch of Human-Computer Interaction (HCI) utilizes graphics to assist researchers in grasping and interpreting data on the purpose of developing form mental models of data. Accordingly, numerical entities can reveal specific features and patterns of the obtained information. Furthermore, the deployed toolset aims on improving "the clarity and aesthetic appeal of the displayed information and allows a person to understand large amount of data and interact with it" [67].
The advent of Big Data and ML guides data research towards shaping the goal of visual representation on the scope of identifying hidden patterns or extremities in data, increasing query flexibility and specialization, comparing various units in order to obtain relative difference in quantities and enabling on-the-fly human interaction. New possibilities of visualization distinguish its techniques by three factors, namely: Data type, process, and interoperability. It's worth noting that human perceptional capabilities are not sufficient to embrace data in large amounts, thus each method can support varied data types, various optical tools, and varied methods for interaction. In particular, agricultural space, as a scientific field that employs interdisciplinary research methods, utilizes spatio-temporal transitions of land characteristics in conjunction with their physical parameters. The various facets of digital representation are based on important aspects of each dataset on the distinctions between data and phenomena. Supplementary to that, agricultural research is based on field-collected data that require high-level abstract models of data processing [68]. The data-flow-driven model is overlaid on cartographic representation of the areas of interest, utilizing static and dynamic maps. Furthermore, such a system is optimally supported by a client-server architecture, in order sustain scalability for handling data-intensive agriculture applications. Quite on the contrary, numerically based analytics produce outputs that are displayed in the form of queries, reports, charts, and diagrams. Those static numerical conceptions can in the typical case be utilized standalone, or superimposed on a dynamic visualization platform.

Crowd-Wisdom Exploitation
During the past decade, the wisdom of the crowd, or, in other words, the so-called crowd-sourcing established itself as a dominant platform for exploiting the skills and knowledge of online Internet users. In particular, crowd-sourcing can be perceived as outsourcing the task of data collection and/or data annotation to a group of non-professionals. The contributors to a crowd-sourcing platform may receive monetary compensation, or provide their services free of charge, having in mind the collective good. In the agriculture case crowd-sourcing may be used in order to increase the efficiency of a multitude of tasks within the aforementioned domain. For example, sensors can be used in a number of contexts for agriculture. By developing a platform that stores this type of information, users will be able to submit data stemming from their own sensor-based projects, something that can lead to the aggregation of information regarding different plant and/or crop species, and geography. Moreover, the task of identifying the species that a plant belongs to, may be enhanced through crowd-sourcing. In particular, a potential contributor can improve the performance of a Machine Learning classifier by providing hand-annotated images along with the correct labels, as well as various metadata that can be utilized as features by the predictive model. Finally, weather information is pivotal for farmers. A number of mobile applications, use forecast information and measurements both from official weather stations, along with data provided by amateur stations, respectively.
However, in order for the aforementioned applications to reach their full potential, a number of initiatives and standards must be enforced in the future within crowd-sourcing platforms, so as to ensure open participation, implement data quality policies, along with privacy and security issues with respect to storing and utilizing the crowd-sourced information. In addition, care should be taken about the identities of the contributors, whereas easy-to-use interfaces have to be provided, due to the fact that the Information Communication Technology (ICT) skills of the contributors may significantly vary in the process. One may identify different criteria for the crowd-sourcing initiative to be appealing and sustainable in a mass scale in the field. Among others, good participatory user interfaces, service and information salience provided to farmers, as well as the provision of reliable services are considered absolutely necessary for the advance of this initiative. In addition, care should be taken about the credibility of the provided information to the farmers or the agricultural stakeholders, together with appropriate data quality policies, yet to be defined at least at the European level. As a final thought, sustainability is quite dependent on both the contextual information feedback from a farmer's perspective and the community's to which the farmer may or may not belong to.

Discussion and Conclusions
In this smart agriculture survey paper we attempted to briefly report on the basic steps and methods of indicative solutions that span across different disciplines and research areas, and as a result cover quite diverse research directions. From the above analysis and presentation it should have been obvious by now that most presented methodologies and scenarios are based on both traditional, well-established computer science algorithms and techniques, that are currently being adapted to work on the newly acquired agricultural data, as well as innovative, new aspects, enabled by modern technologies, such as drones and satellites, whose usage and exploitation were not available until the very recent years. Thus, the overall conclusion drawn may be summarized in the following quotation: When dealing with intelligent agricultural information and data, computing applications and systems, researchers should in principle initially target more flexible and rather generic approaches and then try to identify additional but inherent sources of meaningful information to boost and enhance their findings in a qualitative manner. Since smart agriculture is now becoming widely accepted to have many forms and depictions, the era of extremely focused and domain-specific applications and systems is considered to be quite over. By integrating the herein briefly presented works and by providing a short discussion and interpretation for each identified group, we deem and, at the same time, anticipate that the herein instigated classification will be used by both fellow academics and people involved in almost all steps of the agricultural production chain towards a bouquet of meaningful research directions.
Within this review work we presented several studies, focusing on intelligent agricultural information handling methodologies, for improving and/or gaining insight into a multitude of different tasks within the agricultural domain. We organized the aforementioned studies into three major categories, namely data sources and collection, machine learning methodologies for agricultural data, and finally, intelligent knowledge acquisition. We believe that the economic importance of agricultural industry worldwide demands the development of innovative methodologies for precision and optimal farming. Irrespectively of the geographical region at hand, the harvest varies from year to year due to soil conditions, disease, pests, climate and variation in yield management practices. These factors make the production of high quality produce a rather challenging task. Still, traditional and innovative computational methodologies, such as image analysis, remote monitoring and Machine Learning have the potential to provide an inexpensive, non-destructive way of capturing precise information about the crop. From the analysis of the herein included research works it is rather evident that the particular problems identified within most agricultural learning applications are focused on the specific models and frameworks employed, the sources, the actual nature and pre-processing of data used, and the overall performance achieved according to the metrics deployed for each research work. Agricultural monitoring provide data and knowledge for deep learning and other existing popular computational techniques, in respect to differences in classification or regression performance. Considering the impact and applicability of Machine Learning, our motivation was to identify the main recent trends within this particular application domain, in order to facilitate a better understanding of the specific field for future studies.
Overall, a clear trend is to be identified and this may be summarized into the active utilization of Machine Learning methodologies for a multitude of agricultural tasks. Our future plans include the examination of innovative areas of smart agriculture in narrower, yet trending, application domains, like specific popular social networks (e.g., Twitter, Facebook, etc.), so as to be able to monitor evolving trends in agricultural data utilization and identify potential new emerging data models in the process. Clearly, it is not possible to tackle all aspects of modern agricultural-related computing in a single paper nor discuss all expressions in the process and consequently several open remaining issues are to be identified by the experienced reader as food for thoughts. The latter should also take into account not only the researchers', but also the developers' point of view on the matter. Developers are encouraged to take this survey's observations into account, but at the same time they would need to do so without sacrificing the required usability or without increasing the related computational complexity of their applications in the process. All these remarks are considered to be crucial for enabling sufficient and innovative agricultural information distribution in wide-area, real-life deployments, both within current and future informatics' applications, system and scientific research.

Conflicts of Interest:
The authors declare no conflict of interest.