A Platform Approach to Smart Farm Information Processing

: With the rapid growth of population and the increasing demand for food worldwide, improving productivity in farming procedures is essential. Smart farming is a concept that emphasizes the use of modern technologies such as the Internet of Things (IoT) and artiﬁcial intelligence (AI) to enhance productivity in farming practices. In a smart farming scenario, large amounts of data are collected from diverse sources such as wireless sensor networks, network-connected weather stations, monitoring cameras, and smartphones. These data are valuable resources to be used in data-driven services and decision support systems (DSS) in farming applications. However, one of the major challenges with these large amounts of agriculture data is their immense diversity in terms of format and meaning. Moreover, the different services and technologies in a smart farming ecosystem have limited capability to work together due to the lack of standardized practices for data and system integration. These issues create a signiﬁcant challenge in cooperative service provision, data and technology integration, and data-sharing practices. To address these issues, in this paper, we propose the platform approach, a design approach intended to guide building effective, reliable, and robust smart farming systems. The proposed platform approach considers six requirements for seamless integration, processing, and use of farm data. These requirements in a smart farming platform include interoperability, reliability, scalability, real-time data processing, end-to-end security and privacy, and standardized regulations and policies. A smart farming platform that considers these requirements leads to increased productivity, proﬁtability, and performance of connected smart farms. In this paper, we aim at introducing the platform approach concept for smart farming and reviewing the requirements for this approach.


Introduction
According to the United Nations' Food and Agriculture Organization (FAO) [1], food production should increase by 70% due to the expected growth in the world population by 2050. This population growth requires increasing productivity using solutions that consider resource shortage and farm profitability [2]. Employing information communication technology (ICT), smart technologies, in addition to the rapid development of the Internet of Things (IoT) and artificial intelligence (AI) have led to digitalization in farming called smart farming. Smart farming is considered to be the fourth revolution in farming [3,4]. By managing inputs, such as fertilizers, pesticides, and animal feed, a smart system can help farmers to reduce waste, employ less workforce, decrease overall costs, and to reach a more sustainable environmental impact to achieve higher productivity [5,6]. According to the Market and Markets report, USD 13.8 billion was dedicated to the global smart farming market in 2020. Due to farmers' needs to increase yields, improve livestock production, and reduce management costs to meet the growing demand for food, smart farming should experience rapid growth to USD 22 billion by 2025 [7].
In a smart farming scenario, large amounts of real-time and high-resolution data are generated from remote and automated sensor systems. The data can represent different aspects of farming, including but not limited to livestock, crops, soil, and the environment [8]. The data range from time series to spatial images, human experiences, and observations collected via mobile smartphone applications [9]. Such data can be analyzed to filter out invalid or wrong data [2] and to compute personalized recommendations for a farm to improve productivity [10]. Different modern technologies can be utilized for data generation and data analysis in digital agriculture. IoT, AI, and cloud computing are examples of technologies that have recently been widely used in smart farming [11,12]. Although data-driven solutions have provided various benefits in agriculture, data integration, processing, and usage processes and protocols are still prominent challenges that need to be addressed [13]. Some of these challenges are limited awareness and knowledge of digitalization [14], lack of standardization and suitable management to cope with fragmented and heterogeneous data [15], lack of high-quality data and proper analysis [16], privacy and security challenges in an entire smart farming ecosystem [17], and lack of compliance with uniform regulations and policies [18].
A unified solution that considers the requirements in various stages of the data processing lifecycle can address various issues in digital agriculture. Such a solution enhances awareness regarding the needs and requirements of different data sources, technologies, processes, and protocols (including policies). This solution assists the actors in smart farming ecosystems to provide their services and products to other parties in a more usable format. Moreover, this approach increases food safety, production, and sustainability by providing transparent and trustable data about every stage of the food chain. In addition, a unified solution provides an excellent opportunity for building decision frameworks to aggregate data from diverse sources and to provide high-quality data-driven services. This approach also facilitates data sharing and multiparty cooperation in farming applications, and thus, increases productivity and reduces resource waste.
In this paper, we suggest the platform approach as a unified solution that facilitates cooperation in smart farming applications. Furthermore, this approach enforces relationships between farmers and system providers to track crop cycle information, livestock, and dairy production in a secure manner for their decisions and data management.
In recent years, several platforms have been proposed for smart farming applications. SmartFarmNet [2] is an IoT-based platform that has been developed by a multi-disciplinary Australian team to automate the collection of environmental, irrigation, fertilization, and soil data. This platform integrates IoT devices, such as sensors, actuators, and cloud servers for storing and analysis of collected data. The presented results in the paper showed that SmartFarmNet was capable of providing near real-time query responses. Moreover, the authors demonstrated that increasing the number of sensors had a negligible impact on the performance of the system, therefore, the proposed platform was scalable. Mehdi et al. [19] proposed Smart Farming Oriented BigData Architecture (SFOBA), a platform for big data processing that provided real-time processing on acquired data from smart farm systems and devices. In this study, the authors utilized the Star Schema Benchmark dataset [20] to show how the proposed platform could finish multidimensional queries on 40 million rows of data in less than one second. Clements et al. [21] developed an interactive digital tool that collected data from different sources such as the Key Indicators Mapping System (KIMS) and the Key Indicators Database System (KIDS) both provided by the Food and Agriculture Organization of the United Nations (FAO). The developed tool provided information regarding livestock production, disease prediction, as well as provided the rules for risk assessment at the country level.
Another field in smart farming is smart dairy farming which aims to utilize modern smart technologies to satisfy the increasing demand for quality dairy products, to reduce consumed resources, and to decrease the ecological footprint [22]. Taneja et al. [23] proposed SmartHerd, a platform enabling data-driven dairy farming by analyzing available data and providing controls for farmers and other stakeholders. This platform has been deployed in Waterford, Ireland, and was developed to gather and analyze data regarding dairy cattle. The presented platform is designed to keep working when the internet connection is lost. The generated data in this platform are stored locally and are shared on a cloud infrastructure when the internet connection is available.
Most of the proposed platforms in smart farming have focused on a specific aspect of smart farming such as crop production recommendations [2], big data technologies [19], data transformation [19] reliability [24], and business model [6]. With the increasing need to collect, integrate, analyze, and manage large amounts of data generated at farms, effective farm data processing continues to be a major hurdle for the adoption and success of digital agriculture solutions. Despite the need, the existing studies have not provided a holistically designed approach by considering different requirements for smart farming applications. To fill this gap, in this paper, we propose a design thinking approach, which we refer to as the "platform approach", for smart farming data processing. This approach encourages consideration of six core requirements when designing, implementing, and testing smart farming solutions. These requirements are interoperability of data, processes, and technologies; reliability to ensure that a data source is valid and available; scalability in terms of the capability of the platform to be extended to larger applications; real-time data processing to enable timely access to data and services; protection of infrastructure and smart farming assets (privacy and security); and finally, compliance with policies and regulation. The proposed approach facilitates cooperation among different parties in smart farming ecosystems and enables these actors to make the most use of available data sources.
The rest of this paper is organized as follows: In Section 2, the core components of data processing in smart farming systems are described; in Section 3, we discuss the challenges and requirements in smart farming systems; then, in Section 4, we investigate the main requirements and related solutions in a smart farming platform; finally, in Section 5, we conclude the paper with a summary of the platform approach and some future insights.

The Core Components of Data Processing in Smart Farming Systems
To improve smart farming data processing through the integration of various systems, in this paper, we propose the concept of the platform approach, which is a design approach for data processing. The smart farming platform approach allows farmers, researchers, technology providers, and all other stakeholders to have a standard and reliable solution to collect and share information, resources, and experiences to improve the productivity and performance of smart farming solutions. We demonstrate an overview of the main components of smart farming data processing in Figure 1 in five main layers. This abstract platform can shed light on the components of the smart farming application platform, defining the main tasks and core components for data processing in these systems.
According to Figure 1, in the data acquisition stage, data are collected from diverse sources, such as farmers, sensors, and satellites, as well as external databases such as weather and climate data. The collected data from these different sources can be in different formats that are not compatible to be stored in a unique database. In addition, some collected data may contain incomplete data, missed values, outliers [25], and anomalous instances [26]. To address these problems, in the data preparation stage, best practices for data preprocessing can be used to prepare data for further analysis. These practices include standardizing data to a predefined format, identifying and deleting duplicated data, handling the gaps in generated data, and validating data sources and contents. Additionally, the collected data from different sources can be stored in a common infrastructure and can be integrated with other data sources that are collected from diverse smart farming systems. These methods assist in ensuring data consistency, completeness [27], and accuracy [28]. The components in the second and the third layers are highly correlated, and many of the smart farming data processing functions can be assigned to both layers. The data processing layer components are the brains of smart farming [29]. Mathematical modeling, statistical methods, or AI methods are used to analyze and extract knowledge from farm data. Machine learning (a branch of AI) consists of intelligent techniques to make Agriculture 2022, 12, 838 4 of 18 automated decisions with limited human involvement. Machine learning can perform rapid optimizations, classifications [30], predictions, and recommendations and add value to the entire system [31]. For instance, SomaDetect Inc. of Fredericton [13], utilizes machine learning for identifying individual cows. This system also monitors the cows to detect disease symptoms through data collected from sensors at milking stations on dairy farms. As another example, AirSurf-Lettuce is an open-source analytic platform that uses machine learning to categorize iceberg lettuces to improve the actual yield and crop marketability before harvesting [32]. Model deployment in the context of machine learning refers to the application of generalizing a model to predict new items in the system. Prediction is according to estimating the outcomes for unseen data that can help to provide predictions. For example, the growth prediction in plants helps farmers to decide about harvest time and plan for the required workforce. Forecasting is a kind of prediction about the future using time-series data. An example of forecasting in smart farming is weather forecasting using temporal information to plan for irrigation [33]. According to Figure 1, in the data acquisition stage, data are collected from diverse sources, such as farmers, sensors, and satellites, as well as external databases such as weather and climate data. The collected data from these different sources can be in different formats that are not compatible to be stored in a unique database. In addition, some collected data may contain incomplete data, missed values, outliers [25], and anomalous instances [26]. To address these problems, in the data preparation stage, best practices for data preprocessing can be used to prepare data for further analysis. These practices include standardizing data to a predefined format, identifying and deleting duplicated data, handling the gaps in generated data, and validating data sources and contents. Additionally, the collected data from different sources can be stored in a common infrastructure and can be integrated with other data sources that are collected from diverse smart farming systems. These methods assist in ensuring data consistency, completeness [27], and accuracy [28]. The components in the second and the third layers are highly correlated, and many of the smart farming data processing functions can be assigned to both layers. The data processing layer components are the brains of smart farming [29]. Mathematical modeling, statistical methods, or AI methods are used to analyze and extract knowledge from farm data. Machine learning (a branch of AI) consists of intelligent techniques to make automated decisions with limited human involvement. Machine learning can perform rapid optimizations, classifications [30], predictions, and recommendations and add value to the entire system [31]. For instance, SomaDetect Inc. of Fredericton [13], utilizes machine learning for identifying individual cows. This system also monitors the cows to detect disease symptoms through data collected from sensors at milking stations on dairy farms. As another example, AirSurf-Lettuce is an open-source analytic platform that uses The decision-making stage includes system monitoring, rule management, and model metadata management to provide the results and recommendations through the deployed model. Metadata (data about data) is provided to establish a common understanding of the meaning of data. For example, the available metadata about automatic milking devices can help farmers to use the proper estimators to predict the milking duration. These data can help smart farming actors such as farmers and growers during the decision-making process. Some examples are decisions about disease prediction, pesticide control, and water management. In the last stage of smart farming applications, end-users, i.e., farmers, farming service providers, agriculture researchers, and governments access the system results through services. System monitoring can audit the entire system and provide feedback to optimize the predictors and to improve decision making. In addition, rule management can check the compliance of decisions against available rules.
Since trustworthiness is a core requirement to ensure system adoption and use, it is necessary to consider security and privacy concerns in all communications and end-to-end in the platform. Moreover, the data workflow should comply with the available policy and regulations in smart farming. Application programming interfaces (APIs) are common methods for the integration of different components and resources (e.g., data sources, legal processes, and policy protocols) in a platform approach.

Challenges and Requirements in Smart Farming
The platform approach suggests designing smart farming services using platform structures that can be integrated through APIs or other methods to build effective, reliable, and robust systems. This approach is also aimed at providing a common understanding and semantics among smart farming actors to collaborate in data-sharing practices and cooperative service provision. In Section 2, we presented an abstract overview of such a platform as well as the main components of data processing in smart farming systems. In this section, we discuss the challenges in smart farming applications and convey the major requirements that any agricultural platform should have to address these challenges.
In agriculture, data are highly heterogeneous for different reasons such as type, format, and intent of data, different protocols of devices, and the methods of collecting data. For example, poultry stakeholders are interested in monitoring the daily behavior of birds, such as their movement and feeding patterns, to predict the possibility of a disease outbreak in the early stages. A major challenge in such an application is combining heterogeneous data collected from different sensors in smart poultry farms. In addition to collecting and processing data, aggregating diverse data among various farms can improve the prediction outcome. However, lack of interoperability among data, technologies, and data processes is a significant barrier to reaching this goal. A similar challenge may arise when farmers aim to equip their farms with new IoT devices. As devices from different technology providers do not follow a unique protocol, this limitation restricts farmers from adding new digital tools and devices to their smart farming network.
In the context of smart farming, interoperability refers to the ability of two or more different systems, services, and components (i.e., software components and IoT devices) to be able to work together to exchange information, facilitate processes, integrate technology solutions, and comply with policies and legal requirements. There are four categories of interoperability including semantic/data interoperability, technology or system interoperability, operational interoperability, and legal interoperability [34]. For data/semantic interoperability, the entities should mutually agree on the meaning, content, and context of data exchange and use [35]. Interoperability improves the flexibility of smart farming networks as well as enables farmers to collaborate through data sharing and enhance their farm practices. An important aspect of interoperability is standardization. In the context of data processing, without mechanisms to take data from diverse sources of digital agriculture solutions, it would be extremely resource based and time-consuming to make the most use of data. Using processes and best practices that convert data and information to harmonized formats is critical to enabling data sharing and integration practices and collaborative decision making.
Alongside interoperability in smart farming devices and systems, controlling data reliability is a major requirement in a smart farm platform. Data reliability, sometimes referred to as data quality, is defined as the extent to which the data and its source are trustworthy, unfailing, authentic, genuine, and representative of the problem [36]. Reliability can be evaluated at different levels, such as on-farm and off-farm levels. From an on-farm point of view, it is essential to make sure the data generation is valid and stable. A smart device that is physically damaged, or an issue in network connection might lead to invalid and unstable data generation. From an off-farm perspective, a smart farming platform should ensure the farmers that the data and algorithms are valid and trustable. Data quality issues are also likely to arise in the integration process as the quality of all data sources cannot be verified [37,38]. Wrong decisions originating from unreliable data in decision-making processes cost agricultural businesses billions per year [39].
Scalability is another requirement that should be considered while designing a smart farming platform [40]. Scalability in smart farming refers to the adaptability of a system to increase the capacity, for example, the number of technology devices such as sensors and actuators, while enabling timely analysis [41]. Shortcomings of scalability in a smart farming architecture, in the context of data processing, would lead to weak system performance. For instance, integration with other devices and applications may not be possible if Agriculture 2022, 12, 838 6 of 18 scalability requirements are not embedded by design. It has been recommended to consider the possibility of growth in size and diversity of applications of smart farming systems from the initial stages of system design [42]. As an illustration, consider a large poultry production that owns farms in different locations. This company needs a platform to build a scalable business by integrating the supply chain from feed to hatcheries and poultry farms to distribution [43]. These poultry farms may have different sensors and acquire various types of technologies to collect data in sizes, formats, content, and complexities [17,44]. The company should be able to expand data processing, network bandwidth, and other computation resources to be able to utilize data collected in these farms. If this need for resource growth is not considered in the design of the system, the company cannot easily expand their system to process and use other data sources. Platforms for smart farming should consider scalability as a core requirement to enable growth and future needs for integration.
Moreover, in smart farming systems, we need fast and accurate decisions and actions. Thus, we need platforms that enable farmers to deliver proper actions at the right time. Real-time data analytics is defined as the ability to analyze large volumes of streaming data when they are created or stored [45]. For instance, in the case that a farmer decides to sell their products or buy supplies, a long waiting time for getting access to data and making decisions might lead to losing a bargain, or incorrectly offering an estimation originated from outdated data. Real-time processing is an approach to capture, process, and export data promptly [46,47]. Furthermore, real-time data processing enables smart farming actors to take proper decisions at the right time which leads to minimizing risks and undesirable consequences [48]. A smart farming platform can also facilitate real-time processing as a requirement for reinforcing farming practices and preparing farmers against unexpected circumstances [45]. To achieve real-time data processing, it is also necessary to collect the data in real time. The results acquired based on analyzing old data can cause inaccurate decisions, while real-time data facilitates real-time decisions and actions [46]. Similar to any other smart platform, a smart farming platform should bring security and privacy to the core of attention and provide sufficient protection mechanisms. Modern technologies form an ecosystem in which connected devices are accessible remotely, allowing adversaries to plan cyberattacks. The attackers might aim to steal information or carry out disruptive actions on the smart farming systems [49]. For example, Yang et al. illustrated how adversaries used IoT devices in IT infrastructure to compromise security [4]. As an illustration, a malicious actor can get access to a farm's data using an unsecured smart device and steal information about the products, financial decisions, and future plans. The intruder can also damage the products and equipment. Because of the availability of diverse data sources in smart farming systems, including personal and business data, privacy is another paramount concern. Smart farming systems should provide mechanisms to ensure security requirements and data protection. These practices must protect data end-to-end and in different data processing stages, from data collection to data-driven service provision.
Regulations for agricultural practices affect the design and development of smart farming technologies. For instance, farmers' concerns about data sharing and privacy have been discussed in recent studies [50,51]. Farmers are not convinced that the available policies to protect their intellectual property, business profit share, and privacy are sufficient.
There are also concerns about the lack of compliance with privacy best practices and farm data agreements by the technology providers [52]. To address these concerns and build trust, there is a need to establish standards and best practices that mandate the rights and responsibilities of different actors in smart farming systems. There is also the need to set standards for data-sharing practices and to apply policies to determine accountability and penalties in the case of possible disputes. Other regulations related to food safety, supply chain management, disease monitoring, and other applications can also impact agriculture data management and processing. These policies can be renewed to be applicable to new and emerging agri-food needs and technologies. Policy requirements can be embedded into technologies for monitoring, compliance, and sustainability purposes.
In summary, as is demonstrated in Figure 2, we recommend consideration of a platform approach that facilitates six requirements for smart farming data processing. These requirements enable seamless integration of data, system, and processes with the end-user in mind. They include interoperability to provide compatibility among different components of the platform; real-time processing to generate fast and accurate information for decision making; scalability that enables the platform to extend the entities and resources; reliability to assure accessing valid and up-to-date data; ensuring security and privacy to preserve the system safety and confidentiality; and, finally compliance with the related policies and regulations in smart farming. A platform that satisfies all these requirements can enhance farming practices, and therefore, can increase productivity, profitability, and performance while decreasing resource waste. build trust, there is a need to establish standards and best practices that mandate the rights and responsibilities of different actors in smart farming systems. There is also the need to set standards for data-sharing practices and to apply policies to determine accountability and penalties in the case of possible disputes. Other regulations related to food safety, supply chain management, disease monitoring, and other applications can also impact agriculture data management and processing. These policies can be renewed to be applicable to new and emerging agri-food needs and technologies. Policy requirements can be embedded into technologies for monitoring, compliance, and sustainability purposes.
In summary, as is demonstrated in Figure 2, we recommend consideration of a platform approach that facilitates six requirements for smart farming data processing. These requirements enable seamless integration of data, system, and processes with the enduser in mind. They include interoperability to provide compatibility among different components of the platform; real-time processing to generate fast and accurate information for decision making; scalability that enables the platform to extend the entities and resources; reliability to assure accessing valid and up-to-date data; ensuring security and privacy to preserve the system safety and confidentiality; and, finally compliance with the related policies and regulations in smart farming. A platform that satisfies all these requirements can enhance farming practices, and therefore, can increase productivity, profitability, and performance while decreasing resource waste.

Requirements, Discussion and Solutions
In Section 3, we recommended six major requirements to be considered in the design of smart farming platforms. We suggest that these requirements are necessary for any platform that aims to provide fast and robust services in a smart farming ecosystem. In the following, we investigate these requirements and the solutions to address them in the platform approach.

Interoperability
Currently, there is an extensive number of digital solutions including software tools and services that are used in smart farming. This growing trend has resulted in the generation of heterogeneous data from multiple protocols and communication technologies with different formats and semantics. However, data generated by precision farming tools are not portable and cannot be integrated among smart farming systems. Interoperability among technical software tools, hardware, or processes/protocols is relatively limited often because of the lack of established standards [53]. Interoperability requirements cut across various components of smart farming technologies ( Figure 1) from collecting data with varying accessibility rules, data integration across multiple sectors/devices, and data processing and protection regulation for data governance and data access.
Interoperability can be viewed from legal, organizational or operational, semantic, and technical dimensions [34]. Legal interoperability involves ensuring the integration of smart farming platforms under various legal frameworks including food policies, privacy policies and procedures, and data agreements, to enable data exchange. In a smart farming system, data could be combined from various sources, resulting in integrated datasets. This integration should be performed in accordance with the farm data agreements. To address legal interoperability requirements in smart farming, the RDA-CODATA principles can be used [34]. These principles are: facilitating the authorized access to and re-use of data, deciding who has access to and is responsible for the data, managing the legal interests, explaining the rights clearly, and promoting harmonization of rights in the data. These principles can be embedded into the farm data processing platform and integration can be enabled through application programming interfaces (APIs).
Organizational interoperability is concerned with agricultural actors' capacity to accept and utilize services from other organizations to collaborate effectively [54]. This is dependent on the degree to which farms' processes, responsibilities, and roles are harmonized toward shared objectives and strategies in the agricultural systems to make services more accessible, easily identifiable, and user focused. To address the organizational interoperability requirements, farm business processes and exchanged data need to be integrated, aligned, and documented using a commonly agreed modeling method [55].
Semantic interoperability ensures that the precise meaning and format of the shared data are understood and preserved. In existing smart farming ecosystems, data are collected from fragmented systems in varying formats and meanings. To tackle the lack of semantic interoperability in data processing, it is necessary to describe all data elements by developing vocabularies and schema to ensure that all communicating agricultural parties understand the data in the same way. The solutions for improving semantic interoperability in smart farming are standardization, metadata, and connecting each data variable to a common language, in the form of taxonomies and ontologies [56]. Semantic interoperability issues in smart farming data processes can be solved by using standardized languages such as agroXML [57] which is an XML dialect for describing farm production processes as well as the real-world objects required in conducting these processes. For semantic interoperability enhancement, a smart farming application can leverage metadata that describes original data, allowing the user to make the best decision possible about how to use it. Metadata is generated to add meaning and context to data values and to establish a common understanding of the meaning and semantics of the data. Metadata enables correct and proper use and interpretation of the data by the owners and users [58]. Furthermore, it also enables farm machines to comprehend data by offering models and eliminating ambiguity [58]. For example, in the modeling and simulation of a decision support platform in smart farming, there is a need for a method to detect conflict between system components and concepts of the model. Metadata is used to address the interoperability of such systems by capturing information about model concepts which allows software agents to reason in an unambiguous and machine-readable form [59].
Data interoperability is considerably resolved in smart farming when metadata is standardized, and common language definitions are used. Such standards are defined and promoted by standard agencies such as the International Standards Organisation (ISO) [60]. The Agricultural Information Management Standards (AIMS) [61], the Agricultural Metadata Element Set (AgMES) [62], and Agrovoc [63] are three notable metadata initiatives in the agriculture area [53]. Moreover, ontologies and taxonomies can enable better interoperability by allowing data to be linked at the semantic level. Taxonomy aids in the interpretation of relationships between data entities and the categorization of data [64]. Ontology explains the structural complexities of databases as well as the semantic relationships among data variables collected in databases [10]. Several ontologies such as FoodWiki, Agrovoc, and FoodOn have been proposed to extract the semantics of food and agricultural data to share and reuse agriculture knowledge. In addition, sensor ontologies such as sensor node ontology [65], and sensor-data ontology [66] are used for semantic interoperability of data collected from IoT devices. The goal of these ontologies is to identify essential elements of sensor data and to model resources, services, and geographical data to ease data access for users. Another aspect of interoperability is technical interoperability, which refers to the capacity of two or more technical agricultural platforms, infrastructures, services, or protocols to easily integrate and enable process and data flow [54]. This concept is fundamental to software and hardware compatibility. For example, interoperability is required among farm machines, software, IoT devices, and sensors. If the data processing modules do not recognize the data format collected by the combined harvester, the IoT will be of little use in terms of assisting decision making. Technical interoperability often can be satisfied by selecting and implementing the proper software and/or internet interface (API) and protocols, as well as standardized content encodings for transmission [67]. Technical interoperability could be satisfied by creating a decentralized network of existing systems to enable the exchange of data in smart farming processes. Each system could be independent and built upon its own technical infrastructure. The ATLAS initiative is one of the projects that developed such an open technical interoperable network involving stakeholders from diverse agricultural domains by using open technical specification. This project aims to provide a distributed service-oriented platform that enables some tasks such as watering management, soil management, and behavioral analysis of livestock. This framework involves all actors in the food chain, enhancing the process from farm to fork.
Interoperability is made possible by data standards that allow the sharing and exchange of data. In the absence of standardization of data, processing data from heterogeneous and incompatible sources is challenging and often inefficient. Data standardization is a key step toward data interoperability to promote data quality, data sharing, data reusability, and to optimize data usage. The standardization process of data, processes, policies, and concepts is critical in ensuring all types of interoperability. This goal may be realized by well-defined and well-established protocols and regulations that facilitate the integration of multiple data sources and external knowledge-based services given by stakeholders and partners. Due to the complexity of data standardization, data harmonization [68] can be used as an alternative solution. Data harmonization is the process of merging different data sources into an unambiguous, integrated entity record to be used in the system and to feed the processes of a system [69]. The main difference between data standardization and data harmonization comes from the level of strictness in standards. On the one hand, the aim of data standardization is to make the data clear and consistent. It should be "clear" to ensure that data can be easily understood by individuals who are not involved in data processing, and consistent to ensure that relevant data can be recognized using common terms and formats. Data harmonization, on the other hand, is the process of merging different data sources into an unambiguous, integrated entity fromat to be used in the system and to feed the processes of a system [69].

Reliability
Reliability is a crucial foundation for designing data-driven services across smart farming ecosystems. The reliability of data-driven systems depends on effective processes, data, and technologies [70]. As a major requirement, reliability should be considered in all stages of smart farming, from data acquisition to data preparation and model building. Reliability can be considered from different aspects such as data reliability and technology reliability. The main components of data reliability are data consistency, data completeness [27,71,72], and data accuracy [28]. Consistency as the main component of reliability [2] refers to keeping data concepts, value domains, and formats unchanged. Changes to farm data might take place while processing, moving through networks, or sharing between applications [28,72]. Data completeness is about the availability of all necessary data for decision making [72]. In other words, the deficiency of a component should not impact the accuracy and integrity of data [28]. Data completeness in smart farming does not mean that all data attributes must be present; rather, important data attributes and optional ones need to be decided for selection. Accuracy refers to the extent to which recorded data reflect the true state of source information [28,73]. The values stored by a precision agriculture system may be inaccurate or wrong. This issue can be due to unreliable or broken sensors, lost information, and failed transmissions [74]. The inaccurate data can lead to a risk of missing events and hard interpretations [74].
In addition to data reliability, smart farming platforms require robust and reliable technologies such as high speed and reliable internet connection [75]. The physical safety of IoT devices for precision agriculture systems should be ensured in different environmental conditions to avoid communication failures. IoT data analytics should process data accurately and reliably to enable decision-makers to react quickly to emerging issues and changing conditions [76].
Because agricultural data cover a wide range of heterogeneous and unstructured sources, simple ratios can be used to measure the percentage of data elements that meets specific rules [77]. To manage information structure in a database, Blake et al. [78] suggested using parsing techniques. Using this method, the quality of sensor data, as an example, was calculated by comparing it with data coming from multiple reference sensors, existing historical data, or an alternate data source. There are, however, several challenges to using this approach. For example, the quality of the reference sensors is not always guaranteed, historical data might not be available, and there might be additional costs using alternate data sources. Performing some tests and preprocessing on the aggregated data from sensors and other sources is recommended to raise data reliability in smart farms. For example, outlier detection can help to detect faulty devices or discern the unreliable data exchanged with other farms.

Scalability
Scalability is another requirement that should be considered in a smart farming platform. Scalability refers to the ability to increase available resources and system capability without the need to go through a major system redesign or implementation. As an illustration, in Figure 1, we can increase the capacity for data processing by increasing the cloud resources in the second layer and computation resources in the third layer.
The challenges related to scalability in smart farming fall into two categories: capacity and performance [79]. Scaling capacity refers to the ability to add new nodes or resources to the system [41]. Scaling performance is the ability to improve performance or to keep the performance identical while expanding capacity. The fundamental bottleneck that may affect system performance may be caused by different deployment configurations of various components [80]. Other challenges of scalability are identity management and access control, security [81], privacy [44], governance, and fault tolerance [81]. Since farming data generation is rapidly increasing every day, such data are too large to be stored on a single node. A fundamental solution to address this need is distributing data collection mechanisms across multiple nodes. For instance, Zhou et al. [82] employed Hadoop to process and store 1.44 million data records for daily temperature monitoring. Since most smart farming data are small files that lead to many small files, Hadoop cannot be effective without a distributed system equipped with a high-performance computing system. To address this problem, the Hadoop Distributed File System (HDFS) has been designed to process large (and small size) datasets.
Using cloud computing technology in a smart farming platform is another solution that can address scalability challenges related to capacity due to flexible and robust data collection, management, and processing capabilities [83]. Cloud computing provides a high level of flexibility by providing remote services for monitoring and managing farm data. Moreover, these services can provide on-demand storage and computation resources with no need for on-farm hardware installation [84]. The data stored in the cloud systems are usually distributed in the data storage platforms supported by backup mechanisms. The data-driven services are finally offered by web services accessible through diverse tools, including laptops, tablets, and smartphones in the last stage of smart farming tasks, as shown in Figure 1. SmartFarmNet [2] is an example of a scalable platform that utilizes cloud computing technology to provide a scalable solution for smart farming.
Despite several benefits of employing cloud technology in smart farming systems, the challenges such as latency, security, and privacy issues in transferring a large amount of data to the cloud are still persistent. To address these issues, the use of edge computing at the network's edge has been recommended [78,85]. Edge computing is a technology that aims to process data at the edge of the network, near the source of the data in order to reduce the transmitted data to the cloud servers, as well as to decrease the workload on the centralized cloud computing servers. In addition, edge computing can leverage the scalability of cloud computing resources by taking advantage of both cloud and edge computing and decreasing the volume of data transfer. Having an edge-based ecosystem with the right APIs and tools to integrate various data sources can ensure the availability and real-time processing for users. An example of using edge computing in agriculture is the platform Zamora-Izquierdo et al. [24] who proposed another real-world platform that utilized edge technology to handle the issues in hydroponics farming. This platform was implemented in Spain to evaluate the water consumption and procedures related to tomato cultivation. Digital tools also have been used in livestock production and health management. Although edge computing increases scalability in a system, it increases concerns regarding the heterogeneity of the utilized devices in a network. Because this technology utilizes diverse software and hardware products in different layers of the network, the performance of the system is highly correlated with the compatibility of the used components. Ning et al. [86] discussed the heterogeneity issues in edge computing and the solutions to address this problem.

Near Real-Time Data Processing and Decision Making
Similar to many other emerging real-world applications, smart farming applications need real-time processing of streaming big data [87,88]. Real-time data lead to real-time decisions and actions [48]. In a smart farming system, making fast and accurate decisions is a major goal. To achieve this goal, we need mechanisms to collect and process the diverse available data sources in real time. These data are generated by different technologies including but not limited to IoT, robotics, drones, climate forecasting services, and smartphones. SFOBA [19] is a platform for big data processing which provides real-time processing in many domains of agriculture. This platform integrates different data sources, data modeling, and software products to provide real-time data analysis in farming applications. Al-Thani et al. [89] investigated the set-up process for the use of a drone in monitoring sheep livestock. They also used image processing and ML models in a real field application. With the massive number of pictures from drones and new computer vision and deep learning models, it is possible to predict diseases and pests through greenhouses and farms. These technologies also can be used to estimate plant traits in real time [90]. For example, in the autonomous greenhouse, it is possible to develop a model to estimate the leaf area, dry weight, and fresh weight of the lettuce [91]. Real-time data lead to real-time decisions and actions [48]. For example, agile actions can be carried out in sudden changes in operational conditions or other circumstances such as weather changes and disease prediction alerts.
In addition, the use of IoT, cloud computing, remote sensing, biotechnology, and robotics is increasing in smart farming [48], transforming traditional into smart farming. These technologies can establish the networking of machines and control farm activities automatically and in real time [92]. As discussed, storing data at intermediate points at the "edge" of the network rather than always at the central server or data center leads to faster data processing and a shorter response time critical for real-time processing [93].
To analyze the large agriculture data collected from fields and reveal hidden patterns of interest from them, there is a need to develop forecasting models such as disease, pests, and yield prediction models. Machine learning models that were mostly used in the prediction include artificial neural network, support vector machine, and logistic regression [76]. These algorithms can be integrated with data analytics tools such as Map-Reduce and Spark for real-time analysis and better performance [19,76].
Utilizing edge computing provides another opportunity for data processing which is federated learning that can address a major concern in data processing for smart farming, which is data sharing. Many farmers prefer to keep their data private, decreasing their motivation to participate in collaborative learning activities. Alternatively, utilizing federated learning, the learning model is distributed to the edge nodes without sharing the farms' data with a central unit in the system. This approach reduces data privacy concerns and encourages farmers to participate in cooperative learning processes.

Security and Privacy
Security and privacy mechanisms are important requirements that should be considered throughout a smart farming platform [4]. There are several solutions to secure a smart farming platform against adversaries. Trust management is one of the mechanisms that can enhance the security and privacy of data in a smart farming platform. It enables a service provider to evaluate the trustworthiness of the actors in the system and to set some restrictions on the activities of low-trustable parties. AgriTrust [94] is a trust management approach designed for smart farming applications. This framework monitors the interactions in the system and updates trust metrics such as credibility, robustness, and reliability through time. Using this approach, smart farming devices broadcast their feedback on each transaction to the network, therefore, this feedback can be utilized by the trust management framework for trustworthiness evaluation.
Another mechanism for security enhancement in smart farming is access control which is a mechanism to consider the policies and agreements to handle requests and access permissions. These permissions can be granted based on the roles and attributes or agreements. For example, Chukkapalli et al. [95] proposed an Attribute-Based Access Control (ABAC) in Smart Farming. ABAC utilizes policies that combine different attributes from different sources, including user, data, device, and environment, and provides finegrained, flexible access management. In this study, the authors modeled a smart farming ecosystem with different agriculture sensors for temperature and soil monitoring, tractor and truck movement controls, as well as labor management, and presented a mechanism to handle access requests to data and devices.
In smart farming scenarios, data are collected in the field and usually transferred to on-premise or cloud storage servers [96]. Different technologies are used for data transfer in agriculture, including Wi-Fi and cellular networks. To protect data from adversaries in the transfer stage, encryption is a common technique. For this purpose, the data are encrypted before transmission and then decoded at the destination. Wen Xue et al. [97] presented an encryption method for agricultural information systems. This method builds secure communication among users, farmers, and cloud servers. The statistical results in the paper showed that the presented method reduced the needed time for encryption as compared with other available methods. Another encryption method for smart farming applications was introduced by Ametepe et al. [98]. This study proposed a hybrid method combining two different cryptographic approaches, and then tested the presented method on a crop monitoring system.
A platform for smart farming can leverage the mentioned solutions, including trust management, access control, and encrypted data transmission to ensure that security and privacy during all stages of smart farming procedures, as shown in Figure 2. Another state-of-the-art technology that can be used for enhancing security and privacy in smart farming is the blockchain. This technology has been used in some agriculture applications and could reshape smart farming ecosystems in the future. Blockchain technology is a distributed ledger that keeps the records of all previous transactions in a system [99]. In addition, this technology enables automatic procedure execution using smart contracts. A smart contract is a computer program that can be deployed on a blockchain platform to run some procedures automatically without human intervention [29]. In recent years, blockchain technology and smart contracts have been widely used in different smart farming applications, including trust management [100,101], water management [102,103], food traceability [104], and supply chain [105,106].

Regulation and Policies
Considering the available regulations and practices in agriculture, a mechanism for compliance with these rules is beneficial in a smart farming platform. The regulations related to smart farming cover different aspects of agriculture, including but not limited to, food security, biosecurity, climate change, and data governance. In the following, some of these regulations in different jurisdictions are briefly reviewed.
The European Union's Common Agricultural Policy (CAP) [107] is a unified policy across Europe to support farmers and to improve agricultural productivity. This policy aims to improve the farming economy, address concerns regarding climate changes, and manage natural resource consumption. In smart farming applications, the policy supports technical progress to enhance agriculture productivity as well as reduce ecological footprints. Another objective for the CAP is to control the availability of supplies and to ensure that the prices of products are reasonable for both producers and customers. Another widely used regulation under the European Union's jurisdiction is the European Union (EU) Code of Conduct on Agricultural Data Sharing by Contractual Agreement [108]. The Committee of Professional Agricultural Organizations (COPA) and the General Confederation of Agricultural Cooperatives (COGECA) published this code of conduct in 2020. This code mainly focuses on non-personal data collected on farms and leaves the personal data to be treated under other regulations. The code indicates that due to the nature of collected data in agriculture, it is impossible to define data ownership in the same way as physical objects and it suggests using different levels of rights. Therefore, the key step from this code's point of view is a contract that determines the rights of all parties to protect their sensitive information, while all permissions related to data collection, access, and utilization are needed to be approved by the data originator. To evaluate the compliance of a product or service with this code, a checklist consisting of fifteen questions has been provided by the publishers. These questions specify the collected data and the rights to share, access, and use these data.
In addition, several regulations related to smart farming are used in the U.S. jurisdiction. An example is the U.S. agricultural policy [109] which covers different aspects of agriculture such as trade, insurance, rural economic growth, bioenergy, and organic farming. This policy aims to support U.S. farmers, to enhance the productivity of the farming process, and to reduce negative environmental impact. Generally, in a five-year cycle, the farm acts are updated to govern agriculture, food, and rural development programs. In the context of smart farming, this policy considers different technologies such as GPS, computer mapping, guidance systems, and variable-rate technology [110]. The Food Safety Modernization Act (FSMA) [111] is another law related to smart farming in the USA. This act aims to enhance food safety and to prevent foodborne illness and has seven major rules ensuring responsibility and accountability of different parties that work in the agricultural sections. Different tools have been provided by the U.S. government to facilitate product tracing, build food defense plans, and create food safety plans. Another regulation related to smart farming in the U.S. jurisdiction is the Privacy and Security Principles for Farm Data (PSPFD) [112]. It was established by the American Farm Bureau Federation (AFBF) in 2014. The principles under this regulation are mostly around data ownership, consent, and disclosure. To ensure compliance with PSPFD, the AFBF has developed the Ag Data Transparency Evaluator. This tool assesses contracts among agriculture stakeholders against the principles provided in the PSPFD, and if the procedure is successful, the contract gets an Ag Data Transparency seal that informs the parties that it has been approved.
Despite the efforts on regulations related to big data in smart farming, there are still some gaps in this field. One of the issues is the lack of comprehensive and unified regulations for agriculture data, while the available practices and a code of conduct are not compulsory. To make the most use of sources in digital agriculture, a smart farming platform needs to provide mechanisms ensuring compliance with available policies in smart farming. A smart farming platform should encourage the parties to trust the services provided and participate in collaborative practices that enhance the performance of data processing systems in agriculture applications.

Conclusions
Smart farming provides the agricultural industry with diverse data-driven services which improve different applications from farm to fork. These services benefit from a large amount of available data in smart farming ecosystems. However, a major challenge in smart farming information processing is consistency and compatibility among the utilized technologies, procedures, and protocols. To address this issue, in this paper, we suggest the platform approach which is a design thinking approach that encourages different actors in a smart farming ecosystem to facilitate collaboration among different services by following some requirements. Moreover, we suggest six requirements for a smart farming platform, including interoperability, real-time data processing, scalability, reliability, security, and compliance with farming regulations. Such a platform can enhance the available services in the agriculture industry by enabling collaboration among different service providers. Furthermore, such a platform facilitates data-sharing practices by reducing security and privacy concerns and providing a trustable environment for agriculture data holders. Currently, a major limitation to developing such a framework is the lack of unified protocols and standards, and more effort from technology providers and policymakers is needed to address this issue. In future work, we aim to develop and implement a sample of the proposed framework. To this end, we plan to utilize the available software and hardware products that follow similar protocols, and therefore, are compatible.