AGRICLIMA: Towards a Federated Platform for Spatiotemporal Risk Analysis in Agriculture
Abstract
1. Introduction
1.1. Climate Change Challenges in Agriculture
1.2. Data Sources for Agricultural Risk Analysis
1.3. Analytical Technologies and Their Requirements
1.4. The Data Fragmentation Problem
1.5. Emerging Paradigms for Data Integration
1.6. Research Objectives and Contributions
- Scalable Open-Source Architecture: We provide a modular, cloud-native platform architecture for robust data integration that supports both batch and real-time data access. The architecture is built entirely on mature open-source components (JupyterHub, GeoNetwork, Keycloak, Apache Airflow, PostgreSQL/PostGIS, MinIO, GeoServer), ensuring reproducibility and broad potential for adoption. The modular design allows individual components to be replaced or extended without disrupting the overall system. This choice of mature, widely-adopted components is particularly important given the limited long-term funding certainty characteristic of research infrastructure.
- FAIR-Compliant Data Management: We implement practical mechanisms to support FAIR (Findable, Accessible, Interoperable, Reusable) principles specifically for spatiotemporal agricultural data. This includes adoption of domain-relevant metadata standards (INSPIRE and RNDT for geospatial data), structured metadata entry through guided onboarding sessions with data providers, a searchable catalog for dataset discovery, and standardized access protocols. Importantly, our implementation demonstrates how FAIR principles can be operationalized in a multi-stakeholder environment with both public and private datasets.
- Privacy-Preserving Abstracted Data Access: We provide custom components (AgriclimaClient and Policy Engine) that enable secure, simplified data access for analysts while preserving governance requirements. These components abstract the complexity of policy verification, credential management, and data fetching, allowing researchers to focus on analysis rather than infrastructure. The architecture explicitly supports both shared public datasets and private workspace repositories, enabling confidential data to remain under provider control while still being accessible for authorized analytical workflows.
1.7. Paper Organization
2. Related Work
2.1. Agricultural Data Platforms and Integration Efforts
2.2. FAIR Principles and Agricultural Data
2.3. Federated Learning and Distributed Data Analysis
2.4. Dataspaces and Data Sovereignty
2.5. Semantic Interoperability and Ontologies
2.6. Positioning of This Work
- Operational implementation: Rather than proposing architectural concepts, we present a working platform deployed on cloud infrastructure and evaluated with real datasets and users. This demonstrates practical feasibility and reveals implementation challenges.
- Multi-stakeholder data governance: Our platform explicitly addresses the coexistence of public and private datasets with different access policies, commercial and research users, and varying data sensitivity levels. This reflects realistic requirements for agricultural risk analysis.
- Spatiotemporal focus: The architecture specifically supports spatiotemporal data types and analysis workflows common in agricultural and environmental domains, including geospatial metadata standards, temporal alignment mechanisms, and appropriate storage systems for raster and vector data.
- FAIR operationalization: We demonstrate specific mechanisms to implement FAIR principles in practice, including metadata onboarding processes, catalog implementation, access abstraction layers, and assessment using standardized metrics.
- Open-source implementation: Complete reliance on open-source components with explicit deployment configurations enhances reproducibility and reduces barriers to adoption compared to proprietary platforms.
3. Methods
3.1. Research Approach and Context
3.1.1. Project Organization
3.1.2. Development Methodology
3.1.3. Research Questions
- RQ1—FAIR Compliance: To what extent does the implemented platform achieve FAIR (Findable, Accessible, Interoperable, Reusable) compliance for integrated datasets, and what are the specific strengths and weaknesses across the four FAIR dimensions?
- RQ2—Operational Stability: Does the platform demonstrate stable operation under normal use conditions, and what are the resource utilization characteristics that inform future scaling decisions?
- RQ3—Usability for Analysis: Can researchers effectively conduct spatiotemporal analyses using the platform with minimal infrastructure overhead, and what specific mechanisms enable this capability?
3.2. Platform Architecture Design
- Data integration: Unify heterogeneous datasets from multiple providers with different formats, with a focus on access methods, and policies
- Advanced analytics: Enable researchers to conduct sophisticated spatiotemporal analyses without managing infrastructure complexity
- Practical application: Support development of demonstrators and decision-support tools by domain experts
3.2.1. Modular Architecture Rationale
- Maintainability: Components can be updated independently without propagating changes throughout the system
- Scalability: Individual components can be scaled (vertically or horizontally) based on their specific resource demands
- Replaceability: Alternative implementations of components can be substituted (e.g., different catalog software) without disrupting the overall system
- Testability: Components can be tested in isolation, simplifying quality assurance
3.2.2. Component Descriptions
- Metadata and Discovery Service (MDS): Serves as the platform’s metadata repository, ensuring datasets are discoverable, well-documented, and interoperable. It supports standardized metadata schemas (ISO 19115, INSPIRE, RNDT), facilitates dataset reusability, and provides mechanisms for transparent dataset registration and discovery. The searchable catalog allows users to browse available datasets by various criteria (spatial coverage, temporal range, topic, provider, license).
- Analytical and Visualization Environment (AVE): Provides computational resources for advanced data analysis, geospatial modeling, and visualization. It enables collaborative research through shared notebook environments, development of analytical tools using Python/R/Julia, and application of machine learning and artificial intelligence methods. The environment integrates directly with the platform’s data access mechanisms, abstracting infrastructure complexity from researchers.
- Data Ingestion and Harmonization (DIH): Automates the extraction and integration of heterogeneous datasets from various sources. It ensures consistent, up-to-date data pipelines through scheduled workflows and facilitates metadata consistency while preserving original data formats. The component monitors source data for updates and manages the caching layer that improves access performance.
- Access and Identity Governance (AIG): Manages user authentication, authorization, and policy enforcement. It guarantees that access to datasets is aligned with provider-defined policies and complies with privacy and regulatory standards, thereby supporting secure and responsible data sharing. The component implements Single Sign-On (SSO) across all platform services and integrates with external identity providers when needed.
- Data Management and Storage (DMS): Comprises two complementary repositories. The Shared Data Repository (SDR) hosts datasets accessible to authorized users according to their access level, promoting collaboration and reuse. The Private Workspace Repository (PWR) allows users to store and analyze sensitive or user-contributed datasets securely, maintaining data confidentiality while supporting personalized research workflows. This separation is critical for the multi-stakeholder environment where not all data can be fully public.
3.2.3. Component Interactions Supporting FAIR Principles
- Findability: The DIH component ingests and normalizes data while the MDS registers comprehensive metadata generated through guided onboarding sessions with data providers. This metadata is published in a searchable catalog with full-text search and faceted filtering, ensuring datasets are easily discoverable by both humans (through web interface) and machines (through catalog APIs).
- Accessibility: The MDS ensures that data and metadata are retrievable using standard, open communication protocols (HTTP, OGC WMS/WFS/WCS). The AIG establishes clear authentication and authorization procedures through OAuth2/OIDC protocols, enabling users to access only the resources to which they are entitled. Importantly, metadata remains openly accessible even when data itself is restricted, fostering discovery while respecting access policies.
- Interoperability: Heterogeneous data sources (remote sensing imagery, model outputs, tabular IoT data) are made accessible through standardized interfaces. For geospatial data, this includes OGC-compliant services (GeoServer). Common file formats (NetCDF for gridded data, GeoTIFF for raster imagery, GeoPackage for vector data) are used where appropriate. Metadata follows established standards (ISO 19115 for geospatial, Dublin Core for general resources).
- Reusability: In both SDR and PWR, datasets are stored with provenance information tracked through the MDS. Metadata includes explicit licensing information (Creative Commons, Open Data licenses, or commercial terms) and detailed documentation about data collection methods, processing steps, and known limitations. This rich context enables other researchers to understand, validate, and reuse data for new applications, ensuring long-term value.
3.2.4. Key Architectural Features
- FAIR Alignment: Provides the overarching framework, ensuring that datasets, whether public or private, can be shared under clear policies while maintaining transparency, scalability, and reproducibility. Within this framework, particular emphasis is placed on practical realization of interoperability through standardized data formats, open communication protocols, and adoption of community-accepted standards for the specific geospatial agricultural domain.
- Access and Identity Governance: The AIG component manages authentication, authorization, and monitoring, ensuring regulatory compliance and controlled use of both sensitive and openly available information. This builds trust among stakeholders and encourages responsible data sharing, which is essential in an environment where some data has commercial value.
- Support for Innovation: The platform supports innovation by streamlining data preparation (researchers don’t manage authentication, format conversion, or policy checking manually) and by providing contextual metadata that enhances analytical workflows. This enables development of AI-based applications, new risk metrics, and predictive models to inform agricultural decision-making, with researchers focusing on methods rather than infrastructure.
3.2.5. Technology Selection
- Analytical and Visualization Environment (AVE): Implemented using JupyterHub [35], which provides multi-user access to Jupyter notebooks. Jupyter notebooks are interactive computational documents combining code (Python, R, Julia), narrative text, equations (LaTeX), and visualizations in a single file, making them well-suited for reproducible research and data analysis. JupyterHub handles user authentication, spawns individual notebook servers for each user, and manages computational resources. The AVE integrates with the platform’s data access mechanisms through the custom AgriclimaClient Python library. This client provides a high-level interface to access datasets, abstracting policy verification and data fetching. The client interacts with the MDS to obtain dataset location and access information, validates access rules through the Policy Engine, retrieves necessary credentials from the AIG, and fetches data in the requested format.
- Metadata and Discovery Service (MDS): Implemented using GeoNetwork Open Source [36], a catalog application for spatially referenced metadata. GeoNetwork provides: interactive web interface for metadata browsing and editing; support for metadata standards including ISO 19115/19139 (geospatial), ISO 19119 (services), INSPIRE, and Dublin Core; full-text and spatial search capabilities; CSW (Catalog Service for Web) endpoint for programmatic access; and integrated map viewer for spatial data preview.The platform includes 26 datasets provided by project partners (described in Section 3.3). New datasets can be integrated with modest effort through the defined onboarding process involving metadata template completion and data source configuration.
- Access and Identity Governance (AIG): Implemented using Keycloak [37] and HashiCorp Vault [38]. Keycloak provides Identity and Access Management including: user authentication and authorization; integration with external identity providers (LDAP, Active Directory, SAML, OIDC); role-based and fine-grained permissions; and Single Sign-On across platform services. Vault provides secrets management with identity-based security, including: secure storage of credentials for accessing data sources; dynamic secret generation; and audit logging.A custom component called the AgriclimaPolicy Engine orchestrates interactions between Keycloak and Vault. It provides a REST API called by the AgriclimaClient to check access permissions and retrieve appropriate credentials for specific datasets based on user identity and roles.
- Data Ingestion and Harmonization (DIH): Implemented using Apache Airflow [39], a platform for programmatically authoring, scheduling, and monitoring workflows. Airflow workflows are defined as Directed Acyclic Graphs (DAGs) in Python, specifying: data source connections and extraction logic; metadata consistency checks; caching and synchronization schedules; and monitoring and alerting rules.Important implementation note: The DIH preserves original data formats provided by sources. It does not perform format transformation or harmonization. This design decision reflects the fact that most contributed datasets already use standard formats appropriate for their data type (NetCDF for gridded climate/weather data, GeoTIFF for raster imagery, GeoPackage for vector features, CSV/Parquet for tabular data). The DIH ensures these datasets remain current through scheduled updates, handles both batch transfers and API-based access as appropriate per source, and validates metadata consistency.Airflow’s scheduling capabilities enable automatic monitoring of dataset updates, with alerts when sources become unavailable or data quality checks fail.
- Data Management and Storage (DMS): Implemented using PostgreSQL [40] with PostGIS extension, MinIO [41], and GeoServer [42]:
- –
- PostgreSQL/PostGIS: Relational database with spatial capabilities, primarily used for structured tabular data (farm records, insurance data, sensor observations), vector geospatial features (field boundaries, sensor locations), and metadata storage for the MDS.
- –
- MinIO: S3-compatible object storage for unstructured data including raster images (satellite imagery, climate model outputs), NetCDF files (multidimensional climate and weather data), and user-generated outputs.
- –
- GeoServer: OGC-compliant geospatial data server providing standardized access through Web Map Service (WMS) for map visualization, Web Feature Service (WFS) for vector data access, and Web Coverage Service (WCS) for raster data access. GeoServer internally accesses data from PostgreSQL/PostGIS and MinIO, presenting them through standard protocols.
The separation between Shared Data Repository (SDR) and Private Workspace Repository (PWR) is enforced through integration with AIG. Keycloak manages users and groups, which are mapped to source-specific access policies: GeoServer layer permissions for spatial data, MinIO bucket policies for object storage, and PostgreSQL row-level security for tabular data.Important storage note: Due to high data volumes and provider infrastructure constraints, datasets available externally (at data providers’ premises through their APIs or file servers) are not fully replicated in the SDR. Instead, the DIH implements a caching layer that stages frequently accessed data and maintains consistency with sources through scheduled updates. In contrast, private data uploaded by individual users is persistently stored in the PWR with appropriate backup procedures.
3.2.6. Platform Workflow
- Data Provider Integration: Partners contribute datasets in standard formats (NetCDF, GeoTIFF, CSV, etc.) either through file transfer to platform storage or by providing API access to data hosted at their facilities. Providers participate in guided metadata onboarding sessions where they complete structured metadata templates covering required elements (title, description, creator, temporal and spatial coverage, access conditions, license, etc.).
- Metadata Registration: The DIH validates metadata completeness and consistency, then registers datasets in the MDS (GeoNetwork). The MDS organizes metadata within the searchable catalog and assigns persistent identifiers. Metadata follows ISO 19115 standard for geospatial data, with particular emphasis on elements required by INSPIRE and RNDT (Repertorio Nazionale dei Dati Territoriali) directives.
- Data Access Configuration: The DIH configures appropriate data access mechanisms depending on source characteristics: batch transfer and caching for static historical datasets; API-based access for dynamic or very large datasets; and streaming access for real-time sensor data. Access rules based on dataset licensing and sensitivity are configured in Keycloak and mapped to technical enforcement points (GeoServer permissions, MinIO policies, database access controls).
- User Discovery and Access: Researchers browse the catalog through the MDS web interface or programmatic API to identify relevant datasets. Detailed metadata helps assess dataset suitability (spatial/temporal coverage, variables, quality, license). From the AVE (JupyterHub), researchers use the AgriclimaClient library to request data. The client transparently handles: metadata lookup via MDS; permission verification via Policy Engine and AIG; credential retrieval; and data fetching from appropriate storage (GeoServer, MinIO, PostgreSQL) or source API.
- Analysis and Output: Researchers conduct analyses in Jupyter notebooks with data in memory, benefiting from a rich Python/R ecosystem for geospatial analysis (GDAL, rasterio, geopandas), statistical analysis (numpy, scipy, pandas), machine learning (scikit-learn, tensorflow, pytorch), and visualization (matplotlib, plotly, folium). Private datasets can be uploaded to PWR and integrated with platform datasets. Analysis outputs are saved to PWR for personal use or optionally contributed back to SDR for sharing.
3.2.7. Deployment Configuration
- Kubernetes cluster with autoscaling between 3 and 6 nodes (Standard B2ms VMs: 2 vCPUs, 8 GiB RAM each), providing 6–12 total vCPUs and 24–48 GiB total RAM
- Azure Managed PostgreSQL database (Standard B2s: 2 vCores, 4 GiB RAM, 32 GiB storage) for structured data and metadata
- Persistent storage: 399 GiB on Azure Managed Disk (for containers and smaller datasets) and 100 GiB on Azure Blob (for user workspaces)
3.3. Dataset Integration
- Weather and Climate (11 datasets): Historical meteorological reanalysis, weather forecasts, climate scenarios to 2050, and extreme event indices. Includes both public datasets at 10 × 10 km resolution and commercial high-resolution (1 × 1 km) versions.
- Phenology (2 datasets): Crop development models including a neural network model for vine phenology (PhenoCNN) and phenological stage models for extensive crops (wheat, corn, tomato) based on Growing Degree Days.
- Hydrology (3 datasets): Snow water equivalent maps, hydrological cycle component modeling (surface/deep water content, river flows, evapotranspiration), covering Po and Crati basins.
- Agricultural Land Use (4 datasets): Homogeneous land use area segmentation, automatic crop type classification from Sentinel-2, LUCAS land cover surveys, and digital terrain models (DTM).
- Crop Morphological Development (2 datasets): Vegetation indices (MSAVI2, NDVI) derived from Sentinel-2 optical satellite data at 10 × 10 m resolution with 5-day temporal granularity.
- Farm Registries and Insurance (2 datasets): Anonymized farm management records and insurance position data for the Trentino region, with 15 years of historical records. Access is controlled to preserve farmer confidentiality.
- Historical Events and Damages (2 datasets): Reported adverse events (types and intensity) and surveyed field damage data for major crops (apples, vines), collected from on-site inspections following extreme weather events.
3.4. Evaluation Methodology
3.4.1. FAIR Compliance Assessment (RQ1)
3.4.2. System Performance Monitoring (RQ2)
3.4.3. Demonstrative Use Case (RQ3)
- Dataset discovery via the MDS catalog.
- Data access using the AgriclimaClient library (metadata lookup, permission verification, data fetching).
- Integration of private spatial indicators uploaded to PWR.
- Geospatial operations (grid-to-vector joining) using standard Python libraries.
- Temporal aggregation and weighted calculation.
- Interactive visualization with region selection.
| Algorithm 1 Weighted Meteo Data Aggregation |
|
3.4.4. Limitations of Evaluation Approach
- User base size: Evaluation involved 4 concurrent users from a 16-person team, not a large diverse user population. User experience assessment is qualitative (researcher feedback) rather than systematic usability testing.
- Performance scope: System monitoring characterized resource usage under normal load, not performance limits or scalability boundaries. Load testing and stress testing remain future work.
- Comparative analysis: No systematic comparison with other agricultural data platforms was conducted.
- Temporal scope: Evaluation occurred during initial 10-month deployment. Long-term sustainability, maintenance burden, and evolution patterns are not assessed.
- Generalizability: Results are specific to the Italian agricultural context, the particular consortium composition, and the selected demonstrator applications. Generalization to other contexts requires additional validation.
4. Evaluation
4.1. FAIR Evaluation
4.2. System Performance and Operational Metrics
4.2.1. Cluster-Level Metrics
4.2.2. Node-Level Metrics
4.2.3. Pod-Level Metrics
4.2.4. Storage Metrics
4.3. Demonstrator: Weighted Spatiotemporal Aggregation of Meteorological Data
5. Discussion
5.1. Interpretation of FAIR Compliance Results
5.1.1. Accessibility Success
5.1.2. Findability and Reusability Gaps
5.1.3. Interoperability Variance
5.1.4. Comparison with Prior Agricultural Data Assessments
5.2. System Performance and Scalability Considerations
5.2.1. Current Resource Utilization
5.2.2. Scalability Outlook and Unknowns
5.2.3. Storage Efficiency
5.3. Platform Workflow and Usability Insights
5.4. Limitations and Threats to Validity
- Evaluation Scope Limitations: The evaluation’s transferability is limited by a small user base (4 concurrent users) reflecting only early-stage usage, a 10-month temporal scope missing long-term sustainability insights, and geographic/domain specificity to Italian agriculture and the consortium. Furthermore, performance monitoring only covered normal operational usage, not scalability limits or stress response, and there was no systematic comparative analysis against alternative platforms.
- Methodological Limitations: Methodological constraints include the FAIR assessment’s focus on technical implementation rather than semantic interoperability or actual reuse success. Performance monitoring metrics characterized resource utilization but not user experience or analysis throughput. The use case demonstration was singular, lacking rigorous usability study methodology, and relied on crude measures like lines of code to estimate effort reduction.
- External Validity Concerns: Transferability is potentially limited by the consortium model dependency (raising questions about cooperation in market-driven settings), reliance on institutional backing and resources not available to smaller groups, and cultural factors like the willingness of commercial data providers to participate. Additionally, the technical homogeneity of initial partners suggests higher integration barriers in environments with legacy systems or proprietary formats. Successful platform development requires appropriate organizational, economic, and social conditions, not just technical solutions.
5.5. Lessons Learned
5.5.1. Metadata Is a Social Problem
5.5.2. Access Control Complexity
5.5.3. The Value of Standard Components
5.6. Implications for Agricultural Data Infrastructure
5.6.1. Practical FAIR Implementation
5.6.2. Balancing Openness and Control
5.6.3. Moving Beyond Dataspaces Rhetoric
5.6.4. Practical Semantic Interoperability
- Regional climate risk assessments: Semantic standards for weather, soil, and crop data within limited geographic areas, like the Trentino region, facilitate comparison and strategy validation. The smaller scale aids consensus.
- Recurring event-driven analyses: Standardizing input data for repeated analyses, such as seasonal frost or drought forecasting, allows systematic comparison across seasons and regions, supporting policy.
- Cross-border regulatory compliance: Policy-mandated data exchange, such as for EU reporting, provides the institutional motivation needed for semantic harmonization of compliance-critical data.
- Specific crop-hazard combinations: Focused efforts on specific problems, like apple frost risk or grapevine disease, limit the scope and enable consensus within smaller stakeholder communities, achieving practical interoperability.
5.6.5. Integration with Emerging AI Capabilities
5.7. Platform Evolution
6. Conclusions
6.1. Summary of Contributions
- Scalable Open-Source Architecture: We provided a modular, cloud-native platform architecture built entirely on mature open-source components (JupyterHub, GeoNetwork, Keycloak, Apache Airflow, PostgreSQL/PostGIS, MinIO, GeoServer), ensuring reproducibility and reducing barriers to adoption. The architecture supports both batch and real-time data access through its abstraction layer, handling heterogeneous data types from multiple providers with different access policies.
- FAIR-Compliant Data Management: We demonstrated practical mechanisms to operationalize FAIR principles for spatiotemporal agricultural data, including adoption of domain-relevant metadata standards (INSPIRE, RNDT), structured metadata entry through guided onboarding sessions, searchable catalog implementation, and standardized access protocols. The evaluation using F-UJI metrics showed 80% overall FAIR compliance across 26 integrated datasets, substantially exceeding prior agricultural dataset assessments.
- Privacy-Preserving Abstracted Data Access: We provided custom components (AgriclimaClient and Policy Engine) that enable simplified data access for analysts while preserving governance requirements. The architecture explicitly supports both shared public datasets and private workspace repositories, enabling confidential data to remain under provider control while still being accessible for authorized analytical workflows. The demonstrative use case validated that researchers can conduct spatiotemporal analyses with minimal infrastructure overhead.
6.2. Research Questions and Key Findings
6.3. Limitations and Scope
- The evaluation involved a small user base (4 concurrent users from 16-person team) during initial deployment, not a large-scale production environment.
- System monitoring characterized normal operational usage, not performance limits or breaking points.
- No systematic comparison with alternative platforms was conducted, due to limited access.
- The demonstrative use case focused on platform workflow validation, not advanced AI capabilities.
- Results are specific to Italian agricultural contexts and the particular consortium composition.
6.4. Broader Impact on Agricultural Data Infrastructure
- FAIR principles can be operationalized: The 80% compliance demonstrates that practical implementation of FAIR principles in agriculture is achievable with moderate effort through adoption of existing standards, structured onboarding processes, and appropriate infrastructure. However, achieving full FAIR maturity requires sustained investment in metadata quality, not just initial technical implementation.
- Unified platforms improve outcomes: Platform-mediated data sharing achieved better FAIR compliance than heterogeneous repository federation, justifying investment in integrated infrastructure despite coordination overhead. This suggests value in agricultural data platform development alongside data generation efforts.
- Commercial and open data can coexist: The architecture successfully accommodates datasets with commercial restrictions alongside public data, providing a pragmatic model that engages private sector data providers while maintaining open access where appropriate. This balance is essential for agricultural sectors where significant added-value data comes from commercial sources.
- Standard components reduce risk: Complete reliance on mature open-source components rather than custom development improved sustainability prospects and development speed, a particularly important consideration for research infrastructure with uncertain long-term funding.
- Use case focus enables progress: Rather than attempting to build generic dataspace infrastructure for abstract future needs, focusing on concrete use cases (spatiotemporal agricultural risk analysis) with specific stakeholders, datasets, and workflows provided requirements clarity and validation criteria essential for progress.
6.5. Future Work
6.6. Concluding Remarks
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- IPCC. Climate Change 2021: The Physical Science Basis. In Contribution of Working Group I to the Sixth Assessment Report; IPCC: Geneva, Switzerland, 2021. [Google Scholar]
- Dai, A. Increasing drought under global warming in observations and models. Nat. Clim. Chang. 2013, 3, 52–58. [Google Scholar] [CrossRef]
- Zhao, C.; Liu, B.; Piao, S.; Wang, X.; Lobell, D.B.; Huang, Y.; Huang, M.; Yao, Y.; Bassu, S.; Ciais, P.; et al. Temperature increase reduces global yields of major crops. Proc. Natl. Acad. Sci. USA 2017, 114, 9326–9331. [Google Scholar] [CrossRef]
- Lesk, C.; Rowhani, P.; Ramankutty, N. Influence of extreme weather disasters on global crop production. Nature 2016, 529, 84–87. [Google Scholar] [CrossRef] [PubMed]
- Olesen, J.E.; Trnka, M.; Kersebaum, K.C.; Skjelvåg, A.O.; Seguin, B.; Peltonen-Sainio, P.; Rossi, F.; Kozyra, J.; Micale, F. Impacts and adaptation of European crop production systems to climate change. Eur. J. Agron. 2011, 34, 96–112. [Google Scholar] [CrossRef]
- Cisternas, I.; Velásquez, I.; Caro, A.; Rodríguez, A. Systematic literature review of implementations of precision agriculture. Comput. Electron. Agric. 2020, 176, 105626. [Google Scholar] [CrossRef]
- Sishodia, R.P.; Ray, R.L.; Singh, S.K. Applications of Remote Sensing in Precision Agriculture: A Review. Remote Sens. 2020, 12, 3136. [Google Scholar] [CrossRef]
- Pincheira, M.; Shamsfakhr, F.; Hueller, J.; Vecchio, M. Overcoming Limitations of IoT Installations: Active Sensing UGV for Agricultural Digital Twins. In Proceedings of the IEEE International Workshop on Metrology for Agriculture and Forestry (MetroAgriFor), Pisa, Italy, 6–8 November 2023; pp. 319–324. [Google Scholar]
- Basri, R.; Islam, F.; Shorif, S.B.; Uddin, M.S. Robots and drones in agriculture—A survey. In Computer Vision and Machine Learning in Agriculture; Springer: Berlin/Heidelberg, Germany, 2021; pp. 9–29. [Google Scholar]
- Balasundram, S.K.; Shamshiri, R.R.; Sridhara, S.; Rizan, N. The Role of Digital Agriculture in Mitigating Climate Change and Ensuring Food Security: An Overview. Sustainability 2023, 15, 5325. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
- Hewamalage, H.; Bergmeir, C.; Bandara, K. Recurrent Neural Networks for Time Series Forecasting: Current status and future directions. Int. J. Forecast. 2021, 37, 388–427. [Google Scholar] [CrossRef]
- Grazieschi, P.; Vecchio, M.; Pincheira, M.; Antonelli, F. Soilcast: A Multitask Encoder-Decoder AI Model for Precision Agriculture. In Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing (SAC), SAC ’25, Catania, Italy, 31 March–4 April 2025; pp. 866–873. [Google Scholar]
- Donini, E.; Kasibovic, A.; Garcia, M.H.; Bruzzone, L.; Bovolo, F. An Unsupervised Deep Learning Method for the Super-Resolution of Radar Sounder Data. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 1696–1699. [Google Scholar]
- Guo, W.; Zhang, W.; Zhang, Z.; Tang, P.; Gao, S. Deep Temporal Iterative Clustering for Satellite Image Time Series Land Cover Analysis. Remote Sens. 2022, 14, 3635. [Google Scholar] [CrossRef]
- Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. Found. Trends Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
- Chamorro-Padial, J.; García, R.; Gil, R. A systematic review of open data in agriculture. Comput. Electron. Agric. 2024, 219, 108775. [Google Scholar] [CrossRef]
- Bacco, M.; Kocian, A.; Chessa, S.; Crivello, A.; Barsocchi, P. What are data spaces? Systematic survey and future outlook. Data Brief 2024, 57, 110969. [Google Scholar] [CrossRef] [PubMed]
- Pincheira, M.; Donini, E.; Vecchio, M.; Kanhere, S. A Decentralized Architecture for Trusted Dataset Sharing Using Smart Contracts and Distributed Storage. Sensors 2022, 22, 9118. [Google Scholar] [CrossRef]
- Kreuzberger, D.; Kühl, N.; Hirschl, S. Machine Learning Operations (MLOps): Overview, Definition, and Architecture. IEEE Access 2023, 11, 31866–31879. [Google Scholar] [CrossRef]
- Jouini, I.; Leibovici, D.; Kurian, M.; Bongcam-Rudloff, E.; Kieti, J.; Sellitti, S. Evaluation of CGIAR Platform for Big Data in Agriculture; Technical Report; CGIAR: Rome, Italy, 2021. [Google Scholar]
- Drakos, A.; Protonotarios, V.; Manouselis, N. agINFRA: A research data hub for agriculture, food and the environment. F1000Research 2015, 4, 127. [Google Scholar] [CrossRef]
- Moot, D.; Griffiths, W.M.; Chapman, D.F.; Dodd, M.B.; Teixeira, C.S. AgYields—A National Database for Collation of Past, Present and Future Pasture and Crop Yield Data; New Zealand Grassland Association: Mosgiel, New Zealand, 2022. [Google Scholar]
- Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.; Santos, L.B.d.; Bourne, P.E.; et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef]
- European Parliament. Directive 2007/2/EC of the European Parliament and of the Council; Technical Report; Official Journal of the European Union: Brussels, Belgium, 2007. [Google Scholar]
- Jonquet, C.; Toulet, A.; Arnaud, E.; Aubin, S.; Yeumo, E.D.; Emoneta, V.; Graybealf, J.; Laportec, M.; Musenf, M.A.; Pesce, V.; et al. AgroPortal: A vocabulary and ontology repository for agronomy. Comput. Electron. Agric. 2018, 144, 126–143. [Google Scholar] [CrossRef]
- Petrosyan, L.; Aleixandre-Benavent, R.; Peset, F.; Valderrama-Zurián, J.C.; Ferrer-Sapena, A.; Sixto-Costoya, A. FAIR degree assessment in agriculture datasets using the F-UJI tool. Ecol. Inform. 2023, 76, 102126. [Google Scholar] [CrossRef]
- Biju, V.G.; Shihabudeen, H.; Devabalaji, K.R.; Latheef, M.M.A.; Thomas, T.; Mali, G. Federated Learning Based Crop Disease Detection in Precision Agriculture. In Proceedings of the 2025 International Conference on Intelligent Systems and Data Science (ICISDS), Ernakulam, India, 3–5 April 2025. [Google Scholar]
- Caminha, P.V.; Oliveira, H. Plant Disease Detection Using Federated Learning and Cloud Infrastructure for Scalability and Data Privacy. J. Internet Serv. Appl. 2025, 16, 530–543. [Google Scholar] [CrossRef]
- Brunori, G.; Bacco, M.; Puerta-Piñero, C.; Borzacchiello, M.T.; Stormer, E. Agri-food data spaces: Highlighting the need for a farm-centered strategy. Data Brief 2025, 59, 111388. [Google Scholar] [CrossRef] [PubMed]
- Khatoon, P.S.; Ahmed, M. Importance of semantic interoperability in smart agriculture systems. Trans. Emerg. Telecommun. Technol. 2022, 33, e4448. [Google Scholar] [CrossRef]
- Smaili, N.; Kabbaj, A. Enabling semantic interoperability for smart farming. Agron. Res. 2025, 23, 479–492. [Google Scholar]
- Baldwin, C.Y.; Clark, K.B. Design Rules: The Power of Modularity; MIT Press: Cambridge, MA, USA, 2000. [Google Scholar]
- JupyterHub: Multi-User Server for Jupyter Notebooks. 2025. Available online: https://jupyter.org/hub (accessed on 4 November 2025).
- GeoNetwork Opensource: Catalog Application for Spatial Metadata. 2025. Available online: https://geonetwork-opensource.org (accessed on 4 November 2025).
- Keycloak: Open Source Identity and Access Management. 2025. Available online: https://www.keycloak.org (accessed on 4 November 2025).
- HashiCorp Vault: Identity-Based Secrets Management. 2025. Available online: https://developer.hashicorp.com/vault (accessed on 4 November 2025).
- Apache Airflow: Workflow Orchestration Platform. 2025. Available online: https://airflow.apache.org (accessed on 4 November 2025).
- PostgreSQL with PostGIS: Open-Source Relational and Spatial Database. 2025. Available online: https://www.postgresql.org (accessed on 4 November 2025).
- MinIO: High-Performance Object Storage Server. 2025. Available online: https://min.io (accessed on 4 November 2025).
- GeoServer: Open-Source Server for Geospatial Data and OGC Services. 2025. Available online: https://geoserver.org (accessed on 4 November 2025).
- Devaraju, A.; Huber, R. An automated solution for measuring the progress toward FAIR research data. Patterns 2021, 2, 100370. [Google Scholar] [CrossRef]
- Devaraju, A.; Huber, R. F-UJI Automated FAIR Data Assessment Tool. 2021. Available online: https://www.f-uji.net/index.php?action=test (accessed on 4 November 2025).
- Chen, A.; Wong, C.; Sharif, B.; Peruma, A. Exploring Code Comprehension in Scientific Programming: Preliminary Insights from Research Scientists. In Proceedings of the 2025 IEEE/ACM 33rd International Conference on Program Comprehension (ICPC), Ottawa, ON, Canada, 27–28 April 2025; pp. 350–354. [Google Scholar] [CrossRef]
- Nam, D.; Horvath, A.; Macvean, A.; Myers, B.; Vasilescu, B. MARBLE: Mining for boilerplate code to identify API usability problems. In Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, ASE ’19, San Diego, CA, USA, 11–15 November 2019; IEEE Press: New York, NY, USA, 2020; pp. 615–627. [Google Scholar] [CrossRef]








| Area | ID | Description | Resolution | Time | Granularity | Public/Private | Size/API | Format |
|---|---|---|---|---|---|---|---|---|
| A1 | CD1a | Climate scenarios | 10 × 10 km | 2020–2050 | Daily | Public | 25 GB | NetCDF |
| A1 | CD1b | Chronic and acute indices | 10 × 10 km | 2020–2050 | Daily | Public | 2 MB | NetCDF |
| A1 | CD2a | Extreme events index (hindcast) | 30 × 30 km | 1994–2016 | Monthly | Public | 6 GB | NetCDF |
| A1 | CD2b | Extreme events index (forecast) | 30 × 30 km | from 2017 | Monthly | Public | 200 MB | NetCDF |
| A1 | CD2c | Extreme events index (Projections) | 30 × 30 km | 2015–2100 | Yearly | Public | 300 MB | NetCDF |
| A1 | CD3a | Commercial version of CD1a | 5.5 × 5.5 km | 1994–2016 | Monthly | Private | 82 GB | NetCDF |
| A1 | CD3b | Commercial version of CD1b | 5.5 × 5.5 km | from 2017 | Monthly | Private | 11 GB | NetCDF |
| A1 | CD3c | Commercial version of CD1c | 5.5 × 5.5 km | 2015–2100 | Yearly | Private | 25 GB | NetCDF |
| A1 | MD1 | Weather reanalysis | 10 × 10 km | 1991–2023 | Daily | Public | 100 MB | NetCDF |
| A1 | MD2 | Commercial version of MD1 | 1 × 1 km | 1991–2023 | Daily | Private | API | NetCDF |
| A1 | MD3 | Weather forecast | 4 × 4 km | 15 days | Daily | Public | API | NetCDF |
| A2 | FM1 | Vine phenology model PhenoCNN | 1 km grid | Feb-Nov | Daily | Public | API | Json |
| A2 | FM2 | Phenology model (selected crops) | 1 km grid | Sowing to harvest | Daily | Public | API | GeoTiff |
| A3 | ID1 | Snow water equivalent maps | 500 × 500 m | 1992–2022 | Daily | Public | 20 GB | NetCDF |
| A3 | ID2 | Snow water equivalent | 10 × 10 km | 2010–2023 | Daily | Public | 460 MB | NetCDF |
| A3 | IM1 | Modeling of hydrological cycle components | 2 to 5 km2 | 2000–2023 | Daily | Public | 56 MB | CSV |
| A4 | SD1 | Mapping of homogeneous land use areas | 10 × 10 m | Selected years | Semi-annual | Public | 1 GB | GeoTiff |
| A4 | SD2 | Automatic classification of crop types | 10 × 10 m | Selected years | Semi-annual | Public | 1 GB | GeoTiff |
| A4 | SD3 | Land use/cover LUCAS | Point-based | 2018 | Event-based | Public | API | GeoTiff |
| A4 | SD4 | Digital Terrain Models | 10 × 10 m | NaN | Event-based | Public | API | GeoTiff |
| A5 | SCD3a | MSAVI2 index (from Sentinel 2) | 10 × 10 m | 2017 | 5 days | Public | API | GeoTiff |
| A5 | SCD3b | NDVI index (from Sentinel 2) | 10 × 10 m | 2017 | 5 days | Public | API | GeoTiff |
| A6 | AD1 | Anonymized data from farm files | Point-based | Last 15 years | Event-based | Public | API | GeoPackage |
| A6 | AD2 | Data regarding insurance positions | Point-based | Last 15 years | Event-based | Public | API | GeoPackage |
| A7 | ED1 | Historical reported adverse events | Point-based | Last 15 years | Event-based | Public | API | GeoPackage |
| A7 | ED3 | Historical damage data for crops | Point-based | Last 15 years | Event-based | Public | API | GeoPackage |
| Principle | ID | Description | Score |
|---|---|---|---|
| Findable | F1-01MD | Metadata and data are assigned a globally unique identifier. | 1 |
| F1-02MD | Metadata and data are assigned a persistent identifier | 1 | |
| F2-01M | Metadata includes descriptive core elements (creator, title, data identifier, publisher, publication date, summary and keywords) to support data findability. | 2 | |
| F3-01M | Metadata includes the identifier of the data it describes. | 1 | |
| F4-01M | Metadata is offered in such a way that it can be registered or indexed by search engines. | 2 | |
| Accessible | A1-01M | Metadata contains access level and access conditions of the data. | 1 |
| A1-02MD | Metadata and data are retrievable by their identifier | 2 | |
| A1.1-01MD | A standardized communication protocol is used to access metadata and data. | 2 | |
| A1.2-01MD | Metadata and data are accessible through a standardized communication protocol which supports authentication. | 2 | |
| Interoperable | I1-01M | Metadata is represented using a formal knowledge representation language. | 2 |
| I2-01M | Metadata uses registered semantic resources | 2 | |
| I3-01M | Metadata includes qualified references between the data and its related entities. | 2 | |
| Reusable | R1-01M | Metadata specifies the content of the data. | 2 |
| R1.1-01M | Metadata includes license information under which data can be reused. | 1 | |
| R1.2-01M | Metadata includes provenance information about data creation or generation. | 1 | |
| R1.3-01M | Metadata follows a standard recommended by the target research community of the data. | 1 | |
| R1.3-02D | Data is available in a file format recommended by the target research community. | 1 |
| Area | ID | F Score | A Score | I Score | R Score |
|---|---|---|---|---|---|
| A1 | CD1a | 4 | 7 | 6 | 4 |
| A1 | CD1b | 4 | 7 | 6 | 4 |
| A1 | CD2a | 4 | 7 | 6 | 4 |
| A1 | CD2b | 4 | 7 | 6 | 4 |
| A1 | CD2c | 4 | 7 | 6 | 4 |
| A1 | CD3a | 2 | 7 | 2 | 3 |
| A1 | CD3b | 2 | 7 | 2 | 3 |
| A1 | CD3c | 2 | 7 | 2 | 3 |
| A1 | MD1 | 4 | 7 | 6 | 4 |
| A1 | MD2 | 2 | 7 | 2 | 3 |
| A1 | MD3 | 4 | 7 | 6 | 4 |
| A2 | FM1 | 4 | 7 | 6 | 4 |
| A2 | FM2 | 4 | 7 | 6 | 4 |
| A3 | ID1 | 4 | 7 | 6 | 4 |
| A3 | ID2 | 4 | 7 | 6 | 4 |
| A3 | IM1 | 4 | 7 | 6 | 4 |
| A4 | SD1 | 4 | 7 | 6 | 4 |
| A4 | SD2 | 4 | 7 | 6 | 4 |
| A4 | SD3 | 4 | 7 | 6 | 4 |
| A4 | SD4 | 4 | 7 | 6 | 4 |
| A5 | SCD3a | 4 | 7 | 6 | 4 |
| A5 | SCD3b | 4 | 7 | 6 | 5 |
| A6 | AD1 | 4 | 7 | 6 | 4 |
| A6 | AD2 | 4 | 7 | 6 | 4 |
| A7 | ED1 | 4 | 7 | 6 | 4 |
| A7 | ED3 | 4 | 7 | 6 | 4 |
| Statistic | F Score | A Score | I Score | R Score |
|---|---|---|---|---|
| Mean | 3.692308 | 7.000000 | 5.384615 | 3.884615 |
| Std dev | 0.735893 | 0.000000 | 1.471786 | 0.431455 |
| Min | 2.000000 | 7.000000 | 2.000000 | 3.000000 |
| Max | 4.000000 | 7.000000 | 6.000000 | 5.000000 |
| Mean Level | ![]() | ![]() | ![]() | ![]() |
| Node Name | Running Pods | Node Age |
|---|---|---|
| Aks-agentpool-33955547-vmss00000o | 16 | 45 days |
| Aks-agentpool-33955547-vmss00000r | 28 | 39 days |
| Aks-agentpool-33955547-vmss00000w | 18 | 11 days |
| Aks-agentpool-33955547-vmss000011 | 13 | 3 days |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pincheira, M.; Antonelli, F.; Vecchio, M. AGRICLIMA: Towards a Federated Platform for Spatiotemporal Risk Analysis in Agriculture. Agriculture 2025, 15, 2450. https://doi.org/10.3390/agriculture15232450
Pincheira M, Antonelli F, Vecchio M. AGRICLIMA: Towards a Federated Platform for Spatiotemporal Risk Analysis in Agriculture. Agriculture. 2025; 15(23):2450. https://doi.org/10.3390/agriculture15232450
Chicago/Turabian StylePincheira, Miguel, Fabio Antonelli, and Massimo Vecchio. 2025. "AGRICLIMA: Towards a Federated Platform for Spatiotemporal Risk Analysis in Agriculture" Agriculture 15, no. 23: 2450. https://doi.org/10.3390/agriculture15232450
APA StylePincheira, M., Antonelli, F., & Vecchio, M. (2025). AGRICLIMA: Towards a Federated Platform for Spatiotemporal Risk Analysis in Agriculture. Agriculture, 15(23), 2450. https://doi.org/10.3390/agriculture15232450





